< draft-ietf-tsvwg-ecn-tunnel-08.txt | draft-ietf-tsvwg-ecn-tunnel-09.txt > | |||
---|---|---|---|---|
Transport Area Working Group B. Briscoe | Transport Area Working Group B. Briscoe | |||
Internet-Draft BT | Internet-Draft BT | |||
Updates: 3168, 4301 March 03, 2010 | Updates: 3168, 4301, 4774 July 30, 2010 | |||
(if approved) | (if approved) | |||
Intended status: Standards Track | Intended status: Standards Track | |||
Expires: September 4, 2010 | Expires: January 31, 2011 | |||
Tunnelling of Explicit Congestion Notification | Tunnelling of Explicit Congestion Notification | |||
draft-ietf-tsvwg-ecn-tunnel-08 | draft-ietf-tsvwg-ecn-tunnel-09 | |||
Abstract | Abstract | |||
This document redefines how the explicit congestion notification | This document redefines how the explicit congestion notification | |||
(ECN) field of the IP header should be constructed on entry to and | (ECN) field of the IP header should be constructed on entry to and | |||
exit from any IP in IP tunnel. On encapsulation it updates RFC3168 | exit from any IP in IP tunnel. On encapsulation it updates RFC3168 | |||
to bring all IP in IP tunnels (v4 or v6) into line with RFC4301 IPsec | to bring all IP in IP tunnels (v4 or v6) into line with RFC4301 IPsec | |||
ECN processing. On decapsulation it updates both RFC3168 and RFC4301 | ECN processing. On decapsulation it updates both RFC3168 and RFC4301 | |||
to add new behaviours for previously unused combinations of inner and | to add new behaviours for previously unused combinations of inner and | |||
outer header. The new rules ensure the ECN field is correctly | outer header. The new rules ensure the ECN field is correctly | |||
propagated across a tunnel whether it is used to signal one or two | propagated across a tunnel whether it is used to signal one or two | |||
severity levels of congestion, whereas before only one severity level | severity levels of congestion, whereas before only one severity level | |||
was supported. Tunnel endpoints can be updated in any order without | was supported. Tunnel endpoints can be updated in any order without | |||
affecting pre-existing uses of the ECN field, providing backward | affecting pre-existing uses of the ECN field, thus ensuring backward | |||
compatibility. Nonetheless, operators wanting to support two | compatibility. Nonetheless, operators wanting to support two | |||
severity levels (e.g. for pre-congestion notification--PCN) can | severity levels (e.g. for pre-congestion notification--PCN) can | |||
require compliance with this new specification. A thorough analysis | require compliance with this new specification. A thorough analysis | |||
of the reasoning for these changes and the implications is included. | of the reasoning for these changes and the implications is included. | |||
In the unlikely event that the new rules do not meet a specific need, | In the unlikely event that the new rules do not meet a specific need, | |||
RFC4774 gives guidance on designing alternate ECN semantics and this | RFC4774 gives guidance on designing alternate ECN semantics and this | |||
document extends that to include tunnelling issues. | document extends that to include tunnelling issues. | |||
Status of This Memo | Status of This Memo | |||
This Internet-Draft is submitted to IETF in full conformance with the | This Internet-Draft is submitted in full conformance with the | |||
provisions of BCP 78 and BCP 79. | provisions of BCP 78 and BCP 79. | |||
Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
Task Force (IETF), its areas, and its working groups. Note that | Task Force (IETF). Note that other groups may also distribute | |||
other groups may also distribute working documents as Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
Drafts. | Drafts is at http://datatracker.ietf.org/drafts/current/. | |||
Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
The list of current Internet-Drafts can be accessed at | This Internet-Draft will expire on January 31, 2011. | |||
http://www.ietf.org/ietf/1id-abstracts.txt. | ||||
The list of Internet-Draft Shadow Directories can be accessed at | ||||
http://www.ietf.org/shadow.html. | ||||
This Internet-Draft will expire on September 4, 2010. | ||||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2010 IETF Trust and the persons identified as the | Copyright (c) 2010 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
(http://trustee.ietf.org/license-info) in effect on the date of | (http://trustee.ietf.org/license-info) in effect on the date of | |||
publication of this document. Please review these documents | publication of this document. Please review these documents | |||
carefully, as they describe your rights and restrictions with respect | carefully, as they describe your rights and restrictions with respect | |||
to this document. Code Components extracted from this document must | to this document. Code Components extracted from this document must | |||
include Simplified BSD License text as described in Section 4.e of | include Simplified BSD License text as described in Section 4.e of | |||
the Trust Legal Provisions and are provided without warranty as | the Trust Legal Provisions and are provided without warranty as | |||
described in the BSD License. | described in the Simplified BSD License. | |||
This document may contain material from IETF Documents or IETF | ||||
Contributions published or made publicly available before November | ||||
10, 2008. The person(s) controlling the copyright in some of this | ||||
material may not have granted the IETF Trust the right to allow | ||||
modifications of such material outside the IETF Standards Process. | ||||
Without obtaining an adequate license from the person(s) controlling | ||||
the copyright in such materials, this document may not be modified | ||||
outside the IETF Standards Process, and derivative works of it may | ||||
not be created outside the IETF Standards Process, except to format | ||||
it for publication as an RFC or to translate it into languages other | ||||
than English. | ||||
Table of Contents | Table of Contents | |||
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 9 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 10 | |||
1.1. Scope . . . . . . . . . . . . . . . . . . . . . . . . . . 11 | 1.1. Scope . . . . . . . . . . . . . . . . . . . . . . . . . . 12 | |||
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 11 | 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 12 | |||
3. Summary of Pre-Existing RFCs . . . . . . . . . . . . . . . . . 12 | 3. Summary of Pre-Existing RFCs . . . . . . . . . . . . . . . . . 14 | |||
3.1. Encapsulation at Tunnel Ingress . . . . . . . . . . . . . 12 | 3.1. Encapsulation at Tunnel Ingress . . . . . . . . . . . . . 14 | |||
3.2. Decapsulation at Tunnel Egress . . . . . . . . . . . . . . 13 | 3.2. Decapsulation at Tunnel Egress . . . . . . . . . . . . . . 15 | |||
4. New ECN Tunnelling Rules . . . . . . . . . . . . . . . . . . . 14 | 4. New ECN Tunnelling Rules . . . . . . . . . . . . . . . . . . . 16 | |||
4.1. Default Tunnel Ingress Behaviour . . . . . . . . . . . . . 15 | 4.1. Default Tunnel Ingress Behaviour . . . . . . . . . . . . . 16 | |||
4.2. Default Tunnel Egress Behaviour . . . . . . . . . . . . . 15 | 4.2. Default Tunnel Egress Behaviour . . . . . . . . . . . . . 17 | |||
4.3. Encapsulation Modes . . . . . . . . . . . . . . . . . . . 17 | 4.3. Encapsulation Modes . . . . . . . . . . . . . . . . . . . 19 | |||
4.4. Single Mode of Decapsulation . . . . . . . . . . . . . . . 19 | 4.4. Single Mode of Decapsulation . . . . . . . . . . . . . . . 20 | |||
5. Updates to Earlier RFCs . . . . . . . . . . . . . . . . . . . 20 | 5. Updates to Earlier RFCs . . . . . . . . . . . . . . . . . . . 21 | |||
5.1. Changes to RFC4301 ECN processing . . . . . . . . . . . . 20 | 5.1. Changes to RFC4301 ECN processing . . . . . . . . . . . . 21 | |||
5.2. Changes to RFC3168 ECN processing . . . . . . . . . . . . 20 | 5.2. Changes to RFC3168 ECN processing . . . . . . . . . . . . 22 | |||
5.3. Motivation for Changes . . . . . . . . . . . . . . . . . . 22 | 5.3. Motivation for Changes . . . . . . . . . . . . . . . . . . 23 | |||
5.3.1. Motivation for Changing Encapsulation . . . . . . . . 22 | 5.3.1. Motivation for Changing Encapsulation . . . . . . . . 23 | |||
5.3.2. Motivation for Changing Decapsulation . . . . . . . . 23 | 5.3.2. Motivation for Changing Decapsulation . . . . . . . . 24 | |||
6. Backward Compatibility . . . . . . . . . . . . . . . . . . . . 25 | 6. Backward Compatibility . . . . . . . . . . . . . . . . . . . . 27 | |||
6.1. Non-Issues Updating Decapsulation . . . . . . . . . . . . 25 | 6.1. Non-Issues Updating Decapsulation . . . . . . . . . . . . 27 | |||
6.2. Non-Update of RFC4301 IPsec Encapsulation . . . . . . . . 26 | 6.2. Non-Update of RFC4301 IPsec Encapsulation . . . . . . . . 27 | |||
6.3. Update to RFC3168 Encapsulation . . . . . . . . . . . . . 26 | 6.3. Update to RFC3168 Encapsulation . . . . . . . . . . . . . 28 | |||
7. Design Principles for Alternate ECN Tunnelling Semantics . . . 27 | 7. Design Principles for Alternate ECN Tunnelling Semantics . . . 28 | |||
8. Security Considerations . . . . . . . . . . . . . . . . . . . 29 | 8. IANA Considerations (to be removed on publication): . . . . . 30 | |||
9. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 30 | 9. Security Considerations . . . . . . . . . . . . . . . . . . . 30 | |||
10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 31 | 10. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 32 | |||
11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 31 | 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 32 | |||
11.1. Normative References . . . . . . . . . . . . . . . . . . . 31 | 12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 33 | |||
11.2. Informative References . . . . . . . . . . . . . . . . . . 32 | 12.1. Normative References . . . . . . . . . . . . . . . . . . . 33 | |||
Editorial Comments . . . . . . . . . . . . . . . . . . . . . . . . | 12.2. Informative References . . . . . . . . . . . . . . . . . . 33 | |||
Appendix A. Early ECN Tunnelling RFCs . . . . . . . . . . . . . . 34 | Appendix A. Early ECN Tunnelling RFCs . . . . . . . . . . . . . . 35 | |||
Appendix B. Design Constraints . . . . . . . . . . . . . . . . . 35 | Appendix B. Design Constraints . . . . . . . . . . . . . . . . . 35 | |||
B.1. Security Constraints . . . . . . . . . . . . . . . . . . . 35 | B.1. Security Constraints . . . . . . . . . . . . . . . . . . . 36 | |||
B.2. Control Constraints . . . . . . . . . . . . . . . . . . . 37 | B.2. Control Constraints . . . . . . . . . . . . . . . . . . . 38 | |||
B.3. Management Constraints . . . . . . . . . . . . . . . . . . 38 | B.3. Management Constraints . . . . . . . . . . . . . . . . . . 39 | |||
Appendix C. Contribution to Congestion across a Tunnel . . . . . 39 | Appendix C. Contribution to Congestion across a Tunnel . . . . . 39 | |||
Appendix D. Why Losing ECT(1) on Decapsulation Impedes PCN | Appendix D. Compromise on Decap with ECT(1) Inner and ECT(0) | |||
(to be removed before publication) . . . . . . . . . 40 | Outer . . . . . . . . . . . . . . . . . . . . . . . . 40 | |||
Appendix E. Why Resetting ECN on Encapsulation Impedes PCN | Appendix E. Open Issues . . . . . . . . . . . . . . . . . . . . . 41 | |||
(to be removed before publication) . . . . . . . . . 41 | ||||
Appendix F. Compromise on Decap with ECT(1) Inner and ECT(0) | ||||
Outer . . . . . . . . . . . . . . . . . . . . . . . . 42 | ||||
Appendix G. Open Issues . . . . . . . . . . . . . . . . . . . . . 43 | ||||
Request to the RFC Editor (to be removed on publication): | Request to the RFC Editor (to be removed on publication): | |||
In the RFC index, RFC3168 should be identified as an update to | In the RFC index, RFC3168 should be identified as an update to | |||
RFC2003. RFC4301 should be identified as an update to RFC3168. | RFC2003. RFC4301 should be identified as an update to RFC3168. | |||
Changes from previous drafts (to be removed by the RFC Editor) | Changes from previous drafts (to be removed by the RFC Editor) | |||
Full text differences between IETF draft versions are available at | Full text differences between IETF draft versions are available at | |||
<http://tools.ietf.org/wg/tsvwg/draft-ietf-tsvwg-ecn-tunnel/>, and | <http://tools.ietf.org/wg/tsvwg/draft-ietf-tsvwg-ecn-tunnel/>, and | |||
between earlier individual draft versions at | between earlier individual draft versions at | |||
<http://www.briscoe.net/pubs.html#ecn-tunnel> | <http://www.briscoe.net/pubs.html#ecn-tunnel> | |||
From ietf-06 to ietf-07 (current): | From ietf-08 to ietf-09 (current): Added change log entry for -07 to | |||
-08 that was previously omitted. | ||||
* Changes to standards action text: | ||||
+ Added RFC4774 to 'Updates:' header (the draft always has | ||||
extended the advice in RFC4774 (BCP124) which said very | ||||
little about tunnels. The GENART reviewer merely pointed | ||||
out that the header did not highlight this fact.) | ||||
* Editorial changes: | ||||
+ Abstract: s/providing backward compatibility./thus ensuring | ||||
backward compatibility./ | ||||
+ Moved PCN-related text motivating changes to decapsulation | ||||
from "Default Tunnel Egress Behaviour" (Section 4.2) to | ||||
"Motivation for Changing Decapsulation" (Section 5.3.2) | ||||
where it was merged with existing similar text. | ||||
+ In the non-normative Design Principles avoided using words | ||||
in lower case where they were in contexts that might make | ||||
them confusable with upper case RFC2119 normative language. | ||||
+ Added Stephen Hanna and Ben Campbell to acks and corrected | ||||
spelling of Agarwal. | ||||
+ Deleted endnote discussing corner case with IKEv2 manual | ||||
keying (identified as "to be removed before publication | ||||
following SecDir review"). | ||||
+ Deleted Appendices D & E on why existing ingress & egress | ||||
tunnelling behavour impede PCN and the endnotes that | ||||
referred to them (identified as "to be removed before | ||||
publication"). | ||||
+ Various minor corrections pointed out by reviewers. | ||||
From ietf-07 to ietf-08: | ||||
* Changes to standards actions: | ||||
+ Section 4: Changed non-RFC2119 phrase 'NOT RECOMMENDED' to | ||||
'SHOULD be avoided', wrt alternate ECN tunnelling schemes. | ||||
+ Section 4.2: Used upper-case in 'Alarms SHOULD be rate- | ||||
limited'. | ||||
+ Section 7: Made bullet #1 in the decapsulation guidelines | ||||
for alternate schemes more precise. Also changed any upper- | ||||
case keywords in this informative section to lower case. | ||||
* Editorial changes: | ||||
+ Changed copyright notice to allow for pre-5378 material. | ||||
+ Shifted supporting text intended for deletion on publication | ||||
into editorial comments. | ||||
+ Explained how to read the decapsulation matrices in their | ||||
captions. | ||||
+ Minor clarifications throughout. | ||||
From ietf-06 to ietf-07: | ||||
* Emphasised that this is the opposite of a fork in the RFC | * Emphasised that this is the opposite of a fork in the RFC | |||
series. | series. | |||
* Altered Section 5 to focus on updates to implementations of | * Altered Section 5 to focus on updates to implementations of | |||
earlier RFCs, rather than on updates to the text of the RFCs. | earlier RFCs, rather than on updates to the text of the RFCs. | |||
* Removed potential loop-holes in normative text that | * Removed potential loop-holes in normative text that | |||
implementers might have used to claim compliance without | implementers might have used to claim compliance without | |||
implementing normal mode. Highlighted the deliberate | implementing normal mode. Highlighted the deliberate | |||
skipping to change at page 6, line 36 | skipping to change at page 7, line 51 | |||
Schemes" after all the normative sections. | Schemes" after all the normative sections. | |||
+ Added Appendix A on early history of ECN tunnelling RFCs. | + Added Appendix A on early history of ECN tunnelling RFCs. | |||
+ Removed specialist appendix on "Relative Placement of | + Removed specialist appendix on "Relative Placement of | |||
Tunnelling and In-Path Load Regulation" (Appendix D in the | Tunnelling and In-Path Load Regulation" (Appendix D in the | |||
-02 draft) | -02 draft) | |||
+ Moved and updated specialist text on "Compromise on Decap | + Moved and updated specialist text on "Compromise on Decap | |||
with ECT(1) Inner and ECT(0) Outer" from Security | with ECT(1) Inner and ECT(0) Outer" from Security | |||
Considerations to Appendix F | Considerations to Appendix D | |||
* Textual changes: | * Textual changes: | |||
+ Simplified vocabulary for non-native-english speakers | + Simplified vocabulary for non-native-english speakers | |||
+ Simplified Introduction and defined regularly used terms in | + Simplified Introduction and defined regularly used terms in | |||
an expanded Terminology section. | an expanded Terminology section. | |||
+ More clearly distinguished statically configured tunnels | + More clearly distinguished statically configured tunnels | |||
from dynamic tunnel endpoint discovery, before explaining | from dynamic tunnel endpoint discovery, before explaining | |||
skipping to change at page 9, line 9 | skipping to change at page 10, line 27 | |||
Roadmap), added new Introductory subsection on "Scope" and | Roadmap), added new Introductory subsection on "Scope" and | |||
improved clarity; | improved clarity; | |||
* Added Design Guidelines for New Encapsulations of Congestion | * Added Design Guidelines for New Encapsulations of Congestion | |||
Notification; | Notification; | |||
* Considerably clarified the Backward Compatibility section | * Considerably clarified the Backward Compatibility section | |||
(Section 6); | (Section 6); | |||
* Considerably extended the Security Considerations section | * Considerably extended the Security Considerations section | |||
(Section 8); | (Section 9); | |||
* Summarised the primary rationale much better in the | * Summarised the primary rationale much better in the | |||
conclusions; | conclusions; | |||
* Added numerous extra acknowledgements; | * Added numerous extra acknowledgements; | |||
* Added Appendix E. "Why resetting CE on encapsulation harms | * Added Appendix E. "Why resetting CE on encapsulation harms | |||
PCN", Appendix C. "Contribution to Congestion across a Tunnel" | PCN", Appendix C. "Contribution to Congestion across a Tunnel" | |||
and Appendix D. "Ideal Decapsulation Rules"; | and Appendix D. "Ideal Decapsulation Rules"; | |||
skipping to change at page 17, line 26 | skipping to change at page 18, line 49 | |||
will not amplify into a flood of alarm messages. It MUST be | will not amplify into a flood of alarm messages. It MUST be | |||
possible to suppress alarms or logging, e.g. if it becomes | possible to suppress alarms or logging, e.g. if it becomes | |||
apparent that a combination that previously was not used has | apparent that a combination that previously was not used has | |||
started to be used for legitimate purposes such as a new standards | started to be used for legitimate purposes such as a new standards | |||
action. | action. | |||
The above logic allows for ECT(0) and ECT(1) to both represent the | The above logic allows for ECT(0) and ECT(1) to both represent the | |||
same severity of congestion marking (e.g. "not congestion marked"). | same severity of congestion marking (e.g. "not congestion marked"). | |||
But it also allows future schemes to be defined where ECT(1) is a | But it also allows future schemes to be defined where ECT(1) is a | |||
more severe marking than ECT(0), in particular enabling the simplest | more severe marking than ECT(0), in particular enabling the simplest | |||
possible encoding for PCN [I-D.ietf-pcn-3-in-1-encoding]. Before the | possible encoding for PCN [I-D.ietf-pcn-3-in-1-encoding] (see | |||
present specification was written, the PCN working-group had proposed | Section 5.3.2). Treating ECT(1) as either the same as ECT(0) or as a | |||
a number of work-rounds to the problem of a tunnel egress not | higher severity level is explained in the discussion of the ECN nonce | |||
propagating two severity levels of congestion. Without wishing to | ||||
disparage the ingenuity of these work-rounds, none were chosen for | [RFC3540] in Section 9, which in turn refers to Appendix D. | |||
the standards track because they were either somewhat wasteful, | ||||
imprecise or complicated [Note_PCN_egress]. Treating ECT(1) as | ||||
either the same as ECT(0) or as a higher severity level is explained | ||||
in the discussion of the ECN nonce [RFC3540] in Section 8, which in | ||||
turn refers to Appendix F. | ||||
4.3. Encapsulation Modes | 4.3. Encapsulation Modes | |||
Section 4.1 introduces two encapsulation modes, normal mode and | Section 4.1 introduces two encapsulation modes, normal mode and | |||
compatibility mode, defining their encapsulation behaviour (i.e. | compatibility mode, defining their encapsulation behaviour (i.e. | |||
header copying or zeroing respectively). Note that these are modes | header copying or zeroing respectively). Note that these are modes | |||
of the ingress tunnel endpoint only, not the tunnel as a whole. | of the ingress tunnel endpoint only, not the tunnel as a whole. | |||
To comply with this specification, a tunnel ingress MUST at least | To comply with this specification, a tunnel ingress MUST at least | |||
implement `normal mode'. Unless it will never be used with legacy | implement `normal mode'. Unless it will never be used with legacy | |||
skipping to change at page 18, line 42 | skipping to change at page 20, line 11 | |||
it recognises that proprietary tunnel endpoint discovery protocols | it recognises that proprietary tunnel endpoint discovery protocols | |||
exist. It therefore sets down some constraints on discovery | exist. It therefore sets down some constraints on discovery | |||
protocols to ensure safe interworking. | protocols to ensure safe interworking. | |||
If dynamic tunnel endpoint discovery might pair an ingress with a | If dynamic tunnel endpoint discovery might pair an ingress with a | |||
legacy egress (RFC2003, RFC2401 or RFC2481 or the limited | legacy egress (RFC2003, RFC2401 or RFC2481 or the limited | |||
functionality mode of RFC3168), the ingress MUST implement both | functionality mode of RFC3168), the ingress MUST implement both | |||
normal and compatibility mode. If the tunnel discovery process is | normal and compatibility mode. If the tunnel discovery process is | |||
arranged to only ever find a tunnel egress that propagates ECN | arranged to only ever find a tunnel egress that propagates ECN | |||
(RFC3168 full functionality mode, RFC4301 or this present | (RFC3168 full functionality mode, RFC4301 or this present | |||
specification), then a tunnel ingress can be complaint with the | specification), then a tunnel ingress can be compliant with the | |||
present specification without implementing compatibility mode. | present specification without implementing compatibility mode. | |||
While a compliant tunnel ingress is discovering an egress, it MUST | While a compliant tunnel ingress is discovering an egress, it MUST | |||
send packets in compatibility mode in case the egress it discovers | send packets in compatibility mode in case the egress it discovers | |||
is a legacy egress. If, through the discovery protocol, the | is a legacy egress. If, through the discovery protocol, the | |||
egress indicates that it is compliant with the present | egress indicates that it is compliant with the present | |||
specification, with RFC4301 or with RFC3168 full functionality | specification, with RFC4301 or with RFC3168 full functionality | |||
mode, the ingress can switch itself into normal mode. If the | mode, the ingress can switch itself into normal mode. If the | |||
egress denies compliance with any of these or returns an error | egress denies compliance with any of these or returns an error | |||
that implies it does not understand a request to work to any of | that implies it does not understand a request to work to any of | |||
skipping to change at page 19, line 33 | skipping to change at page 20, line 50 | |||
Implementation note: If a compliant node is the ingress for multiple | Implementation note: If a compliant node is the ingress for multiple | |||
tunnels, a mode setting will need to be stored for each tunnel | tunnels, a mode setting will need to be stored for each tunnel | |||
ingress. However, if a node is the egress for multiple tunnels, | ingress. However, if a node is the egress for multiple tunnels, | |||
none of the tunnels will need to store a mode setting, because a | none of the tunnels will need to store a mode setting, because a | |||
compliant egress only needs one mode. | compliant egress only needs one mode. | |||
4.4. Single Mode of Decapsulation | 4.4. Single Mode of Decapsulation | |||
A compliant decapsulator only needs one mode of operation. However, | A compliant decapsulator only needs one mode of operation. However, | |||
if a complaint egress is implemented to be dynamically discoverable, | if a compliant egress is implemented to be dynamically discoverable, | |||
it may need to respond to discovery requests from various types of | it may need to respond to discovery requests from various types of | |||
legacy tunnel ingress. This specification does not define how | legacy tunnel ingress. This specification does not define how | |||
dynamic negotiation might be done by (proprietary) discovery | dynamic negotiation might be done by (proprietary) discovery | |||
protocols, but it sets down some constraints to ensure safe | protocols, but it sets down some constraints to ensure safe | |||
interworking. | interworking. | |||
Through the discovery protocol, a tunnel ingress compliant with the | Through the discovery protocol, a tunnel ingress compliant with the | |||
present specification might ask if the egress is compliant with the | present specification might ask if the egress is compliant with the | |||
present specification, with RFC4301 or with RFC3168 full | present specification, with RFC4301 or with RFC3168 full | |||
functionality mode. Or an RFC3168 tunnel ingress might try to | functionality mode. Or an RFC3168 tunnel ingress might try to | |||
skipping to change at page 20, line 44 | skipping to change at page 22, line 13 | |||
dropped rather than forwarded as Not-ECT; | dropped rather than forwarded as Not-ECT; | |||
* Certain combinations of inner and outer ECN field have been | * Certain combinations of inner and outer ECN field have been | |||
identified as currently unused. These can trigger logging | identified as currently unused. These can trigger logging | |||
and/or raise alarms. | and/or raise alarms. | |||
Modes: RFC4301 tunnel endpoints do not need modes and are not | Modes: RFC4301 tunnel endpoints do not need modes and are not | |||
updated by the modes in the present specification. Effectively an | updated by the modes in the present specification. Effectively an | |||
RFC4301 IPsec ingress solely uses the REQUIRED normal mode of | RFC4301 IPsec ingress solely uses the REQUIRED normal mode of | |||
encapsulation, which is unchanged from RFC4301 encapsulation. It | encapsulation, which is unchanged from RFC4301 encapsulation. It | |||
will never [Note_Manual_Keying] need the OPTIONAL compatibility | will never need the OPTIONAL compatibility mode as explained in | |||
mode as explained in Section 4.3. | Section 4.3. | |||
5.2. Changes to RFC3168 ECN processing | 5.2. Changes to RFC3168 ECN processing | |||
Ingress: On encapsulation, the new rule in Figure 3 that a normal | Ingress: On encapsulation, the new rule in Figure 3 that a normal | |||
mode tunnel ingress copies any ECN field into the outer header | mode tunnel ingress copies any ECN field into the outer header | |||
updates the full functionality behaviour of an RFC3168 ingress. | updates the full functionality behaviour of an RFC3168 ingress. | |||
Nonetheless, the new compatibility mode encapsulates packets | Nonetheless, the new compatibility mode encapsulates packets | |||
identically to the limited functionality mode of an RFC3168 | identically to the limited functionality mode of an RFC3168 | |||
ingress. | ingress. | |||
Egress: An RFC3168 egress will need to be updated to the new | Egress: An RFC3168 egress will need to be updated to the new | |||
decapsulation behaviour in Figure 4, in order to comply with the | decapsulation behaviour in Figure 4, in order to comply with the | |||
present specification. However, the changes are backward | present specification. However, the changes are backward | |||
skipping to change at page 21, line 46 | skipping to change at page 23, line 11 | |||
behaviour covers all cases. | behaviour covers all cases. | |||
The normal mode of encapsulation is an update to the encapsulation | The normal mode of encapsulation is an update to the encapsulation | |||
behaviour of the full functionality mode of an RFC3168 ingress. | behaviour of the full functionality mode of an RFC3168 ingress. | |||
The compatibility mode of encapsulation is identical to the | The compatibility mode of encapsulation is identical to the | |||
encapsulation behaviour of the limited functionality mode of an | encapsulation behaviour of the limited functionality mode of an | |||
RFC3168 ingress, except it is optional. | RFC3168 ingress, except it is optional. | |||
The constraints on how tunnel discovery protocols set modes in | The constraints on how tunnel discovery protocols set modes in | |||
Section 4.3 and Section 4.4 are an update to RFC3168, but they are | Section 4.3 and Section 4.4 are an update to RFC3168, but they are | |||
unlikely to require code changes as they document safe practice. | unlikely to require code changes as they document existing safe | |||
practice. | ||||
5.3. Motivation for Changes | 5.3. Motivation for Changes | |||
An overriding goal is to ensure the same ECN signals can mean the | An overriding goal is to ensure the same ECN signals can mean the | |||
same thing whatever tunnels happen to encapsulate an IP packet flow. | same thing whatever tunnels happen to encapsulate an IP packet flow. | |||
This removes gratuitous inconsistency, which otherwise constrains the | This removes gratuitous inconsistency, which otherwise constrains the | |||
available design space and makes it harder to design networks and new | available design space and makes it harder to design networks and new | |||
protocols that work predictably. | protocols that work predictably. | |||
5.3.1. Motivation for Changing Encapsulation | 5.3.1. Motivation for Changing Encapsulation | |||
skipping to change at page 22, line 32 | skipping to change at page 23, line 41 | |||
compatibility with legacy decapsulators that do not propagate ECN | compatibility with legacy decapsulators that do not propagate ECN | |||
correctly. | correctly. | |||
The trigger that motivated this update to RFC3168 encapsulation was a | The trigger that motivated this update to RFC3168 encapsulation was a | |||
standards track proposal for pre-congestion notification (PCN | standards track proposal for pre-congestion notification (PCN | |||
[RFC5670]). PCN excess rate marking only works correctly if the ECN | [RFC5670]). PCN excess rate marking only works correctly if the ECN | |||
field is copied on encapsulation (as in RFC4301 and RFC5129); it does | field is copied on encapsulation (as in RFC4301 and RFC5129); it does | |||
not work if ECN is reset (as in RFC3168). This is because PCN excess | not work if ECN is reset (as in RFC3168). This is because PCN excess | |||
rate marking depends on the outer header revealing any congestion | rate marking depends on the outer header revealing any congestion | |||
experienced so far on the whole path, not just since the last tunnel | experienced so far on the whole path, not just since the last tunnel | |||
ingress [Note_PCN_ingress]. | ingress. | |||
PCN allows a network operator to add flow admission and termination | PCN allows a network operator to add flow admission and termination | |||
for inelastic traffic at the edges of a Diffserv domain, but without | for inelastic traffic at the edges of a Diffserv domain, but without | |||
any per-flow mechanisms in the interior and without the generous | any per-flow mechanisms in the interior and without the generous | |||
provisioning typical of Diffserv, aiming to significantly reduce | provisioning typical of Diffserv, aiming to significantly reduce | |||
costs. The PCN architecture [RFC5559] states that RFC3168 IP in IP | costs. The PCN architecture [RFC5559] states that RFC3168 IP in IP | |||
tunnelling of the ECN field cannot be used for any tunnel ingress in | tunnelling of the ECN field cannot be used for any tunnel ingress in | |||
a PCN domain. Prior to the present specification, this left a stark | a PCN domain. Prior to the present specification, this left a stark | |||
choice between not being able to use PCN for inelastic traffic | choice between not being able to use PCN for inelastic traffic | |||
control or not being able to use the many tunnels already deployed | control or not being able to use the many tunnels already deployed | |||
skipping to change at page 24, line 19 | skipping to change at page 25, line 32 | |||
As well as being useful for general future-proofing, this problem | As well as being useful for general future-proofing, this problem | |||
is immediately pressing for standardisation of pre-congestion | is immediately pressing for standardisation of pre-congestion | |||
notification (PCN), which uses two severity levels of congestion. | notification (PCN), which uses two severity levels of congestion. | |||
If a congested queue used ECT(1) in the outer header to signal | If a congested queue used ECT(1) in the outer header to signal | |||
more severe congestion than ECT(0), the pre-existing | more severe congestion than ECT(0), the pre-existing | |||
decapsulation rules would have thrown away this congestion | decapsulation rules would have thrown away this congestion | |||
signal, preventing tunnelled traffic from ever knowing that it | signal, preventing tunnelled traffic from ever knowing that it | |||
should reduce its load. | should reduce its load. | |||
The PCN working group has had to consider a number of wasteful or | Before the present specification was written, the PCN working | |||
convoluted work-rounds to this problem [Note_PCN_egress]. But by | group had to consider a number of wasteful or convoluted work- | |||
far the simplest approach is just to remove the covert channel | rounds to this problem. Without wishing to disparage the | |||
blockages from tunnelling behaviour--now deemed unnecessary | ingenuity of these work-rounds, none were chosen for the | |||
anyway. Then network operators that want to support two | standards track because they were either somewhat wasteful, | |||
congestion severity-levels for PCN can specify that every tunnel | imprecise or complicated. Instead a baseline PCN encoding was | |||
egress in a PCN region must comply with this latest | specified [RFC5696] that supported only one severity level of | |||
specification. | congestion but allowed space for these work-rounds as | |||
experimental extensions. | ||||
But by far the simplest approach is that taken by the current | ||||
specification: just to remove the covert channel blockages from | ||||
tunnelling behaviour--now deemed unnecessary anyway. Then | ||||
network operators that want to support two congestion severity- | ||||
levels for PCN can specify that every tunnel egress in a PCN | ||||
region must comply with this latest specification. Having taken | ||||
this step, the simplest possible encoding for PCN with two | ||||
severity levels of congestion [I-D.ietf-pcn-3-in-1-encoding] can | ||||
be used. | ||||
Not only does this make two congestion severity-levels available | Not only does this make two congestion severity-levels available | |||
for PCN standardisation, but also for other potential uses of the | for PCN, but also for other potential uses of the extra ECN | |||
extra ECN codepoint (e.g. [VCP]). | codepoint (e.g. [VCP]). | |||
2. Cases are documented where a middlebox (e.g. a firewall) drops | 2. Cases are documented where a middlebox (e.g. a firewall) drops | |||
packets with header values that were currently unused (CU) when | packets with header values that were currently unused (CU) when | |||
the box was deployed, often on the grounds that anything | the box was deployed, often on the grounds that anything | |||
unexpected might be an attack. This tends to bar future use of | unexpected might be an attack. This tends to bar future use of | |||
CU values. The new decapsulation rules specify optional logging | CU values. The new decapsulation rules specify optional logging | |||
and/or alarms for specific combinations of inner and outer header | and/or alarms for specific combinations of inner and outer header | |||
that are currently unused. The aim is to give implementers a | that are currently unused. The aim is to give implementers a | |||
recourse other than drop if they are concerned about the security | recourse other than drop if they are concerned about the security | |||
of CU values. It recognises legitimate security concerns about | of CU values. It recognises legitimate security concerns about | |||
skipping to change at page 27, line 26 | skipping to change at page 28, line 50 | |||
7. Design Principles for Alternate ECN Tunnelling Semantics | 7. Design Principles for Alternate ECN Tunnelling Semantics | |||
This section is informative not normative. | This section is informative not normative. | |||
S.5 of RFC3168 permits the Diffserv codepoint (DSCP)[RFC2474] to | S.5 of RFC3168 permits the Diffserv codepoint (DSCP)[RFC2474] to | |||
'switch in' alternative behaviours for marking the ECN field, just as | 'switch in' alternative behaviours for marking the ECN field, just as | |||
it switches in different per-hop behaviours (PHBs) for scheduling. | it switches in different per-hop behaviours (PHBs) for scheduling. | |||
[RFC4774] gives best current practice for designing such alternative | [RFC4774] gives best current practice for designing such alternative | |||
ECN semantics and very briefly mentions in section 5.4 that | ECN semantics and very briefly mentions in section 5.4 that | |||
tunnelling should be considered. The guidance below extends RFC4774, | tunnelling needs to be considered. The guidance below complements | |||
giving additional guidance on designing any alternate ECN semantics | and extends RFC4774, giving additional guidance on designing any | |||
that would also require alternate tunnelling semantics. | alternate ECN semantics that would also require alternate tunnelling | |||
semantics. | ||||
The overriding guidance is: "Avoid designing alternate ECN tunnelling | The overriding guidance is: "Avoid designing alternate ECN tunnelling | |||
semantics, if at all possible." If a scheme requires tunnels to | semantics, if at all possible." If a scheme requires tunnels to | |||
implement special processing of the ECN field for certain DSCPs, it | implement special processing of the ECN field for certain DSCPs, it | |||
will be hard to guarantee that every implementer of every tunnel will | will be hard to guarantee that every implementer of every tunnel will | |||
have added the required exception or that operators will have | have added the required exception or that operators will have | |||
ubiquitously deployed the required updates. It is unlikely a single | ubiquitously deployed the required updates. It is unlikely a single | |||
authority is even aware of all the tunnels in a network, which may | authority is even aware of all the tunnels in a network, which may | |||
include tunnels set up by applications between endpoints, or | include tunnels set up by applications between endpoints, or | |||
dynamically created in the network. Therefore it is highly likely | dynamically created in the network. Therefore it is highly likely | |||
that some tunnels within a network or on hosts connected to it will | that some tunnels within a network or on hosts connected to it will | |||
not implement the required special case. | not implement the required special case. | |||
That said, if a non-default scheme for tunnelling the ECN field is | That said, if a non-default scheme for tunnelling the ECN field is | |||
really required, the following guidelines may prove useful in its | really required, the following guidelines might prove useful in its | |||
design: | design: | |||
On encapsulation in any alternate scheme: | On encapsulation in any alternate scheme: | |||
1. The ECN field of the outer header should be cleared to Not-ECT | 1. The ECN field of the outer header ought to be cleared to Not- | |||
("00") unless it is guaranteed that the corresponding tunnel | ECT ("00") unless it is guaranteed that the corresponding | |||
egress will correctly propagate congestion markings introduced | tunnel egress will correctly propagate congestion markings | |||
across the tunnel in the outer header. | introduced across the tunnel in the outer header. | |||
2. If it has established that ECN will be correctly propagated, | 2. If it has established that ECN will be correctly propagated, | |||
an encapsulator should also copy incoming congestion | an encapsulator ought to also copy incoming congestion | |||
notification into the outer header. The general principle | notification into the outer header. The general principle | |||
here is that the outer header should reflect congestion | here is that the outer header should reflect congestion | |||
accumulated along the whole upstream path, not just since the | accumulated along the whole upstream path, not just since the | |||
tunnel ingress (Appendix B.3 on management and monitoring | tunnel ingress (Appendix B.3 on management and monitoring | |||
explains). | explains). | |||
In some circumstances (e.g. pseudowires, PCN), the whole path | In some circumstances (e.g. pseudowires, PCN), the whole path | |||
is divided into segments, each with its own congestion | is divided into segments, each with its own congestion | |||
notification and feedback loop. In these cases, the function | notification and feedback loop. In these cases, the function | |||
that regulates load at the start of each segment will need to | that regulates load at the start of each segment will need to | |||
reset congestion notification for its segment. Often the | reset congestion notification for its segment. Often the | |||
point where congestion notification is reset will also be | point where congestion notification is reset will also be | |||
located at the start of a tunnel. However, the resetting | located at the start of a tunnel. However, the resetting | |||
function should be thought of as being applied to packets | function can be thought of as being applied to packets after | |||
after the encapsulation function--two logically separate | the encapsulation function--two logically separate functions | |||
functions even though they might run on the same physical box. | even though they might run on the same physical box. Then the | |||
Then the code module doing encapsulation can keep to the | code module doing encapsulation can keep to the copying rule | |||
copying rule and the load regulator module can reset | and the load regulator module can reset congestion, without | |||
congestion, without any code in either module being | any code in either module being conditional on whether the | |||
conditional on whether the other is there. | other is there. | |||
On decapsulation in any new scheme: | On decapsulation in any new scheme: | |||
1. If the arriving inner header is Not-ECT it implies the | 1. If the arriving inner header is Not-ECT it implies the | |||
transport will not understand other ECN codepoints. If the | transport will not understand other ECN codepoints. If the | |||
outer header carries an explicit congestion marking, the | outer header carries an explicit congestion marking, the | |||
alternate scheme would be expected to drop the packet--the | alternate scheme would be expected to drop the packet--the | |||
only indication of congestion the transport will understand. | only indication of congestion the transport will understand. | |||
If the alternate scheme recommends forwarding rather than | If the alternate scheme recommends forwarding rather than | |||
dropping such a packet, it must clearly justify this decision. | dropping such a packet, it will need to clearly justify this | |||
If the inner is Not-ECT and the outer carries any other ECN | decision. If the inner is Not-ECT and the outer carries any | |||
codepoint that does not indicate congestion, the alternate | other ECN codepoint that does not indicate congestion, the | |||
scheme can forward the packet, but probably only as Not-ECT. | alternate scheme can forward the packet, but probably only as | |||
Not-ECT. | ||||
2. If the arriving inner header is other than Not-ECT, the ECN | 2. If the arriving inner header is other than Not-ECT, the ECN | |||
field that the alternate decapsulation scheme forwards should | field that the alternate decapsulation scheme forwards ought | |||
reflect the more severe congestion marking of the arriving | to reflect the more severe congestion marking of the arriving | |||
inner and outer headers. | inner and outer headers. | |||
3. Any alternate scheme must define a behaviour for all | 3. Any alternate scheme will need to define a behaviour for all | |||
combinations of inner and outer headers, even those that would | combinations of inner and outer headers, even those that would | |||
not be expected to result from standards known at the time and | not be expected to result from standards known at the time and | |||
even those that would not be expected from the tunnel ingress | even those that would not be expected from the tunnel ingress | |||
paired with the egress at run-time. Consideration should be | paired with the egress at run-time. Consideration should be | |||
given to logging such unexpected combinations and raising an | given to logging such unexpected combinations and raising an | |||
alarm, particularly if there is a danger that the invalid | alarm, particularly if there is a danger that the invalid | |||
combination implies congestion signals are not being | combination implies congestion signals are not being | |||
propagated correctly. The presence of currently unused | propagated correctly. The presence of currently unused | |||
combinations may represent an attack, but the new scheme | combinations may represent an attack, but the new scheme | |||
should try to define a way to forward such packets, at least | should try to define a way to forward such packets, at least | |||
if a safe outgoing codepoint can be defined. | if a safe outgoing codepoint can be defined. | |||
Raising an alarm allows a management system to decide whether | Raising an alarm allows a management system to decide whether | |||
the anomaly is indeed an attack, in which case it can decide | the anomaly is indeed an attack, in which case it can decide | |||
to drop such packets. This is a preferable approach to hard- | to drop such packets. This is a preferable approach to hard- | |||
coded discard of packets that seem anomalous today, but may be | coded discard of packets that seem anomalous today, but may be | |||
needed tomorrow in future standards actions. | needed tomorrow in future standards actions. | |||
IANA Considerations (to be removed on publication): | 8. IANA Considerations (to be removed on publication): | |||
This memo includes no request to IANA. | This memo includes no request to IANA. | |||
8. Security Considerations | 9. Security Considerations | |||
Appendix B.1 discusses the security constraints imposed on ECN tunnel | Appendix B.1 discusses the security constraints imposed on ECN tunnel | |||
processing. The new rules for ECN tunnel processing (Section 4) | processing. The new rules for ECN tunnel processing (Section 4) | |||
trade-off between information security (covert channels) and traffic | trade-off between information security (covert channels) and traffic | |||
security (congestion monitoring & control). Ensuring congestion | security (congestion monitoring & control). Ensuring congestion | |||
markings are not lost is itself an aspect of security, because if we | markings are not lost is itself an aspect of security, because if we | |||
allowed congestion notification to be lost, any attempt to enforce a | allowed congestion notification to be lost, any attempt to enforce a | |||
response to congestion would be much harder. | response to congestion would be much harder. | |||
Specialist security issues: | Security issues in unlikely but possible scenarios: | |||
Tunnels intersecting Diffserv regions with alternate ECN semantics: | Tunnels intersecting Diffserv regions with alternate ECN semantics: | |||
If alternate congestion notification semantics are defined for a | If alternate congestion notification semantics are defined for a | |||
certain Diffserv PHB, the scope of the alternate semantics might | certain Diffserv PHB, the scope of the alternate semantics might | |||
typically be bounded by the limits of a Diffserv region or | typically be bounded by the limits of a Diffserv region or | |||
regions, as envisaged in [RFC4774] (e.g. the pre-congestion | regions, as envisaged in [RFC4774] (e.g. the pre-congestion | |||
notification architecture [RFC5559]). The inner headers in | notification architecture [RFC5559]). The inner headers in | |||
tunnels crossing the boundary of such a Diffserv region but ending | tunnels crossing the boundary of such a Diffserv region but ending | |||
within the region can potentially leak the external congestion | within the region can potentially leak the external congestion | |||
notification semantics into the region, or leak the internal | notification semantics into the region, or leak the internal | |||
skipping to change at page 30, line 10 | skipping to change at page 31, line 34 | |||
other outside the domain. [RFC5559] gives specific advice on this | other outside the domain. [RFC5559] gives specific advice on this | |||
for the PCN case, but other definitions of alternate semantics | for the PCN case, but other definitions of alternate semantics | |||
will need to discuss the specific security implications in each | will need to discuss the specific security implications in each | |||
case. | case. | |||
ECN nonce tunnel coverage: The new decapsulation rules improve the | ECN nonce tunnel coverage: The new decapsulation rules improve the | |||
coverage of the ECN nonce [RFC3540] relative to the previous rules | coverage of the ECN nonce [RFC3540] relative to the previous rules | |||
in RFC3168 and RFC4301. However, nonce coverage is still not | in RFC3168 and RFC4301. However, nonce coverage is still not | |||
perfect, as this would have led to a safety problem in another | perfect, as this would have led to a safety problem in another | |||
case. Both are corner-cases, so discussion of the compromise | case. Both are corner-cases, so discussion of the compromise | |||
between them is deferred to Appendix F. | between them is deferred to Appendix D. | |||
Covert channel not turned off: A legacy (RFC3168) tunnel ingress | Covert channel not turned off: A legacy (RFC3168) tunnel ingress | |||
could ask an RFC3168 egress to turn off ECN processing as well as | could ask an RFC3168 egress to turn off ECN processing as well as | |||
itself turning off ECN. An egress compliant with the present | itself turning off ECN. An egress compliant with the present | |||
specification will agree to such a request from a legacy ingress, | specification will agree to such a request from a legacy ingress, | |||
but it relies on the ingress always sending Not-ECT in the outer. | but it relies on the ingress always sending Not-ECT in the outer. | |||
If the egress receives other ECN codepoints in the outer it will | If the egress receives other ECN codepoints in the outer it will | |||
process them as normal, so it will actually still copy congestion | process them as normal, so it will actually still copy congestion | |||
markings from the outer to the outgoing header. Referring for | markings from the outer to the outgoing header. Referring for | |||
example to Figure 5 (Appendix B.1), although the tunnel ingress | example to Figure 5 (Appendix B.1), although the tunnel ingress | |||
'I' will set all ECN fields in outer headers to Not-ECT, 'M' could | 'I' will set all ECN fields in outer headers to Not-ECT, 'M' could | |||
still toggle CE or ECT(1) on and off to communicate covertly with | still toggle CE or ECT(1) on and off to communicate covertly with | |||
'B', because we have specified that 'E' only has one mode | 'B', because we have specified that 'E' only has one mode | |||
regardless of what mode it says it has negotiated. We could have | regardless of what mode it says it has negotiated. We could have | |||
specified that 'E' should have a limited functionality mode and | specified that 'E' should have a limited functionality mode and | |||
check for such behaviour. But we decided not to add the extra | check for such behaviour. But we decided not to add the extra | |||
complexity of two modes on a compliant tunnel egress merely to | complexity of two modes on a compliant tunnel egress merely to | |||
cater for an historic security concern that is now considered | cater for an historic security concern that is now considered | |||
manageable. | manageable. | |||
9. Conclusions | 10. Conclusions | |||
This document allows tunnels to propagate an extra level of | This document allows tunnels to propagate an extra level of | |||
congestion severity. It uses previously unused combinations of inner | congestion severity. It uses previously unused combinations of inner | |||
and outer header to augment the rules for calculating the ECN field | and outer header to augment the rules for calculating the ECN field | |||
when decapsulating IP packets at the egress of IPsec (RFC4301) and | when decapsulating IP packets at the egress of IPsec (RFC4301) and | |||
non-IPsec (RFC3168) tunnels. | non-IPsec (RFC3168) tunnels. | |||
This document also updates the ingress tunnelling encapsulation of | This document also updates the ingress tunnelling encapsulation of | |||
RFC3168 ECN to bring all IP in IP tunnels into line with the new | RFC3168 ECN to bring all IP in IP tunnels into line with the new | |||
behaviour in the IPsec architecture of RFC4301, which copies rather | behaviour in the IPsec architecture of RFC4301, which copies rather | |||
skipping to change at page 31, line 22 | skipping to change at page 32, line 47 | |||
At the same time as removing these legacy constraints, the | At the same time as removing these legacy constraints, the | |||
opportunity has been taken to draw together diverging tunnel | opportunity has been taken to draw together diverging tunnel | |||
specifications into a single consistent behaviour. Then any tunnel | specifications into a single consistent behaviour. Then any tunnel | |||
can be deployed unilaterally, and it will support the full range of | can be deployed unilaterally, and it will support the full range of | |||
congestion control and management schemes without any modes or | congestion control and management schemes without any modes or | |||
configuration. Further, any host or router can expect the ECN field | configuration. Further, any host or router can expect the ECN field | |||
to behave in the same way, whatever type of tunnel might intervene in | to behave in the same way, whatever type of tunnel might intervene in | |||
the path. This new certainty could enable new uses of the ECN field | the path. This new certainty could enable new uses of the ECN field | |||
that would otherwise be confounded by ambiguity. | that would otherwise be confounded by ambiguity. | |||
10. Acknowledgements | 11. Acknowledgements | |||
Thanks to David Black for his insightful reviews and patient | Thanks to David Black for his insightful reviews and patient | |||
explanations of better ways to think about function placement and | explanations of better ways to think about function placement and | |||
alarms. Thanks to David and to Anil Agawaal for pointing out cases | alarms. Thanks to David and to Anil Agarwal for pointing out cases | |||
where it is safe to forward CU combinations of headers. Also thanks | where it is safe to forward CU combinations of headers. Also thanks | |||
to Arnaud Jacquet for the idea for Appendix C. Thanks to Gorry | to Arnaud Jacquet for the idea for Appendix C. Thanks to Gorry | |||
Fairhurst, Teco Boot, Michael Menth, Bruce Davie, Toby Moncaster, | Fairhurst, Teco Boot, Michael Menth, Bruce Davie, Toby Moncaster, | |||
Sally Floyd, Alfred Hoenes, Gabriele Corliano, Ingemar Johansson and | Sally Floyd, Alfred Hoenes, Gabriele Corliano, Ingemar Johansson and | |||
Phil Eardley for their thoughts and careful review comments. | Philip Eardley for their thoughts and careful review comments, and to | |||
Stephen Hanna and Ben Campbell respectively for conducting the | ||||
Security Directorate and General Area reviews. | ||||
Bob Briscoe is partly funded by Trilogy, a research project (ICT- | Bob Briscoe is partly funded by Trilogy, a research project (ICT- | |||
216372) supported by the European Community under its Seventh | 216372) supported by the European Community under its Seventh | |||
Framework Programme. The views expressed here are those of the | Framework Programme. The views expressed here are those of the | |||
author only. | author only. | |||
Comments Solicited (to be removed by the RFC Editor): | Comments Solicited (to be removed by the RFC Editor): | |||
Comments and questions are encouraged and very welcome. They can be | Comments and questions are encouraged and very welcome. They can be | |||
addressed to the IETF Transport Area working group mailing list | addressed to the IETF Transport Area working group mailing list | |||
<tsvwg@ietf.org>, and/or to the authors. | <tsvwg@ietf.org>, and/or to the authors. | |||
11. References | 12. References | |||
11.1. Normative References | ||||
[RFC2003] Perkins, C., "IP Encapsulation | ||||
within IP", RFC 2003, October 1996. | ||||
[RFC2119] Bradner, S., "Key words for use in | ||||
RFCs to Indicate Requirement | ||||
Levels", BCP 14, RFC 2119, | ||||
March 1997. | ||||
[RFC3168] Ramakrishnan, K., Floyd, S., and D. | ||||
Black, "The Addition of Explicit | ||||
Congestion Notification (ECN) to | ||||
IP", RFC 3168, September 2001. | ||||
[RFC4301] Kent, S. and K. Seo, "Security | ||||
Architecture for the Internet | ||||
Protocol", RFC 4301, December 2005. | ||||
11.2. Informative References | ||||
[I-D.ietf-pcn-3-in-1-encoding] Briscoe, B. and T. Moncaster, "PCN | ||||
3-State Encoding Extension in a | ||||
single DSCP", | ||||
draft-ietf-pcn-3-in-1-encoding-01 | ||||
(work in progress), February 2010. | ||||
[I-D.ietf-pcn-3-state-encoding] Briscoe, B., Moncaster, T., and M. | ||||
Menth, "A PCN encoding using 2 | ||||
DSCPs to provide 3 or more states", | ||||
draft-ietf-pcn-3-state-encoding-01 | ||||
(work in progress), February 2010. | ||||
[I-D.ietf-pcn-psdm-encoding] Menth, M., Babiarz, J., Moncaster, | 12.1. Normative References | |||
T., and B. Briscoe, "PCN Encoding | ||||
for Packet-Specific Dual Marking | ||||
(PSDM)", | ||||
draft-ietf-pcn-psdm-encoding-00 | ||||
(work in progress), June 2009. | ||||
[I-D.ietf-pcn-sm-edge-behaviour] Charny, A., Karagiannis, G., Menth, | [RFC2003] Perkins, C., "IP Encapsulation within | |||
M., and T. Taylor, "PCN Boundary | IP", RFC 2003, October 1996. | |||
Node Behaviour for the Single | ||||
Marking (SM) Mode of Operation", | ||||
draft-ietf-pcn-sm-edge-behaviour-01 | ||||
(work in progress), October 2009. | ||||
[I-D.satoh-pcn-st-marking] Satoh, D., Ueno, H., Maeda, Y., and | [RFC2119] Bradner, S., "Key words for use in | |||
O. Phanachet, "Single PCN Threshold | RFCs to Indicate Requirement Levels", | |||
Marking by using PCN baseline | BCP 14, RFC 2119, March 1997. | |||
encoding for both admission and | ||||
termination controls", | ||||
draft-satoh-pcn-st-marking-02 (work | ||||
in progress), September 2009. | ||||
[RFC2401] Kent, S. and R. Atkinson, "Security | [RFC3168] Ramakrishnan, K., Floyd, S., and D. | |||
Architecture for the Internet | Black, "The Addition of Explicit | |||
Protocol", RFC 2401, November 1998. | Congestion Notification (ECN) to IP", | |||
RFC 3168, September 2001. | ||||
[RFC2474] Nichols, K., Blake, S., Baker, F., | [RFC4301] Kent, S. and K. Seo, "Security | |||
and D. Black, "Definition of the | Architecture for the Internet | |||
Differentiated Services Field (DS | Protocol", RFC 4301, December 2005. | |||
Field) in the IPv4 and IPv6 | ||||
Headers", RFC 2474, December 1998. | ||||
[RFC2481] Ramakrishnan, K. and S. Floyd, "A | 12.2. Informative References | |||
Proposal to add Explicit Congestion | ||||
Notification (ECN) to IP", | ||||
RFC 2481, January 1999. | ||||
[RFC2983] Black, D., "Differentiated Services | [I-D.ietf-pcn-3-in-1-encoding] Briscoe, B., Moncaster, T., and M. | |||
and Tunnels", RFC 2983, | Menth, "Encoding 3 PCN-States in the | |||
October 2000. | IP header using a single DSCP", | |||
draft-ietf-pcn-3-in-1-encoding-03 | ||||
(work in progress), July 2010. | ||||
[RFC3540] Spring, N., Wetherall, D., and D. | [RFC2401] Kent, S. and R. Atkinson, "Security | |||
Ely, "Robust Explicit Congestion | Architecture for the Internet | |||
Notification (ECN) Signaling with | Protocol", RFC 2401, November 1998. | |||
Nonces", RFC 3540, June 2003. | ||||
[RFC4306] Kaufman, C., "Internet Key Exchange | [RFC2474] Nichols, K., Blake, S., Baker, F., | |||
(IKEv2) Protocol", RFC 4306, | and D. Black, "Definition of the | |||
December 2005. | Differentiated Services Field (DS | |||
Field) in the IPv4 and IPv6 Headers", | ||||
RFC 2474, December 1998. | ||||
[RFC4774] Floyd, S., "Specifying Alternate | [RFC2481] Ramakrishnan, K. and S. Floyd, "A | |||
Semantics for the Explicit | Proposal to add Explicit Congestion | |||
Congestion Notification (ECN) | Notification (ECN) to IP", RFC 2481, | |||
Field", BCP 124, RFC 4774, | January 1999. | |||
November 2006. | ||||
[RFC5129] Davie, B., Briscoe, B., and J. Tay, | [RFC2983] Black, D., "Differentiated Services | |||
"Explicit Congestion Marking in | and Tunnels", RFC 2983, October 2000. | |||
MPLS", RFC 5129, January 2008. | ||||
[RFC5559] Eardley, P., "Pre-Congestion | [RFC3540] Spring, N., Wetherall, D., and D. | |||
Notification (PCN) Architecture", | Ely, "Robust Explicit Congestion | |||
RFC 5559, June 2009. | Notification (ECN) Signaling with | |||
Nonces", RFC 3540, June 2003. | ||||
[RFC5670] Eardley, P., "Metering and Marking | [RFC4306] Kaufman, C., "Internet Key Exchange | |||
Behaviour of PCN-Nodes", RFC 5670, | (IKEv2) Protocol", RFC 4306, | |||
November 2009. | December 2005. | |||
[RFC5696] Moncaster, T., Briscoe, B., and M. | [RFC4774] Floyd, S., "Specifying Alternate | |||
Menth, "Baseline Encoding and | Semantics for the Explicit Congestion | |||
Transport of Pre-Congestion | Notification (ECN) Field", BCP 124, | |||
Information", RFC 5696, | RFC 4774, November 2006. | |||
November 2009. | ||||
[VCP] Xia, Y., Subramanian, L., Stoica, | [RFC5129] Davie, B., Briscoe, B., and J. Tay, | |||
I., and S. Kalyanaraman, "One more | "Explicit Congestion Marking in | |||
bit is enough", Proc. SIGCOMM'05, | MPLS", RFC 5129, January 2008. | |||
ACM CCR 35(4)37--48, 2005, <http:// | ||||
doi.acm.org/10.1145/ | ||||
1080091.1080098>. | ||||
Editorial Comments | [RFC5559] Eardley, P., "Pre-Congestion | |||
Notification (PCN) Architecture", | ||||
RFC 5559, June 2009. | ||||
[Note_Manual_Keying] Bob Briscoe: Note (To be removed by the RFC | [RFC5670] Eardley, P., "Metering and Marking | |||
Editor): One corner case can exist where an | Behaviour of PCN-Nodes", RFC 5670, | |||
RFC4301 ingress does not use IKEv2, but uses | November 2009. | |||
manual keying instead. Then an RFC4301 ingress | ||||
could conceivably be configured to tunnel to an | ||||
egress with limited functionality ECN handling. | ||||
Strictly, for this corner-case, the requirement | ||||
to use compatibility mode in this specification | ||||
updates RFC4301. However, this is such a remote | ||||
possibility that RFC4301 IPsec implementations | ||||
are not required to implement compatibility | ||||
mode. It is planned to remove this note after | ||||
the review process has completed to avoid | ||||
unnecessarily complicating the document with a | ||||
largely theoretical corner case. | ||||
[Note_PCN_egress] Bob Briscoe: During the review process Appendix | [RFC5696] Moncaster, T., Briscoe, B., and M. | |||
D is provided to expand on this point, but it | Menth, "Baseline Encoding and | |||
will be deleted before publication. | Transport of Pre-Congestion | |||
Information", RFC 5696, | ||||
November 2009. | ||||
[Note_PCN_ingress] Bob Briscoe: During the review process Appendix | [VCP] Xia, Y., Subramanian, L., Stoica, I., | |||
E is provided to expand on this point, but it | and S. Kalyanaraman, "One more bit is | |||
will be deleted before publication. | enough", Proc. SIGCOMM'05, ACM | |||
CCR 35(4)37--48, 2005, <http:// | ||||
doi.acm.org/10.1145/1080091.1080098>. | ||||
Appendix A. Early ECN Tunnelling RFCs | Appendix A. Early ECN Tunnelling RFCs | |||
IP in IP tunnelling was originally defined in [RFC2003]. On | IP in IP tunnelling was originally defined in [RFC2003]. On | |||
encapsulation, the incoming header was copied to the outer and on | encapsulation, the incoming header was copied to the outer and on | |||
decapsulation the outer was simply discarded. Initially, IPsec | decapsulation the outer was simply discarded. Initially, IPsec | |||
tunnelling [RFC2401] followed the same behaviour. | tunnelling [RFC2401] followed the same behaviour. | |||
When ECN was introduced experimentally in [RFC2481], legacy (RFC2003 | When ECN was introduced experimentally in [RFC2481], legacy (RFC2003 | |||
or RFC2401) tunnels would have discarded any congestion markings | or RFC2401) tunnels would have discarded any congestion markings | |||
skipping to change at page 40, line 18 | skipping to change at page 40, line 28 | |||
| | | represents 100 packets | | | | represents 100 packets | |||
| 30 | | | | 30 | | | |||
| | | p_t = 12/(100-30) | | | | p_t = 12/(100-30) | |||
p_t + +---------+ = 12/70 | p_t + +---------+ = 12/70 | |||
| | 12 | = 17% | | | 12 | = 17% | |||
0 +-----+---------+---> | 0 +-----+---------+---> | |||
0 30% 100% inner header marking | 0 30% 100% inner header marking | |||
Figure 7: Tunnel Marking of Packets Already Marked at Ingress | Figure 7: Tunnel Marking of Packets Already Marked at Ingress | |||
Appendix D. Why Losing ECT(1) on Decapsulation Impedes PCN (to be | Appendix D. Compromise on Decap with ECT(1) Inner and ECT(0) Outer | |||
removed before publication) | ||||
Congestion notification with two severity levels is currently on the | ||||
IETF's standards track agenda in the Congestion and Pre-Congestion | ||||
Notification (PCN) working group. PCN needs all four possible states | ||||
of congestion signalling in the 2-bit ECN field to be propagated at | ||||
the egress, but pre-existing tunnels only propagate three. The four | ||||
PCN states are: not PCN-enabled, not marked and two increasingly | ||||
severe levels of congestion marking. The less severe marking means | ||||
'stop admitting new traffic' and the more severe marking means | ||||
'terminate some existing flows', which may be needed after reroutes | ||||
(see [RFC5559] for more details). (Note on terminology: wherever | ||||
this document counts four congestion states, the PCN working group | ||||
would count this as three PCN states plus a not-PCN-enabled state.) | ||||
Figure 2 (Section 3.2) shows that pre-existing decapsulation | ||||
behaviour would have discarded any ECT(1) markings in outer headers | ||||
if the inner was ECT(0). This prevented the PCN working group from | ||||
using ECT(1) -- if a PCN node used ECT(1) to indicate one of the | ||||
severity levels of congestion, any later tunnel egress would revert | ||||
the marking to ECT(0) as if nothing had happened. Effectively the | ||||
decapsulation rules of RFC4301 and RFC3168 waste one ECT codepoint; | ||||
they treat the ECT(0) and ECT(1) codepoints as a single codepoint. | ||||
A number of work-rounds to this problem were proposed in the PCN w-g; | ||||
to add the fourth state another way or avoid needing it. Without | ||||
wishing to disparage the ingenuity of these work-rounds, none were | ||||
chosen for the standards track because they were either somewhat | ||||
wasteful, imprecise or complicated: | ||||
o One uses a pair of Diffserv codepoint(s) in place of each PCN DSCP | ||||
to encode the extra state [I-D.ietf-pcn-3-state-encoding], using | ||||
up the rapidly exhausting DSCP space while leaving an ECN | ||||
codepoint unused. | ||||
o Another survives tunnelling without an extra DSCP | ||||
[I-D.ietf-pcn-psdm-encoding], but it requires the PCN edge | ||||
gateways to share the initial state of a packet out of band. | ||||
o Another proposes a more involved marking algorithm in forwarding | ||||
elements to encode the three congestion notification states using | ||||
only two ECN codepoints [I-D.satoh-pcn-st-marking]. | ||||
o Another takes a different approach; it compromises the precision | ||||
of the admission control mechanism in some network scenarios, but | ||||
manages to work with just three encoding states and a single | ||||
marking algorithm [I-D.ietf-pcn-sm-edge-behaviour]. | ||||
Rather than require the IETF to bless any of these experimental | ||||
encoding work-rounds, the present specification fixes the root cause | ||||
of the problem so that operators deploying PCN can simply require | ||||
that tunnel end-points within a PCN region should comply with this | ||||
new ECN tunnelling specification. On the public Internet it would | ||||
not be possible to know whether all tunnels complied with this new | ||||
specification, but universal compliance is feasible for PCN, because | ||||
it is intended to be deployed in a controlled Diffserv region. | ||||
Given the present specification, the PCN w-g could progress a | ||||
trivially simple four-state ECN encoding | ||||
[I-D.ietf-pcn-3-in-1-encoding]. This would replace the interim | ||||
standards track baseline encoding of just three states [RFC5696] | ||||
which makes a fourth state available for any of the experimental | ||||
alternatives. | ||||
Appendix E. Why Resetting ECN on Encapsulation Impedes PCN (to be | ||||
removed before publication) | ||||
The PCN architecture says "...if encapsulation is done within the | ||||
PCN-domain: Any PCN-marking is copied into the outer header. Note: A | ||||
tunnel will not provide this behaviour if it complies with [RFC3168] | ||||
tunnelling in either mode, but it will if it complies with [RFC4301] | ||||
IPsec tunnelling. " | ||||
The specific issue here concerns PCN excess rate marking [RFC5670]. | ||||
The purpose of excess rate marking is to provide a bulk mechanism for | ||||
interior nodes within a PCN domain to mark traffic that is exceeding | ||||
a configured threshold bit-rate, perhaps after an unexpected event | ||||
such as a reroute, a link or node failure, or a more widespread | ||||
disaster. Reroutes are a common cause of QoS degradation in IP | ||||
networks. After reroutes it is common for multiple links in a | ||||
network to become stressed at once. Therefore, PCN excess rate | ||||
marking has been carefully designed to ensure traffic marked at one | ||||
queue will not be counted again for marking at subsequent queues (see | ||||
the `Excess traffic meter function' of [RFC5670]). | ||||
However, if an RFC3168 tunnel ingress intervenes, it resets the ECN | ||||
field in all the outer headers. This will cause excess traffic to be | ||||
counted more than once, leading to many flows being removed that did | ||||
not need to be removed at all. This is why the an RFC3168 tunnel | ||||
ingress cannot be used in a PCN domain. | ||||
The ECN reset in RFC3168 is no longer deemed necessary, it is | ||||
inconsistent with RFC4301, it is not as simple as RFC4301 and it is | ||||
impeding deployment of new protocols like PCN. The present | ||||
specification corrects this perverse situation. | ||||
Appendix F. Compromise on Decap with ECT(1) Inner and ECT(0) Outer | ||||
A packet with an ECT(1) inner and an ECT(0) outer should never arise | A packet with an ECT(1) inner and an ECT(0) outer should never arise | |||
from any known IETF protocol. Without giving a reason, RFC3168 and | from any known IETF protocol. Without giving a reason, RFC3168 and | |||
RFC4301 both say the outer should be ignored when decapsulating such | RFC4301 both say the outer should be ignored when decapsulating such | |||
a packet. This appendix explains why it was decided not to change | a packet. This appendix explains why it was decided not to change | |||
this advice. | this advice. | |||
In summary, ECT(0) always means 'not congested' and ECT(1) may imply | In summary, ECT(0) always means 'not congested' and ECT(1) may imply | |||
the same [RFC3168] or it may imply a higher severity congestion | the same [RFC3168] or it may imply a higher severity congestion | |||
signal [RFC4774], [I-D.ietf-pcn-3-in-1-encoding], depending on the | signal [RFC4774], [I-D.ietf-pcn-3-in-1-encoding], depending on the | |||
skipping to change at page 43, line 20 | skipping to change at page 41, line 32 | |||
Superficially, the opposite case where the inner and outer carry | Superficially, the opposite case where the inner and outer carry | |||
different ECT values, but with an ECT(1) outer and ECT(0) inner, | different ECT values, but with an ECT(1) outer and ECT(0) inner, | |||
seems to require a similar compromise. However, because that case is | seems to require a similar compromise. However, because that case is | |||
reversed, no compromise is necessary; it is best to forward the outer | reversed, no compromise is necessary; it is best to forward the outer | |||
whether the transport expects the ECT(1) to mean a higher severity | whether the transport expects the ECT(1) to mean a higher severity | |||
than ECT(0) or the same severity. Forwarding the outer either | than ECT(0) or the same severity. Forwarding the outer either | |||
preserves a higher value (if it is higher) or it reveals an anomaly | preserves a higher value (if it is higher) or it reveals an anomaly | |||
to the transport (if the two ECT codepoints mean the same severity). | to the transport (if the two ECT codepoints mean the same severity). | |||
Appendix G. Open Issues | Appendix E. Open Issues | |||
The new decapsulation behaviour defined in Section 4.2 adds support | The new decapsulation behaviour defined in Section 4.2 adds support | |||
for propagation of 2 severity levels of congestion. However | for propagation of 2 severity levels of congestion. However | |||
transports have no way to discover whether there are any legacy | transports have no way to discover whether there are any legacy | |||
tunnels on their path that will not propagate 2 severity levels. It | tunnels on their path that will not propagate 2 severity levels. It | |||
would have been nice to add a feature for transports to check path | would have been nice to add a feature for transports to check path | |||
support, but this remains an open issue that will have to be | support, but this remains an open issue that will have to be | |||
addressed in any future standards action to define an end-to-end | addressed in any future standards action to define an end-to-end | |||
scheme that requires 2-severity levels of congestion. PCN avoids | scheme that requires 2-severity levels of congestion. PCN avoids | |||
this problem because it is only for a controlled region, so all | this problem because it is only for a controlled region, so all | |||
End of changes. 61 change blocks. | ||||
349 lines changed or deleted | 248 lines changed or added | |||
This html diff was produced by rfcdiff 1.38. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |