< draft-ietf-tsvwg-ecn-tunnel-07.txt | draft-ietf-tsvwg-ecn-tunnel-08.txt > | |||
---|---|---|---|---|
Transport Area Working Group B. Briscoe | Transport Area Working Group B. Briscoe | |||
Internet-Draft BT | Internet-Draft BT | |||
Updates: 3168, 4301 February 11, 2010 | Updates: 3168, 4301 March 03, 2010 | |||
(if approved) | (if approved) | |||
Intended status: Standards Track | Intended status: Standards Track | |||
Expires: August 15, 2010 | Expires: September 4, 2010 | |||
Tunnelling of Explicit Congestion Notification | Tunnelling of Explicit Congestion Notification | |||
draft-ietf-tsvwg-ecn-tunnel-07 | draft-ietf-tsvwg-ecn-tunnel-08 | |||
Abstract | Abstract | |||
This document redefines how the explicit congestion notification | This document redefines how the explicit congestion notification | |||
(ECN) field of the IP header should be constructed on entry to and | (ECN) field of the IP header should be constructed on entry to and | |||
exit from any IP in IP tunnel. On encapsulation it updates RFC3168 | exit from any IP in IP tunnel. On encapsulation it updates RFC3168 | |||
to bring all IP in IP tunnels (v4 or v6) into line with RFC4301 IPsec | to bring all IP in IP tunnels (v4 or v6) into line with RFC4301 IPsec | |||
ECN processing. On decapsulation it updates both RFC3168 and RFC4301 | ECN processing. On decapsulation it updates both RFC3168 and RFC4301 | |||
to add new behaviours for previously unused combinations of inner and | to add new behaviours for previously unused combinations of inner and | |||
outer header. The new rules ensure the ECN field is correctly | outer header. The new rules ensure the ECN field is correctly | |||
propagated across a tunnel whether it is used to signal one or two | propagated across a tunnel whether it is used to signal one or two | |||
severity levels of congestion, whereas before only one severity level | severity levels of congestion, whereas before only one severity level | |||
was supported. Tunnel endpoints can be updated in any order without | was supported. Tunnel endpoints can be updated in any order without | |||
affecting pre-existing uses of the ECN field (backward compatible). | affecting pre-existing uses of the ECN field, providing backward | |||
Nonetheless, operators wanting to support two severity levels (e.g. | compatibility. Nonetheless, operators wanting to support two | |||
for pre-congestion notification--PCN) can require compliance with | severity levels (e.g. for pre-congestion notification--PCN) can | |||
this new specification. A thorough analysis of the reasoning for | require compliance with this new specification. A thorough analysis | |||
these changes and the implications is included. In the unlikely | of the reasoning for these changes and the implications is included. | |||
event that the new rules do not meet a specific need, RFC4774 gives | In the unlikely event that the new rules do not meet a specific need, | |||
guidance on designing alternate ECN semantics and this document | RFC4774 gives guidance on designing alternate ECN semantics and this | |||
extends that to include tunnelling issues. | document extends that to include tunnelling issues. | |||
Status of This Memo | Status of This Memo | |||
This Internet-Draft is submitted to IETF in full conformance with the | This Internet-Draft is submitted to IETF in full conformance with the | |||
provisions of BCP 78 and BCP 79. | provisions of BCP 78 and BCP 79. | |||
Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
Task Force (IETF), its areas, and its working groups. Note that | Task Force (IETF), its areas, and its working groups. Note that | |||
other groups may also distribute working documents as Internet- | other groups may also distribute working documents as Internet- | |||
Drafts. | Drafts. | |||
skipping to change at page 2, line 9 | skipping to change at page 2, line 9 | |||
and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
The list of current Internet-Drafts can be accessed at | The list of current Internet-Drafts can be accessed at | |||
http://www.ietf.org/ietf/1id-abstracts.txt. | http://www.ietf.org/ietf/1id-abstracts.txt. | |||
The list of Internet-Draft Shadow Directories can be accessed at | The list of Internet-Draft Shadow Directories can be accessed at | |||
http://www.ietf.org/shadow.html. | http://www.ietf.org/shadow.html. | |||
This Internet-Draft will expire on August 15, 2010. | This Internet-Draft will expire on September 4, 2010. | |||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2010 IETF Trust and the persons identified as the | Copyright (c) 2010 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
(http://trustee.ietf.org/license-info) in effect on the date of | (http://trustee.ietf.org/license-info) in effect on the date of | |||
publication of this document. Please review these documents | publication of this document. Please review these documents | |||
carefully, as they describe your rights and restrictions with respect | carefully, as they describe your rights and restrictions with respect | |||
to this document. Code Components extracted from this document must | to this document. Code Components extracted from this document must | |||
include Simplified BSD License text as described in Section 4.e of | include Simplified BSD License text as described in Section 4.e of | |||
the Trust Legal Provisions and are provided without warranty as | the Trust Legal Provisions and are provided without warranty as | |||
described in the BSD License. | described in the BSD License. | |||
This document may contain material from IETF Documents or IETF | ||||
Contributions published or made publicly available before November | ||||
10, 2008. The person(s) controlling the copyright in some of this | ||||
material may not have granted the IETF Trust the right to allow | ||||
modifications of such material outside the IETF Standards Process. | ||||
Without obtaining an adequate license from the person(s) controlling | ||||
the copyright in such materials, this document may not be modified | ||||
outside the IETF Standards Process, and derivative works of it may | ||||
not be created outside the IETF Standards Process, except to format | ||||
it for publication as an RFC or to translate it into languages other | ||||
than English. | ||||
Table of Contents | Table of Contents | |||
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 9 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 9 | |||
1.1. Scope . . . . . . . . . . . . . . . . . . . . . . . . . . 11 | 1.1. Scope . . . . . . . . . . . . . . . . . . . . . . . . . . 11 | |||
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 11 | 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 11 | |||
3. Summary of Pre-Existing RFCs . . . . . . . . . . . . . . . . . 12 | 3. Summary of Pre-Existing RFCs . . . . . . . . . . . . . . . . . 12 | |||
3.1. Encapsulation at Tunnel Ingress . . . . . . . . . . . . . 12 | 3.1. Encapsulation at Tunnel Ingress . . . . . . . . . . . . . 12 | |||
3.2. Decapsulation at Tunnel Egress . . . . . . . . . . . . . . 13 | 3.2. Decapsulation at Tunnel Egress . . . . . . . . . . . . . . 13 | |||
4. New ECN Tunnelling Rules . . . . . . . . . . . . . . . . . . . 14 | 4. New ECN Tunnelling Rules . . . . . . . . . . . . . . . . . . . 14 | |||
4.1. Default Tunnel Ingress Behaviour . . . . . . . . . . . . . 15 | 4.1. Default Tunnel Ingress Behaviour . . . . . . . . . . . . . 15 | |||
4.2. Default Tunnel Egress Behaviour . . . . . . . . . . . . . 15 | 4.2. Default Tunnel Egress Behaviour . . . . . . . . . . . . . 15 | |||
4.3. Encapsulation Modes . . . . . . . . . . . . . . . . . . . 17 | 4.3. Encapsulation Modes . . . . . . . . . . . . . . . . . . . 17 | |||
4.4. Single Mode of Decapsulation . . . . . . . . . . . . . . . 19 | 4.4. Single Mode of Decapsulation . . . . . . . . . . . . . . . 19 | |||
5. Updates to Earlier RFCs . . . . . . . . . . . . . . . . . . . 20 | 5. Updates to Earlier RFCs . . . . . . . . . . . . . . . . . . . 20 | |||
5.1. Changes to RFC4301 ECN processing . . . . . . . . . . . . 20 | 5.1. Changes to RFC4301 ECN processing . . . . . . . . . . . . 20 | |||
5.2. Changes to RFC3168 ECN processing . . . . . . . . . . . . 21 | 5.2. Changes to RFC3168 ECN processing . . . . . . . . . . . . 20 | |||
5.3. Motivation for Changes . . . . . . . . . . . . . . . . . . 22 | 5.3. Motivation for Changes . . . . . . . . . . . . . . . . . . 22 | |||
5.3.1. Motivation for Changing Encapsulation . . . . . . . . 22 | 5.3.1. Motivation for Changing Encapsulation . . . . . . . . 22 | |||
5.3.2. Motivation for Changing Decapsulation . . . . . . . . 23 | 5.3.2. Motivation for Changing Decapsulation . . . . . . . . 23 | |||
6. Backward Compatibility . . . . . . . . . . . . . . . . . . . . 25 | 6. Backward Compatibility . . . . . . . . . . . . . . . . . . . . 25 | |||
6.1. Non-Issues Updating Decapsulation . . . . . . . . . . . . 25 | 6.1. Non-Issues Updating Decapsulation . . . . . . . . . . . . 25 | |||
6.2. Non-Update of RFC4301 IPsec Encapsulation . . . . . . . . 26 | 6.2. Non-Update of RFC4301 IPsec Encapsulation . . . . . . . . 26 | |||
6.3. Update to RFC3168 Encapsulation . . . . . . . . . . . . . 26 | 6.3. Update to RFC3168 Encapsulation . . . . . . . . . . . . . 26 | |||
7. Design Principles for Alternate ECN Tunnelling Semantics . . . 27 | 7. Design Principles for Alternate ECN Tunnelling Semantics . . . 27 | |||
8. Security Considerations . . . . . . . . . . . . . . . . . . . 29 | 8. Security Considerations . . . . . . . . . . . . . . . . . . . 29 | |||
9. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 30 | 9. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 30 | |||
10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 31 | 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 31 | |||
11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 31 | 11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 31 | |||
11.1. Normative References . . . . . . . . . . . . . . . . . . . 31 | 11.1. Normative References . . . . . . . . . . . . . . . . . . . 31 | |||
11.2. Informative References . . . . . . . . . . . . . . . . . . 32 | 11.2. Informative References . . . . . . . . . . . . . . . . . . 32 | |||
Editorial Comments . . . . . . . . . . . . . . . . . . . . . . . . | ||||
Appendix A. Early ECN Tunnelling RFCs . . . . . . . . . . . . . . 34 | Appendix A. Early ECN Tunnelling RFCs . . . . . . . . . . . . . . 34 | |||
Appendix B. Design Constraints . . . . . . . . . . . . . . . . . 35 | Appendix B. Design Constraints . . . . . . . . . . . . . . . . . 35 | |||
B.1. Security Constraints . . . . . . . . . . . . . . . . . . . 35 | B.1. Security Constraints . . . . . . . . . . . . . . . . . . . 35 | |||
B.2. Control Constraints . . . . . . . . . . . . . . . . . . . 37 | B.2. Control Constraints . . . . . . . . . . . . . . . . . . . 37 | |||
B.3. Management Constraints . . . . . . . . . . . . . . . . . . 38 | B.3. Management Constraints . . . . . . . . . . . . . . . . . . 38 | |||
Appendix C. Contribution to Congestion across a Tunnel . . . . . 38 | Appendix C. Contribution to Congestion across a Tunnel . . . . . 39 | |||
Appendix D. Why Losing ECT(1) on Decapsulation Impedes PCN | Appendix D. Why Losing ECT(1) on Decapsulation Impedes PCN | |||
(to be removed before publication) . . . . . . . . . 39 | (to be removed before publication) . . . . . . . . . 40 | |||
Appendix E. Why Resetting ECN on Encapsulation Impedes PCN | Appendix E. Why Resetting ECN on Encapsulation Impedes PCN | |||
(to be removed before publication) . . . . . . . . . 41 | (to be removed before publication) . . . . . . . . . 41 | |||
Appendix F. Compromise on Decap with ECT(1) Inner and ECT(0) | Appendix F. Compromise on Decap with ECT(1) Inner and ECT(0) | |||
Outer . . . . . . . . . . . . . . . . . . . . . . . . 41 | Outer . . . . . . . . . . . . . . . . . . . . . . . . 42 | |||
Appendix G. Open Issues . . . . . . . . . . . . . . . . . . . . . 42 | Appendix G. Open Issues . . . . . . . . . . . . . . . . . . . . . 43 | |||
Request to the RFC Editor (to be removed on publication): | Request to the RFC Editor (to be removed on publication): | |||
In the RFC index, RFC3168 should be identified as an update to | In the RFC index, RFC3168 should be identified as an update to | |||
RFC2003. RFC4301 should be identified as an update to RFC3168. | RFC2003. RFC4301 should be identified as an update to RFC3168. | |||
Changes from previous drafts (to be removed by the RFC Editor) | Changes from previous drafts (to be removed by the RFC Editor) | |||
Full text differences between IETF draft versions are available at | Full text differences between IETF draft versions are available at | |||
<http://tools.ietf.org/wg/tsvwg/draft-ietf-tsvwg-ecn-tunnel/>, and | <http://tools.ietf.org/wg/tsvwg/draft-ietf-tsvwg-ecn-tunnel/>, and | |||
skipping to change at page 9, line 32 | skipping to change at page 9, line 32 | |||
regulation (changed title from "In-path Load Regulation" to | regulation (changed title from "In-path Load Regulation" to | |||
"Non-Dependence of Tunnelling on In-path Load Regulation"), but | "Non-Dependence of Tunnelling on In-path Load Regulation"), but | |||
explained how an in-path load regulation function must be | explained how an in-path load regulation function must be | |||
carefully placed with respect to tunnel encapsulation (in a new | carefully placed with respect to tunnel encapsulation (in a new | |||
sub-section entitled "Dependence of In-Path Load Regulation on | sub-section entitled "Dependence of In-Path Load Regulation on | |||
Tunnelling"). | Tunnelling"). | |||
1. Introduction | 1. Introduction | |||
Explicit congestion notification (ECN [RFC3168]) allows a forwarding | Explicit congestion notification (ECN [RFC3168]) allows a forwarding | |||
element to notify the onset of congestion without having to drop | element (e.g. a router) to notify the onset of congestion without | |||
packets. Instead it can explicitly mark a proportion of packets in | having to drop packets. Instead it can explicitly mark a proportion | |||
the 2-bit ECN field in the IP header (Table 1 recaps the ECN | of packets in the 2-bit ECN field in the IP header (Table 1 recaps | |||
codepoints). | the ECN codepoints). | |||
The outer header of an IP packet can encapsulate one or more IP | The outer header of an IP packet can encapsulate one or more IP | |||
headers for tunnelling. A forwarding element using ECN to signify | headers for tunnelling. A forwarding element using ECN to signify | |||
congestion will only mark the immediately visible outer IP header. | congestion will only mark the immediately visible outer IP header. | |||
When a tunnel decapsulator later removes this outer header, it | When a tunnel decapsulator later removes this outer header, it | |||
follows rules to propagate congestion markings by combining the ECN | follows rules to propagate congestion markings by combining the ECN | |||
fields of the inner and outer IP header into one outgoing IP header. | fields of the inner and outer IP header into one outgoing IP header. | |||
This document updates those rules for IPsec [RFC4301] and non-IPsec | This document updates those rules for IPsec [RFC4301] and non-IPsec | |||
[RFC3168] tunnels to add new behaviours for previously unused | [RFC3168] tunnels to add new behaviours for previously unused | |||
skipping to change at page 14, line 16 | skipping to change at page 14, line 16 | |||
|Incoming | Incoming Outer Header | | |Incoming | Incoming Outer Header | | |||
| Inner +---------+------------+------------+------------+ | | Inner +---------+------------+------------+------------+ | |||
| Header | Not-ECT | ECT(0) | ECT(1) | CE | | | Header | Not-ECT | ECT(0) | ECT(1) | CE | | |||
+---------+---------+------------+------------+------------+ | +---------+---------+------------+------------+------------+ | |||
RFC3168->| Not-ECT | Not-ECT |Not-ECT |Not-ECT | drop | | RFC3168->| Not-ECT | Not-ECT |Not-ECT |Not-ECT | drop | | |||
RFC4301->| Not-ECT | Not-ECT |Not-ECT |Not-ECT |Not-ECT | | RFC4301->| Not-ECT | Not-ECT |Not-ECT |Not-ECT |Not-ECT | | |||
| ECT(0) | ECT(0) | ECT(0) | ECT(0) | CE | | | ECT(0) | ECT(0) | ECT(0) | ECT(0) | CE | | |||
| ECT(1) | ECT(1) | ECT(1) | ECT(1) | CE | | | ECT(1) | ECT(1) | ECT(1) | ECT(1) | CE | | |||
| CE | CE | CE | CE | CE | | | CE | CE | CE | CE | CE | | |||
+---------+---------+------------+------------+------------+ | +---------+---------+------------+------------+------------+ | |||
| Outgoing Header | | ||||
+------------------------------------------------+ | In pre-existing RFCs, the ECN field in the outgoing header was set to | |||
the codepoint at the intersection of the appropriate incoming inner | ||||
header (row) and incoming outer header (column). | ||||
Figure 2: IP in IP Decapsulation; Recap of Pre-existing Behaviour | Figure 2: IP in IP Decapsulation; Recap of Pre-existing Behaviour | |||
The behaviour in the table derives from the logic given in RFC3168 | The behaviour in the table derives from the logic given in RFC3168 | |||
and RFC4301, briefly recapped as follows: | and RFC4301, briefly recapped as follows: | |||
o On decapsulation, if the inner ECN field is Not-ECT the outer is | o On decapsulation, if the inner ECN field is Not-ECT the outer is | |||
ignored. RFC3168 (but not RFC4301) also specified that the | ignored. RFC3168 (but not RFC4301) also specified that the | |||
decapsulator must drop a packet with a Not-ECT inner and CE in the | decapsulator must drop a packet with a Not-ECT inner and CE in the | |||
outer. | outer. | |||
o In all other cases, if the outer is CE, the outgoing ECN field is | o In all other cases, if the outer is CE, the outgoing ECN field is | |||
set to CE, but otherwise the outer is ignored and the inner is | set to CE, but otherwise the outer is ignored and the inner is | |||
used for the outgoing ECN field. | used for the outgoing ECN field. | |||
RFC3168 also made it an auditable event for an IPsec tunnel "if the | Section 9.2.2 of RFC3168 also made it an auditable event for an IPsec | |||
ECN Field is changed inappropriately within an IPsec tunnel...". | tunnel "if the ECN Field is changed inappropriately within an IPsec | |||
Inappropriate changes were not specifically enumerated. RFC4301 did | tunnel...". Inappropriate changes were not specifically enumerated. | |||
not mention inappropriate ECN changes. | RFC4301 did not mention inappropriate ECN changes. | |||
4. New ECN Tunnelling Rules | 4. New ECN Tunnelling Rules | |||
The standards actions below in Section 4.1 (ingress encapsulation) | The standards actions below in Section 4.1 (ingress encapsulation) | |||
and Section 4.2 (egress decapsulation) define new default ECN tunnel | and Section 4.2 (egress decapsulation) define new default ECN tunnel | |||
processing rules for any IP packet (v4 or v6) with any Diffserv | processing rules for any IP packet (v4 or v6) with any Diffserv | |||
codepoint. | codepoint. | |||
If these defaults do not meet a particular requirement, an alternate | If these defaults do not meet a particular requirement, an alternate | |||
ECN tunnelling scheme can be introduced as part of the definition of | ECN tunnelling scheme can be introduced as part of the definition of | |||
an alternate congestion marking scheme used by a specific Diffserv | an alternate congestion marking scheme used by a specific Diffserv | |||
PHB (see S.5 of [RFC3168] and [RFC4774]). When designing such | PHB (see S.5 of [RFC3168] and [RFC4774]). When designing such | |||
alternate ECN tunnelling schemes, the principles in Section 7 should | alternate ECN tunnelling schemes, the principles in Section 7 should | |||
be followed. However, alternate ECN tunnelling schemes are NOT | be followed. However, alternate ECN tunnelling schemes SHOULD be | |||
RECOMMENDED as the deployment burden of handling exceptional PHBs in | avoided whenever possible as the deployment burden of handling | |||
implementations of all affected tunnels should not be underestimated. | exceptional PHBs in implementations of all affected tunnels should | |||
not be underestimated. There is no requirement for a PHB definition | ||||
There is no requirement for a PHB definition to state anything about | to state anything about ECN tunnelling behaviour if the default | |||
ECN tunnelling behaviour if the default behaviour in the present | behaviour in the present specification is sufficient. | |||
specification is sufficient. | ||||
4.1. Default Tunnel Ingress Behaviour | 4.1. Default Tunnel Ingress Behaviour | |||
Two modes of encapsulation are defined here; a REQUIRED `normal mode' | Two modes of encapsulation are defined here; a REQUIRED `normal mode' | |||
and a `compatibility mode', which is for backward compatibility with | and a `compatibility mode', which is for backward compatibility with | |||
tunnel decapsulators that do not understand ECN. Note that these are | tunnel decapsulators that do not understand ECN. Note that these are | |||
modes of the ingress tunnel endpoint only, not the whole tunnel. | modes of the ingress tunnel endpoint only, not the whole tunnel. | |||
Section 4.3 explains why two modes are necessary and specifies the | Section 4.3 explains why two modes are necessary and specifies the | |||
circumstances in which it is sufficient to solely implement normal | circumstances in which it is sufficient to solely implement normal | |||
mode. | mode. | |||
skipping to change at page 16, line 15 | skipping to change at page 16, line 15 | |||
+---------+------------------------------------------------+ | +---------+------------------------------------------------+ | |||
|Incoming | Incoming Outer Header | | |Incoming | Incoming Outer Header | | |||
| Inner +---------+------------+------------+------------+ | | Inner +---------+------------+------------+------------+ | |||
| Header | Not-ECT | ECT(0) | ECT(1) | CE | | | Header | Not-ECT | ECT(0) | ECT(1) | CE | | |||
+---------+---------+------------+------------+------------+ | +---------+---------+------------+------------+------------+ | |||
| Not-ECT | Not-ECT |Not-ECT(!!!)|Not-ECT(!!!)| drop(!!!)| | | Not-ECT | Not-ECT |Not-ECT(!!!)|Not-ECT(!!!)| drop(!!!)| | |||
| ECT(0) | ECT(0) | ECT(0) | ECT(1) | CE | | | ECT(0) | ECT(0) | ECT(0) | ECT(1) | CE | | |||
| ECT(1) | ECT(1) | ECT(1) (!) | ECT(1) | CE | | | ECT(1) | ECT(1) | ECT(1) (!) | ECT(1) | CE | | |||
| CE | CE | CE | CE(!!!)| CE | | | CE | CE | CE | CE(!!!)| CE | | |||
+---------+---------+------------+------------+------------+ | +---------+---------+------------+------------+------------+ | |||
| Outgoing Header | | ||||
+------------------------------------------------+ | The ECN field in the outgoing header is set to the codepoint at the | |||
Currently unused combinations are indicated by '(!!!)' or '(!)' | intersection of the appropriate incoming inner header (row) and | |||
incoming outer header (column). Currently unused combinations are | ||||
indicated by '(!!!)' or '(!)' | ||||
Figure 4: New IP in IP Decapsulation Behaviour | Figure 4: New IP in IP Decapsulation Behaviour | |||
This table for decapsulation behaviour is derived from the following | This table for decapsulation behaviour is derived from the following | |||
logic: | logic: | |||
o If the inner ECN field is Not-ECT the decapsulator MUST NOT | o If the inner ECN field is Not-ECT the decapsulator MUST NOT | |||
propagate any other ECN codepoint onwards. This is because the | propagate any other ECN codepoint onwards. This is because the | |||
inner Not-ECT marking is set by transports that use drop as an | inner Not-ECT marking is set by transports that use drop as an | |||
indication of congestion and would not understand or respond to | indication of congestion and would not understand or respond to | |||
skipping to change at page 17, line 12 | skipping to change at page 17, line 15 | |||
Just because the highlighted combinations are currently unused, | Just because the highlighted combinations are currently unused, | |||
does not mean that all the other combinations are always valid. | does not mean that all the other combinations are always valid. | |||
Some are only valid if they have arrived from a particular type of | Some are only valid if they have arrived from a particular type of | |||
legacy ingress, and dangerous otherwise. Therefore an | legacy ingress, and dangerous otherwise. Therefore an | |||
implementation MAY allow an operator to configure logging and | implementation MAY allow an operator to configure logging and | |||
alarms for such additional header combinations known to be | alarms for such additional header combinations known to be | |||
dangerous or CU for the particular configuration of tunnel | dangerous or CU for the particular configuration of tunnel | |||
endpoints deployed at run-time. | endpoints deployed at run-time. | |||
Alarms should be rate-limited so that the anomalous combinations | Alarms SHOULD be rate-limited so that the anomalous combinations | |||
will not amplify into a flood of alarm messages. It MUST be | will not amplify into a flood of alarm messages. It MUST be | |||
possible to suppress alarms or logging, e.g. if it becomes | possible to suppress alarms or logging, e.g. if it becomes | |||
apparent that a combination that previously was not used has | apparent that a combination that previously was not used has | |||
started to be used for legitimate purposes such as a new standards | started to be used for legitimate purposes such as a new standards | |||
action. | action. | |||
The above logic allows for ECT(0) and ECT(1) to both represent the | The above logic allows for ECT(0) and ECT(1) to both represent the | |||
same severity of congestion marking (e.g. "not congestion marked"). | same severity of congestion marking (e.g. "not congestion marked"). | |||
But it also allows future schemes to be defined where ECT(1) is a | But it also allows future schemes to be defined where ECT(1) is a | |||
more severe marking than ECT(0), in particular enabling the simplest | more severe marking than ECT(0), in particular enabling the simplest | |||
possible encoding for PCN [I-D.ietf-pcn-3-in-1-encoding]. This | possible encoding for PCN [I-D.ietf-pcn-3-in-1-encoding]. Before the | |||
approach is discussed in Appendix D and in the discussion of the ECN | present specification was written, the PCN working-group had proposed | |||
nonce [RFC3540] in Section 8, which in turn refers to Appendix F. | a number of work-rounds to the problem of a tunnel egress not | |||
propagating two severity levels of congestion. Without wishing to | ||||
disparage the ingenuity of these work-rounds, none were chosen for | ||||
the standards track because they were either somewhat wasteful, | ||||
imprecise or complicated [Note_PCN_egress]. Treating ECT(1) as | ||||
either the same as ECT(0) or as a higher severity level is explained | ||||
in the discussion of the ECN nonce [RFC3540] in Section 8, which in | ||||
turn refers to Appendix F. | ||||
4.3. Encapsulation Modes | 4.3. Encapsulation Modes | |||
Section 4.1 introduces two encapsulation modes, normal mode and | Section 4.1 introduces two encapsulation modes, normal mode and | |||
compatibility mode, defining their encapsulation behaviour (i.e. | compatibility mode, defining their encapsulation behaviour (i.e. | |||
header copying or zeroing respectively). Note that these are modes | header copying or zeroing respectively). Note that these are modes | |||
of the ingress tunnel endpoint only, not the tunnel as a whole. | of the ingress tunnel endpoint only, not the tunnel as a whole. | |||
To comply with this specification, a tunnel ingress MUST at least | To comply with this specification, a tunnel ingress MUST at least | |||
implement `normal mode'. Unless it will never be used with legacy | implement `normal mode'. Unless it will never be used with legacy | |||
tunnel egress nodes (RFC2003, RFC2401 or RFC2481 or the limited | tunnel egress nodes (RFC2003, RFC2401 or RFC2481 or the limited | |||
functionality mode of RFC3168), an ingress MUST also implement | functionality mode of RFC3168), an ingress MUST also implement | |||
`compatibility mode' for backward compatibility with tunnel egresses | `compatibility mode' for backward compatibility with tunnel egresses | |||
that do not propagate explicit congestion notifications [RFC4774]. | that do not propagate explicit congestion notifications [RFC4774]. | |||
We can categorise the way that an ingress tunnel endpoint is paired | We can categorise the way that an ingress tunnel endpoint is paired | |||
with an egress as either: | with an egress as either static or dynamically discovered: | |||
static: those paired together by prior configuration or; | ||||
dynamically discovered: those paired together by some form of tunnel | Static: Tunnel endpoints paired together by prior configuration. | |||
endpoint discovery, typically finding an egress on the path taken | ||||
by the first packet. | ||||
Static: Some implementations of encapsulator might always be | Some implementations of encapsulator might always be statically | |||
statically deployed, and constrained to never be paired with a | deployed, and constrained to never be paired with a legacy | |||
legacy decapsulator (RFC2003, RFC2401 or RFC2481 or the limited | decapsulator (RFC2003, RFC2401 or RFC2481 or the limited | |||
functionality mode of RFC3168). In such a case, only normal mode | functionality mode of RFC3168). In such a case, only normal mode | |||
needs to be implemented. | needs to be implemented. | |||
For instance, RFC4301-compatible IPsec tunnel endpoints invariably | For instance, RFC4301-compatible IPsec tunnel endpoints invariably | |||
use IKEv2 [RFC4306] for key exchange, which was introduced | use IKEv2 [RFC4306] for key exchange, which was introduced | |||
alongside RFC4301. Therefore both endpoints of an RFC4301 tunnel | alongside RFC4301. Therefore both endpoints of an RFC4301 tunnel | |||
can be sure that the other end is RFC4301-compatible, because the | can be sure that the other end is RFC4301-compatible, because the | |||
tunnel is only formed after IKEv2 key management has completed, at | tunnel is only formed after IKEv2 key management has completed, at | |||
which point both ends will be RFC4301-compliant by definition. | which point both ends will be RFC4301-compliant by definition. | |||
Therefore an IPsec tunnel ingress does not need compatibility | Therefore an IPsec tunnel ingress does not need compatibility | |||
mode, as it will never interact with legacy ECN tunnels. To | mode, as it will never interact with legacy ECN tunnels. To | |||
comply with the present specification, it only needs to implement | comply with the present specification, it only needs to implement | |||
the required normal mode, which is identical to the pre-existing | the required normal mode, which is identical to the pre-existing | |||
RFC4301 behaviour. | RFC4301 behaviour. | |||
Dynamic Discovery: This specification does not require or recommend | Dynamic Discovery: Tunnel endpoints paired together by some form of | |||
dynamic discovery and it does not define how dynamic negotiation | tunnel endpoint discovery, typically finding an egress on the path | |||
might be done, but it recognises that proprietary tunnel endpoint | taken by the first packet. | |||
discovery protocols exist. It therefore sets down some | ||||
constraints on discovery protocols to ensure safe interworking. | This specification does not require or recommend dynamic discovery | |||
and it does not define how dynamic negotiation might be done, but | ||||
it recognises that proprietary tunnel endpoint discovery protocols | ||||
exist. It therefore sets down some constraints on discovery | ||||
protocols to ensure safe interworking. | ||||
If dynamic tunnel endpoint discovery might pair an ingress with a | If dynamic tunnel endpoint discovery might pair an ingress with a | |||
legacy egress (RFC2003, RFC2401 or RFC2481 or the limited | legacy egress (RFC2003, RFC2401 or RFC2481 or the limited | |||
functionality mode of RFC3168), the ingress MUST implement both | functionality mode of RFC3168), the ingress MUST implement both | |||
normal and compatibility mode. If the tunnel discovery process is | normal and compatibility mode. If the tunnel discovery process is | |||
arranged to only ever find a tunnel egress that propagates ECN | arranged to only ever find a tunnel egress that propagates ECN | |||
(RFC3168 full functionality mode, RFC4301 or this present | (RFC3168 full functionality mode, RFC4301 or this present | |||
specification), then a tunnel ingress can be complaint with the | specification), then a tunnel ingress can be complaint with the | |||
present specification without implementing compatibility mode. | present specification without implementing compatibility mode. | |||
skipping to change at page 19, line 42 | skipping to change at page 19, line 50 | |||
Through the discovery protocol, a tunnel ingress compliant with the | Through the discovery protocol, a tunnel ingress compliant with the | |||
present specification might ask if the egress is compliant with the | present specification might ask if the egress is compliant with the | |||
present specification, with RFC4301 or with RFC3168 full | present specification, with RFC4301 or with RFC3168 full | |||
functionality mode. Or an RFC3168 tunnel ingress might try to | functionality mode. Or an RFC3168 tunnel ingress might try to | |||
negotiate to use limited functionality or full functionality mode | negotiate to use limited functionality or full functionality mode | |||
[RFC3168]. In all these cases, a decapsulating tunnel egress | [RFC3168]. In all these cases, a decapsulating tunnel egress | |||
compliant with this specification MUST agree to any of these | compliant with this specification MUST agree to any of these | |||
requests, since it will behave identically in all these cases. | requests, since it will behave identically in all these cases. | |||
If no ECN-related mode is requested, a compliant tunnel egress MUST | If no ECN-related mode is requested, a compliant tunnel egress MUST | |||
continue without raising any error or warning as its egress behaviour | continue without raising any error or warning, because its egress | |||
is compatible with all the legacy ingress behaviours that do not | behaviour is compatible with all the legacy ingress behaviours that | |||
negotiate capabilities. | do not negotiate capabilities. | |||
A compliant tunnel egress SHOULD raise a warning alarm about any | A compliant tunnel egress SHOULD raise a warning alarm about any | |||
requests to enter modes it does not recognise but, for 'forward | requests to enter modes it does not recognise but, for 'forward | |||
compatibility' with standards actions possibly defined after it was | compatibility' with standards actions possibly defined after it was | |||
implemented, it SHOULD continue operating. | implemented, it SHOULD continue operating. | |||
5. Updates to Earlier RFCs | 5. Updates to Earlier RFCs | |||
5.1. Changes to RFC4301 ECN processing | 5.1. Changes to RFC4301 ECN processing | |||
skipping to change at page 20, line 38 | skipping to change at page 20, line 44 | |||
dropped rather than forwarded as Not-ECT; | dropped rather than forwarded as Not-ECT; | |||
* Certain combinations of inner and outer ECN field have been | * Certain combinations of inner and outer ECN field have been | |||
identified as currently unused. These can trigger logging | identified as currently unused. These can trigger logging | |||
and/or raise alarms. | and/or raise alarms. | |||
Modes: RFC4301 tunnel endpoints do not need modes and are not | Modes: RFC4301 tunnel endpoints do not need modes and are not | |||
updated by the modes in the present specification. Effectively an | updated by the modes in the present specification. Effectively an | |||
RFC4301 IPsec ingress solely uses the REQUIRED normal mode of | RFC4301 IPsec ingress solely uses the REQUIRED normal mode of | |||
encapsulation, which is unchanged from RFC4301 encapsulation. It | encapsulation, which is unchanged from RFC4301 encapsulation. It | |||
will never need the OPTIONAL compatibility mode as explained in | will never [Note_Manual_Keying] need the OPTIONAL compatibility | |||
Section 4.3 (except in one corner-case described below). | mode as explained in Section 4.3. | |||
{ToDo: Question to Security Directorate: Although this corner-case | ||||
theoretically exists, it would be preferable to delete any mention | ||||
of it for simplicity & clarity. Agree?} | ||||
One corner case can exist where an RFC4301 ingress does not use | ||||
IKEv2, but uses manual keying instead. Then an RFC4301 ingress | ||||
could conceivably be configured to tunnel to an egress with | ||||
limited functionality ECN handling. Strictly, for this corner- | ||||
case, the requirement to use compatibility mode in this | ||||
specification updates RFC4301. However, this is such a remote | ||||
possibility that RFC4301 IPsec implementations are NOT REQUIRED to | ||||
implement compatibility mode. | ||||
5.2. Changes to RFC3168 ECN processing | 5.2. Changes to RFC3168 ECN processing | |||
Ingress: On encapsulation, the new rule in Figure 3 that a normal | Ingress: On encapsulation, the new rule in Figure 3 that a normal | |||
mode tunnel ingress copies any ECN field into the outer header | mode tunnel ingress copies any ECN field into the outer header | |||
updates the full functionality behaviour of an RFC3168 ingress. | updates the full functionality behaviour of an RFC3168 ingress. | |||
Nonetheless, the new compatibility mode encapsulates packets | Nonetheless, the new compatibility mode encapsulates packets | |||
identically to the limited functionality mode of an RFC3168 | identically to the limited functionality mode of an RFC3168 | |||
ingress. | ingress. | |||
Egress: An RFC3168 egress will need to be updated to the new | Egress: An RFC3168 egress will need to be updated to the new | |||
decapsulation behaviour in Figure 4, in order to comply with the | decapsulation behaviour in Figure 4, in order to comply with the | |||
present specification. However, the changes are backward | present specification. However, the changes are backward | |||
skipping to change at page 22, line 32 | skipping to change at page 22, line 32 | |||
compatibility with legacy decapsulators that do not propagate ECN | compatibility with legacy decapsulators that do not propagate ECN | |||
correctly. | correctly. | |||
The trigger that motivated this update to RFC3168 encapsulation was a | The trigger that motivated this update to RFC3168 encapsulation was a | |||
standards track proposal for pre-congestion notification (PCN | standards track proposal for pre-congestion notification (PCN | |||
[RFC5670]). PCN excess rate marking only works correctly if the ECN | [RFC5670]). PCN excess rate marking only works correctly if the ECN | |||
field is copied on encapsulation (as in RFC4301 and RFC5129); it does | field is copied on encapsulation (as in RFC4301 and RFC5129); it does | |||
not work if ECN is reset (as in RFC3168). This is because PCN excess | not work if ECN is reset (as in RFC3168). This is because PCN excess | |||
rate marking depends on the outer header revealing any congestion | rate marking depends on the outer header revealing any congestion | |||
experienced so far on the whole path, not just since the last tunnel | experienced so far on the whole path, not just since the last tunnel | |||
ingress (see Appendix E for a full explanation). | ingress [Note_PCN_ingress]. | |||
PCN allows a network operator to add flow admission and termination | PCN allows a network operator to add flow admission and termination | |||
for inelastic traffic at the edges of a Diffserv domain, but without | for inelastic traffic at the edges of a Diffserv domain, but without | |||
any per-flow mechanisms in the interior and without the generous | any per-flow mechanisms in the interior and without the generous | |||
provisioning typical of Diffserv, aiming to significantly reduce | provisioning typical of Diffserv, aiming to significantly reduce | |||
costs. The PCN architecture [RFC5559] states that RFC3168 IP in IP | costs. The PCN architecture [RFC5559] states that RFC3168 IP in IP | |||
tunnelling of the ECN field cannot be used for any tunnel ingress in | tunnelling of the ECN field cannot be used for any tunnel ingress in | |||
a PCN domain. Prior to the present specification, this left a stark | a PCN domain. Prior to the present specification, this left a stark | |||
choice between not being able to use PCN for inelastic traffic | choice between not being able to use PCN for inelastic traffic | |||
control or not being able to use the many tunnels already deployed | control or not being able to use the many tunnels already deployed | |||
skipping to change at page 24, line 20 | skipping to change at page 24, line 20 | |||
As well as being useful for general future-proofing, this problem | As well as being useful for general future-proofing, this problem | |||
is immediately pressing for standardisation of pre-congestion | is immediately pressing for standardisation of pre-congestion | |||
notification (PCN), which uses two severity levels of congestion. | notification (PCN), which uses two severity levels of congestion. | |||
If a congested queue used ECT(1) in the outer header to signal | If a congested queue used ECT(1) in the outer header to signal | |||
more severe congestion than ECT(0), the pre-existing | more severe congestion than ECT(0), the pre-existing | |||
decapsulation rules would have thrown away this congestion | decapsulation rules would have thrown away this congestion | |||
signal, preventing tunnelled traffic from ever knowing that it | signal, preventing tunnelled traffic from ever knowing that it | |||
should reduce its load. | should reduce its load. | |||
The PCN working group has had to consider a number of wasteful or | The PCN working group has had to consider a number of wasteful or | |||
convoluted work-rounds to this problem (see Appendix D). But by | convoluted work-rounds to this problem [Note_PCN_egress]. But by | |||
far the simplest approach is just to remove the covert channel | far the simplest approach is just to remove the covert channel | |||
blockages from tunnelling behaviour--now deemed unnecessary | blockages from tunnelling behaviour--now deemed unnecessary | |||
anyway. Then network operators that want to support two | anyway. Then network operators that want to support two | |||
congestion severity-levels for PCN can specify that every tunnel | congestion severity-levels for PCN can specify that every tunnel | |||
egress in a PCN region must comply with this latest | egress in a PCN region must comply with this latest | |||
specification. | specification. | |||
Not only does this make two congestion severity-levels available | Not only does this make two congestion severity-levels available | |||
for PCN standardisation, but also for other potential uses of the | for PCN standardisation, but also for other potential uses of the | |||
extra ECN codepoint (e.g. [VCP]). | extra ECN codepoint (e.g. [VCP]). | |||
skipping to change at page 28, line 33 | skipping to change at page 28, line 33 | |||
Then the code module doing encapsulation can keep to the | Then the code module doing encapsulation can keep to the | |||
copying rule and the load regulator module can reset | copying rule and the load regulator module can reset | |||
congestion, without any code in either module being | congestion, without any code in either module being | |||
conditional on whether the other is there. | conditional on whether the other is there. | |||
On decapsulation in any new scheme: | On decapsulation in any new scheme: | |||
1. If the arriving inner header is Not-ECT it implies the | 1. If the arriving inner header is Not-ECT it implies the | |||
transport will not understand other ECN codepoints. If the | transport will not understand other ECN codepoints. If the | |||
outer header carries an explicit congestion marking, the | outer header carries an explicit congestion marking, the | |||
alternate scheme will probably need to drop the packet--the | alternate scheme would be expected to drop the packet--the | |||
only indication of congestion the transport will understand. | only indication of congestion the transport will understand. | |||
If the outer carries any other ECN codepoint that does not | If the alternate scheme recommends forwarding rather than | |||
indicate congestion, the alternate scheme can forward the | dropping such a packet, it must clearly justify this decision. | |||
packet, but probably only as Not-ECT. | If the inner is Not-ECT and the outer carries any other ECN | |||
codepoint that does not indicate congestion, the alternate | ||||
scheme can forward the packet, but probably only as Not-ECT. | ||||
2. If the arriving inner header is other than Not-ECT, the ECN | 2. If the arriving inner header is other than Not-ECT, the ECN | |||
field that the alternate decapsulation scheme forwards should | field that the alternate decapsulation scheme forwards should | |||
reflect the more severe congestion marking of the arriving | reflect the more severe congestion marking of the arriving | |||
inner and outer headers. | inner and outer headers. | |||
3. Any alternate scheme MUST define a behaviour for all | 3. Any alternate scheme must define a behaviour for all | |||
combinations of inner and outer headers, even those that would | combinations of inner and outer headers, even those that would | |||
not be expected to result from standards known at the time and | not be expected to result from standards known at the time and | |||
even those that would not be expected from the tunnel ingress | even those that would not be expected from the tunnel ingress | |||
paired with the egress at run-time. Consideration should be | paired with the egress at run-time. Consideration should be | |||
given to logging such unexpected combinations and raising an | given to logging such unexpected combinations and raising an | |||
alarm, particularly if there is a danger that the invalid | alarm, particularly if there is a danger that the invalid | |||
combination implies congestion signals are not being | combination implies congestion signals are not being | |||
propagated correctly. The presence of currently unused | propagated correctly. The presence of currently unused | |||
combinations may represent an attack, but the new scheme | combinations may represent an attack, but the new scheme | |||
should try to define a way to forward such packets, at least | should try to define a way to forward such packets, at least | |||
skipping to change at page 34, line 15 | skipping to change at page 34, line 15 | |||
Information", RFC 5696, | Information", RFC 5696, | |||
November 2009. | November 2009. | |||
[VCP] Xia, Y., Subramanian, L., Stoica, | [VCP] Xia, Y., Subramanian, L., Stoica, | |||
I., and S. Kalyanaraman, "One more | I., and S. Kalyanaraman, "One more | |||
bit is enough", Proc. SIGCOMM'05, | bit is enough", Proc. SIGCOMM'05, | |||
ACM CCR 35(4)37--48, 2005, <http:// | ACM CCR 35(4)37--48, 2005, <http:// | |||
doi.acm.org/10.1145/ | doi.acm.org/10.1145/ | |||
1080091.1080098>. | 1080091.1080098>. | |||
Editorial Comments | ||||
[Note_Manual_Keying] Bob Briscoe: Note (To be removed by the RFC | ||||
Editor): One corner case can exist where an | ||||
RFC4301 ingress does not use IKEv2, but uses | ||||
manual keying instead. Then an RFC4301 ingress | ||||
could conceivably be configured to tunnel to an | ||||
egress with limited functionality ECN handling. | ||||
Strictly, for this corner-case, the requirement | ||||
to use compatibility mode in this specification | ||||
updates RFC4301. However, this is such a remote | ||||
possibility that RFC4301 IPsec implementations | ||||
are not required to implement compatibility | ||||
mode. It is planned to remove this note after | ||||
the review process has completed to avoid | ||||
unnecessarily complicating the document with a | ||||
largely theoretical corner case. | ||||
[Note_PCN_egress] Bob Briscoe: During the review process Appendix | ||||
D is provided to expand on this point, but it | ||||
will be deleted before publication. | ||||
[Note_PCN_ingress] Bob Briscoe: During the review process Appendix | ||||
E is provided to expand on this point, but it | ||||
will be deleted before publication. | ||||
Appendix A. Early ECN Tunnelling RFCs | Appendix A. Early ECN Tunnelling RFCs | |||
IP in IP tunnelling was originally defined in [RFC2003]. On | IP in IP tunnelling was originally defined in [RFC2003]. On | |||
encapsulation, the incoming header was copied to the outer and on | encapsulation, the incoming header was copied to the outer and on | |||
decapsulation the outer was simply discarded. Initially, IPsec | decapsulation the outer was simply discarded. Initially, IPsec | |||
tunnelling [RFC2401] followed the same behaviour. | tunnelling [RFC2401] followed the same behaviour. | |||
When ECN was introduced experimentally in [RFC2481], legacy (RFC2003 | When ECN was introduced experimentally in [RFC2481], legacy (RFC2003 | |||
or RFC2401) tunnels would have discarded any congestion markings | or RFC2401) tunnels would have discarded any congestion markings | |||
added to the outer header, so RFC2481 introduced rules for | added to the outer header, so RFC2481 introduced rules for | |||
skipping to change at page 35, line 25 | skipping to change at page 35, line 47 | |||
Information security can be assured by using various end to end | Information security can be assured by using various end to end | |||
security solutions (including IPsec in transport mode [RFC4301]), but | security solutions (including IPsec in transport mode [RFC4301]), but | |||
a commonly used scenario involves the need to communicate between two | a commonly used scenario involves the need to communicate between two | |||
physically protected domains across the public Internet. In this | physically protected domains across the public Internet. In this | |||
case there are certain management advantages to using IPsec in tunnel | case there are certain management advantages to using IPsec in tunnel | |||
mode solely across the publicly accessible part of the path. The | mode solely across the publicly accessible part of the path. The | |||
path followed by a packet then crosses security 'domains'; the ones | path followed by a packet then crosses security 'domains'; the ones | |||
protected by physical or other means before and after the tunnel and | protected by physical or other means before and after the tunnel and | |||
the one protected by an IPsec tunnel across the otherwise unprotected | the one protected by an IPsec tunnel across the otherwise unprotected | |||
domain. We will use the scenario in Figure 5 where endpoints 'A' and | domain. The scenario in Figure 5 will be used where endpoints 'A' | |||
'B' communicate through a tunnel. The tunnel ingress 'I' and egress | and 'B' communicate through a tunnel. The tunnel ingress 'I' and | |||
'E' are within physically protected edge domains, while the tunnel | egress 'E' are within physically protected edge domains, while the | |||
spans an unprotected internetwork where there may be 'men in the | tunnel spans an unprotected internetwork where there may be 'men in | |||
middle', M. | the middle', M. | |||
physically unprotected physically | physically unprotected physically | |||
<-protected domain-><--domain--><-protected domain-> | <-protected domain-><--domain--><-protected domain-> | |||
+------------------+ +------------------+ | +------------------+ +------------------+ | |||
| | M | | | | | M | | | |||
| A-------->I=========>==========>E-------->B | | | A-------->I=========>==========>E-------->B | | |||
| | | | | | | | | | |||
+------------------+ +------------------+ | +------------------+ +------------------+ | |||
<----IPsec secured----> | <----IPsec secured----> | |||
tunnel | tunnel | |||
skipping to change at page 36, line 22 | skipping to change at page 36, line 45 | |||
from a congested resource towards downstream nodes. Typically a | from a congested resource towards downstream nodes. Typically a | |||
downstream transport might feed the information back somehow to the | downstream transport might feed the information back somehow to the | |||
point upstream of the congestion that can regulate the load on the | point upstream of the congestion that can regulate the load on the | |||
congested resource, but other actions are possible (see [RFC3168] | congested resource, but other actions are possible (see [RFC3168] | |||
S.6). In terms of the above unicast scenario, ECN effectively | S.6). In terms of the above unicast scenario, ECN effectively | |||
intends to create an information channel (for congestion signalling) | intends to create an information channel (for congestion signalling) | |||
from 'M' to 'B' (for 'B' to feed back to 'A'). Therefore the goals | from 'M' to 'B' (for 'B' to feed back to 'A'). Therefore the goals | |||
of IPsec and ECN are mutually incompatible, requiring some | of IPsec and ECN are mutually incompatible, requiring some | |||
compromise. | compromise. | |||
With respect to the DS or ECN fields, S.5.1.2 of RFC4301 says, | With respect to using the DS or ECN fields as covert channels, | |||
"controls are provided to manage the bandwidth of this [covert] | S.5.1.2 of RFC4301 says, "controls are provided to manage the | |||
channel". Using the ECN processing rules of RFC4301, the channel | bandwidth of this channel". Using the ECN processing rules of | |||
bandwidth is two bits per datagram from 'A' to 'M' and one bit per | RFC4301, the channel bandwidth is two bits per datagram from 'A' to | |||
datagram from 'M' to 'A' (because 'E' limits the combinations of the | 'M' and one bit per datagram from 'M' to 'A' (because 'E' limits the | |||
2-bit ECN field that it will copy). In both cases the covert channel | combinations of the 2-bit ECN field that it will copy). In both | |||
bandwidth is further reduced by noise from any real congestion | cases the covert channel bandwidth is further reduced by noise from | |||
marking. RFC4301 implies that these covert channels are sufficiently | any real congestion marking. RFC4301 implies that these covert | |||
limited to be considered a manageable threat. However, with respect | channels are sufficiently limited to be considered a manageable | |||
to the larger (6b) DS field, the same section of RFC4301 says not | threat. However, with respect to the larger (6b) DS field, the same | |||
copying is the default, but a configuration option can allow copying | section of RFC4301 says not copying is the default, but a | |||
"to allow a local administrator to decide whether the covert channel | configuration option can allow copying "to allow a local | |||
provided by copying these bits outweighs the benefits of copying". | administrator to decide whether the covert channel provided by | |||
Of course, an administrator considering copying of the DS field has | copying these bits outweighs the benefits of copying". Of course, an | |||
to take into account that it could be concatenated with the ECN field | administrator considering copying of the DS field has to take into | |||
giving an 8b per datagram covert channel. | account that it could be concatenated with the ECN field giving an 8b | |||
per datagram covert channel. | ||||
For tunnelling the 6b Diffserv field two conceptual models have had | For tunnelling the 6b Diffserv field two conceptual models have had | |||
to be defined so that administrators can trade off security against | to be defined so that administrators can trade off security against | |||
the needs of traffic conditioning [RFC2983]: | the needs of traffic conditioning [RFC2983]: | |||
The uniform model: where the Diffserv field is preserved end-to-end | The uniform model: where the Diffserv field is preserved end-to-end | |||
by copying into the outer header on encapsulation and copying from | by copying into the outer header on encapsulation and copying from | |||
the outer header on decapsulation. | the outer header on decapsulation. | |||
The pipe model: where the outer header is independent of that in the | The pipe model: where the outer header is independent of that in the | |||
skipping to change at page 37, line 15 | skipping to change at page 37, line 38 | |||
It deemed that simplicity was more important than allowing | It deemed that simplicity was more important than allowing | |||
administrators the option of a tiny increment in security, especially | administrators the option of a tiny increment in security, especially | |||
given not copying congestion indications could seriously harm | given not copying congestion indications could seriously harm | |||
everyone's network service. | everyone's network service. | |||
B.2. Control Constraints | B.2. Control Constraints | |||
Congestion control requires that any congestion notification marked | Congestion control requires that any congestion notification marked | |||
into packets by a resource will be able to traverse a feedback loop | into packets by a resource will be able to traverse a feedback loop | |||
back to a function capable of controlling the load on that resource. | back to a function capable of controlling the load on that resource. | |||
To be precise, rather than calling this function the data source, we | To be precise, rather than calling this function the data source, it | |||
will call it the Load Regulator. This will allow us to deal with | will be called the Load Regulator. This allows for exceptional cases | |||
exceptional cases where load is not regulated by the data source, but | where load is not regulated by the data source, but usually the two | |||
usually the two terms will be synonymous. Note the term "a function | terms will be synonymous. Note the term "a function _capable of_ | |||
_capable of_ controlling the load" deliberately includes a source | controlling the load" deliberately includes a source application that | |||
application that doesn't actually control the load but ought to (e.g. | doesn't actually control the load but ought to (e.g. an application | |||
an application without congestion control that uses UDP). | without congestion control that uses UDP). | |||
A--->R--->I=========>M=========>E-------->B | A--->R--->I=========>M=========>E-------->B | |||
Figure 6: Simple Tunnel Scenario | Figure 6: Simple Tunnel Scenario | |||
We now consider a similar tunnelling scenario to the IPsec one just | A similar tunnelling scenario to the IPsec one just described will | |||
described, but without the different security domains so we can just | now be considered, but without the different security domains, | |||
focus on ensuring the control loop and management monitoring can work | because the focus now shifts to whether the control loop and | |||
(Figure 6). If we want resources in the tunnel to be able to | management monitoring work (Figure 6). If resources in the tunnel | |||
explicitly notify congestion and the feedback path is from 'B' to | are to be able to explicitly notify congestion and the feedback path | |||
'A', it will certainly be necessary for 'E' to copy any CE marking | is from 'B' to 'A', it will certainly be necessary for 'E' to copy | |||
from the outer header to the inner header for onward transmission to | any CE marking from the outer header to the inner header for onward | |||
'B', otherwise congestion notification from resources like 'M' cannot | transmission to 'B', otherwise congestion notification from resources | |||
be fed back to the Load Regulator ('A'). But it does not seem | like 'M' cannot be fed back to the Load Regulator ('A'). But it does | |||
necessary for 'I' to copy CE markings from the inner to the outer | not seem necessary for 'I' to copy CE markings from the inner to the | |||
header. For instance, if resource 'R' is congested, it can send | outer header. For instance, if resource 'R' is congested, it can | |||
congestion information to 'B' using the congestion field in the inner | send congestion information to 'B' using the congestion field in the | |||
header without 'I' copying the congestion field into the outer header | inner header without 'I' copying the congestion field into the outer | |||
and 'E' copying it back to the inner header. 'E' can still write any | header and 'E' copying it back to the inner header. 'E' can still | |||
additional congestion marking introduced across the tunnel into the | write any additional congestion marking introduced across the tunnel | |||
congestion field of the inner header. | into the congestion field of the inner header. | |||
All this shows that 'E' can preserve the control loop irrespective of | All this shows that 'E' can preserve the control loop irrespective of | |||
whether 'I' copies congestion notification into the outer header or | whether 'I' copies congestion notification into the outer header or | |||
resets it. | resets it. | |||
That is the situation for existing control arrangements but, because | That is the situation for existing control arrangements but, because | |||
copying reveals more information, it would open up possibilities for | copying reveals more information, it would open up possibilities for | |||
better control system designs. For instance, Appendix E describes | better control system designs. For instance, resetting CE marking on | |||
how resetting CE marking on encapsulation breaks a proposed | encapsulation breaks the standards track PCN congestion marking | |||
congestion marking scheme on the standards track. It ends up | scheme [RFC5670]. It ends up removing excessive amounts of traffic | |||
removing excessive amounts of traffic unnecessarily. Whereas copying | unnecessarily. Whereas copying CE markings at ingress leads to the | |||
CE markings at ingress leads to the correct control behaviour. | correct control behaviour. | |||
B.3. Management Constraints | B.3. Management Constraints | |||
As well as control, there are also management constraints. | As well as control, there are also management constraints. | |||
Specifically, a management system may monitor congestion markings in | Specifically, a management system may monitor congestion markings in | |||
passing packets, perhaps at the border between networks as part of a | passing packets, perhaps at the border between networks as part of a | |||
service level agreement. For instance, monitors at the borders of | service level agreement. For instance, monitors at the borders of | |||
autonomous systems may need to measure how much congestion has | autonomous systems may need to measure how much congestion has | |||
accumulated so far along the path, perhaps to determine between them | accumulated so far along the path, perhaps to determine between them | |||
how much of the congestion is contributed by each domain. | how much of the congestion is contributed by each domain. | |||
In this document we define the baseline of congestion marking (or the | In this document the baseline of congestion marking (or the | |||
Congestion Baseline) as the source of the layer that created (or most | Congestion Baseline) is defined as the source of the layer that | |||
recently reset) the congestion notification field. When monitoring | created (or most recently reset) the congestion notification field. | |||
congestion it would be desirable if the Congestion Baseline did not | When monitoring congestion it would be desirable if the Congestion | |||
depend on whether packets were tunnelled or not. Given some tunnels | Baseline did not depend on whether packets were tunnelled or not. | |||
cross domain borders (e.g. consider M in Figure 6 is monitoring a | Given some tunnels cross domain borders (e.g. consider M in Figure 6 | |||
border), it would therefore be desirable for 'I' to copy congestion | is monitoring a border), it would therefore be desirable for 'I' to | |||
accumulated so far into the outer headers, so that it is exposed | copy congestion accumulated so far into the outer headers, so that it | |||
across the tunnel. | is exposed across the tunnel. | |||
For management purposes it might be useful for the tunnel egress to | For management purposes it might be useful for the tunnel egress to | |||
be able to monitor whether congestion occurred across a tunnel or | be able to monitor whether congestion occurred across a tunnel or | |||
upstream of it. Superficially it appears that copying congestion | upstream of it. Superficially it appears that copying congestion | |||
markings at the ingress would make this difficult, whereas it was | markings at the ingress would make this difficult, whereas it was | |||
straightforward when an RFC3168 ingress reset them. However, | straightforward when an RFC3168 ingress reset them. However, | |||
Appendix C gives a simple and precise method for a tunnel egress to | Appendix C gives a simple and precise method for a tunnel egress to | |||
infer the congestion level introduced across a tunnel. It works | infer the congestion level introduced across a tunnel. It works | |||
irrespective of whether the ingress copies or resets congestion | irrespective of whether the ingress copies or resets congestion | |||
markings. | markings. | |||
End of changes. 37 change blocks. | ||||
139 lines changed or deleted | 177 lines changed or added | |||
This html diff was produced by rfcdiff 1.38. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |