| < draft-ietf-tsvwg-ecn-tunnel-07.txt | draft-ietf-tsvwg-ecn-tunnel-08.txt > | |||
|---|---|---|---|---|
| Transport Area Working Group B. Briscoe | Transport Area Working Group B. Briscoe | |||
| Internet-Draft BT | Internet-Draft BT | |||
| Updates: 3168, 4301 February 11, 2010 | Updates: 3168, 4301 March 03, 2010 | |||
| (if approved) | (if approved) | |||
| Intended status: Standards Track | Intended status: Standards Track | |||
| Expires: August 15, 2010 | Expires: September 4, 2010 | |||
| Tunnelling of Explicit Congestion Notification | Tunnelling of Explicit Congestion Notification | |||
| draft-ietf-tsvwg-ecn-tunnel-07 | draft-ietf-tsvwg-ecn-tunnel-08 | |||
| Abstract | Abstract | |||
| This document redefines how the explicit congestion notification | This document redefines how the explicit congestion notification | |||
| (ECN) field of the IP header should be constructed on entry to and | (ECN) field of the IP header should be constructed on entry to and | |||
| exit from any IP in IP tunnel. On encapsulation it updates RFC3168 | exit from any IP in IP tunnel. On encapsulation it updates RFC3168 | |||
| to bring all IP in IP tunnels (v4 or v6) into line with RFC4301 IPsec | to bring all IP in IP tunnels (v4 or v6) into line with RFC4301 IPsec | |||
| ECN processing. On decapsulation it updates both RFC3168 and RFC4301 | ECN processing. On decapsulation it updates both RFC3168 and RFC4301 | |||
| to add new behaviours for previously unused combinations of inner and | to add new behaviours for previously unused combinations of inner and | |||
| outer header. The new rules ensure the ECN field is correctly | outer header. The new rules ensure the ECN field is correctly | |||
| propagated across a tunnel whether it is used to signal one or two | propagated across a tunnel whether it is used to signal one or two | |||
| severity levels of congestion, whereas before only one severity level | severity levels of congestion, whereas before only one severity level | |||
| was supported. Tunnel endpoints can be updated in any order without | was supported. Tunnel endpoints can be updated in any order without | |||
| affecting pre-existing uses of the ECN field (backward compatible). | affecting pre-existing uses of the ECN field, providing backward | |||
| Nonetheless, operators wanting to support two severity levels (e.g. | compatibility. Nonetheless, operators wanting to support two | |||
| for pre-congestion notification--PCN) can require compliance with | severity levels (e.g. for pre-congestion notification--PCN) can | |||
| this new specification. A thorough analysis of the reasoning for | require compliance with this new specification. A thorough analysis | |||
| these changes and the implications is included. In the unlikely | of the reasoning for these changes and the implications is included. | |||
| event that the new rules do not meet a specific need, RFC4774 gives | In the unlikely event that the new rules do not meet a specific need, | |||
| guidance on designing alternate ECN semantics and this document | RFC4774 gives guidance on designing alternate ECN semantics and this | |||
| extends that to include tunnelling issues. | document extends that to include tunnelling issues. | |||
| Status of This Memo | Status of This Memo | |||
| This Internet-Draft is submitted to IETF in full conformance with the | This Internet-Draft is submitted to IETF in full conformance with the | |||
| provisions of BCP 78 and BCP 79. | provisions of BCP 78 and BCP 79. | |||
| Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
| Task Force (IETF), its areas, and its working groups. Note that | Task Force (IETF), its areas, and its working groups. Note that | |||
| other groups may also distribute working documents as Internet- | other groups may also distribute working documents as Internet- | |||
| Drafts. | Drafts. | |||
| skipping to change at page 2, line 9 | skipping to change at page 2, line 9 | |||
| and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
| time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
| material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
| The list of current Internet-Drafts can be accessed at | The list of current Internet-Drafts can be accessed at | |||
| http://www.ietf.org/ietf/1id-abstracts.txt. | http://www.ietf.org/ietf/1id-abstracts.txt. | |||
| The list of Internet-Draft Shadow Directories can be accessed at | The list of Internet-Draft Shadow Directories can be accessed at | |||
| http://www.ietf.org/shadow.html. | http://www.ietf.org/shadow.html. | |||
| This Internet-Draft will expire on August 15, 2010. | This Internet-Draft will expire on September 4, 2010. | |||
| Copyright Notice | Copyright Notice | |||
| Copyright (c) 2010 IETF Trust and the persons identified as the | Copyright (c) 2010 IETF Trust and the persons identified as the | |||
| document authors. All rights reserved. | document authors. All rights reserved. | |||
| This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
| Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
| (http://trustee.ietf.org/license-info) in effect on the date of | (http://trustee.ietf.org/license-info) in effect on the date of | |||
| publication of this document. Please review these documents | publication of this document. Please review these documents | |||
| carefully, as they describe your rights and restrictions with respect | carefully, as they describe your rights and restrictions with respect | |||
| to this document. Code Components extracted from this document must | to this document. Code Components extracted from this document must | |||
| include Simplified BSD License text as described in Section 4.e of | include Simplified BSD License text as described in Section 4.e of | |||
| the Trust Legal Provisions and are provided without warranty as | the Trust Legal Provisions and are provided without warranty as | |||
| described in the BSD License. | described in the BSD License. | |||
| This document may contain material from IETF Documents or IETF | ||||
| Contributions published or made publicly available before November | ||||
| 10, 2008. The person(s) controlling the copyright in some of this | ||||
| material may not have granted the IETF Trust the right to allow | ||||
| modifications of such material outside the IETF Standards Process. | ||||
| Without obtaining an adequate license from the person(s) controlling | ||||
| the copyright in such materials, this document may not be modified | ||||
| outside the IETF Standards Process, and derivative works of it may | ||||
| not be created outside the IETF Standards Process, except to format | ||||
| it for publication as an RFC or to translate it into languages other | ||||
| than English. | ||||
| Table of Contents | Table of Contents | |||
| 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 9 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 9 | |||
| 1.1. Scope . . . . . . . . . . . . . . . . . . . . . . . . . . 11 | 1.1. Scope . . . . . . . . . . . . . . . . . . . . . . . . . . 11 | |||
| 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 11 | 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 11 | |||
| 3. Summary of Pre-Existing RFCs . . . . . . . . . . . . . . . . . 12 | 3. Summary of Pre-Existing RFCs . . . . . . . . . . . . . . . . . 12 | |||
| 3.1. Encapsulation at Tunnel Ingress . . . . . . . . . . . . . 12 | 3.1. Encapsulation at Tunnel Ingress . . . . . . . . . . . . . 12 | |||
| 3.2. Decapsulation at Tunnel Egress . . . . . . . . . . . . . . 13 | 3.2. Decapsulation at Tunnel Egress . . . . . . . . . . . . . . 13 | |||
| 4. New ECN Tunnelling Rules . . . . . . . . . . . . . . . . . . . 14 | 4. New ECN Tunnelling Rules . . . . . . . . . . . . . . . . . . . 14 | |||
| 4.1. Default Tunnel Ingress Behaviour . . . . . . . . . . . . . 15 | 4.1. Default Tunnel Ingress Behaviour . . . . . . . . . . . . . 15 | |||
| 4.2. Default Tunnel Egress Behaviour . . . . . . . . . . . . . 15 | 4.2. Default Tunnel Egress Behaviour . . . . . . . . . . . . . 15 | |||
| 4.3. Encapsulation Modes . . . . . . . . . . . . . . . . . . . 17 | 4.3. Encapsulation Modes . . . . . . . . . . . . . . . . . . . 17 | |||
| 4.4. Single Mode of Decapsulation . . . . . . . . . . . . . . . 19 | 4.4. Single Mode of Decapsulation . . . . . . . . . . . . . . . 19 | |||
| 5. Updates to Earlier RFCs . . . . . . . . . . . . . . . . . . . 20 | 5. Updates to Earlier RFCs . . . . . . . . . . . . . . . . . . . 20 | |||
| 5.1. Changes to RFC4301 ECN processing . . . . . . . . . . . . 20 | 5.1. Changes to RFC4301 ECN processing . . . . . . . . . . . . 20 | |||
| 5.2. Changes to RFC3168 ECN processing . . . . . . . . . . . . 21 | 5.2. Changes to RFC3168 ECN processing . . . . . . . . . . . . 20 | |||
| 5.3. Motivation for Changes . . . . . . . . . . . . . . . . . . 22 | 5.3. Motivation for Changes . . . . . . . . . . . . . . . . . . 22 | |||
| 5.3.1. Motivation for Changing Encapsulation . . . . . . . . 22 | 5.3.1. Motivation for Changing Encapsulation . . . . . . . . 22 | |||
| 5.3.2. Motivation for Changing Decapsulation . . . . . . . . 23 | 5.3.2. Motivation for Changing Decapsulation . . . . . . . . 23 | |||
| 6. Backward Compatibility . . . . . . . . . . . . . . . . . . . . 25 | 6. Backward Compatibility . . . . . . . . . . . . . . . . . . . . 25 | |||
| 6.1. Non-Issues Updating Decapsulation . . . . . . . . . . . . 25 | 6.1. Non-Issues Updating Decapsulation . . . . . . . . . . . . 25 | |||
| 6.2. Non-Update of RFC4301 IPsec Encapsulation . . . . . . . . 26 | 6.2. Non-Update of RFC4301 IPsec Encapsulation . . . . . . . . 26 | |||
| 6.3. Update to RFC3168 Encapsulation . . . . . . . . . . . . . 26 | 6.3. Update to RFC3168 Encapsulation . . . . . . . . . . . . . 26 | |||
| 7. Design Principles for Alternate ECN Tunnelling Semantics . . . 27 | 7. Design Principles for Alternate ECN Tunnelling Semantics . . . 27 | |||
| 8. Security Considerations . . . . . . . . . . . . . . . . . . . 29 | 8. Security Considerations . . . . . . . . . . . . . . . . . . . 29 | |||
| 9. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 30 | 9. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 30 | |||
| 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 31 | 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 31 | |||
| 11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 31 | 11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 31 | |||
| 11.1. Normative References . . . . . . . . . . . . . . . . . . . 31 | 11.1. Normative References . . . . . . . . . . . . . . . . . . . 31 | |||
| 11.2. Informative References . . . . . . . . . . . . . . . . . . 32 | 11.2. Informative References . . . . . . . . . . . . . . . . . . 32 | |||
| Editorial Comments . . . . . . . . . . . . . . . . . . . . . . . . | ||||
| Appendix A. Early ECN Tunnelling RFCs . . . . . . . . . . . . . . 34 | Appendix A. Early ECN Tunnelling RFCs . . . . . . . . . . . . . . 34 | |||
| Appendix B. Design Constraints . . . . . . . . . . . . . . . . . 35 | Appendix B. Design Constraints . . . . . . . . . . . . . . . . . 35 | |||
| B.1. Security Constraints . . . . . . . . . . . . . . . . . . . 35 | B.1. Security Constraints . . . . . . . . . . . . . . . . . . . 35 | |||
| B.2. Control Constraints . . . . . . . . . . . . . . . . . . . 37 | B.2. Control Constraints . . . . . . . . . . . . . . . . . . . 37 | |||
| B.3. Management Constraints . . . . . . . . . . . . . . . . . . 38 | B.3. Management Constraints . . . . . . . . . . . . . . . . . . 38 | |||
| Appendix C. Contribution to Congestion across a Tunnel . . . . . 38 | Appendix C. Contribution to Congestion across a Tunnel . . . . . 39 | |||
| Appendix D. Why Losing ECT(1) on Decapsulation Impedes PCN | Appendix D. Why Losing ECT(1) on Decapsulation Impedes PCN | |||
| (to be removed before publication) . . . . . . . . . 39 | (to be removed before publication) . . . . . . . . . 40 | |||
| Appendix E. Why Resetting ECN on Encapsulation Impedes PCN | Appendix E. Why Resetting ECN on Encapsulation Impedes PCN | |||
| (to be removed before publication) . . . . . . . . . 41 | (to be removed before publication) . . . . . . . . . 41 | |||
| Appendix F. Compromise on Decap with ECT(1) Inner and ECT(0) | Appendix F. Compromise on Decap with ECT(1) Inner and ECT(0) | |||
| Outer . . . . . . . . . . . . . . . . . . . . . . . . 41 | Outer . . . . . . . . . . . . . . . . . . . . . . . . 42 | |||
| Appendix G. Open Issues . . . . . . . . . . . . . . . . . . . . . 42 | Appendix G. Open Issues . . . . . . . . . . . . . . . . . . . . . 43 | |||
| Request to the RFC Editor (to be removed on publication): | Request to the RFC Editor (to be removed on publication): | |||
| In the RFC index, RFC3168 should be identified as an update to | In the RFC index, RFC3168 should be identified as an update to | |||
| RFC2003. RFC4301 should be identified as an update to RFC3168. | RFC2003. RFC4301 should be identified as an update to RFC3168. | |||
| Changes from previous drafts (to be removed by the RFC Editor) | Changes from previous drafts (to be removed by the RFC Editor) | |||
| Full text differences between IETF draft versions are available at | Full text differences between IETF draft versions are available at | |||
| <http://tools.ietf.org/wg/tsvwg/draft-ietf-tsvwg-ecn-tunnel/>, and | <http://tools.ietf.org/wg/tsvwg/draft-ietf-tsvwg-ecn-tunnel/>, and | |||
| skipping to change at page 9, line 32 | skipping to change at page 9, line 32 | |||
| regulation (changed title from "In-path Load Regulation" to | regulation (changed title from "In-path Load Regulation" to | |||
| "Non-Dependence of Tunnelling on In-path Load Regulation"), but | "Non-Dependence of Tunnelling on In-path Load Regulation"), but | |||
| explained how an in-path load regulation function must be | explained how an in-path load regulation function must be | |||
| carefully placed with respect to tunnel encapsulation (in a new | carefully placed with respect to tunnel encapsulation (in a new | |||
| sub-section entitled "Dependence of In-Path Load Regulation on | sub-section entitled "Dependence of In-Path Load Regulation on | |||
| Tunnelling"). | Tunnelling"). | |||
| 1. Introduction | 1. Introduction | |||
| Explicit congestion notification (ECN [RFC3168]) allows a forwarding | Explicit congestion notification (ECN [RFC3168]) allows a forwarding | |||
| element to notify the onset of congestion without having to drop | element (e.g. a router) to notify the onset of congestion without | |||
| packets. Instead it can explicitly mark a proportion of packets in | having to drop packets. Instead it can explicitly mark a proportion | |||
| the 2-bit ECN field in the IP header (Table 1 recaps the ECN | of packets in the 2-bit ECN field in the IP header (Table 1 recaps | |||
| codepoints). | the ECN codepoints). | |||
| The outer header of an IP packet can encapsulate one or more IP | The outer header of an IP packet can encapsulate one or more IP | |||
| headers for tunnelling. A forwarding element using ECN to signify | headers for tunnelling. A forwarding element using ECN to signify | |||
| congestion will only mark the immediately visible outer IP header. | congestion will only mark the immediately visible outer IP header. | |||
| When a tunnel decapsulator later removes this outer header, it | When a tunnel decapsulator later removes this outer header, it | |||
| follows rules to propagate congestion markings by combining the ECN | follows rules to propagate congestion markings by combining the ECN | |||
| fields of the inner and outer IP header into one outgoing IP header. | fields of the inner and outer IP header into one outgoing IP header. | |||
| This document updates those rules for IPsec [RFC4301] and non-IPsec | This document updates those rules for IPsec [RFC4301] and non-IPsec | |||
| [RFC3168] tunnels to add new behaviours for previously unused | [RFC3168] tunnels to add new behaviours for previously unused | |||
| skipping to change at page 14, line 16 | skipping to change at page 14, line 16 | |||
| |Incoming | Incoming Outer Header | | |Incoming | Incoming Outer Header | | |||
| | Inner +---------+------------+------------+------------+ | | Inner +---------+------------+------------+------------+ | |||
| | Header | Not-ECT | ECT(0) | ECT(1) | CE | | | Header | Not-ECT | ECT(0) | ECT(1) | CE | | |||
| +---------+---------+------------+------------+------------+ | +---------+---------+------------+------------+------------+ | |||
| RFC3168->| Not-ECT | Not-ECT |Not-ECT |Not-ECT | drop | | RFC3168->| Not-ECT | Not-ECT |Not-ECT |Not-ECT | drop | | |||
| RFC4301->| Not-ECT | Not-ECT |Not-ECT |Not-ECT |Not-ECT | | RFC4301->| Not-ECT | Not-ECT |Not-ECT |Not-ECT |Not-ECT | | |||
| | ECT(0) | ECT(0) | ECT(0) | ECT(0) | CE | | | ECT(0) | ECT(0) | ECT(0) | ECT(0) | CE | | |||
| | ECT(1) | ECT(1) | ECT(1) | ECT(1) | CE | | | ECT(1) | ECT(1) | ECT(1) | ECT(1) | CE | | |||
| | CE | CE | CE | CE | CE | | | CE | CE | CE | CE | CE | | |||
| +---------+---------+------------+------------+------------+ | +---------+---------+------------+------------+------------+ | |||
| | Outgoing Header | | ||||
| +------------------------------------------------+ | In pre-existing RFCs, the ECN field in the outgoing header was set to | |||
| the codepoint at the intersection of the appropriate incoming inner | ||||
| header (row) and incoming outer header (column). | ||||
| Figure 2: IP in IP Decapsulation; Recap of Pre-existing Behaviour | Figure 2: IP in IP Decapsulation; Recap of Pre-existing Behaviour | |||
| The behaviour in the table derives from the logic given in RFC3168 | The behaviour in the table derives from the logic given in RFC3168 | |||
| and RFC4301, briefly recapped as follows: | and RFC4301, briefly recapped as follows: | |||
| o On decapsulation, if the inner ECN field is Not-ECT the outer is | o On decapsulation, if the inner ECN field is Not-ECT the outer is | |||
| ignored. RFC3168 (but not RFC4301) also specified that the | ignored. RFC3168 (but not RFC4301) also specified that the | |||
| decapsulator must drop a packet with a Not-ECT inner and CE in the | decapsulator must drop a packet with a Not-ECT inner and CE in the | |||
| outer. | outer. | |||
| o In all other cases, if the outer is CE, the outgoing ECN field is | o In all other cases, if the outer is CE, the outgoing ECN field is | |||
| set to CE, but otherwise the outer is ignored and the inner is | set to CE, but otherwise the outer is ignored and the inner is | |||
| used for the outgoing ECN field. | used for the outgoing ECN field. | |||
| RFC3168 also made it an auditable event for an IPsec tunnel "if the | Section 9.2.2 of RFC3168 also made it an auditable event for an IPsec | |||
| ECN Field is changed inappropriately within an IPsec tunnel...". | tunnel "if the ECN Field is changed inappropriately within an IPsec | |||
| Inappropriate changes were not specifically enumerated. RFC4301 did | tunnel...". Inappropriate changes were not specifically enumerated. | |||
| not mention inappropriate ECN changes. | RFC4301 did not mention inappropriate ECN changes. | |||
| 4. New ECN Tunnelling Rules | 4. New ECN Tunnelling Rules | |||
| The standards actions below in Section 4.1 (ingress encapsulation) | The standards actions below in Section 4.1 (ingress encapsulation) | |||
| and Section 4.2 (egress decapsulation) define new default ECN tunnel | and Section 4.2 (egress decapsulation) define new default ECN tunnel | |||
| processing rules for any IP packet (v4 or v6) with any Diffserv | processing rules for any IP packet (v4 or v6) with any Diffserv | |||
| codepoint. | codepoint. | |||
| If these defaults do not meet a particular requirement, an alternate | If these defaults do not meet a particular requirement, an alternate | |||
| ECN tunnelling scheme can be introduced as part of the definition of | ECN tunnelling scheme can be introduced as part of the definition of | |||
| an alternate congestion marking scheme used by a specific Diffserv | an alternate congestion marking scheme used by a specific Diffserv | |||
| PHB (see S.5 of [RFC3168] and [RFC4774]). When designing such | PHB (see S.5 of [RFC3168] and [RFC4774]). When designing such | |||
| alternate ECN tunnelling schemes, the principles in Section 7 should | alternate ECN tunnelling schemes, the principles in Section 7 should | |||
| be followed. However, alternate ECN tunnelling schemes are NOT | be followed. However, alternate ECN tunnelling schemes SHOULD be | |||
| RECOMMENDED as the deployment burden of handling exceptional PHBs in | avoided whenever possible as the deployment burden of handling | |||
| implementations of all affected tunnels should not be underestimated. | exceptional PHBs in implementations of all affected tunnels should | |||
| not be underestimated. There is no requirement for a PHB definition | ||||
| There is no requirement for a PHB definition to state anything about | to state anything about ECN tunnelling behaviour if the default | |||
| ECN tunnelling behaviour if the default behaviour in the present | behaviour in the present specification is sufficient. | |||
| specification is sufficient. | ||||
| 4.1. Default Tunnel Ingress Behaviour | 4.1. Default Tunnel Ingress Behaviour | |||
| Two modes of encapsulation are defined here; a REQUIRED `normal mode' | Two modes of encapsulation are defined here; a REQUIRED `normal mode' | |||
| and a `compatibility mode', which is for backward compatibility with | and a `compatibility mode', which is for backward compatibility with | |||
| tunnel decapsulators that do not understand ECN. Note that these are | tunnel decapsulators that do not understand ECN. Note that these are | |||
| modes of the ingress tunnel endpoint only, not the whole tunnel. | modes of the ingress tunnel endpoint only, not the whole tunnel. | |||
| Section 4.3 explains why two modes are necessary and specifies the | Section 4.3 explains why two modes are necessary and specifies the | |||
| circumstances in which it is sufficient to solely implement normal | circumstances in which it is sufficient to solely implement normal | |||
| mode. | mode. | |||
| skipping to change at page 16, line 15 | skipping to change at page 16, line 15 | |||
| +---------+------------------------------------------------+ | +---------+------------------------------------------------+ | |||
| |Incoming | Incoming Outer Header | | |Incoming | Incoming Outer Header | | |||
| | Inner +---------+------------+------------+------------+ | | Inner +---------+------------+------------+------------+ | |||
| | Header | Not-ECT | ECT(0) | ECT(1) | CE | | | Header | Not-ECT | ECT(0) | ECT(1) | CE | | |||
| +---------+---------+------------+------------+------------+ | +---------+---------+------------+------------+------------+ | |||
| | Not-ECT | Not-ECT |Not-ECT(!!!)|Not-ECT(!!!)| drop(!!!)| | | Not-ECT | Not-ECT |Not-ECT(!!!)|Not-ECT(!!!)| drop(!!!)| | |||
| | ECT(0) | ECT(0) | ECT(0) | ECT(1) | CE | | | ECT(0) | ECT(0) | ECT(0) | ECT(1) | CE | | |||
| | ECT(1) | ECT(1) | ECT(1) (!) | ECT(1) | CE | | | ECT(1) | ECT(1) | ECT(1) (!) | ECT(1) | CE | | |||
| | CE | CE | CE | CE(!!!)| CE | | | CE | CE | CE | CE(!!!)| CE | | |||
| +---------+---------+------------+------------+------------+ | +---------+---------+------------+------------+------------+ | |||
| | Outgoing Header | | ||||
| +------------------------------------------------+ | The ECN field in the outgoing header is set to the codepoint at the | |||
| Currently unused combinations are indicated by '(!!!)' or '(!)' | intersection of the appropriate incoming inner header (row) and | |||
| incoming outer header (column). Currently unused combinations are | ||||
| indicated by '(!!!)' or '(!)' | ||||
| Figure 4: New IP in IP Decapsulation Behaviour | Figure 4: New IP in IP Decapsulation Behaviour | |||
| This table for decapsulation behaviour is derived from the following | This table for decapsulation behaviour is derived from the following | |||
| logic: | logic: | |||
| o If the inner ECN field is Not-ECT the decapsulator MUST NOT | o If the inner ECN field is Not-ECT the decapsulator MUST NOT | |||
| propagate any other ECN codepoint onwards. This is because the | propagate any other ECN codepoint onwards. This is because the | |||
| inner Not-ECT marking is set by transports that use drop as an | inner Not-ECT marking is set by transports that use drop as an | |||
| indication of congestion and would not understand or respond to | indication of congestion and would not understand or respond to | |||
| skipping to change at page 17, line 12 | skipping to change at page 17, line 15 | |||
| Just because the highlighted combinations are currently unused, | Just because the highlighted combinations are currently unused, | |||
| does not mean that all the other combinations are always valid. | does not mean that all the other combinations are always valid. | |||
| Some are only valid if they have arrived from a particular type of | Some are only valid if they have arrived from a particular type of | |||
| legacy ingress, and dangerous otherwise. Therefore an | legacy ingress, and dangerous otherwise. Therefore an | |||
| implementation MAY allow an operator to configure logging and | implementation MAY allow an operator to configure logging and | |||
| alarms for such additional header combinations known to be | alarms for such additional header combinations known to be | |||
| dangerous or CU for the particular configuration of tunnel | dangerous or CU for the particular configuration of tunnel | |||
| endpoints deployed at run-time. | endpoints deployed at run-time. | |||
| Alarms should be rate-limited so that the anomalous combinations | Alarms SHOULD be rate-limited so that the anomalous combinations | |||
| will not amplify into a flood of alarm messages. It MUST be | will not amplify into a flood of alarm messages. It MUST be | |||
| possible to suppress alarms or logging, e.g. if it becomes | possible to suppress alarms or logging, e.g. if it becomes | |||
| apparent that a combination that previously was not used has | apparent that a combination that previously was not used has | |||
| started to be used for legitimate purposes such as a new standards | started to be used for legitimate purposes such as a new standards | |||
| action. | action. | |||
| The above logic allows for ECT(0) and ECT(1) to both represent the | The above logic allows for ECT(0) and ECT(1) to both represent the | |||
| same severity of congestion marking (e.g. "not congestion marked"). | same severity of congestion marking (e.g. "not congestion marked"). | |||
| But it also allows future schemes to be defined where ECT(1) is a | But it also allows future schemes to be defined where ECT(1) is a | |||
| more severe marking than ECT(0), in particular enabling the simplest | more severe marking than ECT(0), in particular enabling the simplest | |||
| possible encoding for PCN [I-D.ietf-pcn-3-in-1-encoding]. This | possible encoding for PCN [I-D.ietf-pcn-3-in-1-encoding]. Before the | |||
| approach is discussed in Appendix D and in the discussion of the ECN | present specification was written, the PCN working-group had proposed | |||
| nonce [RFC3540] in Section 8, which in turn refers to Appendix F. | a number of work-rounds to the problem of a tunnel egress not | |||
| propagating two severity levels of congestion. Without wishing to | ||||
| disparage the ingenuity of these work-rounds, none were chosen for | ||||
| the standards track because they were either somewhat wasteful, | ||||
| imprecise or complicated [Note_PCN_egress]. Treating ECT(1) as | ||||
| either the same as ECT(0) or as a higher severity level is explained | ||||
| in the discussion of the ECN nonce [RFC3540] in Section 8, which in | ||||
| turn refers to Appendix F. | ||||
| 4.3. Encapsulation Modes | 4.3. Encapsulation Modes | |||
| Section 4.1 introduces two encapsulation modes, normal mode and | Section 4.1 introduces two encapsulation modes, normal mode and | |||
| compatibility mode, defining their encapsulation behaviour (i.e. | compatibility mode, defining their encapsulation behaviour (i.e. | |||
| header copying or zeroing respectively). Note that these are modes | header copying or zeroing respectively). Note that these are modes | |||
| of the ingress tunnel endpoint only, not the tunnel as a whole. | of the ingress tunnel endpoint only, not the tunnel as a whole. | |||
| To comply with this specification, a tunnel ingress MUST at least | To comply with this specification, a tunnel ingress MUST at least | |||
| implement `normal mode'. Unless it will never be used with legacy | implement `normal mode'. Unless it will never be used with legacy | |||
| tunnel egress nodes (RFC2003, RFC2401 or RFC2481 or the limited | tunnel egress nodes (RFC2003, RFC2401 or RFC2481 or the limited | |||
| functionality mode of RFC3168), an ingress MUST also implement | functionality mode of RFC3168), an ingress MUST also implement | |||
| `compatibility mode' for backward compatibility with tunnel egresses | `compatibility mode' for backward compatibility with tunnel egresses | |||
| that do not propagate explicit congestion notifications [RFC4774]. | that do not propagate explicit congestion notifications [RFC4774]. | |||
| We can categorise the way that an ingress tunnel endpoint is paired | We can categorise the way that an ingress tunnel endpoint is paired | |||
| with an egress as either: | with an egress as either static or dynamically discovered: | |||
| static: those paired together by prior configuration or; | ||||
| dynamically discovered: those paired together by some form of tunnel | Static: Tunnel endpoints paired together by prior configuration. | |||
| endpoint discovery, typically finding an egress on the path taken | ||||
| by the first packet. | ||||
| Static: Some implementations of encapsulator might always be | Some implementations of encapsulator might always be statically | |||
| statically deployed, and constrained to never be paired with a | deployed, and constrained to never be paired with a legacy | |||
| legacy decapsulator (RFC2003, RFC2401 or RFC2481 or the limited | decapsulator (RFC2003, RFC2401 or RFC2481 or the limited | |||
| functionality mode of RFC3168). In such a case, only normal mode | functionality mode of RFC3168). In such a case, only normal mode | |||
| needs to be implemented. | needs to be implemented. | |||
| For instance, RFC4301-compatible IPsec tunnel endpoints invariably | For instance, RFC4301-compatible IPsec tunnel endpoints invariably | |||
| use IKEv2 [RFC4306] for key exchange, which was introduced | use IKEv2 [RFC4306] for key exchange, which was introduced | |||
| alongside RFC4301. Therefore both endpoints of an RFC4301 tunnel | alongside RFC4301. Therefore both endpoints of an RFC4301 tunnel | |||
| can be sure that the other end is RFC4301-compatible, because the | can be sure that the other end is RFC4301-compatible, because the | |||
| tunnel is only formed after IKEv2 key management has completed, at | tunnel is only formed after IKEv2 key management has completed, at | |||
| which point both ends will be RFC4301-compliant by definition. | which point both ends will be RFC4301-compliant by definition. | |||
| Therefore an IPsec tunnel ingress does not need compatibility | Therefore an IPsec tunnel ingress does not need compatibility | |||
| mode, as it will never interact with legacy ECN tunnels. To | mode, as it will never interact with legacy ECN tunnels. To | |||
| comply with the present specification, it only needs to implement | comply with the present specification, it only needs to implement | |||
| the required normal mode, which is identical to the pre-existing | the required normal mode, which is identical to the pre-existing | |||
| RFC4301 behaviour. | RFC4301 behaviour. | |||
| Dynamic Discovery: This specification does not require or recommend | Dynamic Discovery: Tunnel endpoints paired together by some form of | |||
| dynamic discovery and it does not define how dynamic negotiation | tunnel endpoint discovery, typically finding an egress on the path | |||
| might be done, but it recognises that proprietary tunnel endpoint | taken by the first packet. | |||
| discovery protocols exist. It therefore sets down some | ||||
| constraints on discovery protocols to ensure safe interworking. | This specification does not require or recommend dynamic discovery | |||
| and it does not define how dynamic negotiation might be done, but | ||||
| it recognises that proprietary tunnel endpoint discovery protocols | ||||
| exist. It therefore sets down some constraints on discovery | ||||
| protocols to ensure safe interworking. | ||||
| If dynamic tunnel endpoint discovery might pair an ingress with a | If dynamic tunnel endpoint discovery might pair an ingress with a | |||
| legacy egress (RFC2003, RFC2401 or RFC2481 or the limited | legacy egress (RFC2003, RFC2401 or RFC2481 or the limited | |||
| functionality mode of RFC3168), the ingress MUST implement both | functionality mode of RFC3168), the ingress MUST implement both | |||
| normal and compatibility mode. If the tunnel discovery process is | normal and compatibility mode. If the tunnel discovery process is | |||
| arranged to only ever find a tunnel egress that propagates ECN | arranged to only ever find a tunnel egress that propagates ECN | |||
| (RFC3168 full functionality mode, RFC4301 or this present | (RFC3168 full functionality mode, RFC4301 or this present | |||
| specification), then a tunnel ingress can be complaint with the | specification), then a tunnel ingress can be complaint with the | |||
| present specification without implementing compatibility mode. | present specification without implementing compatibility mode. | |||
| skipping to change at page 19, line 42 | skipping to change at page 19, line 50 | |||
| Through the discovery protocol, a tunnel ingress compliant with the | Through the discovery protocol, a tunnel ingress compliant with the | |||
| present specification might ask if the egress is compliant with the | present specification might ask if the egress is compliant with the | |||
| present specification, with RFC4301 or with RFC3168 full | present specification, with RFC4301 or with RFC3168 full | |||
| functionality mode. Or an RFC3168 tunnel ingress might try to | functionality mode. Or an RFC3168 tunnel ingress might try to | |||
| negotiate to use limited functionality or full functionality mode | negotiate to use limited functionality or full functionality mode | |||
| [RFC3168]. In all these cases, a decapsulating tunnel egress | [RFC3168]. In all these cases, a decapsulating tunnel egress | |||
| compliant with this specification MUST agree to any of these | compliant with this specification MUST agree to any of these | |||
| requests, since it will behave identically in all these cases. | requests, since it will behave identically in all these cases. | |||
| If no ECN-related mode is requested, a compliant tunnel egress MUST | If no ECN-related mode is requested, a compliant tunnel egress MUST | |||
| continue without raising any error or warning as its egress behaviour | continue without raising any error or warning, because its egress | |||
| is compatible with all the legacy ingress behaviours that do not | behaviour is compatible with all the legacy ingress behaviours that | |||
| negotiate capabilities. | do not negotiate capabilities. | |||
| A compliant tunnel egress SHOULD raise a warning alarm about any | A compliant tunnel egress SHOULD raise a warning alarm about any | |||
| requests to enter modes it does not recognise but, for 'forward | requests to enter modes it does not recognise but, for 'forward | |||
| compatibility' with standards actions possibly defined after it was | compatibility' with standards actions possibly defined after it was | |||
| implemented, it SHOULD continue operating. | implemented, it SHOULD continue operating. | |||
| 5. Updates to Earlier RFCs | 5. Updates to Earlier RFCs | |||
| 5.1. Changes to RFC4301 ECN processing | 5.1. Changes to RFC4301 ECN processing | |||
| skipping to change at page 20, line 38 | skipping to change at page 20, line 44 | |||
| dropped rather than forwarded as Not-ECT; | dropped rather than forwarded as Not-ECT; | |||
| * Certain combinations of inner and outer ECN field have been | * Certain combinations of inner and outer ECN field have been | |||
| identified as currently unused. These can trigger logging | identified as currently unused. These can trigger logging | |||
| and/or raise alarms. | and/or raise alarms. | |||
| Modes: RFC4301 tunnel endpoints do not need modes and are not | Modes: RFC4301 tunnel endpoints do not need modes and are not | |||
| updated by the modes in the present specification. Effectively an | updated by the modes in the present specification. Effectively an | |||
| RFC4301 IPsec ingress solely uses the REQUIRED normal mode of | RFC4301 IPsec ingress solely uses the REQUIRED normal mode of | |||
| encapsulation, which is unchanged from RFC4301 encapsulation. It | encapsulation, which is unchanged from RFC4301 encapsulation. It | |||
| will never need the OPTIONAL compatibility mode as explained in | will never [Note_Manual_Keying] need the OPTIONAL compatibility | |||
| Section 4.3 (except in one corner-case described below). | mode as explained in Section 4.3. | |||
| {ToDo: Question to Security Directorate: Although this corner-case | ||||
| theoretically exists, it would be preferable to delete any mention | ||||
| of it for simplicity & clarity. Agree?} | ||||
| One corner case can exist where an RFC4301 ingress does not use | ||||
| IKEv2, but uses manual keying instead. Then an RFC4301 ingress | ||||
| could conceivably be configured to tunnel to an egress with | ||||
| limited functionality ECN handling. Strictly, for this corner- | ||||
| case, the requirement to use compatibility mode in this | ||||
| specification updates RFC4301. However, this is such a remote | ||||
| possibility that RFC4301 IPsec implementations are NOT REQUIRED to | ||||
| implement compatibility mode. | ||||
| 5.2. Changes to RFC3168 ECN processing | 5.2. Changes to RFC3168 ECN processing | |||
| Ingress: On encapsulation, the new rule in Figure 3 that a normal | Ingress: On encapsulation, the new rule in Figure 3 that a normal | |||
| mode tunnel ingress copies any ECN field into the outer header | mode tunnel ingress copies any ECN field into the outer header | |||
| updates the full functionality behaviour of an RFC3168 ingress. | updates the full functionality behaviour of an RFC3168 ingress. | |||
| Nonetheless, the new compatibility mode encapsulates packets | Nonetheless, the new compatibility mode encapsulates packets | |||
| identically to the limited functionality mode of an RFC3168 | identically to the limited functionality mode of an RFC3168 | |||
| ingress. | ingress. | |||
| Egress: An RFC3168 egress will need to be updated to the new | Egress: An RFC3168 egress will need to be updated to the new | |||
| decapsulation behaviour in Figure 4, in order to comply with the | decapsulation behaviour in Figure 4, in order to comply with the | |||
| present specification. However, the changes are backward | present specification. However, the changes are backward | |||
| skipping to change at page 22, line 32 | skipping to change at page 22, line 32 | |||
| compatibility with legacy decapsulators that do not propagate ECN | compatibility with legacy decapsulators that do not propagate ECN | |||
| correctly. | correctly. | |||
| The trigger that motivated this update to RFC3168 encapsulation was a | The trigger that motivated this update to RFC3168 encapsulation was a | |||
| standards track proposal for pre-congestion notification (PCN | standards track proposal for pre-congestion notification (PCN | |||
| [RFC5670]). PCN excess rate marking only works correctly if the ECN | [RFC5670]). PCN excess rate marking only works correctly if the ECN | |||
| field is copied on encapsulation (as in RFC4301 and RFC5129); it does | field is copied on encapsulation (as in RFC4301 and RFC5129); it does | |||
| not work if ECN is reset (as in RFC3168). This is because PCN excess | not work if ECN is reset (as in RFC3168). This is because PCN excess | |||
| rate marking depends on the outer header revealing any congestion | rate marking depends on the outer header revealing any congestion | |||
| experienced so far on the whole path, not just since the last tunnel | experienced so far on the whole path, not just since the last tunnel | |||
| ingress (see Appendix E for a full explanation). | ingress [Note_PCN_ingress]. | |||
| PCN allows a network operator to add flow admission and termination | PCN allows a network operator to add flow admission and termination | |||
| for inelastic traffic at the edges of a Diffserv domain, but without | for inelastic traffic at the edges of a Diffserv domain, but without | |||
| any per-flow mechanisms in the interior and without the generous | any per-flow mechanisms in the interior and without the generous | |||
| provisioning typical of Diffserv, aiming to significantly reduce | provisioning typical of Diffserv, aiming to significantly reduce | |||
| costs. The PCN architecture [RFC5559] states that RFC3168 IP in IP | costs. The PCN architecture [RFC5559] states that RFC3168 IP in IP | |||
| tunnelling of the ECN field cannot be used for any tunnel ingress in | tunnelling of the ECN field cannot be used for any tunnel ingress in | |||
| a PCN domain. Prior to the present specification, this left a stark | a PCN domain. Prior to the present specification, this left a stark | |||
| choice between not being able to use PCN for inelastic traffic | choice between not being able to use PCN for inelastic traffic | |||
| control or not being able to use the many tunnels already deployed | control or not being able to use the many tunnels already deployed | |||
| skipping to change at page 24, line 20 | skipping to change at page 24, line 20 | |||
| As well as being useful for general future-proofing, this problem | As well as being useful for general future-proofing, this problem | |||
| is immediately pressing for standardisation of pre-congestion | is immediately pressing for standardisation of pre-congestion | |||
| notification (PCN), which uses two severity levels of congestion. | notification (PCN), which uses two severity levels of congestion. | |||
| If a congested queue used ECT(1) in the outer header to signal | If a congested queue used ECT(1) in the outer header to signal | |||
| more severe congestion than ECT(0), the pre-existing | more severe congestion than ECT(0), the pre-existing | |||
| decapsulation rules would have thrown away this congestion | decapsulation rules would have thrown away this congestion | |||
| signal, preventing tunnelled traffic from ever knowing that it | signal, preventing tunnelled traffic from ever knowing that it | |||
| should reduce its load. | should reduce its load. | |||
| The PCN working group has had to consider a number of wasteful or | The PCN working group has had to consider a number of wasteful or | |||
| convoluted work-rounds to this problem (see Appendix D). But by | convoluted work-rounds to this problem [Note_PCN_egress]. But by | |||
| far the simplest approach is just to remove the covert channel | far the simplest approach is just to remove the covert channel | |||
| blockages from tunnelling behaviour--now deemed unnecessary | blockages from tunnelling behaviour--now deemed unnecessary | |||
| anyway. Then network operators that want to support two | anyway. Then network operators that want to support two | |||
| congestion severity-levels for PCN can specify that every tunnel | congestion severity-levels for PCN can specify that every tunnel | |||
| egress in a PCN region must comply with this latest | egress in a PCN region must comply with this latest | |||
| specification. | specification. | |||
| Not only does this make two congestion severity-levels available | Not only does this make two congestion severity-levels available | |||
| for PCN standardisation, but also for other potential uses of the | for PCN standardisation, but also for other potential uses of the | |||
| extra ECN codepoint (e.g. [VCP]). | extra ECN codepoint (e.g. [VCP]). | |||
| skipping to change at page 28, line 33 | skipping to change at page 28, line 33 | |||
| Then the code module doing encapsulation can keep to the | Then the code module doing encapsulation can keep to the | |||
| copying rule and the load regulator module can reset | copying rule and the load regulator module can reset | |||
| congestion, without any code in either module being | congestion, without any code in either module being | |||
| conditional on whether the other is there. | conditional on whether the other is there. | |||
| On decapsulation in any new scheme: | On decapsulation in any new scheme: | |||
| 1. If the arriving inner header is Not-ECT it implies the | 1. If the arriving inner header is Not-ECT it implies the | |||
| transport will not understand other ECN codepoints. If the | transport will not understand other ECN codepoints. If the | |||
| outer header carries an explicit congestion marking, the | outer header carries an explicit congestion marking, the | |||
| alternate scheme will probably need to drop the packet--the | alternate scheme would be expected to drop the packet--the | |||
| only indication of congestion the transport will understand. | only indication of congestion the transport will understand. | |||
| If the outer carries any other ECN codepoint that does not | If the alternate scheme recommends forwarding rather than | |||
| indicate congestion, the alternate scheme can forward the | dropping such a packet, it must clearly justify this decision. | |||
| packet, but probably only as Not-ECT. | If the inner is Not-ECT and the outer carries any other ECN | |||
| codepoint that does not indicate congestion, the alternate | ||||
| scheme can forward the packet, but probably only as Not-ECT. | ||||
| 2. If the arriving inner header is other than Not-ECT, the ECN | 2. If the arriving inner header is other than Not-ECT, the ECN | |||
| field that the alternate decapsulation scheme forwards should | field that the alternate decapsulation scheme forwards should | |||
| reflect the more severe congestion marking of the arriving | reflect the more severe congestion marking of the arriving | |||
| inner and outer headers. | inner and outer headers. | |||
| 3. Any alternate scheme MUST define a behaviour for all | 3. Any alternate scheme must define a behaviour for all | |||
| combinations of inner and outer headers, even those that would | combinations of inner and outer headers, even those that would | |||
| not be expected to result from standards known at the time and | not be expected to result from standards known at the time and | |||
| even those that would not be expected from the tunnel ingress | even those that would not be expected from the tunnel ingress | |||
| paired with the egress at run-time. Consideration should be | paired with the egress at run-time. Consideration should be | |||
| given to logging such unexpected combinations and raising an | given to logging such unexpected combinations and raising an | |||
| alarm, particularly if there is a danger that the invalid | alarm, particularly if there is a danger that the invalid | |||
| combination implies congestion signals are not being | combination implies congestion signals are not being | |||
| propagated correctly. The presence of currently unused | propagated correctly. The presence of currently unused | |||
| combinations may represent an attack, but the new scheme | combinations may represent an attack, but the new scheme | |||
| should try to define a way to forward such packets, at least | should try to define a way to forward such packets, at least | |||
| skipping to change at page 34, line 15 | skipping to change at page 34, line 15 | |||
| Information", RFC 5696, | Information", RFC 5696, | |||
| November 2009. | November 2009. | |||
| [VCP] Xia, Y., Subramanian, L., Stoica, | [VCP] Xia, Y., Subramanian, L., Stoica, | |||
| I., and S. Kalyanaraman, "One more | I., and S. Kalyanaraman, "One more | |||
| bit is enough", Proc. SIGCOMM'05, | bit is enough", Proc. SIGCOMM'05, | |||
| ACM CCR 35(4)37--48, 2005, <http:// | ACM CCR 35(4)37--48, 2005, <http:// | |||
| doi.acm.org/10.1145/ | doi.acm.org/10.1145/ | |||
| 1080091.1080098>. | 1080091.1080098>. | |||
| Editorial Comments | ||||
| [Note_Manual_Keying] Bob Briscoe: Note (To be removed by the RFC | ||||
| Editor): One corner case can exist where an | ||||
| RFC4301 ingress does not use IKEv2, but uses | ||||
| manual keying instead. Then an RFC4301 ingress | ||||
| could conceivably be configured to tunnel to an | ||||
| egress with limited functionality ECN handling. | ||||
| Strictly, for this corner-case, the requirement | ||||
| to use compatibility mode in this specification | ||||
| updates RFC4301. However, this is such a remote | ||||
| possibility that RFC4301 IPsec implementations | ||||
| are not required to implement compatibility | ||||
| mode. It is planned to remove this note after | ||||
| the review process has completed to avoid | ||||
| unnecessarily complicating the document with a | ||||
| largely theoretical corner case. | ||||
| [Note_PCN_egress] Bob Briscoe: During the review process Appendix | ||||
| D is provided to expand on this point, but it | ||||
| will be deleted before publication. | ||||
| [Note_PCN_ingress] Bob Briscoe: During the review process Appendix | ||||
| E is provided to expand on this point, but it | ||||
| will be deleted before publication. | ||||
| Appendix A. Early ECN Tunnelling RFCs | Appendix A. Early ECN Tunnelling RFCs | |||
| IP in IP tunnelling was originally defined in [RFC2003]. On | IP in IP tunnelling was originally defined in [RFC2003]. On | |||
| encapsulation, the incoming header was copied to the outer and on | encapsulation, the incoming header was copied to the outer and on | |||
| decapsulation the outer was simply discarded. Initially, IPsec | decapsulation the outer was simply discarded. Initially, IPsec | |||
| tunnelling [RFC2401] followed the same behaviour. | tunnelling [RFC2401] followed the same behaviour. | |||
| When ECN was introduced experimentally in [RFC2481], legacy (RFC2003 | When ECN was introduced experimentally in [RFC2481], legacy (RFC2003 | |||
| or RFC2401) tunnels would have discarded any congestion markings | or RFC2401) tunnels would have discarded any congestion markings | |||
| added to the outer header, so RFC2481 introduced rules for | added to the outer header, so RFC2481 introduced rules for | |||
| skipping to change at page 35, line 25 | skipping to change at page 35, line 47 | |||
| Information security can be assured by using various end to end | Information security can be assured by using various end to end | |||
| security solutions (including IPsec in transport mode [RFC4301]), but | security solutions (including IPsec in transport mode [RFC4301]), but | |||
| a commonly used scenario involves the need to communicate between two | a commonly used scenario involves the need to communicate between two | |||
| physically protected domains across the public Internet. In this | physically protected domains across the public Internet. In this | |||
| case there are certain management advantages to using IPsec in tunnel | case there are certain management advantages to using IPsec in tunnel | |||
| mode solely across the publicly accessible part of the path. The | mode solely across the publicly accessible part of the path. The | |||
| path followed by a packet then crosses security 'domains'; the ones | path followed by a packet then crosses security 'domains'; the ones | |||
| protected by physical or other means before and after the tunnel and | protected by physical or other means before and after the tunnel and | |||
| the one protected by an IPsec tunnel across the otherwise unprotected | the one protected by an IPsec tunnel across the otherwise unprotected | |||
| domain. We will use the scenario in Figure 5 where endpoints 'A' and | domain. The scenario in Figure 5 will be used where endpoints 'A' | |||
| 'B' communicate through a tunnel. The tunnel ingress 'I' and egress | and 'B' communicate through a tunnel. The tunnel ingress 'I' and | |||
| 'E' are within physically protected edge domains, while the tunnel | egress 'E' are within physically protected edge domains, while the | |||
| spans an unprotected internetwork where there may be 'men in the | tunnel spans an unprotected internetwork where there may be 'men in | |||
| middle', M. | the middle', M. | |||
| physically unprotected physically | physically unprotected physically | |||
| <-protected domain-><--domain--><-protected domain-> | <-protected domain-><--domain--><-protected domain-> | |||
| +------------------+ +------------------+ | +------------------+ +------------------+ | |||
| | | M | | | | | M | | | |||
| | A-------->I=========>==========>E-------->B | | | A-------->I=========>==========>E-------->B | | |||
| | | | | | | | | | | |||
| +------------------+ +------------------+ | +------------------+ +------------------+ | |||
| <----IPsec secured----> | <----IPsec secured----> | |||
| tunnel | tunnel | |||
| skipping to change at page 36, line 22 | skipping to change at page 36, line 45 | |||
| from a congested resource towards downstream nodes. Typically a | from a congested resource towards downstream nodes. Typically a | |||
| downstream transport might feed the information back somehow to the | downstream transport might feed the information back somehow to the | |||
| point upstream of the congestion that can regulate the load on the | point upstream of the congestion that can regulate the load on the | |||
| congested resource, but other actions are possible (see [RFC3168] | congested resource, but other actions are possible (see [RFC3168] | |||
| S.6). In terms of the above unicast scenario, ECN effectively | S.6). In terms of the above unicast scenario, ECN effectively | |||
| intends to create an information channel (for congestion signalling) | intends to create an information channel (for congestion signalling) | |||
| from 'M' to 'B' (for 'B' to feed back to 'A'). Therefore the goals | from 'M' to 'B' (for 'B' to feed back to 'A'). Therefore the goals | |||
| of IPsec and ECN are mutually incompatible, requiring some | of IPsec and ECN are mutually incompatible, requiring some | |||
| compromise. | compromise. | |||
| With respect to the DS or ECN fields, S.5.1.2 of RFC4301 says, | With respect to using the DS or ECN fields as covert channels, | |||
| "controls are provided to manage the bandwidth of this [covert] | S.5.1.2 of RFC4301 says, "controls are provided to manage the | |||
| channel". Using the ECN processing rules of RFC4301, the channel | bandwidth of this channel". Using the ECN processing rules of | |||
| bandwidth is two bits per datagram from 'A' to 'M' and one bit per | RFC4301, the channel bandwidth is two bits per datagram from 'A' to | |||
| datagram from 'M' to 'A' (because 'E' limits the combinations of the | 'M' and one bit per datagram from 'M' to 'A' (because 'E' limits the | |||
| 2-bit ECN field that it will copy). In both cases the covert channel | combinations of the 2-bit ECN field that it will copy). In both | |||
| bandwidth is further reduced by noise from any real congestion | cases the covert channel bandwidth is further reduced by noise from | |||
| marking. RFC4301 implies that these covert channels are sufficiently | any real congestion marking. RFC4301 implies that these covert | |||
| limited to be considered a manageable threat. However, with respect | channels are sufficiently limited to be considered a manageable | |||
| to the larger (6b) DS field, the same section of RFC4301 says not | threat. However, with respect to the larger (6b) DS field, the same | |||
| copying is the default, but a configuration option can allow copying | section of RFC4301 says not copying is the default, but a | |||
| "to allow a local administrator to decide whether the covert channel | configuration option can allow copying "to allow a local | |||
| provided by copying these bits outweighs the benefits of copying". | administrator to decide whether the covert channel provided by | |||
| Of course, an administrator considering copying of the DS field has | copying these bits outweighs the benefits of copying". Of course, an | |||
| to take into account that it could be concatenated with the ECN field | administrator considering copying of the DS field has to take into | |||
| giving an 8b per datagram covert channel. | account that it could be concatenated with the ECN field giving an 8b | |||
| per datagram covert channel. | ||||
| For tunnelling the 6b Diffserv field two conceptual models have had | For tunnelling the 6b Diffserv field two conceptual models have had | |||
| to be defined so that administrators can trade off security against | to be defined so that administrators can trade off security against | |||
| the needs of traffic conditioning [RFC2983]: | the needs of traffic conditioning [RFC2983]: | |||
| The uniform model: where the Diffserv field is preserved end-to-end | The uniform model: where the Diffserv field is preserved end-to-end | |||
| by copying into the outer header on encapsulation and copying from | by copying into the outer header on encapsulation and copying from | |||
| the outer header on decapsulation. | the outer header on decapsulation. | |||
| The pipe model: where the outer header is independent of that in the | The pipe model: where the outer header is independent of that in the | |||
| skipping to change at page 37, line 15 | skipping to change at page 37, line 38 | |||
| It deemed that simplicity was more important than allowing | It deemed that simplicity was more important than allowing | |||
| administrators the option of a tiny increment in security, especially | administrators the option of a tiny increment in security, especially | |||
| given not copying congestion indications could seriously harm | given not copying congestion indications could seriously harm | |||
| everyone's network service. | everyone's network service. | |||
| B.2. Control Constraints | B.2. Control Constraints | |||
| Congestion control requires that any congestion notification marked | Congestion control requires that any congestion notification marked | |||
| into packets by a resource will be able to traverse a feedback loop | into packets by a resource will be able to traverse a feedback loop | |||
| back to a function capable of controlling the load on that resource. | back to a function capable of controlling the load on that resource. | |||
| To be precise, rather than calling this function the data source, we | To be precise, rather than calling this function the data source, it | |||
| will call it the Load Regulator. This will allow us to deal with | will be called the Load Regulator. This allows for exceptional cases | |||
| exceptional cases where load is not regulated by the data source, but | where load is not regulated by the data source, but usually the two | |||
| usually the two terms will be synonymous. Note the term "a function | terms will be synonymous. Note the term "a function _capable of_ | |||
| _capable of_ controlling the load" deliberately includes a source | controlling the load" deliberately includes a source application that | |||
| application that doesn't actually control the load but ought to (e.g. | doesn't actually control the load but ought to (e.g. an application | |||
| an application without congestion control that uses UDP). | without congestion control that uses UDP). | |||
| A--->R--->I=========>M=========>E-------->B | A--->R--->I=========>M=========>E-------->B | |||
| Figure 6: Simple Tunnel Scenario | Figure 6: Simple Tunnel Scenario | |||
| We now consider a similar tunnelling scenario to the IPsec one just | A similar tunnelling scenario to the IPsec one just described will | |||
| described, but without the different security domains so we can just | now be considered, but without the different security domains, | |||
| focus on ensuring the control loop and management monitoring can work | because the focus now shifts to whether the control loop and | |||
| (Figure 6). If we want resources in the tunnel to be able to | management monitoring work (Figure 6). If resources in the tunnel | |||
| explicitly notify congestion and the feedback path is from 'B' to | are to be able to explicitly notify congestion and the feedback path | |||
| 'A', it will certainly be necessary for 'E' to copy any CE marking | is from 'B' to 'A', it will certainly be necessary for 'E' to copy | |||
| from the outer header to the inner header for onward transmission to | any CE marking from the outer header to the inner header for onward | |||
| 'B', otherwise congestion notification from resources like 'M' cannot | transmission to 'B', otherwise congestion notification from resources | |||
| be fed back to the Load Regulator ('A'). But it does not seem | like 'M' cannot be fed back to the Load Regulator ('A'). But it does | |||
| necessary for 'I' to copy CE markings from the inner to the outer | not seem necessary for 'I' to copy CE markings from the inner to the | |||
| header. For instance, if resource 'R' is congested, it can send | outer header. For instance, if resource 'R' is congested, it can | |||
| congestion information to 'B' using the congestion field in the inner | send congestion information to 'B' using the congestion field in the | |||
| header without 'I' copying the congestion field into the outer header | inner header without 'I' copying the congestion field into the outer | |||
| and 'E' copying it back to the inner header. 'E' can still write any | header and 'E' copying it back to the inner header. 'E' can still | |||
| additional congestion marking introduced across the tunnel into the | write any additional congestion marking introduced across the tunnel | |||
| congestion field of the inner header. | into the congestion field of the inner header. | |||
| All this shows that 'E' can preserve the control loop irrespective of | All this shows that 'E' can preserve the control loop irrespective of | |||
| whether 'I' copies congestion notification into the outer header or | whether 'I' copies congestion notification into the outer header or | |||
| resets it. | resets it. | |||
| That is the situation for existing control arrangements but, because | That is the situation for existing control arrangements but, because | |||
| copying reveals more information, it would open up possibilities for | copying reveals more information, it would open up possibilities for | |||
| better control system designs. For instance, Appendix E describes | better control system designs. For instance, resetting CE marking on | |||
| how resetting CE marking on encapsulation breaks a proposed | encapsulation breaks the standards track PCN congestion marking | |||
| congestion marking scheme on the standards track. It ends up | scheme [RFC5670]. It ends up removing excessive amounts of traffic | |||
| removing excessive amounts of traffic unnecessarily. Whereas copying | unnecessarily. Whereas copying CE markings at ingress leads to the | |||
| CE markings at ingress leads to the correct control behaviour. | correct control behaviour. | |||
| B.3. Management Constraints | B.3. Management Constraints | |||
| As well as control, there are also management constraints. | As well as control, there are also management constraints. | |||
| Specifically, a management system may monitor congestion markings in | Specifically, a management system may monitor congestion markings in | |||
| passing packets, perhaps at the border between networks as part of a | passing packets, perhaps at the border between networks as part of a | |||
| service level agreement. For instance, monitors at the borders of | service level agreement. For instance, monitors at the borders of | |||
| autonomous systems may need to measure how much congestion has | autonomous systems may need to measure how much congestion has | |||
| accumulated so far along the path, perhaps to determine between them | accumulated so far along the path, perhaps to determine between them | |||
| how much of the congestion is contributed by each domain. | how much of the congestion is contributed by each domain. | |||
| In this document we define the baseline of congestion marking (or the | In this document the baseline of congestion marking (or the | |||
| Congestion Baseline) as the source of the layer that created (or most | Congestion Baseline) is defined as the source of the layer that | |||
| recently reset) the congestion notification field. When monitoring | created (or most recently reset) the congestion notification field. | |||
| congestion it would be desirable if the Congestion Baseline did not | When monitoring congestion it would be desirable if the Congestion | |||
| depend on whether packets were tunnelled or not. Given some tunnels | Baseline did not depend on whether packets were tunnelled or not. | |||
| cross domain borders (e.g. consider M in Figure 6 is monitoring a | Given some tunnels cross domain borders (e.g. consider M in Figure 6 | |||
| border), it would therefore be desirable for 'I' to copy congestion | is monitoring a border), it would therefore be desirable for 'I' to | |||
| accumulated so far into the outer headers, so that it is exposed | copy congestion accumulated so far into the outer headers, so that it | |||
| across the tunnel. | is exposed across the tunnel. | |||
| For management purposes it might be useful for the tunnel egress to | For management purposes it might be useful for the tunnel egress to | |||
| be able to monitor whether congestion occurred across a tunnel or | be able to monitor whether congestion occurred across a tunnel or | |||
| upstream of it. Superficially it appears that copying congestion | upstream of it. Superficially it appears that copying congestion | |||
| markings at the ingress would make this difficult, whereas it was | markings at the ingress would make this difficult, whereas it was | |||
| straightforward when an RFC3168 ingress reset them. However, | straightforward when an RFC3168 ingress reset them. However, | |||
| Appendix C gives a simple and precise method for a tunnel egress to | Appendix C gives a simple and precise method for a tunnel egress to | |||
| infer the congestion level introduced across a tunnel. It works | infer the congestion level introduced across a tunnel. It works | |||
| irrespective of whether the ingress copies or resets congestion | irrespective of whether the ingress copies or resets congestion | |||
| markings. | markings. | |||
| End of changes. 37 change blocks. | ||||
| 139 lines changed or deleted | 177 lines changed or added | |||
This html diff was produced by rfcdiff 1.38. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||