| draft-ietf-tsvwg-ecn-tunnel-00.txt | draft-ietf-tsvwg-ecn-tunnel-01.txt | |||
|---|---|---|---|---|
| Transport Area Working Group B. Briscoe | Transport Area Working Group B. Briscoe | |||
| Internet-Draft BT | Internet-Draft BT | |||
| Intended status: Standards Track Oct 16, 2008 | Intended status: Standards Track Oct 27, 2008 | |||
| Expires: April 19, 2009 | Expires: April 30, 2009 | |||
| Layered Encapsulation of Congestion Notification | Layered Encapsulation of Congestion Notification | |||
| draft-ietf-tsvwg-ecn-tunnel-00 | draft-ietf-tsvwg-ecn-tunnel-01 | |||
| Status of this Memo | Status of this Memo | |||
| By submitting this Internet-Draft, each author represents that any | By submitting this Internet-Draft, each author represents that any | |||
| applicable patent or other IPR claims of which he or she is aware | applicable patent or other IPR claims of which he or she is aware | |||
| have been or will be disclosed, and any of which he or she becomes | have been or will be disclosed, and any of which he or she becomes | |||
| aware will be disclosed, in accordance with Section 6 of BCP 79. | aware will be disclosed, in accordance with Section 6 of BCP 79. | |||
| Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
| Task Force (IETF), its areas, and its working groups. Note that | Task Force (IETF), its areas, and its working groups. Note that | |||
| skipping to change at page 1, line 34 | skipping to change at page 1, line 34 | |||
| and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
| time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
| material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
| The list of current Internet-Drafts can be accessed at | The list of current Internet-Drafts can be accessed at | |||
| http://www.ietf.org/ietf/1id-abstracts.txt. | http://www.ietf.org/ietf/1id-abstracts.txt. | |||
| The list of Internet-Draft Shadow Directories can be accessed at | The list of Internet-Draft Shadow Directories can be accessed at | |||
| http://www.ietf.org/shadow.html. | http://www.ietf.org/shadow.html. | |||
| This Internet-Draft will expire on April 19, 2009. | This Internet-Draft will expire on April 30, 2009. | |||
| Abstract | Abstract | |||
| This document redefines how the explicit congestion notification | This document redefines how the explicit congestion notification | |||
| (ECN) field of the outer IP header of a tunnel should be constructed. | (ECN) field of the outer IP header of a tunnel should be constructed. | |||
| It brings all IP in IP tunnels (v4 or v6) into line with the way | It brings all IP in IP tunnels (v4 or v6) into line with the way | |||
| IPsec tunnels now construct the ECN field. It includes a thorough | IPsec tunnels now construct the ECN field. It includes a thorough | |||
| analysis of the reasoning for this change and the implications. It | analysis of the reasoning for this change and the implications. It | |||
| also gives guidelines on the encapsulation of IP congestion | also gives guidelines on the encapsulation of IP congestion | |||
| notification by any outer header, whether encapsulated in an IP | notification by any outer header, whether encapsulated in an IP | |||
| skipping to change at page 2, line 12 | skipping to change at page 2, line 12 | |||
| help interworking, if the IETF or other standards bodies specify any | help interworking, if the IETF or other standards bodies specify any | |||
| new encapsulation of congestion notification. | new encapsulation of congestion notification. | |||
| Table of Contents | Table of Contents | |||
| 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 | |||
| 1.1. The Need for Rationalisation . . . . . . . . . . . . . . . 5 | 1.1. The Need for Rationalisation . . . . . . . . . . . . . . . 5 | |||
| 1.2. Document Roadmap . . . . . . . . . . . . . . . . . . . . . 6 | 1.2. Document Roadmap . . . . . . . . . . . . . . . . . . . . . 6 | |||
| 1.3. Scope . . . . . . . . . . . . . . . . . . . . . . . . . . 7 | 1.3. Scope . . . . . . . . . . . . . . . . . . . . . . . . . . 7 | |||
| 2. Requirements Language . . . . . . . . . . . . . . . . . . . . 8 | 2. Requirements Language . . . . . . . . . . . . . . . . . . . . 8 | |||
| 3. Design Constraints . . . . . . . . . . . . . . . . . . . . . . 8 | 3. Design Constraints . . . . . . . . . . . . . . . . . . . . . . 9 | |||
| 3.1. Security Constraints . . . . . . . . . . . . . . . . . . . 8 | 3.1. Security Constraints . . . . . . . . . . . . . . . . . . . 9 | |||
| 3.2. Control Constraints . . . . . . . . . . . . . . . . . . . 10 | 3.2. Control Constraints . . . . . . . . . . . . . . . . . . . 11 | |||
| 3.3. Management Constraints . . . . . . . . . . . . . . . . . . 12 | 3.3. Management Constraints . . . . . . . . . . . . . . . . . . 12 | |||
| 4. Design Principles . . . . . . . . . . . . . . . . . . . . . . 12 | 4. Design Principles . . . . . . . . . . . . . . . . . . . . . . 13 | |||
| 4.1. Design Guidelines for New Encapsulations of Congestion | 4.1. Design Guidelines for New Encapsulations of Congestion | |||
| Notification . . . . . . . . . . . . . . . . . . . . . . . 14 | Notification . . . . . . . . . . . . . . . . . . . . . . . 14 | |||
| 5. Default ECN Tunnelling Rules . . . . . . . . . . . . . . . . . 15 | 5. Default ECN Tunnelling Rules . . . . . . . . . . . . . . . . . 16 | |||
| 6. Backward Compatibility . . . . . . . . . . . . . . . . . . . . 16 | 6. Backward Compatibility . . . . . . . . . . . . . . . . . . . . 17 | |||
| 7. Changes from Earlier RFCs . . . . . . . . . . . . . . . . . . 18 | 7. Changes from Earlier RFCs . . . . . . . . . . . . . . . . . . 19 | |||
| 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 19 | 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 20 | |||
| 9. Security Considerations . . . . . . . . . . . . . . . . . . . 19 | 9. Security Considerations . . . . . . . . . . . . . . . . . . . 20 | |||
| 10. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 21 | 10. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 22 | |||
| 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 22 | 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 23 | |||
| 12. Comments Solicited . . . . . . . . . . . . . . . . . . . . . . 23 | 12. Comments Solicited . . . . . . . . . . . . . . . . . . . . . . 23 | |||
| 13. References . . . . . . . . . . . . . . . . . . . . . . . . . . 23 | 13. References . . . . . . . . . . . . . . . . . . . . . . . . . . 23 | |||
| 13.1. Normative References . . . . . . . . . . . . . . . . . . . 23 | 13.1. Normative References . . . . . . . . . . . . . . . . . . . 23 | |||
| 13.2. Informative References . . . . . . . . . . . . . . . . . . 23 | 13.2. Informative References . . . . . . . . . . . . . . . . . . 24 | |||
| Appendix A. Why resetting CE on encapsulation harms PCN . . . . . 25 | Editorial Comments . . . . . . . . . . . . . . . . . . . . . . . . | |||
| Appendix B. Contribution to Congestion across a Tunnel . . . . . 26 | Appendix A. Why resetting CE on encapsulation harms PCN . . . . . 26 | |||
| Appendix C. Ideal Decapsulation Rules . . . . . . . . . . . . . . 27 | Appendix B. Contribution to Congestion across a Tunnel . . . . . 27 | |||
| Appendix C. Comprehensive Decapsulation Rules . . . . . . . . . . 28 | ||||
| C.1. Ways to Introduce the Comprehensive Decapsulation Rules . 31 | ||||
| Appendix D. Non-Dependence of Tunnelling on In-path Load | Appendix D. Non-Dependence of Tunnelling on In-path Load | |||
| Regulation . . . . . . . . . . . . . . . . . . . . . 29 | Regulation . . . . . . . . . . . . . . . . . . . . . 32 | |||
| D.1. Dependence of In-Path Load Regulation on Tunnelling . . . 30 | D.1. Dependence of In-Path Load Regulation on Tunnelling . . . 33 | |||
| Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 33 | Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 36 | |||
| Intellectual Property and Copyright Statements . . . . . . . . . . 34 | Intellectual Property and Copyright Statements . . . . . . . . . . 37 | |||
| Changes from previous drafts (to be removed by the RFC Editor) | Changes from previous drafts (to be removed by the RFC Editor) | |||
| From briscoe-01 to ietf-00 (current): | Full text differences between IETF draft versions are available at | |||
| <http://tools.ietf.org/wg/tsvwg/draft-ietf-tsvwg-ecn-tunnel/>, and | ||||
| between earlier individual draft versions at | ||||
| <http://www.cs.ucl.ac.uk/staff/B.Briscoe/pubs.html#ecn-tunnel> | ||||
| From ietf-00 to ietf-01 (current): | ||||
| * Identified two additional alarm states in the decapsulation | ||||
| rules (Figure 3) if ECT(X) in outer and inner contradict each | ||||
| other. | ||||
| * Altered Comprehensive Decapsulation Rules (Appendix C) so that | ||||
| ECT(0) in the outer no longer overrides ECT(1) in the inner. | ||||
| Used the term 'Comprehensive' instead of 'Ideal'. And | ||||
| considerably updated the text in this appendix. | ||||
| * Added Appendix C.1 to weigh up the various ways the | ||||
| Comprehensive Decapsulation Rules might be introduced. This | ||||
| replaces the previous contradictory statements saying complex | ||||
| backwards compatibility interactions would be introduced while | ||||
| also saying there would be no backwards compatibility issues. | ||||
| * Updated references. | ||||
| From briscoe-01 to ietf-00: | ||||
| * Re-wrote Appendix B giving much simpler technique to measure | * Re-wrote Appendix B giving much simpler technique to measure | |||
| contribution to congestion across a tunnel. | contribution to congestion across a tunnel. | |||
| * Added discussion of backward compatibility of the ideal | * Added discussion of backward compatibility of the ideal | |||
| decapsulation scheme in Appendix C | decapsulation scheme in Appendix C | |||
| * Updated references. Minor corrections & clarifications | * Updated references. Minor corrections & clarifications | |||
| throughout. | throughout. | |||
| skipping to change at page 16, line 22 | skipping to change at page 17, line 5 | |||
| encapsulation behaviour MUST only be used if the tunnel ingress is in | encapsulation behaviour MUST only be used if the tunnel ingress is in | |||
| `normal state'. A `compatibility state' with a different | `normal state'. A `compatibility state' with a different | |||
| encapsulation behaviour is also specified in Section 6 for backward | encapsulation behaviour is also specified in Section 6 for backward | |||
| compatibility with legacy tunnel egresses that do not understand ECN. | compatibility with legacy tunnel egresses that do not understand ECN. | |||
| To decapsulate the inner header at the tunnel egress, a compliant | To decapsulate the inner header at the tunnel egress, a compliant | |||
| tunnel egress MUST set the outgoing ECN field to the codepoint at the | tunnel egress MUST set the outgoing ECN field to the codepoint at the | |||
| intersection of the appropriate incoming inner header (row) and outer | intersection of the appropriate incoming inner header (row) and outer | |||
| header (column) in Figure 3. | header (column) in Figure 3. | |||
| +---------------------------------------------+ | +----------------------------------------------+ | |||
| | Incoming Outer Header | | | Incoming Outer Header | | |||
| +---------------------+---------+-----------+-----------+-----------+ | +------------------+---------+------------+------------+----------+ | |||
| | Incoming Inner | Not-ECT | ECT(0) | ECT(1) | CE | | | Incoming Inner | Not-ECT | ECT(0) | ECT(1) | CE | | |||
| | Header | | | | | | | Header | | | | | | |||
| +---------------------+---------+-----------+-----------+-----------+ | +------------------+---------+------------+------------+----------+ | |||
| | Not-ECT | Not-ECT | drop(!!!) | drop(!!!) | drop(!!!) | | | Not-ECT | Not-ECT | drop(!!!) | drop(!!!) | drop(!!!) | | |||
| | ECT(0) | ECT(0) | ECT(0) | ECT(0) | CE | | | ECT(0) | ECT(0) | ECT(0) | ECT(0)(!!!)| CE | | |||
| | ECT(1) | ECT(1) | ECT(1) | ECT(1) | CE | | | ECT(1) | ECT(1) | ECT(1)(!!!)| ECT(1) | CE | | |||
| | CE | CE | CE | CE (!!!) | CE | | | CE | CE | CE | CE (!!!) | CE | | |||
| +---------------------+---------+-----------+-----------+-----------+ | +------------------+---------+------------+------------+----------+ | |||
| | Outgoing Header | | | Outgoing Header | | |||
| +---------------------------------------------+ | +----------------------------------------------+ | |||
| Figure 3: IP in IP Decapsulation | Figure 3: IP in IP Decapsulation | |||
| The exclamation marks '(!!!)' in Figure 3 indicate that this | The exclamation marks '(!!!)' in Figure 3 indicate that this | |||
| combination of inner and outer headers should not be possible if only | combination of inner and outer headers should not be possible if only | |||
| legal transitions have taken place. So, the decapsulator should drop | legal transitions have taken place. So, the decapsulator should drop | |||
| or mark the ECN field as the table specifies, but it MAY also raise | or mark the ECN field as the table in Figure 3 specifies, but it MAY | |||
| an appropriate alarm. It MUST NOT raise an alarm so often that the | also raise an appropriate alarm. It MUST NOT raise an alarm so often | |||
| illegal combinations would amplify into a flood of alarm messages. | that the illegal combinations would amplify into a flood of alarm | |||
| messages. | ||||
| 6. Backward Compatibility | 6. Backward Compatibility | |||
| Note: in RFC3168, a tunnel was in one of two modes: limited | Note: in RFC3168, a tunnel was in one of two modes: limited | |||
| functionality or full functionality. Rather than working with modes | functionality or full functionality. Rather than working with modes | |||
| of the tunnel as a whole, this specification uses the term `state' to | of the tunnel as a whole, this specification uses the term `state' to | |||
| refer separately to the state of each tunnel end point, which is how | refer separately to the state of each tunnel end point, which is how | |||
| implementations have to work. | implementations have to work. | |||
| If one end of an IPsec tunnel is compliant with [RFC4301], the other | If one end of an IPsec tunnel is compliant with [RFC4301], the other | |||
| skipping to change at page 18, line 50 | skipping to change at page 19, line 33 | |||
| 7. Changes from Earlier RFCs | 7. Changes from Earlier RFCs | |||
| The rule that a normal state tunnel ingress MUST copy any ECN field | The rule that a normal state tunnel ingress MUST copy any ECN field | |||
| into the outer header is a change to the ingress behaviour of | into the outer header is a change to the ingress behaviour of | |||
| RFC3168, but it is the same as the rules for IPsec tunnels in | RFC3168, but it is the same as the rules for IPsec tunnels in | |||
| RFC4301. | RFC4301. | |||
| The rules for calculating the outgoing ECN field on decapsulation at | The rules for calculating the outgoing ECN field on decapsulation at | |||
| a tunnel egress are in line with the full functionality mode of ECN | a tunnel egress are in line with the full functionality mode of ECN | |||
| in RFC3168 and with RFC4301, except that neither identified that an | in RFC3168 and with RFC4301, except that neither identified the | |||
| outer header of ECT(1) combined with an inner header of CE was an | following illegal combinations: outer ECT(1) with inner ECT(0) or | |||
| illegal combination. | with CE; outer ECT(0) with inner ECT(1). | |||
| The rules for how a tunnel establishes whether the egress has full | The rules for how a tunnel establishes whether the egress has full | |||
| functionality ECN capabilities are an update to RFC3168. For all the | functionality ECN capabilities are an update to RFC3168. For all the | |||
| typical cases, RFC4301 is not updated by the ECN capability check in | typical cases, RFC4301 is not updated by the ECN capability check in | |||
| this specification, because a typical RFC4301 tunnel ingress will | this specification, because a typical RFC4301 tunnel ingress will | |||
| have already established that it is talking to an RFC4301 tunnel | have already established that it is talking to an RFC4301 tunnel | |||
| egress (e.g. if it uses IKEv2). However, there may be some corner | egress (e.g. if it uses IKEv2). However, there may be some corner | |||
| cases (e.g. manual keying) where an RFC4301 tunnel ingress talks with | cases (e.g. manual keying) where an RFC4301 tunnel ingress talks with | |||
| an egress with limited functionality ECN handling. Strictly, for | an egress with limited functionality ECN handling. Strictly, for | |||
| such corner cases, the requirement to use compatibility mode in this | such corner cases, the requirement to use compatibility mode in this | |||
| skipping to change at page 20, line 41 | skipping to change at page 21, line 24 | |||
| detect if a CE marking had been applied then subsequently removed. | detect if a CE marking had been applied then subsequently removed. | |||
| The source could detect this by weaving a pseudo-random sequence of | The source could detect this by weaving a pseudo-random sequence of | |||
| ECT(0) and ECT(1) values into a stream of packets, which is termed an | ECT(0) and ECT(1) values into a stream of packets, which is termed an | |||
| ECN nonce. By the decapsulation rules in RFC3168 and RFC4301, if the | ECN nonce. By the decapsulation rules in RFC3168 and RFC4301, if the | |||
| inner and outer headers carry contradictory ECT values only the inner | inner and outer headers carry contradictory ECT values only the inner | |||
| header is preserved for onward forwarding. So if a CE marking added | header is preserved for onward forwarding. So if a CE marking added | |||
| to the outer ECN field has been illegally (or accidentally) | to the outer ECN field has been illegally (or accidentally) | |||
| suppressed by a subsequent node in the tunnel, the decapsulator will | suppressed by a subsequent node in the tunnel, the decapsulator will | |||
| revert the ECN field to its value before tampering, hiding all | revert the ECN field to its value before tampering, hiding all | |||
| evidence of the crime from the onward feedback loop. To close this | evidence of the crime from the onward feedback loop. To close this | |||
| loophole, we could have specified that an outer header value of ECT | minor loophole, we could have specified that an outer header value of | |||
| should overwrite a contradictory ECT value in the inner header (for | ECT should overwrite a contradictory ECT value in the inner header. | |||
| how, see the ideal decapsulation rules proposed in Appendix C). But | But currently we choose to keep the 'broken' behaviour defined in | |||
| currently we choose to keep the 'broken' behaviour defined in RFC3168 | RFC3168 & RFC4301 for all the following reasons: | |||
| & RFC4301 for all the following reasons: | ||||
| 1. We wanted to avoid any changes to IPsec tunnelling behaviour; | 1. We wanted to avoid any changes to IPsec tunnelling behaviour; | |||
| 2. Allowing ECT values in the outer header to override the inner | 2. Allowing ECT values in the outer header to override the inner | |||
| header would have increased the bandwidth of the covert channel | header would have increased the bandwidth of the covert channel | |||
| through the egress gateway from 1 to 1.5 bit per datagram, | through the egress gateway from 1 to 1.5 bit per datagram, | |||
| potentially threatening to upset the consensus established in the | potentially threatening to upset the consensus established in the | |||
| security area that says that the bandwidth of this covert channel | security area that says that the bandwidth of this covert channel | |||
| can now be safely managed; | can now be safely managed; | |||
| skipping to change at page 23, line 35 | skipping to change at page 24, line 19 | |||
| [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition | [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition | |||
| of Explicit Congestion Notification (ECN) to IP", | of Explicit Congestion Notification (ECN) to IP", | |||
| RFC 3168, September 2001. | RFC 3168, September 2001. | |||
| [RFC4301] Kent, S. and K. Seo, "Security Architecture for the | [RFC4301] Kent, S. and K. Seo, "Security Architecture for the | |||
| Internet Protocol", RFC 4301, December 2005. | Internet Protocol", RFC 4301, December 2005. | |||
| 13.2. Informative References | 13.2. Informative References | |||
| [I-D.briscoe-pcn-3-in-1-encoding] | ||||
| Briscoe, B., "PCN 3-State Encoding Extension in a single | ||||
| DSCP", draft-briscoe-pcn-3-in-1-encoding-00 (work in | ||||
| progress), October 2008. | ||||
| [I-D.ietf-pcn-architecture] | [I-D.ietf-pcn-architecture] | |||
| Eardley, P., "Pre-Congestion Notification (PCN) | Eardley, P., "Pre-Congestion Notification (PCN) | |||
| Architecture", draft-ietf-pcn-architecture-07 (work in | Architecture", draft-ietf-pcn-architecture-08 (work in | |||
| progress), September 2008. | progress), October 2008. | |||
| [I-D.ietf-pcn-baseline-encoding] | ||||
| Moncaster, T., Briscoe, B., and M. Menth, "Baseline | ||||
| Encoding and Transport of Pre-Congestion Information", | ||||
| draft-ietf-pcn-baseline-encoding-01 (work in progress), | ||||
| October 2008. | ||||
| [I-D.ietf-pcn-marking-behaviour] | [I-D.ietf-pcn-marking-behaviour] | |||
| Eardley, P., "Marking behaviour of PCN-nodes", | Eardley, P., "Marking behaviour of PCN-nodes", | |||
| draft-ietf-pcn-marking-behaviour-00 (work in progress), | draft-ietf-pcn-marking-behaviour-01 (work in progress), | |||
| October 2008. | October 2008. | |||
| [I-D.ietf-pwe3-congestion-frmwk] | [I-D.ietf-pwe3-congestion-frmwk] | |||
| Bryant, S., Davie, B., Martini, L., and E. Rosen, | Bryant, S., Davie, B., Martini, L., and E. Rosen, | |||
| "Pseudowire Congestion Control Framework", | "Pseudowire Congestion Control Framework", | |||
| draft-ietf-pwe3-congestion-frmwk-01 (work in progress), | draft-ietf-pwe3-congestion-frmwk-01 (work in progress), | |||
| May 2008. | May 2008. | |||
| [I-D.menth-pcn-psdm-encoding] | ||||
| Menth, M., Babiarz, J., Moncaster, T., and B. Briscoe, | ||||
| "PCN Encoding for Packet-Specific Dual Marking (PSDM)", | ||||
| draft-menth-pcn-psdm-encoding-00 (work in progress), | ||||
| July 2008. | ||||
| [I-D.moncaster-pcn-3-state-encoding] | [I-D.moncaster-pcn-3-state-encoding] | |||
| Moncaster, T., Briscoe, B., and M. Menth, "A three state | Moncaster, T., Briscoe, B., and M. Menth, "A three state | |||
| extended PCN encoding scheme", | extended PCN encoding scheme", | |||
| draft-moncaster-pcn-3-state-encoding-00 (work in | draft-moncaster-pcn-3-state-encoding-00 (work in | |||
| progress), June 2008. | progress), June 2008. | |||
| [IEEE802.1au] | [IEEE802.1au] | |||
| IEEE, "IEEE Standard for Local and Metropolitan Area | IEEE, "IEEE Standard for Local and Metropolitan Area | |||
| Networks--Virtual Bridged Local Area Networks - Amendment | Networks--Virtual Bridged Local Area Networks - Amendment | |||
| 10: Congestion Notification", 2008, | 10: Congestion Notification", 2008, | |||
| <http://www.ieee802.org/1/pages/802.1au.html>. | <http://www.ieee802.org/1/pages/802.1au.html>. | |||
| (Work in Progress; Access Controlled link within page) | (Work in Progress; Access Controlled link within page) | |||
| [ITU-T.I.371] | [ITU-T.I.371] | |||
| ITU-T, "Traffic Control and Congestion Control in | ITU-T, "Traffic Control and Congestion Control in B-ISDN", | |||
| {B-ISDN}", ITU-T Rec. I.371 (03/04), March 2004. | ITU-T Rec. I.371 (03/04), March 2004. | |||
| [PCNcharter] | [PCNcharter] | |||
| IETF, "Congestion and Pre-Congestion Notification (pcn)", | IETF, "Congestion and Pre-Congestion Notification (pcn)", | |||
| IETF w-g charter , Feb 2007, | IETF w-g charter , Feb 2007, | |||
| <http://www.ietf.org/html.charters/pcn-charter.html>. | <http://www.ietf.org/html.charters/pcn-charter.html>. | |||
| [Patterns_Arch] | [Patterns_Arch] | |||
| Day, J., "Patterns in Network Architecture: A Return to | Day, J., "Patterns in Network Architecture: A Return to | |||
| Fundamentals", Pub: Prentice Hall ISBN-13: 9780132252423, | Fundamentals", Pub: Prentice Hall ISBN-13: 9780132252423, | |||
| Jan 2008. | Jan 2008. | |||
| skipping to change at page 25, line 31 | skipping to change at page 26, line 31 | |||
| [RFC5129] Davie, B., Briscoe, B., and J. Tay, "Explicit Congestion | [RFC5129] Davie, B., Briscoe, B., and J. Tay, "Explicit Congestion | |||
| Marking in MPLS", RFC 5129, January 2008. | Marking in MPLS", RFC 5129, January 2008. | |||
| [Shayman] "Using ECN to Signal Congestion Within an MPLS Domain", | [Shayman] "Using ECN to Signal Congestion Within an MPLS Domain", | |||
| 2000, <http://www.ee.umd.edu/~shayman/papers.d/ | 2000, <http://www.ee.umd.edu/~shayman/papers.d/ | |||
| draft-shayman-mpls-ecn-00.txt>. | draft-shayman-mpls-ecn-00.txt>. | |||
| (Expired) | (Expired) | |||
| Editorial Comments | ||||
| [Note_Nonce_Compr] Note that even the tentatively proposed | ||||
| Comprehensive Decapsulation Rules in Appendix C | ||||
| do not fix the minor compromise to the protection | ||||
| of the ECN nonce that RFC3168 and RFC4301 both | ||||
| suffer from (described under Security | ||||
| Considerations above). An attacker with control | ||||
| over a tunnel interior node can revert a packet | ||||
| previously marked CE within the same tunnel to | ||||
| its original marking. It can do this by changing | ||||
| CE markings to ECT(0) because the decapsulator | ||||
| rules give precedence to the inner header if the | ||||
| outer is ECT(0). To fix this, we could have | ||||
| specified that the outgoing header should be | ||||
| ECT(0) when the incoming outer is ECT(0) but the | ||||
| inner is ECT(1). Although this would close the | ||||
| minor loophole in the nonce, it would raise a | ||||
| minor safety issue if multilevel ECN or PCN were | ||||
| used. A less severe marking in the inner header | ||||
| would override a more severe one in the outer. | ||||
| Both are corner cases so it is difficult to | ||||
| decide which is more important: i) the loophole | ||||
| in the nonce is only for a minor case of one | ||||
| tunnel node attacking another in the same tunnel; | ||||
| and ii) the severity inversion would not result | ||||
| from any legal codepoint transition. If the | ||||
| Comprehensive Decapsulation Rules of Appendix C | ||||
| are taken up, we currently believe i) safety | ||||
| against misconfiguration is slightly more | ||||
| important than ii) securing against an attack | ||||
| that has little, if any, clear motivation. | ||||
| Appendix A. Why resetting CE on encapsulation harms PCN | Appendix A. Why resetting CE on encapsulation harms PCN | |||
| Regarding encapsulation, the section of the PCN architecture | Regarding encapsulation, the section of the PCN architecture | |||
| [I-D.ietf-pcn-architecture] on tunnelling says that header copying | [I-D.ietf-pcn-architecture] on tunnelling says that header copying | |||
| (RFC4301) allows PCN to work correctly. However, resetting CE | (RFC4301) allows PCN to work correctly. Whereas resetting CE | |||
| markings confuses PCN marking. | markings confuses PCN marking. | |||
| The specific issue here concerns PCN excess rate marking | The specific issue here concerns PCN excess rate marking | |||
| [I-D.ietf-pcn-marking-behaviour], i.e. the bulk marking of traffic | [I-D.ietf-pcn-marking-behaviour], i.e. the bulk marking of traffic | |||
| that exceeds a configured threshold rate. One of the goals of excess | that exceeds a configured threshold rate. One of the goals of excess | |||
| rate marking is to enable the speedy removal of excess admission | rate marking is to enable the speedy removal of excess admission | |||
| controlled traffic following re-routes caused by link failures or | controlled traffic following re-routes caused by link failures or | |||
| other disasters. This maintains a share of the capacity for | other disasters. This maintains a share of the capacity for | |||
| competing admission controlled traffic and for traffic in lower | competing admission controlled traffic and for traffic in lower | |||
| priority classes. After failures, traffic re-routed onto remaining | priority classes. After failures, traffic re-routed onto remaining | |||
| skipping to change at page 27, line 17 | skipping to change at page 28, line 37 | |||
| | 30 | | | | 30 | | | |||
| | | | The large square | | | | The large square | |||
| | +---------+p_t represents 100 packets | | +---------+p_t represents 100 packets | |||
| | | 12 | | | | 12 | | |||
| +-----+---------+0 | +-----+---------+0 | |||
| 0 30% 100% | 0 30% 100% | |||
| inner header marking | inner header marking | |||
| Figure 4: Tunnel Marking of Packets Already Marked at Ingress | Figure 4: Tunnel Marking of Packets Already Marked at Ingress | |||
| Appendix C. Ideal Decapsulation Rules | Appendix C. Comprehensive Decapsulation Rules | |||
| This appendix is not normative. Compliance with this appendix is NOT | ||||
| REQUIRED for compliance with the present specification. | ||||
| If the default ECN encapsulation behaviour does not offer suitable | This appendix is not currently normative. Compliance with this | |||
| trade offs, procedures exist for associating a new behaviour with a | appendix is NOT REQUIRED for compliance with the present | |||
| new Diffserv PHB. However, it is unrealistic to expect vendors of | specification. | |||
| all IPSec and all IP in IP tunnel endpoints to cater for the | ||||
| exceptional behaviour of PHB XXX. If all tunnels did require XXX- | ||||
| specific behaviour, the resulting patchy and error-prone deployment | ||||
| would probably cause XXX to suffer byzantine feature interactions | ||||
| with poorly implemented tunnels. The default rules for tunnel | ||||
| endpoints to handle both the Diffserv field and the ECN field should | ||||
| 'just work' when handling packets with an XXX Diffserv codepoint. | ||||
| Given this specification requests a standards action to update the | Given this specification requests a standards action to update the | |||
| RFC3168 encapsulation behaviour, this appendix explores a further | RFC3168 encapsulation behaviour, this appendix explores a further | |||
| change to decapsulation that we ought to specify at the same time. | change to decapsulation that we ought to specify at the same time. | |||
| If instead this further change is added later, it will add another | If instead this further change is added later, it will add another | |||
| set of backward compatibility combinations to the already complicated | optional mode to the already complicated change history of ECN | |||
| change history of ECN tunnelling. | tunnelling. | |||
| Multi-level congestion notification is currently on the IETF's | Multi-level congestion notification is currently on the IETF's | |||
| standards track agenda in the Congestion and Pre-Congestion | standards track agenda in the Congestion and Pre-Congestion | |||
| Notification (PCN) working group. The PCN working group requires | Notification (PCN) working group. The PCN working group eventually | |||
| three congestion states (not marked and two levels of congestion | requires three congestion states (not marked and two increasingly | |||
| marking) [I-D.ietf-pcn-architecture]. The aim is for the first level | severe levels of congestion marking) [I-D.ietf-pcn-architecture]. | |||
| of marking to stop admitting new traffic and the second level to | The aim is for the less severe level of marking to stop admitting new | |||
| terminate sufficient existing flows to bring a network back to its | traffic and the more severe level to terminate sufficient existing | |||
| operating point after a serious failure. | flows to bring a network back to its operating point after a serious | |||
| failure. | ||||
| Although the ECN field gives sufficient codepoints for these three | Although the ECN field gives sufficient codepoints for these three | |||
| states, the PCN working group cannot use them in case any tunnel | states, current ECN tunnelling RFCs prevent the PCN working group | |||
| decapsulations occur within a PCN region. If a node in a tunnel sets | from using them in case any tunnel decapsulations occur within a PCN | |||
| the ECN field to ECT(0) or ECT(1), this change will be discarded by a | region (see Appendix A of [I-D.ietf-pcn-baseline-encoding]). If a | |||
| tunnel egress compliant with RFC4301 and RFC3168. This can be seen | node in a tunnel sets the ECN field to ECT(0) or ECT(1), this change | |||
| in Figure 3, where the ECT values in the outer header are ignored | will be discarded by a tunnel egress compliant with RFC4301 or | |||
| unless the inner header is the same. Effectively the ECT(0) and | RFC3168. This can be seen in Figure 3, where the ECT values in the | |||
| ECT(1) codepoints have to be treated as just one codepoint when they | outer header are ignored unless the inner header is the same. | |||
| could otherwise have been used for their intended purpose of | Effectively the ECT(0) and ECT(1) codepoints have to be treated as | |||
| congestion notification. Instead, the PCN w-g has had to propose | just one codepoint when they could otherwise have been used for their | |||
| using extra Diffserv codepoint(s) to encode the extra states | intended purpose of congestion notification. | |||
| [I-D.moncaster-pcn-3-state-encoding], using up the rapidly exhausting | ||||
| DSCP space while leaving ECN codepoints unused. | ||||
| Although this is currently most pressing for the PCN working group, | As a consequence, the PCN w-g has initially confined itself to two | |||
| the issue is more general. Under Security Considerations (Section 9) | encoding states as a baseline encoding | |||
| it has already been explained that a data sender cannot use the | [I-D.ietf-pcn-baseline-encoding]. And it has had to propose an | |||
| experimental ECN nonce [RFC3540] to detect suppression of congestion | experimental extension using extra Diffserv codepoint(s) to encode | |||
| notification along a tunnel. | the extra states [I-D.moncaster-pcn-3-state-encoding], using up the | |||
| rapidly exhausting DSCP space while leaving ECN codepoints unused. | ||||
| Another PCN encoding has been proposed that would survive tunnelling | ||||
| without an extra DSCP [I-D.menth-pcn-psdm-encoding], but it requires | ||||
| the PCN edge gateways to somehow share state so the egress can | ||||
| determine which marking a packet started with at the ingress. Also a | ||||
| PCN ingress node can game the system by initiating packets with | ||||
| inappropriate markings. | ||||
| More generally, the currently standardised tunnel decapsulation | Although this issue is currently most pressing for the PCN working | |||
| behaviour unnecessarily wastes a quarter of two bits (i.e. half a | group, it is more general. The currently standardised tunnel | |||
| bit) in the IP (v4 & v6) header. As explained in Section 3.1, the | decapsulation behaviour unnecessarily wastes a quarter of two bits | |||
| original reason for not copying down outer ECT codepoints for onward | (i.e. half a bit) in the IP (v4 & v6) header. As explained in | |||
| forwarding was to limit the covert channel across a decapsulator to 1 | Section 3.1, the original reason for not copying down outer ECT | |||
| bit per packet. However, now that the IETF Security Area has deemed | codepoints for onward forwarding was to limit the covert channel | |||
| that a 2-bit covert channel through an encapsulator is a manageable | across a decapsulator to 1 bit per packet. However, now that the | |||
| risk, the same should be true for a decapsulator. | IETF Security Area has deemed that a 2-bit covert channel through an | |||
| encapsulator is a manageable risk, the same should be true for a | ||||
| decapsulator. | ||||
| Figure 5 proposes a more ideal layered decapsulation behaviour. | Figure 5 proposes a more comprehensive layered decapsulation | |||
| Note: this table is only to support discussion. It is not currently | behaviour that would properly support a simpler experimental 3-state | |||
| proposed for standards action. The only difference from Figure 3 | ECN encodings such as | |||
| (that is proposed for standards action), is the swapping of the cells | [I-D.briscoe-pcn-3-in-1-encoding].[Note_Nonce_Compr] Note that the | |||
| highlighted as *ECT(X)*. | proposal tabulated in Figure 5 is only to support discussion. It is | |||
| not currently proposed for standards action. The only difference | ||||
| from Figure 3 (which _is_ proposed for standards action) is the | ||||
| change to the cell highlighted as *ECT(1)*. | ||||
| +---------------------------------------------+ | +----------------------------------------------+ | |||
| | Incoming Outer Header | | | Incoming Outer Header | | |||
| +---------------------+---------+-----------+-----------+-----------+ | +------------------+---------+------------+------------+----------+ | |||
| | Incoming Inner | Not-ECT | ECT(0) | ECT(1) | CE | | | Incoming Inner | Not-ECT | ECT(0) | ECT(1) | CE | | |||
| | Header | | | | | | | Header | | | | | | |||
| +---------------------+---------+-----------+-----------+-----------+ | +------------------+---------+------------+------------+----------+ | |||
| | Not-ECT | Not-ECT | drop(!!!) | drop(!!!) | drop(!!!) | | | Not-ECT | Not-ECT | drop(!!!) | drop(!!!) | drop(!!!) | | |||
| | ECT(0) | ECT(0) | ECT(0) | *ECT(1)* | CE | | | ECT(0) | ECT(0) | ECT(0) | *ECT(1)* | CE | | |||
| | ECT(1) | ECT(1) | *ECT(0)* | ECT(1) | CE | | | ECT(1) | ECT(1) | ECT(1)(!!!)| ECT(1) | CE | | |||
| | CE | CE | CE | CE (!!!) | CE | | | CE | CE | CE | CE (!!!) | CE | | |||
| +---------------------+---------+-----------+-----------+-----------+ | +------------------+---------+------------+------------+----------+ | |||
| | Outgoing Header | | | Outgoing Header | | |||
| +---------------------------------------------+ | +----------------------------------------------+ | |||
| Figure 5: Ideal IP in IP Decapsulation (currently informative, not | Figure 5: Comprehensive IP in IP Decapsulation (currently | |||
| normative) | informative, not normative) | |||
| Note that, if this ideal proposal were taken up, a tunnel egress | The table is derived from the following logic: | |||
| complying with it would be backwards compatible with all previous | ||||
| specifications for encapsulation of ECN at the ingress (RFC4301, both | o On decapsulation, if the inner ECN field is Not-ECT but the outer | |||
| modes of RFC3168, both modes of RFC2481 and RFC2003). In comparison | ECN field is anything but Not-ECT the decapsulator must drop the | |||
| with an RFC3168 or RFC4301 tunnel egress, it would require no | packet. This is because the Not-ECT marking on the inner header | |||
| additional configuration at the ingress nor any additional | is set by transports that do not know how to respond to an | |||
| negotiation with the ingress. The only new issue would be the burden | explicit congestion marking; | |||
| of an extra standard to be compliant with, adding to the already | ||||
| complex history of ECN tunnelling RFCs. | o In all other cases, the outgoing ECN field is set to the more | |||
| severe marking of the outer and inner ECN fields, where the | ||||
| ranking of severity from highest to lowest is CE, ECT(1), ECT(0), | ||||
| Not-ECT; | ||||
| o There are cases where no legal transition in any current or | ||||
| previous ECN tunneling specification would result in certain | ||||
| combinations of inner and outer ECN fields. In these cases | ||||
| (indicated in the table by '(!!!)'), the decapsulator may also | ||||
| raise an alarm, but not so often that the illegal combinations | ||||
| would amplify into a flood of alarm messages. | ||||
| If this more comprehensive decapsulation proposal were taken up, it | ||||
| would be backwards compatible with all previous encapsulations of ECN | ||||
| at the ingress (RFC4301, both modes of RFC3168, both modes of RFC2481 | ||||
| and RFC2003). The outgoing header is different for one combination | ||||
| of inner & outer headers, but that combination was previously illegal | ||||
| anyway, so no known mechanisms in the Internet rely on the previous | ||||
| behaviour. The proposed tunnel egress requires no additional option | ||||
| configuration at the ingress or egress nor any additional negotiation | ||||
| with the ingress. | ||||
| C.1. Ways to Introduce the Comprehensive Decapsulation Rules | ||||
| There would be a number of ways for this more comprehensive | ||||
| decapsulation proposal to be introduced: | ||||
| o It could be specified in the present standards track proposal | ||||
| (preferred) or in an experimental extension; | ||||
| o it could be specified as a new default for all Diffserv PHBs | ||||
| (preferred) or as an option to be configured only for Diffserv | ||||
| PHBs requiring it. | ||||
| The argument for making this change now, rather than in a separate | ||||
| experimental extension, is to avoid the burden of an extra standard | ||||
| to be compliant with and to be backwards compatible with--so we don't | ||||
| add to the already complex history of ECN tunnelling RFCs. The | ||||
| argument for a separate experimental extension is that we may never | ||||
| need this change (if PCN is never successfully deployed and if no-one | ||||
| ever needs three ECN or PCN encoding states rather than two). | ||||
| However, the change does no harm to existing mechanisms and stops | ||||
| tunnels wasting of quarter of a bit (a 2-bit codepoint). | ||||
| The argument for making this new decapsulation behaviour the default | ||||
| for all PHBs is that it doesn't change any expected behaviour that | ||||
| existing mechanisms rely on already. Also, by ending the present | ||||
| waste of a codepoint, in the future a use of that codepoint could be | ||||
| proposed for all PHBs, even if PCN isn't successfully deployed. | ||||
| In practice, if this comprehensive decapsulation was specified | ||||
| straightaway as the normative default for all PHBs, a network | ||||
| operator deploying 3-state PCN would be able to request that tunnels | ||||
| comply with the latest specification. Implementers of non-PCN | ||||
| tunnels would not need to comply but, if they did, their code would | ||||
| be future proofed and no harm would be done to legacy operations. | ||||
| Therefore, rather than branching their code base, it would be easiest | ||||
| for implementers to make all their new tunnel code comply with this | ||||
| specfication, whether or not it was for PCN. But they could leave | ||||
| old code untouched, unless it was for PCN. | ||||
| The alternatives are worse. Implementers would otherwise have to | ||||
| provide configurable decapsulation options and operators would have | ||||
| to configure all IPsec and IP in IP tunnel endpoints for the | ||||
| exceptional behaviour of certain PHBs. The rules for tunnel | ||||
| endpoints to handle both the Diffserv field and the ECN field should | ||||
| 'just work' when handling packets with any Diffserv codepoint. | ||||
| Appendix D. Non-Dependence of Tunnelling on In-path Load Regulation | Appendix D. Non-Dependence of Tunnelling on In-path Load Regulation | |||
| We have said that at any point in a network, the Congestion Baseline | We have said that at any point in a network, the Congestion Baseline | |||
| (where congestion notification starts from zero) should be the | (where congestion notification starts from zero) should be the | |||
| previous upstream Load Regulator. We have also said that the ingress | previous upstream Load Regulator. We have also said that the ingress | |||
| of an IP in IP tunnel must copy congestion indications to the | of an IP in IP tunnel must copy congestion indications to the | |||
| encapsulating outer headers it creates. If the Load Regulator is in- | encapsulating outer headers it creates. If the Load Regulator is in- | |||
| path rather than at the source, and also a tunnel ingress, these two | path rather than at the source, and also a tunnel ingress, these two | |||
| requirements seem to be contradictory. A tunnel ingress must not | requirements seem to be contradictory. A tunnel ingress must not | |||
| End of changes. 41 change blocks. | ||||
| 117 lines changed or deleted | 260 lines changed or added | |||
This html diff was produced by rfcdiff 1.35. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||