draft-ietf-tsvwg-ecn-tunnel-00.txt | draft-ietf-tsvwg-ecn-tunnel-01.txt | |||
---|---|---|---|---|
Transport Area Working Group B. Briscoe | Transport Area Working Group B. Briscoe | |||
Internet-Draft BT | Internet-Draft BT | |||
Intended status: Standards Track Oct 16, 2008 | Intended status: Standards Track Oct 27, 2008 | |||
Expires: April 19, 2009 | Expires: April 30, 2009 | |||
Layered Encapsulation of Congestion Notification | Layered Encapsulation of Congestion Notification | |||
draft-ietf-tsvwg-ecn-tunnel-00 | draft-ietf-tsvwg-ecn-tunnel-01 | |||
Status of this Memo | Status of this Memo | |||
By submitting this Internet-Draft, each author represents that any | By submitting this Internet-Draft, each author represents that any | |||
applicable patent or other IPR claims of which he or she is aware | applicable patent or other IPR claims of which he or she is aware | |||
have been or will be disclosed, and any of which he or she becomes | have been or will be disclosed, and any of which he or she becomes | |||
aware will be disclosed, in accordance with Section 6 of BCP 79. | aware will be disclosed, in accordance with Section 6 of BCP 79. | |||
Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
Task Force (IETF), its areas, and its working groups. Note that | Task Force (IETF), its areas, and its working groups. Note that | |||
skipping to change at page 1, line 34 | skipping to change at page 1, line 34 | |||
and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
The list of current Internet-Drafts can be accessed at | The list of current Internet-Drafts can be accessed at | |||
http://www.ietf.org/ietf/1id-abstracts.txt. | http://www.ietf.org/ietf/1id-abstracts.txt. | |||
The list of Internet-Draft Shadow Directories can be accessed at | The list of Internet-Draft Shadow Directories can be accessed at | |||
http://www.ietf.org/shadow.html. | http://www.ietf.org/shadow.html. | |||
This Internet-Draft will expire on April 19, 2009. | This Internet-Draft will expire on April 30, 2009. | |||
Abstract | Abstract | |||
This document redefines how the explicit congestion notification | This document redefines how the explicit congestion notification | |||
(ECN) field of the outer IP header of a tunnel should be constructed. | (ECN) field of the outer IP header of a tunnel should be constructed. | |||
It brings all IP in IP tunnels (v4 or v6) into line with the way | It brings all IP in IP tunnels (v4 or v6) into line with the way | |||
IPsec tunnels now construct the ECN field. It includes a thorough | IPsec tunnels now construct the ECN field. It includes a thorough | |||
analysis of the reasoning for this change and the implications. It | analysis of the reasoning for this change and the implications. It | |||
also gives guidelines on the encapsulation of IP congestion | also gives guidelines on the encapsulation of IP congestion | |||
notification by any outer header, whether encapsulated in an IP | notification by any outer header, whether encapsulated in an IP | |||
skipping to change at page 2, line 12 | skipping to change at page 2, line 12 | |||
help interworking, if the IETF or other standards bodies specify any | help interworking, if the IETF or other standards bodies specify any | |||
new encapsulation of congestion notification. | new encapsulation of congestion notification. | |||
Table of Contents | Table of Contents | |||
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 | |||
1.1. The Need for Rationalisation . . . . . . . . . . . . . . . 5 | 1.1. The Need for Rationalisation . . . . . . . . . . . . . . . 5 | |||
1.2. Document Roadmap . . . . . . . . . . . . . . . . . . . . . 6 | 1.2. Document Roadmap . . . . . . . . . . . . . . . . . . . . . 6 | |||
1.3. Scope . . . . . . . . . . . . . . . . . . . . . . . . . . 7 | 1.3. Scope . . . . . . . . . . . . . . . . . . . . . . . . . . 7 | |||
2. Requirements Language . . . . . . . . . . . . . . . . . . . . 8 | 2. Requirements Language . . . . . . . . . . . . . . . . . . . . 8 | |||
3. Design Constraints . . . . . . . . . . . . . . . . . . . . . . 8 | 3. Design Constraints . . . . . . . . . . . . . . . . . . . . . . 9 | |||
3.1. Security Constraints . . . . . . . . . . . . . . . . . . . 8 | 3.1. Security Constraints . . . . . . . . . . . . . . . . . . . 9 | |||
3.2. Control Constraints . . . . . . . . . . . . . . . . . . . 10 | 3.2. Control Constraints . . . . . . . . . . . . . . . . . . . 11 | |||
3.3. Management Constraints . . . . . . . . . . . . . . . . . . 12 | 3.3. Management Constraints . . . . . . . . . . . . . . . . . . 12 | |||
4. Design Principles . . . . . . . . . . . . . . . . . . . . . . 12 | 4. Design Principles . . . . . . . . . . . . . . . . . . . . . . 13 | |||
4.1. Design Guidelines for New Encapsulations of Congestion | 4.1. Design Guidelines for New Encapsulations of Congestion | |||
Notification . . . . . . . . . . . . . . . . . . . . . . . 14 | Notification . . . . . . . . . . . . . . . . . . . . . . . 14 | |||
5. Default ECN Tunnelling Rules . . . . . . . . . . . . . . . . . 15 | 5. Default ECN Tunnelling Rules . . . . . . . . . . . . . . . . . 16 | |||
6. Backward Compatibility . . . . . . . . . . . . . . . . . . . . 16 | 6. Backward Compatibility . . . . . . . . . . . . . . . . . . . . 17 | |||
7. Changes from Earlier RFCs . . . . . . . . . . . . . . . . . . 18 | 7. Changes from Earlier RFCs . . . . . . . . . . . . . . . . . . 19 | |||
8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 19 | 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 20 | |||
9. Security Considerations . . . . . . . . . . . . . . . . . . . 19 | 9. Security Considerations . . . . . . . . . . . . . . . . . . . 20 | |||
10. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 21 | 10. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 22 | |||
11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 22 | 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 23 | |||
12. Comments Solicited . . . . . . . . . . . . . . . . . . . . . . 23 | 12. Comments Solicited . . . . . . . . . . . . . . . . . . . . . . 23 | |||
13. References . . . . . . . . . . . . . . . . . . . . . . . . . . 23 | 13. References . . . . . . . . . . . . . . . . . . . . . . . . . . 23 | |||
13.1. Normative References . . . . . . . . . . . . . . . . . . . 23 | 13.1. Normative References . . . . . . . . . . . . . . . . . . . 23 | |||
13.2. Informative References . . . . . . . . . . . . . . . . . . 23 | 13.2. Informative References . . . . . . . . . . . . . . . . . . 24 | |||
Appendix A. Why resetting CE on encapsulation harms PCN . . . . . 25 | Editorial Comments . . . . . . . . . . . . . . . . . . . . . . . . | |||
Appendix B. Contribution to Congestion across a Tunnel . . . . . 26 | Appendix A. Why resetting CE on encapsulation harms PCN . . . . . 26 | |||
Appendix C. Ideal Decapsulation Rules . . . . . . . . . . . . . . 27 | Appendix B. Contribution to Congestion across a Tunnel . . . . . 27 | |||
Appendix C. Comprehensive Decapsulation Rules . . . . . . . . . . 28 | ||||
C.1. Ways to Introduce the Comprehensive Decapsulation Rules . 31 | ||||
Appendix D. Non-Dependence of Tunnelling on In-path Load | Appendix D. Non-Dependence of Tunnelling on In-path Load | |||
Regulation . . . . . . . . . . . . . . . . . . . . . 29 | Regulation . . . . . . . . . . . . . . . . . . . . . 32 | |||
D.1. Dependence of In-Path Load Regulation on Tunnelling . . . 30 | D.1. Dependence of In-Path Load Regulation on Tunnelling . . . 33 | |||
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 33 | Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 36 | |||
Intellectual Property and Copyright Statements . . . . . . . . . . 34 | Intellectual Property and Copyright Statements . . . . . . . . . . 37 | |||
Changes from previous drafts (to be removed by the RFC Editor) | Changes from previous drafts (to be removed by the RFC Editor) | |||
From briscoe-01 to ietf-00 (current): | Full text differences between IETF draft versions are available at | |||
<http://tools.ietf.org/wg/tsvwg/draft-ietf-tsvwg-ecn-tunnel/>, and | ||||
between earlier individual draft versions at | ||||
<http://www.cs.ucl.ac.uk/staff/B.Briscoe/pubs.html#ecn-tunnel> | ||||
From ietf-00 to ietf-01 (current): | ||||
* Identified two additional alarm states in the decapsulation | ||||
rules (Figure 3) if ECT(X) in outer and inner contradict each | ||||
other. | ||||
* Altered Comprehensive Decapsulation Rules (Appendix C) so that | ||||
ECT(0) in the outer no longer overrides ECT(1) in the inner. | ||||
Used the term 'Comprehensive' instead of 'Ideal'. And | ||||
considerably updated the text in this appendix. | ||||
* Added Appendix C.1 to weigh up the various ways the | ||||
Comprehensive Decapsulation Rules might be introduced. This | ||||
replaces the previous contradictory statements saying complex | ||||
backwards compatibility interactions would be introduced while | ||||
also saying there would be no backwards compatibility issues. | ||||
* Updated references. | ||||
From briscoe-01 to ietf-00: | ||||
* Re-wrote Appendix B giving much simpler technique to measure | * Re-wrote Appendix B giving much simpler technique to measure | |||
contribution to congestion across a tunnel. | contribution to congestion across a tunnel. | |||
* Added discussion of backward compatibility of the ideal | * Added discussion of backward compatibility of the ideal | |||
decapsulation scheme in Appendix C | decapsulation scheme in Appendix C | |||
* Updated references. Minor corrections & clarifications | * Updated references. Minor corrections & clarifications | |||
throughout. | throughout. | |||
skipping to change at page 16, line 22 | skipping to change at page 17, line 5 | |||
encapsulation behaviour MUST only be used if the tunnel ingress is in | encapsulation behaviour MUST only be used if the tunnel ingress is in | |||
`normal state'. A `compatibility state' with a different | `normal state'. A `compatibility state' with a different | |||
encapsulation behaviour is also specified in Section 6 for backward | encapsulation behaviour is also specified in Section 6 for backward | |||
compatibility with legacy tunnel egresses that do not understand ECN. | compatibility with legacy tunnel egresses that do not understand ECN. | |||
To decapsulate the inner header at the tunnel egress, a compliant | To decapsulate the inner header at the tunnel egress, a compliant | |||
tunnel egress MUST set the outgoing ECN field to the codepoint at the | tunnel egress MUST set the outgoing ECN field to the codepoint at the | |||
intersection of the appropriate incoming inner header (row) and outer | intersection of the appropriate incoming inner header (row) and outer | |||
header (column) in Figure 3. | header (column) in Figure 3. | |||
+---------------------------------------------+ | +----------------------------------------------+ | |||
| Incoming Outer Header | | | Incoming Outer Header | | |||
+---------------------+---------+-----------+-----------+-----------+ | +------------------+---------+------------+------------+----------+ | |||
| Incoming Inner | Not-ECT | ECT(0) | ECT(1) | CE | | | Incoming Inner | Not-ECT | ECT(0) | ECT(1) | CE | | |||
| Header | | | | | | | Header | | | | | | |||
+---------------------+---------+-----------+-----------+-----------+ | +------------------+---------+------------+------------+----------+ | |||
| Not-ECT | Not-ECT | drop(!!!) | drop(!!!) | drop(!!!) | | | Not-ECT | Not-ECT | drop(!!!) | drop(!!!) | drop(!!!) | | |||
| ECT(0) | ECT(0) | ECT(0) | ECT(0) | CE | | | ECT(0) | ECT(0) | ECT(0) | ECT(0)(!!!)| CE | | |||
| ECT(1) | ECT(1) | ECT(1) | ECT(1) | CE | | | ECT(1) | ECT(1) | ECT(1)(!!!)| ECT(1) | CE | | |||
| CE | CE | CE | CE (!!!) | CE | | | CE | CE | CE | CE (!!!) | CE | | |||
+---------------------+---------+-----------+-----------+-----------+ | +------------------+---------+------------+------------+----------+ | |||
| Outgoing Header | | | Outgoing Header | | |||
+---------------------------------------------+ | +----------------------------------------------+ | |||
Figure 3: IP in IP Decapsulation | Figure 3: IP in IP Decapsulation | |||
The exclamation marks '(!!!)' in Figure 3 indicate that this | The exclamation marks '(!!!)' in Figure 3 indicate that this | |||
combination of inner and outer headers should not be possible if only | combination of inner and outer headers should not be possible if only | |||
legal transitions have taken place. So, the decapsulator should drop | legal transitions have taken place. So, the decapsulator should drop | |||
or mark the ECN field as the table specifies, but it MAY also raise | or mark the ECN field as the table in Figure 3 specifies, but it MAY | |||
an appropriate alarm. It MUST NOT raise an alarm so often that the | also raise an appropriate alarm. It MUST NOT raise an alarm so often | |||
illegal combinations would amplify into a flood of alarm messages. | that the illegal combinations would amplify into a flood of alarm | |||
messages. | ||||
6. Backward Compatibility | 6. Backward Compatibility | |||
Note: in RFC3168, a tunnel was in one of two modes: limited | Note: in RFC3168, a tunnel was in one of two modes: limited | |||
functionality or full functionality. Rather than working with modes | functionality or full functionality. Rather than working with modes | |||
of the tunnel as a whole, this specification uses the term `state' to | of the tunnel as a whole, this specification uses the term `state' to | |||
refer separately to the state of each tunnel end point, which is how | refer separately to the state of each tunnel end point, which is how | |||
implementations have to work. | implementations have to work. | |||
If one end of an IPsec tunnel is compliant with [RFC4301], the other | If one end of an IPsec tunnel is compliant with [RFC4301], the other | |||
skipping to change at page 18, line 50 | skipping to change at page 19, line 33 | |||
7. Changes from Earlier RFCs | 7. Changes from Earlier RFCs | |||
The rule that a normal state tunnel ingress MUST copy any ECN field | The rule that a normal state tunnel ingress MUST copy any ECN field | |||
into the outer header is a change to the ingress behaviour of | into the outer header is a change to the ingress behaviour of | |||
RFC3168, but it is the same as the rules for IPsec tunnels in | RFC3168, but it is the same as the rules for IPsec tunnels in | |||
RFC4301. | RFC4301. | |||
The rules for calculating the outgoing ECN field on decapsulation at | The rules for calculating the outgoing ECN field on decapsulation at | |||
a tunnel egress are in line with the full functionality mode of ECN | a tunnel egress are in line with the full functionality mode of ECN | |||
in RFC3168 and with RFC4301, except that neither identified that an | in RFC3168 and with RFC4301, except that neither identified the | |||
outer header of ECT(1) combined with an inner header of CE was an | following illegal combinations: outer ECT(1) with inner ECT(0) or | |||
illegal combination. | with CE; outer ECT(0) with inner ECT(1). | |||
The rules for how a tunnel establishes whether the egress has full | The rules for how a tunnel establishes whether the egress has full | |||
functionality ECN capabilities are an update to RFC3168. For all the | functionality ECN capabilities are an update to RFC3168. For all the | |||
typical cases, RFC4301 is not updated by the ECN capability check in | typical cases, RFC4301 is not updated by the ECN capability check in | |||
this specification, because a typical RFC4301 tunnel ingress will | this specification, because a typical RFC4301 tunnel ingress will | |||
have already established that it is talking to an RFC4301 tunnel | have already established that it is talking to an RFC4301 tunnel | |||
egress (e.g. if it uses IKEv2). However, there may be some corner | egress (e.g. if it uses IKEv2). However, there may be some corner | |||
cases (e.g. manual keying) where an RFC4301 tunnel ingress talks with | cases (e.g. manual keying) where an RFC4301 tunnel ingress talks with | |||
an egress with limited functionality ECN handling. Strictly, for | an egress with limited functionality ECN handling. Strictly, for | |||
such corner cases, the requirement to use compatibility mode in this | such corner cases, the requirement to use compatibility mode in this | |||
skipping to change at page 20, line 41 | skipping to change at page 21, line 24 | |||
detect if a CE marking had been applied then subsequently removed. | detect if a CE marking had been applied then subsequently removed. | |||
The source could detect this by weaving a pseudo-random sequence of | The source could detect this by weaving a pseudo-random sequence of | |||
ECT(0) and ECT(1) values into a stream of packets, which is termed an | ECT(0) and ECT(1) values into a stream of packets, which is termed an | |||
ECN nonce. By the decapsulation rules in RFC3168 and RFC4301, if the | ECN nonce. By the decapsulation rules in RFC3168 and RFC4301, if the | |||
inner and outer headers carry contradictory ECT values only the inner | inner and outer headers carry contradictory ECT values only the inner | |||
header is preserved for onward forwarding. So if a CE marking added | header is preserved for onward forwarding. So if a CE marking added | |||
to the outer ECN field has been illegally (or accidentally) | to the outer ECN field has been illegally (or accidentally) | |||
suppressed by a subsequent node in the tunnel, the decapsulator will | suppressed by a subsequent node in the tunnel, the decapsulator will | |||
revert the ECN field to its value before tampering, hiding all | revert the ECN field to its value before tampering, hiding all | |||
evidence of the crime from the onward feedback loop. To close this | evidence of the crime from the onward feedback loop. To close this | |||
loophole, we could have specified that an outer header value of ECT | minor loophole, we could have specified that an outer header value of | |||
should overwrite a contradictory ECT value in the inner header (for | ECT should overwrite a contradictory ECT value in the inner header. | |||
how, see the ideal decapsulation rules proposed in Appendix C). But | But currently we choose to keep the 'broken' behaviour defined in | |||
currently we choose to keep the 'broken' behaviour defined in RFC3168 | RFC3168 & RFC4301 for all the following reasons: | |||
& RFC4301 for all the following reasons: | ||||
1. We wanted to avoid any changes to IPsec tunnelling behaviour; | 1. We wanted to avoid any changes to IPsec tunnelling behaviour; | |||
2. Allowing ECT values in the outer header to override the inner | 2. Allowing ECT values in the outer header to override the inner | |||
header would have increased the bandwidth of the covert channel | header would have increased the bandwidth of the covert channel | |||
through the egress gateway from 1 to 1.5 bit per datagram, | through the egress gateway from 1 to 1.5 bit per datagram, | |||
potentially threatening to upset the consensus established in the | potentially threatening to upset the consensus established in the | |||
security area that says that the bandwidth of this covert channel | security area that says that the bandwidth of this covert channel | |||
can now be safely managed; | can now be safely managed; | |||
skipping to change at page 23, line 35 | skipping to change at page 24, line 19 | |||
[RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition | [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition | |||
of Explicit Congestion Notification (ECN) to IP", | of Explicit Congestion Notification (ECN) to IP", | |||
RFC 3168, September 2001. | RFC 3168, September 2001. | |||
[RFC4301] Kent, S. and K. Seo, "Security Architecture for the | [RFC4301] Kent, S. and K. Seo, "Security Architecture for the | |||
Internet Protocol", RFC 4301, December 2005. | Internet Protocol", RFC 4301, December 2005. | |||
13.2. Informative References | 13.2. Informative References | |||
[I-D.briscoe-pcn-3-in-1-encoding] | ||||
Briscoe, B., "PCN 3-State Encoding Extension in a single | ||||
DSCP", draft-briscoe-pcn-3-in-1-encoding-00 (work in | ||||
progress), October 2008. | ||||
[I-D.ietf-pcn-architecture] | [I-D.ietf-pcn-architecture] | |||
Eardley, P., "Pre-Congestion Notification (PCN) | Eardley, P., "Pre-Congestion Notification (PCN) | |||
Architecture", draft-ietf-pcn-architecture-07 (work in | Architecture", draft-ietf-pcn-architecture-08 (work in | |||
progress), September 2008. | progress), October 2008. | |||
[I-D.ietf-pcn-baseline-encoding] | ||||
Moncaster, T., Briscoe, B., and M. Menth, "Baseline | ||||
Encoding and Transport of Pre-Congestion Information", | ||||
draft-ietf-pcn-baseline-encoding-01 (work in progress), | ||||
October 2008. | ||||
[I-D.ietf-pcn-marking-behaviour] | [I-D.ietf-pcn-marking-behaviour] | |||
Eardley, P., "Marking behaviour of PCN-nodes", | Eardley, P., "Marking behaviour of PCN-nodes", | |||
draft-ietf-pcn-marking-behaviour-00 (work in progress), | draft-ietf-pcn-marking-behaviour-01 (work in progress), | |||
October 2008. | October 2008. | |||
[I-D.ietf-pwe3-congestion-frmwk] | [I-D.ietf-pwe3-congestion-frmwk] | |||
Bryant, S., Davie, B., Martini, L., and E. Rosen, | Bryant, S., Davie, B., Martini, L., and E. Rosen, | |||
"Pseudowire Congestion Control Framework", | "Pseudowire Congestion Control Framework", | |||
draft-ietf-pwe3-congestion-frmwk-01 (work in progress), | draft-ietf-pwe3-congestion-frmwk-01 (work in progress), | |||
May 2008. | May 2008. | |||
[I-D.menth-pcn-psdm-encoding] | ||||
Menth, M., Babiarz, J., Moncaster, T., and B. Briscoe, | ||||
"PCN Encoding for Packet-Specific Dual Marking (PSDM)", | ||||
draft-menth-pcn-psdm-encoding-00 (work in progress), | ||||
July 2008. | ||||
[I-D.moncaster-pcn-3-state-encoding] | [I-D.moncaster-pcn-3-state-encoding] | |||
Moncaster, T., Briscoe, B., and M. Menth, "A three state | Moncaster, T., Briscoe, B., and M. Menth, "A three state | |||
extended PCN encoding scheme", | extended PCN encoding scheme", | |||
draft-moncaster-pcn-3-state-encoding-00 (work in | draft-moncaster-pcn-3-state-encoding-00 (work in | |||
progress), June 2008. | progress), June 2008. | |||
[IEEE802.1au] | [IEEE802.1au] | |||
IEEE, "IEEE Standard for Local and Metropolitan Area | IEEE, "IEEE Standard for Local and Metropolitan Area | |||
Networks--Virtual Bridged Local Area Networks - Amendment | Networks--Virtual Bridged Local Area Networks - Amendment | |||
10: Congestion Notification", 2008, | 10: Congestion Notification", 2008, | |||
<http://www.ieee802.org/1/pages/802.1au.html>. | <http://www.ieee802.org/1/pages/802.1au.html>. | |||
(Work in Progress; Access Controlled link within page) | (Work in Progress; Access Controlled link within page) | |||
[ITU-T.I.371] | [ITU-T.I.371] | |||
ITU-T, "Traffic Control and Congestion Control in | ITU-T, "Traffic Control and Congestion Control in B-ISDN", | |||
{B-ISDN}", ITU-T Rec. I.371 (03/04), March 2004. | ITU-T Rec. I.371 (03/04), March 2004. | |||
[PCNcharter] | [PCNcharter] | |||
IETF, "Congestion and Pre-Congestion Notification (pcn)", | IETF, "Congestion and Pre-Congestion Notification (pcn)", | |||
IETF w-g charter , Feb 2007, | IETF w-g charter , Feb 2007, | |||
<http://www.ietf.org/html.charters/pcn-charter.html>. | <http://www.ietf.org/html.charters/pcn-charter.html>. | |||
[Patterns_Arch] | [Patterns_Arch] | |||
Day, J., "Patterns in Network Architecture: A Return to | Day, J., "Patterns in Network Architecture: A Return to | |||
Fundamentals", Pub: Prentice Hall ISBN-13: 9780132252423, | Fundamentals", Pub: Prentice Hall ISBN-13: 9780132252423, | |||
Jan 2008. | Jan 2008. | |||
skipping to change at page 25, line 31 | skipping to change at page 26, line 31 | |||
[RFC5129] Davie, B., Briscoe, B., and J. Tay, "Explicit Congestion | [RFC5129] Davie, B., Briscoe, B., and J. Tay, "Explicit Congestion | |||
Marking in MPLS", RFC 5129, January 2008. | Marking in MPLS", RFC 5129, January 2008. | |||
[Shayman] "Using ECN to Signal Congestion Within an MPLS Domain", | [Shayman] "Using ECN to Signal Congestion Within an MPLS Domain", | |||
2000, <http://www.ee.umd.edu/~shayman/papers.d/ | 2000, <http://www.ee.umd.edu/~shayman/papers.d/ | |||
draft-shayman-mpls-ecn-00.txt>. | draft-shayman-mpls-ecn-00.txt>. | |||
(Expired) | (Expired) | |||
Editorial Comments | ||||
[Note_Nonce_Compr] Note that even the tentatively proposed | ||||
Comprehensive Decapsulation Rules in Appendix C | ||||
do not fix the minor compromise to the protection | ||||
of the ECN nonce that RFC3168 and RFC4301 both | ||||
suffer from (described under Security | ||||
Considerations above). An attacker with control | ||||
over a tunnel interior node can revert a packet | ||||
previously marked CE within the same tunnel to | ||||
its original marking. It can do this by changing | ||||
CE markings to ECT(0) because the decapsulator | ||||
rules give precedence to the inner header if the | ||||
outer is ECT(0). To fix this, we could have | ||||
specified that the outgoing header should be | ||||
ECT(0) when the incoming outer is ECT(0) but the | ||||
inner is ECT(1). Although this would close the | ||||
minor loophole in the nonce, it would raise a | ||||
minor safety issue if multilevel ECN or PCN were | ||||
used. A less severe marking in the inner header | ||||
would override a more severe one in the outer. | ||||
Both are corner cases so it is difficult to | ||||
decide which is more important: i) the loophole | ||||
in the nonce is only for a minor case of one | ||||
tunnel node attacking another in the same tunnel; | ||||
and ii) the severity inversion would not result | ||||
from any legal codepoint transition. If the | ||||
Comprehensive Decapsulation Rules of Appendix C | ||||
are taken up, we currently believe i) safety | ||||
against misconfiguration is slightly more | ||||
important than ii) securing against an attack | ||||
that has little, if any, clear motivation. | ||||
Appendix A. Why resetting CE on encapsulation harms PCN | Appendix A. Why resetting CE on encapsulation harms PCN | |||
Regarding encapsulation, the section of the PCN architecture | Regarding encapsulation, the section of the PCN architecture | |||
[I-D.ietf-pcn-architecture] on tunnelling says that header copying | [I-D.ietf-pcn-architecture] on tunnelling says that header copying | |||
(RFC4301) allows PCN to work correctly. However, resetting CE | (RFC4301) allows PCN to work correctly. Whereas resetting CE | |||
markings confuses PCN marking. | markings confuses PCN marking. | |||
The specific issue here concerns PCN excess rate marking | The specific issue here concerns PCN excess rate marking | |||
[I-D.ietf-pcn-marking-behaviour], i.e. the bulk marking of traffic | [I-D.ietf-pcn-marking-behaviour], i.e. the bulk marking of traffic | |||
that exceeds a configured threshold rate. One of the goals of excess | that exceeds a configured threshold rate. One of the goals of excess | |||
rate marking is to enable the speedy removal of excess admission | rate marking is to enable the speedy removal of excess admission | |||
controlled traffic following re-routes caused by link failures or | controlled traffic following re-routes caused by link failures or | |||
other disasters. This maintains a share of the capacity for | other disasters. This maintains a share of the capacity for | |||
competing admission controlled traffic and for traffic in lower | competing admission controlled traffic and for traffic in lower | |||
priority classes. After failures, traffic re-routed onto remaining | priority classes. After failures, traffic re-routed onto remaining | |||
skipping to change at page 27, line 17 | skipping to change at page 28, line 37 | |||
| 30 | | | | 30 | | | |||
| | | The large square | | | | The large square | |||
| +---------+p_t represents 100 packets | | +---------+p_t represents 100 packets | |||
| | 12 | | | | 12 | | |||
+-----+---------+0 | +-----+---------+0 | |||
0 30% 100% | 0 30% 100% | |||
inner header marking | inner header marking | |||
Figure 4: Tunnel Marking of Packets Already Marked at Ingress | Figure 4: Tunnel Marking of Packets Already Marked at Ingress | |||
Appendix C. Ideal Decapsulation Rules | Appendix C. Comprehensive Decapsulation Rules | |||
This appendix is not normative. Compliance with this appendix is NOT | ||||
REQUIRED for compliance with the present specification. | ||||
If the default ECN encapsulation behaviour does not offer suitable | This appendix is not currently normative. Compliance with this | |||
trade offs, procedures exist for associating a new behaviour with a | appendix is NOT REQUIRED for compliance with the present | |||
new Diffserv PHB. However, it is unrealistic to expect vendors of | specification. | |||
all IPSec and all IP in IP tunnel endpoints to cater for the | ||||
exceptional behaviour of PHB XXX. If all tunnels did require XXX- | ||||
specific behaviour, the resulting patchy and error-prone deployment | ||||
would probably cause XXX to suffer byzantine feature interactions | ||||
with poorly implemented tunnels. The default rules for tunnel | ||||
endpoints to handle both the Diffserv field and the ECN field should | ||||
'just work' when handling packets with an XXX Diffserv codepoint. | ||||
Given this specification requests a standards action to update the | Given this specification requests a standards action to update the | |||
RFC3168 encapsulation behaviour, this appendix explores a further | RFC3168 encapsulation behaviour, this appendix explores a further | |||
change to decapsulation that we ought to specify at the same time. | change to decapsulation that we ought to specify at the same time. | |||
If instead this further change is added later, it will add another | If instead this further change is added later, it will add another | |||
set of backward compatibility combinations to the already complicated | optional mode to the already complicated change history of ECN | |||
change history of ECN tunnelling. | tunnelling. | |||
Multi-level congestion notification is currently on the IETF's | Multi-level congestion notification is currently on the IETF's | |||
standards track agenda in the Congestion and Pre-Congestion | standards track agenda in the Congestion and Pre-Congestion | |||
Notification (PCN) working group. The PCN working group requires | Notification (PCN) working group. The PCN working group eventually | |||
three congestion states (not marked and two levels of congestion | requires three congestion states (not marked and two increasingly | |||
marking) [I-D.ietf-pcn-architecture]. The aim is for the first level | severe levels of congestion marking) [I-D.ietf-pcn-architecture]. | |||
of marking to stop admitting new traffic and the second level to | The aim is for the less severe level of marking to stop admitting new | |||
terminate sufficient existing flows to bring a network back to its | traffic and the more severe level to terminate sufficient existing | |||
operating point after a serious failure. | flows to bring a network back to its operating point after a serious | |||
failure. | ||||
Although the ECN field gives sufficient codepoints for these three | Although the ECN field gives sufficient codepoints for these three | |||
states, the PCN working group cannot use them in case any tunnel | states, current ECN tunnelling RFCs prevent the PCN working group | |||
decapsulations occur within a PCN region. If a node in a tunnel sets | from using them in case any tunnel decapsulations occur within a PCN | |||
the ECN field to ECT(0) or ECT(1), this change will be discarded by a | region (see Appendix A of [I-D.ietf-pcn-baseline-encoding]). If a | |||
tunnel egress compliant with RFC4301 and RFC3168. This can be seen | node in a tunnel sets the ECN field to ECT(0) or ECT(1), this change | |||
in Figure 3, where the ECT values in the outer header are ignored | will be discarded by a tunnel egress compliant with RFC4301 or | |||
unless the inner header is the same. Effectively the ECT(0) and | RFC3168. This can be seen in Figure 3, where the ECT values in the | |||
ECT(1) codepoints have to be treated as just one codepoint when they | outer header are ignored unless the inner header is the same. | |||
could otherwise have been used for their intended purpose of | Effectively the ECT(0) and ECT(1) codepoints have to be treated as | |||
congestion notification. Instead, the PCN w-g has had to propose | just one codepoint when they could otherwise have been used for their | |||
using extra Diffserv codepoint(s) to encode the extra states | intended purpose of congestion notification. | |||
[I-D.moncaster-pcn-3-state-encoding], using up the rapidly exhausting | ||||
DSCP space while leaving ECN codepoints unused. | ||||
Although this is currently most pressing for the PCN working group, | As a consequence, the PCN w-g has initially confined itself to two | |||
the issue is more general. Under Security Considerations (Section 9) | encoding states as a baseline encoding | |||
it has already been explained that a data sender cannot use the | [I-D.ietf-pcn-baseline-encoding]. And it has had to propose an | |||
experimental ECN nonce [RFC3540] to detect suppression of congestion | experimental extension using extra Diffserv codepoint(s) to encode | |||
notification along a tunnel. | the extra states [I-D.moncaster-pcn-3-state-encoding], using up the | |||
rapidly exhausting DSCP space while leaving ECN codepoints unused. | ||||
Another PCN encoding has been proposed that would survive tunnelling | ||||
without an extra DSCP [I-D.menth-pcn-psdm-encoding], but it requires | ||||
the PCN edge gateways to somehow share state so the egress can | ||||
determine which marking a packet started with at the ingress. Also a | ||||
PCN ingress node can game the system by initiating packets with | ||||
inappropriate markings. | ||||
More generally, the currently standardised tunnel decapsulation | Although this issue is currently most pressing for the PCN working | |||
behaviour unnecessarily wastes a quarter of two bits (i.e. half a | group, it is more general. The currently standardised tunnel | |||
bit) in the IP (v4 & v6) header. As explained in Section 3.1, the | decapsulation behaviour unnecessarily wastes a quarter of two bits | |||
original reason for not copying down outer ECT codepoints for onward | (i.e. half a bit) in the IP (v4 & v6) header. As explained in | |||
forwarding was to limit the covert channel across a decapsulator to 1 | Section 3.1, the original reason for not copying down outer ECT | |||
bit per packet. However, now that the IETF Security Area has deemed | codepoints for onward forwarding was to limit the covert channel | |||
that a 2-bit covert channel through an encapsulator is a manageable | across a decapsulator to 1 bit per packet. However, now that the | |||
risk, the same should be true for a decapsulator. | IETF Security Area has deemed that a 2-bit covert channel through an | |||
encapsulator is a manageable risk, the same should be true for a | ||||
decapsulator. | ||||
Figure 5 proposes a more ideal layered decapsulation behaviour. | Figure 5 proposes a more comprehensive layered decapsulation | |||
Note: this table is only to support discussion. It is not currently | behaviour that would properly support a simpler experimental 3-state | |||
proposed for standards action. The only difference from Figure 3 | ECN encodings such as | |||
(that is proposed for standards action), is the swapping of the cells | [I-D.briscoe-pcn-3-in-1-encoding].[Note_Nonce_Compr] Note that the | |||
highlighted as *ECT(X)*. | proposal tabulated in Figure 5 is only to support discussion. It is | |||
not currently proposed for standards action. The only difference | ||||
from Figure 3 (which _is_ proposed for standards action) is the | ||||
change to the cell highlighted as *ECT(1)*. | ||||
+---------------------------------------------+ | +----------------------------------------------+ | |||
| Incoming Outer Header | | | Incoming Outer Header | | |||
+---------------------+---------+-----------+-----------+-----------+ | +------------------+---------+------------+------------+----------+ | |||
| Incoming Inner | Not-ECT | ECT(0) | ECT(1) | CE | | | Incoming Inner | Not-ECT | ECT(0) | ECT(1) | CE | | |||
| Header | | | | | | | Header | | | | | | |||
+---------------------+---------+-----------+-----------+-----------+ | +------------------+---------+------------+------------+----------+ | |||
| Not-ECT | Not-ECT | drop(!!!) | drop(!!!) | drop(!!!) | | | Not-ECT | Not-ECT | drop(!!!) | drop(!!!) | drop(!!!) | | |||
| ECT(0) | ECT(0) | ECT(0) | *ECT(1)* | CE | | | ECT(0) | ECT(0) | ECT(0) | *ECT(1)* | CE | | |||
| ECT(1) | ECT(1) | *ECT(0)* | ECT(1) | CE | | | ECT(1) | ECT(1) | ECT(1)(!!!)| ECT(1) | CE | | |||
| CE | CE | CE | CE (!!!) | CE | | | CE | CE | CE | CE (!!!) | CE | | |||
+---------------------+---------+-----------+-----------+-----------+ | +------------------+---------+------------+------------+----------+ | |||
| Outgoing Header | | | Outgoing Header | | |||
+---------------------------------------------+ | +----------------------------------------------+ | |||
Figure 5: Ideal IP in IP Decapsulation (currently informative, not | Figure 5: Comprehensive IP in IP Decapsulation (currently | |||
normative) | informative, not normative) | |||
Note that, if this ideal proposal were taken up, a tunnel egress | The table is derived from the following logic: | |||
complying with it would be backwards compatible with all previous | ||||
specifications for encapsulation of ECN at the ingress (RFC4301, both | o On decapsulation, if the inner ECN field is Not-ECT but the outer | |||
modes of RFC3168, both modes of RFC2481 and RFC2003). In comparison | ECN field is anything but Not-ECT the decapsulator must drop the | |||
with an RFC3168 or RFC4301 tunnel egress, it would require no | packet. This is because the Not-ECT marking on the inner header | |||
additional configuration at the ingress nor any additional | is set by transports that do not know how to respond to an | |||
negotiation with the ingress. The only new issue would be the burden | explicit congestion marking; | |||
of an extra standard to be compliant with, adding to the already | ||||
complex history of ECN tunnelling RFCs. | o In all other cases, the outgoing ECN field is set to the more | |||
severe marking of the outer and inner ECN fields, where the | ||||
ranking of severity from highest to lowest is CE, ECT(1), ECT(0), | ||||
Not-ECT; | ||||
o There are cases where no legal transition in any current or | ||||
previous ECN tunneling specification would result in certain | ||||
combinations of inner and outer ECN fields. In these cases | ||||
(indicated in the table by '(!!!)'), the decapsulator may also | ||||
raise an alarm, but not so often that the illegal combinations | ||||
would amplify into a flood of alarm messages. | ||||
If this more comprehensive decapsulation proposal were taken up, it | ||||
would be backwards compatible with all previous encapsulations of ECN | ||||
at the ingress (RFC4301, both modes of RFC3168, both modes of RFC2481 | ||||
and RFC2003). The outgoing header is different for one combination | ||||
of inner & outer headers, but that combination was previously illegal | ||||
anyway, so no known mechanisms in the Internet rely on the previous | ||||
behaviour. The proposed tunnel egress requires no additional option | ||||
configuration at the ingress or egress nor any additional negotiation | ||||
with the ingress. | ||||
C.1. Ways to Introduce the Comprehensive Decapsulation Rules | ||||
There would be a number of ways for this more comprehensive | ||||
decapsulation proposal to be introduced: | ||||
o It could be specified in the present standards track proposal | ||||
(preferred) or in an experimental extension; | ||||
o it could be specified as a new default for all Diffserv PHBs | ||||
(preferred) or as an option to be configured only for Diffserv | ||||
PHBs requiring it. | ||||
The argument for making this change now, rather than in a separate | ||||
experimental extension, is to avoid the burden of an extra standard | ||||
to be compliant with and to be backwards compatible with--so we don't | ||||
add to the already complex history of ECN tunnelling RFCs. The | ||||
argument for a separate experimental extension is that we may never | ||||
need this change (if PCN is never successfully deployed and if no-one | ||||
ever needs three ECN or PCN encoding states rather than two). | ||||
However, the change does no harm to existing mechanisms and stops | ||||
tunnels wasting of quarter of a bit (a 2-bit codepoint). | ||||
The argument for making this new decapsulation behaviour the default | ||||
for all PHBs is that it doesn't change any expected behaviour that | ||||
existing mechanisms rely on already. Also, by ending the present | ||||
waste of a codepoint, in the future a use of that codepoint could be | ||||
proposed for all PHBs, even if PCN isn't successfully deployed. | ||||
In practice, if this comprehensive decapsulation was specified | ||||
straightaway as the normative default for all PHBs, a network | ||||
operator deploying 3-state PCN would be able to request that tunnels | ||||
comply with the latest specification. Implementers of non-PCN | ||||
tunnels would not need to comply but, if they did, their code would | ||||
be future proofed and no harm would be done to legacy operations. | ||||
Therefore, rather than branching their code base, it would be easiest | ||||
for implementers to make all their new tunnel code comply with this | ||||
specfication, whether or not it was for PCN. But they could leave | ||||
old code untouched, unless it was for PCN. | ||||
The alternatives are worse. Implementers would otherwise have to | ||||
provide configurable decapsulation options and operators would have | ||||
to configure all IPsec and IP in IP tunnel endpoints for the | ||||
exceptional behaviour of certain PHBs. The rules for tunnel | ||||
endpoints to handle both the Diffserv field and the ECN field should | ||||
'just work' when handling packets with any Diffserv codepoint. | ||||
Appendix D. Non-Dependence of Tunnelling on In-path Load Regulation | Appendix D. Non-Dependence of Tunnelling on In-path Load Regulation | |||
We have said that at any point in a network, the Congestion Baseline | We have said that at any point in a network, the Congestion Baseline | |||
(where congestion notification starts from zero) should be the | (where congestion notification starts from zero) should be the | |||
previous upstream Load Regulator. We have also said that the ingress | previous upstream Load Regulator. We have also said that the ingress | |||
of an IP in IP tunnel must copy congestion indications to the | of an IP in IP tunnel must copy congestion indications to the | |||
encapsulating outer headers it creates. If the Load Regulator is in- | encapsulating outer headers it creates. If the Load Regulator is in- | |||
path rather than at the source, and also a tunnel ingress, these two | path rather than at the source, and also a tunnel ingress, these two | |||
requirements seem to be contradictory. A tunnel ingress must not | requirements seem to be contradictory. A tunnel ingress must not | |||
End of changes. 41 change blocks. | ||||
117 lines changed or deleted | 260 lines changed or added | |||
This html diff was produced by rfcdiff 1.35. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |