draft-davie-ecn-mpls-00.txt | draft-davie-ecn-mpls-01.txt | |||
---|---|---|---|---|
Network Working Group B. Davie | Network Working Group B. Davie | |||
Internet-Draft Cisco Systems, Inc. | Internet-Draft Cisco Systems, Inc. | |||
Expires: December 20, 2006 B. Briscoe | Intended status: Standards Track B. Briscoe | |||
J. Tay | Expires: April 21, 2007 J. Tay | |||
BT Research | BT Research | |||
June 18, 2006 | October 18, 2006 | |||
Explicit Congestion Marking in MPLS | Explicit Congestion Marking in MPLS | |||
draft-davie-ecn-mpls-00.txt | draft-davie-ecn-mpls-01.txt | |||
Status of this Memo | Status of this Memo | |||
By submitting this Internet-Draft, each author represents that any | By submitting this Internet-Draft, each author represents that any | |||
applicable patent or other IPR claims of which he or she is aware | applicable patent or other IPR claims of which he or she is aware | |||
have been or will be disclosed, and any of which he or she becomes | have been or will be disclosed, and any of which he or she becomes | |||
aware will be disclosed, in accordance with Section 6 of BCP 79. | aware will be disclosed, in accordance with Section 6 of BCP 79. | |||
Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
Task Force (IETF), its areas, and its working groups. Note that | Task Force (IETF), its areas, and its working groups. Note that | |||
skipping to change at page 1, line 36 | skipping to change at page 1, line 36 | |||
and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
The list of current Internet-Drafts can be accessed at | The list of current Internet-Drafts can be accessed at | |||
http://www.ietf.org/ietf/1id-abstracts.txt. | http://www.ietf.org/ietf/1id-abstracts.txt. | |||
The list of Internet-Draft Shadow Directories can be accessed at | The list of Internet-Draft Shadow Directories can be accessed at | |||
http://www.ietf.org/shadow.html. | http://www.ietf.org/shadow.html. | |||
This Internet-Draft will expire on December 20, 2006. | This Internet-Draft will expire on April 21, 2007. | |||
Copyright Notice | Copyright Notice | |||
Copyright (C) The Internet Society (2006). | Copyright (C) The Internet Society (2006). | |||
Abstract | Abstract | |||
RFC 3270 defines how to support the Diffserv arhitecture in MPLS | RFC 3270 defines how to support the Diffserv architecture in MPLS | |||
networks, including how to encode Diffserv Code Points (DSCPs) in an | networks, including how to encode Diffserv Code Points (DSCPs) in an | |||
MPLS header. DSCPs may be encoded in the EXP field, while other uses | MPLS header. DSCPs may be encoded in the EXP field, while other uses | |||
of that field are not precluded. RFC3270 makes no statement about | of that field are not precluded. RFC3270 makes no statement about | |||
how Explicit Congestion Notification (ECN) marking might be encoded | how Explicit Congestion Notification (ECN) marking might be encoded | |||
in the MPLS header. This draft defines how an operator might define | in the MPLS header. This draft defines how an operator might define | |||
some of the EXP codepoints for explicit congestion notification, | some of the EXP codepoints for explicit congestion notification, | |||
without precluding other uses. | without precluding other uses. | |||
Requirements Language | Requirements Language | |||
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | |||
document are to be interpreted as described in RFC 2119 [RFC2119]. | document are to be interpreted as described in RFC 2119 [RFC2119]. | |||
Table of Contents | Table of Contents | |||
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 | |||
1.1. Background . . . . . . . . . . . . . . . . . . . . . . . . 4 | 1.1. Changes From Previous (-00) Version . . . . . . . . . . . 4 | |||
1.2. Intent . . . . . . . . . . . . . . . . . . . . . . . . . . 4 | 1.2. Background . . . . . . . . . . . . . . . . . . . . . . . . 4 | |||
1.3. Terminology . . . . . . . . . . . . . . . . . . . . . . . 5 | 1.3. Intent . . . . . . . . . . . . . . . . . . . . . . . . . . 5 | |||
2. Use of MPLS EXP Field for ECN . . . . . . . . . . . . . . . . 5 | 1.4. Terminology . . . . . . . . . . . . . . . . . . . . . . . 5 | |||
3. Per-domain ECT checking . . . . . . . . . . . . . . . . . . . 7 | 2. Use of MPLS EXP Field for ECN . . . . . . . . . . . . . . . . 6 | |||
4. ECN-enabled MPLS domain . . . . . . . . . . . . . . . . . . . 8 | 3. Per-domain ECT checking . . . . . . . . . . . . . . . . . . . 8 | |||
4.1. Pushing (adding) one or more labels to an IP packet . . . 8 | 4. ECN-enabled MPLS domain . . . . . . . . . . . . . . . . . . . 9 | |||
4.2. Pushing one or more labels onto an MPLS labelled packet . 8 | 4.1. Pushing (adding) one or more labels to an IP packet . . . 9 | |||
4.2. Pushing one or more labels onto an MPLS labelled packet . 9 | ||||
4.3. Congestion experienced in an interior MPLS node . . . . . 9 | 4.3. Congestion experienced in an interior MPLS node . . . . . 9 | |||
4.4. Crossing a Diffserv Domain Boundary . . . . . . . . . . . 9 | 4.4. Crossing a Diffserv Domain Boundary . . . . . . . . . . . 9 | |||
4.5. Popping an MPLS label (not the end of the stack) . . . . . 9 | 4.5. Popping an MPLS label (not the end of the stack) . . . . . 10 | |||
4.6. Popping the last MPLS label in the stack . . . . . . . . . 9 | 4.6. Popping the last MPLS label in the stack . . . . . . . . . 10 | |||
4.7. Diffserv Tunneling Models . . . . . . . . . . . . . . . . 10 | 4.7. Diffserv Tunneling Models . . . . . . . . . . . . . . . . 11 | |||
4.8. Extension to Pre-Congestion Notification . . . . . . . . . 10 | 4.8. Extension to Pre-Congestion Notification . . . . . . . . . 11 | |||
4.8.1. Label Push onto IP packet . . . . . . . . . . . . . . 10 | 4.8.1. Label Push onto IP packet . . . . . . . . . . . . . . 12 | |||
4.8.2. Pushing Additional MPLS Labels . . . . . . . . . . . . 10 | 4.8.2. Pushing Additional MPLS Labels . . . . . . . . . . . . 12 | |||
4.8.3. Admission Control or Pre-emption Marking inside | 4.8.3. Admission Control or Pre-emption Marking inside | |||
MPLS domain . . . . . . . . . . . . . . . . . . . . . 11 | MPLS domain . . . . . . . . . . . . . . . . . . . . . 12 | |||
4.8.4. Popping an MPLS Label (not end of stack) . . . . . . . 11 | 4.8.4. Popping an MPLS Label (not end of stack) . . . . . . . 12 | |||
4.8.5. Popping the last MPLS Label to expose IP header . . . 11 | 4.8.5. Popping the last MPLS Label to expose IP header . . . 12 | |||
5. ECN-disabled MPLS domain . . . . . . . . . . . . . . . . . . . 11 | 5. ECN-disabled MPLS domain . . . . . . . . . . . . . . . . . . . 13 | |||
6. The use of more codepoints with E-LSPs and L-LSPs . . . . . . 11 | 6. The use of more codepoints with E-LSPs and L-LSPs . . . . . . 13 | |||
7. Relationship to tunnel behavior in RFC 3168 . . . . . . . . . 12 | 7. Relationship to tunnel behavior in RFC 3168 . . . . . . . . . 13 | |||
7.1. Alternative approach to support ECN in an MPLS domain . . 12 | 7.1. Alternative approach to support ECN in an MPLS domain . . 14 | |||
8. Example Uses . . . . . . . . . . . . . . . . . . . . . . . . . 13 | 8. Example Uses . . . . . . . . . . . . . . . . . . . . . . . . . 15 | |||
8.1. RFC3168-style ECN . . . . . . . . . . . . . . . . . . . . 13 | 8.1. RFC3168-style ECN . . . . . . . . . . . . . . . . . . . . 15 | |||
8.2. ECN Co-existence with Diffserv E-LSPs . . . . . . . . . . 14 | 8.2. ECN Co-existence with Diffserv E-LSPs . . . . . . . . . . 15 | |||
8.3. Congestion-feedback-based Traffic Engineering . . . . . . 14 | 8.3. Congestion-feedback-based Traffic Engineering . . . . . . 16 | |||
8.4. PCN flow admission control and flow pre-emption . . . . . 14 | 8.4. PCN flow admission control and flow pre-emption . . . . . 16 | |||
9. Deployment Considerations . . . . . . . . . . . . . . . . . . 15 | 9. Deployment Considerations . . . . . . . . . . . . . . . . . . 17 | |||
9.1. Marking non-ECN Capable Packets . . . . . . . . . . . . . 15 | 9.1. Marking non-ECN Capable Packets . . . . . . . . . . . . . 17 | |||
9.2. Non-ECN capable routers in an MPLS Domain . . . . . . . . 16 | 9.2. Non-ECN capable routers in an MPLS Domain . . . . . . . . 17 | |||
10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 16 | 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 18 | |||
11. Security Considerations . . . . . . . . . . . . . . . . . . . 16 | 11. Security Considerations . . . . . . . . . . . . . . . . . . . 18 | |||
12. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 17 | 12. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 19 | |||
13. References . . . . . . . . . . . . . . . . . . . . . . . . . . 17 | 13. References . . . . . . . . . . . . . . . . . . . . . . . . . . 19 | |||
13.1. Normative References . . . . . . . . . . . . . . . . . . . 17 | 13.1. Normative References . . . . . . . . . . . . . . . . . . . 19 | |||
13.2. Informative References . . . . . . . . . . . . . . . . . . 18 | 13.2. Informative References . . . . . . . . . . . . . . . . . . 20 | |||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 20 | Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 21 | |||
Intellectual Property and Copyright Statements . . . . . . . . . . 21 | Intellectual Property and Copyright Statements . . . . . . . . . . 22 | |||
1. Introduction | 1. Introduction | |||
1.1. Background | 1.1. Changes From Previous (-00) Version | |||
[RFC3270] defines how to support the Diffserv arhitecture in MPLS | [Note to RFC Editor: This section to be removed before publication] | |||
o Corrected the description of ECN-MPLS marking proposed in | ||||
[Shayman], which closely corresponds to that proposed in this | ||||
document. | ||||
o Pre-congestion notification (PCN) marking is now described in a | ||||
way that does not require normative references to PCN | ||||
specifications. PCN discussion now serves only to illustrate how | ||||
the ECN marking concepts can be extended to cover more complex | ||||
scenarios, with PCN being an example. | ||||
o Added specification of behavior when MPLS encapsulated packets | ||||
cross from an ECN-enabled domain to a domain that is not ECN- | ||||
enabled. | ||||
o Clarified that copying MPLS ECN or PCN marking into exposed IP | ||||
header on egress is not mandatory | ||||
o Fixed typos and nits | ||||
1.2. Background | ||||
[RFC3270] defines how to support the Diffserv architecture in MPLS | ||||
networks, including how to encode Diffserv Code Points (DSCPs) in an | networks, including how to encode Diffserv Code Points (DSCPs) in an | |||
MPLS header. DSCPs may be encoded in the EXP field, while other uses | MPLS header. DSCPs may be encoded in the EXP field, while other uses | |||
of that field are not precluded. RFC3270 makes no statement about | of that field are not precluded. RFC3270 makes no statement about | |||
how Explicit Congestion Notification (ECN) marking might be encoded | how Explicit Congestion Notification (ECN) marking might be encoded | |||
in the MPLS header. This draft defines how an operator might define | in the MPLS header. This draft defines how an operator might define | |||
some of the EXP codepoints for explicit congestion notification, | some of the EXP codepoints for explicit congestion notification, | |||
without precluding other uses. In parallel to the activity defining | without precluding other uses. In parallel to the activity defining | |||
the addition of ECN to IP [RFC3168], two proposals were made to add | the addition of ECN to IP [RFC3168], two proposals were made to add | |||
ECN to MPLS [Floyd][Shayman]. These proposals, however, fell by the | ECN to MPLS [Floyd][Shayman]. These proposals, however, fell by the | |||
way-side. With ECN for IP now being a proposed standard, and | wayside. With ECN for IP now being a proposed standard, and | |||
developing interest in using pre-congestion notification (PCN) for | developing interest in using pre-congestion notification (PCN) for | |||
admission control and flow pre-emption[I-D.briscoe-tsvwg-cl- | admission control and flow pre-emption | |||
architecture], there is consequent interest in being able to support | [I-D.briscoe-tsvwg-cl-architecture], there is consequent interest in | |||
ECN across IP networks consisting of MPLS-enabled domains. Therefore | being able to support ECN across IP networks consisting of MPLS- | |||
it is necessary to specify the protocol for including ECN or PCN in | enabled domains. Therefore it is necessary to specify the protocol | |||
the MPLS shim header, and the protocol behaviour of edge MPLS nodes. | for including ECN in the MPLS shim header, and the protocol behavior | |||
of edge MPLS nodes. | ||||
We note that in [RFC3168] there are four codepoints used for ECN | We note that in [RFC3168] there are four codepoints used for ECN | |||
marking, which are encoded using two bits of the IP header. The MPLS | marking, which are encoded using two bits of the IP header. The MPLS | |||
EXP field is the logical place to encode ECN codepoints, but with | EXP field is the logical place to encode ECN codepoints, but with | |||
only 3 bits (8 codepoints) available, and with the same field being | only 3 bits (8 codepoints) available, and with the same field being | |||
used to convey DSCP information as well, there is a clear incentive | used to convey DSCP information as well, there is a clear incentive | |||
to conserve the number of codepoints consumed for ECN purposes. | to conserve the number of codepoints consumed for ECN purposes. | |||
Efficient use of the EXP field has been a focus of prior drafts | Efficient use of the EXP field has been a focus of prior drafts | |||
[Floyd] [Shayman] and we draw on those efforts in this draft as well. | [Floyd] [Shayman] and we draw on those efforts in this draft as well. | |||
1.2. Intent | 1.3. Intent | |||
Our intent is to specify how the MPLS shim header[RFC3032] should | Our intent is to specify how the MPLS shim header[RFC3032] should | |||
denote ECN marking and how MPLS nodes should understand whether the | denote ECN marking and how MPLS nodes should understand whether the | |||
transport for a packet will be ECN capable. We offer this as a | transport for a packet will be ECN capable. We offer this as a | |||
building block, from which to build different congestion notification | building block, from which to build different congestion notification | |||
systems. We do not intend to specify how the resulting congestion | systems. We do not intend to specify how the resulting congestion | |||
notification is fed back to an upstream node that can mitigate | notification is fed back to an upstream node that can mitigate | |||
congestion. For instance, unlike [Shayman], we do not specify edge- | congestion. For instance, unlike [Shayman], we do not specify edge- | |||
to-edge MPLS domain feedback, but we also do not preclude it. | to-edge MPLS domain feedback, but we also do not preclude it. | |||
Nonetheless, we do specify how the egress node of an MPLS domain | Nonetheless, we do specify how the egress node of an MPLS domain | |||
should copy congestion notification from the MPLS shim into the | should copy congestion notification from the MPLS shim into the | |||
underlying IP header if the ECN is to be carried onward towards the | underlying IP header if the ECN is to be carried onward towards the | |||
IP receiver. But we do NOT mandate that MPLS congestion notification | IP receiver. But we do NOT mandate that MPLS congestion notification | |||
must be copied into the IP header for onward transmission. This | must be copied into the IP header for onward transmission. This | |||
draft aims to be generic for any use of congestion notification in | draft aims to be generic for any use of congestion notification in | |||
MPLS. PCN or traffic engineering are merely two of many motivating | MPLS. PCN or traffic engineering are merely two of many motivating | |||
applications (see Section 8.) | applications (see Section 8.) | |||
1.3. Terminology | 1.4. Terminology | |||
This document draws freely on the terminology of ECN [RFC3168] and | This document draws freely on the terminology of ECN [RFC3168] and | |||
MPLS [RFC3031]. For ease of reference, we have included some | MPLS [RFC3031]. For ease of reference, we have included some | |||
definitions here, but refer the reader to the references above for | definitions here, but refer the reader to the references above for | |||
complete specifications of the relevant technologies: | complete specifications of the relevant technologies: | |||
o CE: Congestion Experienced. One of the states with which a packet | o CE: Congestion Experienced. One of the states with which a packet | |||
may be marked in a network supporting ECN. A packet is marked in | may be marked in a network supporting ECN. A packet is marked in | |||
this state by an ECN-capable router, to indicate that this router | this state by an ECN-capable router, to indicate that this router | |||
was experiencing congestion at the time the packet arrived. | was experiencing congestion at the time the packet arrived. | |||
skipping to change at page 6, line 7 | skipping to change at page 6, line 29 | |||
already defines use of codepoints in the EXP field for differentiated | already defines use of codepoints in the EXP field for differentiated | |||
services. Although it does not preclude other compatible uses of the | services. Although it does not preclude other compatible uses of the | |||
EXP field, this clearly seems to limit the space available for ECN, | EXP field, this clearly seems to limit the space available for ECN, | |||
given the field is only 3 bits (8 codepoints). | given the field is only 3 bits (8 codepoints). | |||
RFC 3270 defines two possible approaches for requesting | RFC 3270 defines two possible approaches for requesting | |||
differentiated service treatment from an LSR. | differentiated service treatment from an LSR. | |||
o In the E-LSP approach, different codepoints of the EXP field in | o In the E-LSP approach, different codepoints of the EXP field in | |||
the MPLS shim header are used to indicate the packet's per hop | the MPLS shim header are used to indicate the packet's per hop | |||
behaviour (PHB). | behavior (PHB). | |||
o In the L-LSP approach, an MPLS label is assigned for each PHB | o In the L-LSP approach, an MPLS label is assigned for each PHB | |||
scheduling class (PSC, as defined in [RFC3260], so that an LSR | scheduling class (PSC, as defined in [RFC3260], so that an LSR | |||
determines both its forwarding and its scheduling behaviour from | determines both its forwarding and its scheduling behavior from | |||
the label. | the label. | |||
If an MPLS domain uses the L-LSP approach, there is likely to be | If an MPLS domain uses the L-LSP approach, there is likely to be | |||
space in the EXP field for ECN codepoint(s). Where the E-LSP | space in the EXP field for ECN codepoint(s). Where the E-LSP | |||
approach is used, then codepoint space in the EXP field is likely to | approach is used, then codepoint space in the EXP field is likely to | |||
be scarce. This draft focuses on interworking ECN marking with the | be scarce. This draft focuses on interworking ECN marking with the | |||
E-LSP approach as it is the tougher problem. Consequently the same | E-LSP approach as it is the tougher problem. Consequently the same | |||
approach can also be applied with L-LSPs. | approach can also be applied with L-LSPs. | |||
We recommend that explicit congestion notification in MPLS should use | We recommend that explicit congestion notification in MPLS should use | |||
codepoints instead of bits in the EXP field. Since not every DSCP | codepoints instead of bits in the EXP field. Since not every PHB | |||
will need an associated ECN codepoint and some DSCPs might need two | will need an associated ECN codepoint and in some applications a | |||
ECN codepoints [I-D.briscoe-tsvwg-cl-architecture], it would be | given PHB might need two ECN codepoints (see, for | |||
wasteful and incorrect to assign a bit for ECN. | example,[I-D.briscoe-tsvwg-cl-architecture]) it would be wasteful to | |||
assign a dedicated bit for ECN. | ||||
For each PHB that uses ECN marking, we assume one EXP codepoint will | For each PHB that uses ECN marking, we assume one EXP codepoint will | |||
be defined meaning not congestion marked (Not-CM), and at least one | be defined meaning not congestion marked (Not-CM), and at least one | |||
other codepoint will be defined meaning congestion marked (CM). | other codepoint will be defined meaning congestion marked (CM). | |||
Therefore, each PHB that uses ECN marking will consume at least two | Therefore, each PHB that uses ECN marking will consume at least two | |||
EXP codepoints. But PHBs that do not use ECN marking will only | EXP codepoints. But PHBs that do not use ECN marking will only | |||
consume one. | consume one. | |||
Further, we wish to use minimal space in the MPLS shim header to tell | Further, we wish to use minimal space in the MPLS shim header to tell | |||
interior LSRs whether each packet will be received by an ECN-capable | interior LSRs whether each packet will be received by an ECN-capable | |||
skipping to change at page 6, line 50 | skipping to change at page 7, line 26 | |||
o One possible approach is for congested LSRs to mark the ECN field | o One possible approach is for congested LSRs to mark the ECN field | |||
in the underlying IP header at the bottom of the label stack. | in the underlying IP header at the bottom of the label stack. | |||
Although many commercial LSRs routinely access the IP header for | Although many commercial LSRs routinely access the IP header for | |||
other reasons (ECMP), there are numerous drawbacks to attempting | other reasons (ECMP), there are numerous drawbacks to attempting | |||
to find an IP header beneath an MPLS label stack. Notably, there | to find an IP header beneath an MPLS label stack. Notably, there | |||
is the challenge of detecting the absence of an IP header when | is the challenge of detecting the absence of an IP header when | |||
non-IP packets are carried on an LSP. Therefore we will not | non-IP packets are carried on an LSP. Therefore we will not | |||
consider this approach further. | consider this approach further. | |||
o In the schemes suggested by [Floyd] and [Shayman], ECT and CE are | o In the scheme suggested by [Floyd] ECT and CE are overloaded into | |||
overloaded into one bit, so that a 0 means ECT while a 1 might | one bit, so that a 0 means ECT while a 1 might either mean Not-ECT | |||
either mean Not-ECT or it might mean CE. A packet that has been | or it might mean CE. A packet that has been marked as having | |||
marked as having experienced congestion upstream, and then is | experienced congestion upstream, and then is picked out for | |||
picked out for marking at a second congested LSR, will be dropped | marking at a second congested LSR, will be dropped by the second | |||
by the second LSR since it cannot determine whether the packet has | LSR since it cannot determine whether the packet has previously | |||
previously experienced congestion or if ECN is not supported by | experienced congestion or if ECN is not supported by the | |||
the transport. | transport. | |||
While such an approach seemed potentially palatable for | While such an approach seemed potentially palatable, we do not | |||
traditional ECN, we do not recommend it here for the following | recommend it here for the following reasons. In some cases we | |||
reasons. In some cases we wish to be able to use ECN marking long | wish to be able to use ECN marking long before actual congestion | |||
before actual congestion (e.g. pre-congestion notification). In | (e.g. pre-congestion notification). In these circumstances, | |||
these circumstances, marking rates at each LSR might be non- | marking rates at each LSR might be non-negligible most of the | |||
negligible most of the time, so the chances of a previously marked | time, so the chances of a previously marked packet encountering an | |||
packet encountering an LSR that wants to mark it again will also | LSR that wants to mark it again will also be non-negligible. In | |||
be non-negligible. This will lead to unacceptable drop rates. | the case where CE and not-ECT are indistinguishable to core | |||
For instance, if the typical marking rate at every router or LSRs | routers, such a scenario could lead to unacceptable drop rates. | |||
is p, and the typical diameter of the network of LSRs is d, then | If the typical marking rate at every router or LSRs is p, and the | |||
the probability that a marked packet will be marked again is 1- | typical diameter of the network of LSRs is d, then the probability | |||
[1+p(d-1)][1-p]^(d-1). For instance, with 6 LSRs in a row, each | that a marked packet will be chosen for marking more than once is | |||
marking ECN with 1% probability, this bit overloading scheme would | 1-[Pr(never marked) + Pr(marked at exactly one hop)] = 1- [(1-p)^d | |||
introduce a drop rate of 0.15% unnecessarily. Given most modern | + dp(1-p)^(d-1)]. For instance, with 6 LSRs in a row, each | |||
core networks are sized to introduce near-zero packet drop, it may | marking ECN with 1% probability, the chances of a packet that is | |||
be unacceptable to drop over one in a thousand packets | already marked being chosen for marking a second time is 0.15%. | |||
unnecessarily. | The bit overloading scheme would therefore introduce a drop rate | |||
of 0.15% unnecessarily. Given that most modern core networks are | ||||
sized to introduce near-zero packet drop, it may be unacceptable | ||||
to drop over one in a thousand packets unnecessarily. | ||||
o A third possible approach is for interior LSRs to assume that the | o A third possible approach was suggested by [Shayman]. In this | |||
endpoints are ECN-capable, but this assumption is checked when the | scheme, interior LSRs assume that the endpoints are ECN-capable, | |||
final label is popped. If an interior LSR has marked ECN in the | but this assumption is checked when the final label is popped. If | |||
EXP field of the shim, but the IP header says the endpoints are | an interior LSR has marked ECN in the EXP field of the shim | |||
not ECN capable, the edge router (or penultimate if using | header, but the IP header says the endpoints are not ECN capable, | |||
penultimate hop popping) drops the packet. We recommend this | the edge router (or penultimate router, if using penultimate hop | |||
scheme, which we call `per-domain ECT checking'; and define it | popping) drops the packet. We recommend this scheme, which we | |||
more precisely in the following section. Its chief drawback is | call `per-domain ECT checking', and define it more precisely in | |||
that it can involve packets continuing to be forwarded after | the following section. Its chief drawback is that it can cause | |||
encountering congestion only to be dropped at the egress of the | packets to be forwarded after encountering congestion only to be | |||
MPLS domain. The rationale for this decision is given in | dropped at the egress of the MPLS domain. The rationale for this | |||
Section 9.1. | decision is given in Section 9.1. | |||
3. Per-domain ECT checking | 3. Per-domain ECT checking | |||
For the purposes of this discussion, we define the egress nodes of an | For the purposes of this discussion, we define the egress nodes of an | |||
MPLS domain as the nodes that pop the last MPLS label from the label | MPLS domain as the nodes that pop the last MPLS label from the label | |||
stack, exposing the IP (or, potentially non-IP) header. Note that | stack, exposing the IP (or, potentially non-IP) header. Note that | |||
such a node may be the ultimate or penultimate hop of an LSP, | such a node may be the ultimate or penultimate hop of an LSP, | |||
depending on whether penultimate hop popping (PHP) is employed. | depending on whether penultimate hop popping (PHP) is employed. | |||
In the per-domain ECT checking approach, the egress nodes take | In the per-domain ECT checking approach, the egress nodes take | |||
skipping to change at page 9, line 14 | skipping to change at page 9, line 41 | |||
push to the newly added outer label. If more than one label is being | push to the newly added outer label. If more than one label is being | |||
pushed, the same EXP value is copied to all label stack entries. | pushed, the same EXP value is copied to all label stack entries. | |||
4.3. Congestion experienced in an interior MPLS node | 4.3. Congestion experienced in an interior MPLS node | |||
If the EXP codepoint of the packet maps to a PHB that uses ECN | If the EXP codepoint of the packet maps to a PHB that uses ECN | |||
marking and the marking algorithm requires the packet to be marked, | marking and the marking algorithm requires the packet to be marked, | |||
the CM state is set (irrespective of whether it is already in the CM | the CM state is set (irrespective of whether it is already in the CM | |||
state). | state). | |||
If the buffer is full, the packet would be dropped. | If the buffer is full, a packet is dropped. | |||
4.4. Crossing a Diffserv Domain Boundary | 4.4. Crossing a Diffserv Domain Boundary | |||
If an MPLS-encapsulated packet crosses a Diffserv domain boundary, it | If an MPLS-encapsulated packet crosses a Diffserv domain boundary, it | |||
may be the case that the two domains use different encodings of the | may be the case that the two domains use different encodings of the | |||
same PHB in the EXP field. In such cases, the EXP field must be | same PHB in the EXP field. In such cases, the EXP field must be | |||
rewritten at the domain boundary. If the PHB is one that supports | rewritten at the domain boundary. If the PHB is one that supports | |||
ECN, then the appropriate ECN marking should also be preserved when | ECN, then the appropriate ECN marking should also be preserved when | |||
the EXP field is mapped at the boundary. | the EXP field is mapped at the boundary. | |||
If an MPLS-encapsulated packet that is in the CM state crosses from a | ||||
domain that is ECN-enabled (as defined in Section 3) to a domain that | ||||
is not ECN-enabled, then it is necessary to perform the egress | ||||
checking procedures at the egress LSR of the ECN-enabled domain. | ||||
This means that if the encapsulated packet is not ECN capable, the | ||||
packet MUST be dropped. Note that this implies the egress LSR must | ||||
be able to look beneath the MPLS header without popping the label | ||||
stack. | ||||
The related issue of Diffserv tunnel models is discussed in | The related issue of Diffserv tunnel models is discussed in | |||
Section 4.7. | Section 4.7. | |||
4.5. Popping an MPLS label (not the end of the stack) | 4.5. Popping an MPLS label (not the end of the stack) | |||
When a packet has more than one MPLS label in the stack and the top | When a packet has more than one MPLS label in the stack and the top | |||
label is popped, another MPLS label is exposed. In this case the ECN | label is popped, another MPLS label is exposed. In this case the ECN | |||
information should be transferred from the outer EXP field to the | information should be transferred from the outer EXP field to the | |||
inner MPLS label in the following manner. If the inner EXP field is | inner MPLS label in the following manner. If the inner EXP field is | |||
Not-CM, the inner EXP field is set to the same CM or Not-CM state as | Not-CM, the inner EXP field is set to the same CM or Not-CM state as | |||
skipping to change at page 9, line 49 | skipping to change at page 10, line 38 | |||
4.6. Popping the last MPLS label in the stack | 4.6. Popping the last MPLS label in the stack | |||
When the last MPLS label is popped from the packet, its payload is | When the last MPLS label is popped from the packet, its payload is | |||
exposed. If that packet is not IP, and does not have any capability | exposed. If that packet is not IP, and does not have any capability | |||
equivalent to ECT, it is assumed Not-ECT and treated as such. That | equivalent to ECT, it is assumed Not-ECT and treated as such. That | |||
means that if the EXP value of the MPLS header was CM, the packet | means that if the EXP value of the MPLS header was CM, the packet | |||
MUST be dropped. | MUST be dropped. | |||
Assuming an IP packet was exposed, we have to examine whether that | Assuming an IP packet was exposed, we have to examine whether that | |||
packet is ECT or not. If the inner IP packet is Not-ECT, its ECN | packet is ECT or not. A Not-ECT packet MUST be dropped if the EXP | |||
field remains unchanged if the EXP field is Not-CM. However, a Not- | field is CM. | |||
ECT packet MUST be dropped if the EXP field is CM. | ||||
If the ECN field of the inner packet is set to ECT(0), ECT(1) or CE, | For the remainder of this section, we describe the behavior that is | |||
the ECN field remains unchanged if the EXP field is set to Not-CM. | required if the ECN information is to be transferred from the MPLS | |||
The ECN field is set to CE if the EXP field is CM. Note that an | header into the exposed IP header for onward transmission. As noted | |||
inner value of CE and an outer value of not-CM should be considered | in Section 1.3, such behavior is not mandated by this document, but | |||
anomalous, and SHOULD be logged in some way by the LSR. | may be selected by an operator. | |||
If the inner IP packet is Not-ECT, its ECN field remains unchanged if | ||||
the EXP field is Not-CM. If the ECN field of the inner packet is set | ||||
to ECT(0), ECT(1) or CE, the ECN field remains unchanged if the EXP | ||||
field is set to Not-CM. The ECN field is set to CE if the EXP field | ||||
is CM. Note that an inner value of CE and an outer value of not-CM | ||||
should be considered anomalous, and SHOULD be logged in some way by | ||||
the LSR. | ||||
4.7. Diffserv Tunneling Models | 4.7. Diffserv Tunneling Models | |||
[RFC3270] describes three tunneling models for Diffserv support | [RFC3270] describes three tunneling models for Diffserv support | |||
across MPLS Domains, referred to as the uniform, short pipe, and pipe | across MPLS Domains, referred to as the uniform, short pipe, and pipe | |||
models. The differences between these models lie in whether the | models. The differences between these models lie in whether the | |||
Diffserv treatment that applies to a packet while it travels along a | Diffserv treatment that applies to a packet while it travels along a | |||
particular LSP is carried to the last hop of the LSP and beyond the | particular LSP is carried to the last hop of the LSP and beyond the | |||
last hop. Depending on which mode is preferred by an operator, the | last hop. Depending on which mode is preferred by an operator, the | |||
EXP value or DSCP value of an exposed header following a label pop | EXP value or DSCP value of an exposed header following a label pop | |||
may or may not be dependent on the EXP value of the label that is | may or may not be dependent on the EXP value of the label that is | |||
removed by the pop operation. We believe that in the case of ECN | removed by the pop operation. We believe that in the case of ECN | |||
marking, the use of these models should only apply to the encoding of | marking, the use of these models should only apply to the encoding of | |||
the Diffserv PHB in the EXP value, and that the choice of codepoint | the Diffserv PHB in the EXP value, and that the choice of codepoint | |||
for ECN should always be made based on the procedures described | for ECN should always be made based on the procedures described | |||
above, independent of the tunneling model. | above, independent of the tunneling model. | |||
4.8. Extension to Pre-Congestion Notification | 4.8. Extension to Pre-Congestion Notification | |||
To fully support PCN [I-D.briscoe-tsvwg-cl-architecture] in an MPLS | This section describes how the preceding mechanisms can be extended | |||
domain for a particular PHB, a total of 3 codepoints need to be | to support PCN [I-D.briscoe-tsvwg-cl-architecture]. Our intent here | |||
allocated for that PHB. (See Section 8.4 for further discussion of | is to show that the mechanisms are readily extended to more complex | |||
PCN and the possibility of using fewer codepoints.) These 3 | scenarios than ECN, but this section may be safely ignored if one is | |||
interested only in supporting ECN. | ||||
The relevant aspects of PCN for the purposes of this discussion are: | ||||
o PCN uses 3 states rather than 2 for ECN - these are referred to as | ||||
admission marked (AM), pre-emption marked (PM) and not marked (NM) | ||||
states. (See Section 8.4 for further discussion of PCN and the | ||||
possibility of using fewer codepoints.) | ||||
o A packet can go from NM to AM, from NM to PM, or from AM to PM, | ||||
but no other transition is possible. | ||||
o Whereas ECN-capable packets are identified by the ECT value in the | ||||
IP header, PCN-capability is determined by the PHB of the packet. | ||||
Thus, to support PCN fully in an MPLS domain for a particular PHB, a | ||||
total of 3 codepoints need to be allocated for that PHB. These 3 | ||||
codepoints represent the admission marked (AM), pre-emption marked | codepoints represent the admission marked (AM), pre-emption marked | |||
(PM) and not marked (NM) states. The procedures described above need | (PM) and not marked (NM) states. The procedures described above need | |||
to be slightly modified to support this scenario. The following | to be slightly modified to support this scenario. The following | |||
procedures are invoked when the topmost DSCP or EXP value indicates a | procedures are invoked when the topmost DSCP or EXP value indicates a | |||
PHB that supports PCN. | PHB that supports PCN. | |||
4.8.1. Label Push onto IP packet | 4.8.1. Label Push onto IP packet | |||
If the IP packet header indicates AM, set the EXP value of all | If the IP packet header indicates AM, set the EXP value of all | |||
entries in the label stack to AM. If the IP packet header indicates | entries in the label stack to AM. If the IP packet header indicates | |||
skipping to change at page 11, line 8 | skipping to change at page 12, line 20 | |||
any other marking of the IP header, set the EXP value of all entries | any other marking of the IP header, set the EXP value of all entries | |||
in the label stack to NM. | in the label stack to NM. | |||
4.8.2. Pushing Additional MPLS Labels | 4.8.2. Pushing Additional MPLS Labels | |||
The procedures of Section 4.2 apply. | The procedures of Section 4.2 apply. | |||
4.8.3. Admission Control or Pre-emption Marking inside MPLS domain | 4.8.3. Admission Control or Pre-emption Marking inside MPLS domain | |||
The EXP value can be set to AM or PM according to the same procedures | The EXP value can be set to AM or PM according to the same procedures | |||
as described in [I-D.briscoe-tsvwg-cl-phb]. | as described in [I-D.briscoe-tsvwg-cl-phb]. For the purposes of this | |||
document, it does not matter exactly what algorithms are used to | ||||
decide when to set AM or PM; all that matters is that if a router | ||||
would have marked AM (or PM) in the IP header, it should set the EXP | ||||
value in the MPLS header to the AM (or PM) codepoint. | ||||
4.8.4. Popping an MPLS Label (not end of stack) | 4.8.4. Popping an MPLS Label (not end of stack) | |||
When popping an MPLS Label exposes another MPLS label, the AM or PM | When popping an MPLS Label exposes another MPLS label, the AM or PM | |||
marking should be transferred to the exposed EXP field in the | marking should be transferred to the exposed EXP field in the | |||
following manner: if the inner EXP value is NM, then it should be set | following manner: if the inner EXP value is NM, then it should be set | |||
to the same marking state as the EXP value of the popped label stack | to the same marking state as the EXP value of the popped label stack | |||
entry. If the inner EXP value is AM, it should be unchanged if the | entry. If the inner EXP value is AM, it should be unchanged if the | |||
popped EXP value was AM, and it should be set to PM if the popped EXP | popped EXP value was AM, and it should be set to PM if the popped EXP | |||
value was PM. If the popped EXP value was NM, this should be logged | value was PM. If the popped EXP value was NM, this should be logged | |||
in some way and the inner EXP value should be unchanged. If the | in some way and the inner EXP value should be unchanged. If the | |||
inner EXP value is PM, it should be unchanged whatever the popped EXP | inner EXP value is PM, it should be unchanged whatever the popped EXP | |||
value was, but any EXP value other than PM should be logged. | value was, but any EXP value other than PM should be logged. | |||
4.8.5. Popping the last MPLS Label to expose IP header | 4.8.5. Popping the last MPLS Label to expose IP header | |||
When popping the last MPLS Label exposes the IP header, the AM or PM | When popping the last MPLS Label exposes the IP header, there are two | |||
marking should be transferred to the exposed IP header field in the | cases to consider: | |||
following manner: if the inner IP header value is neither AM nor PM, | ||||
and the EXP value was NM, then the IP header should be unchanged. | o the popping LSR is NOT the egress router of the PCN region, in | |||
For any other EXP value, the IP header should be set to the same | which case AM or PM marking should be transferred to the exposed | |||
marking state as the EXP value of the popped label stack entry. If | IP header field; or | |||
the inner IP header value is AM, it should be unchanged if the popped | ||||
EXP value was AM, and it should be set to PM if the popped EXP value | o the popping LSR IS the egress router of the PCN region. | |||
was PM. If the popped EXP value was NM, this should be logged in | ||||
some way and the inner IP header value should be unchanged. If the | In the latter case, the behavior of the egress LSR is defined in | |||
IP header value is PM, it should be unchanged whatever the popped EXP | [I-D.briscoe-tsvwg-cl-architecture] and is beyond the scope of this | |||
value was, but any EXP value other than PM should be logged. | document. In the former case, the marking should be transferred from | |||
the popped MPLS header to the exposed IP header as follows: if the | ||||
inner IP header value is neither AM nor PM, and the EXP value was NM, | ||||
then the IP header should be unchanged. For any other EXP value, the | ||||
IP header should be set to the same marking state as the EXP value of | ||||
the popped label stack entry. If the inner IP header value is AM, it | ||||
should be unchanged if the popped EXP value was AM, and it should be | ||||
set to PM if the popped EXP value was PM. If the popped EXP value | ||||
was NM, this should be logged in some way and the inner IP header | ||||
value should be unchanged. If the IP header value is PM, it should | ||||
be unchanged whatever the popped EXP value was, but any EXP value | ||||
other than PM should be logged. | ||||
5. ECN-disabled MPLS domain | 5. ECN-disabled MPLS domain | |||
If ECN is not enabled on all the egress LSRs of a domain, ECN MUST | If ECN is not enabled on all the egress LSRs of a domain, ECN MUST | |||
NOT be enabled on any LSRs throughout the domain. If congestion is | NOT be enabled on any LSRs throughout the domain. If congestion is | |||
experienced on any LSR in an ECN-disabled MPLS domain, packets MUST | experienced on any LSR in an ECN-disabled MPLS domain, packets MUST | |||
be dropped NOT marked. The exact algorithm for deciding when to drop | be dropped, NOT marked. The exact algorithm for deciding when to | |||
packets during congestion (e.g. tail-drop, RED, etc.) is a local | drop packets during congestion (e.g. tail-drop, RED, etc.) is a local | |||
matter for the operator of the domain. | matter for the operator of the domain. | |||
6. The use of more codepoints with E-LSPs and L-LSPs | 6. The use of more codepoints with E-LSPs and L-LSPs | |||
RFC 3270 gives different options with E-LSPs and L-LSPs and some of | RFC 3270 gives different options with E-LSPs and L-LSPs and some of | |||
those could potentially provide ample EXP codepoints for ECN/PCN. | those could potentially provide ample EXP codepoints for ECN/PCN. | |||
However, deploying L-LSPs vs E-LSPs has many implications such as | However, deploying L-LSPs vs E-LSPs has many implications such as | |||
platform support and operational complexity. The above ECN/PCN MPLS | platform support and operational complexity. The above ECN/PCN MPLS | |||
solution should provide some flexibility. If the operator has | solution should provide some flexibility. If the operator has | |||
deployed one L-LSP per PHB scheduling class, then EXP space will be a | deployed one L-LSP per PHB scheduling class, then EXP space will be a | |||
non-issue and it could be used to achieve more sophisticated ECN/PCN | non-issue and it could be used to achieve more sophisticated ECN/PCN | |||
behavior if required. If the operator wants to stick to E-LSPs and | behavior if required. If the operator wants to stick to E-LSPs and | |||
uses a handful of EXP codepoints for Diffserv, it may be desirable to | uses a handful of EXP codepoints for Diffserv, it may be desirable to | |||
operate with a minimum number of extra ECN/PCN codepoints, even if | operate with a minimum number of extra ECN/PCN codepoints, even if | |||
this comes with some compromise on ECN/PCN optimality. See Section 8 | this comes with some compromise on ECN/PCN optimality. See Section 8 | |||
for discussion of some possible deployment scenarios. | for discussion of some possible deployment scenarios. | |||
skipping to change at page 12, line 19 | skipping to change at page 13, line 43 | |||
non-issue and it could be used to achieve more sophisticated ECN/PCN | non-issue and it could be used to achieve more sophisticated ECN/PCN | |||
behavior if required. If the operator wants to stick to E-LSPs and | behavior if required. If the operator wants to stick to E-LSPs and | |||
uses a handful of EXP codepoints for Diffserv, it may be desirable to | uses a handful of EXP codepoints for Diffserv, it may be desirable to | |||
operate with a minimum number of extra ECN/PCN codepoints, even if | operate with a minimum number of extra ECN/PCN codepoints, even if | |||
this comes with some compromise on ECN/PCN optimality. See Section 8 | this comes with some compromise on ECN/PCN optimality. See Section 8 | |||
for discussion of some possible deployment scenarios. | for discussion of some possible deployment scenarios. | |||
7. Relationship to tunnel behavior in RFC 3168 | 7. Relationship to tunnel behavior in RFC 3168 | |||
[RFC3168] defines two modes of encapsulating ECN-marked IP packets | [RFC3168] defines two modes of encapsulating ECN-marked IP packets | |||
inside additonal IP headers when tunnels are used. The two modes are | inside additional IP headers when tunnels are used. The two modes | |||
the "full functionality" and "limited functionality" modes. In the | are the "full functionality" and "limited functionality" modes. In | |||
full functionality mode, the ECT information from the inner header is | the full functionality mode, the ECT information from the inner | |||
copied to the outer header at the tunnel ingress, but the CE | header is copied to the outer header at the tunnel ingress, but the | |||
information is not. In the limited functionality mode, neither ECT | CE information is not. In the limited functionality mode, neither | |||
nor CE information is copied to the outer header, and thus ECN cannot | ECT nor CE information is copied to the outer header, and thus ECN | |||
be applied to the encapsulated packet. | cannot be applied to the encapsulated packet. | |||
The behavior that is specified in Section 4 of this document | The behavior that is specified in Section 4 of this document | |||
resembles the "full functionality" mode in the sense that it conveys | resembles the "full functionality" mode in the sense that it conveys | |||
some information from inner to outer header, and in the sense that it | some information from inner to outer header, and in the sense that it | |||
enables full ECN support along the MPLS LSP (which is analogous to an | enables full ECN support along the MPLS LSP (which is analogous to an | |||
IP tunnel in this context). However it differs in one respect, which | IP tunnel in this context). However it differs in one respect, which | |||
is that the CE information is conveyed from the inner header to the | is that the CE information is conveyed from the inner header to the | |||
outer header. Our reason for this different design choice is to give | outer header. Our reason for this different design choice is to give | |||
interior routers and LSRs more information about upstream marking in | interior routers and LSRs more information about upstream marking in | |||
multi-bottleneck cases. For instance, the flow pre-emption marking | multi-bottleneck cases. For instance, the flow pre-emption marking | |||
skipping to change at page 13, line 42 | skipping to change at page 15, line 21 | |||
on whether a packet has already suffered upstream marking. The | on whether a packet has already suffered upstream marking. The | |||
currently proposed pre-emption marking in PCN is an example where | currently proposed pre-emption marking in PCN is an example where | |||
such an exception would be necessary (see the discussion at the start | such an exception would be necessary (see the discussion at the start | |||
of Section 7). | of Section 7). | |||
8. Example Uses | 8. Example Uses | |||
8.1. RFC3168-style ECN | 8.1. RFC3168-style ECN | |||
[RFC3168] proposes the use of ECN in TCP and introduces the use of | [RFC3168] proposes the use of ECN in TCP and introduces the use of | |||
ECN-Echo and CWR flags in the TCP header for initialisation. The TCP | ECN-Echo and CWR flags in the TCP header for initialization. The TCP | |||
sender responds accordingly (such as not increasing the congestion | sender responds accordingly (such as not increasing the congestion | |||
window) when it receives an ECN-Echo (ECE) ACK packet (that is, an | window) when it receives an ECN-Echo (ECE) ACK packet (that is, an | |||
ACK packet with ECN-Echo flag set in the TCP header), then the sender | ACK packet with ECN-Echo flag set in the TCP header), then the sender | |||
knows that congestion was encountered in the network on the path from | knows that congestion was encountered in the network on the path from | |||
the sender to the receiver. | the sender to the receiver. | |||
It would be possible to enable ECN in an MPLS domain for Diffserv | It would be possible to enable ECN in an MPLS domain for Diffserv | |||
PHBs like AF and best efforts that are expected to be used by TCP and | PHBs like AF and best efforts that are expected to be used by TCP and | |||
similar transports (e.g. DCCP [RFC4340]). Then end-to-end | similar transports (e.g. DCCP [RFC4340]). Then end-to-end | |||
congestion control in transports capable of understanding ECN would | congestion control in transports capable of understanding ECN would | |||
skipping to change at page 15, line 4 | skipping to change at page 16, line 32 | |||
[I-D.briscoe-tsvwg-cl-architecture] proposes using pre-congestion | [I-D.briscoe-tsvwg-cl-architecture] proposes using pre-congestion | |||
notification (PCN) on routers within an edge-to-edge Diffserv region | notification (PCN) on routers within an edge-to-edge Diffserv region | |||
to control admission of new flows to the region and, if necessary, to | to control admission of new flows to the region and, if necessary, to | |||
pre-empt existing flows in response to disasters and other anomalous | pre-empt existing flows in response to disasters and other anomalous | |||
routing events. In this approach, the current level of PCN marking | routing events. In this approach, the current level of PCN marking | |||
is picked up by the signalling used to initiate each flow in order to | is picked up by the signalling used to initiate each flow in order to | |||
inform the admission control decision for the whole region at once. | inform the admission control decision for the whole region at once. | |||
As an example, a minor extension to RSVP signalling has been proposed | As an example, a minor extension to RSVP signalling has been proposed | |||
[I-D.lefaucheur-rsvp-ecn] to carry this message, but a similar | [I-D.lefaucheur-rsvp-ecn] to carry this message, but a similar | |||
approach has also been proposed that uses NSIS signalling [I-D.ietf- | approach has also been proposed that uses NSIS signalling | |||
nsis-rmd]. | [I-D.ietf-nsis-rmd]. | |||
If it is possible for LSRs to signify congestion in MPLS, PCN marking | If it is possible for LSRs to signify congestion in MPLS, PCN marking | |||
could be used for admission control and flow pre-emption across a | could be used for admission control and flow pre-emption across a | |||
Diffserv region, irrespective of whether it contained pure IP | Diffserv region, irrespective of whether it contained pure IP | |||
routers, MPLS LSRs, or both. Indeed, the solution could be somewhat | routers, MPLS LSRs, or both. Indeed, the solution could be somewhat | |||
more efficient to implement if aggregates could identify themselves | more efficient to implement if aggregates could identify themselves | |||
by their MPLS label. Section 4.8 describes the mechanisms by which | by their MPLS label. Section 4.8 describes the mechanisms by which | |||
the necessary markings for PCN could be carried in the MPLS header. | the necessary markings for PCN could be carried in the MPLS header. | |||
As an illustrative example of how the EXP field might be used in this | As an illustrative example of how the EXP field might be used in this | |||
skipping to change at page 16, line 13 | skipping to change at page 17, line 40 | |||
dropped later should become less prevalent as more transports use | dropped later should become less prevalent as more transports use | |||
ECN. This is why we chose not to use the [Floyd] alternative which | ECN. This is why we chose not to use the [Floyd] alternative which | |||
introduced a low but persistent level of unnecessary packet drop for | introduced a low but persistent level of unnecessary packet drop for | |||
all time. Although that scheme did not carry droppable traffic to | all time. Although that scheme did not carry droppable traffic to | |||
the edge of the MPLS domain, we felt this was a small price to pay, | the edge of the MPLS domain, we felt this was a small price to pay, | |||
and it was anyway only of concern until ECN had become more widely | and it was anyway only of concern until ECN had become more widely | |||
deployed. | deployed. | |||
A partial solution would be to preferentially drop packets arriving | A partial solution would be to preferentially drop packets arriving | |||
at a congested router that were already marked. There is no solution | at a congested router that were already marked. There is no solution | |||
to the problem of marking a packet congested by another packet that | to the problem of marking a packet when congestion is caused by | |||
should have been dropped. However, the chance of such an occurrence | another packet that should have been dropped. However, the chance of | |||
is very low and the consequences are not significant. It merely | such an occurrence is very low and the consequences are not | |||
causes an application to very occasionally slow down its rate when it | significant. It merely causes an application to very occasionally | |||
did not have to. | slow down its rate when it did not have to. | |||
9.2. Non-ECN capable routers in an MPLS Domain | 9.2. Non-ECN capable routers in an MPLS Domain | |||
What if an MPLS domain wants to use ECN, but not all legacy routers | What if an MPLS domain wants to use ECN, but not all legacy routers | |||
are able to support it? | are able to support it? | |||
If the legacy router(s) are used in the interior, this is not a | If the legacy router(s) are used in the interior, this is not a | |||
problem. They will simply have to drop the packets if they are | problem. They will simply have to drop the packets if they are | |||
congested, rather than mark them, which is the standard behaviour for | congested, rather than mark them, which is the standard behavior for | |||
IP routers that are not ECN-enabled. | IP routers that are not ECN-enabled. | |||
If the legacy router were used as an egress router, it would not be | If the legacy router were used as an egress router, it would not be | |||
able to check the ECN capability of the transport correctly. An | able to check the ECN capability of the transport correctly. An | |||
operator in this position would not be able to use this solution and | operator in this position would not be able to use this solution and | |||
therefore MUST NOT enable ECN unless all egress routers are ECN- | therefore MUST NOT enable ECN unless all egress routers are ECN- | |||
capable. | capable. | |||
10. IANA Considerations | 10. IANA Considerations | |||
skipping to change at page 17, line 21 | skipping to change at page 18, line 47 | |||
without requiring any specific support from the proposal in this | without requiring any specific support from the proposal in this | |||
draft. The nonce does not need to be present in the MPLS shim | draft. The nonce does not need to be present in the MPLS shim | |||
header. As long as the nonce is present in the IP header when the | header. As long as the nonce is present in the IP header when the | |||
ECN information is copied from the last MPLS shim header, it will be | ECN information is copied from the last MPLS shim header, it will be | |||
overwritten if congestion has been experienced by an LSR. This is | overwritten if congestion has been experienced by an LSR. This is | |||
all that is necessary for the sender to detect a misbehaving | all that is necessary for the sender to detect a misbehaving | |||
receiver. | receiver. | |||
An alternative proposal currently in progress in the IETF | An alternative proposal currently in progress in the IETF | |||
[I-D.briscoe-tsvwg-re-ecn-tcp] allows the network to prevent | [I-D.briscoe-tsvwg-re-ecn-tcp] allows the network to prevent | |||
misbehaviour by senders or receivers or other routers. Like the ECN | misbehavior by senders or receivers or other routers. Like the ECN | |||
nonce, it works correctly without requiring any specific support from | nonce, it works correctly without requiring any specific support from | |||
the proposal in this draft. It uses a bit in the IP header (the RE | the proposal in this draft. It uses a bit in the IP header (the RE | |||
bit) which is set by the sender and never changed along the path-it | bit) which is set by the sender and never changed along the path-it | |||
is only read by certain policing elements in the network. There is | is only read by certain policing elements in the network. There is | |||
no need for a copy of this bit in the MPLS shim, as policing nodes | no need for a copy of this bit in the MPLS shim, as policing nodes | |||
can examine the IP header if they need to, particularly given they | can examine the IP header if they need to, particularly given they | |||
are intended to only be necessary at domain borders where MPLS | are intended to only be necessary at domain borders where MPLS | |||
headers are often removed. | headers are often removed. | |||
12. Acknowledgements | 12. Acknowledgements | |||
Thanks to K.K. Ramakrishnan and Sally Floyd for getting us thinking | Thanks to K.K. Ramakrishnan and Sally Floyd for getting us thinking | |||
about this in the first place and for providing advice on tunneling | about this in the first place and for providing advice on tunneling | |||
of ECN packets, and to Joe Babiarz and Ben Niven-Jenkins for their | of ECN packets, and to Joe Babiarz, Ben Niven-Jenkins, Phil Eardley, | |||
comments on the draft. | and Ruediger Geib for their comments on the draft. | |||
13. References | 13. References | |||
13.1. Normative References | 13.1. Normative References | |||
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | |||
Requirement Levels", BCP 14, RFC 2119, March 1997. | Requirement Levels", BCP 14, RFC 2119, March 1997. | |||
[RFC2475] Blake, S., Black, D., Carlson, M., Davies, E., Wang, Z., | [RFC2475] Blake, S., Black, D., Carlson, M., Davies, E., Wang, Z., | |||
and W. Weiss, "An Architecture for Differentiated | and W. Weiss, "An Architecture for Differentiated | |||
skipping to change at page 18, line 30 | skipping to change at page 20, line 13 | |||
Services", RFC 3270, May 2002. | Services", RFC 3270, May 2002. | |||
13.2. Informative References | 13.2. Informative References | |||
[Floyd] "A Proposal to Incorporate ECN in MPLS", 1999. | [Floyd] "A Proposal to Incorporate ECN in MPLS", 1999. | |||
Work in progress. http://www.icir.org/floyd/papers/ | Work in progress. http://www.icir.org/floyd/papers/ | |||
draft-ietf-mpls-ecn-00.txt | draft-ietf-mpls-ecn-00.txt | |||
[I-D.briscoe-tsvwg-cl-architecture] | [I-D.briscoe-tsvwg-cl-architecture] | |||
Briscoe, B., "A Framework for Admission Control over | Briscoe, B., "An edge-to-edge Deployment Model for Pre- | |||
DiffServ using Pre-Congestion Notification", | Congestion Notification: Admission Control over a | |||
draft-briscoe-tsvwg-cl-architecture-02 (work in progress), | DiffServ Region", draft-briscoe-tsvwg-cl-architecture-03 | |||
March 2006. | (work in progress), June 2006. | |||
[I-D.briscoe-tsvwg-cl-phb] | [I-D.briscoe-tsvwg-cl-phb] | |||
Briscoe, B., "Pre-Congestion Notification marking", | Briscoe, B., "Pre-Congestion Notification marking", | |||
draft-briscoe-tsvwg-cl-phb-01 (work in progress), | draft-briscoe-tsvwg-cl-phb-02 (work in progress), | |||
March 2006. | June 2006. | |||
[I-D.briscoe-tsvwg-re-ecn-border-cheat] | [I-D.briscoe-tsvwg-re-ecn-border-cheat] | |||
Briscoe, B., "Emulating Border Flow Policing using Re-ECN | Briscoe, B., "Emulating Border Flow Policing using Re-ECN | |||
on Bulk Data", draft-briscoe-tsvwg-re-ecn-border-cheat-00 | on Bulk Data", draft-briscoe-tsvwg-re-ecn-border-cheat-01 | |||
(work in progress), February 2006. | (work in progress), June 2006. | |||
[I-D.briscoe-tsvwg-re-ecn-tcp] | [I-D.briscoe-tsvwg-re-ecn-tcp] | |||
Briscoe, B., "Re-ECN: Adding Accountability for Causing | Briscoe, B., "Re-ECN: Adding Accountability for Causing | |||
Congestion to TCP/IP", draft-briscoe-tsvwg-re-ecn-tcp-01 | Congestion to TCP/IP", draft-briscoe-tsvwg-re-ecn-tcp-02 | |||
(work in progress), March 2006. | (work in progress), June 2006. | |||
[I-D.chan-tsvwg-diffserv-class-aggr] | [I-D.chan-tsvwg-diffserv-class-aggr] | |||
Chan, K., "Aggregation of DiffServ Service Classes", | Chan, K., "Aggregation of DiffServ Service Classes", | |||
draft-chan-tsvwg-diffserv-class-aggr-03 (work in | draft-chan-tsvwg-diffserv-class-aggr-03 (work in | |||
progress), January 2006. | progress), January 2006. | |||
[I-D.ietf-nsis-rmd] | [I-D.ietf-nsis-rmd] | |||
Bader, A., "RMD-QOSM - The Resource Management in Diffserv | Bader, A., "RMD-QOSM - The Resource Management in Diffserv | |||
QOS Model", draft-ietf-nsis-rmd-06 (work in progress), | QOS Model", draft-ietf-nsis-rmd-07 (work in progress), | |||
February 2006. | June 2006. | |||
[I-D.lefaucheur-rsvp-ecn] | [I-D.lefaucheur-rsvp-ecn] | |||
Faucheur, F., "RSVP Extensions for Admission Control over | Faucheur, F., "RSVP Extensions for Admission Control over | |||
Diffserv using Pre-congestion Notification", | Diffserv using Pre-congestion Notification (PCN)", | |||
draft-lefaucheur-rsvp-ecn-00 (work in progress), | draft-lefaucheur-rsvp-ecn-01 (work in progress), | |||
October 2005. | June 2006. | |||
[RFC3540] Spring, N., Wetherall, D., and D. Ely, "Robust Explicit | [RFC3540] Spring, N., Wetherall, D., and D. Ely, "Robust Explicit | |||
Congestion Notification (ECN) Signaling with Nonces", | Congestion Notification (ECN) Signaling with Nonces", | |||
RFC 3540, June 2003. | RFC 3540, June 2003. | |||
[RFC4340] Kohler, E., Handley, M., and S. Floyd, "Datagram | [RFC4340] Kohler, E., Handley, M., and S. Floyd, "Datagram | |||
Congestion Control Protocol (DCCP)", RFC 4340, March 2006. | Congestion Control Protocol (DCCP)", RFC 4340, March 2006. | |||
[Shayman] "Using ECN to Signal Congestion Within an MPLS Domain", | [Shayman] "Using ECN to Signal Congestion Within an MPLS Domain", | |||
2000. | 2000. | |||
skipping to change at page 21, line 5 | skipping to change at page 22, line 5 | |||
BT Research | BT Research | |||
B54/77, Sirius House | B54/77, Sirius House | |||
Adastral Park | Adastral Park | |||
Martlesham Heath | Martlesham Heath | |||
Ipswich | Ipswich | |||
Suffolk IP5 3RE | Suffolk IP5 3RE | |||
United Kingdom | United Kingdom | |||
Email: june.tay@bt.com | Email: june.tay@bt.com | |||
Intellectual Property Statement | Full Copyright Statement | |||
Copyright (C) The Internet Society (2006). | ||||
This document is subject to the rights, licenses and restrictions | ||||
contained in BCP 78, and except as set forth therein, the authors | ||||
retain all their rights. | ||||
This document and the information contained herein are provided on an | ||||
"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS | ||||
OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET | ||||
ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, | ||||
INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE | ||||
INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED | ||||
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. | ||||
Intellectual Property | ||||
The IETF takes no position regarding the validity or scope of any | The IETF takes no position regarding the validity or scope of any | |||
Intellectual Property Rights or other rights that might be claimed to | Intellectual Property Rights or other rights that might be claimed to | |||
pertain to the implementation or use of the technology described in | pertain to the implementation or use of the technology described in | |||
this document or the extent to which any license under such rights | this document or the extent to which any license under such rights | |||
might or might not be available; nor does it represent that it has | might or might not be available; nor does it represent that it has | |||
made any independent effort to identify any such rights. Information | made any independent effort to identify any such rights. Information | |||
on the procedures with respect to rights in RFC documents can be | on the procedures with respect to rights in RFC documents can be | |||
found in BCP 78 and BCP 79. | found in BCP 78 and BCP 79. | |||
skipping to change at page 21, line 29 | skipping to change at page 22, line 45 | |||
such proprietary rights by implementers or users of this | such proprietary rights by implementers or users of this | |||
specification can be obtained from the IETF on-line IPR repository at | specification can be obtained from the IETF on-line IPR repository at | |||
http://www.ietf.org/ipr. | http://www.ietf.org/ipr. | |||
The IETF invites any interested party to bring to its attention any | The IETF invites any interested party to bring to its attention any | |||
copyrights, patents or patent applications, or other proprietary | copyrights, patents or patent applications, or other proprietary | |||
rights that may cover technology that may be required to implement | rights that may cover technology that may be required to implement | |||
this standard. Please address the information to the IETF at | this standard. Please address the information to the IETF at | |||
ietf-ipr@ietf.org. | ietf-ipr@ietf.org. | |||
Disclaimer of Validity | ||||
This document and the information contained herein are provided on an | ||||
"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS | ||||
OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET | ||||
ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, | ||||
INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE | ||||
INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED | ||||
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. | ||||
Copyright Statement | ||||
Copyright (C) The Internet Society (2006). This document is subject | ||||
to the rights, licenses and restrictions contained in BCP 78, and | ||||
except as set forth therein, the authors retain all their rights. | ||||
Acknowledgment | Acknowledgment | |||
Funding for the RFC Editor function is currently provided by the | Funding for the RFC Editor function is provided by the IETF | |||
Internet Society. | Administrative Support Activity (IASA). | |||
End of changes. 46 change blocks. | ||||
177 lines changed or deleted | 252 lines changed or added | |||
This html diff was produced by rfcdiff 1.33. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |