draft-ietf-tsvwg-ecn-mpls-00.txt | draft-ietf-tsvwg-ecn-mpls-01.txt | |||
---|---|---|---|---|
Network Working Group B. Davie | Network Working Group B. Davie | |||
Internet-Draft Cisco Systems, Inc. | Internet-Draft Cisco Systems, Inc. | |||
Intended status: Standards Track B. Briscoe | Intended status: Standards Track B. Briscoe | |||
Expires: August 24, 2007 J. Tay | Expires: December 21, 2007 J. Tay | |||
BT Research | BT Research | |||
February 20, 2007 | June 19, 2007 | |||
Explicit Congestion Marking in MPLS | Explicit Congestion Marking in MPLS | |||
draft-ietf-tsvwg-ecn-mpls-00.txt | draft-ietf-tsvwg-ecn-mpls-01.txt | |||
Status of this Memo | Status of this Memo | |||
By submitting this Internet-Draft, each author represents that any | By submitting this Internet-Draft, each author represents that any | |||
applicable patent or other IPR claims of which he or she is aware | applicable patent or other IPR claims of which he or she is aware | |||
have been or will be disclosed, and any of which he or she becomes | have been or will be disclosed, and any of which he or she becomes | |||
aware will be disclosed, in accordance with Section 6 of BCP 79. | aware will be disclosed, in accordance with Section 6 of BCP 79. | |||
Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
Task Force (IETF), its areas, and its working groups. Note that | Task Force (IETF), its areas, and its working groups. Note that | |||
skipping to change at page 1, line 36 | skipping to change at page 1, line 36 | |||
and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
The list of current Internet-Drafts can be accessed at | The list of current Internet-Drafts can be accessed at | |||
http://www.ietf.org/ietf/1id-abstracts.txt. | http://www.ietf.org/ietf/1id-abstracts.txt. | |||
The list of Internet-Draft Shadow Directories can be accessed at | The list of Internet-Draft Shadow Directories can be accessed at | |||
http://www.ietf.org/shadow.html. | http://www.ietf.org/shadow.html. | |||
This Internet-Draft will expire on August 24, 2007. | This Internet-Draft will expire on December 21, 2007. | |||
Copyright Notice | Copyright Notice | |||
Copyright (C) The IETF Trust (2007). | Copyright (C) The IETF Trust (2007). | |||
Abstract | Abstract | |||
RFC 3270 defines how to support the Diffserv architecture in MPLS | RFC 3270 defines how to support the Diffserv architecture in MPLS | |||
networks, including how to encode Diffserv Code Points (DSCPs) in an | networks, including how to encode Diffserv Code Points (DSCPs) in an | |||
MPLS header. DSCPs may be encoded in the EXP field, while other uses | MPLS header. DSCPs may be encoded in the EXP field, while other uses | |||
skipping to change at page 3, line 5 | skipping to change at page 2, line 13 | |||
in the MPLS header. This draft defines how an operator might define | in the MPLS header. This draft defines how an operator might define | |||
some of the EXP codepoints for explicit congestion notification, | some of the EXP codepoints for explicit congestion notification, | |||
without precluding other uses. | without precluding other uses. | |||
Requirements Language | Requirements Language | |||
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | |||
document are to be interpreted as described in RFC 2119 [RFC2119]. | document are to be interpreted as described in RFC 2119 [RFC2119]. | |||
Table of Contents | Change History | |||
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 | [Note to RFC Editor: This section to be removed before publication] | |||
1.1. Change History . . . . . . . . . . . . . . . . . . . . . . 4 | ||||
1.2. Background . . . . . . . . . . . . . . . . . . . . . . . . 4 | ||||
1.3. Intent . . . . . . . . . . . . . . . . . . . . . . . . . . 5 | ||||
1.4. Terminology . . . . . . . . . . . . . . . . . . . . . . . 5 | ||||
2. Use of MPLS EXP Field for ECN . . . . . . . . . . . . . . . . 6 | ||||
3. Per-domain ECT checking . . . . . . . . . . . . . . . . . . . 8 | ||||
4. ECN-enabled MPLS domain . . . . . . . . . . . . . . . . . . . 9 | ||||
4.1. Pushing (adding) one or more labels to an IP packet . . . 9 | ||||
4.2. Pushing one or more labels onto an MPLS labelled packet . 9 | ||||
4.3. Congestion experienced in an interior MPLS node . . . . . 9 | ||||
4.4. Crossing a Diffserv Domain Boundary . . . . . . . . . . . 10 | ||||
4.5. Popping an MPLS label (not the end of the stack) . . . . . 10 | ||||
4.6. Popping the last MPLS label in the stack . . . . . . . . . 10 | ||||
4.7. Diffserv Tunneling Models . . . . . . . . . . . . . . . . 11 | ||||
4.8. Extension to Pre-Congestion Notification . . . . . . . . . 11 | ||||
4.8.1. Label Push onto IP packet . . . . . . . . . . . . . . 12 | ||||
4.8.2. Pushing Additional MPLS Labels . . . . . . . . . . . . 12 | ||||
4.8.3. Admission Control or Pre-emption Marking inside | ||||
MPLS domain . . . . . . . . . . . . . . . . . . . . . 12 | ||||
4.8.4. Popping an MPLS Label (not end of stack) . . . . . . . 12 | ||||
4.8.5. Popping the last MPLS Label to expose IP header . . . 12 | ||||
5. ECN-disabled MPLS domain . . . . . . . . . . . . . . . . . . . 13 | ||||
6. The use of more codepoints with E-LSPs and L-LSPs . . . . . . 13 | ||||
7. Relationship to tunnel behavior in RFC 3168 . . . . . . . . . 14 | ||||
7.1. Alternative approach to support ECN in an MPLS domain . . 14 | ||||
8. Example Uses . . . . . . . . . . . . . . . . . . . . . . . . . 15 | ||||
8.1. RFC3168-style ECN . . . . . . . . . . . . . . . . . . . . 15 | ||||
8.2. ECN Co-existence with Diffserv E-LSPs . . . . . . . . . . 15 | ||||
8.3. Congestion-feedback-based Traffic Engineering . . . . . . 16 | ||||
8.4. PCN flow admission control and flow pre-emption . . . . . 16 | ||||
9. Deployment Considerations . . . . . . . . . . . . . . . . . . 17 | ||||
9.1. Marking non-ECN Capable Packets . . . . . . . . . . . . . 17 | ||||
9.2. Non-ECN capable routers in an MPLS Domain . . . . . . . . 18 | ||||
10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 18 | ||||
11. Security Considerations . . . . . . . . . . . . . . . . . . . 18 | ||||
12. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 19 | ||||
13. References . . . . . . . . . . . . . . . . . . . . . . . . . . 19 | ||||
13.1. Normative References . . . . . . . . . . . . . . . . . . . 19 | ||||
13.2. Informative References . . . . . . . . . . . . . . . . . . 20 | ||||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 21 | ||||
Intellectual Property and Copyright Statements . . . . . . . . . . 22 | ||||
1. Introduction | Changes in this version (draft-ietf-tsvwg-ecn-mpls-01.txt) relative | |||
to the last (draft-ietf-tsvwg-ecn-mpls-00.txt): | ||||
1.1. Change History | o Moved the detailed discussion of marking procedures for Pre- | |||
Congestion Notification (PCN) to an appendix. | ||||
[Note to RFC Editor: This section to be removed before publication] | o Removed PCN as a motivation for the efficient code-point usage in | |||
Section 2. | ||||
This version (draft-ietf-tsvwg-ecn-mpls-00.txt) differs from the last | o Clarified the rationale for preferring the ECT-checking approach | |||
(draft-davie-mpls-ecn-01.txt) only in title, date, and updated | over the approach of [Floyd] in Section 9.1. | |||
references. | ||||
Changes from draft-davie-ecn-mpls-00 to draft-davie-ecn-mpls-01: | o Updated discussion of relationship to RFC3168 in Section 7 | |||
o Removed discussion of re-ECN from Security Considerations. | ||||
o Fixed typos and nits. | ||||
Changes in draft-ietf-tsvwg-ecn-mpls-00.txt relative to | ||||
draft-davie-ecn-mpls-00: | ||||
o Corrected the description of ECN-MPLS marking proposed in | o Corrected the description of ECN-MPLS marking proposed in | |||
[Shayman], which closely corresponds to that proposed in this | [Shayman], which closely corresponds to that proposed in this | |||
document. | document. | |||
o Pre-congestion notification (PCN) marking is now described in a | o Pre-congestion notification (PCN) marking is now described in a | |||
way that does not require normative references to PCN | way that does not require normative references to PCN | |||
specifications. PCN discussion now serves only to illustrate how | specifications. PCN discussion now serves only to illustrate how | |||
the ECN marking concepts can be extended to cover more complex | the ECN marking concepts can be extended to cover more complex | |||
scenarios, with PCN being an example. | scenarios, with PCN being an example. | |||
o Added specification of behavior when MPLS encapsulated packets | o Added specification of behavior when MPLS encapsulated packets | |||
cross from an ECN-enabled domain to a domain that is not ECN- | cross from an ECN-enabled domain to a domain that is not ECN- | |||
enabled. | enabled. | |||
o Clarified that copying MPLS ECN or PCN marking into exposed IP | o Clarified that copying MPLS ECN or PCN marking into exposed IP | |||
header on egress is not mandatory | header on egress is not mandatory | |||
o Fixed typos and nits | o Fixed typos and nits | |||
1.2. Background | Table of Contents | |||
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 | ||||
1.1. Background . . . . . . . . . . . . . . . . . . . . . . . . 4 | ||||
1.2. Intent . . . . . . . . . . . . . . . . . . . . . . . . . . 4 | ||||
1.3. Terminology . . . . . . . . . . . . . . . . . . . . . . . 5 | ||||
2. Use of MPLS EXP Field for ECN . . . . . . . . . . . . . . . . 6 | ||||
3. Per-domain ECT checking . . . . . . . . . . . . . . . . . . . 8 | ||||
4. ECN-enabled MPLS domain . . . . . . . . . . . . . . . . . . . 8 | ||||
4.1. Pushing (adding) one or more labels to an IP packet . . . 9 | ||||
4.2. Pushing one or more labels onto an MPLS labelled packet . 9 | ||||
4.3. Congestion experienced in an interior MPLS node . . . . . 9 | ||||
4.4. Crossing a Diffserv Domain Boundary . . . . . . . . . . . 9 | ||||
4.5. Popping an MPLS label (not the end of the stack) . . . . . 10 | ||||
4.6. Popping the last MPLS label in the stack . . . . . . . . . 10 | ||||
4.7. Diffserv Tunneling Models . . . . . . . . . . . . . . . . 10 | ||||
5. ECN-disabled MPLS domain . . . . . . . . . . . . . . . . . . . 11 | ||||
6. The use of more codepoints with E-LSPs and L-LSPs . . . . . . 11 | ||||
7. Relationship to tunnel behavior in RFC 3168 . . . . . . . . . 11 | ||||
8. Example Uses . . . . . . . . . . . . . . . . . . . . . . . . . 12 | ||||
8.1. RFC3168-style ECN . . . . . . . . . . . . . . . . . . . . 12 | ||||
8.2. ECN Co-existence with Diffserv E-LSPs . . . . . . . . . . 12 | ||||
8.3. Congestion-feedback-based Traffic Engineering . . . . . . 13 | ||||
8.4. PCN flow admission control and flow pre-emption . . . . . 13 | ||||
9. Deployment Considerations . . . . . . . . . . . . . . . . . . 14 | ||||
9.1. Marking non-ECN Capable Packets . . . . . . . . . . . . . 14 | ||||
9.2. Non-ECN capable routers in an MPLS Domain . . . . . . . . 15 | ||||
10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 15 | ||||
11. Security Considerations . . . . . . . . . . . . . . . . . . . 15 | ||||
12. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 16 | ||||
Appendix A. Extension to Pre-Congestion Notification . . . . . . 16 | ||||
Appendix A.1. Label Push onto IP packet . . . . . . . . . . . . . 17 | ||||
Appendix A.2. Pushing Additional MPLS Labels . . . . . . . . . . . 17 | ||||
Appendix A.3. Admission Control or Pre-emption Marking inside | ||||
MPLS domain . . . . . . . . . . . . . . . . . . . . 17 | ||||
Appendix A.4. Popping an MPLS Label (not end of stack) . . . . . . 17 | ||||
Appendix A.5. Popping the last MPLS Label to expose IP header . . 18 | ||||
13. References . . . . . . . . . . . . . . . . . . . . . . . . . . 18 | ||||
13.1. Normative References . . . . . . . . . . . . . . . . . . . 18 | ||||
13.2. Informative References . . . . . . . . . . . . . . . . . . 19 | ||||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 20 | ||||
Intellectual Property and Copyright Statements . . . . . . . . . . 22 | ||||
1. Introduction | ||||
1.1. Background | ||||
[RFC3168] defines Explicit Congestion Notification for IP. The | ||||
primary purpose of ECN is to allow congestion to be signalled without | ||||
dropping packets. | ||||
[RFC3270] defines how to support the Diffserv architecture in MPLS | [RFC3270] defines how to support the Diffserv architecture in MPLS | |||
networks, including how to encode Diffserv Code Points (DSCPs) in an | networks, including how to encode Diffserv Code Points (DSCPs) in an | |||
MPLS header. DSCPs may be encoded in the EXP field, while other uses | MPLS header. DSCPs may be encoded in the EXP field, while other uses | |||
of that field are not precluded. RFC3270 makes no statement about | of that field are not precluded. RFC3270 makes no statement about | |||
how Explicit Congestion Notification (ECN) marking might be encoded | how Explicit Congestion Notification (ECN) marking might be encoded | |||
in the MPLS header. This draft defines how an operator might define | in the MPLS header. | |||
some of the EXP codepoints for explicit congestion notification, | ||||
without precluding other uses. In parallel to the activity defining | This draft defines how an operator might define some of the EXP | |||
the addition of ECN to IP [RFC3168], two proposals were made to add | codepoints for explicit congestion notification, without precluding | |||
ECN to MPLS [Floyd][Shayman]. These proposals, however, fell by the | other uses. In parallel to the activity defining the addition of ECN | |||
wayside. With ECN for IP now being a proposed standard, and | to IP [RFC3168], two proposals were made to add ECN to MPLS | |||
developing interest in using pre-congestion notification (PCN) for | [Floyd][Shayman]. These proposals, however, fell by the wayside. | |||
admission control and flow pre-emption | With ECN for IP now being a proposed standard, and developing | |||
[I-D.briscoe-tsvwg-cl-architecture], there is consequent interest in | interest in using pre-congestion notification (PCN) for admission | |||
being able to support ECN across IP networks consisting of MPLS- | control and flow pre-emption [I-D.briscoe-tsvwg-cl-architecture], | |||
enabled domains. Therefore it is necessary to specify the protocol | there is consequent interest in being able to support ECN across IP | |||
for including ECN in the MPLS shim header, and the protocol behavior | networks consisting of MPLS-enabled domains. Therefore it is | |||
of edge MPLS nodes. | necessary to specify the protocol for including ECN in the MPLS shim | |||
header, and the protocol behavior of edge MPLS nodes. | ||||
We note that in [RFC3168] there are four codepoints used for ECN | We note that in [RFC3168] there are four codepoints used for ECN | |||
marking, which are encoded using two bits of the IP header. The MPLS | marking, which are encoded using two bits of the IP header. The MPLS | |||
EXP field is the logical place to encode ECN codepoints, but with | EXP field is the logical place to encode ECN codepoints, but with | |||
only 3 bits (8 codepoints) available, and with the same field being | only 3 bits (8 codepoints) available, and with the same field being | |||
used to convey DSCP information as well, there is a clear incentive | used to convey DSCP information as well, there is a clear incentive | |||
to conserve the number of codepoints consumed for ECN purposes. | to conserve the number of codepoints consumed for ECN purposes. | |||
Efficient use of the EXP field has been a focus of prior drafts | Efficient use of the EXP field has been a focus of prior drafts | |||
[Floyd] [Shayman] and we draw on those efforts in this draft as well. | [Floyd] [Shayman] and we draw on those efforts in this draft as well. | |||
1.3. Intent | We also note that [RFC3168] defines default usage of the ECN field | |||
but allows for the possibility that some Diffserv PHBs might include | ||||
different specifications on how the ECN field is to be used. This | ||||
draft seeks to preserve that capability. | ||||
1.2. Intent | ||||
Our intent is to specify how the MPLS shim header[RFC3032] should | Our intent is to specify how the MPLS shim header[RFC3032] should | |||
denote ECN marking and how MPLS nodes should understand whether the | denote ECN marking and how MPLS nodes should understand whether the | |||
transport for a packet will be ECN capable. We offer this as a | transport for a packet will be ECN capable. We offer this as a | |||
building block, from which to build different congestion notification | building block, from which to build different congestion notification | |||
systems. We do not intend to specify how the resulting congestion | systems. We do not intend to specify how the resulting congestion | |||
notification is fed back to an upstream node that can mitigate | notification is fed back to an upstream node that can mitigate | |||
congestion. For instance, unlike [Shayman], we do not specify edge- | congestion. For instance, unlike [Shayman], we do not specify edge- | |||
to-edge MPLS domain feedback, but we also do not preclude it. | to-edge MPLS domain feedback, but we also do not preclude it. | |||
Nonetheless, we do specify how the egress node of an MPLS domain | Nonetheless, we do specify how the egress node of an MPLS domain | |||
should copy congestion notification from the MPLS shim into the | should copy congestion notification from the MPLS shim into the | |||
underlying IP header if the ECN is to be carried onward towards the | encapsulated IP header if the ECN is to be carried onward towards the | |||
IP receiver. But we do NOT mandate that MPLS congestion notification | IP receiver. But we do NOT mandate that MPLS congestion notification | |||
must be copied into the IP header for onward transmission. This | must be copied into the IP header for onward transmission. This | |||
draft aims to be generic for any use of congestion notification in | draft aims to be generic for any use of congestion notification in | |||
MPLS. PCN or traffic engineering are merely two of many motivating | MPLS. Support of [RFC3168] is our primary motivation; some | |||
applications (see Section 8.) | additional potential applications to illustrate the flexibility of | |||
our approach are described in Section 8. In particular, we aim to | ||||
support possible future schemes that may use more than one level of | ||||
congestion marking. | ||||
1.4. Terminology | 1.3. Terminology | |||
This document draws freely on the terminology of ECN [RFC3168] and | This document draws freely on the terminology of ECN [RFC3168] and | |||
MPLS [RFC3031]. For ease of reference, we have included some | MPLS [RFC3031]. For ease of reference, we have included some | |||
definitions here, but refer the reader to the references above for | definitions here, but refer the reader to the references above for | |||
complete specifications of the relevant technologies: | complete specifications of the relevant technologies: | |||
o CE: Congestion Experienced. One of the states with which a packet | o CE: Congestion Experienced. One of the states with which a packet | |||
may be marked in a network supporting ECN. A packet is marked in | may be marked in a network supporting ECN. A packet is marked in | |||
this state by an ECN-capable router, to indicate that this router | this state by an ECN-capable router, to indicate that this router | |||
was experiencing congestion at the time the packet arrived. | was experiencing congestion at the time the packet arrived. | |||
skipping to change at page 6, line 13 | skipping to change at page 5, line 45 | |||
the transport protocol are ECN-capable. A router may not mark a | the transport protocol are ECN-capable. A router may not mark a | |||
packet as CE unless the packet was marked ECT when it arrived. | packet as CE unless the packet was marked ECT when it arrived. | |||
o Not-ECT: Not ECN capable transport. An end system marks a packet | o Not-ECT: Not ECN capable transport. An end system marks a packet | |||
with this codepoint to indicate that the end-points of the | with this codepoint to indicate that the end-points of the | |||
transport protocol are not ECN-capable. A congested router cannot | transport protocol are not ECN-capable. A congested router cannot | |||
mark such packets as CE, and thus can only drop them to indicate | mark such packets as CE, and thus can only drop them to indicate | |||
congestion. | congestion. | |||
o EXP field. A 3 bit field in the MPLS label header [RFC3032] which | o EXP field. A 3 bit field in the MPLS label header [RFC3032] which | |||
may be used to convey Diffserv information (and used in this draft | may be used to convey Diffserv information (and is also used in | |||
to carry ECN information). | this draft to carry ECN information). | |||
o PHP. Penultimate Hop Popping. An MPLS operation in which the | o PHP. Penultimate Hop Popping. An MPLS operation in which the | |||
penultimate Label Switching Router (LSR) on a Label Switched Path | penultimate Label Switching Router (LSR) on a Label Switched Path | |||
(LSP) removes the top label from the packet before forwarding the | (LSP) removes the top label from the packet before forwarding the | |||
packet to the final LSR on the LSP. | packet to the final LSR on the LSP. | |||
2. Use of MPLS EXP Field for ECN | 2. Use of MPLS EXP Field for ECN | |||
We propose that LSRs configured for explicit congestion notification | We propose that LSRs configured for explicit congestion notification | |||
should use the EXP field in the MPLS shim header. However, RFC 3270 | should use the EXP field in the MPLS shim header. However, [RFC3270] | |||
already defines use of codepoints in the EXP field for differentiated | already defines use of codepoints in the EXP field for differentiated | |||
services. Although it does not preclude other compatible uses of the | services. Although it does not preclude other compatible uses of the | |||
EXP field, this clearly seems to limit the space available for ECN, | EXP field, this clearly seems to limit the space available for ECN, | |||
given the field is only 3 bits (8 codepoints). | given the field is only 3 bits (8 codepoints). | |||
RFC 3270 defines two possible approaches for requesting | [RFC3270] defines two possible approaches for requesting | |||
differentiated service treatment from an LSR. | differentiated service treatment from an LSR. | |||
o In the E-LSP approach, different codepoints of the EXP field in | o In the E-LSP approach, different codepoints of the EXP field in | |||
the MPLS shim header are used to indicate the packet's per hop | the MPLS shim header are used to indicate the packet's per hop | |||
behavior (PHB). | behavior (PHB). | |||
o In the L-LSP approach, an MPLS label is assigned for each PHB | o In the L-LSP approach, an MPLS label is assigned for each PHB | |||
scheduling class (PSC, as defined in [RFC3260], so that an LSR | scheduling class (PSC, as defined in [RFC3260], so that an LSR | |||
determines both its forwarding and its scheduling behavior from | determines both its forwarding and its scheduling behavior from | |||
the label. | the label. | |||
If an MPLS domain uses the L-LSP approach, there is likely to be | If an MPLS domain uses the L-LSP approach, there is likely to be | |||
space in the EXP field for ECN codepoint(s). Where the E-LSP | space in the EXP field for ECN codepoint(s). Where the E-LSP | |||
approach is used, then codepoint space in the EXP field is likely to | approach is used, then codepoint space in the EXP field is likely to | |||
be scarce. This draft focuses on interworking ECN marking with the | be scarce. This draft focuses on interworking ECN marking with the | |||
E-LSP approach as it is the tougher problem. Consequently the same | E-LSP approach as it is the tougher problem. Consequently the same | |||
approach can also be applied with L-LSPs. | approach can also be applied with L-LSPs. | |||
We recommend that explicit congestion notification in MPLS should use | We recommend that explicit congestion notification in MPLS should use | |||
codepoints instead of bits in the EXP field. Since not every PHB | codepoints instead of bits in the EXP field. Since not every PHB | |||
will need an associated ECN codepoint and in some applications a | will necessarily require an associated ECN codepoint it would be | |||
given PHB might need two ECN codepoints (see, for | wasteful to assign a dedicated bit for ECN. (There may also be cases | |||
example,[I-D.briscoe-tsvwg-cl-architecture]) it would be wasteful to | where a given PHB might need more than one ECN-like codepoint; see | |||
assign a dedicated bit for ECN. | Section 8.4 for an example.) | |||
For each PHB that uses ECN marking, we assume one EXP codepoint will | For each PHB that uses ECN marking, we assume one EXP codepoint will | |||
be defined meaning not congestion marked (Not-CM), and at least one | be defined meaning not congestion marked (Not-CM), and at least one | |||
other codepoint will be defined meaning congestion marked (CM). | other codepoint will be defined meaning congestion marked (CM). | |||
Therefore, each PHB that uses ECN marking will consume at least two | Therefore, each PHB that uses ECN marking will consume at least two | |||
EXP codepoints. But PHBs that do not use ECN marking will only | EXP codepoints. But PHBs that do not use ECN marking will only | |||
consume one. | consume one. | |||
Further, we wish to use minimal space in the MPLS shim header to tell | Further, we wish to use minimal space in the MPLS shim header to tell | |||
interior LSRs whether each packet will be received by an ECN-capable | interior LSRs whether each packet will be received by an ECN-capable | |||
skipping to change at page 9, line 12 | skipping to change at page 8, line 40 | |||
In the per-domain ECT checking approach, only the egress nodes check | In the per-domain ECT checking approach, only the egress nodes check | |||
whether an IP packet is destined for an ECN-capable transport. | whether an IP packet is destined for an ECN-capable transport. | |||
Therefore, any single LSR within an MPLS domain MUST NOT be | Therefore, any single LSR within an MPLS domain MUST NOT be | |||
configured to enable ECN marking unless all the egress LSRs | configured to enable ECN marking unless all the egress LSRs | |||
surrounding it are already configured to handle ECN marking. | surrounding it are already configured to handle ECN marking. | |||
We call a domain surrounded by ECN-capable egress LSRs an ECN-enabled | We call a domain surrounded by ECN-capable egress LSRs an ECN-enabled | |||
MPLS domain. This term only implies that all the egress LSRs are | MPLS domain. This term only implies that all the egress LSRs are | |||
ECN-enabled; some interior LSRs may not be ECN-enabled. For | ECN-enabled; some interior LSRs may not be ECN-enabled. For | |||
instance, it would be possible to use legacy LSRs incapable of | instance, it would be possible to use some legacy LSRs incapable of | |||
supporting ECN in the interior of an MPLS domain as long as all the | supporting ECN in the interior of an MPLS domain as long as all the | |||
egress LSRs were ECN-capable. Note that if PHP is used, the | egress LSRs were ECN-capable. Note that if PHP is used, the | |||
"penultimate hop" routers which perform the pop operation do need to | "penultimate hop" routers which perform the pop operation do need to | |||
be ECN-enabled, since they are acting in this context as egress LSRs. | be ECN-enabled, since they are acting in this context as egress LSRs. | |||
4. ECN-enabled MPLS domain | 4. ECN-enabled MPLS domain | |||
In the following subsections we describe various operations affecting | In the following subsections we describe various operations affecting | |||
the ECN marking of a packet that may be performed at MPLS edge and | the ECN marking of a packet that may be performed at MPLS edge and | |||
core LSRs. | core LSRs. | |||
skipping to change at page 11, line 4 | skipping to change at page 10, line 32 | |||
means that if the EXP value of the MPLS header was CM, the packet | means that if the EXP value of the MPLS header was CM, the packet | |||
MUST be dropped. | MUST be dropped. | |||
Assuming an IP packet was exposed, we have to examine whether that | Assuming an IP packet was exposed, we have to examine whether that | |||
packet is ECT or not. A Not-ECT packet MUST be dropped if the EXP | packet is ECT or not. A Not-ECT packet MUST be dropped if the EXP | |||
field is CM. | field is CM. | |||
For the remainder of this section, we describe the behavior that is | For the remainder of this section, we describe the behavior that is | |||
required if the ECN information is to be transferred from the MPLS | required if the ECN information is to be transferred from the MPLS | |||
header into the exposed IP header for onward transmission. As noted | header into the exposed IP header for onward transmission. As noted | |||
in Section 1.3, such behavior is not mandated by this document, but | in Section 1.2, such behavior is not mandated by this document, but | |||
may be selected by an operator. | may be selected by an operator. | |||
If the inner IP packet is Not-ECT, its ECN field remains unchanged if | If the inner IP packet is Not-ECT, its ECN field remains unchanged if | |||
the EXP field is Not-CM. If the ECN field of the inner packet is set | the EXP field is Not-CM. If the ECN field of the inner packet is set | |||
to ECT(0), ECT(1) or CE, the ECN field remains unchanged if the EXP | to ECT(0), ECT(1) or CE, the ECN field remains unchanged if the EXP | |||
field is set to Not-CM. The ECN field is set to CE if the EXP field | field is set to Not-CM. The ECN field is set to CE if the EXP field | |||
is CM. Note that an inner value of CE and an outer value of not-CM | is CM. Note that an inner value of CE and an outer value of not-CM | |||
should be considered anomalous, and SHOULD be logged in some way by | should be considered anomalous, and SHOULD be logged in some way by | |||
the LSR. | the LSR. | |||
skipping to change at page 11, line 31 | skipping to change at page 11, line 10 | |||
particular LSP is carried to the last hop of the LSP and beyond the | particular LSP is carried to the last hop of the LSP and beyond the | |||
last hop. Depending on which mode is preferred by an operator, the | last hop. Depending on which mode is preferred by an operator, the | |||
EXP value or DSCP value of an exposed header following a label pop | EXP value or DSCP value of an exposed header following a label pop | |||
may or may not be dependent on the EXP value of the label that is | may or may not be dependent on the EXP value of the label that is | |||
removed by the pop operation. We believe that in the case of ECN | removed by the pop operation. We believe that in the case of ECN | |||
marking, the use of these models should only apply to the encoding of | marking, the use of these models should only apply to the encoding of | |||
the Diffserv PHB in the EXP value, and that the choice of codepoint | the Diffserv PHB in the EXP value, and that the choice of codepoint | |||
for ECN should always be made based on the procedures described | for ECN should always be made based on the procedures described | |||
above, independent of the tunneling model. | above, independent of the tunneling model. | |||
4.8. Extension to Pre-Congestion Notification | ||||
This section describes how the preceding mechanisms can be extended | ||||
to support PCN [I-D.briscoe-tsvwg-cl-architecture]. Our intent here | ||||
is to show that the mechanisms are readily extended to more complex | ||||
scenarios than ECN, but this section may be safely ignored if one is | ||||
interested only in supporting ECN. | ||||
The relevant aspects of PCN for the purposes of this discussion are: | ||||
o PCN uses 3 states rather than 2 for ECN - these are referred to as | ||||
admission marked (AM), pre-emption marked (PM) and not marked (NM) | ||||
states. (See Section 8.4 for further discussion of PCN and the | ||||
possibility of using fewer codepoints.) | ||||
o A packet can go from NM to AM, from NM to PM, or from AM to PM, | ||||
but no other transition is possible. | ||||
o Whereas ECN-capable packets are identified by the ECT value in the | ||||
IP header, PCN-capability is determined by the PHB of the packet. | ||||
Thus, to support PCN fully in an MPLS domain for a particular PHB, a | ||||
total of 3 codepoints need to be allocated for that PHB. These 3 | ||||
codepoints represent the admission marked (AM), pre-emption marked | ||||
(PM) and not marked (NM) states. The procedures described above need | ||||
to be slightly modified to support this scenario. The following | ||||
procedures are invoked when the topmost DSCP or EXP value indicates a | ||||
PHB that supports PCN. | ||||
4.8.1. Label Push onto IP packet | ||||
If the IP packet header indicates AM, set the EXP value of all | ||||
entries in the label stack to AM. If the IP packet header indicates | ||||
PM, set the EXP value of all entries in the label stack to PM. For | ||||
any other marking of the IP header, set the EXP value of all entries | ||||
in the label stack to NM. | ||||
4.8.2. Pushing Additional MPLS Labels | ||||
The procedures of Section 4.2 apply. | ||||
4.8.3. Admission Control or Pre-emption Marking inside MPLS domain | ||||
The EXP value can be set to AM or PM according to the same procedures | ||||
as described in [I-D.briscoe-tsvwg-cl-phb]. For the purposes of this | ||||
document, it does not matter exactly what algorithms are used to | ||||
decide when to set AM or PM; all that matters is that if a router | ||||
would have marked AM (or PM) in the IP header, it should set the EXP | ||||
value in the MPLS header to the AM (or PM) codepoint. | ||||
4.8.4. Popping an MPLS Label (not end of stack) | ||||
When popping an MPLS Label exposes another MPLS label, the AM or PM | ||||
marking should be transferred to the exposed EXP field in the | ||||
following manner: if the inner EXP value is NM, then it should be set | ||||
to the same marking state as the EXP value of the popped label stack | ||||
entry. If the inner EXP value is AM, it should be unchanged if the | ||||
popped EXP value was AM, and it should be set to PM if the popped EXP | ||||
value was PM. If the popped EXP value was NM, this should be logged | ||||
in some way and the inner EXP value should be unchanged. If the | ||||
inner EXP value is PM, it should be unchanged whatever the popped EXP | ||||
value was, but any EXP value other than PM should be logged. | ||||
4.8.5. Popping the last MPLS Label to expose IP header | ||||
When popping the last MPLS Label exposes the IP header, there are two | ||||
cases to consider: | ||||
o the popping LSR is NOT the egress router of the PCN region, in | ||||
which case AM or PM marking should be transferred to the exposed | ||||
IP header field; or | ||||
o the popping LSR IS the egress router of the PCN region. | ||||
In the latter case, the behavior of the egress LSR is defined in | ||||
[I-D.briscoe-tsvwg-cl-architecture] and is beyond the scope of this | ||||
document. In the former case, the marking should be transferred from | ||||
the popped MPLS header to the exposed IP header as follows: if the | ||||
inner IP header value is neither AM nor PM, and the EXP value was NM, | ||||
then the IP header should be unchanged. For any other EXP value, the | ||||
IP header should be set to the same marking state as the EXP value of | ||||
the popped label stack entry. If the inner IP header value is AM, it | ||||
should be unchanged if the popped EXP value was AM, and it should be | ||||
set to PM if the popped EXP value was PM. If the popped EXP value | ||||
was NM, this should be logged in some way and the inner IP header | ||||
value should be unchanged. If the IP header value is PM, it should | ||||
be unchanged whatever the popped EXP value was, but any EXP value | ||||
other than PM should be logged. | ||||
5. ECN-disabled MPLS domain | 5. ECN-disabled MPLS domain | |||
If ECN is not enabled on all the egress LSRs of a domain, ECN MUST | If ECN is not enabled on all the egress LSRs of a domain, ECN MUST | |||
NOT be enabled on any LSRs throughout the domain. If congestion is | NOT be enabled on any LSRs throughout the domain. If congestion is | |||
experienced on any LSR in an ECN-disabled MPLS domain, packets MUST | experienced on any LSR in an ECN-disabled MPLS domain, packets MUST | |||
be dropped, NOT marked. The exact algorithm for deciding when to | be dropped, NOT marked. The exact algorithm for deciding when to | |||
drop packets during congestion (e.g. tail-drop, RED, etc.) is a local | drop packets during congestion (e.g. tail-drop, RED, etc.) is a local | |||
matter for the operator of the domain. | matter for the operator of the domain. | |||
6. The use of more codepoints with E-LSPs and L-LSPs | 6. The use of more codepoints with E-LSPs and L-LSPs | |||
RFC 3270 gives different options with E-LSPs and L-LSPs and some of | [RFC3270] gives different options with E-LSPs and L-LSPs and some of | |||
those could potentially provide ample EXP codepoints for ECN/PCN. | those could potentially provide ample EXP codepoints for ECN. | |||
However, deploying L-LSPs vs E-LSPs has many implications such as | However, deploying L-LSPs vs E-LSPs has many implications such as | |||
platform support and operational complexity. The above ECN/PCN MPLS | platform support and operational complexity. The above ECN MPLS | |||
solution should provide some flexibility. If the operator has | solution should provide some flexibility. If the operator has | |||
deployed one L-LSP per PHB scheduling class, then EXP space will be a | deployed one L-LSP per PHB scheduling class, then EXP space will be a | |||
non-issue and it could be used to achieve more sophisticated ECN/PCN | non-issue and it could be used to achieve more sophisticated ECN | |||
behavior if required. If the operator wants to stick to E-LSPs and | behavior if required. If the operator wants to stick to E-LSPs and | |||
uses a handful of EXP codepoints for Diffserv, it may be desirable to | uses a handful of EXP codepoints for Diffserv, it may be desirable to | |||
operate with a minimum number of extra ECN/PCN codepoints, even if | operate with a minimum number of extra ECN codepoints, even if this | |||
this comes with some compromise on ECN/PCN optimality. See Section 8 | comes with some compromise on ECN optimality. See Section 8 for | |||
for discussion of some possible deployment scenarios. | discussion of some possible deployment scenarios. | |||
7. Relationship to tunnel behavior in RFC 3168 | 7. Relationship to tunnel behavior in RFC 3168 | |||
[RFC3168] defines two modes of encapsulating ECN-marked IP packets | [RFC3168] defines two modes of encapsulating ECN-marked IP packets | |||
inside additional IP headers when tunnels are used. The two modes | inside additional IP headers when tunnels are used. The two modes | |||
are the "full functionality" and "limited functionality" modes. In | are the "full functionality" and "limited functionality" modes. In | |||
the full functionality mode, the ECT information from the inner | the full functionality mode, the ECT information from the inner | |||
header is copied to the outer header at the tunnel ingress, but the | header is copied to the outer header at the tunnel ingress, but the | |||
CE information is not. In the limited functionality mode, neither | CE information is not. In the limited functionality mode, neither | |||
ECT nor CE information is copied to the outer header, and thus ECN | ECT nor CE information is copied to the outer header, and thus ECN | |||
cannot be applied to the encapsulated packet. | cannot be applied to the encapsulated packet. | |||
The behavior that is specified in Section 4 of this document | The behavior that is specified in Section 4 of this document | |||
resembles the "full functionality" mode in the sense that it conveys | resembles the "full functionality" mode in the sense that it conveys | |||
some information from inner to outer header, and in the sense that it | some information from inner to outer header, and in the sense that it | |||
enables full ECN support along the MPLS LSP (which is analogous to an | enables full ECN support along the MPLS LSP (which is analogous to an | |||
IP tunnel in this context). However it differs in one respect, which | IP tunnel in this context). However it differs in one respect, which | |||
is that the CE information is conveyed from the inner header to the | is that the CE information is conveyed from the inner header to the | |||
outer header. Our reason for this different design choice is to give | outer header. Our original reason for this different design choice | |||
interior routers and LSRs more information about upstream marking in | was to give interior routers and LSRs more information about upstream | |||
multi-bottleneck cases. For instance, the flow pre-emption marking | marking in multi-bottleneck cases. For instance, the flow pre- | |||
mechanism proposed for PCN works by only considering packets for | emption marking mechanism proposed for PCN works by only considering | |||
marking that have not already been marked upstream. Unless existing | packets for marking that have not already been marked upstream. | |||
pre-emption marking is copied from the inner to the outer header at | Unless existing pre-emption marking is copied from the inner to the | |||
tunnel ingress, the mechanism doesn't pre-empt enough traffic in | outer header at tunnel ingress, the mechanism doesn't pre-empt enough | |||
cases where anomalous events hit multiple MPLS domains at once. | traffic in cases where anomalous events hit multiple domains at once. | |||
[RFC3168] does not give any reasons against conveying CE information | [RFC3168] does not give any reasons against conveying CE information | |||
from the inner header to the outer in the "full functionality" mode. | from the inner header to the outer in the "full functionality" mode. | |||
So, rather than define different encapsulation methods for ECN and | Furthermore, [RFC4301] specifies that the ECN marking should be | |||
PCN, Section 4 defines a common approach for both. | copied from inner header to outer header in IPSEC tunnels, consistent | |||
with the approach defined here. [Briscoe] discusses this issue in | ||||
7.1. Alternative approach to support ECN in an MPLS domain | more detail. In summary, the approach described in Section 4 appears | |||
to be both a sound technical choice and consistent with the current | ||||
It is possible to define an approach for MPLS support of ECN that | state of thinking in the IETF. | |||
more closely resembles that of the full functionality mode of | ||||
[RFC3168]. This approach would differ from that described in | ||||
Section 4 in the following ways: | ||||
o when pushing one or more MPLS labels onto an IP packet, the not-CM | ||||
state is set in the EXP field of all label stack entries | ||||
o when pushing one or more MPLS labels onto an MPLS packet, the | ||||
not-CM state is set in the EXP field of all newly added label | ||||
stack entries | ||||
o when popping an MPLS label and the exposed header is MPLS (i.e. | ||||
this is not the end of stack), the EXP field of the MPLS packet | ||||
should be set to CM if the popped label's EXP value was CM and | ||||
left unchanged otherwise | ||||
o when popping an MPLS label and the exposed header is IP, the IP | ||||
ECN field should be set to CE if the EXP value was CM and if the | ||||
IP header indicated that the packet was ECN capable. If the IP | ||||
header indicated not-ECT and the EXP value was CM, the packet MUST | ||||
be dropped. If the EXP value was not-CM, the ECN field in the IP | ||||
header is unchanged. | ||||
The advantages of this scheme over that described in Section 4 are | ||||
greater similarity to [RFC3168], and the ability to determine, at the | ||||
end of an LSP, that congestion either did or did not occur along that | ||||
LSP (since the initial state is always not-CM at the start of an | ||||
LSP). | ||||
A disadvantage of this approach is that exceptions to this rule are | ||||
necessary in cases where the marking process on LSRs needs to depend | ||||
on whether a packet has already suffered upstream marking. The | ||||
currently proposed pre-emption marking in PCN is an example where | ||||
such an exception would be necessary (see the discussion at the start | ||||
of Section 7). | ||||
8. Example Uses | 8. Example Uses | |||
8.1. RFC3168-style ECN | 8.1. RFC3168-style ECN | |||
[RFC3168] proposes the use of ECN in TCP and introduces the use of | [RFC3168] proposes the use of ECN in TCP and introduces the use of | |||
ECN-Echo and CWR flags in the TCP header for initialization. The TCP | ECN-Echo and CWR flags in the TCP header for initialization. The TCP | |||
sender responds accordingly (such as not increasing the congestion | sender responds accordingly (such as not increasing the congestion | |||
window) when it receives an ECN-Echo (ECE) ACK packet (that is, an | window) when it receives an ECN-Echo (ECE) ACK packet (that is, an | |||
ACK packet with ECN-Echo flag set in the TCP header), then the sender | ACK packet with ECN-Echo flag set in the TCP header), then the sender | |||
skipping to change at page 16, line 12 | skipping to change at page 13, line 11 | |||
accomplished by simply allocated a second codepoint to the PHB for | accomplished by simply allocated a second codepoint to the PHB for | |||
the "CM" state of that PHB and retaining the old codepoint for the | the "CM" state of that PHB and retaining the old codepoint for the | |||
"not-CM" state. An operator with only four deployed PHBs could of | "not-CM" state. An operator with only four deployed PHBs could of | |||
course enable ECN marking on all those PHBs. It is easy to imagine | course enable ECN marking on all those PHBs. It is easy to imagine | |||
cases where some PHBs might benefit more from ECN than others - for | cases where some PHBs might benefit more from ECN than others - for | |||
example, an operator might use ECN on a premium data service but not | example, an operator might use ECN on a premium data service but not | |||
on a PHB used for best effort internet traffic. | on a PHB used for best effort internet traffic. | |||
As an illustrative example of how the EXP field might be used in this | As an illustrative example of how the EXP field might be used in this | |||
case, consider the example of an operator who is using the aggregated | case, consider the example of an operator who is using the aggregated | |||
service classes described in [I-D.chan-tsvwg-diffserv-class-aggr]. | service classes proposed in [I-D.ietf-tsvwg-diffserv-class-aggr]. He | |||
He may choose to support ECN only for the Assured Elastic Treatment | may choose to support ECN only for the Assured Elastic Treatment | |||
Aggregate, using the EXP codepoint 010 for the not-CM state and 011 | Aggregate, using the EXP codepoint 010 for the not-CM state and 011 | |||
for the CM state. All other codepoints could be the same as in | for the CM state. All other codepoints could be the same as in | |||
[I-D.chan-tsvwg-diffserv-class-aggr]. Of course any other | [I-D.ietf-tsvwg-diffserv-class-aggr]. Of course any other | |||
combination of EXP values can be used according to the specific set | combination of EXP values can be used according to the specific set | |||
of PHBs and marking conventions used within that operator's network. | of PHBs and marking conventions used within that operator's network. | |||
8.3. Congestion-feedback-based Traffic Engineering | 8.3. Congestion-feedback-based Traffic Engineering | |||
Shayman's traffic engineering [Shayman] proposed the use of ECN by an | Shayman's traffic engineering [Shayman] proposed the use of ECN by an | |||
egress LSR feeding back congestion to an ingress LSR to mitigate | egress LSR feeding back congestion to an ingress LSR to mitigate | |||
congestion by employing dynamic traffic engineering techniques such | congestion by employing dynamic traffic engineering techniques such | |||
as shifting flows to an alternate path. It proposed a new RSVP | as shifting flows to an alternate path. It proposed a new RSVP | |||
TUNNEL CONGESTION message which was sent to the ingress LSR and | TUNNEL CONGESTION message which was sent to the ingress LSR and | |||
skipping to change at page 16, line 48 | skipping to change at page 13, line 47 | |||
As an example, a minor extension to RSVP signalling has been proposed | As an example, a minor extension to RSVP signalling has been proposed | |||
[I-D.lefaucheur-rsvp-ecn] to carry this message, but a similar | [I-D.lefaucheur-rsvp-ecn] to carry this message, but a similar | |||
approach has also been proposed that uses NSIS signalling | approach has also been proposed that uses NSIS signalling | |||
[I-D.ietf-nsis-rmd]. | [I-D.ietf-nsis-rmd]. | |||
If it is possible for LSRs to signify congestion in MPLS, PCN marking | If it is possible for LSRs to signify congestion in MPLS, PCN marking | |||
could be used for admission control and flow pre-emption across a | could be used for admission control and flow pre-emption across a | |||
Diffserv region, irrespective of whether it contained pure IP | Diffserv region, irrespective of whether it contained pure IP | |||
routers, MPLS LSRs, or both. Indeed, the solution could be somewhat | routers, MPLS LSRs, or both. Indeed, the solution could be somewhat | |||
more efficient to implement if aggregates could identify themselves | more efficient to implement if aggregates could identify themselves | |||
by their MPLS label. Section 4.8 describes the mechanisms by which | by their MPLS label. Appendix A describes the mechanisms by which | |||
the necessary markings for PCN could be carried in the MPLS header. | the necessary markings for PCN could be carried in the MPLS header. | |||
As an illustrative example of how the EXP field might be used in this | As an illustrative example of how the EXP field might be used in this | |||
case, consider the example of an operator who is using the aggregated | case, consider the example of an operator who is using the aggregated | |||
service classes described in [I-D.chan-tsvwg-diffserv-class-aggr]. | service classes proposed in [I-D.ietf-tsvwg-diffserv-class-aggr]. He | |||
He may choose to support PCN only for the Real Time Treatment | may choose to support PCN only for the Real Time Treatment Aggregate, | |||
Aggregate, using the EXP codepoint 100 for the not-marked (NM) state, | using the EXP codepoint 100 for the not-marked (NM) state, 101 for | |||
101 for the Admission Marked (AM) state, and 111 for the Pre-emption | the Admission Marked (AM) state, and 111 for the Pre-emption Marked | |||
Marked (PM) state. All other codepoints could be the same as in | (PM) state. All other codepoints could be the same as in | |||
[I-D.chan-tsvwg-diffserv-class-aggr]. Of course any other | [I-D.ietf-tsvwg-diffserv-class-aggr]. Of course any other | |||
combination of EXP values can be used according to the specific set | combination of EXP values can be used according to the specific set | |||
of PHBs and marking conventions used within that operator's network. | of PHBs and marking conventions used within that operator's network. | |||
It might also be possible to deploy a similar solution using PCN | It might also be possible to deploy a similar solution using PCN | |||
marking over MPLS for just admission control alone, or just flow pre- | marking over MPLS for just admission control alone, or just flow pre- | |||
emption alone, particularly if codepoint space was at a premium in | emption alone, particularly if codepoint space was at a premium in | |||
the MPLS EXP field. However, the feasibility of deploying one | the MPLS EXP field. However, the feasibility of deploying one | |||
without the other would require further study. | without the other would require further study. We also note that an | |||
approach to deploying PCN using only a single marking codepoint to | ||||
support both pre-emption and admission control has been | ||||
proposed[I-D.charny-pcn-single-marking]. | ||||
9. Deployment Considerations | 9. Deployment Considerations | |||
9.1. Marking non-ECN Capable Packets | 9.1. Marking non-ECN Capable Packets | |||
What is the consequences of marking a packet that is not ECN-capable? | What are the consequences of marking a packet that is not ECN- | |||
Even if it will be dropped before leaving the domain, doesn't this | capable? Even if it will be dropped before leaving the domain, | |||
consume resources unnecessarily? | doesn't this consume resources unnecessarily? | |||
The problem only arises if there is congestion downstream of an | The problem only arises if there is congestion downstream of an | |||
earlier congested node. It might be that marked packets are carried | earlier congested queue in the same MPLS domain. Downstream | |||
through this second congested router when, within the underlying IP | congested LSRs might forward packets already marked, even though they | |||
header they are not ECN capable, so they will be dropped later. Such | will be dropped later when the inner IP header is found to be Not-ECT | |||
packets might cause other packets to be marked (or dropped) that | on decapsulation. Such packets might cause the downstream LSRs to | |||
would not otherwise have been. | mark (or drop) other packets that they would otherwise not have had | |||
to. | ||||
We decided to use the per-domain ECT checking approach because it | We expect congestion will typically be rare in MPLS networks, but it | |||
would become optimal as ECN deployment became prevalent. The | might not be. The extra unnecessary load at downstream LSRs will not | |||
situation where traffic is carried beyond a congested LSR only to be | be more than the fraction of marked packets from upstream LSRs, even | |||
dropped later should become less prevalent as more transports use | in the worst case where no transports are ECN capable. Therefore the | |||
ECN. This is why we chose not to use the [Floyd] alternative which | amount of unnecessary marking (or drop) on an LSR will not be more | |||
introduced a low but persistent level of unnecessary packet drop for | than the product of its local marking rate and the marking rate due | |||
all time. Although that scheme did not carry droppable traffic to | to upstream LSRs within the same domain - typically the product of | |||
the edge of the MPLS domain, we felt this was a small price to pay, | two small (often zero) probabilities. | |||
and it was anyway only of concern until ECN had become more widely | ||||
deployed. | This is why we decided to use the per-domain ECT checking approach - | |||
because the most likely effect would be a very slightly increased | ||||
marking rate, which would result in very slightly higher drop only | ||||
for non-ECN-capable transports. We chose not to use the [Floyd] | ||||
alternative which introduced a low but persistent level of | ||||
unnecessary packet drop for all time, even for ECN-capable | ||||
transports. Although that scheme did not carry traffic to the edge | ||||
of the MPLS domain only to be dropped on decapsulation, we felt our | ||||
minor inefficiency was a small price to pay. And it would get | ||||
smaller still if ECN deployment widened. | ||||
A partial solution would be to preferentially drop packets arriving | A partial solution would be to preferentially drop packets arriving | |||
at a congested router that were already marked. There is no solution | at a congested router that were already marked. There is no solution | |||
to the problem of marking a packet when congestion is caused by | to the problem of marking a packet when congestion is caused by | |||
another packet that should have been dropped. However, the chance of | another packet that should have been dropped. However, the chance of | |||
such an occurrence is very low and the consequences are not | such an occurrence is very low and the consequences are not | |||
significant. It merely causes an application to very occasionally | significant. It merely causes an application to very occasionally | |||
slow down its rate when it did not have to. | slow down its rate when it did not have to. | |||
9.2. Non-ECN capable routers in an MPLS Domain | 9.2. Non-ECN capable routers in an MPLS Domain | |||
skipping to change at page 19, line 7 | skipping to change at page 16, line 19 | |||
An ECN sender can use the ECN nonce [RFC3540] to detect a misbehaving | An ECN sender can use the ECN nonce [RFC3540] to detect a misbehaving | |||
receiver. The ECN nonce works correctly across an MPLS domain | receiver. The ECN nonce works correctly across an MPLS domain | |||
without requiring any specific support from the proposal in this | without requiring any specific support from the proposal in this | |||
draft. The nonce does not need to be present in the MPLS shim | draft. The nonce does not need to be present in the MPLS shim | |||
header. As long as the nonce is present in the IP header when the | header. As long as the nonce is present in the IP header when the | |||
ECN information is copied from the last MPLS shim header, it will be | ECN information is copied from the last MPLS shim header, it will be | |||
overwritten if congestion has been experienced by an LSR. This is | overwritten if congestion has been experienced by an LSR. This is | |||
all that is necessary for the sender to detect a misbehaving | all that is necessary for the sender to detect a misbehaving | |||
receiver. | receiver. | |||
An alternative proposal currently in progress in the IETF | ||||
[I-D.briscoe-tsvwg-re-ecn-tcp] allows the network to prevent | ||||
misbehavior by senders or receivers or other routers. Like the ECN | ||||
nonce, it works correctly without requiring any specific support from | ||||
the proposal in this draft. It uses a bit in the IP header (the RE | ||||
bit) which is set by the sender and never changed along the path-it | ||||
is only read by certain policing elements in the network. There is | ||||
no need for a copy of this bit in the MPLS shim, as policing nodes | ||||
can examine the IP header if they need to, particularly given they | ||||
are intended to only be necessary at domain borders where MPLS | ||||
headers are often removed. | ||||
12. Acknowledgments | 12. Acknowledgments | |||
Thanks to K.K. Ramakrishnan and Sally Floyd for getting us thinking | Thanks to K.K. Ramakrishnan and Sally Floyd for getting us thinking | |||
about this in the first place and for providing advice on tunneling | about this in the first place and for providing advice on tunneling | |||
of ECN packets, and to Joe Babiarz, Ben Niven-Jenkins, Phil Eardley, | of ECN packets, and to Sally Floyd, Joe Babiarz, Ben Niven-Jenkins, | |||
and Ruediger Geib for their comments on the draft. | Phil Eardley, Ruediger Geib, and Magnus Westerlund for their comments | |||
on the draft. | ||||
Appendix A. Extension to Pre-Congestion Notification | ||||
This appendix describes how the mechanisms decribed in the body of | ||||
the document can be extended to support PCN | ||||
[I-D.briscoe-tsvwg-cl-architecture]. Our intent here is to show that | ||||
the mechanisms are readily extended to more complex scenarios than | ||||
ECN, particulary in the case where more codepoints are needed, but | ||||
this appendix may be safely ignored if one is interested only in | ||||
supporting ECN. Note that the PCN standards are still very much | ||||
under development at the time of writing, hence the precise details | ||||
contained in this appendix may be subject to change, and we stress | ||||
that this appendix is for illustrative purposes only. | ||||
The relevant aspects of PCN for the purposes of this discussion are: | ||||
o PCN uses 3 states rather than 2 for ECN - these are referred to as | ||||
admission marked (AM), pre-emption marked (PM) and not marked (NM) | ||||
states. (See Section 8.4 for further discussion of PCN and the | ||||
possibility of using fewer codepoints.) | ||||
o A packet can go from NM to AM, from NM to PM, or from AM to PM, | ||||
but no other transition is possible. | ||||
o The determination of whether a packet is subject to PCN is based | ||||
on the PHB of the packet. | ||||
Thus, to support PCN fully in an MPLS domain for a particular PHB, a | ||||
total of 3 codepoints need to be allocated for that PHB. These 3 | ||||
codepoints represent the admission marked (AM), pre-emption marked | ||||
(PM) and not marked (NM) states. The procedures described in | ||||
Section 4 above need to be slightly modified to support this | ||||
scenario. The following procedures are invoked when the topmost DSCP | ||||
or EXP value indicates a PHB that supports PCN. | ||||
Appendix A.1. Label Push onto IP packet | ||||
If the IP packet header indicates AM, set the EXP value of all | ||||
entries in the label stack to AM. If the IP packet header indicates | ||||
PM, set the EXP value of all entries in the label stack to PM. For | ||||
any other marking of the IP header, set the EXP value of all entries | ||||
in the label stack to NM. | ||||
Appendix A.2. Pushing Additional MPLS Labels | ||||
The procedures of Section 4.2 apply. | ||||
Appendix A.3. Admission Control or Pre-emption Marking inside MPLS | ||||
domain | ||||
The EXP value can be set to AM or PM according to the same procedures | ||||
as described in [I-D.briscoe-tsvwg-cl-phb]. For the purposes of this | ||||
document, it does not matter exactly what algorithms are used to | ||||
decide when to set AM or PM; all that matters is that if a router | ||||
would have marked AM (or PM) in the IP header, it should set the EXP | ||||
value in the MPLS header to the AM (or PM) codepoint. | ||||
Appendix A.4. Popping an MPLS Label (not end of stack) | ||||
When popping an MPLS Label exposes another MPLS label, the AM or PM | ||||
marking should be transferred to the exposed EXP field in the | ||||
following manner: | ||||
o If the inner EXP value is NM, then it should be set to the same | ||||
marking state as the EXP value of the popped label stack entry. | ||||
o If the inner EXP value is AM, it should be unchanged if the popped | ||||
EXP value was AM, and it should be set to PM if the popped EXP | ||||
value was PM. If the popped EXP value was NM, this should be | ||||
logged in some way and the inner EXP value should be unchanged. | ||||
o If the inner EXP value is PM, it should be unchanged whatever the | ||||
popped EXP value was, but any EXP value other than PM should be | ||||
logged. | ||||
Appendix A.5. Popping the last MPLS Label to expose IP header | ||||
When popping the last MPLS Label exposes the IP header, there are two | ||||
cases to consider: | ||||
o the popping LSR is NOT the egress router of the PCN region, in | ||||
which case AM or PM marking should be transferred to the exposed | ||||
IP header field; or | ||||
o the popping LSR IS the egress router of the PCN region. | ||||
In the latter case, the behavior of the egress LSR is defined in | ||||
[I-D.briscoe-tsvwg-cl-architecture] and is beyond the scope of this | ||||
document. In the former case, the marking should be transferred from | ||||
the popped MPLS header to the exposed IP header as follows: | ||||
o If the inner IP header value is neither AM nor PM, and the EXP | ||||
value was NM, then the IP header should be unchanged. For any | ||||
other EXP value, the IP header should be set to the same marking | ||||
state as the EXP value of the popped label stack entry. | ||||
o If the inner IP header value is AM, it should be unchanged if the | ||||
popped EXP value was AM, and it should be set to PM if the popped | ||||
EXP value was PM. If the popped EXP value was NM, this should be | ||||
logged in some way and the inner IP header value should be | ||||
unchanged. | ||||
o If the IP header value is PM, it should be unchanged whatever the | ||||
popped EXP value was, but any EXP value other than PM should be | ||||
logged. | ||||
13. References | 13. References | |||
13.1. Normative References | 13.1. Normative References | |||
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | |||
Requirement Levels", BCP 14, RFC 2119, March 1997. | Requirement Levels", BCP 14, RFC 2119, March 1997. | |||
[RFC2475] Blake, S., Black, D., Carlson, M., Davies, E., Wang, Z., | ||||
and W. Weiss, "An Architecture for Differentiated | ||||
Services", RFC 2475, December 1998. | ||||
[RFC3031] Rosen, E., Viswanathan, A., and R. Callon, "Multiprotocol | [RFC3031] Rosen, E., Viswanathan, A., and R. Callon, "Multiprotocol | |||
Label Switching Architecture", RFC 3031, January 2001. | Label Switching Architecture", RFC 3031, January 2001. | |||
[RFC3032] Rosen, E., Tappan, D., Fedorkow, G., Rekhter, Y., | [RFC3032] Rosen, E., Tappan, D., Fedorkow, G., Rekhter, Y., | |||
Farinacci, D., Li, T., and A. Conta, "MPLS Label Stack | Farinacci, D., Li, T., and A. Conta, "MPLS Label Stack | |||
Encoding", RFC 3032, January 2001. | Encoding", RFC 3032, January 2001. | |||
[RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition | [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition | |||
of Explicit Congestion Notification (ECN) to IP", | of Explicit Congestion Notification (ECN) to IP", | |||
RFC 3168, September 2001. | RFC 3168, September 2001. | |||
[RFC3260] Grossman, D., "New Terminology and Clarifications for | ||||
Diffserv", RFC 3260, April 2002. | ||||
[RFC3270] Le Faucheur, F., Wu, L., Davie, B., Davari, S., Vaananen, | [RFC3270] Le Faucheur, F., Wu, L., Davie, B., Davari, S., Vaananen, | |||
P., Krishnan, R., Cheval, P., and J. Heinanen, "Multi- | P., Krishnan, R., Cheval, P., and J. Heinanen, "Multi- | |||
Protocol Label Switching (MPLS) Support of Differentiated | Protocol Label Switching (MPLS) Support of Differentiated | |||
Services", RFC 3270, May 2002. | Services", RFC 3270, May 2002. | |||
[RFC4301] Kent, S. and K. Seo, "Security Architecture for the | ||||
Internet Protocol", RFC 4301, December 2005. | ||||
13.2. Informative References | 13.2. Informative References | |||
[Briscoe] "Layered Encapsulation of Congestion Notification", | ||||
June 2007. | ||||
Work in progress. | ||||
[Floyd] "A Proposal to Incorporate ECN in MPLS", 1999. | [Floyd] "A Proposal to Incorporate ECN in MPLS", 1999. | |||
Work in progress. http://www.icir.org/floyd/papers/ | Work in progress. http://www.icir.org/floyd/papers/ | |||
draft-ietf-mpls-ecn-00.txt | draft-ietf-mpls-ecn-00.txt | |||
[I-D.briscoe-tsvwg-cl-architecture] | [I-D.briscoe-tsvwg-cl-architecture] | |||
Briscoe, B., "An edge-to-edge Deployment Model for Pre- | Briscoe, B., "An edge-to-edge Deployment Model for Pre- | |||
Congestion Notification: Admission Control over a | Congestion Notification: Admission Control over a | |||
DiffServ Region", draft-briscoe-tsvwg-cl-architecture-04 | DiffServ Region", draft-briscoe-tsvwg-cl-architecture-04 | |||
(work in progress), October 2006. | (work in progress), October 2006. | |||
[I-D.briscoe-tsvwg-cl-phb] | [I-D.briscoe-tsvwg-cl-phb] | |||
Briscoe, B., "Pre-Congestion Notification marking", | Briscoe, B., "Pre-Congestion Notification marking", | |||
draft-briscoe-tsvwg-cl-phb-03 (work in progress), | draft-briscoe-tsvwg-cl-phb-03 (work in progress), | |||
October 2006. | October 2006. | |||
[I-D.briscoe-tsvwg-re-ecn-border-cheat] | [I-D.charny-pcn-single-marking] | |||
Briscoe, B., "Emulating Border Flow Policing using Re-ECN | Charny, A., "Pre-Congestion Notification Using Single | |||
on Bulk Data", draft-briscoe-tsvwg-re-ecn-border-cheat-01 | Marking for Admission and Pre-emption", | |||
(work in progress), June 2006. | draft-charny-pcn-single-marking-01 (work in progress), | |||
March 2007. | ||||
[I-D.briscoe-tsvwg-re-ecn-tcp] | ||||
Briscoe, B., "Re-ECN: Adding Accountability for Causing | ||||
Congestion to TCP/IP", draft-briscoe-tsvwg-re-ecn-tcp-03 | ||||
(work in progress), October 2006. | ||||
[I-D.chan-tsvwg-diffserv-class-aggr] | ||||
Chan, K., "Aggregation of DiffServ Service Classes", | ||||
draft-chan-tsvwg-diffserv-class-aggr-03 (work in | ||||
progress), January 2006. | ||||
[I-D.ietf-nsis-rmd] | [I-D.ietf-nsis-rmd] | |||
Bader, A., "RMD-QOSM - The Resource Management in Diffserv | Bader, A., "RMD-QOSM - The Resource Management in Diffserv | |||
QOS Model", draft-ietf-nsis-rmd-08 (work in progress), | QOS Model", draft-ietf-nsis-rmd-09 (work in progress), | |||
October 2006. | March 2007. | |||
[I-D.ietf-tsvwg-diffserv-class-aggr] | ||||
Chan, K., "Aggregation of DiffServ Service Classes", | ||||
draft-ietf-tsvwg-diffserv-class-aggr-02 (work in | ||||
progress), March 2007. | ||||
[I-D.lefaucheur-rsvp-ecn] | [I-D.lefaucheur-rsvp-ecn] | |||
Faucheur, F., "RSVP Extensions for Admission Control over | Faucheur, F., "RSVP Extensions for Admission Control over | |||
Diffserv using Pre-congestion Notification (PCN)", | Diffserv using Pre-congestion Notification (PCN)", | |||
draft-lefaucheur-rsvp-ecn-01 (work in progress), | draft-lefaucheur-rsvp-ecn-01 (work in progress), | |||
June 2006. | June 2006. | |||
[RFC3260] Grossman, D., "New Terminology and Clarifications for | ||||
Diffserv", RFC 3260, April 2002. | ||||
[RFC3540] Spring, N., Wetherall, D., and D. Ely, "Robust Explicit | [RFC3540] Spring, N., Wetherall, D., and D. Ely, "Robust Explicit | |||
Congestion Notification (ECN) Signaling with Nonces", | Congestion Notification (ECN) Signaling with Nonces", | |||
RFC 3540, June 2003. | RFC 3540, June 2003. | |||
[RFC4340] Kohler, E., Handley, M., and S. Floyd, "Datagram | [RFC4340] Kohler, E., Handley, M., and S. Floyd, "Datagram | |||
Congestion Control Protocol (DCCP)", RFC 4340, March 2006. | Congestion Control Protocol (DCCP)", RFC 4340, March 2006. | |||
[Shayman] "Using ECN to Signal Congestion Within an MPLS Domain", | [Shayman] "Using ECN to Signal Congestion Within an MPLS Domain", | |||
2000. | 2000. | |||
End of changes. 47 change blocks. | ||||
293 lines changed or deleted | 303 lines changed or added | |||
This html diff was produced by rfcdiff 1.33. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |