| draft-ietf-tsvwg-ecn-mpls-00.txt | draft-ietf-tsvwg-ecn-mpls-01.txt | |||
|---|---|---|---|---|
| Network Working Group B. Davie | Network Working Group B. Davie | |||
| Internet-Draft Cisco Systems, Inc. | Internet-Draft Cisco Systems, Inc. | |||
| Intended status: Standards Track B. Briscoe | Intended status: Standards Track B. Briscoe | |||
| Expires: August 24, 2007 J. Tay | Expires: December 21, 2007 J. Tay | |||
| BT Research | BT Research | |||
| February 20, 2007 | June 19, 2007 | |||
| Explicit Congestion Marking in MPLS | Explicit Congestion Marking in MPLS | |||
| draft-ietf-tsvwg-ecn-mpls-00.txt | draft-ietf-tsvwg-ecn-mpls-01.txt | |||
| Status of this Memo | Status of this Memo | |||
| By submitting this Internet-Draft, each author represents that any | By submitting this Internet-Draft, each author represents that any | |||
| applicable patent or other IPR claims of which he or she is aware | applicable patent or other IPR claims of which he or she is aware | |||
| have been or will be disclosed, and any of which he or she becomes | have been or will be disclosed, and any of which he or she becomes | |||
| aware will be disclosed, in accordance with Section 6 of BCP 79. | aware will be disclosed, in accordance with Section 6 of BCP 79. | |||
| Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
| Task Force (IETF), its areas, and its working groups. Note that | Task Force (IETF), its areas, and its working groups. Note that | |||
| skipping to change at page 1, line 36 | skipping to change at page 1, line 36 | |||
| and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
| time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
| material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
| The list of current Internet-Drafts can be accessed at | The list of current Internet-Drafts can be accessed at | |||
| http://www.ietf.org/ietf/1id-abstracts.txt. | http://www.ietf.org/ietf/1id-abstracts.txt. | |||
| The list of Internet-Draft Shadow Directories can be accessed at | The list of Internet-Draft Shadow Directories can be accessed at | |||
| http://www.ietf.org/shadow.html. | http://www.ietf.org/shadow.html. | |||
| This Internet-Draft will expire on August 24, 2007. | This Internet-Draft will expire on December 21, 2007. | |||
| Copyright Notice | Copyright Notice | |||
| Copyright (C) The IETF Trust (2007). | Copyright (C) The IETF Trust (2007). | |||
| Abstract | Abstract | |||
| RFC 3270 defines how to support the Diffserv architecture in MPLS | RFC 3270 defines how to support the Diffserv architecture in MPLS | |||
| networks, including how to encode Diffserv Code Points (DSCPs) in an | networks, including how to encode Diffserv Code Points (DSCPs) in an | |||
| MPLS header. DSCPs may be encoded in the EXP field, while other uses | MPLS header. DSCPs may be encoded in the EXP field, while other uses | |||
| skipping to change at page 3, line 5 | skipping to change at page 2, line 13 | |||
| in the MPLS header. This draft defines how an operator might define | in the MPLS header. This draft defines how an operator might define | |||
| some of the EXP codepoints for explicit congestion notification, | some of the EXP codepoints for explicit congestion notification, | |||
| without precluding other uses. | without precluding other uses. | |||
| Requirements Language | Requirements Language | |||
| The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
| "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | |||
| document are to be interpreted as described in RFC 2119 [RFC2119]. | document are to be interpreted as described in RFC 2119 [RFC2119]. | |||
| Table of Contents | Change History | |||
| 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 | [Note to RFC Editor: This section to be removed before publication] | |||
| 1.1. Change History . . . . . . . . . . . . . . . . . . . . . . 4 | ||||
| 1.2. Background . . . . . . . . . . . . . . . . . . . . . . . . 4 | ||||
| 1.3. Intent . . . . . . . . . . . . . . . . . . . . . . . . . . 5 | ||||
| 1.4. Terminology . . . . . . . . . . . . . . . . . . . . . . . 5 | ||||
| 2. Use of MPLS EXP Field for ECN . . . . . . . . . . . . . . . . 6 | ||||
| 3. Per-domain ECT checking . . . . . . . . . . . . . . . . . . . 8 | ||||
| 4. ECN-enabled MPLS domain . . . . . . . . . . . . . . . . . . . 9 | ||||
| 4.1. Pushing (adding) one or more labels to an IP packet . . . 9 | ||||
| 4.2. Pushing one or more labels onto an MPLS labelled packet . 9 | ||||
| 4.3. Congestion experienced in an interior MPLS node . . . . . 9 | ||||
| 4.4. Crossing a Diffserv Domain Boundary . . . . . . . . . . . 10 | ||||
| 4.5. Popping an MPLS label (not the end of the stack) . . . . . 10 | ||||
| 4.6. Popping the last MPLS label in the stack . . . . . . . . . 10 | ||||
| 4.7. Diffserv Tunneling Models . . . . . . . . . . . . . . . . 11 | ||||
| 4.8. Extension to Pre-Congestion Notification . . . . . . . . . 11 | ||||
| 4.8.1. Label Push onto IP packet . . . . . . . . . . . . . . 12 | ||||
| 4.8.2. Pushing Additional MPLS Labels . . . . . . . . . . . . 12 | ||||
| 4.8.3. Admission Control or Pre-emption Marking inside | ||||
| MPLS domain . . . . . . . . . . . . . . . . . . . . . 12 | ||||
| 4.8.4. Popping an MPLS Label (not end of stack) . . . . . . . 12 | ||||
| 4.8.5. Popping the last MPLS Label to expose IP header . . . 12 | ||||
| 5. ECN-disabled MPLS domain . . . . . . . . . . . . . . . . . . . 13 | ||||
| 6. The use of more codepoints with E-LSPs and L-LSPs . . . . . . 13 | ||||
| 7. Relationship to tunnel behavior in RFC 3168 . . . . . . . . . 14 | ||||
| 7.1. Alternative approach to support ECN in an MPLS domain . . 14 | ||||
| 8. Example Uses . . . . . . . . . . . . . . . . . . . . . . . . . 15 | ||||
| 8.1. RFC3168-style ECN . . . . . . . . . . . . . . . . . . . . 15 | ||||
| 8.2. ECN Co-existence with Diffserv E-LSPs . . . . . . . . . . 15 | ||||
| 8.3. Congestion-feedback-based Traffic Engineering . . . . . . 16 | ||||
| 8.4. PCN flow admission control and flow pre-emption . . . . . 16 | ||||
| 9. Deployment Considerations . . . . . . . . . . . . . . . . . . 17 | ||||
| 9.1. Marking non-ECN Capable Packets . . . . . . . . . . . . . 17 | ||||
| 9.2. Non-ECN capable routers in an MPLS Domain . . . . . . . . 18 | ||||
| 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 18 | ||||
| 11. Security Considerations . . . . . . . . . . . . . . . . . . . 18 | ||||
| 12. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 19 | ||||
| 13. References . . . . . . . . . . . . . . . . . . . . . . . . . . 19 | ||||
| 13.1. Normative References . . . . . . . . . . . . . . . . . . . 19 | ||||
| 13.2. Informative References . . . . . . . . . . . . . . . . . . 20 | ||||
| Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 21 | ||||
| Intellectual Property and Copyright Statements . . . . . . . . . . 22 | ||||
| 1. Introduction | Changes in this version (draft-ietf-tsvwg-ecn-mpls-01.txt) relative | |||
| to the last (draft-ietf-tsvwg-ecn-mpls-00.txt): | ||||
| 1.1. Change History | o Moved the detailed discussion of marking procedures for Pre- | |||
| Congestion Notification (PCN) to an appendix. | ||||
| [Note to RFC Editor: This section to be removed before publication] | o Removed PCN as a motivation for the efficient code-point usage in | |||
| Section 2. | ||||
| This version (draft-ietf-tsvwg-ecn-mpls-00.txt) differs from the last | o Clarified the rationale for preferring the ECT-checking approach | |||
| (draft-davie-mpls-ecn-01.txt) only in title, date, and updated | over the approach of [Floyd] in Section 9.1. | |||
| references. | ||||
| Changes from draft-davie-ecn-mpls-00 to draft-davie-ecn-mpls-01: | o Updated discussion of relationship to RFC3168 in Section 7 | |||
| o Removed discussion of re-ECN from Security Considerations. | ||||
| o Fixed typos and nits. | ||||
| Changes in draft-ietf-tsvwg-ecn-mpls-00.txt relative to | ||||
| draft-davie-ecn-mpls-00: | ||||
| o Corrected the description of ECN-MPLS marking proposed in | o Corrected the description of ECN-MPLS marking proposed in | |||
| [Shayman], which closely corresponds to that proposed in this | [Shayman], which closely corresponds to that proposed in this | |||
| document. | document. | |||
| o Pre-congestion notification (PCN) marking is now described in a | o Pre-congestion notification (PCN) marking is now described in a | |||
| way that does not require normative references to PCN | way that does not require normative references to PCN | |||
| specifications. PCN discussion now serves only to illustrate how | specifications. PCN discussion now serves only to illustrate how | |||
| the ECN marking concepts can be extended to cover more complex | the ECN marking concepts can be extended to cover more complex | |||
| scenarios, with PCN being an example. | scenarios, with PCN being an example. | |||
| o Added specification of behavior when MPLS encapsulated packets | o Added specification of behavior when MPLS encapsulated packets | |||
| cross from an ECN-enabled domain to a domain that is not ECN- | cross from an ECN-enabled domain to a domain that is not ECN- | |||
| enabled. | enabled. | |||
| o Clarified that copying MPLS ECN or PCN marking into exposed IP | o Clarified that copying MPLS ECN or PCN marking into exposed IP | |||
| header on egress is not mandatory | header on egress is not mandatory | |||
| o Fixed typos and nits | o Fixed typos and nits | |||
| 1.2. Background | Table of Contents | |||
| 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 | ||||
| 1.1. Background . . . . . . . . . . . . . . . . . . . . . . . . 4 | ||||
| 1.2. Intent . . . . . . . . . . . . . . . . . . . . . . . . . . 4 | ||||
| 1.3. Terminology . . . . . . . . . . . . . . . . . . . . . . . 5 | ||||
| 2. Use of MPLS EXP Field for ECN . . . . . . . . . . . . . . . . 6 | ||||
| 3. Per-domain ECT checking . . . . . . . . . . . . . . . . . . . 8 | ||||
| 4. ECN-enabled MPLS domain . . . . . . . . . . . . . . . . . . . 8 | ||||
| 4.1. Pushing (adding) one or more labels to an IP packet . . . 9 | ||||
| 4.2. Pushing one or more labels onto an MPLS labelled packet . 9 | ||||
| 4.3. Congestion experienced in an interior MPLS node . . . . . 9 | ||||
| 4.4. Crossing a Diffserv Domain Boundary . . . . . . . . . . . 9 | ||||
| 4.5. Popping an MPLS label (not the end of the stack) . . . . . 10 | ||||
| 4.6. Popping the last MPLS label in the stack . . . . . . . . . 10 | ||||
| 4.7. Diffserv Tunneling Models . . . . . . . . . . . . . . . . 10 | ||||
| 5. ECN-disabled MPLS domain . . . . . . . . . . . . . . . . . . . 11 | ||||
| 6. The use of more codepoints with E-LSPs and L-LSPs . . . . . . 11 | ||||
| 7. Relationship to tunnel behavior in RFC 3168 . . . . . . . . . 11 | ||||
| 8. Example Uses . . . . . . . . . . . . . . . . . . . . . . . . . 12 | ||||
| 8.1. RFC3168-style ECN . . . . . . . . . . . . . . . . . . . . 12 | ||||
| 8.2. ECN Co-existence with Diffserv E-LSPs . . . . . . . . . . 12 | ||||
| 8.3. Congestion-feedback-based Traffic Engineering . . . . . . 13 | ||||
| 8.4. PCN flow admission control and flow pre-emption . . . . . 13 | ||||
| 9. Deployment Considerations . . . . . . . . . . . . . . . . . . 14 | ||||
| 9.1. Marking non-ECN Capable Packets . . . . . . . . . . . . . 14 | ||||
| 9.2. Non-ECN capable routers in an MPLS Domain . . . . . . . . 15 | ||||
| 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 15 | ||||
| 11. Security Considerations . . . . . . . . . . . . . . . . . . . 15 | ||||
| 12. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 16 | ||||
| Appendix A. Extension to Pre-Congestion Notification . . . . . . 16 | ||||
| Appendix A.1. Label Push onto IP packet . . . . . . . . . . . . . 17 | ||||
| Appendix A.2. Pushing Additional MPLS Labels . . . . . . . . . . . 17 | ||||
| Appendix A.3. Admission Control or Pre-emption Marking inside | ||||
| MPLS domain . . . . . . . . . . . . . . . . . . . . 17 | ||||
| Appendix A.4. Popping an MPLS Label (not end of stack) . . . . . . 17 | ||||
| Appendix A.5. Popping the last MPLS Label to expose IP header . . 18 | ||||
| 13. References . . . . . . . . . . . . . . . . . . . . . . . . . . 18 | ||||
| 13.1. Normative References . . . . . . . . . . . . . . . . . . . 18 | ||||
| 13.2. Informative References . . . . . . . . . . . . . . . . . . 19 | ||||
| Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 20 | ||||
| Intellectual Property and Copyright Statements . . . . . . . . . . 22 | ||||
| 1. Introduction | ||||
| 1.1. Background | ||||
| [RFC3168] defines Explicit Congestion Notification for IP. The | ||||
| primary purpose of ECN is to allow congestion to be signalled without | ||||
| dropping packets. | ||||
| [RFC3270] defines how to support the Diffserv architecture in MPLS | [RFC3270] defines how to support the Diffserv architecture in MPLS | |||
| networks, including how to encode Diffserv Code Points (DSCPs) in an | networks, including how to encode Diffserv Code Points (DSCPs) in an | |||
| MPLS header. DSCPs may be encoded in the EXP field, while other uses | MPLS header. DSCPs may be encoded in the EXP field, while other uses | |||
| of that field are not precluded. RFC3270 makes no statement about | of that field are not precluded. RFC3270 makes no statement about | |||
| how Explicit Congestion Notification (ECN) marking might be encoded | how Explicit Congestion Notification (ECN) marking might be encoded | |||
| in the MPLS header. This draft defines how an operator might define | in the MPLS header. | |||
| some of the EXP codepoints for explicit congestion notification, | ||||
| without precluding other uses. In parallel to the activity defining | This draft defines how an operator might define some of the EXP | |||
| the addition of ECN to IP [RFC3168], two proposals were made to add | codepoints for explicit congestion notification, without precluding | |||
| ECN to MPLS [Floyd][Shayman]. These proposals, however, fell by the | other uses. In parallel to the activity defining the addition of ECN | |||
| wayside. With ECN for IP now being a proposed standard, and | to IP [RFC3168], two proposals were made to add ECN to MPLS | |||
| developing interest in using pre-congestion notification (PCN) for | [Floyd][Shayman]. These proposals, however, fell by the wayside. | |||
| admission control and flow pre-emption | With ECN for IP now being a proposed standard, and developing | |||
| [I-D.briscoe-tsvwg-cl-architecture], there is consequent interest in | interest in using pre-congestion notification (PCN) for admission | |||
| being able to support ECN across IP networks consisting of MPLS- | control and flow pre-emption [I-D.briscoe-tsvwg-cl-architecture], | |||
| enabled domains. Therefore it is necessary to specify the protocol | there is consequent interest in being able to support ECN across IP | |||
| for including ECN in the MPLS shim header, and the protocol behavior | networks consisting of MPLS-enabled domains. Therefore it is | |||
| of edge MPLS nodes. | necessary to specify the protocol for including ECN in the MPLS shim | |||
| header, and the protocol behavior of edge MPLS nodes. | ||||
| We note that in [RFC3168] there are four codepoints used for ECN | We note that in [RFC3168] there are four codepoints used for ECN | |||
| marking, which are encoded using two bits of the IP header. The MPLS | marking, which are encoded using two bits of the IP header. The MPLS | |||
| EXP field is the logical place to encode ECN codepoints, but with | EXP field is the logical place to encode ECN codepoints, but with | |||
| only 3 bits (8 codepoints) available, and with the same field being | only 3 bits (8 codepoints) available, and with the same field being | |||
| used to convey DSCP information as well, there is a clear incentive | used to convey DSCP information as well, there is a clear incentive | |||
| to conserve the number of codepoints consumed for ECN purposes. | to conserve the number of codepoints consumed for ECN purposes. | |||
| Efficient use of the EXP field has been a focus of prior drafts | Efficient use of the EXP field has been a focus of prior drafts | |||
| [Floyd] [Shayman] and we draw on those efforts in this draft as well. | [Floyd] [Shayman] and we draw on those efforts in this draft as well. | |||
| 1.3. Intent | We also note that [RFC3168] defines default usage of the ECN field | |||
| but allows for the possibility that some Diffserv PHBs might include | ||||
| different specifications on how the ECN field is to be used. This | ||||
| draft seeks to preserve that capability. | ||||
| 1.2. Intent | ||||
| Our intent is to specify how the MPLS shim header[RFC3032] should | Our intent is to specify how the MPLS shim header[RFC3032] should | |||
| denote ECN marking and how MPLS nodes should understand whether the | denote ECN marking and how MPLS nodes should understand whether the | |||
| transport for a packet will be ECN capable. We offer this as a | transport for a packet will be ECN capable. We offer this as a | |||
| building block, from which to build different congestion notification | building block, from which to build different congestion notification | |||
| systems. We do not intend to specify how the resulting congestion | systems. We do not intend to specify how the resulting congestion | |||
| notification is fed back to an upstream node that can mitigate | notification is fed back to an upstream node that can mitigate | |||
| congestion. For instance, unlike [Shayman], we do not specify edge- | congestion. For instance, unlike [Shayman], we do not specify edge- | |||
| to-edge MPLS domain feedback, but we also do not preclude it. | to-edge MPLS domain feedback, but we also do not preclude it. | |||
| Nonetheless, we do specify how the egress node of an MPLS domain | Nonetheless, we do specify how the egress node of an MPLS domain | |||
| should copy congestion notification from the MPLS shim into the | should copy congestion notification from the MPLS shim into the | |||
| underlying IP header if the ECN is to be carried onward towards the | encapsulated IP header if the ECN is to be carried onward towards the | |||
| IP receiver. But we do NOT mandate that MPLS congestion notification | IP receiver. But we do NOT mandate that MPLS congestion notification | |||
| must be copied into the IP header for onward transmission. This | must be copied into the IP header for onward transmission. This | |||
| draft aims to be generic for any use of congestion notification in | draft aims to be generic for any use of congestion notification in | |||
| MPLS. PCN or traffic engineering are merely two of many motivating | MPLS. Support of [RFC3168] is our primary motivation; some | |||
| applications (see Section 8.) | additional potential applications to illustrate the flexibility of | |||
| our approach are described in Section 8. In particular, we aim to | ||||
| support possible future schemes that may use more than one level of | ||||
| congestion marking. | ||||
| 1.4. Terminology | 1.3. Terminology | |||
| This document draws freely on the terminology of ECN [RFC3168] and | This document draws freely on the terminology of ECN [RFC3168] and | |||
| MPLS [RFC3031]. For ease of reference, we have included some | MPLS [RFC3031]. For ease of reference, we have included some | |||
| definitions here, but refer the reader to the references above for | definitions here, but refer the reader to the references above for | |||
| complete specifications of the relevant technologies: | complete specifications of the relevant technologies: | |||
| o CE: Congestion Experienced. One of the states with which a packet | o CE: Congestion Experienced. One of the states with which a packet | |||
| may be marked in a network supporting ECN. A packet is marked in | may be marked in a network supporting ECN. A packet is marked in | |||
| this state by an ECN-capable router, to indicate that this router | this state by an ECN-capable router, to indicate that this router | |||
| was experiencing congestion at the time the packet arrived. | was experiencing congestion at the time the packet arrived. | |||
| skipping to change at page 6, line 13 | skipping to change at page 5, line 45 | |||
| the transport protocol are ECN-capable. A router may not mark a | the transport protocol are ECN-capable. A router may not mark a | |||
| packet as CE unless the packet was marked ECT when it arrived. | packet as CE unless the packet was marked ECT when it arrived. | |||
| o Not-ECT: Not ECN capable transport. An end system marks a packet | o Not-ECT: Not ECN capable transport. An end system marks a packet | |||
| with this codepoint to indicate that the end-points of the | with this codepoint to indicate that the end-points of the | |||
| transport protocol are not ECN-capable. A congested router cannot | transport protocol are not ECN-capable. A congested router cannot | |||
| mark such packets as CE, and thus can only drop them to indicate | mark such packets as CE, and thus can only drop them to indicate | |||
| congestion. | congestion. | |||
| o EXP field. A 3 bit field in the MPLS label header [RFC3032] which | o EXP field. A 3 bit field in the MPLS label header [RFC3032] which | |||
| may be used to convey Diffserv information (and used in this draft | may be used to convey Diffserv information (and is also used in | |||
| to carry ECN information). | this draft to carry ECN information). | |||
| o PHP. Penultimate Hop Popping. An MPLS operation in which the | o PHP. Penultimate Hop Popping. An MPLS operation in which the | |||
| penultimate Label Switching Router (LSR) on a Label Switched Path | penultimate Label Switching Router (LSR) on a Label Switched Path | |||
| (LSP) removes the top label from the packet before forwarding the | (LSP) removes the top label from the packet before forwarding the | |||
| packet to the final LSR on the LSP. | packet to the final LSR on the LSP. | |||
| 2. Use of MPLS EXP Field for ECN | 2. Use of MPLS EXP Field for ECN | |||
| We propose that LSRs configured for explicit congestion notification | We propose that LSRs configured for explicit congestion notification | |||
| should use the EXP field in the MPLS shim header. However, RFC 3270 | should use the EXP field in the MPLS shim header. However, [RFC3270] | |||
| already defines use of codepoints in the EXP field for differentiated | already defines use of codepoints in the EXP field for differentiated | |||
| services. Although it does not preclude other compatible uses of the | services. Although it does not preclude other compatible uses of the | |||
| EXP field, this clearly seems to limit the space available for ECN, | EXP field, this clearly seems to limit the space available for ECN, | |||
| given the field is only 3 bits (8 codepoints). | given the field is only 3 bits (8 codepoints). | |||
| RFC 3270 defines two possible approaches for requesting | [RFC3270] defines two possible approaches for requesting | |||
| differentiated service treatment from an LSR. | differentiated service treatment from an LSR. | |||
| o In the E-LSP approach, different codepoints of the EXP field in | o In the E-LSP approach, different codepoints of the EXP field in | |||
| the MPLS shim header are used to indicate the packet's per hop | the MPLS shim header are used to indicate the packet's per hop | |||
| behavior (PHB). | behavior (PHB). | |||
| o In the L-LSP approach, an MPLS label is assigned for each PHB | o In the L-LSP approach, an MPLS label is assigned for each PHB | |||
| scheduling class (PSC, as defined in [RFC3260], so that an LSR | scheduling class (PSC, as defined in [RFC3260], so that an LSR | |||
| determines both its forwarding and its scheduling behavior from | determines both its forwarding and its scheduling behavior from | |||
| the label. | the label. | |||
| If an MPLS domain uses the L-LSP approach, there is likely to be | If an MPLS domain uses the L-LSP approach, there is likely to be | |||
| space in the EXP field for ECN codepoint(s). Where the E-LSP | space in the EXP field for ECN codepoint(s). Where the E-LSP | |||
| approach is used, then codepoint space in the EXP field is likely to | approach is used, then codepoint space in the EXP field is likely to | |||
| be scarce. This draft focuses on interworking ECN marking with the | be scarce. This draft focuses on interworking ECN marking with the | |||
| E-LSP approach as it is the tougher problem. Consequently the same | E-LSP approach as it is the tougher problem. Consequently the same | |||
| approach can also be applied with L-LSPs. | approach can also be applied with L-LSPs. | |||
| We recommend that explicit congestion notification in MPLS should use | We recommend that explicit congestion notification in MPLS should use | |||
| codepoints instead of bits in the EXP field. Since not every PHB | codepoints instead of bits in the EXP field. Since not every PHB | |||
| will need an associated ECN codepoint and in some applications a | will necessarily require an associated ECN codepoint it would be | |||
| given PHB might need two ECN codepoints (see, for | wasteful to assign a dedicated bit for ECN. (There may also be cases | |||
| example,[I-D.briscoe-tsvwg-cl-architecture]) it would be wasteful to | where a given PHB might need more than one ECN-like codepoint; see | |||
| assign a dedicated bit for ECN. | Section 8.4 for an example.) | |||
| For each PHB that uses ECN marking, we assume one EXP codepoint will | For each PHB that uses ECN marking, we assume one EXP codepoint will | |||
| be defined meaning not congestion marked (Not-CM), and at least one | be defined meaning not congestion marked (Not-CM), and at least one | |||
| other codepoint will be defined meaning congestion marked (CM). | other codepoint will be defined meaning congestion marked (CM). | |||
| Therefore, each PHB that uses ECN marking will consume at least two | Therefore, each PHB that uses ECN marking will consume at least two | |||
| EXP codepoints. But PHBs that do not use ECN marking will only | EXP codepoints. But PHBs that do not use ECN marking will only | |||
| consume one. | consume one. | |||
| Further, we wish to use minimal space in the MPLS shim header to tell | Further, we wish to use minimal space in the MPLS shim header to tell | |||
| interior LSRs whether each packet will be received by an ECN-capable | interior LSRs whether each packet will be received by an ECN-capable | |||
| skipping to change at page 9, line 12 | skipping to change at page 8, line 40 | |||
| In the per-domain ECT checking approach, only the egress nodes check | In the per-domain ECT checking approach, only the egress nodes check | |||
| whether an IP packet is destined for an ECN-capable transport. | whether an IP packet is destined for an ECN-capable transport. | |||
| Therefore, any single LSR within an MPLS domain MUST NOT be | Therefore, any single LSR within an MPLS domain MUST NOT be | |||
| configured to enable ECN marking unless all the egress LSRs | configured to enable ECN marking unless all the egress LSRs | |||
| surrounding it are already configured to handle ECN marking. | surrounding it are already configured to handle ECN marking. | |||
| We call a domain surrounded by ECN-capable egress LSRs an ECN-enabled | We call a domain surrounded by ECN-capable egress LSRs an ECN-enabled | |||
| MPLS domain. This term only implies that all the egress LSRs are | MPLS domain. This term only implies that all the egress LSRs are | |||
| ECN-enabled; some interior LSRs may not be ECN-enabled. For | ECN-enabled; some interior LSRs may not be ECN-enabled. For | |||
| instance, it would be possible to use legacy LSRs incapable of | instance, it would be possible to use some legacy LSRs incapable of | |||
| supporting ECN in the interior of an MPLS domain as long as all the | supporting ECN in the interior of an MPLS domain as long as all the | |||
| egress LSRs were ECN-capable. Note that if PHP is used, the | egress LSRs were ECN-capable. Note that if PHP is used, the | |||
| "penultimate hop" routers which perform the pop operation do need to | "penultimate hop" routers which perform the pop operation do need to | |||
| be ECN-enabled, since they are acting in this context as egress LSRs. | be ECN-enabled, since they are acting in this context as egress LSRs. | |||
| 4. ECN-enabled MPLS domain | 4. ECN-enabled MPLS domain | |||
| In the following subsections we describe various operations affecting | In the following subsections we describe various operations affecting | |||
| the ECN marking of a packet that may be performed at MPLS edge and | the ECN marking of a packet that may be performed at MPLS edge and | |||
| core LSRs. | core LSRs. | |||
| skipping to change at page 11, line 4 | skipping to change at page 10, line 32 | |||
| means that if the EXP value of the MPLS header was CM, the packet | means that if the EXP value of the MPLS header was CM, the packet | |||
| MUST be dropped. | MUST be dropped. | |||
| Assuming an IP packet was exposed, we have to examine whether that | Assuming an IP packet was exposed, we have to examine whether that | |||
| packet is ECT or not. A Not-ECT packet MUST be dropped if the EXP | packet is ECT or not. A Not-ECT packet MUST be dropped if the EXP | |||
| field is CM. | field is CM. | |||
| For the remainder of this section, we describe the behavior that is | For the remainder of this section, we describe the behavior that is | |||
| required if the ECN information is to be transferred from the MPLS | required if the ECN information is to be transferred from the MPLS | |||
| header into the exposed IP header for onward transmission. As noted | header into the exposed IP header for onward transmission. As noted | |||
| in Section 1.3, such behavior is not mandated by this document, but | in Section 1.2, such behavior is not mandated by this document, but | |||
| may be selected by an operator. | may be selected by an operator. | |||
| If the inner IP packet is Not-ECT, its ECN field remains unchanged if | If the inner IP packet is Not-ECT, its ECN field remains unchanged if | |||
| the EXP field is Not-CM. If the ECN field of the inner packet is set | the EXP field is Not-CM. If the ECN field of the inner packet is set | |||
| to ECT(0), ECT(1) or CE, the ECN field remains unchanged if the EXP | to ECT(0), ECT(1) or CE, the ECN field remains unchanged if the EXP | |||
| field is set to Not-CM. The ECN field is set to CE if the EXP field | field is set to Not-CM. The ECN field is set to CE if the EXP field | |||
| is CM. Note that an inner value of CE and an outer value of not-CM | is CM. Note that an inner value of CE and an outer value of not-CM | |||
| should be considered anomalous, and SHOULD be logged in some way by | should be considered anomalous, and SHOULD be logged in some way by | |||
| the LSR. | the LSR. | |||
| skipping to change at page 11, line 31 | skipping to change at page 11, line 10 | |||
| particular LSP is carried to the last hop of the LSP and beyond the | particular LSP is carried to the last hop of the LSP and beyond the | |||
| last hop. Depending on which mode is preferred by an operator, the | last hop. Depending on which mode is preferred by an operator, the | |||
| EXP value or DSCP value of an exposed header following a label pop | EXP value or DSCP value of an exposed header following a label pop | |||
| may or may not be dependent on the EXP value of the label that is | may or may not be dependent on the EXP value of the label that is | |||
| removed by the pop operation. We believe that in the case of ECN | removed by the pop operation. We believe that in the case of ECN | |||
| marking, the use of these models should only apply to the encoding of | marking, the use of these models should only apply to the encoding of | |||
| the Diffserv PHB in the EXP value, and that the choice of codepoint | the Diffserv PHB in the EXP value, and that the choice of codepoint | |||
| for ECN should always be made based on the procedures described | for ECN should always be made based on the procedures described | |||
| above, independent of the tunneling model. | above, independent of the tunneling model. | |||
| 4.8. Extension to Pre-Congestion Notification | ||||
| This section describes how the preceding mechanisms can be extended | ||||
| to support PCN [I-D.briscoe-tsvwg-cl-architecture]. Our intent here | ||||
| is to show that the mechanisms are readily extended to more complex | ||||
| scenarios than ECN, but this section may be safely ignored if one is | ||||
| interested only in supporting ECN. | ||||
| The relevant aspects of PCN for the purposes of this discussion are: | ||||
| o PCN uses 3 states rather than 2 for ECN - these are referred to as | ||||
| admission marked (AM), pre-emption marked (PM) and not marked (NM) | ||||
| states. (See Section 8.4 for further discussion of PCN and the | ||||
| possibility of using fewer codepoints.) | ||||
| o A packet can go from NM to AM, from NM to PM, or from AM to PM, | ||||
| but no other transition is possible. | ||||
| o Whereas ECN-capable packets are identified by the ECT value in the | ||||
| IP header, PCN-capability is determined by the PHB of the packet. | ||||
| Thus, to support PCN fully in an MPLS domain for a particular PHB, a | ||||
| total of 3 codepoints need to be allocated for that PHB. These 3 | ||||
| codepoints represent the admission marked (AM), pre-emption marked | ||||
| (PM) and not marked (NM) states. The procedures described above need | ||||
| to be slightly modified to support this scenario. The following | ||||
| procedures are invoked when the topmost DSCP or EXP value indicates a | ||||
| PHB that supports PCN. | ||||
| 4.8.1. Label Push onto IP packet | ||||
| If the IP packet header indicates AM, set the EXP value of all | ||||
| entries in the label stack to AM. If the IP packet header indicates | ||||
| PM, set the EXP value of all entries in the label stack to PM. For | ||||
| any other marking of the IP header, set the EXP value of all entries | ||||
| in the label stack to NM. | ||||
| 4.8.2. Pushing Additional MPLS Labels | ||||
| The procedures of Section 4.2 apply. | ||||
| 4.8.3. Admission Control or Pre-emption Marking inside MPLS domain | ||||
| The EXP value can be set to AM or PM according to the same procedures | ||||
| as described in [I-D.briscoe-tsvwg-cl-phb]. For the purposes of this | ||||
| document, it does not matter exactly what algorithms are used to | ||||
| decide when to set AM or PM; all that matters is that if a router | ||||
| would have marked AM (or PM) in the IP header, it should set the EXP | ||||
| value in the MPLS header to the AM (or PM) codepoint. | ||||
| 4.8.4. Popping an MPLS Label (not end of stack) | ||||
| When popping an MPLS Label exposes another MPLS label, the AM or PM | ||||
| marking should be transferred to the exposed EXP field in the | ||||
| following manner: if the inner EXP value is NM, then it should be set | ||||
| to the same marking state as the EXP value of the popped label stack | ||||
| entry. If the inner EXP value is AM, it should be unchanged if the | ||||
| popped EXP value was AM, and it should be set to PM if the popped EXP | ||||
| value was PM. If the popped EXP value was NM, this should be logged | ||||
| in some way and the inner EXP value should be unchanged. If the | ||||
| inner EXP value is PM, it should be unchanged whatever the popped EXP | ||||
| value was, but any EXP value other than PM should be logged. | ||||
| 4.8.5. Popping the last MPLS Label to expose IP header | ||||
| When popping the last MPLS Label exposes the IP header, there are two | ||||
| cases to consider: | ||||
| o the popping LSR is NOT the egress router of the PCN region, in | ||||
| which case AM or PM marking should be transferred to the exposed | ||||
| IP header field; or | ||||
| o the popping LSR IS the egress router of the PCN region. | ||||
| In the latter case, the behavior of the egress LSR is defined in | ||||
| [I-D.briscoe-tsvwg-cl-architecture] and is beyond the scope of this | ||||
| document. In the former case, the marking should be transferred from | ||||
| the popped MPLS header to the exposed IP header as follows: if the | ||||
| inner IP header value is neither AM nor PM, and the EXP value was NM, | ||||
| then the IP header should be unchanged. For any other EXP value, the | ||||
| IP header should be set to the same marking state as the EXP value of | ||||
| the popped label stack entry. If the inner IP header value is AM, it | ||||
| should be unchanged if the popped EXP value was AM, and it should be | ||||
| set to PM if the popped EXP value was PM. If the popped EXP value | ||||
| was NM, this should be logged in some way and the inner IP header | ||||
| value should be unchanged. If the IP header value is PM, it should | ||||
| be unchanged whatever the popped EXP value was, but any EXP value | ||||
| other than PM should be logged. | ||||
| 5. ECN-disabled MPLS domain | 5. ECN-disabled MPLS domain | |||
| If ECN is not enabled on all the egress LSRs of a domain, ECN MUST | If ECN is not enabled on all the egress LSRs of a domain, ECN MUST | |||
| NOT be enabled on any LSRs throughout the domain. If congestion is | NOT be enabled on any LSRs throughout the domain. If congestion is | |||
| experienced on any LSR in an ECN-disabled MPLS domain, packets MUST | experienced on any LSR in an ECN-disabled MPLS domain, packets MUST | |||
| be dropped, NOT marked. The exact algorithm for deciding when to | be dropped, NOT marked. The exact algorithm for deciding when to | |||
| drop packets during congestion (e.g. tail-drop, RED, etc.) is a local | drop packets during congestion (e.g. tail-drop, RED, etc.) is a local | |||
| matter for the operator of the domain. | matter for the operator of the domain. | |||
| 6. The use of more codepoints with E-LSPs and L-LSPs | 6. The use of more codepoints with E-LSPs and L-LSPs | |||
| RFC 3270 gives different options with E-LSPs and L-LSPs and some of | [RFC3270] gives different options with E-LSPs and L-LSPs and some of | |||
| those could potentially provide ample EXP codepoints for ECN/PCN. | those could potentially provide ample EXP codepoints for ECN. | |||
| However, deploying L-LSPs vs E-LSPs has many implications such as | However, deploying L-LSPs vs E-LSPs has many implications such as | |||
| platform support and operational complexity. The above ECN/PCN MPLS | platform support and operational complexity. The above ECN MPLS | |||
| solution should provide some flexibility. If the operator has | solution should provide some flexibility. If the operator has | |||
| deployed one L-LSP per PHB scheduling class, then EXP space will be a | deployed one L-LSP per PHB scheduling class, then EXP space will be a | |||
| non-issue and it could be used to achieve more sophisticated ECN/PCN | non-issue and it could be used to achieve more sophisticated ECN | |||
| behavior if required. If the operator wants to stick to E-LSPs and | behavior if required. If the operator wants to stick to E-LSPs and | |||
| uses a handful of EXP codepoints for Diffserv, it may be desirable to | uses a handful of EXP codepoints for Diffserv, it may be desirable to | |||
| operate with a minimum number of extra ECN/PCN codepoints, even if | operate with a minimum number of extra ECN codepoints, even if this | |||
| this comes with some compromise on ECN/PCN optimality. See Section 8 | comes with some compromise on ECN optimality. See Section 8 for | |||
| for discussion of some possible deployment scenarios. | discussion of some possible deployment scenarios. | |||
| 7. Relationship to tunnel behavior in RFC 3168 | 7. Relationship to tunnel behavior in RFC 3168 | |||
| [RFC3168] defines two modes of encapsulating ECN-marked IP packets | [RFC3168] defines two modes of encapsulating ECN-marked IP packets | |||
| inside additional IP headers when tunnels are used. The two modes | inside additional IP headers when tunnels are used. The two modes | |||
| are the "full functionality" and "limited functionality" modes. In | are the "full functionality" and "limited functionality" modes. In | |||
| the full functionality mode, the ECT information from the inner | the full functionality mode, the ECT information from the inner | |||
| header is copied to the outer header at the tunnel ingress, but the | header is copied to the outer header at the tunnel ingress, but the | |||
| CE information is not. In the limited functionality mode, neither | CE information is not. In the limited functionality mode, neither | |||
| ECT nor CE information is copied to the outer header, and thus ECN | ECT nor CE information is copied to the outer header, and thus ECN | |||
| cannot be applied to the encapsulated packet. | cannot be applied to the encapsulated packet. | |||
| The behavior that is specified in Section 4 of this document | The behavior that is specified in Section 4 of this document | |||
| resembles the "full functionality" mode in the sense that it conveys | resembles the "full functionality" mode in the sense that it conveys | |||
| some information from inner to outer header, and in the sense that it | some information from inner to outer header, and in the sense that it | |||
| enables full ECN support along the MPLS LSP (which is analogous to an | enables full ECN support along the MPLS LSP (which is analogous to an | |||
| IP tunnel in this context). However it differs in one respect, which | IP tunnel in this context). However it differs in one respect, which | |||
| is that the CE information is conveyed from the inner header to the | is that the CE information is conveyed from the inner header to the | |||
| outer header. Our reason for this different design choice is to give | outer header. Our original reason for this different design choice | |||
| interior routers and LSRs more information about upstream marking in | was to give interior routers and LSRs more information about upstream | |||
| multi-bottleneck cases. For instance, the flow pre-emption marking | marking in multi-bottleneck cases. For instance, the flow pre- | |||
| mechanism proposed for PCN works by only considering packets for | emption marking mechanism proposed for PCN works by only considering | |||
| marking that have not already been marked upstream. Unless existing | packets for marking that have not already been marked upstream. | |||
| pre-emption marking is copied from the inner to the outer header at | Unless existing pre-emption marking is copied from the inner to the | |||
| tunnel ingress, the mechanism doesn't pre-empt enough traffic in | outer header at tunnel ingress, the mechanism doesn't pre-empt enough | |||
| cases where anomalous events hit multiple MPLS domains at once. | traffic in cases where anomalous events hit multiple domains at once. | |||
| [RFC3168] does not give any reasons against conveying CE information | [RFC3168] does not give any reasons against conveying CE information | |||
| from the inner header to the outer in the "full functionality" mode. | from the inner header to the outer in the "full functionality" mode. | |||
| So, rather than define different encapsulation methods for ECN and | Furthermore, [RFC4301] specifies that the ECN marking should be | |||
| PCN, Section 4 defines a common approach for both. | copied from inner header to outer header in IPSEC tunnels, consistent | |||
| with the approach defined here. [Briscoe] discusses this issue in | ||||
| 7.1. Alternative approach to support ECN in an MPLS domain | more detail. In summary, the approach described in Section 4 appears | |||
| to be both a sound technical choice and consistent with the current | ||||
| It is possible to define an approach for MPLS support of ECN that | state of thinking in the IETF. | |||
| more closely resembles that of the full functionality mode of | ||||
| [RFC3168]. This approach would differ from that described in | ||||
| Section 4 in the following ways: | ||||
| o when pushing one or more MPLS labels onto an IP packet, the not-CM | ||||
| state is set in the EXP field of all label stack entries | ||||
| o when pushing one or more MPLS labels onto an MPLS packet, the | ||||
| not-CM state is set in the EXP field of all newly added label | ||||
| stack entries | ||||
| o when popping an MPLS label and the exposed header is MPLS (i.e. | ||||
| this is not the end of stack), the EXP field of the MPLS packet | ||||
| should be set to CM if the popped label's EXP value was CM and | ||||
| left unchanged otherwise | ||||
| o when popping an MPLS label and the exposed header is IP, the IP | ||||
| ECN field should be set to CE if the EXP value was CM and if the | ||||
| IP header indicated that the packet was ECN capable. If the IP | ||||
| header indicated not-ECT and the EXP value was CM, the packet MUST | ||||
| be dropped. If the EXP value was not-CM, the ECN field in the IP | ||||
| header is unchanged. | ||||
| The advantages of this scheme over that described in Section 4 are | ||||
| greater similarity to [RFC3168], and the ability to determine, at the | ||||
| end of an LSP, that congestion either did or did not occur along that | ||||
| LSP (since the initial state is always not-CM at the start of an | ||||
| LSP). | ||||
| A disadvantage of this approach is that exceptions to this rule are | ||||
| necessary in cases where the marking process on LSRs needs to depend | ||||
| on whether a packet has already suffered upstream marking. The | ||||
| currently proposed pre-emption marking in PCN is an example where | ||||
| such an exception would be necessary (see the discussion at the start | ||||
| of Section 7). | ||||
| 8. Example Uses | 8. Example Uses | |||
| 8.1. RFC3168-style ECN | 8.1. RFC3168-style ECN | |||
| [RFC3168] proposes the use of ECN in TCP and introduces the use of | [RFC3168] proposes the use of ECN in TCP and introduces the use of | |||
| ECN-Echo and CWR flags in the TCP header for initialization. The TCP | ECN-Echo and CWR flags in the TCP header for initialization. The TCP | |||
| sender responds accordingly (such as not increasing the congestion | sender responds accordingly (such as not increasing the congestion | |||
| window) when it receives an ECN-Echo (ECE) ACK packet (that is, an | window) when it receives an ECN-Echo (ECE) ACK packet (that is, an | |||
| ACK packet with ECN-Echo flag set in the TCP header), then the sender | ACK packet with ECN-Echo flag set in the TCP header), then the sender | |||
| skipping to change at page 16, line 12 | skipping to change at page 13, line 11 | |||
| accomplished by simply allocated a second codepoint to the PHB for | accomplished by simply allocated a second codepoint to the PHB for | |||
| the "CM" state of that PHB and retaining the old codepoint for the | the "CM" state of that PHB and retaining the old codepoint for the | |||
| "not-CM" state. An operator with only four deployed PHBs could of | "not-CM" state. An operator with only four deployed PHBs could of | |||
| course enable ECN marking on all those PHBs. It is easy to imagine | course enable ECN marking on all those PHBs. It is easy to imagine | |||
| cases where some PHBs might benefit more from ECN than others - for | cases where some PHBs might benefit more from ECN than others - for | |||
| example, an operator might use ECN on a premium data service but not | example, an operator might use ECN on a premium data service but not | |||
| on a PHB used for best effort internet traffic. | on a PHB used for best effort internet traffic. | |||
| As an illustrative example of how the EXP field might be used in this | As an illustrative example of how the EXP field might be used in this | |||
| case, consider the example of an operator who is using the aggregated | case, consider the example of an operator who is using the aggregated | |||
| service classes described in [I-D.chan-tsvwg-diffserv-class-aggr]. | service classes proposed in [I-D.ietf-tsvwg-diffserv-class-aggr]. He | |||
| He may choose to support ECN only for the Assured Elastic Treatment | may choose to support ECN only for the Assured Elastic Treatment | |||
| Aggregate, using the EXP codepoint 010 for the not-CM state and 011 | Aggregate, using the EXP codepoint 010 for the not-CM state and 011 | |||
| for the CM state. All other codepoints could be the same as in | for the CM state. All other codepoints could be the same as in | |||
| [I-D.chan-tsvwg-diffserv-class-aggr]. Of course any other | [I-D.ietf-tsvwg-diffserv-class-aggr]. Of course any other | |||
| combination of EXP values can be used according to the specific set | combination of EXP values can be used according to the specific set | |||
| of PHBs and marking conventions used within that operator's network. | of PHBs and marking conventions used within that operator's network. | |||
| 8.3. Congestion-feedback-based Traffic Engineering | 8.3. Congestion-feedback-based Traffic Engineering | |||
| Shayman's traffic engineering [Shayman] proposed the use of ECN by an | Shayman's traffic engineering [Shayman] proposed the use of ECN by an | |||
| egress LSR feeding back congestion to an ingress LSR to mitigate | egress LSR feeding back congestion to an ingress LSR to mitigate | |||
| congestion by employing dynamic traffic engineering techniques such | congestion by employing dynamic traffic engineering techniques such | |||
| as shifting flows to an alternate path. It proposed a new RSVP | as shifting flows to an alternate path. It proposed a new RSVP | |||
| TUNNEL CONGESTION message which was sent to the ingress LSR and | TUNNEL CONGESTION message which was sent to the ingress LSR and | |||
| skipping to change at page 16, line 48 | skipping to change at page 13, line 47 | |||
| As an example, a minor extension to RSVP signalling has been proposed | As an example, a minor extension to RSVP signalling has been proposed | |||
| [I-D.lefaucheur-rsvp-ecn] to carry this message, but a similar | [I-D.lefaucheur-rsvp-ecn] to carry this message, but a similar | |||
| approach has also been proposed that uses NSIS signalling | approach has also been proposed that uses NSIS signalling | |||
| [I-D.ietf-nsis-rmd]. | [I-D.ietf-nsis-rmd]. | |||
| If it is possible for LSRs to signify congestion in MPLS, PCN marking | If it is possible for LSRs to signify congestion in MPLS, PCN marking | |||
| could be used for admission control and flow pre-emption across a | could be used for admission control and flow pre-emption across a | |||
| Diffserv region, irrespective of whether it contained pure IP | Diffserv region, irrespective of whether it contained pure IP | |||
| routers, MPLS LSRs, or both. Indeed, the solution could be somewhat | routers, MPLS LSRs, or both. Indeed, the solution could be somewhat | |||
| more efficient to implement if aggregates could identify themselves | more efficient to implement if aggregates could identify themselves | |||
| by their MPLS label. Section 4.8 describes the mechanisms by which | by their MPLS label. Appendix A describes the mechanisms by which | |||
| the necessary markings for PCN could be carried in the MPLS header. | the necessary markings for PCN could be carried in the MPLS header. | |||
| As an illustrative example of how the EXP field might be used in this | As an illustrative example of how the EXP field might be used in this | |||
| case, consider the example of an operator who is using the aggregated | case, consider the example of an operator who is using the aggregated | |||
| service classes described in [I-D.chan-tsvwg-diffserv-class-aggr]. | service classes proposed in [I-D.ietf-tsvwg-diffserv-class-aggr]. He | |||
| He may choose to support PCN only for the Real Time Treatment | may choose to support PCN only for the Real Time Treatment Aggregate, | |||
| Aggregate, using the EXP codepoint 100 for the not-marked (NM) state, | using the EXP codepoint 100 for the not-marked (NM) state, 101 for | |||
| 101 for the Admission Marked (AM) state, and 111 for the Pre-emption | the Admission Marked (AM) state, and 111 for the Pre-emption Marked | |||
| Marked (PM) state. All other codepoints could be the same as in | (PM) state. All other codepoints could be the same as in | |||
| [I-D.chan-tsvwg-diffserv-class-aggr]. Of course any other | [I-D.ietf-tsvwg-diffserv-class-aggr]. Of course any other | |||
| combination of EXP values can be used according to the specific set | combination of EXP values can be used according to the specific set | |||
| of PHBs and marking conventions used within that operator's network. | of PHBs and marking conventions used within that operator's network. | |||
| It might also be possible to deploy a similar solution using PCN | It might also be possible to deploy a similar solution using PCN | |||
| marking over MPLS for just admission control alone, or just flow pre- | marking over MPLS for just admission control alone, or just flow pre- | |||
| emption alone, particularly if codepoint space was at a premium in | emption alone, particularly if codepoint space was at a premium in | |||
| the MPLS EXP field. However, the feasibility of deploying one | the MPLS EXP field. However, the feasibility of deploying one | |||
| without the other would require further study. | without the other would require further study. We also note that an | |||
| approach to deploying PCN using only a single marking codepoint to | ||||
| support both pre-emption and admission control has been | ||||
| proposed[I-D.charny-pcn-single-marking]. | ||||
| 9. Deployment Considerations | 9. Deployment Considerations | |||
| 9.1. Marking non-ECN Capable Packets | 9.1. Marking non-ECN Capable Packets | |||
| What is the consequences of marking a packet that is not ECN-capable? | What are the consequences of marking a packet that is not ECN- | |||
| Even if it will be dropped before leaving the domain, doesn't this | capable? Even if it will be dropped before leaving the domain, | |||
| consume resources unnecessarily? | doesn't this consume resources unnecessarily? | |||
| The problem only arises if there is congestion downstream of an | The problem only arises if there is congestion downstream of an | |||
| earlier congested node. It might be that marked packets are carried | earlier congested queue in the same MPLS domain. Downstream | |||
| through this second congested router when, within the underlying IP | congested LSRs might forward packets already marked, even though they | |||
| header they are not ECN capable, so they will be dropped later. Such | will be dropped later when the inner IP header is found to be Not-ECT | |||
| packets might cause other packets to be marked (or dropped) that | on decapsulation. Such packets might cause the downstream LSRs to | |||
| would not otherwise have been. | mark (or drop) other packets that they would otherwise not have had | |||
| to. | ||||
| We decided to use the per-domain ECT checking approach because it | We expect congestion will typically be rare in MPLS networks, but it | |||
| would become optimal as ECN deployment became prevalent. The | might not be. The extra unnecessary load at downstream LSRs will not | |||
| situation where traffic is carried beyond a congested LSR only to be | be more than the fraction of marked packets from upstream LSRs, even | |||
| dropped later should become less prevalent as more transports use | in the worst case where no transports are ECN capable. Therefore the | |||
| ECN. This is why we chose not to use the [Floyd] alternative which | amount of unnecessary marking (or drop) on an LSR will not be more | |||
| introduced a low but persistent level of unnecessary packet drop for | than the product of its local marking rate and the marking rate due | |||
| all time. Although that scheme did not carry droppable traffic to | to upstream LSRs within the same domain - typically the product of | |||
| the edge of the MPLS domain, we felt this was a small price to pay, | two small (often zero) probabilities. | |||
| and it was anyway only of concern until ECN had become more widely | ||||
| deployed. | This is why we decided to use the per-domain ECT checking approach - | |||
| because the most likely effect would be a very slightly increased | ||||
| marking rate, which would result in very slightly higher drop only | ||||
| for non-ECN-capable transports. We chose not to use the [Floyd] | ||||
| alternative which introduced a low but persistent level of | ||||
| unnecessary packet drop for all time, even for ECN-capable | ||||
| transports. Although that scheme did not carry traffic to the edge | ||||
| of the MPLS domain only to be dropped on decapsulation, we felt our | ||||
| minor inefficiency was a small price to pay. And it would get | ||||
| smaller still if ECN deployment widened. | ||||
| A partial solution would be to preferentially drop packets arriving | A partial solution would be to preferentially drop packets arriving | |||
| at a congested router that were already marked. There is no solution | at a congested router that were already marked. There is no solution | |||
| to the problem of marking a packet when congestion is caused by | to the problem of marking a packet when congestion is caused by | |||
| another packet that should have been dropped. However, the chance of | another packet that should have been dropped. However, the chance of | |||
| such an occurrence is very low and the consequences are not | such an occurrence is very low and the consequences are not | |||
| significant. It merely causes an application to very occasionally | significant. It merely causes an application to very occasionally | |||
| slow down its rate when it did not have to. | slow down its rate when it did not have to. | |||
| 9.2. Non-ECN capable routers in an MPLS Domain | 9.2. Non-ECN capable routers in an MPLS Domain | |||
| skipping to change at page 19, line 7 | skipping to change at page 16, line 19 | |||
| An ECN sender can use the ECN nonce [RFC3540] to detect a misbehaving | An ECN sender can use the ECN nonce [RFC3540] to detect a misbehaving | |||
| receiver. The ECN nonce works correctly across an MPLS domain | receiver. The ECN nonce works correctly across an MPLS domain | |||
| without requiring any specific support from the proposal in this | without requiring any specific support from the proposal in this | |||
| draft. The nonce does not need to be present in the MPLS shim | draft. The nonce does not need to be present in the MPLS shim | |||
| header. As long as the nonce is present in the IP header when the | header. As long as the nonce is present in the IP header when the | |||
| ECN information is copied from the last MPLS shim header, it will be | ECN information is copied from the last MPLS shim header, it will be | |||
| overwritten if congestion has been experienced by an LSR. This is | overwritten if congestion has been experienced by an LSR. This is | |||
| all that is necessary for the sender to detect a misbehaving | all that is necessary for the sender to detect a misbehaving | |||
| receiver. | receiver. | |||
| An alternative proposal currently in progress in the IETF | ||||
| [I-D.briscoe-tsvwg-re-ecn-tcp] allows the network to prevent | ||||
| misbehavior by senders or receivers or other routers. Like the ECN | ||||
| nonce, it works correctly without requiring any specific support from | ||||
| the proposal in this draft. It uses a bit in the IP header (the RE | ||||
| bit) which is set by the sender and never changed along the path-it | ||||
| is only read by certain policing elements in the network. There is | ||||
| no need for a copy of this bit in the MPLS shim, as policing nodes | ||||
| can examine the IP header if they need to, particularly given they | ||||
| are intended to only be necessary at domain borders where MPLS | ||||
| headers are often removed. | ||||
| 12. Acknowledgments | 12. Acknowledgments | |||
| Thanks to K.K. Ramakrishnan and Sally Floyd for getting us thinking | Thanks to K.K. Ramakrishnan and Sally Floyd for getting us thinking | |||
| about this in the first place and for providing advice on tunneling | about this in the first place and for providing advice on tunneling | |||
| of ECN packets, and to Joe Babiarz, Ben Niven-Jenkins, Phil Eardley, | of ECN packets, and to Sally Floyd, Joe Babiarz, Ben Niven-Jenkins, | |||
| and Ruediger Geib for their comments on the draft. | Phil Eardley, Ruediger Geib, and Magnus Westerlund for their comments | |||
| on the draft. | ||||
| Appendix A. Extension to Pre-Congestion Notification | ||||
| This appendix describes how the mechanisms decribed in the body of | ||||
| the document can be extended to support PCN | ||||
| [I-D.briscoe-tsvwg-cl-architecture]. Our intent here is to show that | ||||
| the mechanisms are readily extended to more complex scenarios than | ||||
| ECN, particulary in the case where more codepoints are needed, but | ||||
| this appendix may be safely ignored if one is interested only in | ||||
| supporting ECN. Note that the PCN standards are still very much | ||||
| under development at the time of writing, hence the precise details | ||||
| contained in this appendix may be subject to change, and we stress | ||||
| that this appendix is for illustrative purposes only. | ||||
| The relevant aspects of PCN for the purposes of this discussion are: | ||||
| o PCN uses 3 states rather than 2 for ECN - these are referred to as | ||||
| admission marked (AM), pre-emption marked (PM) and not marked (NM) | ||||
| states. (See Section 8.4 for further discussion of PCN and the | ||||
| possibility of using fewer codepoints.) | ||||
| o A packet can go from NM to AM, from NM to PM, or from AM to PM, | ||||
| but no other transition is possible. | ||||
| o The determination of whether a packet is subject to PCN is based | ||||
| on the PHB of the packet. | ||||
| Thus, to support PCN fully in an MPLS domain for a particular PHB, a | ||||
| total of 3 codepoints need to be allocated for that PHB. These 3 | ||||
| codepoints represent the admission marked (AM), pre-emption marked | ||||
| (PM) and not marked (NM) states. The procedures described in | ||||
| Section 4 above need to be slightly modified to support this | ||||
| scenario. The following procedures are invoked when the topmost DSCP | ||||
| or EXP value indicates a PHB that supports PCN. | ||||
| Appendix A.1. Label Push onto IP packet | ||||
| If the IP packet header indicates AM, set the EXP value of all | ||||
| entries in the label stack to AM. If the IP packet header indicates | ||||
| PM, set the EXP value of all entries in the label stack to PM. For | ||||
| any other marking of the IP header, set the EXP value of all entries | ||||
| in the label stack to NM. | ||||
| Appendix A.2. Pushing Additional MPLS Labels | ||||
| The procedures of Section 4.2 apply. | ||||
| Appendix A.3. Admission Control or Pre-emption Marking inside MPLS | ||||
| domain | ||||
| The EXP value can be set to AM or PM according to the same procedures | ||||
| as described in [I-D.briscoe-tsvwg-cl-phb]. For the purposes of this | ||||
| document, it does not matter exactly what algorithms are used to | ||||
| decide when to set AM or PM; all that matters is that if a router | ||||
| would have marked AM (or PM) in the IP header, it should set the EXP | ||||
| value in the MPLS header to the AM (or PM) codepoint. | ||||
| Appendix A.4. Popping an MPLS Label (not end of stack) | ||||
| When popping an MPLS Label exposes another MPLS label, the AM or PM | ||||
| marking should be transferred to the exposed EXP field in the | ||||
| following manner: | ||||
| o If the inner EXP value is NM, then it should be set to the same | ||||
| marking state as the EXP value of the popped label stack entry. | ||||
| o If the inner EXP value is AM, it should be unchanged if the popped | ||||
| EXP value was AM, and it should be set to PM if the popped EXP | ||||
| value was PM. If the popped EXP value was NM, this should be | ||||
| logged in some way and the inner EXP value should be unchanged. | ||||
| o If the inner EXP value is PM, it should be unchanged whatever the | ||||
| popped EXP value was, but any EXP value other than PM should be | ||||
| logged. | ||||
| Appendix A.5. Popping the last MPLS Label to expose IP header | ||||
| When popping the last MPLS Label exposes the IP header, there are two | ||||
| cases to consider: | ||||
| o the popping LSR is NOT the egress router of the PCN region, in | ||||
| which case AM or PM marking should be transferred to the exposed | ||||
| IP header field; or | ||||
| o the popping LSR IS the egress router of the PCN region. | ||||
| In the latter case, the behavior of the egress LSR is defined in | ||||
| [I-D.briscoe-tsvwg-cl-architecture] and is beyond the scope of this | ||||
| document. In the former case, the marking should be transferred from | ||||
| the popped MPLS header to the exposed IP header as follows: | ||||
| o If the inner IP header value is neither AM nor PM, and the EXP | ||||
| value was NM, then the IP header should be unchanged. For any | ||||
| other EXP value, the IP header should be set to the same marking | ||||
| state as the EXP value of the popped label stack entry. | ||||
| o If the inner IP header value is AM, it should be unchanged if the | ||||
| popped EXP value was AM, and it should be set to PM if the popped | ||||
| EXP value was PM. If the popped EXP value was NM, this should be | ||||
| logged in some way and the inner IP header value should be | ||||
| unchanged. | ||||
| o If the IP header value is PM, it should be unchanged whatever the | ||||
| popped EXP value was, but any EXP value other than PM should be | ||||
| logged. | ||||
| 13. References | 13. References | |||
| 13.1. Normative References | 13.1. Normative References | |||
| [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | |||
| Requirement Levels", BCP 14, RFC 2119, March 1997. | Requirement Levels", BCP 14, RFC 2119, March 1997. | |||
| [RFC2475] Blake, S., Black, D., Carlson, M., Davies, E., Wang, Z., | ||||
| and W. Weiss, "An Architecture for Differentiated | ||||
| Services", RFC 2475, December 1998. | ||||
| [RFC3031] Rosen, E., Viswanathan, A., and R. Callon, "Multiprotocol | [RFC3031] Rosen, E., Viswanathan, A., and R. Callon, "Multiprotocol | |||
| Label Switching Architecture", RFC 3031, January 2001. | Label Switching Architecture", RFC 3031, January 2001. | |||
| [RFC3032] Rosen, E., Tappan, D., Fedorkow, G., Rekhter, Y., | [RFC3032] Rosen, E., Tappan, D., Fedorkow, G., Rekhter, Y., | |||
| Farinacci, D., Li, T., and A. Conta, "MPLS Label Stack | Farinacci, D., Li, T., and A. Conta, "MPLS Label Stack | |||
| Encoding", RFC 3032, January 2001. | Encoding", RFC 3032, January 2001. | |||
| [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition | [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition | |||
| of Explicit Congestion Notification (ECN) to IP", | of Explicit Congestion Notification (ECN) to IP", | |||
| RFC 3168, September 2001. | RFC 3168, September 2001. | |||
| [RFC3260] Grossman, D., "New Terminology and Clarifications for | ||||
| Diffserv", RFC 3260, April 2002. | ||||
| [RFC3270] Le Faucheur, F., Wu, L., Davie, B., Davari, S., Vaananen, | [RFC3270] Le Faucheur, F., Wu, L., Davie, B., Davari, S., Vaananen, | |||
| P., Krishnan, R., Cheval, P., and J. Heinanen, "Multi- | P., Krishnan, R., Cheval, P., and J. Heinanen, "Multi- | |||
| Protocol Label Switching (MPLS) Support of Differentiated | Protocol Label Switching (MPLS) Support of Differentiated | |||
| Services", RFC 3270, May 2002. | Services", RFC 3270, May 2002. | |||
| [RFC4301] Kent, S. and K. Seo, "Security Architecture for the | ||||
| Internet Protocol", RFC 4301, December 2005. | ||||
| 13.2. Informative References | 13.2. Informative References | |||
| [Briscoe] "Layered Encapsulation of Congestion Notification", | ||||
| June 2007. | ||||
| Work in progress. | ||||
| [Floyd] "A Proposal to Incorporate ECN in MPLS", 1999. | [Floyd] "A Proposal to Incorporate ECN in MPLS", 1999. | |||
| Work in progress. http://www.icir.org/floyd/papers/ | Work in progress. http://www.icir.org/floyd/papers/ | |||
| draft-ietf-mpls-ecn-00.txt | draft-ietf-mpls-ecn-00.txt | |||
| [I-D.briscoe-tsvwg-cl-architecture] | [I-D.briscoe-tsvwg-cl-architecture] | |||
| Briscoe, B., "An edge-to-edge Deployment Model for Pre- | Briscoe, B., "An edge-to-edge Deployment Model for Pre- | |||
| Congestion Notification: Admission Control over a | Congestion Notification: Admission Control over a | |||
| DiffServ Region", draft-briscoe-tsvwg-cl-architecture-04 | DiffServ Region", draft-briscoe-tsvwg-cl-architecture-04 | |||
| (work in progress), October 2006. | (work in progress), October 2006. | |||
| [I-D.briscoe-tsvwg-cl-phb] | [I-D.briscoe-tsvwg-cl-phb] | |||
| Briscoe, B., "Pre-Congestion Notification marking", | Briscoe, B., "Pre-Congestion Notification marking", | |||
| draft-briscoe-tsvwg-cl-phb-03 (work in progress), | draft-briscoe-tsvwg-cl-phb-03 (work in progress), | |||
| October 2006. | October 2006. | |||
| [I-D.briscoe-tsvwg-re-ecn-border-cheat] | [I-D.charny-pcn-single-marking] | |||
| Briscoe, B., "Emulating Border Flow Policing using Re-ECN | Charny, A., "Pre-Congestion Notification Using Single | |||
| on Bulk Data", draft-briscoe-tsvwg-re-ecn-border-cheat-01 | Marking for Admission and Pre-emption", | |||
| (work in progress), June 2006. | draft-charny-pcn-single-marking-01 (work in progress), | |||
| March 2007. | ||||
| [I-D.briscoe-tsvwg-re-ecn-tcp] | ||||
| Briscoe, B., "Re-ECN: Adding Accountability for Causing | ||||
| Congestion to TCP/IP", draft-briscoe-tsvwg-re-ecn-tcp-03 | ||||
| (work in progress), October 2006. | ||||
| [I-D.chan-tsvwg-diffserv-class-aggr] | ||||
| Chan, K., "Aggregation of DiffServ Service Classes", | ||||
| draft-chan-tsvwg-diffserv-class-aggr-03 (work in | ||||
| progress), January 2006. | ||||
| [I-D.ietf-nsis-rmd] | [I-D.ietf-nsis-rmd] | |||
| Bader, A., "RMD-QOSM - The Resource Management in Diffserv | Bader, A., "RMD-QOSM - The Resource Management in Diffserv | |||
| QOS Model", draft-ietf-nsis-rmd-08 (work in progress), | QOS Model", draft-ietf-nsis-rmd-09 (work in progress), | |||
| October 2006. | March 2007. | |||
| [I-D.ietf-tsvwg-diffserv-class-aggr] | ||||
| Chan, K., "Aggregation of DiffServ Service Classes", | ||||
| draft-ietf-tsvwg-diffserv-class-aggr-02 (work in | ||||
| progress), March 2007. | ||||
| [I-D.lefaucheur-rsvp-ecn] | [I-D.lefaucheur-rsvp-ecn] | |||
| Faucheur, F., "RSVP Extensions for Admission Control over | Faucheur, F., "RSVP Extensions for Admission Control over | |||
| Diffserv using Pre-congestion Notification (PCN)", | Diffserv using Pre-congestion Notification (PCN)", | |||
| draft-lefaucheur-rsvp-ecn-01 (work in progress), | draft-lefaucheur-rsvp-ecn-01 (work in progress), | |||
| June 2006. | June 2006. | |||
| [RFC3260] Grossman, D., "New Terminology and Clarifications for | ||||
| Diffserv", RFC 3260, April 2002. | ||||
| [RFC3540] Spring, N., Wetherall, D., and D. Ely, "Robust Explicit | [RFC3540] Spring, N., Wetherall, D., and D. Ely, "Robust Explicit | |||
| Congestion Notification (ECN) Signaling with Nonces", | Congestion Notification (ECN) Signaling with Nonces", | |||
| RFC 3540, June 2003. | RFC 3540, June 2003. | |||
| [RFC4340] Kohler, E., Handley, M., and S. Floyd, "Datagram | [RFC4340] Kohler, E., Handley, M., and S. Floyd, "Datagram | |||
| Congestion Control Protocol (DCCP)", RFC 4340, March 2006. | Congestion Control Protocol (DCCP)", RFC 4340, March 2006. | |||
| [Shayman] "Using ECN to Signal Congestion Within an MPLS Domain", | [Shayman] "Using ECN to Signal Congestion Within an MPLS Domain", | |||
| 2000. | 2000. | |||
| End of changes. 47 change blocks. | ||||
| 293 lines changed or deleted | 303 lines changed or added | |||
This html diff was produced by rfcdiff 1.33. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||