| draft-davie-ecn-mpls-00.txt | draft-davie-ecn-mpls-01.txt | |||
|---|---|---|---|---|
| Network Working Group B. Davie | Network Working Group B. Davie | |||
| Internet-Draft Cisco Systems, Inc. | Internet-Draft Cisco Systems, Inc. | |||
| Expires: December 20, 2006 B. Briscoe | Intended status: Standards Track B. Briscoe | |||
| J. Tay | Expires: April 21, 2007 J. Tay | |||
| BT Research | BT Research | |||
| June 18, 2006 | October 18, 2006 | |||
| Explicit Congestion Marking in MPLS | Explicit Congestion Marking in MPLS | |||
| draft-davie-ecn-mpls-00.txt | draft-davie-ecn-mpls-01.txt | |||
| Status of this Memo | Status of this Memo | |||
| By submitting this Internet-Draft, each author represents that any | By submitting this Internet-Draft, each author represents that any | |||
| applicable patent or other IPR claims of which he or she is aware | applicable patent or other IPR claims of which he or she is aware | |||
| have been or will be disclosed, and any of which he or she becomes | have been or will be disclosed, and any of which he or she becomes | |||
| aware will be disclosed, in accordance with Section 6 of BCP 79. | aware will be disclosed, in accordance with Section 6 of BCP 79. | |||
| Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
| Task Force (IETF), its areas, and its working groups. Note that | Task Force (IETF), its areas, and its working groups. Note that | |||
| skipping to change at page 1, line 36 | skipping to change at page 1, line 36 | |||
| and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
| time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
| material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
| The list of current Internet-Drafts can be accessed at | The list of current Internet-Drafts can be accessed at | |||
| http://www.ietf.org/ietf/1id-abstracts.txt. | http://www.ietf.org/ietf/1id-abstracts.txt. | |||
| The list of Internet-Draft Shadow Directories can be accessed at | The list of Internet-Draft Shadow Directories can be accessed at | |||
| http://www.ietf.org/shadow.html. | http://www.ietf.org/shadow.html. | |||
| This Internet-Draft will expire on December 20, 2006. | This Internet-Draft will expire on April 21, 2007. | |||
| Copyright Notice | Copyright Notice | |||
| Copyright (C) The Internet Society (2006). | Copyright (C) The Internet Society (2006). | |||
| Abstract | Abstract | |||
| RFC 3270 defines how to support the Diffserv arhitecture in MPLS | RFC 3270 defines how to support the Diffserv architecture in MPLS | |||
| networks, including how to encode Diffserv Code Points (DSCPs) in an | networks, including how to encode Diffserv Code Points (DSCPs) in an | |||
| MPLS header. DSCPs may be encoded in the EXP field, while other uses | MPLS header. DSCPs may be encoded in the EXP field, while other uses | |||
| of that field are not precluded. RFC3270 makes no statement about | of that field are not precluded. RFC3270 makes no statement about | |||
| how Explicit Congestion Notification (ECN) marking might be encoded | how Explicit Congestion Notification (ECN) marking might be encoded | |||
| in the MPLS header. This draft defines how an operator might define | in the MPLS header. This draft defines how an operator might define | |||
| some of the EXP codepoints for explicit congestion notification, | some of the EXP codepoints for explicit congestion notification, | |||
| without precluding other uses. | without precluding other uses. | |||
| Requirements Language | Requirements Language | |||
| The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
| "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | |||
| document are to be interpreted as described in RFC 2119 [RFC2119]. | document are to be interpreted as described in RFC 2119 [RFC2119]. | |||
| Table of Contents | Table of Contents | |||
| 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 | |||
| 1.1. Background . . . . . . . . . . . . . . . . . . . . . . . . 4 | 1.1. Changes From Previous (-00) Version . . . . . . . . . . . 4 | |||
| 1.2. Intent . . . . . . . . . . . . . . . . . . . . . . . . . . 4 | 1.2. Background . . . . . . . . . . . . . . . . . . . . . . . . 4 | |||
| 1.3. Terminology . . . . . . . . . . . . . . . . . . . . . . . 5 | 1.3. Intent . . . . . . . . . . . . . . . . . . . . . . . . . . 5 | |||
| 2. Use of MPLS EXP Field for ECN . . . . . . . . . . . . . . . . 5 | 1.4. Terminology . . . . . . . . . . . . . . . . . . . . . . . 5 | |||
| 3. Per-domain ECT checking . . . . . . . . . . . . . . . . . . . 7 | 2. Use of MPLS EXP Field for ECN . . . . . . . . . . . . . . . . 6 | |||
| 4. ECN-enabled MPLS domain . . . . . . . . . . . . . . . . . . . 8 | 3. Per-domain ECT checking . . . . . . . . . . . . . . . . . . . 8 | |||
| 4.1. Pushing (adding) one or more labels to an IP packet . . . 8 | 4. ECN-enabled MPLS domain . . . . . . . . . . . . . . . . . . . 9 | |||
| 4.2. Pushing one or more labels onto an MPLS labelled packet . 8 | 4.1. Pushing (adding) one or more labels to an IP packet . . . 9 | |||
| 4.2. Pushing one or more labels onto an MPLS labelled packet . 9 | ||||
| 4.3. Congestion experienced in an interior MPLS node . . . . . 9 | 4.3. Congestion experienced in an interior MPLS node . . . . . 9 | |||
| 4.4. Crossing a Diffserv Domain Boundary . . . . . . . . . . . 9 | 4.4. Crossing a Diffserv Domain Boundary . . . . . . . . . . . 9 | |||
| 4.5. Popping an MPLS label (not the end of the stack) . . . . . 9 | 4.5. Popping an MPLS label (not the end of the stack) . . . . . 10 | |||
| 4.6. Popping the last MPLS label in the stack . . . . . . . . . 9 | 4.6. Popping the last MPLS label in the stack . . . . . . . . . 10 | |||
| 4.7. Diffserv Tunneling Models . . . . . . . . . . . . . . . . 10 | 4.7. Diffserv Tunneling Models . . . . . . . . . . . . . . . . 11 | |||
| 4.8. Extension to Pre-Congestion Notification . . . . . . . . . 10 | 4.8. Extension to Pre-Congestion Notification . . . . . . . . . 11 | |||
| 4.8.1. Label Push onto IP packet . . . . . . . . . . . . . . 10 | 4.8.1. Label Push onto IP packet . . . . . . . . . . . . . . 12 | |||
| 4.8.2. Pushing Additional MPLS Labels . . . . . . . . . . . . 10 | 4.8.2. Pushing Additional MPLS Labels . . . . . . . . . . . . 12 | |||
| 4.8.3. Admission Control or Pre-emption Marking inside | 4.8.3. Admission Control or Pre-emption Marking inside | |||
| MPLS domain . . . . . . . . . . . . . . . . . . . . . 11 | MPLS domain . . . . . . . . . . . . . . . . . . . . . 12 | |||
| 4.8.4. Popping an MPLS Label (not end of stack) . . . . . . . 11 | 4.8.4. Popping an MPLS Label (not end of stack) . . . . . . . 12 | |||
| 4.8.5. Popping the last MPLS Label to expose IP header . . . 11 | 4.8.5. Popping the last MPLS Label to expose IP header . . . 12 | |||
| 5. ECN-disabled MPLS domain . . . . . . . . . . . . . . . . . . . 11 | 5. ECN-disabled MPLS domain . . . . . . . . . . . . . . . . . . . 13 | |||
| 6. The use of more codepoints with E-LSPs and L-LSPs . . . . . . 11 | 6. The use of more codepoints with E-LSPs and L-LSPs . . . . . . 13 | |||
| 7. Relationship to tunnel behavior in RFC 3168 . . . . . . . . . 12 | 7. Relationship to tunnel behavior in RFC 3168 . . . . . . . . . 13 | |||
| 7.1. Alternative approach to support ECN in an MPLS domain . . 12 | 7.1. Alternative approach to support ECN in an MPLS domain . . 14 | |||
| 8. Example Uses . . . . . . . . . . . . . . . . . . . . . . . . . 13 | 8. Example Uses . . . . . . . . . . . . . . . . . . . . . . . . . 15 | |||
| 8.1. RFC3168-style ECN . . . . . . . . . . . . . . . . . . . . 13 | 8.1. RFC3168-style ECN . . . . . . . . . . . . . . . . . . . . 15 | |||
| 8.2. ECN Co-existence with Diffserv E-LSPs . . . . . . . . . . 14 | 8.2. ECN Co-existence with Diffserv E-LSPs . . . . . . . . . . 15 | |||
| 8.3. Congestion-feedback-based Traffic Engineering . . . . . . 14 | 8.3. Congestion-feedback-based Traffic Engineering . . . . . . 16 | |||
| 8.4. PCN flow admission control and flow pre-emption . . . . . 14 | 8.4. PCN flow admission control and flow pre-emption . . . . . 16 | |||
| 9. Deployment Considerations . . . . . . . . . . . . . . . . . . 15 | 9. Deployment Considerations . . . . . . . . . . . . . . . . . . 17 | |||
| 9.1. Marking non-ECN Capable Packets . . . . . . . . . . . . . 15 | 9.1. Marking non-ECN Capable Packets . . . . . . . . . . . . . 17 | |||
| 9.2. Non-ECN capable routers in an MPLS Domain . . . . . . . . 16 | 9.2. Non-ECN capable routers in an MPLS Domain . . . . . . . . 17 | |||
| 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 16 | 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 18 | |||
| 11. Security Considerations . . . . . . . . . . . . . . . . . . . 16 | 11. Security Considerations . . . . . . . . . . . . . . . . . . . 18 | |||
| 12. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 17 | 12. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 19 | |||
| 13. References . . . . . . . . . . . . . . . . . . . . . . . . . . 17 | 13. References . . . . . . . . . . . . . . . . . . . . . . . . . . 19 | |||
| 13.1. Normative References . . . . . . . . . . . . . . . . . . . 17 | 13.1. Normative References . . . . . . . . . . . . . . . . . . . 19 | |||
| 13.2. Informative References . . . . . . . . . . . . . . . . . . 18 | 13.2. Informative References . . . . . . . . . . . . . . . . . . 20 | |||
| Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 20 | Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 21 | |||
| Intellectual Property and Copyright Statements . . . . . . . . . . 21 | Intellectual Property and Copyright Statements . . . . . . . . . . 22 | |||
| 1. Introduction | 1. Introduction | |||
| 1.1. Background | 1.1. Changes From Previous (-00) Version | |||
| [RFC3270] defines how to support the Diffserv arhitecture in MPLS | [Note to RFC Editor: This section to be removed before publication] | |||
| o Corrected the description of ECN-MPLS marking proposed in | ||||
| [Shayman], which closely corresponds to that proposed in this | ||||
| document. | ||||
| o Pre-congestion notification (PCN) marking is now described in a | ||||
| way that does not require normative references to PCN | ||||
| specifications. PCN discussion now serves only to illustrate how | ||||
| the ECN marking concepts can be extended to cover more complex | ||||
| scenarios, with PCN being an example. | ||||
| o Added specification of behavior when MPLS encapsulated packets | ||||
| cross from an ECN-enabled domain to a domain that is not ECN- | ||||
| enabled. | ||||
| o Clarified that copying MPLS ECN or PCN marking into exposed IP | ||||
| header on egress is not mandatory | ||||
| o Fixed typos and nits | ||||
| 1.2. Background | ||||
| [RFC3270] defines how to support the Diffserv architecture in MPLS | ||||
| networks, including how to encode Diffserv Code Points (DSCPs) in an | networks, including how to encode Diffserv Code Points (DSCPs) in an | |||
| MPLS header. DSCPs may be encoded in the EXP field, while other uses | MPLS header. DSCPs may be encoded in the EXP field, while other uses | |||
| of that field are not precluded. RFC3270 makes no statement about | of that field are not precluded. RFC3270 makes no statement about | |||
| how Explicit Congestion Notification (ECN) marking might be encoded | how Explicit Congestion Notification (ECN) marking might be encoded | |||
| in the MPLS header. This draft defines how an operator might define | in the MPLS header. This draft defines how an operator might define | |||
| some of the EXP codepoints for explicit congestion notification, | some of the EXP codepoints for explicit congestion notification, | |||
| without precluding other uses. In parallel to the activity defining | without precluding other uses. In parallel to the activity defining | |||
| the addition of ECN to IP [RFC3168], two proposals were made to add | the addition of ECN to IP [RFC3168], two proposals were made to add | |||
| ECN to MPLS [Floyd][Shayman]. These proposals, however, fell by the | ECN to MPLS [Floyd][Shayman]. These proposals, however, fell by the | |||
| way-side. With ECN for IP now being a proposed standard, and | wayside. With ECN for IP now being a proposed standard, and | |||
| developing interest in using pre-congestion notification (PCN) for | developing interest in using pre-congestion notification (PCN) for | |||
| admission control and flow pre-emption[I-D.briscoe-tsvwg-cl- | admission control and flow pre-emption | |||
| architecture], there is consequent interest in being able to support | [I-D.briscoe-tsvwg-cl-architecture], there is consequent interest in | |||
| ECN across IP networks consisting of MPLS-enabled domains. Therefore | being able to support ECN across IP networks consisting of MPLS- | |||
| it is necessary to specify the protocol for including ECN or PCN in | enabled domains. Therefore it is necessary to specify the protocol | |||
| the MPLS shim header, and the protocol behaviour of edge MPLS nodes. | for including ECN in the MPLS shim header, and the protocol behavior | |||
| of edge MPLS nodes. | ||||
| We note that in [RFC3168] there are four codepoints used for ECN | We note that in [RFC3168] there are four codepoints used for ECN | |||
| marking, which are encoded using two bits of the IP header. The MPLS | marking, which are encoded using two bits of the IP header. The MPLS | |||
| EXP field is the logical place to encode ECN codepoints, but with | EXP field is the logical place to encode ECN codepoints, but with | |||
| only 3 bits (8 codepoints) available, and with the same field being | only 3 bits (8 codepoints) available, and with the same field being | |||
| used to convey DSCP information as well, there is a clear incentive | used to convey DSCP information as well, there is a clear incentive | |||
| to conserve the number of codepoints consumed for ECN purposes. | to conserve the number of codepoints consumed for ECN purposes. | |||
| Efficient use of the EXP field has been a focus of prior drafts | Efficient use of the EXP field has been a focus of prior drafts | |||
| [Floyd] [Shayman] and we draw on those efforts in this draft as well. | [Floyd] [Shayman] and we draw on those efforts in this draft as well. | |||
| 1.2. Intent | 1.3. Intent | |||
| Our intent is to specify how the MPLS shim header[RFC3032] should | Our intent is to specify how the MPLS shim header[RFC3032] should | |||
| denote ECN marking and how MPLS nodes should understand whether the | denote ECN marking and how MPLS nodes should understand whether the | |||
| transport for a packet will be ECN capable. We offer this as a | transport for a packet will be ECN capable. We offer this as a | |||
| building block, from which to build different congestion notification | building block, from which to build different congestion notification | |||
| systems. We do not intend to specify how the resulting congestion | systems. We do not intend to specify how the resulting congestion | |||
| notification is fed back to an upstream node that can mitigate | notification is fed back to an upstream node that can mitigate | |||
| congestion. For instance, unlike [Shayman], we do not specify edge- | congestion. For instance, unlike [Shayman], we do not specify edge- | |||
| to-edge MPLS domain feedback, but we also do not preclude it. | to-edge MPLS domain feedback, but we also do not preclude it. | |||
| Nonetheless, we do specify how the egress node of an MPLS domain | Nonetheless, we do specify how the egress node of an MPLS domain | |||
| should copy congestion notification from the MPLS shim into the | should copy congestion notification from the MPLS shim into the | |||
| underlying IP header if the ECN is to be carried onward towards the | underlying IP header if the ECN is to be carried onward towards the | |||
| IP receiver. But we do NOT mandate that MPLS congestion notification | IP receiver. But we do NOT mandate that MPLS congestion notification | |||
| must be copied into the IP header for onward transmission. This | must be copied into the IP header for onward transmission. This | |||
| draft aims to be generic for any use of congestion notification in | draft aims to be generic for any use of congestion notification in | |||
| MPLS. PCN or traffic engineering are merely two of many motivating | MPLS. PCN or traffic engineering are merely two of many motivating | |||
| applications (see Section 8.) | applications (see Section 8.) | |||
| 1.3. Terminology | 1.4. Terminology | |||
| This document draws freely on the terminology of ECN [RFC3168] and | This document draws freely on the terminology of ECN [RFC3168] and | |||
| MPLS [RFC3031]. For ease of reference, we have included some | MPLS [RFC3031]. For ease of reference, we have included some | |||
| definitions here, but refer the reader to the references above for | definitions here, but refer the reader to the references above for | |||
| complete specifications of the relevant technologies: | complete specifications of the relevant technologies: | |||
| o CE: Congestion Experienced. One of the states with which a packet | o CE: Congestion Experienced. One of the states with which a packet | |||
| may be marked in a network supporting ECN. A packet is marked in | may be marked in a network supporting ECN. A packet is marked in | |||
| this state by an ECN-capable router, to indicate that this router | this state by an ECN-capable router, to indicate that this router | |||
| was experiencing congestion at the time the packet arrived. | was experiencing congestion at the time the packet arrived. | |||
| skipping to change at page 6, line 7 | skipping to change at page 6, line 29 | |||
| already defines use of codepoints in the EXP field for differentiated | already defines use of codepoints in the EXP field for differentiated | |||
| services. Although it does not preclude other compatible uses of the | services. Although it does not preclude other compatible uses of the | |||
| EXP field, this clearly seems to limit the space available for ECN, | EXP field, this clearly seems to limit the space available for ECN, | |||
| given the field is only 3 bits (8 codepoints). | given the field is only 3 bits (8 codepoints). | |||
| RFC 3270 defines two possible approaches for requesting | RFC 3270 defines two possible approaches for requesting | |||
| differentiated service treatment from an LSR. | differentiated service treatment from an LSR. | |||
| o In the E-LSP approach, different codepoints of the EXP field in | o In the E-LSP approach, different codepoints of the EXP field in | |||
| the MPLS shim header are used to indicate the packet's per hop | the MPLS shim header are used to indicate the packet's per hop | |||
| behaviour (PHB). | behavior (PHB). | |||
| o In the L-LSP approach, an MPLS label is assigned for each PHB | o In the L-LSP approach, an MPLS label is assigned for each PHB | |||
| scheduling class (PSC, as defined in [RFC3260], so that an LSR | scheduling class (PSC, as defined in [RFC3260], so that an LSR | |||
| determines both its forwarding and its scheduling behaviour from | determines both its forwarding and its scheduling behavior from | |||
| the label. | the label. | |||
| If an MPLS domain uses the L-LSP approach, there is likely to be | If an MPLS domain uses the L-LSP approach, there is likely to be | |||
| space in the EXP field for ECN codepoint(s). Where the E-LSP | space in the EXP field for ECN codepoint(s). Where the E-LSP | |||
| approach is used, then codepoint space in the EXP field is likely to | approach is used, then codepoint space in the EXP field is likely to | |||
| be scarce. This draft focuses on interworking ECN marking with the | be scarce. This draft focuses on interworking ECN marking with the | |||
| E-LSP approach as it is the tougher problem. Consequently the same | E-LSP approach as it is the tougher problem. Consequently the same | |||
| approach can also be applied with L-LSPs. | approach can also be applied with L-LSPs. | |||
| We recommend that explicit congestion notification in MPLS should use | We recommend that explicit congestion notification in MPLS should use | |||
| codepoints instead of bits in the EXP field. Since not every DSCP | codepoints instead of bits in the EXP field. Since not every PHB | |||
| will need an associated ECN codepoint and some DSCPs might need two | will need an associated ECN codepoint and in some applications a | |||
| ECN codepoints [I-D.briscoe-tsvwg-cl-architecture], it would be | given PHB might need two ECN codepoints (see, for | |||
| wasteful and incorrect to assign a bit for ECN. | example,[I-D.briscoe-tsvwg-cl-architecture]) it would be wasteful to | |||
| assign a dedicated bit for ECN. | ||||
| For each PHB that uses ECN marking, we assume one EXP codepoint will | For each PHB that uses ECN marking, we assume one EXP codepoint will | |||
| be defined meaning not congestion marked (Not-CM), and at least one | be defined meaning not congestion marked (Not-CM), and at least one | |||
| other codepoint will be defined meaning congestion marked (CM). | other codepoint will be defined meaning congestion marked (CM). | |||
| Therefore, each PHB that uses ECN marking will consume at least two | Therefore, each PHB that uses ECN marking will consume at least two | |||
| EXP codepoints. But PHBs that do not use ECN marking will only | EXP codepoints. But PHBs that do not use ECN marking will only | |||
| consume one. | consume one. | |||
| Further, we wish to use minimal space in the MPLS shim header to tell | Further, we wish to use minimal space in the MPLS shim header to tell | |||
| interior LSRs whether each packet will be received by an ECN-capable | interior LSRs whether each packet will be received by an ECN-capable | |||
| skipping to change at page 6, line 50 | skipping to change at page 7, line 26 | |||
| o One possible approach is for congested LSRs to mark the ECN field | o One possible approach is for congested LSRs to mark the ECN field | |||
| in the underlying IP header at the bottom of the label stack. | in the underlying IP header at the bottom of the label stack. | |||
| Although many commercial LSRs routinely access the IP header for | Although many commercial LSRs routinely access the IP header for | |||
| other reasons (ECMP), there are numerous drawbacks to attempting | other reasons (ECMP), there are numerous drawbacks to attempting | |||
| to find an IP header beneath an MPLS label stack. Notably, there | to find an IP header beneath an MPLS label stack. Notably, there | |||
| is the challenge of detecting the absence of an IP header when | is the challenge of detecting the absence of an IP header when | |||
| non-IP packets are carried on an LSP. Therefore we will not | non-IP packets are carried on an LSP. Therefore we will not | |||
| consider this approach further. | consider this approach further. | |||
| o In the schemes suggested by [Floyd] and [Shayman], ECT and CE are | o In the scheme suggested by [Floyd] ECT and CE are overloaded into | |||
| overloaded into one bit, so that a 0 means ECT while a 1 might | one bit, so that a 0 means ECT while a 1 might either mean Not-ECT | |||
| either mean Not-ECT or it might mean CE. A packet that has been | or it might mean CE. A packet that has been marked as having | |||
| marked as having experienced congestion upstream, and then is | experienced congestion upstream, and then is picked out for | |||
| picked out for marking at a second congested LSR, will be dropped | marking at a second congested LSR, will be dropped by the second | |||
| by the second LSR since it cannot determine whether the packet has | LSR since it cannot determine whether the packet has previously | |||
| previously experienced congestion or if ECN is not supported by | experienced congestion or if ECN is not supported by the | |||
| the transport. | transport. | |||
| While such an approach seemed potentially palatable for | While such an approach seemed potentially palatable, we do not | |||
| traditional ECN, we do not recommend it here for the following | recommend it here for the following reasons. In some cases we | |||
| reasons. In some cases we wish to be able to use ECN marking long | wish to be able to use ECN marking long before actual congestion | |||
| before actual congestion (e.g. pre-congestion notification). In | (e.g. pre-congestion notification). In these circumstances, | |||
| these circumstances, marking rates at each LSR might be non- | marking rates at each LSR might be non-negligible most of the | |||
| negligible most of the time, so the chances of a previously marked | time, so the chances of a previously marked packet encountering an | |||
| packet encountering an LSR that wants to mark it again will also | LSR that wants to mark it again will also be non-negligible. In | |||
| be non-negligible. This will lead to unacceptable drop rates. | the case where CE and not-ECT are indistinguishable to core | |||
| For instance, if the typical marking rate at every router or LSRs | routers, such a scenario could lead to unacceptable drop rates. | |||
| is p, and the typical diameter of the network of LSRs is d, then | If the typical marking rate at every router or LSRs is p, and the | |||
| the probability that a marked packet will be marked again is 1- | typical diameter of the network of LSRs is d, then the probability | |||
| [1+p(d-1)][1-p]^(d-1). For instance, with 6 LSRs in a row, each | that a marked packet will be chosen for marking more than once is | |||
| marking ECN with 1% probability, this bit overloading scheme would | 1-[Pr(never marked) + Pr(marked at exactly one hop)] = 1- [(1-p)^d | |||
| introduce a drop rate of 0.15% unnecessarily. Given most modern | + dp(1-p)^(d-1)]. For instance, with 6 LSRs in a row, each | |||
| core networks are sized to introduce near-zero packet drop, it may | marking ECN with 1% probability, the chances of a packet that is | |||
| be unacceptable to drop over one in a thousand packets | already marked being chosen for marking a second time is 0.15%. | |||
| unnecessarily. | The bit overloading scheme would therefore introduce a drop rate | |||
| of 0.15% unnecessarily. Given that most modern core networks are | ||||
| sized to introduce near-zero packet drop, it may be unacceptable | ||||
| to drop over one in a thousand packets unnecessarily. | ||||
| o A third possible approach is for interior LSRs to assume that the | o A third possible approach was suggested by [Shayman]. In this | |||
| endpoints are ECN-capable, but this assumption is checked when the | scheme, interior LSRs assume that the endpoints are ECN-capable, | |||
| final label is popped. If an interior LSR has marked ECN in the | but this assumption is checked when the final label is popped. If | |||
| EXP field of the shim, but the IP header says the endpoints are | an interior LSR has marked ECN in the EXP field of the shim | |||
| not ECN capable, the edge router (or penultimate if using | header, but the IP header says the endpoints are not ECN capable, | |||
| penultimate hop popping) drops the packet. We recommend this | the edge router (or penultimate router, if using penultimate hop | |||
| scheme, which we call `per-domain ECT checking'; and define it | popping) drops the packet. We recommend this scheme, which we | |||
| more precisely in the following section. Its chief drawback is | call `per-domain ECT checking', and define it more precisely in | |||
| that it can involve packets continuing to be forwarded after | the following section. Its chief drawback is that it can cause | |||
| encountering congestion only to be dropped at the egress of the | packets to be forwarded after encountering congestion only to be | |||
| MPLS domain. The rationale for this decision is given in | dropped at the egress of the MPLS domain. The rationale for this | |||
| Section 9.1. | decision is given in Section 9.1. | |||
| 3. Per-domain ECT checking | 3. Per-domain ECT checking | |||
| For the purposes of this discussion, we define the egress nodes of an | For the purposes of this discussion, we define the egress nodes of an | |||
| MPLS domain as the nodes that pop the last MPLS label from the label | MPLS domain as the nodes that pop the last MPLS label from the label | |||
| stack, exposing the IP (or, potentially non-IP) header. Note that | stack, exposing the IP (or, potentially non-IP) header. Note that | |||
| such a node may be the ultimate or penultimate hop of an LSP, | such a node may be the ultimate or penultimate hop of an LSP, | |||
| depending on whether penultimate hop popping (PHP) is employed. | depending on whether penultimate hop popping (PHP) is employed. | |||
| In the per-domain ECT checking approach, the egress nodes take | In the per-domain ECT checking approach, the egress nodes take | |||
| skipping to change at page 9, line 14 | skipping to change at page 9, line 41 | |||
| push to the newly added outer label. If more than one label is being | push to the newly added outer label. If more than one label is being | |||
| pushed, the same EXP value is copied to all label stack entries. | pushed, the same EXP value is copied to all label stack entries. | |||
| 4.3. Congestion experienced in an interior MPLS node | 4.3. Congestion experienced in an interior MPLS node | |||
| If the EXP codepoint of the packet maps to a PHB that uses ECN | If the EXP codepoint of the packet maps to a PHB that uses ECN | |||
| marking and the marking algorithm requires the packet to be marked, | marking and the marking algorithm requires the packet to be marked, | |||
| the CM state is set (irrespective of whether it is already in the CM | the CM state is set (irrespective of whether it is already in the CM | |||
| state). | state). | |||
| If the buffer is full, the packet would be dropped. | If the buffer is full, a packet is dropped. | |||
| 4.4. Crossing a Diffserv Domain Boundary | 4.4. Crossing a Diffserv Domain Boundary | |||
| If an MPLS-encapsulated packet crosses a Diffserv domain boundary, it | If an MPLS-encapsulated packet crosses a Diffserv domain boundary, it | |||
| may be the case that the two domains use different encodings of the | may be the case that the two domains use different encodings of the | |||
| same PHB in the EXP field. In such cases, the EXP field must be | same PHB in the EXP field. In such cases, the EXP field must be | |||
| rewritten at the domain boundary. If the PHB is one that supports | rewritten at the domain boundary. If the PHB is one that supports | |||
| ECN, then the appropriate ECN marking should also be preserved when | ECN, then the appropriate ECN marking should also be preserved when | |||
| the EXP field is mapped at the boundary. | the EXP field is mapped at the boundary. | |||
| If an MPLS-encapsulated packet that is in the CM state crosses from a | ||||
| domain that is ECN-enabled (as defined in Section 3) to a domain that | ||||
| is not ECN-enabled, then it is necessary to perform the egress | ||||
| checking procedures at the egress LSR of the ECN-enabled domain. | ||||
| This means that if the encapsulated packet is not ECN capable, the | ||||
| packet MUST be dropped. Note that this implies the egress LSR must | ||||
| be able to look beneath the MPLS header without popping the label | ||||
| stack. | ||||
| The related issue of Diffserv tunnel models is discussed in | The related issue of Diffserv tunnel models is discussed in | |||
| Section 4.7. | Section 4.7. | |||
| 4.5. Popping an MPLS label (not the end of the stack) | 4.5. Popping an MPLS label (not the end of the stack) | |||
| When a packet has more than one MPLS label in the stack and the top | When a packet has more than one MPLS label in the stack and the top | |||
| label is popped, another MPLS label is exposed. In this case the ECN | label is popped, another MPLS label is exposed. In this case the ECN | |||
| information should be transferred from the outer EXP field to the | information should be transferred from the outer EXP field to the | |||
| inner MPLS label in the following manner. If the inner EXP field is | inner MPLS label in the following manner. If the inner EXP field is | |||
| Not-CM, the inner EXP field is set to the same CM or Not-CM state as | Not-CM, the inner EXP field is set to the same CM or Not-CM state as | |||
| skipping to change at page 9, line 49 | skipping to change at page 10, line 38 | |||
| 4.6. Popping the last MPLS label in the stack | 4.6. Popping the last MPLS label in the stack | |||
| When the last MPLS label is popped from the packet, its payload is | When the last MPLS label is popped from the packet, its payload is | |||
| exposed. If that packet is not IP, and does not have any capability | exposed. If that packet is not IP, and does not have any capability | |||
| equivalent to ECT, it is assumed Not-ECT and treated as such. That | equivalent to ECT, it is assumed Not-ECT and treated as such. That | |||
| means that if the EXP value of the MPLS header was CM, the packet | means that if the EXP value of the MPLS header was CM, the packet | |||
| MUST be dropped. | MUST be dropped. | |||
| Assuming an IP packet was exposed, we have to examine whether that | Assuming an IP packet was exposed, we have to examine whether that | |||
| packet is ECT or not. If the inner IP packet is Not-ECT, its ECN | packet is ECT or not. A Not-ECT packet MUST be dropped if the EXP | |||
| field remains unchanged if the EXP field is Not-CM. However, a Not- | field is CM. | |||
| ECT packet MUST be dropped if the EXP field is CM. | ||||
| If the ECN field of the inner packet is set to ECT(0), ECT(1) or CE, | For the remainder of this section, we describe the behavior that is | |||
| the ECN field remains unchanged if the EXP field is set to Not-CM. | required if the ECN information is to be transferred from the MPLS | |||
| The ECN field is set to CE if the EXP field is CM. Note that an | header into the exposed IP header for onward transmission. As noted | |||
| inner value of CE and an outer value of not-CM should be considered | in Section 1.3, such behavior is not mandated by this document, but | |||
| anomalous, and SHOULD be logged in some way by the LSR. | may be selected by an operator. | |||
| If the inner IP packet is Not-ECT, its ECN field remains unchanged if | ||||
| the EXP field is Not-CM. If the ECN field of the inner packet is set | ||||
| to ECT(0), ECT(1) or CE, the ECN field remains unchanged if the EXP | ||||
| field is set to Not-CM. The ECN field is set to CE if the EXP field | ||||
| is CM. Note that an inner value of CE and an outer value of not-CM | ||||
| should be considered anomalous, and SHOULD be logged in some way by | ||||
| the LSR. | ||||
| 4.7. Diffserv Tunneling Models | 4.7. Diffserv Tunneling Models | |||
| [RFC3270] describes three tunneling models for Diffserv support | [RFC3270] describes three tunneling models for Diffserv support | |||
| across MPLS Domains, referred to as the uniform, short pipe, and pipe | across MPLS Domains, referred to as the uniform, short pipe, and pipe | |||
| models. The differences between these models lie in whether the | models. The differences between these models lie in whether the | |||
| Diffserv treatment that applies to a packet while it travels along a | Diffserv treatment that applies to a packet while it travels along a | |||
| particular LSP is carried to the last hop of the LSP and beyond the | particular LSP is carried to the last hop of the LSP and beyond the | |||
| last hop. Depending on which mode is preferred by an operator, the | last hop. Depending on which mode is preferred by an operator, the | |||
| EXP value or DSCP value of an exposed header following a label pop | EXP value or DSCP value of an exposed header following a label pop | |||
| may or may not be dependent on the EXP value of the label that is | may or may not be dependent on the EXP value of the label that is | |||
| removed by the pop operation. We believe that in the case of ECN | removed by the pop operation. We believe that in the case of ECN | |||
| marking, the use of these models should only apply to the encoding of | marking, the use of these models should only apply to the encoding of | |||
| the Diffserv PHB in the EXP value, and that the choice of codepoint | the Diffserv PHB in the EXP value, and that the choice of codepoint | |||
| for ECN should always be made based on the procedures described | for ECN should always be made based on the procedures described | |||
| above, independent of the tunneling model. | above, independent of the tunneling model. | |||
| 4.8. Extension to Pre-Congestion Notification | 4.8. Extension to Pre-Congestion Notification | |||
| To fully support PCN [I-D.briscoe-tsvwg-cl-architecture] in an MPLS | This section describes how the preceding mechanisms can be extended | |||
| domain for a particular PHB, a total of 3 codepoints need to be | to support PCN [I-D.briscoe-tsvwg-cl-architecture]. Our intent here | |||
| allocated for that PHB. (See Section 8.4 for further discussion of | is to show that the mechanisms are readily extended to more complex | |||
| PCN and the possibility of using fewer codepoints.) These 3 | scenarios than ECN, but this section may be safely ignored if one is | |||
| interested only in supporting ECN. | ||||
| The relevant aspects of PCN for the purposes of this discussion are: | ||||
| o PCN uses 3 states rather than 2 for ECN - these are referred to as | ||||
| admission marked (AM), pre-emption marked (PM) and not marked (NM) | ||||
| states. (See Section 8.4 for further discussion of PCN and the | ||||
| possibility of using fewer codepoints.) | ||||
| o A packet can go from NM to AM, from NM to PM, or from AM to PM, | ||||
| but no other transition is possible. | ||||
| o Whereas ECN-capable packets are identified by the ECT value in the | ||||
| IP header, PCN-capability is determined by the PHB of the packet. | ||||
| Thus, to support PCN fully in an MPLS domain for a particular PHB, a | ||||
| total of 3 codepoints need to be allocated for that PHB. These 3 | ||||
| codepoints represent the admission marked (AM), pre-emption marked | codepoints represent the admission marked (AM), pre-emption marked | |||
| (PM) and not marked (NM) states. The procedures described above need | (PM) and not marked (NM) states. The procedures described above need | |||
| to be slightly modified to support this scenario. The following | to be slightly modified to support this scenario. The following | |||
| procedures are invoked when the topmost DSCP or EXP value indicates a | procedures are invoked when the topmost DSCP or EXP value indicates a | |||
| PHB that supports PCN. | PHB that supports PCN. | |||
| 4.8.1. Label Push onto IP packet | 4.8.1. Label Push onto IP packet | |||
| If the IP packet header indicates AM, set the EXP value of all | If the IP packet header indicates AM, set the EXP value of all | |||
| entries in the label stack to AM. If the IP packet header indicates | entries in the label stack to AM. If the IP packet header indicates | |||
| skipping to change at page 11, line 8 | skipping to change at page 12, line 20 | |||
| any other marking of the IP header, set the EXP value of all entries | any other marking of the IP header, set the EXP value of all entries | |||
| in the label stack to NM. | in the label stack to NM. | |||
| 4.8.2. Pushing Additional MPLS Labels | 4.8.2. Pushing Additional MPLS Labels | |||
| The procedures of Section 4.2 apply. | The procedures of Section 4.2 apply. | |||
| 4.8.3. Admission Control or Pre-emption Marking inside MPLS domain | 4.8.3. Admission Control or Pre-emption Marking inside MPLS domain | |||
| The EXP value can be set to AM or PM according to the same procedures | The EXP value can be set to AM or PM according to the same procedures | |||
| as described in [I-D.briscoe-tsvwg-cl-phb]. | as described in [I-D.briscoe-tsvwg-cl-phb]. For the purposes of this | |||
| document, it does not matter exactly what algorithms are used to | ||||
| decide when to set AM or PM; all that matters is that if a router | ||||
| would have marked AM (or PM) in the IP header, it should set the EXP | ||||
| value in the MPLS header to the AM (or PM) codepoint. | ||||
| 4.8.4. Popping an MPLS Label (not end of stack) | 4.8.4. Popping an MPLS Label (not end of stack) | |||
| When popping an MPLS Label exposes another MPLS label, the AM or PM | When popping an MPLS Label exposes another MPLS label, the AM or PM | |||
| marking should be transferred to the exposed EXP field in the | marking should be transferred to the exposed EXP field in the | |||
| following manner: if the inner EXP value is NM, then it should be set | following manner: if the inner EXP value is NM, then it should be set | |||
| to the same marking state as the EXP value of the popped label stack | to the same marking state as the EXP value of the popped label stack | |||
| entry. If the inner EXP value is AM, it should be unchanged if the | entry. If the inner EXP value is AM, it should be unchanged if the | |||
| popped EXP value was AM, and it should be set to PM if the popped EXP | popped EXP value was AM, and it should be set to PM if the popped EXP | |||
| value was PM. If the popped EXP value was NM, this should be logged | value was PM. If the popped EXP value was NM, this should be logged | |||
| in some way and the inner EXP value should be unchanged. If the | in some way and the inner EXP value should be unchanged. If the | |||
| inner EXP value is PM, it should be unchanged whatever the popped EXP | inner EXP value is PM, it should be unchanged whatever the popped EXP | |||
| value was, but any EXP value other than PM should be logged. | value was, but any EXP value other than PM should be logged. | |||
| 4.8.5. Popping the last MPLS Label to expose IP header | 4.8.5. Popping the last MPLS Label to expose IP header | |||
| When popping the last MPLS Label exposes the IP header, the AM or PM | When popping the last MPLS Label exposes the IP header, there are two | |||
| marking should be transferred to the exposed IP header field in the | cases to consider: | |||
| following manner: if the inner IP header value is neither AM nor PM, | ||||
| and the EXP value was NM, then the IP header should be unchanged. | o the popping LSR is NOT the egress router of the PCN region, in | |||
| For any other EXP value, the IP header should be set to the same | which case AM or PM marking should be transferred to the exposed | |||
| marking state as the EXP value of the popped label stack entry. If | IP header field; or | |||
| the inner IP header value is AM, it should be unchanged if the popped | ||||
| EXP value was AM, and it should be set to PM if the popped EXP value | o the popping LSR IS the egress router of the PCN region. | |||
| was PM. If the popped EXP value was NM, this should be logged in | ||||
| some way and the inner IP header value should be unchanged. If the | In the latter case, the behavior of the egress LSR is defined in | |||
| IP header value is PM, it should be unchanged whatever the popped EXP | [I-D.briscoe-tsvwg-cl-architecture] and is beyond the scope of this | |||
| value was, but any EXP value other than PM should be logged. | document. In the former case, the marking should be transferred from | |||
| the popped MPLS header to the exposed IP header as follows: if the | ||||
| inner IP header value is neither AM nor PM, and the EXP value was NM, | ||||
| then the IP header should be unchanged. For any other EXP value, the | ||||
| IP header should be set to the same marking state as the EXP value of | ||||
| the popped label stack entry. If the inner IP header value is AM, it | ||||
| should be unchanged if the popped EXP value was AM, and it should be | ||||
| set to PM if the popped EXP value was PM. If the popped EXP value | ||||
| was NM, this should be logged in some way and the inner IP header | ||||
| value should be unchanged. If the IP header value is PM, it should | ||||
| be unchanged whatever the popped EXP value was, but any EXP value | ||||
| other than PM should be logged. | ||||
| 5. ECN-disabled MPLS domain | 5. ECN-disabled MPLS domain | |||
| If ECN is not enabled on all the egress LSRs of a domain, ECN MUST | If ECN is not enabled on all the egress LSRs of a domain, ECN MUST | |||
| NOT be enabled on any LSRs throughout the domain. If congestion is | NOT be enabled on any LSRs throughout the domain. If congestion is | |||
| experienced on any LSR in an ECN-disabled MPLS domain, packets MUST | experienced on any LSR in an ECN-disabled MPLS domain, packets MUST | |||
| be dropped NOT marked. The exact algorithm for deciding when to drop | be dropped, NOT marked. The exact algorithm for deciding when to | |||
| packets during congestion (e.g. tail-drop, RED, etc.) is a local | drop packets during congestion (e.g. tail-drop, RED, etc.) is a local | |||
| matter for the operator of the domain. | matter for the operator of the domain. | |||
| 6. The use of more codepoints with E-LSPs and L-LSPs | 6. The use of more codepoints with E-LSPs and L-LSPs | |||
| RFC 3270 gives different options with E-LSPs and L-LSPs and some of | RFC 3270 gives different options with E-LSPs and L-LSPs and some of | |||
| those could potentially provide ample EXP codepoints for ECN/PCN. | those could potentially provide ample EXP codepoints for ECN/PCN. | |||
| However, deploying L-LSPs vs E-LSPs has many implications such as | However, deploying L-LSPs vs E-LSPs has many implications such as | |||
| platform support and operational complexity. The above ECN/PCN MPLS | platform support and operational complexity. The above ECN/PCN MPLS | |||
| solution should provide some flexibility. If the operator has | solution should provide some flexibility. If the operator has | |||
| deployed one L-LSP per PHB scheduling class, then EXP space will be a | deployed one L-LSP per PHB scheduling class, then EXP space will be a | |||
| non-issue and it could be used to achieve more sophisticated ECN/PCN | non-issue and it could be used to achieve more sophisticated ECN/PCN | |||
| behavior if required. If the operator wants to stick to E-LSPs and | behavior if required. If the operator wants to stick to E-LSPs and | |||
| uses a handful of EXP codepoints for Diffserv, it may be desirable to | uses a handful of EXP codepoints for Diffserv, it may be desirable to | |||
| operate with a minimum number of extra ECN/PCN codepoints, even if | operate with a minimum number of extra ECN/PCN codepoints, even if | |||
| this comes with some compromise on ECN/PCN optimality. See Section 8 | this comes with some compromise on ECN/PCN optimality. See Section 8 | |||
| for discussion of some possible deployment scenarios. | for discussion of some possible deployment scenarios. | |||
| skipping to change at page 12, line 19 | skipping to change at page 13, line 43 | |||
| non-issue and it could be used to achieve more sophisticated ECN/PCN | non-issue and it could be used to achieve more sophisticated ECN/PCN | |||
| behavior if required. If the operator wants to stick to E-LSPs and | behavior if required. If the operator wants to stick to E-LSPs and | |||
| uses a handful of EXP codepoints for Diffserv, it may be desirable to | uses a handful of EXP codepoints for Diffserv, it may be desirable to | |||
| operate with a minimum number of extra ECN/PCN codepoints, even if | operate with a minimum number of extra ECN/PCN codepoints, even if | |||
| this comes with some compromise on ECN/PCN optimality. See Section 8 | this comes with some compromise on ECN/PCN optimality. See Section 8 | |||
| for discussion of some possible deployment scenarios. | for discussion of some possible deployment scenarios. | |||
| 7. Relationship to tunnel behavior in RFC 3168 | 7. Relationship to tunnel behavior in RFC 3168 | |||
| [RFC3168] defines two modes of encapsulating ECN-marked IP packets | [RFC3168] defines two modes of encapsulating ECN-marked IP packets | |||
| inside additonal IP headers when tunnels are used. The two modes are | inside additional IP headers when tunnels are used. The two modes | |||
| the "full functionality" and "limited functionality" modes. In the | are the "full functionality" and "limited functionality" modes. In | |||
| full functionality mode, the ECT information from the inner header is | the full functionality mode, the ECT information from the inner | |||
| copied to the outer header at the tunnel ingress, but the CE | header is copied to the outer header at the tunnel ingress, but the | |||
| information is not. In the limited functionality mode, neither ECT | CE information is not. In the limited functionality mode, neither | |||
| nor CE information is copied to the outer header, and thus ECN cannot | ECT nor CE information is copied to the outer header, and thus ECN | |||
| be applied to the encapsulated packet. | cannot be applied to the encapsulated packet. | |||
| The behavior that is specified in Section 4 of this document | The behavior that is specified in Section 4 of this document | |||
| resembles the "full functionality" mode in the sense that it conveys | resembles the "full functionality" mode in the sense that it conveys | |||
| some information from inner to outer header, and in the sense that it | some information from inner to outer header, and in the sense that it | |||
| enables full ECN support along the MPLS LSP (which is analogous to an | enables full ECN support along the MPLS LSP (which is analogous to an | |||
| IP tunnel in this context). However it differs in one respect, which | IP tunnel in this context). However it differs in one respect, which | |||
| is that the CE information is conveyed from the inner header to the | is that the CE information is conveyed from the inner header to the | |||
| outer header. Our reason for this different design choice is to give | outer header. Our reason for this different design choice is to give | |||
| interior routers and LSRs more information about upstream marking in | interior routers and LSRs more information about upstream marking in | |||
| multi-bottleneck cases. For instance, the flow pre-emption marking | multi-bottleneck cases. For instance, the flow pre-emption marking | |||
| skipping to change at page 13, line 42 | skipping to change at page 15, line 21 | |||
| on whether a packet has already suffered upstream marking. The | on whether a packet has already suffered upstream marking. The | |||
| currently proposed pre-emption marking in PCN is an example where | currently proposed pre-emption marking in PCN is an example where | |||
| such an exception would be necessary (see the discussion at the start | such an exception would be necessary (see the discussion at the start | |||
| of Section 7). | of Section 7). | |||
| 8. Example Uses | 8. Example Uses | |||
| 8.1. RFC3168-style ECN | 8.1. RFC3168-style ECN | |||
| [RFC3168] proposes the use of ECN in TCP and introduces the use of | [RFC3168] proposes the use of ECN in TCP and introduces the use of | |||
| ECN-Echo and CWR flags in the TCP header for initialisation. The TCP | ECN-Echo and CWR flags in the TCP header for initialization. The TCP | |||
| sender responds accordingly (such as not increasing the congestion | sender responds accordingly (such as not increasing the congestion | |||
| window) when it receives an ECN-Echo (ECE) ACK packet (that is, an | window) when it receives an ECN-Echo (ECE) ACK packet (that is, an | |||
| ACK packet with ECN-Echo flag set in the TCP header), then the sender | ACK packet with ECN-Echo flag set in the TCP header), then the sender | |||
| knows that congestion was encountered in the network on the path from | knows that congestion was encountered in the network on the path from | |||
| the sender to the receiver. | the sender to the receiver. | |||
| It would be possible to enable ECN in an MPLS domain for Diffserv | It would be possible to enable ECN in an MPLS domain for Diffserv | |||
| PHBs like AF and best efforts that are expected to be used by TCP and | PHBs like AF and best efforts that are expected to be used by TCP and | |||
| similar transports (e.g. DCCP [RFC4340]). Then end-to-end | similar transports (e.g. DCCP [RFC4340]). Then end-to-end | |||
| congestion control in transports capable of understanding ECN would | congestion control in transports capable of understanding ECN would | |||
| skipping to change at page 15, line 4 | skipping to change at page 16, line 32 | |||
| [I-D.briscoe-tsvwg-cl-architecture] proposes using pre-congestion | [I-D.briscoe-tsvwg-cl-architecture] proposes using pre-congestion | |||
| notification (PCN) on routers within an edge-to-edge Diffserv region | notification (PCN) on routers within an edge-to-edge Diffserv region | |||
| to control admission of new flows to the region and, if necessary, to | to control admission of new flows to the region and, if necessary, to | |||
| pre-empt existing flows in response to disasters and other anomalous | pre-empt existing flows in response to disasters and other anomalous | |||
| routing events. In this approach, the current level of PCN marking | routing events. In this approach, the current level of PCN marking | |||
| is picked up by the signalling used to initiate each flow in order to | is picked up by the signalling used to initiate each flow in order to | |||
| inform the admission control decision for the whole region at once. | inform the admission control decision for the whole region at once. | |||
| As an example, a minor extension to RSVP signalling has been proposed | As an example, a minor extension to RSVP signalling has been proposed | |||
| [I-D.lefaucheur-rsvp-ecn] to carry this message, but a similar | [I-D.lefaucheur-rsvp-ecn] to carry this message, but a similar | |||
| approach has also been proposed that uses NSIS signalling [I-D.ietf- | approach has also been proposed that uses NSIS signalling | |||
| nsis-rmd]. | [I-D.ietf-nsis-rmd]. | |||
| If it is possible for LSRs to signify congestion in MPLS, PCN marking | If it is possible for LSRs to signify congestion in MPLS, PCN marking | |||
| could be used for admission control and flow pre-emption across a | could be used for admission control and flow pre-emption across a | |||
| Diffserv region, irrespective of whether it contained pure IP | Diffserv region, irrespective of whether it contained pure IP | |||
| routers, MPLS LSRs, or both. Indeed, the solution could be somewhat | routers, MPLS LSRs, or both. Indeed, the solution could be somewhat | |||
| more efficient to implement if aggregates could identify themselves | more efficient to implement if aggregates could identify themselves | |||
| by their MPLS label. Section 4.8 describes the mechanisms by which | by their MPLS label. Section 4.8 describes the mechanisms by which | |||
| the necessary markings for PCN could be carried in the MPLS header. | the necessary markings for PCN could be carried in the MPLS header. | |||
| As an illustrative example of how the EXP field might be used in this | As an illustrative example of how the EXP field might be used in this | |||
| skipping to change at page 16, line 13 | skipping to change at page 17, line 40 | |||
| dropped later should become less prevalent as more transports use | dropped later should become less prevalent as more transports use | |||
| ECN. This is why we chose not to use the [Floyd] alternative which | ECN. This is why we chose not to use the [Floyd] alternative which | |||
| introduced a low but persistent level of unnecessary packet drop for | introduced a low but persistent level of unnecessary packet drop for | |||
| all time. Although that scheme did not carry droppable traffic to | all time. Although that scheme did not carry droppable traffic to | |||
| the edge of the MPLS domain, we felt this was a small price to pay, | the edge of the MPLS domain, we felt this was a small price to pay, | |||
| and it was anyway only of concern until ECN had become more widely | and it was anyway only of concern until ECN had become more widely | |||
| deployed. | deployed. | |||
| A partial solution would be to preferentially drop packets arriving | A partial solution would be to preferentially drop packets arriving | |||
| at a congested router that were already marked. There is no solution | at a congested router that were already marked. There is no solution | |||
| to the problem of marking a packet congested by another packet that | to the problem of marking a packet when congestion is caused by | |||
| should have been dropped. However, the chance of such an occurrence | another packet that should have been dropped. However, the chance of | |||
| is very low and the consequences are not significant. It merely | such an occurrence is very low and the consequences are not | |||
| causes an application to very occasionally slow down its rate when it | significant. It merely causes an application to very occasionally | |||
| did not have to. | slow down its rate when it did not have to. | |||
| 9.2. Non-ECN capable routers in an MPLS Domain | 9.2. Non-ECN capable routers in an MPLS Domain | |||
| What if an MPLS domain wants to use ECN, but not all legacy routers | What if an MPLS domain wants to use ECN, but not all legacy routers | |||
| are able to support it? | are able to support it? | |||
| If the legacy router(s) are used in the interior, this is not a | If the legacy router(s) are used in the interior, this is not a | |||
| problem. They will simply have to drop the packets if they are | problem. They will simply have to drop the packets if they are | |||
| congested, rather than mark them, which is the standard behaviour for | congested, rather than mark them, which is the standard behavior for | |||
| IP routers that are not ECN-enabled. | IP routers that are not ECN-enabled. | |||
| If the legacy router were used as an egress router, it would not be | If the legacy router were used as an egress router, it would not be | |||
| able to check the ECN capability of the transport correctly. An | able to check the ECN capability of the transport correctly. An | |||
| operator in this position would not be able to use this solution and | operator in this position would not be able to use this solution and | |||
| therefore MUST NOT enable ECN unless all egress routers are ECN- | therefore MUST NOT enable ECN unless all egress routers are ECN- | |||
| capable. | capable. | |||
| 10. IANA Considerations | 10. IANA Considerations | |||
| skipping to change at page 17, line 21 | skipping to change at page 18, line 47 | |||
| without requiring any specific support from the proposal in this | without requiring any specific support from the proposal in this | |||
| draft. The nonce does not need to be present in the MPLS shim | draft. The nonce does not need to be present in the MPLS shim | |||
| header. As long as the nonce is present in the IP header when the | header. As long as the nonce is present in the IP header when the | |||
| ECN information is copied from the last MPLS shim header, it will be | ECN information is copied from the last MPLS shim header, it will be | |||
| overwritten if congestion has been experienced by an LSR. This is | overwritten if congestion has been experienced by an LSR. This is | |||
| all that is necessary for the sender to detect a misbehaving | all that is necessary for the sender to detect a misbehaving | |||
| receiver. | receiver. | |||
| An alternative proposal currently in progress in the IETF | An alternative proposal currently in progress in the IETF | |||
| [I-D.briscoe-tsvwg-re-ecn-tcp] allows the network to prevent | [I-D.briscoe-tsvwg-re-ecn-tcp] allows the network to prevent | |||
| misbehaviour by senders or receivers or other routers. Like the ECN | misbehavior by senders or receivers or other routers. Like the ECN | |||
| nonce, it works correctly without requiring any specific support from | nonce, it works correctly without requiring any specific support from | |||
| the proposal in this draft. It uses a bit in the IP header (the RE | the proposal in this draft. It uses a bit in the IP header (the RE | |||
| bit) which is set by the sender and never changed along the path-it | bit) which is set by the sender and never changed along the path-it | |||
| is only read by certain policing elements in the network. There is | is only read by certain policing elements in the network. There is | |||
| no need for a copy of this bit in the MPLS shim, as policing nodes | no need for a copy of this bit in the MPLS shim, as policing nodes | |||
| can examine the IP header if they need to, particularly given they | can examine the IP header if they need to, particularly given they | |||
| are intended to only be necessary at domain borders where MPLS | are intended to only be necessary at domain borders where MPLS | |||
| headers are often removed. | headers are often removed. | |||
| 12. Acknowledgements | 12. Acknowledgements | |||
| Thanks to K.K. Ramakrishnan and Sally Floyd for getting us thinking | Thanks to K.K. Ramakrishnan and Sally Floyd for getting us thinking | |||
| about this in the first place and for providing advice on tunneling | about this in the first place and for providing advice on tunneling | |||
| of ECN packets, and to Joe Babiarz and Ben Niven-Jenkins for their | of ECN packets, and to Joe Babiarz, Ben Niven-Jenkins, Phil Eardley, | |||
| comments on the draft. | and Ruediger Geib for their comments on the draft. | |||
| 13. References | 13. References | |||
| 13.1. Normative References | 13.1. Normative References | |||
| [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | |||
| Requirement Levels", BCP 14, RFC 2119, March 1997. | Requirement Levels", BCP 14, RFC 2119, March 1997. | |||
| [RFC2475] Blake, S., Black, D., Carlson, M., Davies, E., Wang, Z., | [RFC2475] Blake, S., Black, D., Carlson, M., Davies, E., Wang, Z., | |||
| and W. Weiss, "An Architecture for Differentiated | and W. Weiss, "An Architecture for Differentiated | |||
| skipping to change at page 18, line 30 | skipping to change at page 20, line 13 | |||
| Services", RFC 3270, May 2002. | Services", RFC 3270, May 2002. | |||
| 13.2. Informative References | 13.2. Informative References | |||
| [Floyd] "A Proposal to Incorporate ECN in MPLS", 1999. | [Floyd] "A Proposal to Incorporate ECN in MPLS", 1999. | |||
| Work in progress. http://www.icir.org/floyd/papers/ | Work in progress. http://www.icir.org/floyd/papers/ | |||
| draft-ietf-mpls-ecn-00.txt | draft-ietf-mpls-ecn-00.txt | |||
| [I-D.briscoe-tsvwg-cl-architecture] | [I-D.briscoe-tsvwg-cl-architecture] | |||
| Briscoe, B., "A Framework for Admission Control over | Briscoe, B., "An edge-to-edge Deployment Model for Pre- | |||
| DiffServ using Pre-Congestion Notification", | Congestion Notification: Admission Control over a | |||
| draft-briscoe-tsvwg-cl-architecture-02 (work in progress), | DiffServ Region", draft-briscoe-tsvwg-cl-architecture-03 | |||
| March 2006. | (work in progress), June 2006. | |||
| [I-D.briscoe-tsvwg-cl-phb] | [I-D.briscoe-tsvwg-cl-phb] | |||
| Briscoe, B., "Pre-Congestion Notification marking", | Briscoe, B., "Pre-Congestion Notification marking", | |||
| draft-briscoe-tsvwg-cl-phb-01 (work in progress), | draft-briscoe-tsvwg-cl-phb-02 (work in progress), | |||
| March 2006. | June 2006. | |||
| [I-D.briscoe-tsvwg-re-ecn-border-cheat] | [I-D.briscoe-tsvwg-re-ecn-border-cheat] | |||
| Briscoe, B., "Emulating Border Flow Policing using Re-ECN | Briscoe, B., "Emulating Border Flow Policing using Re-ECN | |||
| on Bulk Data", draft-briscoe-tsvwg-re-ecn-border-cheat-00 | on Bulk Data", draft-briscoe-tsvwg-re-ecn-border-cheat-01 | |||
| (work in progress), February 2006. | (work in progress), June 2006. | |||
| [I-D.briscoe-tsvwg-re-ecn-tcp] | [I-D.briscoe-tsvwg-re-ecn-tcp] | |||
| Briscoe, B., "Re-ECN: Adding Accountability for Causing | Briscoe, B., "Re-ECN: Adding Accountability for Causing | |||
| Congestion to TCP/IP", draft-briscoe-tsvwg-re-ecn-tcp-01 | Congestion to TCP/IP", draft-briscoe-tsvwg-re-ecn-tcp-02 | |||
| (work in progress), March 2006. | (work in progress), June 2006. | |||
| [I-D.chan-tsvwg-diffserv-class-aggr] | [I-D.chan-tsvwg-diffserv-class-aggr] | |||
| Chan, K., "Aggregation of DiffServ Service Classes", | Chan, K., "Aggregation of DiffServ Service Classes", | |||
| draft-chan-tsvwg-diffserv-class-aggr-03 (work in | draft-chan-tsvwg-diffserv-class-aggr-03 (work in | |||
| progress), January 2006. | progress), January 2006. | |||
| [I-D.ietf-nsis-rmd] | [I-D.ietf-nsis-rmd] | |||
| Bader, A., "RMD-QOSM - The Resource Management in Diffserv | Bader, A., "RMD-QOSM - The Resource Management in Diffserv | |||
| QOS Model", draft-ietf-nsis-rmd-06 (work in progress), | QOS Model", draft-ietf-nsis-rmd-07 (work in progress), | |||
| February 2006. | June 2006. | |||
| [I-D.lefaucheur-rsvp-ecn] | [I-D.lefaucheur-rsvp-ecn] | |||
| Faucheur, F., "RSVP Extensions for Admission Control over | Faucheur, F., "RSVP Extensions for Admission Control over | |||
| Diffserv using Pre-congestion Notification", | Diffserv using Pre-congestion Notification (PCN)", | |||
| draft-lefaucheur-rsvp-ecn-00 (work in progress), | draft-lefaucheur-rsvp-ecn-01 (work in progress), | |||
| October 2005. | June 2006. | |||
| [RFC3540] Spring, N., Wetherall, D., and D. Ely, "Robust Explicit | [RFC3540] Spring, N., Wetherall, D., and D. Ely, "Robust Explicit | |||
| Congestion Notification (ECN) Signaling with Nonces", | Congestion Notification (ECN) Signaling with Nonces", | |||
| RFC 3540, June 2003. | RFC 3540, June 2003. | |||
| [RFC4340] Kohler, E., Handley, M., and S. Floyd, "Datagram | [RFC4340] Kohler, E., Handley, M., and S. Floyd, "Datagram | |||
| Congestion Control Protocol (DCCP)", RFC 4340, March 2006. | Congestion Control Protocol (DCCP)", RFC 4340, March 2006. | |||
| [Shayman] "Using ECN to Signal Congestion Within an MPLS Domain", | [Shayman] "Using ECN to Signal Congestion Within an MPLS Domain", | |||
| 2000. | 2000. | |||
| skipping to change at page 21, line 5 | skipping to change at page 22, line 5 | |||
| BT Research | BT Research | |||
| B54/77, Sirius House | B54/77, Sirius House | |||
| Adastral Park | Adastral Park | |||
| Martlesham Heath | Martlesham Heath | |||
| Ipswich | Ipswich | |||
| Suffolk IP5 3RE | Suffolk IP5 3RE | |||
| United Kingdom | United Kingdom | |||
| Email: june.tay@bt.com | Email: june.tay@bt.com | |||
| Intellectual Property Statement | Full Copyright Statement | |||
| Copyright (C) The Internet Society (2006). | ||||
| This document is subject to the rights, licenses and restrictions | ||||
| contained in BCP 78, and except as set forth therein, the authors | ||||
| retain all their rights. | ||||
| This document and the information contained herein are provided on an | ||||
| "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS | ||||
| OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET | ||||
| ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, | ||||
| INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE | ||||
| INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED | ||||
| WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. | ||||
| Intellectual Property | ||||
| The IETF takes no position regarding the validity or scope of any | The IETF takes no position regarding the validity or scope of any | |||
| Intellectual Property Rights or other rights that might be claimed to | Intellectual Property Rights or other rights that might be claimed to | |||
| pertain to the implementation or use of the technology described in | pertain to the implementation or use of the technology described in | |||
| this document or the extent to which any license under such rights | this document or the extent to which any license under such rights | |||
| might or might not be available; nor does it represent that it has | might or might not be available; nor does it represent that it has | |||
| made any independent effort to identify any such rights. Information | made any independent effort to identify any such rights. Information | |||
| on the procedures with respect to rights in RFC documents can be | on the procedures with respect to rights in RFC documents can be | |||
| found in BCP 78 and BCP 79. | found in BCP 78 and BCP 79. | |||
| skipping to change at page 21, line 29 | skipping to change at page 22, line 45 | |||
| such proprietary rights by implementers or users of this | such proprietary rights by implementers or users of this | |||
| specification can be obtained from the IETF on-line IPR repository at | specification can be obtained from the IETF on-line IPR repository at | |||
| http://www.ietf.org/ipr. | http://www.ietf.org/ipr. | |||
| The IETF invites any interested party to bring to its attention any | The IETF invites any interested party to bring to its attention any | |||
| copyrights, patents or patent applications, or other proprietary | copyrights, patents or patent applications, or other proprietary | |||
| rights that may cover technology that may be required to implement | rights that may cover technology that may be required to implement | |||
| this standard. Please address the information to the IETF at | this standard. Please address the information to the IETF at | |||
| ietf-ipr@ietf.org. | ietf-ipr@ietf.org. | |||
| Disclaimer of Validity | ||||
| This document and the information contained herein are provided on an | ||||
| "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS | ||||
| OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET | ||||
| ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, | ||||
| INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE | ||||
| INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED | ||||
| WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. | ||||
| Copyright Statement | ||||
| Copyright (C) The Internet Society (2006). This document is subject | ||||
| to the rights, licenses and restrictions contained in BCP 78, and | ||||
| except as set forth therein, the authors retain all their rights. | ||||
| Acknowledgment | Acknowledgment | |||
| Funding for the RFC Editor function is currently provided by the | Funding for the RFC Editor function is provided by the IETF | |||
| Internet Society. | Administrative Support Activity (IASA). | |||
| End of changes. 46 change blocks. | ||||
| 177 lines changed or deleted | 252 lines changed or added | |||
This html diff was produced by rfcdiff 1.33. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||