draft-briscoe-tsvwg-cl-architecture-03.txt | draft-briscoe-tsvwg-cl-architecture-04.txt | |||
---|---|---|---|---|
TSVWG B. Briscoe | TSVWG B. Briscoe | |||
Internet Draft P. Eardley | Internet Draft P. Eardley | |||
draft-briscoe-tsvwg-cl-architecture-03.txt D. Songhurst | draft-briscoe-tsvwg-cl-architecture-04.txt D. Songhurst | |||
Expires: December 2006 BT | Expires: April 2007 BT | |||
F. Le Faucheur | F. Le Faucheur | |||
A. Charny | A. Charny | |||
Cisco Systems, Inc | Cisco Systems, Inc | |||
J. Babiarz | J. Babiarz | |||
K. Chan | K. Chan | |||
S. Dudley | S. Dudley | |||
Nortel | Nortel | |||
G. Karagiannis | G. Karagiannis | |||
University of Twente / Ericsson | University of Twente / Ericsson | |||
A. Bader | A. Bader | |||
L. Westberg | L. Westberg | |||
Ericsson | Ericsson | |||
26 June, 2006 | 25 October, 2006 | |||
An edge-to-edge Deployment Model for Pre-Congestion Notification: | An edge-to-edge Deployment Model for Pre-Congestion Notification: | |||
Admission Control over a DiffServ Region | Admission Control over a DiffServ Region | |||
draft-briscoe-tsvwg-cl-architecture-03.txt | draft-briscoe-tsvwg-cl-architecture-04.txt | |||
Status of this Memo | Status of this Memo | |||
By submitting this Internet-Draft, each author represents that any | By submitting this Internet-Draft, each author represents that any | |||
applicable patent or other IPR claims of which he or she is aware | applicable patent or other IPR claims of which he or she is aware | |||
have been or will be disclosed, and any of which he or she becomes | have been or will be disclosed, and any of which he or she becomes | |||
aware will be disclosed, in accordance with Section 6 of BCP 79. | aware will be disclosed, in accordance with Section 6 of BCP 79. | |||
Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
Task Force (IETF), its areas, and its working groups. Note that | Task Force (IETF), its areas, and its working groups. Note that | |||
skipping to change at page 2, line 16 | skipping to change at page 2, line 16 | |||
This Internet-Draft will expire on September 6, 2006. | This Internet-Draft will expire on September 6, 2006. | |||
Copyright Notice | Copyright Notice | |||
Copyright (C) The Internet Society (2006). All Rights Reserved. | Copyright (C) The Internet Society (2006). All Rights Reserved. | |||
Abstract | Abstract | |||
This document describes a deployment model for pre-congestion | This document describes a deployment model for pre-congestion | |||
notification (PCN). PCN-based flow admission control and if necessary | notification (PCN) operating in a large DiffServ-based region of the | |||
flow pre-emption preserve the Controlled Load service to admitted | Internet. PCN-based admission control protects the quality of service | |||
flows. Routers in a large DiffServ-based region of the Internet use | of existing flows in normal circumstances, whilst if necessary (eg | |||
new pre-congestion notification marking to give early warning of | after a large failure) pre-emption of some flows preserves the quality | |||
their own congestion. Gateways around the edges of the region convert | of service of the remaining flows. Each link has a configured- | |||
measurements of this packet granularity marking into admission | admission-rate and a configured-pre-emption-rate, and a router marks | |||
control and pre-emption functions at flow granularity. Note that | packets that exceed these rates. Hence routers give an early warning of | |||
interior routers of the DiffServ-based region do not require flow | their own potential congestion, before packets need to be dropped. | |||
state or signalling - they only have to do the bulk packet marking of | Gateways around the edges of the PCN-region convert measurements of | |||
PCN. Hence an end-to-end Controlled Load service can be achieved | packet rates and their markings into decisions about whether to admit | |||
without any scalability impact on interior routers. | new flows, and (if necessary) into the rate of excess traffic that | |||
should be pre-empted. Per-flow admission states are kept at the | ||||
gateways only, while the PCN markers that are required for all routers | ||||
operate on the aggregate traffic - hence there is no scalability impact | ||||
on interior routers. | ||||
Authors' Note (TO BE DELETED BY THE RFC EDITOR UPON PUBLICATION) | Authors' Note (TO BE DELETED BY THE RFC EDITOR UPON PUBLICATION) | |||
This document is posted as an Internet-Draft with the intention of | This document is posted as an Internet-Draft with the intention of | |||
eventually becoming an INFORMATIONAL RFC. | eventually becoming an INFORMATIONAL RFC. | |||
Table of Contents | Table of Contents | |||
1. Introduction......................................... 5 | 1. Introduction................................................5 | |||
1.1. Summary......................................... 5 | 1.1. Summary................................................5 | |||
1.1.1. Flow admission control........................ 7 | 1.2. Key benefits...........................................8 | |||
1.1.2. Flow pre-emption............................. 9 | 1.3. Terminology............................................9 | |||
1.1.3. Both admission control and pre-emption.......... 10 | 1.4. Existing terminology...................................11 | |||
1.2. Terminology.................................... 10 | 1.5. Standardisation requirements...........................11 | |||
1.3. Existing terminology............................. 12 | 1.6. Structure of rest of the document......................12 | |||
1.4. Standardisation requirements...................... 12 | 2. Key aspects of the deployment model.........................13 | |||
1.5. Structure of rest of the document.................. 13 | 2.1. Key goals.............................................13 | |||
2. Key aspects of the deployment model..................... 14 | 2.2. Key assumptions........................................14 | |||
2.1. Key goals...................................... 14 | 3. Deployment model...........................................17 | |||
2.2. Key assumptions................................. 15 | 3.1. Admission control......................................17 | |||
2.3. Key benefits ................................... 17 | 3.1.1. Pre-Congestion Notification for Admission Marking..17 | |||
3. Deployment model.................................... 19 | 3.1.2. Measurements to support admission control..........17 | |||
3.1. Admission control ............................... 19 | ||||
3.1.1. Pre-Congestion Notification for Admission Marking. 19 | ||||
3.1.2. Measurements to support admission control........ 19 | ||||
3.1.3. How edge-to-edge admission control supports end-to-end | 3.1.3. How edge-to-edge admission control supports end-to-end | |||
QoS signalling ................................... 20 | QoS signalling..........................................18 | |||
3.1.4. Use case................................... 20 | 3.1.4. Use case.........................................18 | |||
3.2. Flow pre-emption................................ 22 | 3.2. Flow pre-emption.......................................20 | |||
3.2.1. Alerting an ingress gateway that flow pre-emption may be | 3.2.1. Alerting an ingress gateway that flow pre-emption may be | |||
needed.......................................... 22 | needed..................................................20 | |||
3.2.2. Determining the right amount of CL traffic to drop 24 | 3.2.2. Determining the right amount of CL traffic to drop.23 | |||
3.2.3. Use case for flow pre-emption ................. 25 | 3.2.3. Use case for flow pre-emption.....................24 | |||
4. Summary of Functionality.............................. 27 | 3.3. Both admission control and pre-emption.................25 | |||
4.1. Ingress gateways................................ 27 | 4. Summary of Functionality....................................27 | |||
4.2. Interior routers................................ 28 | 4.1. Ingress gateways.......................................27 | |||
4.3. Egress gateways................................. 28 | 4.2. Interior routers.......................................28 | |||
4.4. Failures....................................... 29 | 4.3. Egress gateways........................................28 | |||
5. Limitations and some potential solutions................. 31 | 4.4. Failures..............................................29 | |||
5.1. ECMP.......................................... 31 | 5. Limitations and some potential solutions....................31 | |||
5.2. Beat down effect................................ 33 | 5.1. ECMP..................................................31 | |||
5.3. Bi-directional sessions .......................... 35 | 5.2. Beat down effect.......................................33 | |||
5.4. Global fairness................................. 37 | 5.3. Bi-directional sessions................................35 | |||
5.5. Flash crowds ................................... 39 | 5.4. Global fairness........................................37 | |||
5.6. Pre-empting too fast............................. 40 | 5.5. Flash crowds..........................................39 | |||
5.7. Other potential extensions........................ 42 | 5.6. Pre-empting too fast...................................41 | |||
5.7.1. Tunnelling................................. 42 | 5.7. Other potential extensions.............................42 | |||
5.7.2. Multi-domain and multi-operator usage........... 43 | 5.7.1. Tunnelling........................................42 | |||
5.7.3. Preferential dropping of pre-emption marked packets43 | 5.7.2. Multi-domain and multi-operator usage.............43 | |||
5.7.4. Adaptive bandwidth for the Controlled Load service 44 | 5.7.3. Preferential dropping of pre-emption marked packets44 | |||
5.7.4. Adaptive bandwidth for the Controlled Load service.44 | ||||
5.7.5. Controlled Load service with end-to-end Pre-Congestion | 5.7.5. Controlled Load service with end-to-end Pre-Congestion | |||
Notification..................................... 44 | Notification............................................45 | |||
5.7.6. MPLS-TE ................................... 45 | 5.7.6. MPLS-TE..........................................45 | |||
6. Relationship to other QoS mechanisms.................... 46 | 6. Relationship to other QoS mechanisms........................46 | |||
6.1. IntServ Controlled Load .......................... 46 | 6.1. IntServ Controlled Load................................46 | |||
6.2. Integrated services operation over DiffServ.......... 46 | 6.2. Integrated services operation over DiffServ............46 | |||
6.3. Differentiated Services .......................... 46 | 6.3. Differentiated Services................................46 | |||
6.4. ECN........................................... 47 | 6.4. ECN...................................................47 | |||
6.5. RTECN......................................... 47 | 6.5. RTECN.................................................47 | |||
6.6. RMD........................................... 47 | 6.6. RMD...................................................48 | |||
6.7. RSVP Aggregation over MPLS-TE...................... 48 | 6.7. RSVP Aggregation over MPLS-TE..........................48 | |||
7. Security Considerations............................... 49 | 6.8. Other Network Admission Control Approaches.............48 | |||
8. Acknowledgements.................................... 49 | 7. Security Considerations.....................................49 | |||
9. Comments solicited................................... 49 | 8. Acknowledgements...........................................49 | |||
10. Changes from earlier versions of the draft.............. 50 | 9. Comments solicited.........................................50 | |||
11. Appendices ........................................ 51 | 10. Changes from earlier versions of the draft.................50 | |||
11.1. Appendix A: Explicit Congestion Notification ........ 51 | 11. Appendices................................................52 | |||
11.1. Appendix A: Explicit Congestion Notification..........52 | ||||
11.2. Appendix B: What is distributed measurement-based admission | 11.2. Appendix B: What is distributed measurement-based admission | |||
control?........................................... 52 | control?...................................................53 | |||
11.3. Appendix C: Calculating the Exponentially weighted moving | 11.3. Appendix C: Calculating the Exponentially weighted moving | |||
average (EWMA)...................................... 53 | average (EWMA).............................................54 | |||
12. References ........................................ 55 | 12. References................................................56 | |||
Authors' Addresses..................................... 60 | Authors' Addresses............................................61 | |||
Intellectual Property Statement .......................... 62 | Intellectual Property Statement................................63 | |||
Disclaimer of Validity.................................. 62 | Disclaimer of Validity........................................63 | |||
Copyright Statement.................................... 62 | Copyright Statement...........................................63 | |||
1. Introduction | 1. Introduction | |||
1.1. Summary | 1.1. Summary | |||
This document describes a deployment model to achieve an end-to-end | This document describes a deployment model to achieve an end-to-end | |||
Controlled Load service by using (within a large region of the | Controlled Load service by using (within a large region of the | |||
Internet) DiffServ and edge-to-edge distributed measurement-based | Internet) DiffServ and edge-to-edge distributed measurement-based | |||
admission control and flow pre-emption. Controlled load service is a | admission control and flow pre-emption. Controlled load service is a | |||
quality of service (QoS) closely approximating the QoS that the same | quality of service (QoS) closely approximating the QoS that the same | |||
flow would receive from a lightly loaded network element [RFC2211]. | flow would receive from a lightly loaded network element [RFC2211]. | |||
Controlled Load (CL) is useful for inelastic flows such as those for | Controlled Load (CL) is useful for inelastic flows such as those for | |||
real-time media. | real-time media. | |||
In line with the "IntServ over DiffServ" framework defined in | In line with the "IntServ over DiffServ" framework defined in | |||
[RFC2998], the CL service is supported end-to-end and RSVP signalling | [RFC2998], the CL service is supported end-to-end and RSVP signalling | |||
[RFC2205] is used end-to-end, over an edge-to-edge DiffServ region. | [RFC2205] is used end-to-end, over an edge-to-edge DiffServ region. | |||
We call the DiffServ region the "CL-region". | ||||
___ ___ _______________________________________ ____ ___ | ___ ___ _______________________________________ ____ ___ | |||
| | | | | | | | | | | ||||
| | | | |Ingress Interior Egress| | | | | | | | | | |Ingress Interior Egress| | | | | | |||
| | | | |gateway routers gateway| | | | | | | | | | |gateway routers gateway| | | | | | |||
| | | | |-------+ +-------+ +-------+ +------| | | | | | | | | | |-------+ +-------+ +-------+ +------| | | | | | |||
| | | | | PCN- | | PCN- | | PCN- | | | | | | | | | | | | | PCN- | | PCN- | | PCN- | | | | | | | | |||
| |..| |..|marking|..|marking|..|marking|..| Meter|..| |..| | | | |..| |..|marking|..|marking|..|marking|..| Meter|..| |..| | | |||
| | | | |-------+ +-------+ +-------+ +------| | | | | | | | | | |-------+ +-------+ +-------+ +------| | | | | | |||
| | | | | \ / | | | | | | | | | | | \ / | | | | | | |||
| | | | | \ / | | | | | | | | | | | \ / | | | | | | |||
| | | | | \ Congestion-Level-Estimate / | | | | | | | | | | | \ Congestion-Level-Estimate / | | | | | | |||
| | | | | \ (for admission control) / | | | | | | | | | | | \ (for admission control) / | | | | | | |||
skipping to change at page 6, line 8 | skipping to change at page 6, line 8 | |||
<------ edge-to-edge signalling -----> | <------ edge-to-edge signalling -----> | |||
(for admission control & flow pre-emption) | (for admission control & flow pre-emption) | |||
<-------------------end-to-end QoS signalling protocol---------------> | <-------------------end-to-end QoS signalling protocol---------------> | |||
Figure 1: Overall QoS architecture (NB terminology explained later) | Figure 1: Overall QoS architecture (NB terminology explained later) | |||
Figure 1 shows an example of an overall QoS architecture, where the | Figure 1 shows an example of an overall QoS architecture, where the | |||
two access networks are connected by a CL-region. Another possibility | two access networks are connected by a CL-region. Another possibility | |||
is that there are several CL-regions between the access networks - | is that there are several CL-regions between the access networks - | |||
each would operate the Pre-Congestion Notification mechanisms | each would operate the Pre-Congestion Notification mechanisms | |||
separately. | separately. The document assumes RSVP as the end-to-end QoS | |||
signalling protocol. However, the RSVP signalling may itself be | ||||
In Section 1.1.1 we summarise how admission of new CL microflows is | originated or terminated by proxies still closer to the edge of the | |||
controlled so as to deliver the required QoS. In abnormal | network, such as home hubs or the like, triggered in turn by | |||
circumstances, for instance a disaster affecting multiple interior | application layer signalling. [RFC2998] and our approach are compared | |||
routers, then the QoS on existing CL microflows may degrade even if | further in Section 6.2. | |||
care was exercised when admitting those microflows before those | ||||
circumstances. Therefore we also propose a mechanism (summarised in | ||||
Section 1.1.2) to pre-empt some of the existing microflows. Then | ||||
remaining microflows retain their expected QoS, while improved QoS is | ||||
quickly restored to lower priority traffic. | ||||
As a fundamental building block to support these two mechanisms, we | ||||
introduce "Pre-Congestion Notification". Pre-Congestion Notification | ||||
(PCN) builds on the concepts of RFC 3168, "The addition of Explicit | ||||
Congestion Notification to IP". The [PCN] document defines the | ||||
respective algorithms that determine when a PCN-enabled router marks | ||||
a packet with Admission Marking or Pre-emption Marking, depending on | ||||
the traffic level. | ||||
In order to support CL traffic we would expect PCN to supplement the | ||||
existing Expedited Forwarding (EF). Within the controlled edge-to- | ||||
edge region, a particular packet receives the Pre-Congestion | ||||
Notification (PCN) behaviour if the packet's differentiated services | ||||
codepoint (DSCP) is set to EF and also the ECN field indicates ECN | ||||
Capable Transport. However, PCN is not only intended to supplement | ||||
EF. PCN is specified (in [PCN]) as a building block which can | ||||
supplement the scheduling behaviour of other PHBs. | ||||
There are various possible ways to encode the markings into a packet, | ||||
using the ECN field and perhaps other DSCPs, which are discussed in | ||||
[PCN]. In this draft we use the abstract names Admission Marking and | ||||
Pre-emption Marking. | ||||
This framework assumes that the Pre-Congestion Notification behaviour | Flows must enter and leave the CL-region through its ingress and | |||
is used in a controlled environment, i.e. within the controlled edge- | egress gateways, and they need traffic descriptors that are policed | |||
to-edge region. | by the ingress gateway (NB the policing function is out of this | |||
document's scope). The overall CL-traffic between two border routers | ||||
is called a "CL-region-aggregate". | ||||
1.1.1. Flow admission control | The document introduces a mechanism for flow admission control: | |||
should a new flow be admitted into a specific CL-region-aggregate? | ||||
Admission control protects the QoS of existing CL-flows in normal | ||||
circumstances. In abnormal circumstances, for instance a disaster | ||||
affecting multiple interior routers, then the QoS on existing CL | ||||
microflows may degrade even if care was exercised when admitting | ||||
those microflows before those circumstances. Therefore we also | ||||
propose a mechanism for flow pre-emption: how much traffic, in a | ||||
specific CL-region-aggregate, should be pre-empted in order to | ||||
preserve the QoS of the remaining CL-flows? Flow pre-emption also | ||||
restores QoS to lower priority traffic. | ||||
This document describes a new admission control procedure for an | As a fundamental building block to enable these two mechanisms, each | |||
edge-to-edge region, which uses new per-hop Pre-Congestion | link of the CL-region is associated with a configured-admission-rate | |||
Notification 'admission marking' as a fundamental building block. In | and configured-pre-emption-rate; the former is usually significantly | |||
turn, an end-to-end CL service would use this as a building block | larger than the latter. If traffic in a specific DiffServ class ("CL- | |||
within a broader QoS architecture. | traffic") on the link exceeds these rates then packets are marked | |||
with "Admission Marking" or "Pre-emption Marking". The algorithms | ||||
that determine the number of packets marked are outlined in Section 3 | ||||
and detailed in [PCN]. PCN marking (Pre-Congestion Notification) | ||||
builds on the concepts of RFC 3168, "The addition of Explicit | ||||
Congestion Notification to IP" (which is briefly summarised in | ||||
Appendix A). | ||||
The per-hop, edge-to-edge and end-to-end aspects are now briefly | Traffic rate on link ^ | |||
introduced in turn. | | | |||
| Drop packets | ||||
link bandwidth -|--------------------------- | ||||
| | ||||
| Pre-emption Mark packets | ||||
configured-pre-emption-rate -|--------------------------- | ||||
| | ||||
| Admission Mark packets | ||||
configured-admission-rate -|--------------------------- | ||||
| | ||||
| No marking of packets | ||||
| | ||||
+--------------------------- | ||||
Appendix A provides a brief summary of Explicit Congestion | Figure 2: Packet Marking by Routers | |||
Notification (ECN) [RFC3168]. It specifies that a router sets the ECN | ||||
field to the Congestion Experienced (CE) value as a warning of | ||||
incipient congestion. RFC3168 doesn't specify a particular algorithm | ||||
for setting the CE codepoint, although Random Early Detection (RED) | ||||
is expected to be used. | ||||
Pre-Congestion Notification (PCN) builds on the concepts of ECN. PCN | Gateways of the CL-region make measurements of packet rates and their | |||
introduces a new algorithm that Admission Marks packets before there | PCN markings and convert them into decisions about whether to admit | |||
is any significant build-up of CL packets in the queue. Admission | new flows, and (if necessary) into the rate of excess traffic that | |||
marked packets therefore act as an "early warning" when the amount of | should be pre-empted. These mechanisms are detailed in Section 3 and | |||
packets flowing is getting close to the engineered capacity. Hence it | briefly outlined in the next few paragraphs. | |||
can be used with per-hop behaviours (PHBs) designed to operate with | ||||
very low queue occupancy, such as Expedited Forwarding (EF). Note | ||||
that our use of the ECN field operates across the CL-region, i.e. | ||||
edge-to-edge, and not host-to-host as in [RFC3168]. | ||||
Turning next to the edge-to-edge aspect. All routers within a region | The admission control mechanism for a new flow entering the network | |||
of the Internet, which we call the CL-region, apply the PHB used for | at ingress gateway G0 and leaving it at egress gateway G1 relies on | |||
CL traffic and the Pre-Congestion Notification behaviour. Traffic | feedback from the egress gateway G1 about the existing CL-region- | |||
must enter/leave the CL-region through ingress/egress gateways, which | aggregate between G0 and G1. This feedback is generated as follows. | |||
have special functionality. Typically the CL-region is the core or | All routers meter the rate of the CL-traffic on their outgoing links | |||
backbone of an operator. The CL service is achieved "edge-to-edge" | and mark the packets with the Admission Mark if the configured- | |||
across the CL-region, by using distributed measurement-based | admission-rate is exceeded. Egress gateway G1 measures the Admission | |||
admission control: the decision whether to admit a new microflow | Marks for each of its CL-region-aggregates separately. If the | |||
depends on a measurement of the existing traffic between the same | fraction of traffic on a CL-region-aggregate that is Admission Marked | |||
pair of ingress and egress gateways (i.e. the same pair as the | exceeds some threshold, no further flows should be admitted into this | |||
prospective new microflow). (See Appendix B for further discussion on | CL-region-aggregate. Because sources vary their data rates (amongst | |||
"What is distributed measurement-based admission control?") | other reasons) the rate of the CL-traffic on a link may fluctuate | |||
above and below the configured-admission-rate. Hence to get more | ||||
stable information, the egress gateway measures the fraction as a | ||||
moving average, called the Congestion-Level-Estimate. This is | ||||
signalled from the egress G1 to the ingress G0, to enable the ingress | ||||
to block new flows. | ||||
As CL packets travel across the CL-region, routers will admission | Admission control seems most useful for DiffServ's Controlled load | |||
mark packets (according to the Pre-Congestion Notification algorithm) | service. In order to support CL traffic we would expect PCN to | |||
as an "early warning" of potential congestion, i.e. before there is | supplement the existing scheduling behaviour Expedited Forwarding | |||
any significant build-up of CL packets in the queue. For traffic from | (EF). Since PCN gives an "early warning" of potential congestion | |||
each remote ingress gateway, the CL-region's egress gateway measures | (hence "pre-congestion notification"), admission control can kick in | |||
the fraction of CL traffic that is admission marked. The egress | before there is any significant build up of packets in routers - | |||
gateway calculates the value on a per bit basis as a moving average | which is exactly the performance required for CL. However, PCN is not | |||
(exponentially weighted is suggested), and which we term Congestion- | only intended to supplement EF. PCN is specified (in [PCN]) as a | |||
Level-Estimate (CLE). Then it reports it to the CL-region's ingress | building block which can supplement the scheduling behaviour of other | |||
gateway, piggy-backed on the signalling for a new flow. The ingress | PHBs. | |||
gateway only admits the new CL microflow if the Congestion-Level- | ||||
Estimate is less than the value of the CLE-threshold. Hence | ||||
previously accepted CL microflows will suffer minimal queuing delay, | ||||
jitter and loss. | ||||
In turn, the edge-to-edge architecture is a building block in | The function to pre-empt flows (or allow the potential to pre-empt | |||
delivering an end-to-end CL service. The approach is similar to that | them) relies on feedback from the egress gateway about the CL-region- | |||
described in [RFC2998] for Integrated services operation over | aggregates. This feedback is generated as follows. All routers meter | |||
DiffServ networks. Like [RFC2998], an IntServ class (CL in our case) | the rate of the CL-traffic on their outgoing links, and if the rate | |||
is achieved end-to-end, with a CL-region viewed as a single | is in excess of the configured-pre-emption-rate then packets | |||
reservation hop in the total end-to-end path. Interior routers of the | amounting to the excess rate are Pre-emption Marked. If the egress | |||
CL-region do not process flow signalling nor do they hold per flow | gateway G1 sees a Pre-emption Marked packet then it measures, for | |||
state. We assume that the end-to-end signalling mechanism is RSVP | this CL-region-aggregate, the rate of all received packets that | |||
(Section 2.2). However, the RSVP signalling may itself be originated | aren't Pre-emption Marked. This is the rate of CL-traffic that the | |||
or terminated by proxies still closer to the edge of the network, | network can actually support from G0 to G1, and we thus call it the | |||
such as home hubs or the like, triggered in turn by application layer | Sustainable-Aggregate-Rate. The ingress gateway G0 compares the | |||
signalling. [RFC2998] and our approach are compared further in | Sustainable-Aggregate-Rate with the rate that it is sending towards | |||
Section 6.2. | G1, and hence determines the required traffic rate reduction. The | |||
document assumes flow pre-emption as the way of reacting to this | ||||
information, ie stopping sufficient flows to reduce the rate to the | ||||
Sustainable-Aggregate-Rate. However, this isn't mandated, for | ||||
instance policy or regulation may prevent pre-emption of some flows - | ||||
such considerations are out of scope of this document. | ||||
An important benefit compared with the IntServ over DiffServ model | 1.2. Key benefits | |||
[RFC2998] arises from the fact that the load is controlled | ||||
dynamically rather than with traffic conditioning agreements (TCAs). | ||||
TCAs were originally introduced in the (informational) DiffServ | ||||
architecture [RFC2475] as an alternative to reservation processing in | ||||
the interior region in order to reduce the burden on interior | ||||
routers. With TCAs, in practice service providers rely on | ||||
subscription-time Service Level Agreements that statically define the | ||||
parameters of the traffic that will be accepted from a customer. The | ||||
problem arises because the TCA at the ingress must allow any | ||||
destination address, if it is to remain scalable. But for longer | ||||
topologies, the chances increase that traffic will focus on an | ||||
interior resource, even though it is within contract at the ingress | ||||
[Reid], e.g. all flows converge on the same egress gateway. Even | ||||
though networks can be engineered to make such failures rare, when | ||||
they occur all inelastic flows through the congested resource fail | ||||
catastrophically. | ||||
Distributed measurement-based admission control avoids reservation | We believe that the mechanisms described in this document are simple, | |||
processing (whether per flow or aggregated) on interior routers but | scalable, and robust because: | |||
flows are still blocked dynamically in response to actual congestion | ||||
on any interior router. Hence there is no need for accurate or | ||||
conservative prediction of the traffic matrix. | ||||
1.1.2. Flow pre-emption | o Per flow state is only required at the ingress gateways to prevent | |||
non-admitted CL traffic from entering the PCN-region. Other | ||||
network entities are not aware of individual flows. | ||||
An essential QoS issue in core and backbone networks is being able to | o For each of its links a router has Admission Marking and Pre- | |||
cope with failures of routers and links. The consequent re-routing | emption Marking behaviours. These markers operate on the overall | |||
can cause severe congestion on some links and hence degrade the QoS | CL traffic of the respective link. Therefore, there are no | |||
experienced by on-going microflows and other, lower priority traffic. | scalability concerns. | |||
Even when the network is engineered to sustain a single link failure, | ||||
multiple link failures (e.g. due to a fibre cut, router failure or a | ||||
natural disaster) can cause violation of capacity constraints and | ||||
resulting QoS failures. Our solution uses rate-based flow pre- | ||||
emption, so that sufficient of the previously admitted CL microflows | ||||
are dropped to ensure that the remaining ones again receive QoS | ||||
commensurate with the CL service and at least some QoS is quickly | ||||
restored to other traffic classes. | ||||
The solution involves four steps. First, triggering the ingress | o The information of these measurements is implicitly signalled to | |||
gateway to test whether pre-emption may be needed. A router enhanced | the egress gateways by the marks in the packet headers. No | |||
with Pre-Congestion Notification may optionally include an algorithm | protocol actions (explicit messages) are required. | |||
that Pre-emption Marks packets. Reception of a packet with such a | ||||
marking alerts the egress gateway that pre-emption may be needed, | ||||
which in turn sends a Pre-emption Alert message to the ingress | ||||
gateway. Secondly, calculating the right amount of traffic to drop. | ||||
This involves the egress gateway measuring, and reporting to the | ||||
ingress gateway, the current rate of CL traffic received from that | ||||
particular ingress gateway. This is the CL rate which the network can | ||||
actually support from that ingress gateway to that egress gateway, | ||||
and we thus call it the Sustainable-Aggregate-Rate. The ingress | ||||
gateway compares the Sustainable-Aggregate-Rate) with the rate that | ||||
it is sending and hence determines how much traffic needs to be pre- | ||||
empted. Thirdly, choosing which flows to shed in order to drop the | ||||
traffic calculated in the second step. Information on the priority of | ||||
flows may be held by the ingress gateway, or by some out of band | ||||
policy decision point. How these systems co-ordinate to determine | ||||
which flows to drop is outside the scope of this document, but | ||||
between them they have all the information necessary to make the | ||||
decision. Fourthly, tearing down reservations for the chosen flows. | ||||
The ingress gateway triggers standard tear-down messages for the | ||||
reservation protocol in use. In turn, this is expected to result in | ||||
end-systems tearing down the corresponding sessions (e.g. voice | ||||
calls) using the corresponding session control protocols. | ||||
The focus of this document is on the first two steps, i.e. | o The egress gateways make separate measurements for each ingress | |||
determining that pre-emption may be needed and estimating how much | gateway of packets. Each meter operates on the overall CL traffic | |||
traffic needs to be pre-empted. We provide some hints about the | of a particular CL-region-aggregate. Therefore, there are no | |||
latter two steps in Section 3.2.3, but don't try to provide full | scalability concerns as long as the number of ingress gateways is | |||
guidance as it greatly depends on the particular detailed operational | not overwhelmingly large. | |||
situation. | ||||
The solution operates within a little over one round trip time - the | o Feedback signalling is required between all pairs of ingress and | |||
time required for microflow packets that have experienced Pre-emption | egress gateways and the signalled information is on the basis of | |||
Marking to travel downstream through the CL-region and arrive at the | the corresponding CL-region-aggregate, i.e. it is also unaware of | |||
egress gateway, plus some additional time for the egress gateway to | individual flows. | |||
measure the rate seen after it has been alerted that pre-emption may | ||||
be needed, and the time for the egress gateway to report this | ||||
information to the ingress gateway. | ||||
1.1.3. Both admission control and pre-emption | o The configured-admission-rates can be chosen small enough that | |||
admitted traffic can still be carried after a rerouting in most | ||||
failure cases. This is an important feature as QoS violations in | ||||
core networks due to link failures are more likely than QoS | ||||
violations due to increased traffic volume. | ||||
This document describes both the admission control and pre-emption | o The admitted load is controlled dynamically. Therefore it adapts | |||
mechanisms, and we suggest that an operator uses both. However, we do | as the traffic matrix changes, and also if the network topology | |||
not require this and some operators may want to implement only one. | changes (eg after a link failure). Hence an operator can be less | |||
conservative when deploying network capacity, and less accurate in | ||||
their prediction of the traffic matrix. Also, controlling the load | ||||
using statically provisioned capacity per ingress (regardless of | ||||
the egress of a flow), as is typical in the DiffServ architecture | ||||
[RFC2475], can lead to focussed overload: many flows happen to | ||||
focus on a particular link and then all flows through the | ||||
congested link fail catastrophically (Section 6.2). | ||||
For example, an operator could use just admission control, solving | o The pre-emption function complements admission control. It allows | |||
heavy congestion (caused by re-routing) by 'just waiting' - as | the network to recover from sudden unexpected surges of CL-traffic | |||
sessions end, existing microflows naturally depart from the system | on some links, thus restoring QoS to the remaining flows. Such | |||
over time, and the admission control mechanism will prevent admission | scenarios are very unlikely but not impossible. They can be caused | |||
of new microflows that use the affected links. So the CL-region will | by large network failures that redirect lots of admitted CL- | |||
naturally return to normal controlled load service, but with reduced | traffic to other links, or by malfunction of the measurement-based | |||
capacity. The drawback of this approach would be that until flows | admission control in the presence of admitted flows that send for | |||
naturally depart to relieve the congestion, all flows and lower | a while with an atypically low rate and increase their rates in a | |||
priority services will be adversely affected. As another example, an | correlated way. | |||
operator could use just admission control, avoiding heavy congestion | ||||
(caused by re-routing) by 'capacity planning' - by configuring | ||||
admission control thresholds to lower levels than the network could | ||||
accept in normal situations such that the load after failure is | ||||
expected to stay below acceptable levels even with reduced network | ||||
resources. | ||||
On the other hand, an operator could just rely for admission control | 1.3. Terminology | |||
on the traffic conditioning agreements of the DiffServ architecture | ||||
[RFC2475]. The pre-emption mechanism described in this document would | ||||
be used to counteract the problem described at the end of Section | ||||
1.1.1. | ||||
1.2. Terminology | EDITOR'S NOTE: Terminology in this document is (hopefully) consistent | |||
with that in [PCN]. However, it may not be consistent with the | ||||
terminology in other PCN-related documents. The PCN Working Group (if | ||||
formed) will need to agree a single set of terminology. | ||||
This terminology is copied from the pre-congestion notification | This terminology is copied from the pre-congestion notification | |||
marking draft [PCN]: | marking draft [PCN]: | |||
o Pre-Congestion Notification (PCN): two new algorithms that | o Pre-Congestion Notification (PCN): two new algorithms that | |||
determine when a PCN-enabled router Admission Marks and Pre- | determine when a PCN-enabled router Admission Marks and Pre- | |||
emption Marks a packet, depending on the traffic level. | emption Marks a packet, depending on the traffic level. | |||
o Admission Marking condition: the traffic level is such that the | o Admission Marking condition: the traffic level is such that the | |||
router Admission Marks packets. The router provides an "early | router Admission Marks packets. The router provides an "early | |||
skipping to change at page 12, line 23 | skipping to change at page 11, line 27 | |||
o Sustainable-Aggregate-Rate: the rate of traffic that the network | o Sustainable-Aggregate-Rate: the rate of traffic that the network | |||
can actually support for a specific CL-region-aggregate. So it is | can actually support for a specific CL-region-aggregate. So it is | |||
measured by an egress gateway for the CL packets from a particular | measured by an egress gateway for the CL packets from a particular | |||
ingress gateway. | ingress gateway. | |||
o Ingress-Aggregate-Rate: the rate of traffic that is being sent on | o Ingress-Aggregate-Rate: the rate of traffic that is being sent on | |||
a specific CL-region-aggregate. So it is measured by an ingress | a specific CL-region-aggregate. So it is measured by an ingress | |||
gateway for the CL packets sent towards a particular egress | gateway for the CL packets sent towards a particular egress | |||
gateway. | gateway. | |||
1.3. Existing terminology | 1.4. Existing terminology | |||
This is a placeholder for useful terminology that is defined | This is a placeholder for useful terminology that is defined | |||
elsewhere. | elsewhere. | |||
1.4. Standardisation requirements | 1.5. Standardisation requirements | |||
The framework described in this document has two new standardisation | The framework described in this document has two new standardisation | |||
requirements: | requirements: | |||
o new Pre-Congestion Notification for Admission Marking and Pre- | o new Pre-Congestion Notification for Admission Marking and Pre- | |||
emption Marking are required, as detailed in [PCN]. | emption Marking are required, as detailed in [PCN]. | |||
o the end-to-end signalling protocol needs to be modified to carry | o the end-to-end signalling protocol needs to be modified to carry | |||
the Congestion-Level-Estimate report (for admission control) and | the Congestion-Level-Estimate report (for admission control) and | |||
the Sustainable-Aggregate-Rate (for flow pre-emption). With our | the Sustainable-Aggregate-Rate (for flow pre-emption). With our | |||
skipping to change at page 13, line 8 | skipping to change at page 12, line 11 | |||
detailed in [RSVP-PCN], for example to carry the Congestion-Level- | detailed in [RSVP-PCN], for example to carry the Congestion-Level- | |||
Estimate and Sustainable-Aggregate-Rate information from egress | Estimate and Sustainable-Aggregate-Rate information from egress | |||
gateway to ingress gateway. | gateway to ingress gateway. | |||
o We are discussing what to standardise about the gateway's | o We are discussing what to standardise about the gateway's | |||
behaviour. | behaviour. | |||
Other than these things, the arrangement uses existing IETF protocols | Other than these things, the arrangement uses existing IETF protocols | |||
throughout, although not in their usual architecture. | throughout, although not in their usual architecture. | |||
1.5. Structure of rest of the document | 1.6. Structure of rest of the document | |||
Section 2 describes some key aspects of the deployment model: our | Section 2 describes some key aspects of the deployment model: our | |||
goals, assumptions and the benefits we believe it has. Section 3 | goals and assumptions. Section 3 describes the deployment model, | |||
describes the deployment model, whilst Section 4 summarises the | whilst Section 4 summarises the required changes to the various | |||
required changes to the various routers in the CL-region. Section 5 | routers in the CL-region. Section 5 outlines some limitations of PCN | |||
outlines some limitations of PCN that we've identified in this | that we've identified in this deployment model; it also discusses | |||
deployment model; it also discusses some potential solutions, and | some potential solutions, and other possible extensions. Section 6 | |||
other possible extensions. Section 6 provides some comparison with | provides some comparison with existing QoS mechanisms. | |||
existing QoS mechanisms. | ||||
2. Key aspects of the deployment model | 2. Key aspects of the deployment model | |||
EDITOR'S NOTE: The material in Section 2 will eventually disappear, | ||||
as it will be covered by the problem statement of the PCN Working | ||||
Group (if formed). | ||||
In this section we discuss the key aspects of the deployment model: | In this section we discuss the key aspects of the deployment model: | |||
o At a high level, our key goals, i.e. the functionality that we | o At a high level, our key goals, i.e. the functionality that we | |||
want to achieve | want to achieve | |||
o The assumptions that we're prepared to make | o The assumptions that we're prepared to make | |||
o The consequent benefits they bring | ||||
2.1. Key goals | 2.1. Key goals | |||
The deployment model achieves an end-to-end controlled load (CL) | The deployment model achieves an end-to-end controlled load (CL) | |||
service where a segment of the end-to-end path is an edge-to-edge | service where a segment of the end-to-end path is an edge-to-edge | |||
Pre-Congestion Notification region. CL is a quality of service (QoS) | Pre-Congestion Notification region. CL is a quality of service (QoS) | |||
closely approximating the QoS that the same flow would receive from a | closely approximating the QoS that the same flow would receive from a | |||
lightly loaded network element [RFC2211]. It is useful for inelastic | lightly loaded network element [RFC2211]. It is useful for inelastic | |||
flows such as those for real-time media. | flows such as those for real-time media. | |||
o The CL service should be achieved despite varying load levels of | o The CL service should be achieved despite varying load levels of | |||
skipping to change at page 17, line 18 | skipping to change at page 17, line 5 | |||
Expedited Forwarding's PHB, but supplemented with Pre-Congestion | Expedited Forwarding's PHB, but supplemented with Pre-Congestion | |||
Notification. If this is possible, other PHBs (like Assured | Notification. If this is possible, other PHBs (like Assured | |||
Forwarding) could be supplemented with the same new behaviours. | Forwarding) could be supplemented with the same new behaviours. | |||
This is similar to how RFC3168 ECN was defined to supplement any | This is similar to how RFC3168 ECN was defined to supplement any | |||
PHB. | PHB. | |||
o Routing: we are looking in greater detail at the solution in the | o Routing: we are looking in greater detail at the solution in the | |||
presence of Equal Cost Multi-Path routing and at suitable | presence of Equal Cost Multi-Path routing and at suitable | |||
enhancements. See also the 'ECMP' section 5.1 later. | enhancements. See also the 'ECMP' section 5.1 later. | |||
2.3. Key benefits | ||||
We believe that the mechanism described in this document has several | ||||
advantages: | ||||
o It achieves statistical guarantees of quality of service for | ||||
microflows, delivering a very low delay, jitter and packet loss | ||||
service suitable for applications like voice and video calls that | ||||
generate real time inelastic traffic. This is because of its per | ||||
microflow admission control scheme, combined with its dynamic on- | ||||
path "early warning" of potential congestion. The guarantee is at | ||||
least as strong as with IntServ Controlled Load (Section 6.1 | ||||
mentions why the guarantee may be somewhat better), but without | ||||
the scalability problems of per-microflow IntServ. | ||||
o It can support "Emergency" and military Multi-Level Pre-emption | ||||
and Priority (MLPP) services, even in times of heavy congestion | ||||
(perhaps caused by failure of a router within the CL-region), by | ||||
pre-empting on-going "ordinary CL microflows". See also Section | ||||
4.5. | ||||
o It scales well, because there is no signal processing or per flow | ||||
state held by the interior routers of the CL-region. Note that | ||||
interior routers only hold state per outgoing interface - they do | ||||
not hold state per CL-region-aggregate nor per flow. | ||||
o It is resilient, again because no per flow state is held by the | ||||
interior routers of the CL-region. Hence during an interior | ||||
routing change caused by a router failure, no microflow state has | ||||
to be relocated. The flow pre-emption mechanism further helps | ||||
resilience because it rapidly reduces the load to one that the CL- | ||||
region can support. | ||||
o It helps preserve, through the flow pre-emption mechanism, QoS to | ||||
as many microflows as possible and to lower priority traffic in | ||||
times of heavy congestion (e.g. caused by failure of an interior | ||||
router). Otherwise long-lived microflows could cause loss on all | ||||
CL microflows for a long time. | ||||
o It avoids the potential catastrophic failure problem when the | ||||
DiffServ architecture is used in large networks using statically | ||||
provisioned capacity. This is achieved by controlling the load | ||||
dynamically, based on edge-to-edge-path real-time measurement of | ||||
Pre-Congestion Notification, as discussed in Section 1.1.1. | ||||
o It requires minimal new standardisation, because it reuses | ||||
existing QoS protocols and algorithms. | ||||
o It can be deployed incrementally, region by region or network by | ||||
network. Not all the regions or networks on the end-to-end path | ||||
need to have it deployed. Two CL-regions can even be separated by | ||||
a network that uses another QoS mechanism (e.g. MPLS-TE). | ||||
o It provides a deployment path for use of ECN for real-time | ||||
applications. Operators can gain experience of ECN before its | ||||
applicability to end-systems is understood and end terminals are | ||||
ECN capable. | ||||
3. Deployment model | 3. Deployment model | |||
3.1. Admission control | 3.1. Admission control | |||
In this section we describe the admission control mechanism. We | In this section we describe the admission control mechanism. We | |||
discuss the three pieces of the solution and then give an example of | discuss the three pieces of the solution and then give an example of | |||
how they fit together in a use case: | how they fit together in a use case: | |||
o the new Pre-Congestion Notification for Admission Marking used by | o the new Pre-Congestion Notification for Admission Marking used by | |||
all routers in the CL-region | all routers in the CL-region | |||
skipping to change at page 22, line 19 | skipping to change at page 20, line 19 | |||
they fit together in a use case: | they fit together in a use case: | |||
o How an ingress gateway is triggered to test whether flow pre- | o How an ingress gateway is triggered to test whether flow pre- | |||
emption may be needed | emption may be needed | |||
o How an ingress gateway determines the right amount of CL traffic | o How an ingress gateway determines the right amount of CL traffic | |||
to drop | to drop | |||
The mechanism is defined in [PCN] and [RSVP-PCN]. | The mechanism is defined in [PCN] and [RSVP-PCN]. | |||
Two subsequent steps could be: | ||||
o Choose which flows to shed, influenced by their priority and other | ||||
policy information | ||||
o Tear down the reservations for the chosen flows | ||||
We provide some hints about these latter two steps in Section 3.2.3, | ||||
but don't try to provide full guidance as it greatly depends on the | ||||
particular detailed operational situation. | ||||
An essential QoS issue in core and backbone networks is being able to | ||||
cope with failures of routers and links. The consequent re-routing | ||||
can cause severe congestion on some links and hence degrade the QoS | ||||
experienced by on-going microflows and other, lower priority traffic. | ||||
Even when the network is engineered to sustain a single link failure, | ||||
multiple link failures (e.g. due to a fibre cut, router failure or a | ||||
natural disaster) can cause violation of capacity constraints and | ||||
resulting QoS failures. Our solution uses rate-based flow pre- | ||||
emption, so that sufficient of the previously admitted CL microflows | ||||
are dropped to ensure that the remaining ones again receive QoS | ||||
commensurate with the CL service and at least some QoS is quickly | ||||
restored to other traffic classes. | ||||
3.2.1. Alerting an ingress gateway that flow pre-emption may be needed | 3.2.1. Alerting an ingress gateway that flow pre-emption may be needed | |||
Alerting an ingress gateway that flow pre-emption may be needed is a | Alerting an ingress gateway that flow pre-emption may be needed is a | |||
two stage process: a router in the CL-region alerts an egress gateway | two stage process: a router in the CL-region alerts an egress gateway | |||
that flow pre-emption may be needed; in turn the egress gateway | that flow pre-emption may be needed; in turn the egress gateway | |||
alerts the relevant ingress gateway. Every router in the CL-region | alerts the relevant ingress gateway. Every router in the CL-region | |||
has the ability to alert egress gateways, which may be done either | has the ability to alert egress gateways, which may be done either | |||
explicitly or implicitly: | explicitly or implicitly: | |||
o Explicit - the router per-hop behaviour is supplemented with a new | o Explicit - the router per-hop behaviour is supplemented with a new | |||
skipping to change at page 23, line 5 | skipping to change at page 21, line 30 | |||
that packets are pre-emption marked before the actual queue builds | that packets are pre-emption marked before the actual queue builds | |||
up. The algorithm's main parameter is the configured-pre-emption- | up. The algorithm's main parameter is the configured-pre-emption- | |||
rate, which is set lower than the link speed (but higher than the | rate, which is set lower than the link speed (but higher than the | |||
configured-admission-rate). Thus pre-emption marked packets indicate | configured-admission-rate). Thus pre-emption marked packets indicate | |||
that the CL traffic rate is reaching the configured-pre-emption-rate | that the CL traffic rate is reaching the configured-pre-emption-rate | |||
and so act as an "early warning" that the engineered capacity is | and so act as an "early warning" that the engineered capacity is | |||
nearly reached. Therefore they indicate that it may be advisable to | nearly reached. Therefore they indicate that it may be advisable to | |||
pre-empt some of the existing CL flows in order to preserve the QoS | pre-empt some of the existing CL flows in order to preserve the QoS | |||
of the others. | of the others. | |||
Note that the pre-emption marking algorithm doesn't measure the | ||||
packets that are already Pre-emption Marked. This ensures that in a | ||||
scenario with several links that are above their configured-pre- | ||||
emption-rate, then at the egress gateway the rate of packets | ||||
excluding Pre-emption Marked ones truly does represent the | ||||
Sustainable-Aggregate-Rate(see below for explanation). | ||||
Note that the explicit mechanism only makes sense if all the routers | Note that the explicit mechanism only makes sense if all the routers | |||
in the CL-region have the functionality so that the egress gateways | in the CL-region have the functionality so that the egress gateways | |||
can rely on the explicit mechanism. Otherwise there is the danger | can rely on the explicit mechanism. Otherwise there is the danger | |||
that the traffic happens to focus on a router without it, and egress | that the traffic happens to focus on a router without it, and egress | |||
gateways then have also to watch for implicit pre-emption alerts. | gateways then have also to watch for implicit pre-emption alerts. | |||
When one or more packets in a CL-region-aggregate alert the egress | When one or more packets in a CL-region-aggregate alert the egress | |||
gateway of the need for flow pre-emption, whether explicitly or | gateway of the need for flow pre-emption, whether explicitly or | |||
implicitly, the egress puts that CL-region-aggregate into the Pre- | implicitly, the egress puts that CL-region-aggregate into the Pre- | |||
emption Alert state. For each CL-region-aggregate in alert state it | emption Alert state. For each CL-region-aggregate in alert state it | |||
skipping to change at page 26, line 8 | skipping to change at page 24, line 40 | |||
this packet is part of (by using a five-tuple filter and comparing it | this packet is part of (by using a five-tuple filter and comparing it | |||
with state installed at admission) and hence which ingress gateway | with state installed at admission) and hence which ingress gateway | |||
the packet came from. It sets up a meter to measure the traffic rate | the packet came from. It sets up a meter to measure the traffic rate | |||
from this ingress gateway, and as soon as possible sends a message to | from this ingress gateway, and as soon as possible sends a message to | |||
the ingress gateway. This message alerts the ingress gateway that | the ingress gateway. This message alerts the ingress gateway that | |||
pre-emption may be needed and contains the traffic rate measured by | pre-emption may be needed and contains the traffic rate measured by | |||
the egress gateway. Then the ingress gateway determines the traffic | the egress gateway. Then the ingress gateway determines the traffic | |||
rate that it is sending towards this egress gateway and hence it can | rate that it is sending towards this egress gateway and hence it can | |||
calculate the amount of traffic that needs to be pre-empted. | calculate the amount of traffic that needs to be pre-empted. | |||
The solution operates within a little over one round trip time - the | ||||
time required for microflow packets that have experienced Pre-emption | ||||
Marking to travel downstream through the CL-region and arrive at the | ||||
egress gateway, plus some additional time for the egress gateway to | ||||
measure the rate seen after it has been alerted that pre-emption may | ||||
be needed, and the time for the egress gateway to report this | ||||
information to the ingress gateway. | ||||
The ingress gateway could now just shed random microflows, but it is | The ingress gateway could now just shed random microflows, but it is | |||
better if the least important ones are dropped. The ingress gateway | better if the least important ones are dropped. The ingress gateway | |||
could use information stored locally in each reservation's state | could use information stored locally in each reservation's state | |||
(such as for example the RSVP pre-emption priority of [RSVP- | (such as for example the RSVP pre-emption priority of [RSVP- | |||
PREEMPTION] or the RSVP admission priority of [RSVP-EMERGENCY]) as | PREEMPTION] or the RSVP admission priority of [RSVP-EMERGENCY]) as | |||
well as information provided by a policy decision point in order to | well as information provided by a policy decision point in order to | |||
decide which of the flows to shed (or perhaps which ones not to | decide which of the flows to shed (or perhaps which ones not to | |||
shed). This way, flow pre-emption can also helps emergency/military | shed). This way, flow pre-emption can also helps emergency/military | |||
calls by taking into account the corresponding priorities (as | calls by taking into account the corresponding priorities (as | |||
conveyed in RSVP policy elements) when selecting calls to be pre- | conveyed in RSVP policy elements) when selecting calls to be pre- | |||
skipping to change at page 27, line 5 | skipping to change at page 25, line 36 | |||
significantly less than the physical line capacity, flow pre-emption | significantly less than the physical line capacity, flow pre-emption | |||
may be triggered before any congestion has actually occurred and | may be triggered before any congestion has actually occurred and | |||
before any packet is dropped. | before any packet is dropped. | |||
We extend the scenario further by imagining that (due to a disaster | We extend the scenario further by imagining that (due to a disaster | |||
of some kind) further routers in the CL-region fail during the time | of some kind) further routers in the CL-region fail during the time | |||
taken by the pre-emption process described above. This is handled | taken by the pre-emption process described above. This is handled | |||
naturally, as packets will continue to be pre-emption marked and so | naturally, as packets will continue to be pre-emption marked and so | |||
the pre-emption process will happen for a second time. | the pre-emption process will happen for a second time. | |||
3.3. Both admission control and pre-emption | ||||
This document describes both the admission control and pre-emption | ||||
mechanisms, and we suggest that an operator uses both. However, we do | ||||
not require this and some operators may want to implement only one. | ||||
For example, an operator could use just admission control, solving | ||||
heavy congestion (caused by re-routing) by 'just waiting' - as | ||||
sessions end, existing microflows naturally depart from the system | ||||
over time, and the admission control mechanism will prevent admission | ||||
of new microflows that use the affected links. So the CL-region will | ||||
naturally return to normal controlled load service, but with reduced | ||||
capacity. The drawback of this approach would be that until flows | ||||
naturally depart to relieve the congestion, all flows and lower | ||||
priority services will be adversely affected. As another example, an | ||||
operator could use just admission control, avoiding heavy congestion | ||||
(caused by re-routing) by 'capacity planning' - by configuring | ||||
admission control thresholds to lower levels than the network could | ||||
accept in normal situations such that the load after failure is | ||||
expected to stay below acceptable levels even with reduced network | ||||
resources. | ||||
On the other hand, an operator could just rely for admission control | ||||
on the traffic conditioning agreements of the DiffServ architecture | ||||
[RFC2475]. The pre-emption mechanism described in this document would | ||||
be used to counteract the problem described at the end of Section | ||||
1.1.1. | ||||
4. Summary of Functionality | 4. Summary of Functionality | |||
This section is intended to provide a systematic summary of the new | This section is intended to provide a systematic summary of the new | |||
functionality required by the routers in the CL-region. | functionality required by the routers in the CL-region. | |||
A network operator upgrades normal IP routers by: | A network operator upgrades normal IP routers by: | |||
o Adding functionality related to admission control and flow pre- | o Adding functionality related to admission control and flow pre- | |||
emption to all its ingress and egress gateways | emption to all its ingress and egress gateways | |||
skipping to change at page 31, line 13 | skipping to change at page 31, line 13 | |||
(and, if needed, the pre-emption mechanism) to sort things out. | (and, if needed, the pre-emption mechanism) to sort things out. | |||
5. Limitations and some potential solutions | 5. Limitations and some potential solutions | |||
In this section we describe various limitations of the deployment | In this section we describe various limitations of the deployment | |||
model, and some suggestions about potential ways of alleviating them. | model, and some suggestions about potential ways of alleviating them. | |||
The limitations fall into three broad categories: | The limitations fall into three broad categories: | |||
o ECMP (Section 5.1): the assumption about routing (Section 2.2) is | o ECMP (Section 5.1): the assumption about routing (Section 2.2) is | |||
that all packets between a pair of ingress and egress gateways | that all packets between a pair of ingress and egress gateways | |||
follow the same path; ECMP breaks this assumption | follow the same path; ECMP breaks this assumption. A study | |||
regarding the accuracy of load balancing schemes can be found in | ||||
[LoadBalancing-a] and [LoadBalancing-b]. | ||||
o The lack of global coordination (Sections 5.2, 5.3 and 5.4): a | o The lack of global coordination (Sections 5.2, 5.3 and 5.4): a | |||
decision about admission control or flow pre-emption is made for | decision about admission control or flow pre-emption is made for | |||
one aggregate independently of other aggregates | one aggregate independently of other aggregates | |||
o Timing and accuracy of measurements (Sections 5.5 and 5.6): the | o Timing and accuracy of measurements (Sections 5.5 and 5.6): the | |||
assumption (Section 2.2) that additional load, offered within the | assumption (Section 2.2) that additional load, offered within the | |||
reaction time of the measurement-based admission control | reaction time of the measurement-based admission control | |||
mechanism, doesn't move the system directly from no congestion to | mechanism, doesn't move the system directly from no congestion to | |||
overload (dropping packets). A 'flash crowd' may break this | overload (dropping packets). A 'flash crowd' may break this | |||
skipping to change at page 32, line 42 | skipping to change at page 32, line 43 | |||
or are pre-empted), and there is still the danger that for some | or are pre-empted), and there is still the danger that for some | |||
traffic mixes the operator hasn't been cautious enough. | traffic mixes the operator hasn't been cautious enough. | |||
o for admission control, probe to obtain a flow-specific congestion- | o for admission control, probe to obtain a flow-specific congestion- | |||
level-estimate. Earlier this document suggests continuously | level-estimate. Earlier this document suggests continuously | |||
monitoring the congestion-level-estimate. Instead, probe packets | monitoring the congestion-level-estimate. Instead, probe packets | |||
could be sent for each prospective new flow. The probe packets | could be sent for each prospective new flow. The probe packets | |||
have the same IP address etc as the data packets would have, and | have the same IP address etc as the data packets would have, and | |||
hence follow the same ECMP path. However, probing is an extra | hence follow the same ECMP path. However, probing is an extra | |||
overhead, depending on how many probe packets need to be sent to | overhead, depending on how many probe packets need to be sent to | |||
get a sufficiently accurate congestion-level-estimate. | get a sufficiently accurate congestion-level-estimate. Probes also | |||
cause a processing overhead, either for the machine at the | ||||
destination address or for the egress gateway to identify and | ||||
remove the probe packets. | ||||
o for flow pre-emption, only select flows for pre-emption from | o for flow pre-emption, only select flows for pre-emption from | |||
amongst those that have actually received a Pre-emption Marked | amongst those that have actually received a Pre-emption Marked | |||
packet. Because these flows must have followed an ECMP path that | packet. Because these flows must have followed an ECMP path that | |||
goes through an overloaded router. However, it needs some extra | goes through an overloaded router. However, it needs some extra | |||
work by the egress gateway, to record this information and report | work by the egress gateway, to record this information and report | |||
it to the ingress gateway. | it to the ingress gateway. | |||
o for flow pre-emption, a variant of this idea involves introducing | o for flow pre-emption, a variant of this idea involves introducing | |||
a new marking behaviour, 'Router Marking'. A router that is pre- | a new marking behaviour, 'Router Marking'. A router that is pre- | |||
skipping to change at page 43, line 36 | skipping to change at page 43, line 47 | |||
(Section 2.2), so that the CL-region could consist of multiple | (Section 2.2), so that the CL-region could consist of multiple | |||
domains run by different operators that did not trust each other. | domains run by different operators that did not trust each other. | |||
Then only the ingress and egress gateways of the CL-region would take | Then only the ingress and egress gateways of the CL-region would take | |||
part in the admission control procedure, i.e. at the ingress to the | part in the admission control procedure, i.e. at the ingress to the | |||
first domain and the egress from the final domain. The border routers | first domain and the egress from the final domain. The border routers | |||
between operators within the CL-region would only have to do bulk | between operators within the CL-region would only have to do bulk | |||
accounting - they wouldn't do per microflow metering and policing, | accounting - they wouldn't do per microflow metering and policing, | |||
and they wouldn't take part in signal processing or hold per flow | and they wouldn't take part in signal processing or hold per flow | |||
state [Briscoe]. [Re-feedback] explains how a downstream domain can | state [Briscoe]. [Re-feedback] explains how a downstream domain can | |||
police that its upstream domain does not 'cheat' by admitting traffic | police that its upstream domain does not 'cheat' by admitting traffic | |||
when the downstream path is over-congested. [Re-PCN] proposes how to | when the downstream path is congested. [Re-PCN] proposes how to | |||
achieve this with the help of another recently proposed extension to | achieve this with the help of another recently proposed extension to | |||
ECN, involving re-echoing ECN feedback [Re-ECN]. | ECN, involving re-echoing ECN feedback [Re-ECN]. | |||
5.7.3. Preferential dropping of pre-emption marked packets | 5.7.3. Preferential dropping of pre-emption marked packets | |||
When the rate of real-time traffic in the specified class exceeds the | When the rate of real-time traffic in the specified class exceeds the | |||
maximum configured rate, then a router has to drop some packet(s) | maximum configured rate, then a router has to drop some packet(s) | |||
instead of forwarding them on the out-going link. Now when the egress | instead of forwarding them on the out-going link. Now when the egress | |||
gateway measures the Sustainable-Aggregate-Rate, neither dropped | gateway measures the Sustainable-Aggregate-Rate, neither dropped | |||
packets nor pre-emption marked packets contribute to it. Dropping | packets nor pre-emption marked packets contribute to it. Dropping | |||
skipping to change at page 45, line 9 | skipping to change at page 45, line 20 | |||
aggregation assumption (Section 2.2) doesn't hold. In the extreme it | aggregation assumption (Section 2.2) doesn't hold. In the extreme it | |||
may be possible to operate the framework end-to-end, i.e. between end | may be possible to operate the framework end-to-end, i.e. between end | |||
hosts. One potential method is to send probe packets to test whether | hosts. One potential method is to send probe packets to test whether | |||
the network can support a prospective new CL microflow. The probe | the network can support a prospective new CL microflow. The probe | |||
packets would be sent at the same traffic rate as expected for the | packets would be sent at the same traffic rate as expected for the | |||
actual microflow, but in order not to disturb existing CL traffic a | actual microflow, but in order not to disturb existing CL traffic a | |||
router would always schedule probe packets behind CL ones (compare | router would always schedule probe packets behind CL ones (compare | |||
[Breslau00]); this implies they have a new DSCP. Otherwise the | [Breslau00]); this implies they have a new DSCP. Otherwise the | |||
routers would treat probe packets identically to CL packets. In order | routers would treat probe packets identically to CL packets. In order | |||
to perform admission control quickly, in parts of the network where | to perform admission control quickly, in parts of the network where | |||
there are only a few CL microflows, the Pre-Congestion marking | there are only a few CL microflows, the algorithm for Admission | |||
behaviour for probe packets would switch from admission marking no | Marking described in [PCN] would need to "switch on" very rapidly, ie | |||
packets to admission marking them all for only a minimal increase in | go from marking no packets to marking them all for only a minimal | |||
load. | increase in the size of the virtual queue. | |||
5.7.6. MPLS-TE | 5.7.6. MPLS-TE | |||
[ECN-MPLS] discusses how to extend the deployment model to MPLS, i.e. | [ECN-MPLS] discusses how to extend the deployment model to MPLS, i.e. | |||
for admission control of microflows into a set of MPLS-TE aggregates | for admission control of microflows into a set of MPLS-TE aggregates | |||
(Multi-protocol label switching traffic engineering). It would | (Multi-protocol label switching traffic engineering). It would | |||
require that the MPLS header could include the ECN field, which is | require that the MPLS header could include the ECN field, which is | |||
not precluded by RFC3270. See [ECN-MPLS]. | not precluded by RFC3270. See [ECN-MPLS]. | |||
6. Relationship to other QoS mechanisms | 6. Relationship to other QoS mechanisms | |||
skipping to change at page 46, line 50 | skipping to change at page 46, line 50 | |||
indications of network resource availability. In practice, service | indications of network resource availability. In practice, service | |||
providers rely on subscription-time Service Level Agreements (SLAs) | providers rely on subscription-time Service Level Agreements (SLAs) | |||
that statically define the parameters of the traffic that will be | that statically define the parameters of the traffic that will be | |||
accepted from a customer. The CL mechanism allows dynamic reservation | accepted from a customer. The CL mechanism allows dynamic reservation | |||
of resources through the DiffServ domain and, with the potential | of resources through the DiffServ domain and, with the potential | |||
extension mentioned in Section 5.7.2, it can span multiple domains | extension mentioned in Section 5.7.2, it can span multiple domains | |||
without active policing mechanisms at the borders (unlike DiffServ). | without active policing mechanisms at the borders (unlike DiffServ). | |||
Therefore we do not use the traffic conditioning agreements (TCAs) of | Therefore we do not use the traffic conditioning agreements (TCAs) of | |||
the (informational) DiffServ architecture [RFC2475]. | the (informational) DiffServ architecture [RFC2475]. | |||
An important benefit arises from the fact that the load is controlled | ||||
dynamically rather than with traffic conditioning agreements (TCAs). | ||||
TCAs were originally introduced in the (informational) DiffServ | ||||
architecture [RFC2475] as an alternative to reservation processing in | ||||
the interior region in order to reduce the burden on interior | ||||
routers. With TCAs, in practice service providers rely on | ||||
subscription-time Service Level Agreements that statically define the | ||||
parameters of the traffic that will be accepted from a customer. The | ||||
problem arises because the TCA at the ingress must allow any | ||||
destination address, if it is to remain scalable. But for longer | ||||
topologies, the chances increase that traffic will focus on an | ||||
interior resource, even though it is within contract at the ingress | ||||
[Reid], e.g. all flows converge on the same egress gateway. Even | ||||
though networks can be engineered to make such failures rare, when | ||||
they occur all inelastic flows through the congested resource fail | ||||
catastrophically. | ||||
[Johnson] compares admission control with a 'generously dimensioned' | [Johnson] compares admission control with a 'generously dimensioned' | |||
DiffServ network as ways to achieve QoS. The former is recommended. | DiffServ network as ways to achieve QoS. The former is recommended. | |||
6.4. ECN | 6.4. ECN | |||
The marking behaviour described in this document complies with the | The marking behaviour described in this document complies with the | |||
ECN aspects of the IP wire protocol RFC3168, but provides its own | ECN aspects of the IP wire protocol RFC3168, but provides its own | |||
edge-to-edge feedback instead of the TCP aspects of RFC3168. All | edge-to-edge feedback instead of the TCP aspects of RFC3168. All | |||
routers within the CL-region are upgraded with the admission marking | routers within the CL-region are upgraded with the admission marking | |||
and pre-emption marking of Pre-Congestion Notification, so the | and pre-emption marking of Pre-Congestion Notification, so the | |||
skipping to change at page 49, line 5 | skipping to change at page 48, line 38 | |||
Multi-protocol label switching traffic engineering (MPLS-TE) allows | Multi-protocol label switching traffic engineering (MPLS-TE) allows | |||
scalable reservation of resources in the core for an aggregate of | scalable reservation of resources in the core for an aggregate of | |||
many microflows. To achieve end-to-end reservations, admission | many microflows. To achieve end-to-end reservations, admission | |||
control and policing of microflows into the aggregate can be achieved | control and policing of microflows into the aggregate can be achieved | |||
using techniques such as RSVP Aggregation over MPLS TE Tunnels as per | using techniques such as RSVP Aggregation over MPLS TE Tunnels as per | |||
[AGGRE-TE]. However, in the case of inter-provider environments, | [AGGRE-TE]. However, in the case of inter-provider environments, | |||
these techniques require that admission control and policing be | these techniques require that admission control and policing be | |||
repeated at each trust boundary or that MPLS TE tunnels span multiple | repeated at each trust boundary or that MPLS TE tunnels span multiple | |||
domains. | domains. | |||
6.8. Other Network Admission Control Approaches | ||||
Link admission control (LAC) describes how admission control (AC) can | ||||
be done on a single link and comprises, e.g., the calculation of | ||||
effective bandwidths which may be the base for a parameter-based AC. | ||||
In contrast, network AC (NAC) describes how AC can be done for a | ||||
network and focuses on the locations from which data is gathered for | ||||
the admission decision. Most approaches implement a link budget based | ||||
NAC (LB NAC) where each link has a certain AC-budget. RSVP works | ||||
according to that principle, but also the new concept admits | ||||
additional flows as long as each link on the new flow's path still | ||||
has resources available. The border-to-border budget based NAC (BBB | ||||
NAC) pre-configures an AC budget for all border-to-border | ||||
relationships (= CL-region-aggregates) and if this capacity budget is | ||||
exhausted, new flows are rejected. The TCA-based admission control | ||||
which is associated with the DiffServ architecture implements an | ||||
ingress budget based NAC (IB NAC). These basically different concepts | ||||
have different flexibility and efficiency with regard to the use of | ||||
link bandwidths [NAC-a,NAC-b]. They can be made resilient by choosing | ||||
the budgets in such a way that the network will not be congested | ||||
after rerouting due to a failure. The efficiency of the approaches is | ||||
different with and without such resilient requirements. | ||||
7. Security Considerations | 7. Security Considerations | |||
To protect against denial of service attacks, the ingress gateway of | To protect against denial of service attacks, the ingress gateway of | |||
the CL-region needs to police all CL packets and drop packets in | the CL-region needs to police all CL packets and drop packets in | |||
excess of the reservation. This is similar to operations with | excess of the reservation. This is similar to operations with | |||
existing IntServ behaviour. | existing IntServ behaviour. | |||
For pre-emption, it is considered acceptable from a security | For pre-emption, it is considered acceptable from a security | |||
perspective that the ingress gateway can treat "emergency/military" | perspective that the ingress gateway can treat "emergency/military" | |||
CL flows preferentially compared with "ordinary" CL flows. However, | CL flows preferentially compared with "ordinary" CL flows. However, | |||
skipping to change at page 49, line 39 | skipping to change at page 50, line 5 | |||
The admission control mechanism evolved from the work led by Martin | The admission control mechanism evolved from the work led by Martin | |||
Karsten on the Guaranteed Stream Provider developed in the M3I | Karsten on the Guaranteed Stream Provider developed in the M3I | |||
project [GSPa, GSP-TR], which in turn was based on the theoretical | project [GSPa, GSP-TR], which in turn was based on the theoretical | |||
work of Gibbens and Kelly [DCAC]. Kennedy Cheng, Gabriele Corliano, | work of Gibbens and Kelly [DCAC]. Kennedy Cheng, Gabriele Corliano, | |||
Carla Di Cairano-Gilfedder, Kashaf Khan, Peter Hovell, Arnaud Jacquet | Carla Di Cairano-Gilfedder, Kashaf Khan, Peter Hovell, Arnaud Jacquet | |||
and June Tay (BT) helped develop and evaluate this approach. | and June Tay (BT) helped develop and evaluate this approach. | |||
Many thanks to those who have commented on this work at Transport | Many thanks to those who have commented on this work at Transport | |||
Area Working Group meetings and on the mailing list, including: Ken | Area Working Group meetings and on the mailing list, including: Ken | |||
Carlberg, Ruediger Geib, Lars Westberg, David Black, Robert Hancock, | Carlberg, Ruediger Geib, Lars Westberg, David Black, Robert Hancock, | |||
Cornelia Kappler. | Cornelia Kappler, Michael Menth. | |||
9. Comments solicited | 9. Comments solicited | |||
Comments and questions are encouraged and very welcome. They can be | Comments and questions are encouraged and very welcome. They can be | |||
sent to the Transport Area Working Group's mailing list, | sent to the Transport Area Working Group's mailing list, | |||
tsvwg@ietf.org, and/or to the authors. | tsvwg@ietf.org, and/or to the authors. | |||
10. Changes from earlier versions of the draft | 10. Changes from earlier versions of the draft | |||
The main changes are: | The main changes are: | |||
skipping to change at page 51, line 5 | skipping to change at page 51, line 5 | |||
Section 5 has been updated and expanded. It is now about the | Section 5 has been updated and expanded. It is now about the | |||
'limitations' of the PCN mechanism, as described in the earlier | 'limitations' of the PCN mechanism, as described in the earlier | |||
sections, plus discussion of 'possible solutions' to those | sections, plus discussion of 'possible solutions' to those | |||
limitations. | limitations. | |||
The measurement of the Congestion-Level-Estimate now includes pre- | The measurement of the Congestion-Level-Estimate now includes pre- | |||
emption marked packets as well as admission marked ones. Section | emption marked packets as well as admission marked ones. Section | |||
3.1.2 explains. | 3.1.2 explains. | |||
From -03 to -04 | ||||
Detailed review by Michael Menth. In response, Abstract, Summary and | ||||
Key benefits sections re-written. Numerous detailed comments on | ||||
Sections 5 and following sections. | ||||
11. Appendices | 11. Appendices | |||
11.1. Appendix A: Explicit Congestion Notification | 11.1. Appendix A: Explicit Congestion Notification | |||
This Appendix provides a brief summary of Explicit Congestion | This Appendix provides a brief summary of Explicit Congestion | |||
Notification (ECN). | Notification (ECN). | |||
[RFC3168] specifies the incorporation of ECN to TCP and IP, including | [RFC3168] specifies the incorporation of ECN to TCP and IP, including | |||
ECN's use of two bits in the IP header. It specifies a method for | ECN's use of two bits in the IP header. It specifies a method for | |||
indicating incipient congestion to end-hosts (e.g. as in RED, Random | indicating incipient congestion to end-hosts (e.g. as in RED, Random | |||
skipping to change at page 52, line 5 | skipping to change at page 53, line 7 | |||
The CE codepoint '11' is set by a router to indicate congestion to | The CE codepoint '11' is set by a router to indicate congestion to | |||
the end hosts. The term 'CE packet' denotes a packet that has the CE | the end hosts. The term 'CE packet' denotes a packet that has the CE | |||
codepoint set. | codepoint set. | |||
The ECN-Capable Transport (ECT) codepoints '10' and '01' (ECT(0) and | The ECN-Capable Transport (ECT) codepoints '10' and '01' (ECT(0) and | |||
ECT(1) respectively) are set by the data sender to indicate that the | ECT(1) respectively) are set by the data sender to indicate that the | |||
end-points of the transport protocol are ECN-capable. Routers treat | end-points of the transport protocol are ECN-capable. Routers treat | |||
the ECT(0) and ECT(1) codepoints as equivalent. Senders are free to | the ECT(0) and ECT(1) codepoints as equivalent. Senders are free to | |||
use either the ECT(0) or the ECT(1) codepoint to indicate ECT, on a | use either the ECT(0) or the ECT(1) codepoint to indicate ECT, on a | |||
packet-by-packet basis. The use of both the two codepoints for ECT is | packet-by-packet basis. The motivation for having two codepoints (the | |||
motivated primarily by the desire to allow mechanisms for the data | 'ECN nonce') is the desire to check two things: for the data sender | |||
sender to verify that network elements are not erasing the CE | to verify that network elements are not erasing the CE codepoint; and | |||
codepoint, and that data receivers are properly reporting to the | for the data sender to verify that data receivers are properly | |||
sender the receipt of packets with the CE codepoint set. | reporting to the sender the receipt of packets with the CE codepoint | |||
set. | ||||
ECN requires support from the transport protocol, in addition to the | ECN requires support from the transport protocol, in addition to the | |||
functionality given by the ECN field in the IP packet header. | functionality given by the ECN field in the IP packet header. | |||
[RFC3168] addresses the addition of ECN Capability to TCP, specifying | [RFC3168] addresses the addition of ECN Capability to TCP, specifying | |||
three new pieces of functionality: negotiation between the endpoints | three new pieces of functionality: negotiation between the endpoints | |||
during connection setup to determine if they are both ECN-capable; an | during connection setup to determine if they are both ECN-capable; an | |||
ECN-Echo (ECE) flag in the TCP header so that the data receiver can | ECN-Echo (ECE) flag in the TCP header so that the data receiver can | |||
inform the data sender when a CE packet has been received; and a | inform the data sender when a CE packet has been received; and a | |||
Congestion Window Reduced (CWR) flag in the TCP header so that the | Congestion Window Reduced (CWR) flag in the TCP header so that the | |||
data sender can inform the data receiver that the congestion window | data sender can inform the data receiver that the congestion window | |||
skipping to change at page 55, line 5 | skipping to change at page 56, line 5 | |||
bits]n ) | bits]n ) | |||
[EWMA-AM-bits]'n+1 = (B * bits-in-packet) + (w' * [EWMA-AM-bits]n | [EWMA-AM-bits]'n+1 = (B * bits-in-packet) + (w' * [EWMA-AM-bits]n | |||
) | ) | |||
where w' = (1-w)/w. | where w' = (1-w)/w. | |||
If w' is arranged to be a power of 2, these per packet algorithms can | If w' is arranged to be a power of 2, these per packet algorithms can | |||
be implemented solely with a shift and an add. | be implemented solely with a shift and an add. | |||
There are alternative possibilities for smoothing out the congestion- | ||||
level-estimate. For example [TEWMA] deals better with the issue of | ||||
stale information when the traffic rate for | ||||
12. References | 12. References | |||
A later version will distinguish normative and informative | A later version will distinguish normative and informative | |||
references. | references. | |||
[AGGRE-TE] Francois Le Faucheur, Michael Dibiasio, Bruce Davie, | [AGGRE-TE] Francois Le Faucheur, Michael Dibiasio, Bruce Davie, | |||
Michael Davenport, Chris Christou, Jerry Ash, Bur | Michael Davenport, Chris Christou, Jerry Ash, Bur | |||
Goode, 'Aggregation of RSVP Reservations over MPLS | Goode, 'Aggregation of RSVP Reservations over MPLS | |||
TE/DS-TE Tunnels', draft-ietf-tsvwg-rsvp-dste-03 (work | TE/DS-TE Tunnels', draft-ietf-tsvwg-rsvp-dste-03 (work | |||
[ANSI.MLPP.Spec] American National Standards Institute, | [ANSI.MLPP.Spec] American National Standards Institute, | |||
skipping to change at page 56, line 37 | skipping to change at page 58, line 4 | |||
http://www.kom.e-technik.tu- | http://www.kom.e-technik.tu- | |||
darmstadt.de/publications/abstracts/KS02-5.html (May, | darmstadt.de/publications/abstracts/KS02-5.html (May, | |||
2002) | 2002) | |||
[ITU.MLPP.1990] International Telecommunications Union, "Multilevel | [ITU.MLPP.1990] International Telecommunications Union, "Multilevel | |||
Precedence and Pre-emption Service (MLPP)", ITU-T | Precedence and Pre-emption Service (MLPP)", ITU-T | |||
Recommendation I.255.3, 1990. | Recommendation I.255.3, 1990. | |||
[Johnson] DM Johnson, 'QoS control versus generous | [Johnson] DM Johnson, 'QoS control versus generous | |||
dimensioning', BT Technology Journal, Vol 23 No 2, | dimensioning', BT Technology Journal, Vol 23 No 2, | |||
[LoadBalancing-a] Ruediger Martin, Michael Menth, and Michael | ||||
Hemmkeppler: "Accuracy and Dynamics of Hash-Based Load | ||||
Balancing Algorithms for Multipath Internet Routing", | ||||
IEEE Broadnets, San Jose, CA, USA, October 2006 | ||||
http://www3.informatik.uni- | ||||
wuerzburg.de/~menth/Publications/Menth06p.pdf | ||||
[LoadBalancing-b] Ruediger Martin, Michael Menth, and Michael | ||||
Hemmkeppler: "Accuracy and Dynamics of Multi-Stage | ||||
Load Balancing for Multipath Internet Routing", | ||||
currently under submission http://www3.informatik.uni- | ||||
wuerzburg.de/~menth/Publications/Menth07-Sub-6.pdf | ||||
[Low] S. Low, L. Andrew, B. Wydrowski, 'Understanding XCP: | [Low] S. Low, L. Andrew, B. Wydrowski, 'Understanding XCP: | |||
equilibrium and fairness', IEEE InfoCom 2005 | equilibrium and fairness', IEEE InfoCom 2005 | |||
[NAC-a] Michael Menth: "Efficient Admission Control and | ||||
Routing in Resilient Communication Networks", PhD | ||||
thesis, July 2004, http://opus.bibliothek.uni- | ||||
wuerzburg.de/opus/volltexte/2004/994/pdf/Menth04.pdf | ||||
[NAC-b] Michael Menth, Stefan Kopf, Joachim Charzinski, and | ||||
Karl Schrodi: "Resilient Network Admission Control", | ||||
currently under submission. | ||||
http://www3.informatik.uni- | ||||
wuerzburg.de/~menth/Publications/Menth07-Sub-3.pdf | ||||
[PCN] B. Briscoe, P. Eardley, D. Songhurst, F. Le Faucheur, | [PCN] B. Briscoe, P. Eardley, D. Songhurst, F. Le Faucheur, | |||
A. Charny, V. Liatsos, S. Dudley, J. Babiarz, K. Chan, | A. Charny, V. Liatsos, S. Dudley, J. Babiarz, K. Chan, | |||
G. Karagiannis, A. Bader, L. Westberg. 'Pre-Congestion | G. Karagiannis, A. Bader, L. Westberg. 'Pre-Congestion | |||
Notification marking', draft-briscoe-tsvwg-cl-phb-02 | Notification marking', draft-briscoe-tsvwg-cl-phb-02 | |||
(work in progress), June 2006. | (work in progress), June 2006. | |||
[Re-ECN] Bob Briscoe, Arnaud Jacquet, Alessandro Salvatori, | [Re-ECN] Bob Briscoe, Arnaud Jacquet, Alessandro Salvatori, | |||
'Re-ECN: Adding Accountability for Causing Congestion | 'Re-ECN: Adding Accountability for Causing Congestion | |||
to TCP/IP', draft-briscoe-tsvwg-re-ecn-tcp-01 (work in | to TCP/IP', draft-briscoe-tsvwg-re-ecn-tcp-01 (work in | |||
progress), March 2006. | progress), March 2006. | |||
End of changes. 57 change blocks. | ||||
358 lines changed or deleted | 405 lines changed or added | |||
This html diff was produced by rfcdiff 1.33. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |