draft-irtf-iccrg-welzl-congestion-control-open-research-02.txt   draft-irtf-iccrg-welzl-congestion-control-open-research-03.txt 
Network Working Group Michael Welzl Network Working Group Michael Welzl
Internet Draft Dimitri Papadimitriou Internet Draft Dimitri Papadimitriou
Document: draft-irtf-iccrg-welzl- Editors Document: draft-irtf-iccrg-welzl- Editors
congestion-control-open-research-02.txt congestion-control-open-research-03.txt
Michael Scharf Expires: October 16, 2009 Michael Scharf
Bob Briscoe Bob Briscoe
April 17, 2009
Open Research Issues in Internet Congestion Control Open Research Issues in Internet Congestion Control
draft-irtf-iccrg-welzl-congestion-control-open-research-02.txt draft-irtf-iccrg-welzl-congestion-control-open-research-03.txt
Status of this Memo Status of this Memo
By submitting this Internet-Draft, each author represents that any This Internet-Draft is submitted to IETF in full conformance with the
applicable patent or other IPR claims of which he or she is aware provisions of BCP 78 and BCP 79.
have been or will be disclosed, and any of which he or she becomes
aware will be disclosed, in accordance with Section 6 of BCP 79.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that other Task Force (IETF), its areas, and its working groups. Note that
groups may also distribute working documents as Internet- other groups may also distribute working documents as Internet-
Drafts. Drafts.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt. http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html. http://www.ietf.org/shadow.html.
This Internet-Draft will expire on January 30, 2009.
Copyright Notice
Copyright (C) The IETF Trust (2008).
Abstract Abstract
This document describes some of the open problems in Internet This document describes some of the open problems in Internet
congestion control that are known today. This includes several new congestion control that are known today. This includes several new
challenges that are becoming important as the network grows, as well challenges that are becoming important as the network grows, as well
as some issues that have been known for many years. These challenges as some issues that have been known for many years. These challenges
are generally considered to be open research topics that may require are generally considered to be open research topics that may require
more study or application of innovative techniques before Internet- more study or application of innovative techniques before Internet-
scale solutions can be confidently engineered and deployed. scale solutions can be confidently engineered and deployed.
skipping to change at page 2, line 23 skipping to change at page 2, line 18
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC-2119 [i]. document are to be interpreted as described in RFC-2119 [i].
Table of Contents Table of Contents
1. Introduction...................................................3 1. Introduction...................................................3
2. Global Challenges..............................................4 2. Global Challenges..............................................4
2.1 Heterogeneity..............................................4 2.1 Heterogeneity..............................................4
2.2 Stability..................................................6 2.2 Stability..................................................6
2.3 Fairness...................................................7 2.3 Fairness...................................................7
3. Detailed Challenges............................................8 3. Detailed Challenges............................................9
3.1 Challenge 1: Router Support................................8 3.1 Challenge 1: Network Support...............................9
3.2 Challenge 2: Corruption Loss..............................12 3.2 Challenge 2: Corruption Loss..............................14
3.3 Challenge 3: Small Packets................................14 3.3 Challenge 3: Packets Sizes................................16
3.4 Challenge 4: Pseudo-Wires.................................18 3.4 Challenge 4: Flow Startup.................................20
3.5 Challenge 5: Multi-domain Congestion Control..............20 3.5 Challenge 5: Multi-domain Congestion Control..............22
3.6 Challenge 6: Precedence for Elastic Traffic...............21 3.6 Challenge 6: Precedence for Elastic Traffic...............25
3.7 Challenge 7: Misbehaving Senders and Receivers............22 3.7 Challenge 7: Misbehaving Senders and Receivers............26
3.8 Other challenges..........................................23 3.8 Other challenges..........................................27
4. Security Considerations.......................................25 4. Security Considerations.......................................32
5. Contributors..................................................26 5. Contributors..................................................32
6. References....................................................26 6. References....................................................32
6.1 Normative References.........................................26 6.1 Normative References.........................................32
Acknowledgments...............................................32 Acknowledgments...............................................40
1. Introduction 1. Introduction
This document describes some of the open research topics in the This document describes some of the open research topics in the
domain of Internet congestion control that are known today. We begin domain of Internet congestion control that are known today. We begin
by reviewing some proposed definitions of congestion and congestion by reviewing some proposed definitions of congestion and congestion
control based on current understandings. control based on current understandings.
Congestion can be defined as the reduction in utility due to overload Congestion can be defined as a state or condition that occurs when
in networks that support both spatial and temporal multiplexing, but the network resources are overloaded resulting into impairments for
no reservation [Keshav]. Congestion control is a (typically network users as objectively measured by the probability of loss
distributed) algorithm to share network resources among competing and/or of delay). The overload results in the reduction of utility in
traffic sources. Two components of distributed congestion control networks that support both spatial and temporal multiplexing, but no
have been defined in the context of prima-dual modeling [Kelly98]. reservation [Keshav]. Congestion control is a (typically distributed)
Primal congestion control refers to the algorithm executed by the algorithm to share network resources among competing traffic sources.
traffic sources algorithm for controlling their sending rates or Two components of distributed congestion control have been defined in
window sizes. This is normally a closed-loop control, where this the context of primal-dual modeling [Kelly98]. Primal congestion
operation depends on feedback. TCP algorithms fall in this category. control refers to the algorithm executed by the traffic sources
Dual congestion control is implemented by the routers through algorithm for controlling their sending rates or window sizes. This
gathering information about the traffic traversing them. A dual is normally a closed-loop control, where this operation depends on
congestion control algorithm updates, implicitly or explicitly, a feedback. TCP algorithms fall in this category. Dual congestion
congestion measure and sends it back, implicitly or explicitly, to control is implemented by the routers through gathering information
the traffic sources that use that link. Queue management algorithms about the traffic traversing them. A dual congestion control
such as Random Early Detection (RED) [Floyd93] or Random Exponential algorithm updates, implicitly or explicitly, a congestion measure or
Marking (REM) [Ath01] fall in the "dual" category. congestion rate and sends it back, implicitly or explicitly, to the
traffic sources that use that link. Queue management algorithms such
as Random Early Detection (RED) [Floyd93] or Random Exponential
Marking (REM) [Ath01] fall into the "dual" category.
Congestion control provides for a fundamental set of mechanisms for Congestion control provides for a fundamental set of mechanisms for
maintaining the stability and efficiency of the Internet. Congestion maintaining the stability and efficiency of the Internet. Congestion
control has been associated with TCP since Van Jacobson's work in control has been associated with TCP since Van Jacobson's work in
1988, but there is also congestion control outside of TCP (e.g. for 1988, but there is also congestion control outside of TCP (e.g. for
real-time multimedia applications, multicast, and router-based real-time multimedia applications, multicast, and router-based
mechanisms). The Van Jacobson end-to-end congestion control mechanisms) [ICCRG-RFCs]. The Van Jacobson end-to-end congestion
algorithms [Jacobson88] [RFC2581] are used by the Internet transport control algorithms [Jacobson88] [RFC2581] are used by the Internet
protocol TCP [RFC4614]. They have been proven to be highly successful transport protocol TCP [RFC4614]. They have been proven to be highly
over many years but have begun to reach their limits, as the successful over many years but have begun to reach their limits, as
heterogeneity of both the data link and physical layer and the heterogeneity of both the data link and physical layer and
applications are pulling TCP congestion control (which performs applications are pulling TCP congestion control (which performs
poorly as the bandwidth or delay increases) outside of its natural poorly as the bandwidth or delay increases) beyond its natural
operating regime. A side effect of these deficits is that there is an operating regime. A side effect of these deficits is that there is an
increasing share of hosts that use non-standardized congestion increasing share of hosts that use non-standardized congestion
control enhancements (for instance, many Linux distributions have control enhancements (for instance, many Linux distributions have
been shipped with "CUBIC" as default TCP congestion control been shipped with "CUBIC" as default TCP congestion control
mechanism). mechanism).
While the original Jacobson algorithm requires no congestion-related While the original Jacobson algorithm requires no congestion-related
state in routers, more recent modifications have departed from the state in routers, more recent modifications have departed from the
strict application of the end-to-end principle [Saltzer84]. Active strict application of the end-to-end principle [Saltzer84] in order
Queue Management (AQM) in routers, e.g., RED and its variants such as to avoid congestion collapse. Active Queue Management (AQM) in
xCHOKE [Pan00], RED with In/Out (RIO) [Clark98], improves performance routers, e.g., RED and its variants such as Weighted RED (WRED),
by keeping queues small (implicit feedback via dropped packets), Stabilized RED (SRED), Adaptive RED (ARED), xCHOKE [Pan00], RED with
while Explicit Congestion Notification (ECN) [Floyd94] [RFC3168] In/Out (RIO) [Clark98], improves performance by keeping queues small
passes one bit of congestion information back to senders when an AQM (implicit feedback via dropped packets), while Explicit Congestion
would normally drop a packet. These measures do improve performance, Notification (ECN) [Floyd94] [RFC3168] passes one bit of congestion
but there is a limit to how much can be accomplished without more information back to senders when an AQM would normally drop a packet.
information from routers. The requirement of extreme scalability These measures do improve performance, but there is a limit to how
together with robustness has been a difficult hurdle to accelerating much can be accomplished without more information from routers. The
information flow. Primal-Dual TCP/AQM distributed algorithm stability requirement of extreme scalability together with robustness has been
and equilibrium properties have been extensively studied (cf. [Low02] a difficult hurdle to accelerating information flow. Primal-Dual
[Low03]). TCP/AQM distributed algorithm stability and equilibrium properties
have been extensively studied (cf. [Low02], [Low03], [Kelly98],
[Kelly05]).
Congestion control includes many new challenges that are becoming Congestion control includes many new challenges that are becoming
important as the network grows in addition to the issues that have important as the network grows in addition to the issues that have
been known for many years. These are generally considered to be open been known for many years. These are generally considered to be open
research topics that may require more study or application of research topics that may require more study or application of
innovative techniques before Internet-scale solutions can be innovative techniques before Internet-scale solutions can be
confidently engineered and deployed. In what follows, an overview of confidently engineered and deployed. In what follows, an overview of
some of these challenges is given. some of these challenges is given.
2. Global Challenges 2. Global Challenges
skipping to change at page 4, line 37 skipping to change at page 4, line 42
2.1 Heterogeneity 2.1 Heterogeneity
The Internet encompasses a large variety of heterogeneous IP networks The Internet encompasses a large variety of heterogeneous IP networks
that are realized by a multitude of technologies, which result in a that are realized by a multitude of technologies, which result in a
tremendous variety of link and path characteristics: capacity can be tremendous variety of link and path characteristics: capacity can be
either scarce in very slow speed radio links (several kbps), or there either scarce in very slow speed radio links (several kbps), or there
may be an abundant supply in high-speed optical links (several may be an abundant supply in high-speed optical links (several
gigabit per second). Concerning latency, scenarios range from local gigabit per second). Concerning latency, scenarios range from local
interconnects (much less than a millisecond) to certain wireless and interconnects (much less than a millisecond) to certain wireless and
satellite links with very large latencies (up to a second). Even satellite links with very large latencies (up to a second). Even
higher latencies can occur in interstellar communication. As a higher latencies can occur in space communication. As a consequence,
consequence, both the available bandwidth and the end-to-end delay in both the available bandwidth and the end-to-end delay in the Internet
the Internet may vary over many orders of magnitude, and it is likely may vary over many orders of magnitude, and it is likely that the
that the range of parameters will further increase in future. range of parameters will further increase in future.
Additionally, neither the available bandwidth nor the end-to-end Additionally, neither the available bandwidth nor the end-to-end
delay is constant. At the IP layer, competing cross-traffic, traffic delay is constant. At the IP layer, competing cross-traffic, traffic
management in routers, and dynamic routing can result in sudden management in routers, and dynamic routing can result in sudden
changes of the characteristics of an end-to-end path. Additional changes of the characteristics of an end-to-end path. Additional
dynamics can be caused by link layer mechanisms, such as shared media dynamics can be caused by link layer mechanisms, such as shared media
access (e.g., in wireless networks), changes of links access (e.g., in wireless networks), changes of links due to mobility
(horizontal/vertical handovers), topology modifications (e. g., in (horizontal/vertical handovers), topology modifications (e. g., in
ad-hoc networks), link layer error correction and dynamic bandwidth ad-hoc or meshed networks), link layer error correction and dynamic
provisioning schemes. From this follows that path characteristics can bandwidth provisioning schemes. From this follows that path
be subject to substantial changes within short time frames. characteristics can be subject to substantial changes within short
time frames.
The congestion control algorithms have to deal with this variety in Congestion control algorithms have to deal with this variety in an
an efficient way. The congestion control principles introduced by Van efficient and stable way. The congestion control principles
Jacobson assume a rather static scenario and implicitly target introduced by Van Jacobson assume a rather static scenario and
configurations where the bandwidth-delay product is of the order of implicitly target configurations where the bandwidth-delay product is
some dozens of packets at most. While these principles have proved to of the order of some dozens of packets at most. While these
work well in the Internet for almost two decades, much larger principles have proved to work well in the Internet for almost two
bandwidth-delay products and increased dynamics challenge them more decades, much larger bandwidth-delay products and increased dynamics
and more. There are many situations where today's congestion control challenge them more and more. There are many situations where today's
algorithms react in a suboptimal way, resulting in low resource congestion control algorithms react in a suboptimal way, resulting in
utilization, non-optimal congestion avoidance, or unfairness. low resource utilization, non-optimal congestion avoidance, or
unequal flow rates.
This gave rise to a multitude of new proposals for congestion control This has resulted into a multitude of new proposals for congestion
algorithms. For instance, since the Additive-Increase Multiplicative control algorithms. For instance, since the Additive Increase
Decrease (AIMD) behavior of TCP is too conservative in practical Multiplicative Decrease (AIMD) behavior of TCP is too conservative in
environments when then congestion window is large, several high-speed practical environments when then congestion window is large, several
congestion control extensions have been developed. However, these new high-speed congestion control extensions have been developed.
algorithms raise fairness issues, and they may be less robust in However, these new algorithms raise rate equality issues, and they
certain situations for which they have not been designed. Up to now, may be less robust in certain situations for which they have not been
there is still no common agreement in the IETF on which algorithm and designed. Up to now, there is still no common agreement in the IETF
protocol to choose. on which algorithm(s) and protocol(s) to choose.
It is always possible to tune congestion control parameters based on It is always possible to tune congestion control parameters based on
some knowledge about the environment and the application scenario. some knowledge of the environment and the application scenario.
However, the fundamental question is whether it is possible to define However, the interaction of multiple congestion control techniques
one congestion control mechanism that operates reasonable well in the interacting with each other is not yet well understood. The
fundamental question is whether it is possible to define one
congestion control mechanism that operates reasonably well in the
whole range of scenarios that exist in the Internet. Hence, it is an whole range of scenarios that exist in the Internet. Hence, it is an
important research question how new Internet congestion control important research question how new Internet congestion control
mechanisms would have to be designed, which maximum degree of mechanisms would have to be designed, which maximum degree of
dynamics it could efficiently handle, and whether it could keep the dynamics they can efficiently handle, and whether they can keep the
genererality of the existing end-to-end solutions. genererality of the existing end-to-end solutions.
Some improvements of congestion control could be realized by simple Some improvements to congestion control could be realized by simple
changes of single functions in end-system or network components. changes of single functions in end-system or network components.
However, new mechanism can also require a fundamental redesign of the However, new mechanism(s) might also require a fundamental redesign
overall network architecture, and they may even affect the design of of the overall network architecture, and they may even affect the
Internet applications. This can imply significant interoperability design of Internet applications. This can imply significant
and backward compatibility challenges and/or create network interoperability and backward compatibility challenges and/or create
accessibility obstacles. In particular, networks and/or applications network accessibility obstacles. In particular, networks and/or
that do not use or support a new congestion control mechanism could applications that do not use or support a new congestion control
be penalized by a significantly worse performance compared to what mechanism could be penalized by a significantly worse performance
they would get if everybody used the existing mechanisms (cf. the compared to what they would get if everybody used the existing
discussion on fairness in section 2.3). [RFC5033] defines several mechanisms (cf. the discussion on fairness in section 2.3). [RFC5033]
criteria to evaluate the appropriateness of a new congestion control defines several criteria to evaluate the appropriateness of a new
mechanism. However, the fundamental question is how much performance congestion control mechanism. However, the fundamental question is
deterioration is acceptable for "legacy" applications. This tradeoff how much performance deterioration is acceptable for "legacy"
between performance and cost has to be very carefully examined for applications. This tradeoff between performance and cost has to be
all new congestion control schemes. very carefully examined for all new congestion control schemes.
2.2 Stability 2.2 Stability
Control theory, which is a mathematical tool for describing dynamic Control theory is a mathematical tool for describing dynamic systems.
systems, lends itself to modeling congestion control - TCP is a It lends itself to modeling congestion control - TCP is a perfect
perfect example of a typical "closed loop" system that can be example of a typical "closed loop" system that can be described in
described in control theoretic terms. In control theory, there is a control theoretic terms. However, control theory has had to be
mathematically defined notion of system stability. In a stable extended to model the interactions between multiple control loops in
system, for any bounded input over any amount of time, the output a network. In control theory, there is a mathematically defined
will also be bounded. For congestion control, what is actually meant notion of system stability. In a stable system, for any bounded input
with stability is typically asymptotic stability: a mechanism should over any amount of time, the output will also be bounded. For
converge to a certain state irrespective of the initial state of the congestion control, what is actually meant by global stability is
network. typically asymptotic stability: a mechanism should converge to a
certain state irrespective of the initial state of the network. Local
stability means that if the system is perturbed from its stable state
it will quickly return towards the locally stable state.
Control theoretic modeling of a realistic network can be quite Control theoretic modeling of a realistic network can be quite
difficult, especially when taking distinct packet sizes and difficult, especially when taking distinct packet sizes and
heterogeneous RTTs into account. It has therefore become common heterogeneous RTTs into account. It has therefore become common
practice to model simpler cases and leave the more complicated practice to model simpler cases and to leave the more complicated
(realistic) situations for simulations. Clearly, if a mechanism is (realistic) situations for simulations. Clearly, if a mechanism is
not stable in a simple scenario, it is generally useless; this method not stable in a simple scenario, it is generally useless; this method
therefore helps to eliminate faulty congestion control candidates at therefore helps to eliminate faulty congestion control candidates at
an early stage. an early stage.
Some fundamental facts, which are known from control theory are Some fundamental facts, which are known from control theory are
useful as guidelines when designing a congestion control mechanism. useful as guidelines when designing a congestion control mechanism.
For instance, a controller should only be fed a system state that For instance, a controller should only be fed a system state that
reflects its output. A (low-pass) filter function should be used in reflects its output. A (low-pass) filter function should be used in
order to pass only states to the controller that are expected to last order to pass only states to the controller that are expected to last
skipping to change at page 6, line 53 skipping to change at page 7, line 14
reached "steady state", tries to maintain an equal amount of packets reached "steady state", tries to maintain an equal amount of packets
in flight at any time by only sending a packet into the network when in flight at any time by only sending a packet into the network when
a packet has left the network (as indicated by an ACK arriving at the a packet has left the network (as indicated by an ACK arriving at the
sender). The latter aspect has guided many decisions regarding sender). The latter aspect has guided many decisions regarding
changes that were made to TCP over the years. changes that were made to TCP over the years.
The reasoning in [Jacobson88] assumes all senders to be acting at the The reasoning in [Jacobson88] assumes all senders to be acting at the
same time. The stability of TCP under more realistic network same time. The stability of TCP under more realistic network
conditions has been investigated in a large number of ensuing works, conditions has been investigated in a large number of ensuing works,
leading to no clear conclusion that TCP would also be asymptotically leading to no clear conclusion that TCP would also be asymptotically
stable under arbitrary network conditions. The stability impact of stable under arbitrary network conditions. On the other hand,
Slow Start (which can be significant as short-lived HTTP flows often research has concluded that stability can be assured with constraints
never leave this phase) is also not entirely clear. on dynamics that are less stringent than the "conservation of packets
principle". From control theory, only rate increase (not the target
rate) needs to be inversely proportional to RTT (whereas window-based
control converges on a target rate inversely proportional to RTT).
With rate-based control, high-speed congestion control converges on a
rate that is independent of RTT as long as its dynamics depends on
RTT (e.g. FAST TCP [Jin04]).
However in the stability analysis of TCP and of these more modern
controls the stability impact of Slow Start (which can be significant
as short-lived HTTP flows often never leave this phase) is not
entirely clear.
2.3 Fairness 2.3 Fairness
Recently, the way the Internet community reasons about fairness has Recently, the way the Internet community reasons about fairness has
been called into deep questioning [Bri07]. Much of the community has been called into deep questioning [Bri07]. Much of the community has
taken fairness to mean approximate equality between the rates of taken fairness to mean approximate equality between the rates of
flows (flow rate fairness) that experience equivalent path congestion flows (flow rate fairness) that experience equivalent path congestion
as with TCP [RFC2581] and TFRC [RFC3448]. [RFC3714] depicts the as with TCP [RFC2581] and TFRC [RFC3448]. [RFC3714] depicts the
resulting situation as "The Amorphous Problem of Fairness". resulting situation as "The Amorphous Problem of Fairness".
skipping to change at page 7, line 31 skipping to change at page 8, line 6
In comparison, the debate between max-min, proportional and TCP In comparison, the debate between max-min, proportional and TCP
fairness is about mere details. These three all share the assumption fairness is about mere details. These three all share the assumption
that equal flow rates are desirable; they merely differ in the second that equal flow rates are desirable; they merely differ in the second
order issue of how to share out excess capacity in a network of many order issue of how to share out excess capacity in a network of many
bottlenecks. In contrast, cost fairness should lead to extremely bottlenecks. In contrast, cost fairness should lead to extremely
unequal flow rates by design. Equivalently, equal flow rates would unequal flow rates by design. Equivalently, equal flow rates would
typically be considered extremely unfair. typically be considered extremely unfair.
The two traditional approaches are not protocol options that can each The two traditional approaches are not protocol options that can each
be followed in different parts of a network. They result in research be followed in different parts of an inter-network. They result in
agendas and issues that are different in their respective objectives research agendas and issues that are different in their respective
resulting in different set of open issues. objectives resulting in different set of open issues.
If we assume TCP-friendliness as a goal with flow rate as the metric, If we assume TCP-friendliness as a goal with flow rate as the metric,
open issues would be: open issues would be:
- Should rate fairness depend on the packet rate or the bit rate? - Should rate fairness depend on the packet rate or the bit rate?
- Should the flow rate depend on RTT (as in TCP) or should only flow - Should the flow rate depend on RTT (as in TCP) or should only flow
dynamics depend on RTT (e.g. as in Fast TCP [Jin04])? dynamics depend on RTT (e.g. as in Fast TCP [Jin04])?
- How to estimate whether a particular flow start strategy is fair, - How to estimate whether a particular flow start strategy is fair,
or whether a particular fast recovery strategy after a reduction in or whether a particular fast recovery strategy after a reduction in
rate due to congestion is fair? rate due to congestion is fair?
- How should we judge what is reasonably fair if an application - Should we judge what is reasonably fair if an application needs,
needs, for example, even smoother flows than TFRC, or it needs to for example, even smoother flows than TFRC, or it needs to
burst occasionally, or with any other application behavior? burst occasionally, or with any other application behavior?
- During brief congestion bursts (e.g. due to new flow arrivals) how - During brief congestion bursts (e.g. due to new flow arrivals) how
to judge at what point it becomes unfair for some flows to continue to judge at what point it becomes unfair for some flows to continue
at a smooth rate while others reduce their rate? at a smooth rate while others reduce their rate?
- Which mechanism(s) should be used to enforce approximate flow rate - Which mechanism(s) should be used to enforce approximate flow rate
fairness? fairness?
- How can we introduce some degree of fairness that takes account of - Should we introduce some degree of fairness that takes account of
flow duration? different users' flow activity over time?
- How to judge the fairness of applications using a large number of - How to judge the fairness of applications using a large number of
flows over separate paths (e.g. via an overlay)? flows over separate paths (e.g. via an overlay)?
If we assume cost fairness as a goal with congestion volume as the If we assume cost fairness as a goal with congestion volume as the
metric, open issues would be: metric, open issues would be:
- Can one application's sensitivity to instantaneous congestion - Can one application's sensitivity to instantaneous congestion
really be protected by longer-term accountability of competing really be protected by longer-term accountability of competing
applications? applications?
- Which protocol mechanism(s) should give accountability for causing - Which protocol mechanism(s) are needed to give accountability for
congestion? causing congestion?
- How to design one or two generic transport protocols (such as to - How to design one or two generic transport protocols (such as to
TCP, UDP, etc.) with the addition of application policy control? TCP, UDP, etc.) with the addition of application policy control?
- Which policy enforcement should be used by networks and which - Which policy enforcement should be used by networks and what are
interactions between application policy and network policy the interactions between application policy and network policy
enforcement? enforcement?
- How to design a new policy enforcement framework that will - How to design a new policy enforcement framework that will
appropriately compete with existing flows aiming for rate equality appropriately compete with existing flows aiming for rate equality
(e.g. TCP)? (e.g. TCP)?
The question of how to reason about fairness is a pre-requisite to The question of how to reason about fairness is a pre-requisite to
agreeing on the research agenda. However, that question does not agreeing on the research agenda. If the relevant metric is flow-rate
require more research in itself, it is merely a debate that needs to it places constraints at protocol design-time, whereas if the metric
be resolved by studying existing research and by assessing how bad is congestion volume the constraints move to run-time, while design-
fairness problems could become if they are not addressed rigorously. time constraints can be relaxed [Bri08]. However, that question does
not require more research in itself, it is merely a debate that needs
to be resolved by studying existing research and by assessing how bad
fairness problems could become if they are not addressed rigorously,
and whether we can rely on trust to maintain approximate fairness
without requiring policing complexity [Floyd08]. The latter points
may themselves lead to additional research. However, it is also
accepted that more research will not necessarily lead to convince
either side to change their opinions. More debate would be needed. It
seems also that if an architecture is built to support cost-fairness
then equal-costs result in flow-rate fairness as a degenerate case;
that is, flow-rate fairness can be seen as a special case of cost-
fairness. One can be used to build the other, but not vice-versa.
3. Detailed Challenges 3. Detailed Challenges
3.1 Challenge 1: Router Support 3.1 Challenge 1: Network Support
Routers can be involved in congestion control in two ways: first, This challenge is the most critical to get right. Changes to the
they can implicitly optimize their functions, such as queue balance of functions between the endpoints and network equipment
could require a change to the per-datagram data plane interface
between the transport and network layers. Network equipment vendors
need to be assured that any new interface is stable enough (on decade
timescales) to build into firmware and hardware, and OS vendors will
not use a new interface unless it is likely to be widely deployed.
Network components can be involved in congestion control in two ways:
first, they can implicitly optimize their functions, such as queue
management and scheduling strategies, in order to support the management and scheduling strategies, in order to support the
operation of an end-to-end congestion control. Second, routers can operation of an end-to-end congestion control. Second, network
participate in congestion control via explicit notification components can participate in congestion control via explicit
mechanisms. notification mechanisms. Explicit notification mechanisms require a
communication between network components and end-systems. In the
Internet, network interconnection is realized at the IP layer. As a
consequence, notification signals can only be realized within the IP
layer or in higher protocol layers. Only network components that
process IP packets can trigger such notifications. This includes
routers and potentially also middleboxes, but not pure link layer
devices. The following section distinguish clearly between the term
"network component" and the term "router"; the term "router" is used
whenever the processing of IP packets is explicitly required. One
fundamental challenge of network supported congestion control is that
typically not all network components along a path are routers (cf.
Section 3.1.3).
In the first category, various approaches have been proposed and also The first category of implicit mechanisms can be implemented in any
deployed, such as different AQM techniques. Even though these network component that processes and stores packets. Various
implicit techniques are known to improve network performance during approaches have been proposed and also deployed, such as different
congestion phases, they are still only partly deployed in the AQM techniques. Even though these implicit techniques are known to
Internet. This may be due to the fact that finding optimal and robust improve network performance during congestion phases, they are still
parameterizations for these mechanisms is a non-trivial problem. only partly deployed in the Internet. This may be due to the fact
Indeed, the problem with various AQM schemes is the difficulty to that finding optimal and robust parameterizations for these
identify correct values of the parameter set that affects the mechanisms is a non-trivial problem. Indeed, the problem with various
performance of the queuing scheme (due to variation in the number of AQM schemes is the difficulty to identify correct values of the
sources, the capacity and the feedback delay) [Fioriu00] [Hollot01] parameter set that affects the performance of the queuing scheme (due
[Zhang03]. Many AQM schemes (RED, REM, BLUE, PI-Controller but also to variation in the number of sources, the capacity and the feedback
Adaptive Virtual Queue (AVQ)) do not define a systematic rule for delay) [Fioriu00] [Hollot01] [Zhang03]. Many AQM schemes (RED, REM,
setting their parameters. BLUE, PI-Controller but also Adaptive Virtual Queue (AVQ)) do not
define a systematic rule for setting their parameters.
By using explicit feedback from the network, connection endpoints can The second class of approaches uses explicit notification. By using
obtain more accurate information about the current network explicit feedback from the network, connection endpoints can obtain
characteristics on the path. This allows endpoints to make more more accurate information about the current network characteristics
precise decisions that can better prevent packet loss and that can on the path. This allows endpoints to make more precise decisions
also improve fairness among different flows. Examples for explicit that can better prevent packet loss and that can also improve rate
router feedback include Explicit Congestion Notification (ECN) equality among different flows.
[RFC3168], Quick-Start [RFC4782], the eXplicit Control Protocol (XCP)
[Katabi02] [Falk07], the Rate Control Protocol (RCP) [Dukk06], and Explicit feedback techniques fall into three broad categories:
o Explicit congestion feedback: whether one bit Explicit Congestion
Notification (ECN) [RFC3168] or proposals for more than one bit
[Xia05];
o Explicit per-datagram rate feedback: the eXplicit Control Protocol
(XCP) [Katabi02] [Falk07], the Rate Control Protocol (RCP)
[Dukki05];
o Explicit rate feedback: by in-band signaling, such as by Quick-
-Start [RFC4782], or by means of out-of-band signaling, e.g.
CADPC/PTP [Welzl03]. CADPC/PTP [Welzl03].
Explicit router feedback can address some of the inherent Explicit router feedback can address some of the inherent
shortcomings of TCP. For instance, XCP has been developed to overcome shortcomings of TCP. For instance, XCP was developed to overcome the
the inefficiency, unfairness and instability that TCP suffers from inefficiency, unfairness and instability that TCP suffers from when
when the per-flow bandwidth-delay product increases. By decoupling the per-flow bandwidth-delay product increases. By decoupling
resource utilization/congestion control from fairness control, XCP resource utilization/congestion control from fairness control, XCP
achieves fair bandwidth allocation, high utilization, a small achieves fair bandwidth allocation, high utilization, a small
standing queue size, and near-zero packet drops, with both steady and standing queue size, and near-zero packet drops, with both steady and
highly varying traffic. Importantly, XCP does not maintain any per- highly varying traffic. Importantly, XCP does not maintain any per-
flow state in routers and requires few CPU cycles per packet, hence flow state in routers and requires few CPU cycles per packet, hence
making it potentially applicable in high-speed routers. However, XCP making it potentially applicable in high-speed routers. However, XCP
is still subject to research: as [Andrew05] has pointed out, XCP is is still subject to research: as [Andrew05] has pointed out, XCP is
locally stable but globally unstable when the maximum RTT of a flow locally stable but globally unstable when the maximum RTT of a flow
is much larger than the mean RTT. This instability can be removed by is much larger than the mean RTT. This instability can be removed by
changing the update strategy for the estimation interval, but this changing the update strategy for the estimation interval, but this
makes the system vulnerable to erroneous RTT advertisements. The makes the system vulnerable to erroneous RTT advertisements. The
authors of [PAP02] have shown that, when flows with different RTTs authors of [PAP02] have shown that, when flows with different RTTs
are applied, XCP sometimes discriminates among heterogeneous traffic are applied, XCP sometimes discriminates among heterogeneous traffic
flows, even if XCP is generally fair to different flows. [Low05] flows, even if XCP generally equalizes rate among different flows.
provides for a complete characterization of the XCP equilibrium [Low05] provides for a complete characterization of the XCP
properties. equilibrium properties.
Several other explicit router feedback schemes have been developed Several other explicit router feedback schemes have been developed
with different design objectives. For instance, RCP uses a per-packet with different design objectives. For instance, RCP uses per-packet
feedback similar to XCP. Different to XCP, RCP focuses on the feedback similar to XCP. But unlike XCP, RCP focuses on the reduction
reduction of flow completion times and therefore tolerates larger of flow completion times [Dukki06], taking an optimistic approach to
instantaneous queue sizes [Dukk06]. flows likely to arrive in the next RTT and tolerating larger
instantaneous queue sizes [Dukki05]. XCP on the other hand gives very
poor flow completion times for short flows.
Both implicit and explicit router support should be considered in the Both implicit and explicit router support should be considered in the
context of the end-to-end argument [Saltzer84], which is one of the context of the end-to-end argument [Saltzer84], which is one of the
key design principles of the Internet. It suggests that functions key design principles of the Internet. It suggests that functions
that can be realized both in the end-systems and in the network that can be realized both in the end-systems and in the network
should be implemented in the end-systems. This principle ensures that should be implemented in the end-systems. This principle ensures that
the network provides a general service and that remains as simple as the network provides a general service and that remains as simple as
possible (any additional complexity is placed above the IP layer, possible (any additional complexity is placed above the IP layer,
i.e., at the edges) so as to ensure reliability and robustness. In i.e., at the edges) so as to ensure evolvability, reliability and
particular, this means that Internet protocols should not rely on the robustness. Furthermore, the fate-sharing principle, enunciated by
maintenance of applicative state (i.e., information about the state Dave Clark in "Design Philosophy of the DARPA Internet Protocols",
of the end-to-end communication) inside the network [RFC1958] and mandates that an end-to-end Internet protocol design should not rely
that the network state (e.g. routing state) maintained by the on the maintenance of any per-flow state (i.e., information about the
state of the end-to-end communication) inside the network [RFC1958]
and that the network state (e.g. routing state) maintained by the
Internet shall minimize its interaction with the states maintained at Internet shall minimize its interaction with the states maintained at
the end-points/hosts. the end-points/hosts.
However, as discussed for instance in [Moors02], congestion control However, as discussed for instance in [Moors02], congestion control
cannot be realized as a pure end-to-end function only. Congestion is cannot be realized as a pure end-to-end function only. Congestion is
an inherent network phenomenon and can only be resolved efficiently an inherent network phenomenon and can only be resolved efficiently
by some cooperation of end-systems and the network. Congestion by some cooperation of end-systems and the network. Congestion
control in today's Internet protocols follows the end-to-end design control in today's Internet protocols follows the end-to-end design
principle insofar as only minimal feedback from the network is used principle insofar as only minimal feedback from the network is used
(e. g., packet loss and delay). The end-systems only decide how to (e. g., packet loss and delay). The end-systems only decide how to
react and how to avoid congestion. The crux is that, on the one hand, react and how to avoid congestion. The crux is that, on the one hand,
there would be substantial benefit by further assistance from the there would be substantial benefit by further assistance from the
network, but, on the other hand, such router support could lead to network, but, on the other hand, such router support could lead to
duplication of functions, which might even harmfully interact with duplication of functions, which might even harmfully interact with
end-to-end protocol mechanisms. The different requirements of end-to-end protocol mechanisms. The different requirements of
applications (cf. the fairness discussion in Section 2.3) call for a applications (cf. the fairness discussion in Section 2.3) call for a
variety of different congestion control approaches, but putting such variety of different congestion control approaches, but putting such
application-specific behavior inside the network should be avoided, per-flow behavior inside the network should be avoided, as such
as such design would clearly be at odds with the end-to-end design design would clearly be at odds with the end-to-end and fate sharing
principle. design principles.
The end-to-end argument is generally regarded as a key ingredient for The end-to-end and fate sharing principles are generally regarded as
ensuring a scalable network design. In order to ensure that new the key ingredients for ensuring a scalable and survivable network
congestion control mechanisms are scalable, violating this principle design. In order to ensure that new congestion control mechanisms are
must therefore be avoided. scalable, violating these principles must therefore be avoided.
In general, router support raises many issues that have not been In general, network support of congestion control raises many issues
completely solved yet. that have not been completely solved yet.
3.1.1 Performance and robustness 3.1.1 Performance and robustness
Congestion control is subject to some tradeoffs: on one hand, it must Congestion control is subject to some tradeoffs: on one hand, it must
allow high link utilizations and fair resource sharing but on the allow high link utilizations and fair resource sharing but on the
other hand, the algorithms must also be robust and conservative in other hand, the algorithms must also be robust in particular during
particular during congestion phases. congestion phases.
Router support can help to improve performance and fairness, but it Router support can help to improve performance but it can also result
can also result in additional complexity and more control loops. This in additional complexity and more control loops. This requires a
requires a careful design of the algorithms in order to ensure careful design of the algorithms in order to ensure stability and
stability and avoid e.g. oscillations. A further challenge is the avoid e.g. oscillations. A further challenge is the fact that
fact that information may be imprecise. For instance, severe information may be imprecise. For instance, severe congestion can
congestion can delay feedback signals. Also, the measurement of delay feedback signals. Also, in-network measurement of parameters
parameters such as RTTs or data rates may contain estimation errors. such as RTTs or data rates may contain estimation errors. Even though
Even though there has been significant progress in providing there has been significant progress in providing fundamental
fundamental theoretical models for such effects, research has not theoretical models for such effects, research has not completely
completely explored the whole problem space yet. explored the whole problem space yet.
Open questions are: Open questions are:
- How much can routers theoretically improve performance in the - How much can network elements theoretically improve performance in
complete range of communication scenarios that exists in the the complete range of communication scenarios that exists in the
Internet without damaging or impacting end-to-end mechanisms Internet without damaging or impacting end-to-end mechanisms
already in place? already in place?
- Is it possible to design robust mechanisms that offer significant - Is it possible to design robust mechanisms that offer significant
benefits without additional risks? benefits with minimum additional risks?
- What is the minimum support that is needed from routers in order - What is the minimum support that is needed from the network in
to achieve significantly better performance than with end-to-end order to achieve significantly better performance than with
mechanisms? end-to-end mechanisms and the current IP header limitations that
provide at most unary ECN signals?
3.1.2 Granularity of router functions 3.1.2 Granularity of network component functions
There are several degrees of freedom concerning router involvement, There are several degrees of freedom concerning the involvement of
ranging from some few additional functions in network management network entities, ranging from some few additional functions in
procedures one the one end, and additional per packet processing on network management procedures on the one end, and additional per
the other end of the solution space. Furthermore, different amounts packet processing on the other end of the solution space.
of state can be kept in routers (no per-flow state, partial per-flow Furthermore, different amounts of state can be kept in routers (no
state, soft state, hard state). The additional router processing is a per-flow state, partial per-flow state, soft state, hard state). The
challenge for Internet scalability and could also increase end-to-end additional router processing is a challenge for Internet scalability
latencies. and could also increase end-to-end latencies.
There are many solutions that do not require per-flow state and thus There are many solutions that do not require per-flow state and thus
do not cause a large processing overhead. However, scalability issues do not cause a large processing overhead. However, scalability issues
could also be caused, for instance, by synchronization mechanisms for could also be caused, for instance, by synchronization mechanisms for
state information among parallel processing entities, which are e. g. state information among parallel processing entities, which are e. g.
used in high-speed router hardware designs. used in high-speed router hardware designs.
Open questions are: Open questions are:
- What granularity of router processing can be realized without - What granularity of router processing can be realized without
affecting Internet scalability? affecting Internet scalability?
- How can additional processing efforts be kept at a minimum? - How can additional processing efforts be kept at a minimum?
3.1.3 Information acquisition 3.1.3 Information acquisition
In order to support congestion control, routers have to obtain at In order to support congestion control, network components have to
least a subset of the following information. Obtaining that obtain at least a subset of the following information. Obtaining that
information may result in complex tasks. information may result in complex tasks.
1. Capacity of (outgoing) links 1. Capacity of (outgoing) links
Link characteristics depend on the realization of lower protocol Link characteristics depend on the realization of lower protocol
layers. Routers do not necessarily know the link layer network layers. Routers operating at IP layer do not necessarily know the
topology and link capacities, and these are not always constant (e. link layer network topology and link capacities, and these are not
g., on shared wireless links). Depending on the network technology, always constant (e. g., on shared wireless links or bandwidth-on-
there can be queues or bottlenecks that are not directly visible at demand links). Depending on the network technology, there can be
the IP layer. Difficulties also arise when using IP-in-IP tunnels queues or bottlenecks that are not directly visible at the IP layer.
[RFC 2003] or MPLS [RFC3031] [RFC3032]. In these cases, link
information could be determined by cross-layer information exchange, Difficulties also arise when using IP-in-IP tunnels [RFC 2003] IPsec
but this requires link layer technology specific interfaces. An tunnels [RFC4301], IP encapsulated in L2TP [RFC2661], GRE [RFC1701],
alternative could be online measurements, but this can cause PPTP [RFC2637] or MPLS [RFC3031] [RFC3032] [RFC5129]. In these cases,
significant additional network overhead. link information could be determined by cross-layer information
exchange, but this requires link layer technology specific
interfaces. An alternative could be online measurements, but this can
cause significant additional network overhead. General guidelines for
encapsulation and decapsulation of explicit congestion information
are currently in preparation [ECN-tunnel].
2. Traffic carried over (outgoing) links 2. Traffic carried over (outgoing) links
Accurate online measurement of data rates is challenging when traffic Accurate online measurement of data rates is challenging when traffic
is bursty. For instance, measuring a "current link load" requires is bursty. For instance, measuring a "current link load" requires
defining the right measurement interval/ sampling interval. This is a defining the right measurement interval/ sampling interval. This is a
challenge for proposals that require knowledge e.g. about the current challenge for proposals that require knowledge e.g. about the current
link utilization. link utilization.
3. Internal buffer statistics 3. Internal buffer statistics
Some proposals use buffer statistics such as a virtual queue length Some proposals use buffer statistics such as a virtual queue length
to trigger feedback. However, routers can include multiple to trigger feedback. However, network components can include multiple
distributed buffer stages that make it difficult to obtain such distributed buffer stages that make it difficult to obtain such
metrics. metrics.
Open questions are: Can and should this information be made Open questions are: Can and should this information be made
available, e.g., by additional interfaces or protocols? available, e.g., by additional interfaces or protocols?
3.1.4 Feedback signaling 3.1.4 Feedback signaling
Explicit notification mechanisms can be realized either by in-band Explicit notification mechanisms can be realized either by in-band
signaling (notifications piggybacked along with the data traffic) or signaling (notifications piggybacked along with the data traffic) or
by out-of-band signaling. The latter case requires additional by out-of-band signaling [Sarola07]. The latter case requires
protocols and can be further subdivided into path-coupled and path- additional protocols and a secure binding between the signals and the
decoupled approaches. packets they refer to. Out-of-band signaling can be further
subdivided into path-coupled and path-decoupled approaches.
Open questions concerning feedback signaling include: Open questions concerning feedback signaling include:
- At which protocol layer should the feedback signaling occur - At which protocol layer should the feedback signaling occur
(IP/network layer assisted, transport layer assisted, hybrid (IP/network layer assisted, transport layer assisted, hybrid
solutions, shim layer, intermediate sub-layer, etc.) ? solutions, shim layer, intermediate sub-layer, etc.)? Should the
feedback signaling be path-coupled or path-decoupled?
- What is the optimal frequency of feedback (only in case of - What is the optimal frequency of feedback (only in case of
congestion events, per RTT, per packet, etc.)? congestion events, per RTT, per packet, etc.)?
- What direction should feedback take (from resource via receiver to
sender, or directly back to sender)?
3.2 Challenge 2: Corruption Loss 3.2 Challenge 2: Corruption Loss
It is common for congestion control mechanisms to interpret packet It is common for congestion control mechanisms to interpret packet
loss as a sign of congestion. This is appropriate when packets are loss as a sign of congestion. This is appropriate when packets are
dropped in routers because of a queue that overflows, but there are dropped in routers because of a queue that overflows, but there are
other possible reasons for packet drops. In particular, in wireless other possible reasons for packet drops. In particular, in wireless
networks, packets can be dropped because of corruption, rendering the networks, packets can be dropped because of corruption, rendering the
typical reaction of a congestion control mechanism inappropriate. typical reaction of a congestion control mechanism inappropriate.
TCP over wireless and satellite is a topic that has been investigated TCP over wireless and satellite is a topic that has been investigated
for a long time [Krishnan04]. There are some proposals where the for a long time [Krishnan04]. There are some proposals where the
congestion control mechanism would react as if a packet had not been congestion control mechanism would react as if a packet had not been
skipping to change at page 13, line 14 skipping to change at page 14, line 44
It is common for congestion control mechanisms to interpret packet It is common for congestion control mechanisms to interpret packet
loss as a sign of congestion. This is appropriate when packets are loss as a sign of congestion. This is appropriate when packets are
dropped in routers because of a queue that overflows, but there are dropped in routers because of a queue that overflows, but there are
other possible reasons for packet drops. In particular, in wireless other possible reasons for packet drops. In particular, in wireless
networks, packets can be dropped because of corruption, rendering the networks, packets can be dropped because of corruption, rendering the
typical reaction of a congestion control mechanism inappropriate. typical reaction of a congestion control mechanism inappropriate.
TCP over wireless and satellite is a topic that has been investigated TCP over wireless and satellite is a topic that has been investigated
for a long time [Krishnan04]. There are some proposals where the for a long time [Krishnan04]. There are some proposals where the
congestion control mechanism would react as if a packet had not been congestion control mechanism would react as if a packet had not been
dropped in the presence of corruption (cf. TCP HACK [BALAN01]), but dropped in the presence of corruption (cf. TCP HACK [Balan01]), but
discussions in the IETF have shown that there is no agreement that discussions in the IETF have shown that there is no agreement that
this type of reaction is appropriate. For instance, it has been said this type of reaction is appropriate. For instance, it has been said
that congestion can manifest itself as corruption on shared wireless that congestion can manifest itself as corruption on shared wireless
links, and it is questionable whether a source that sends packets links, and it is questionable whether a source that sends packets
that are continuously impaired by link noise should keep sending at a that are continuously impaired by link noise should keep sending at a
high rate. high rate because it has lost the integrity of the feedback loop.
Generally, two questions must be addressed when designing congestion Generally, two questions must be addressed when designing congestion
control mechanism that takes corruption into account: control mechanism that takes corruption into account:
1. How is corruption detected? 1. How is corruption detected?
2. What should be the reaction? 2. What should be the reaction?
In addition to question 1 above, it may be useful to consider In addition to question 1 above, it may be useful to consider
detecting the reason for corruption, but this has not yet been done detecting the reason for corruption, but this has not yet been done
skipping to change at page 14, line 6 skipping to change at page 15, line 36
checksum does not show an error, it is possible for errors to be checksum does not show an error, it is possible for errors to be
found in the payload using a second checksum. Such error detection is found in the payload using a second checksum. Such error detection is
possible with UDP-Lite and DCCP; it was found to work well over a possible with UDP-Lite and DCCP; it was found to work well over a
GPRS network in a study [Chester04] and poorly over a WiFi network in GPRS network in a study [Chester04] and poorly over a WiFi network in
another study [Rossi06] [Welzl08]. Note that, while UDP-Lite and DCCP another study [Rossi06] [Welzl08]. Note that, while UDP-Lite and DCCP
enable the detection of corruption, the specifications of these enable the detection of corruption, the specifications of these
protocols do not foresee any specific reaction to it for the time protocols do not foresee any specific reaction to it for the time
being. being.
The idea of having a transport endpoint detect and accordingly react The idea of having a transport endpoint detect and accordingly react
to corruption poses a number of interesting questions regarding (or not) to corruption poses a number of interesting questions
cross-layer interactions. As IP is designed to operate over arbitrary regarding cross-layer interactions. As IP is designed to operate over
link layers, it is therefore difficult to design a congestion control arbitrary link layers, it is therefore difficult to design a
mechanism on top of it, which appropriately reacts to corruption - congestion control mechanism on top of it, which appropriately reacts
especially as the specific data link layers that are in use along an to corruption - especially as the specific data link layers that are
end-to-end path are typically unknown to entities at the transport in use along an end-to-end path are typically unknown to entities at
layer. the transport layer.
While the IETF has not yet specified how a congestion control While the IETF has not yet specified how a congestion control
mechanism should react to corruption, proposals exist in the mechanism should react to corruption, proposals exist in the
literature. For instance, TCP Westwood sets the congestion window literature. For instance, TCP Westwood sets the congestion window
equal to the measured bandwidth at time of congestion in response to equal to the measured bandwidth at time of congestion in response to
three DupACKs or a timeout. This measurement is obtained by counting three DupACKs or a timeout. This measurement is obtained by counting
and filtering the ACK rate. This setting provides a significant and filtering the ACK rate. This setting provides a significant
goodput improvement in noisy channels because the "blind" by half goodput improvement in noisy channels because the "blind" by half
window reduction of standard TCP is avoided, i.e. the window is not window reduction of standard TCP is avoided, i.e. the window is not
reduced by too much [Mascolo01]. reduced by too much [Mascolo01].
Open questions concerning corruption loss include: Open questions concerning corruption loss include:
- How should corruption loss be detected? - How should corruption loss be detected?
- How should a source react when it is known that corruption has - How should a source react when it is known that corruption has
occurred? occurred?
3.3 Challenge 3: Small Packets - Can an ECN-capable flow infer that loss must be due to corruption
just from lack of explicit congestion notifications around a loss
episode [LT-TCP]? Or could this inference be dangerous given the
transport doesn't know whether queues on the path are all ECN-
capable?
Over past years, the performance of TCP congestion avoidance 3.3 Challenge 3: Packets Sizes
TCP does not take packet size into account when responding to losses
or ECN. Over past years, the performance of TCP congestion avoidance
algorithms has been extensively studied. The well known "square root algorithms has been extensively studied. The well known "square root
formula" provides the performance of the TCP congestion avoidance formula" provides the performance of the TCP congestion avoidance
algorithm for TCP Reno [RFC2581]. [Padhye98] enhances the model to algorithm for TCP Reno [RFC2581]. [Padhye98] enhances the model to
account for timeouts, receiver window, and delayed ACKs. account for timeouts, receiver window, and delayed ACKs.
For the sake of the present discussion, we will assume that the TCP For the sake of the present discussion, we will assume that the TCP
throughput is expressed using the simplified formula. Using this throughput is expressed using the simplified formula. Using this
formula, the TCP throughput is proportional to the packet size and formula, the TCP throughput is proportional to the segment size and
inversely proportional to the RTT and the square root of the drop inversely proportional to the RTT and the square root of the drop
probability: probability:
MSS 1 S 1
B ~ C --- ------- B ~ C --- -------
RTT sqrt(p) RTT sqrt(p)
where where,
S is the TCP segment size (in bytes)
RTT is the end-to-end round trip time of the TCP connection
(in seconds)
MSS is the TCP segment size (in bytes)
RTT is the end-to-end round trip time of the TCP connection (in
seconds)
p is the packet drop probability p is the packet drop probability
Neglecting the fact that the TCP rate linearly depends on it, Neglecting the fact that the TCP rate linearly depends on it,
choosing the ideal packet size is a trade-off between high throughput choosing the ideal packet size is a trade-off between high throughput
(the larger a packet, the smaller the relative header overhead) and (the larger a packet, the smaller the relative header overhead) and
low delay (the smaller a packet, the shorter the time that is needed low delay (the smaller a packet, the shorter the time that is needed
until it is filled with data). Observing that TCP is not suited for until it is filled with data). Observing that TCP is not suited for
applications such as streaming media (since reliable in-order applications such as streaming media (since reliable in-order
delivery and congestion control can cause arbitrarily long delays), delivery and congestion control can cause arbitrarily long delays),
this trade-off has not usually been considered for TCP applications, this trade-off has not usually been considered for TCP applications,
and the influence of the packet size on the sending rate is not and the influence of the packet size on the sending rate is has not
typically seen as a significant issue. typically been seen as a significant issue, given there are still few
paths through the Internet that support packets larger than the 1500B
common with Ethernet.
The situation is different for the Datagram Congestion Control The situation is already different for the Datagram Congestion
Protocol (DCCP) [RFC4340], which has been designed to enable Control Protocol (DCCP) [RFC4340], which has been designed to enable
unreliable but congestion-controlled datagram transmission, avoiding unreliable but congestion-controlled datagram transmission, avoiding
the arbitrary delays associated with TCP. DCCP is intended for the arbitrary delays associated with TCP. DCCP is intended for
applications such as streaming media that can benefit from control applications such as streaming media that can benefit from control
over the tradeoffs between delay and reliable in-order delivery. over the tradeoffs between delay and reliable in-order delivery.
DCCP provides for a choice of modular congestion control mechanisms. DCCP provides for a choice of modular congestion control mechanisms.
DCCP uses Congestion Control Identifiers (CCIDs) to specify the DCCP uses Congestion Control Identifiers (CCIDs) to specify the
congestion control mechanism. Three profiles are currently specified: congestion control mechanism. Three profiles are currently specified:
- DCCP Congestion Control ID 2 (CCID 2) [RFC4341]: - DCCP Congestion Control ID 2 (CCID 2) [RFC4341]:
TCP-like Congestion Control. CCID 2 sends data using a close TCP-like Congestion Control. CCID 2 sends data using a close
variant of TCP's congestion control mechanisms, incorporating a variant of TCP's congestion control mechanisms, incorporating a
variant of SACK [RFC2018, RFC3517]. CCID 2 is suitable for senders variant of SACK [RFC2018, RFC3517]. CCID 2 is suitable for senders
who can adapt to the abrupt changes in congestion window typical of who can adapt to the abrupt changes in congestion window typical of
TCP's AIMD congestion control, and particularly useful for senders TCP's AIMD congestion control, and particularly useful for senders
who would like to take advantage of the available bandwidth in an who would like to take advantage of the available bandwidth in an
environment with rapidly changing conditions. environment with rapidly changing conditions.
- DCCP Congestion Control ID 3 (CCID 3) [RFC4342]: - DCCP Congestion Control ID 3 (CCID 3) [RFC4342]:
TCP-Friendly Rate Control (TFRC) [RFC3448bis] is a congestion TCP-Friendly Rate Control (TFRC) [RFC3448bis] is a congestion
control mechanism designed for unicast flows operating in a best- control mechanism designed for unicast flows operating in a best-
effort Internet environment. It is reasonably fair when competing effort Internet environment. It is reasonably fair when competing
for bandwidth with TCP flows, but has a much lower variation of for bandwidth with TCP flows, but has a much lower variation of
throughput over time compared with TCP, making it more suitable for throughput over time compared with TCP, making it more suitable for
applications such as streaming media where a relatively smooth applications such as streaming media where a relatively smooth
sending rate is of importance. CCID 3 is appropriate for flows that sending rate is of importance. CCID 3 is appropriate for flows that
would prefer to minimize abrupt changes in the sending rate, would prefer to minimize abrupt changes in the sending rate,
including streaming media applications with small or moderate including streaming media applications with small or moderate
skipping to change at page 16, line 20 skipping to change at page 18, line 13
or by applications that change their sending rate by varying the or by applications that change their sending rate by varying the
segment size. Because CCID 4 is intended for applications that use segment size. Because CCID 4 is intended for applications that use
a fixed small segment size, or that vary their segment size in a fixed small segment size, or that vary their segment size in
response to congestion, the transmit rate derived from the TCP response to congestion, the transmit rate derived from the TCP
throughput equation is reduced by a factor that accounts for packet throughput equation is reduced by a factor that accounts for packet
header size, as specified in [RFC4828]. header size, as specified in [RFC4828].
The resulting open questions are: The resulting open questions are:
- How does TFRC-SP operate under various network conditions? - How does TFRC-SP operate under various network conditions?
- How to design congestion control so as to scale with packet - How to design congestion control so as to scale with packet
size (dependency of congestion algorithm on packet size)? Early size (dependency of congestion algorithm on packet size)?
assessment shows that packet size dependency should remain at
the transport layer.
Today, many network resources are designed so that packet processing Today, many network resources are designed so that packet processing
cannot be overloaded even for incoming loads at the maximum bit-rate cannot be overloaded even for incoming loads at the maximum bit-rate
of the line. If packet processing can handle sustained load r [packet of the line. If packet processing can handle sustained load r [packet
per second] and the minimum packet size is h [bit] (i.e. packet per second] and the minimum packet size is h [bit] (i.e. packet
headers with no payload), then a line rate of x [bit per second] will headers with no payload), then a line rate of x [bit per second] will
never be able to overload packet processing as long as x <= r.h. never be able to overload packet processing as long as x =< r.h.
However, realistic equipment is often designed to only cope with a However, realistic equipment is often designed to only cope with a
near-worst-case workload with a few larger packets in the mix, rather near-worst-case workload with a few larger packets in the mix, rather
than the worst-cast of all minimum size packets. In this case, x = than the worst-cast of all minimum size packets. In this case, x =
r.(h + e) for some small value of e. r.(h + e) for some small value of e.
Therefore, it is likely that most congestion seen on today's Internet Therefore, it is likely that most congestion seen on today's Internet
is due to an excess of bits rather than packets, although packet- is due to an excess of bits rather than packets, although packet-
congestion is not impossible for runs of small packets (e.g. TCP ACKs congestion is not impossible for runs of small packets (e.g. TCP ACKs
or DoS attacks with small UDP datagrams). or DoS attacks with small UDP datagrams).
This observation raises additional open issues: This observation raises additional open issues:
- Will bit congestion remain prevalent? - Will bit congestion remain prevalent?
Being able to assume that congestion is generally due to excess bits Being able to assume that congestion is generally due to excess
not excess packets is a useful simplifying assumption in the design bits not excess packets is a useful simplifying assumption in the
of congestion control protocols. Can we rely on this assumption into design of congestion control protocols. Can we rely on this
the future? assumption into the future? An alternative view of the future is
that in-network processing will become commonplace, so that per-
packet processing will be as likely to be the bottleneck as per-bit
transmission [Shin08].
Over the last three decades, performance gains have mainly been Over the last three decades, performance gains have mainly been
through increased packet rates, not bigger packets. But if bigger through increased packet rates, not bigger packets. But if bigger
maximum segment sizes become more prevalent, tiny segments (e.g. maximum segment sizes do become more prevalent, tiny segments (e.g.
ACKs) will not stop being widely used – leading to - a widening
ACKs) will still continue to be widely used - a widening range of range of packet sizes.
packet sizes.
The open question is thus whether or not packet processing rates (r) The open question is thus whether or not packet processing rates
will keep up with growth in transmission rates (x). A superficial (r) will keep up with growth in transmission rates (x). A
look at Moore's Law type trends would suggest that processing (r) superficial look at Moore's Law type trends would suggest that
will continue to outstrip growth in transmission (x). But predictions processing (r) will continue to outstrip growth in transmission
based on actual knowledge of technology futures would be useful. (x). But predictions based on actual knowledge of technology
Another open question is whether there are likely to be more small futures would be useful. Another open question is whether there are
packets in the average packet mix. If the answers to either of these likely to be more small packets in the average packet mix. If the
questions predict that packet congestion could become prevalent, answers to either of these questions predict that packet congestion
congestion control protocols will have to be more complicated. could become prevalent, congestion control protocols will have to
be more complicated.
- Confusable Causes of Drop - Confusable Causes of Drop
There is a considerable body of research on how to distinguish There is a considerable body of research on how to distinguish
whether packet drops are due to transmission corruption or to whether packet drops are due to transmission corruption or to
congestion. But the full list of confusable causes of drop is longer congestion. But the full list of confusable causes of drop is
and includes transmission loss, congestion loss (bit congestion and longer and includes transmission loss, congestion loss (bit c
packet congestion), and policing loss. congestion and packet congestion), and policing loss.
If congestion is due to excess bits, the bit rate should be reduced. If congestion is due to excess bits, the bit rate should be
If congestion is due to excess packets, the packet rate can be reduced. If congestion is due to excess packets, the packet rate
reduced without reducing the bit rate - by using larger packets. can be reduced without reducing the bit rate - by using larger
However, if the transport cannot tell which of these causes led to a packets. However, if the transport cannot tell which of these
specific drop, its only safe response is to reduce the bit rate. This causes led to a specific drop, its only safe response is to reduce
is why the Internet would be more complicated if packet congestion the bit rate. This is why the Internet would be more complicated if
were prevalent, as reducing the bit rate normally also reduces the packet congestion were prevalent, as reducing the bit rate normally
packet rate, while reducing the packet rate doesn't necessarily also reduces the packet rate, while reducing the packet rate
reduce the bit rate. doesn't necessarily reduce the bit rate.
Given distinguishing between transmission loss and congestion is Given distinguishing between transmission loss and congestion is
already an open issue (Section 3.2), if that problem is ever solved, already an open issue (Section 3.2), if that problem is ever
a further open issue would be whether to standardize a solution that solved, a further open issue would be whether to standardize a
distinguishes all the above causes of drop, not just two of them. solution that distinguishes all the above causes of drop, not just
two of them.
Nonetheless, even if we find a way for network equipment to Nonetheless, even if we find a way for network equipment to
explicitly distinguish which sort of drop has occurred, we will never explicitly distinguish which sort of drop has occurred, we will
be able to assume that such a smart AQM solution is deployed at every never be able to assume that such a smart AQM solution is deployed
congestible resource throughout the Internet - at every higher layer at every congestible resource throughout the Internet - at every
device like firewalls, proxies, servers and at every lower layer higher layer device like firewalls, proxies, servers and at every
device like low-end home hubs, DSLAMs, WLAN cards, cellular base- lower layer device like low-end home hubs, DSLAMs, WLAN cards,
stations and so on. Thus, transport protocols will always have to cellular base-stations and so on. Thus, transport protocols will
cope with drops due to unguessable causes, so we should always treat always have to cope with drops due to unpredictable causes, so we
AQM smarts as an optimization, not a given. should always treat AQM smarts as an optimization, not a given.
- What does a congestion notification on a packet of a certain size - What does a congestion notification on a packet of a certain size
mean? mean?
The open issue here is whether a loss or explicit congestion mark The open issue here is whether a loss or explicit congestion mark
should be interpreted as a single congestion event irrespective of should be interpreted as a single congestion event irrespective of
the size of the packet lost or marked, or whether the strength of the the size of the packet lost or marked, or whether the strength of
congestion notification is weighted by the size of the packet. This the congestion notification is weighted by the size of the packet.
issue is discussed at length in [Bri08], along with other aspects of This issue is discussed at length in [Bri08], along with other
packet size and congestion control. aspects of packet size and congestion control.
[Bri08] makes the strong recommendation that network equipment should [Bri08] makes the strong recommendation that network equipment
drop or mark packets with a probability independent of each specific should drop or mark packets with a probability independent of each
packet's size, while congestion controls should respond to dropped or specific packet's size, while congestion controls should respond to
marked packets in proportion to the packet's size. This issue is dropped or marked packets in proportion to the packet's size. This
deferred to the Transport Area Working Group. issue is under discussion in the Transport Area Working Group.
- Packet Size and Congestion Control Protocol Design - Packet Size and Congestion Control Protocol Design
If the above recommendation is correct - that the packet size of a If the above recommendation is correct - that the packet size of a
congestion notification should be taken into account when the congestion notification should be taken into account when the
transport reads, not when the network writes the notification - it transport reads, not when the network writes the notification - it
opens up a significant program of protocol engineering and re- opens up a significant program of protocol engineering and re-
engineering. Indeed, TCP does not take packet size into account when engineering. Indeed, TCP does not take packet size into account
responding to losses or ECN. At present this is not a pressing when responding to losses or ECN. At present this is not a pressing
problem because use of 1500B data segments is very prevalent for TCP problem because use of 1500B data segments is very prevalent for
and the range of alternative segment sizes is not large. However, we TCP and the incidence of alternative maximum segment sizes is not
should design the Internet's protocols so they will scale with packet large. However, we should design the Internet's protocols so they
size, so an open issue is whether we should evolve TCP, or expect new will scale with packet size, so an open issue is whether we should
protocols to take over. evolve TCP to be sensitive to packet size, or expect new protocols
to take over.
As we continue to standardize new congestion control protocols, we As we continue to standardize new congestion control protocols, we
must then face the issue of how they should take account of packet must then face the issue of how they should take account of packet
size. If we determine that TCP was incorrect in not taking account of size. If we determine that TCP was incorrect in not taking account
packet size, even if we don't change TCP, we should not allow new of packet size, even if we don't change TCP, we should not allow
protocols to follow TCP's example in this respect. For example, as new protocols to follow TCP's example in this respect. For example,
explained here above, the small-packet variant of TCP-friendly rate as explained here above, the small-packet variant of TCP-friendly
control (TFRC-SP [RFC4828]) is an experimental protocol that aims to rate control (TFRC-SP [RFC4828]) is an experimental protocol that
take account of packet size. Whatever packet size it uses, it ensures aims to take account of packet size. Whatever packet size it uses,
its rate approximately equals that of a TCP using 1500B segments. it ensures its rate approximately equals that of a TCP using 1500B
This raises the further question of whether TCP with 1500B segments segments. This raises the further question of whether TCP with
will be a suitable long-term gold standard, or whether we need a more 1500B segments will be a suitable long-term gold standard, or
thoroughgoing review of what it means for a congestion control to whether we need a more thoroughgoing review of what it means for a
scale with packet size. congestion control to scale with packet size.
3.4 Challenge 4: Pseudo-Wires 3.4 Challenge 4: Flow Startup
Pseudowires (PW) may carry non-TCP data flows (e.g. TDM traffic). The beginning of data transmissions imposes some further, unique
Structure Agnostic TDM over Packet (SATOP) [RFC4553], Circuit challenges: When a connection to a new destination is established,
Emulation over Packet Switched Networks (CESoPSN), TDM over IP, are the end-systems have hardly any information about the characteristics
not responsive to congestion control in a TCP-friendly manner as of the path in between and the available bandwidth. In this flow
prescribed by [RFC2914]. Moreover, it is not possible to simply startup situation there is no obvious choice how to start to send. A
reduce the flow rate of a TDM PW when facing packet loss. similar problem also occurs after relatively long idle times, since
the congestion control state then no longer reflects current
information about the state of the network (flow restart problem).
Carrying TDM PW over an IP network poses a real problem. Indeed, Van Jacobson [Jacobson88] suggested using the slow-start mechanism
providers can rate control corresponding incoming traffic but it may both for the flow startup and the flow restart, and this is today’s
not be able to detect that a PW carries TDM traffic. This can be standard solution [RFC2581]. The slow-start algorithm starts with a
illustrated with the following example. small initial congestion window, which is exponentially increased as
soon as acknowledgements arrive. However, the slow-start is not
optimal in many situations: First, it can take quite a long time
until a sender can fully utilize the available bandwidth on a path.
Second, the exponential increase may be too aggressive and cause
multiple packet loss if large congestion windows are reached (slow-
start overshooting). Finally, the slow-start does not ensure that new
flows converge quickly to a reasonable share of resources, in
particular if they compete with long-lived flows. This convergence
problem may even worsen if more aggressive congestion control
variants get widely used.
........... ............ The slow-start and its interaction with the congestion avoidance
. . . phase was largely designed by intuition [Jacobson88]. So far, little
S1 --- E1 --- . . theory has been developed to understand the flow startup problem and
. | . . its implication on congestion control stability and fairness. There
. === E5 === E7 --- is also no established methodology to evaluate whether new flow
. | . . | startup mechanisms are appropriate or not.
S2 --- E2 --- . . |
. . . | |
........... . | v
. ----- R --->
........... . | ^
. . . | |
S3 --- E3 --- . . |
. | . . |
. === E6 === E8 ---
. | . .
S4 --- E4 --- . .
. . .
........... ............
\---- P1 ---/ \---------- P2 ----- As a consequence, it is a non-trivial task to address the
shortcomings of the slow-start algorithm. Several experimental
enhancements have been proposed, such as the congestion window
validation [RFC2861] and the limited slow-start [RFC3742]. There are
also ongoing research activities, focusing e.g. on bandwidth
estimation techniques, delay-based congestion control, or rate pacing
mechanisms. However, any alternative end-to-end flow startup approach
has to cope with the inherent problem that there is no or only few
information about the path at the beginning of a data transfer. This
uncertainty could be reduced by more expressive feedback signaling
(cf. Section 3.1). For instance, a source could learn the path
characteristics faster with the Quick-Start mechanism [RFC4782]. But,
even if the source knew exactly what rate it should aim for, it would
still not necessarily be safe to jump straight to that rate. The end-
system still doesn't know how much how a change in its own rate will
affect the path, which also might become congested in less than one
RTT. Further research would be useful to understand the effect of
decreasing the uncertainty by explicit feedback separately from
control theoretic stability questions. Furthermore, the flow startup
also raises fairness questions. For instance, it is unclear whether
it could be reasonable to use a faster startup when an end-system
detects that a path is currently not congested.
Sources S1, S2, S3 and S4 are originating TDM over IP traffic. P1 In summary, there are several topics for further research concerning
provider edges E1, E2, E3, and E4 are rate limiting such traffic. The flow startups:
SLA of provider P1 with transit provider P2 is such that the latter
assumes a BE traffic pattern and that the distribution shows the
typical properties of common BE traffic (elastic, non-real time, non-
interactive).
The problem arises for transit provider P2 that is not able to detect - Better theoretical understanding of the design and evaluation of
that IP packets are carrying constant-bit rate service traffic that flow startup mechanisms, concerning their impact on congestion
is by definition unresponsive to any congestion control mechanisms. risk, stability, and fairness
Assuming P1 providers are rate limiting BE traffic, a transit P2 - Evaluate whether it may be appropriate to allow more
provider router R may be subject to serious congestion as all TDM PWs differentiated starting schemes, e. g., to allow higher initial
cross the same router. TCP-friendly traffic would follow TCP's AIMD rates under certain constraints; this also requires refining
algorithm of reducing the sending rate in half in response to each fairness for startup situations
packet drop. Nevertheless, the TDM PWs will take all the available
capacity, leaving no room for any other type of traffic. Note that
the situation may simply occur because S4 suddenly turns up a TDM PW.
As it is not possible to assume that edge routers will soon have the - Better theoretical models for the effects of decreasing
ability to detect the type of the carried traffic, it is important uncertainty by additional network feedback, in particular if the
for transit routers (P2 provider) to be able to apply a fair, robust, path characteristics are very dynamic.
responsive and efficient congestion control technique in order to
prevent impacting normally behaving Internet traffic. However, it is
still an open question how the corresponding mechanisms in the data
and control planes have to be designed.
3.5 Challenge 5: Multi-domain Congestion Control 3.5 Challenge 5: Multi-domain Congestion Control
Transport protocols such as TCP operate over the Internet that is Transport protocols such as TCP operate over the Internet that is
divided into autonomous systems. These systems are characterized by divided into autonomous systems. These systems are characterized by
their heterogeneity as IP networks are realized by a multitude of their heterogeneity as IP networks are realized by a multitude of
technologies. The variety of conditions and their variations leads to technologies. The variety of conditions and their variations leads to
correlation effects between policers that regulate traffic against correlation effects between policers that regulate traffic against
certain conformance criteria. certain conformance criteria.
skipping to change at page 20, line 35 skipping to change at page 22, line 38
queue management techniques - to convey congestion information trying queue management techniques - to convey congestion information trying
to prevent packet losses (packet loss and the number of packets to prevent packet losses (packet loss and the number of packets
marked gives an indication of the level of congestion). Using TCP marked gives an indication of the level of congestion). Using TCP
ACKs to feed back that information allows the hosts to realign their ACKs to feed back that information allows the hosts to realign their
transmission rate and thus encourage them to efficiently use the transmission rate and thus encourage them to efficiently use the
network. In IP, ECN uses the two unused bits of the TOS field network. In IP, ECN uses the two unused bits of the TOS field
[RFC2474]. Further, ECN in TCP uses two bits in the TCP header that [RFC2474]. Further, ECN in TCP uses two bits in the TCP header that
were previously defined as reserved [RFC793]. were previously defined as reserved [RFC793].
ECN [RFC3168] is an example of a congestion feedback mechanism from ECN [RFC3168] is an example of a congestion feedback mechanism from
the network toward hosts, while the policer must sit at every the network toward hosts. The congestion-based feedback scheme
potential point of congestion. The congestion-based feedback scheme
however has limitations when applied on an inter-domain basis. however has limitations when applied on an inter-domain basis.
Indeed, the same congestion feedback mechanism is required along the Indeed, Section 8 and 19 of RFC3168 details consequences/implication
entire path for optimal control at end-systems. of i) a network erasing CE introduced earlier on the path and ii) a
network changing Not-ECT to ECT. Both of which could allow an
attacking network to cause excess congestion in an upstream network,
even if the transports were behaving correctly. There are since so
far two possible solutions to problem i) the ECN nonce [RFC3540] and
the re-ECN incentive system. Nevertheless, the absence of IPv6 header
checksum implies that corruption could be more impacting than in the
IPv4 case. Fragmentation is another: the ECN-nonce cannot protect
against misbehaving receivers that conceal marked fragments, so some
protection is lost in situations where Path MTU discovery is
disabled. So, there is still room for improvement on the ECN
mechanism to cope with ECN when operating in multi-domain networks.
Operational/deployment experience is nevertheless required to
determine the extent of these problems. The second problem is mainly
related to deployment and usage practices and does not seem to result
into any specific research challenge.
Another solution in a multi-domain environment may be the TCP rate Another solution in a multi-domain environment may be the TCP rate
controller (TRC), a traffic conditioner which regulates the TCP flow controller (TRC), a traffic conditioner which regulates the TCP flow
at the ingress node in each domain by controlling packet drops and at the ingress node in each domain by controlling packet drops and
RTT of the packets in a flow. The outgoing traffic from a TRC delays of the packets in a flow. The outgoing traffic from a TRC
controlled domain is shaped in such a way that no packets are dropped controlled domain is shaped in such a way that no packets are dropped
at the policer. However, the TRC depends on the end-to-end TCP model, at the policer. However, the TRC depends on the end-to-end TCP model,
and thus the diversity of TCP implementations is a general problem. and thus the diversity of TCP implementations is a general problem.
Security is another challenge for multi-domain operation. At some 3.5.1 Multi-domain operations
domain boundaries, an increasing number of application layer gateways
(e. g., proxies) are deployed, which split up end-to-end connections
and prevent end-to-end congestion control.
Furthermore, authentication and authorization issues can arise at Security is a challenge for multi-domain network operation. At domain
domain boundaries whenever information is exchanged, and so far the boundaries, authentication and authorization issues can arise
Internet does not have a single general security architecture that whenever congestion control information is exchanged. From this
could be used in all cases. Many autonomous systems also only perspective, the Internet does not have so far a single general
exchange some limited amount of information about their internal security architecture that could be used in all cases. Many
state (topology hiding principle), even though having more precise autonomous systems also only exchange some limited amount of
information could be highly beneficial for congestion control. The information about their internal state (topology hiding principle),
future evolution of the Internet inter-domain operation has to show even though having more precise information could be highly
whether more multi-domain information exchange can be realized. beneficial for congestion control. Indeed, prevent revealing internal
network structure is highly sensitive in multi-domain network
operations and thus also a concern when it comes to the deployability
of congestion control schemes. For instance, an RCP-like scheme could
reveal more information about the internal network dimensioning than
TCP does today.
The future evolution of the Internet inter-domain operation has to
show whether more multi-domain information exchange can be
effectively realized. This is of particular importance for congestion
control schemes that make use of explicit per-datagram rate feedback
(e.g. RCP or XCP) or explicit rate feedback or that use in-band
congestion signaling (e.g. QuickStart) or out-of-band signaling (e.g.
CADPC/PTP). Explicit signaling exchanges at the inter-domain level
that result in local domain triggers are currently absent from the
Internet. From this perspective, security means resulting from
limited trust between different administrative units result in policy
enforcement that exacerbates difficulty encountered when explicit
feedback congestion control information is exchanged between domains.
3.5.2 Multi-domain Pseudowires
Extending pseudo-wires across multiple domains poses specific issues.
Pseudowires (PW) may carry non-TCP data flows (e.g. TDM traffic) over
a multi-domain IP networks. Structure Agnostic TDM over Packet
(SATOP) [RFC4553], Circuit Emulation over Packet Switched Networks
(CESoPSN), TDM over IP, are not responsive to congestion control in a
TCP-friendly manner as discussed by [RFC2914] (see also [RFC5033]).
Moreover, it is not possible to simply reduce the flow rate of a TDM
PW when facing packet loss. Indeed, providers can rate control
corresponding incoming traffic but it may not be able to detect that
a PW carries TDM traffic. This can be illustrated with the following
example.
........... ............
. . .
S1 --- E1 --- . .
. | . .
. === E5 === E7 ---
. | . . |
S2 --- E2 --- . . |
. . . | |
........... . | v
. ----- R --->
........... . | ^
. . . | |
S3 --- E3 --- . . |
. | . . |
. === E6 === E8 ---
. | . .
S4 --- E4 --- . .
. . .
........... ............
\---- P1 ---/ \---------- P2 -----
Sources S1, S2, S3 and S4 are originating TDM over IP traffic. P1
provider edges E1, E2, E3, and E4 are rate limiting such traffic. The
SLA of provider P1 with transit provider P2 is such that the latter
assumes a BE traffic pattern and that the distribution shows the
typical properties of common BE traffic (elastic, non-real time, non-
interactive).
The problem arises for transit provider P2 that is not able to detect
that IP packets are carrying constant-bit rate service traffic for
which the only useful congestion control mechanism would rely on
implicit or explicit admission control.
Assuming P1 providers are rate limiting BE traffic, a transit P2
provider router R may be subject to serious congestion as all TDM PWs
cross the same router. TCP-friendly traffic (e.g. each flow within
the PW) would follow TCP's AIMD algorithm of reducing the sending
rate in half in response to each packet drop. Nevertheless, the PWs
of TDM traffic could take all the available capacity while other more
TCP-friendly traffic reduced itself to nothing. Note that
the situation may simply occur because S4 suddenly turns on
additional TDM channels.
It is neither possible nor desirable to assume that edge routers will
soon have the ability to detect the responsiveness of the carried
traffic, but it is still important for transit providers to be able
to police a fair, robust, responsive and efficient congestion control
technique in order to avoid impacting congestion responsive Internet
traffic.
However, we must not require only certain specific responses to
congestion to be embedded within the network, which would harm
evolvability. So designing the corresponding mechanisms in the data
and control planes is still open.
3.6 Challenge 6: Precedence for Elastic Traffic 3.6 Challenge 6: Precedence for Elastic Traffic
Traffic initiated by so-called elastic applications adapt to the Traffic initiated by so-called elastic applications adapt to the
available bandwidth using feedback about the state of the network. available bandwidth using feedback about the state of the network.
For all these flows the application dynamically adjusts the data For all these flows the application dynamically adjusts the data
generation rate. Examples encompass short-lived elastic traffic generation rate. Examples encompass short-lived elastic traffic
including HTTP and instant messaging traffic as well as long file including HTTP and instant messaging traffic as well as long file
transfers with FTP. In brief, elastic data applications can show transfers with FTP. In brief, elastic data applications can show
extremely different requirements and traffic characteristics. extremely different requirements and traffic characteristics.
skipping to change at page 21, line 41 skipping to change at page 25, line 47
For instance, low precedence traffic should experience lower average For instance, low precedence traffic should experience lower average
throughput than higher precedence traffic. Several questions arise throughput than higher precedence traffic. Several questions arise
here: what is the meaning of "relative"? What is the role of the here: what is the meaning of "relative"? What is the role of the
Transport Layer? Transport Layer?
The preferential treatment of higher precedence traffic with The preferential treatment of higher precedence traffic with
appropriate congestion control mechanisms is still an open issue that appropriate congestion control mechanisms is still an open issue that
may, depending on the proposed solution, impact both the host and the may, depending on the proposed solution, impact both the host and the
network precedence awareness, and thereby congestion control. network precedence awareness, and thereby congestion control.
[RFC2990] points out that interactions between congestion control and [RFC2990] points out that the interactions between congestion control
DiffServ [RFC2475] have yet to be addressed, and this statement is and DiffServ [RFC2475] have yet to be addressed, and this statement
still valid at the time of writing. is still valid at the time of writing.
There is also still work to be performed regarding lower precedence There is also still work to be performed regarding lower precedence
traffic – data transfers which are useful, yet not important enough traffic – data transfers which are useful, yet not important enough
to significantly impair any other traffic. Examples of applications to significantly impair any other traffic. Examples of applications
that could make use of such traffic are web caches and web browsers that could make use of such traffic are web caches and web browsers
(e.g. for pre-fetching) as well as peer-to-peer applications. There (e.g. for pre-fetching) as well as peer-to-peer applications. There
are proposals for achieving low precedence on a pure end-to-end basis are proposals for achieving low precedence on a pure end-to-end basis
(e.g. TCP-LP [Kuzmanovic]), and there is a specification for (e.g. TCP-LP [Kuzmanovic03]), and there is a specification for
achieving it via router mechanisms [RFC3662]. It seems, however, that achieving it via router mechanisms [RFC3662]. It seems, however, that
such traffic is still hardly used, and sending lower precedence data such traffic is still hardly used, and sending lower precedence data
is not yet a common service on the Internet. is not yet a common service on the Internet.
3.7 Challenge 7: Misbehaving Senders and Receivers 3.7 Challenge 7: Misbehaving Senders and Receivers
In the current Internet architecture, congestion control depends on In the current Internet architecture, congestion control depends on
parties acting against their own interests. It is not in a receiver's parties acting against their own interests. It is not in a receiver's
interest to honestly return feedback about congestion on the path, interest to honestly return feedback about congestion on the path,
effectively requesting a slower transfer. It is not in the sender's effectively requesting a slower transfer. It is not in the sender's
interest to reduce its rate in response to congestion if it can rely interest to reduce its rate in response to congestion if it can rely
on others to do so. Additionally, networks may have strategic reasons on others to do so. Additionally, networks may have strategic reasons
to make other networks appear congested. to make other networks appear congested.
Numerous strategies to divert congestion control have already been Numerous strategies to improve the congestion control have already
identified. The IETF has particularly focused on misbehaving TCP been identified. The IETF has particularly focused on misbehaving TCP
receivers that could confuse a compliant sender into assigning receivers that could confuse a compliant sender into assigning
excessive network and/or server resources to that receiver (e.g. excessive network and/or server resources to that receiver (e.g.
[Sav99], [RFC3540]). But, although such strategies are worryingly [Sav99], [RFC3540]). But, although such strategies are worryingly
powerful, they do not yet seem common. powerful, they do not yet seem common (however, evidence of attack
prevalence is itself a research requirement).
A growing proportion of Internet traffic comes from applications A growing proportion of Internet traffic comes from applications
designed not to use congestion control at all, or worse, applications designed not to use congestion control at all, or worse, applications
that add more forward error correction the more losses they that add more forward error correction the more losses they
experience. Some believe the Internet was designed to allow such experience. Some believe the Internet was designed to allow such
freedom so it can hardly be called misbehavior. But others consider freedom so it can hardly be called misbehavior. But others consider
that it is misbehavior to abuse this freedom [RFC3714], given one that it is misbehavior to abuse this freedom [RFC3714], given one
person's freedom can constrain the freedom of others (congestion person's freedom can constrain the freedom of others (congestion
represents this conflict of interests). Indeed, leaving freedom represents this conflict of interests). Indeed, leaving freedom
unchecked might result in congestion collapse in parts of the unchecked might result in congestion collapse in parts of the
Internet. Proportionately, large volumes of unresponsive voice Internet. Proportionately, large volumes of unresponsive voice
traffic could represent such a threat, particularly for countries traffic could represent such a threat, particularly for countries
with less generous provisioning [RFC3714]. More recently, Internet with less generous provisioning [RFC3714]. Also, Internet video on
video on demand services are becoming popular that transfer much demand services are becoming popular that transfer much greater data
greater data rates without congestion control (e.g. the peer-to-peer rates without congestion control. In general, it is recommended that
Joost service currently streams media over UDP at about 700kbps such UDP applications use some form of congestion control [RFC5405].
downstream and 220kbps upstream).
Note that the problem is not just misbehavior driven by a selfish Note that the problem is not just misbehavior driven by a self-
desire for more bandwidth (see Section 4). interested desire for more bandwidth. Indeed, congestion control may
be attacked by someone who makes no gain for themselves, other than
the satisfaction of harming others (see Security Considerations in
Section 4).
Open research questions resulting from these considerations are: Open research questions resulting from these considerations are:
- By design, new congestion control protocols need to enable one end - By design, new congestion control protocols need to enable one end
to check the other for protocol compliance. to check the other for protocol compliance.
- Provide congestion control primitives that satisfy more demanding - We need to provide congestion control primitives that satisfy more
applications (smoother than TFRC, faster than high speed TCPs), so demanding applications (smoother than TFRC, faster than high speed
that application developers and users do not turn off congestion TCPs), so that application developers and users do not turn off
control to get the rate they expect and need. congestion control to get the rate they expect and need.
Note also that self-restraint is disappearing from the Internet. So, Note also that self-restraint is disappearing from the Internet. So,
it may no longer be sufficient to rely on developers/users it may no longer be sufficient to rely on developers/users
voluntarily submitting themselves to congestion control. As main voluntarily submitting themselves to congestion control. As main
consequence, mechanisms to enforce fairness (see Section 2.3) need to consequence, mechanisms to enforce fairness (see Sections 2.3, 3.4,
have more emphasis within the research agenda. and 3.5) need to have more emphasis within the research agenda.
3.8 Other challenges 3.8 Other challenges
This section provides additional challenges and open research issues This section provides additional challenges and open research issues
that are not (at this point in time) deemed very large or of that are not (at this point in time) deemed very large or of
different nature compared to the main challenges depicted since so different nature compared to the main challenges depicted so far.
far.
Note that this section may be complemented in future release of this Note that this section may be complemented in future release of this
document by topics discussed during the last ICCRG meeting, co- document by topics discussed during the last ICCRG meeting, co-
located with PFLDNet 2008 International Workshop. Topics of interest located with PFLDNet 2008 International Workshop. Topics of interest
include multipath congestion control, and congestion control for include multipath congestion control, and congestion control for
multimedia codecs that only support certain set of data rates. multimedia codecs that only support certain set of data rates.
3.8.1 RTT estimation 3.8.1 RTT estimation
Several congestion control schemes have to precisely know the round- Several congestion control schemes have to precisely know the round-
trip time (RTT) of a path. The RTT is a measure of the current delay trip time (RTT) of a path. The RTT is a measure of the current delay
on a network. It is defined as the delay between the sending of a on a network. It is defined as the delay between the sending of a
packet and the reception of a corresponding response, which is echoed packet and the reception of a corresponding response, if echoed back
back immediately by receiver upon receipt of the packet. This immediately by receiver upon receipt of the packet. This corresponds
corresponds to the sum of the one-way delay of the packet and the to the sum of the one-way delay of the packet and the (potentially
(potentially different) one-way delay of the response. Furthermore, different) one-way delay of the response. Furthermore, any RTT
any RTT measurement also includes some additional delay due to the measurement also includes some additional delay due to the packet
packet processing in both end-systems. processing in both end-systems.
There are various techniques to measure the RTT: Active measurements There are various techniques to measure the RTT: Active measurements
inject special probe packets to the network and then measure the inject special probe packets to the network and then measure the
response time, using e.g. ICMP. In contrast, passive measurements response time, using e.g. ICMP. In contrast, passive measurements
determine the RTT from ongoing communication processes, without determine the RTT from ongoing communication processes, without
sending additional packets. sending additional packets.
The connection endpoints of reliable transport protocols such as TCP, The connection endpoints of reliable transport protocols such as TCP,
SCTP, and DCCP, as well as several application protocols, keep track SCTP, and DCCP, as well as several application protocols, keep track
of the RTT in order to dynamically adjust protocol parameters such as of the RTT in order to dynamically adjust protocol parameters such as
skipping to change at page 24, line 9 skipping to change at page 28, line 17
procedure, in combination with Karn's algorithm that prohibits RTT procedure, in combination with Karn's algorithm that prohibits RTT
measurements from retransmitted segments [RFC2988]. Traditionally, measurements from retransmitted segments [RFC2988]. Traditionally,
TCP implementations take one RTT measurement at a time (i. e., about TCP implementations take one RTT measurement at a time (i. e., about
once per RTT). As alternative, the TCP timestamp option [RFC1323] once per RTT). As alternative, the TCP timestamp option [RFC1323]
allows more frequent explicit measurements, since a sender can safely allows more frequent explicit measurements, since a sender can safely
obtain an RTT sample from every received acknowledgment. In obtain an RTT sample from every received acknowledgment. In
principle, similar measurement mechanisms are used by protocols other principle, similar measurement mechanisms are used by protocols other
than TCP. than TCP.
Sometimes it would be beneficial to know the RTT not only at the Sometimes it would be beneficial to know the RTT not only at the
sender, but also at the receiver. A passive receiver can deduce some sender, but also at the receiver, e.g., to find the one-way variation
information about the RTT by analyzing the sequence numbers of in delay due to one-way congestion.. A passive receiver can deduce
some information about the RTT by analyzing the sequence numbers of
received segments. But this method is error-prone and only works if received segments. But this method is error-prone and only works if
the sender permanently sends data. Other network entities on the path the sender permanently sends data. Other network entities on the path
can apply similar heuristics in order to approximate the RTT of a can apply similar heuristics in order to approximate the RTT of a
connection, but this mechanism is protocol-specific and requires per- connection, but this mechanism is protocol-specific and requires per-
connection state. In the current Internet, there is no simple and connection state. In the current Internet, there is no simple and
safe solution to determine the RTT of a connection in network safe solution to determine the RTT of a connection in network
entities other than the sender. entities other than the sender.
As outlined earlier in this document, the round-trip time is As outlined earlier in this document, the round-trip time is
typically not a constant value. For a given path, there is typically not a constant value. For a given path, there is
theoretical minimum value, which is given by the minimum theoretical minimum value, which is given by the minimum
transmission, processing and propagation delay on that path. However, transmission, processing and propagation delay on that path. However,
additional variable delays might be caused by congestion, cross- additional variable delays might be caused by congestion, cross-
traffic, shared mediums access control schemes, recovery procedures, traffic, shared mediums access control schemes, recovery procedures,
or other sub-IP layer mechanisms. Furthermore, a change of the path or other sub-IP layer mechanisms. Furthermore, a change of the path
(e. g., route flipping, handover in mobile networks) can result in (e. g., route flipping, hand-over in mobile networks) can result in
completely different delay characteristics. completely different delay characteristics.
Due to this variability, one single measured RTT value is hardly Due to this variability, one single measured RTT value is hardly
sufficient to characterize a path. This is why many protocols use RTT sufficient to characterize a path. This is why many protocols use RTT
estimators that derive an averaged value and keep track of a certain estimators that derive an averaged value and keep track of a certain
history of previous samples. For instance, TCP endpoints derive a history of previous samples. For instance, TCP endpoints derive a
smoothed round-trip time (SRTT) from an exponential weighted moving smoothed round-trip time (SRTT) from an exponential weighted moving
average [RFC2988]. Such a low-pass filter ensures that measurement average [RFC2988]. Such a low-pass filter ensures that measurement
noise and single outliers do not significantly affect the estimated noise and single outliers do not significantly affect the estimated
RTT. Still, a fundamental drawback of low-pass filters is that the RTT. Still, a fundamental drawback of low-pass filters is that the
skipping to change at page 25, line 15 skipping to change at page 29, line 24
evidence that such frequent sampling may not have a significant evidence that such frequent sampling may not have a significant
benefit [Allman99]. benefit [Allman99].
- Filter design: A closely related question is how to design good - Filter design: A closely related question is how to design good
filters for the measured samples. The existing algorithms are known filters for the measured samples. The existing algorithms are known
to be robust, but they are far from being perfect. The fundamental to be robust, but they are far from being perfect. The fundamental
problem is that there is no single set of RTT values that could problem is that there is no single set of RTT values that could
characterize the Internet as a whole, i.e., it is hard to define a characterize the Internet as a whole, i.e., it is hard to define a
design target. design target.
- Default values: RTT estimators can fail in certain scenarios, e. - Default values: RTT estimators can fail in certain scenarios, e.g.,
g., when any feedback is missing. In this case, default values have when any feedback is missing. In this case, default values have
to be used. Today, most default values are set to conservative to be used. Today, most default values are set to conservative
values that may not be optimal for most Internet communication. values that may not be optimal for most Internet communication.
Still, the impact of more aggressive settings is not well Still, the impact of more aggressive settings is not well
understood. understood.
- Clock granularities: RTT estimation depends on the clock - Clock granularities: RTT estimation depends on the clock
granularities of the protocol stacks. Even though there is a trend granularities of the protocol stacks. Even though there is a trend
towards higher precision timers, the limited granularity may still towards higher precision timers, the limited granularity
prevent highly accurate RTT estimations. (particularly on low cost devices) may still prevent highly
accurate RTT estimations.
3.8.2 Malfunctioning devices 3.8.2 Malfunctioning devices
There is a long history of malfunctioning devices harming the There is a long history of malfunctioning devices harming the
deployment of new and potentially beneficial functionality in the deployment of new and potentially beneficial functionality in the
Internet. Sometimes, such devices drop packets when a certain Internet. Sometimes, such devices drop packets or even crash
mechanism is used, causing users to opt for reliability instead of completely when a certain mechanism is used, causing users to opt for
performance and disable the mechanism, or operating system vendors to reliability instead of performance and disable the mechanism, or
disable it by default. One well-known example is ECN, whose operating system vendors to disable it by default. One well-known
deployment was long hindered by malfunctioning firewalls, but there example is ECN, whose deployment was long hindered by malfunctioning
are many other examples (e.g. the Window Scaling option of TCP). firewalls and is still hindered by malfunctioning home-hubs, but
there are many other examples (e.g. the Window Scaling option of TCP)
[Thaler07].
As new congestion control mechanisms are developed with the intention As new congestion control mechanisms are developed with the intention
of eventually seeing them deployed in the Internet, it would be of eventually seeing them deployed in the Internet, it would be
useful to collect information about failures caused by devices of useful to collect information about failures caused by devices of
this sort, analyze the reasons for these failures, and determine this sort, analyze the reasons for these failures, and determine
whether there are ways for such devices to do what they intend to do whether there are ways for such devices to do what they intend to do
without causing unintended failures. Recommendation for vendors of without causing unintended failures. Recommendation for vendors of
these devices could be derived from such an analysis. It would also these devices could be derived from such an analysis. It would also
be useful to see whether there are ways for failures caused by such be useful to see whether there are ways for failures caused by such
devices to become more visible to endpoints, or for those failures to devices to become more visible to endpoints, or for those failures to
become more visible to the maintainers of such devices. become more visible to the maintainers of such devices.
3.8.3. Dependence on RTT
AIMD window algorithms that have the goal of packet conservation end
up converging on a rate that is inversely proportional to RTT.
However, control theoretic approaches to stability have shown that
only the increase in rate (acceleration) not the target rate needs to
be inversely proportional to RTT.
It is possible to have more aggressive behaviors for some demanding
applications as long as they are part of a mix with less aggressive
transports [Key04]. This beneficial effect of transport type mixing
is probably how the Internet currently manages to remain stable even
in the presence of TCP slow start, which is more aggressive than the
theory allows for stability. Research giving deeper insight into
these aspects would be very useful.
3.8.4. Congestion Control in Multi-layered Networks
We often forget that a network of IP nodes is just as vulnerable to
congestion in the lower layers between IP-capable nodes as it is to
congestion on the IP-capable nodes themselves. As we develop
techniques for network equipment to take a greater part in congestion
control (ECN, XCP, RCP etc – see Section 3.1), we must not forget
that these techniques will either need to be deployed at lower layers
as well, or they will need to interwork with lower layer mechanisms.
[ECN-tunnel] gives guidelines on propagating ECN from lower layers
upwards, but to the authors' knowledge the layering problem has not
been addressed for explicit rate protocol proposals such as XCP &
RCP. Some issues are straightforward matters of interoperability
(e.g. how exactly to copy fields up the layers). While others are
less obvious (e.g. re-framing issues: if RCP were deployed in a lower
layer, how might multiple small RCP frames all with different rates
in their headers be assembled into a larger IP-layer datagram?).
Multi-layer considerations also confound many mechanisms that aim to
discover whether every node on the path supports the new congestion
control protocol. For instance, some proposals maintain a secondary
TTL field parallel to that in the IP header. Any nodes that support
the new behavior update both TTL fields, whereas legacy IP nodes will
only update the IP TTL field. This allows the endpoints to check
whether all IP nodes on the path support the new behavior, in which
case both TTLs will be equal at the receiver. But mechanisms like
these overlook nodes at lower layers that might not support the new
behavior.
It should also be possible to include the issue of congestion control
across overlay networks of relays under the general area of multi-
layer congestion control.
3.8.5. Multipath End-to-end Congestion Control and Traffic Engineering
Recent work has shown that multipath endpoint congestion control
[Kelly05] offers considerable benefits in terms of resilience and
resource usage efficiency. By pooling the resources on all paths,
even nodes not using multiple paths benefit from those that are.
Nowadays, there is considerable further research to do in this area,
particularly to understand interactions with network operator
controlled route provision and traffic engineering, and indeed
whether multipath congestion control can perform better traffic
engineering than the network itself, given the right incentives.
3.8.6 ALGs and Middleboxes
An increasing number of application layer gateways (ALG),
middleboxes, and proxies (see Section 3.6 of [RFC2775]) are deployed
at domain boundaries to verify conformance but also filter traffic
and control flows to e.g. prevent among other information leaking
between autonomous systems beyond routing information. These systems
split up end-to-end TCP connections and prevent end-to-end congestion
control. On the other side, transport over encrypted tunnels may not
allow that other network entities to participate in congestion
control.
Basically, such systems disrupt the primal and dual congestion
control components whereas their effects have not been so far
systematically studied. From this perspective, one shall account for
two levels of interference:
- The "transparent" case i.e. the end-point address from the sender
perspective is still the receiver (the destination IP address). For
instance relay systems intercept payload but do not relay
congestion control information.
- The "non-transparent" case is not a problem (back-to-back
connections) results in a lesser problem. Indeed, although these
devices interfere with end-to-end network transparency, they
correctly terminating network, transport and application layer
protocols on both sides.
4. Security Considerations 4. Security Considerations
Misbehavior may be driven by pure malice, or malice may in turn be Misbehavior may be driven by pure malice, or malice may in turn be
driven by wider selfish interests, e.g. using distributed denial of driven by wider selfish interests, e.g. using distributed denial of
service (DDoS) attacks to gain rewards by extortion [RFC4948]. DDoS service (DDoS) attacks to gain rewards by extortion [RFC4948]. DDoS
attacks are possible both because of vulnerabilities in operating attacks are possible both because of vulnerabilities in operating
systems and because the Internet delivers packets without requiring systems and because the Internet delivers packets without requiring
congestion control. congestion control.
To date, compliance with congestion control rules and being fair To date, compliance with congestion control rules and being fair
skipping to change at page 26, line 21 skipping to change at page 32, line 26
behavior can be regarded as a security issue; its implications are behavior can be regarded as a security issue; its implications are
discussed throughout these documents in a scattered fashion. discussed throughout these documents in a scattered fashion.
Currently the focus of the research agenda against denial of service Currently the focus of the research agenda against denial of service
is about identifying attack packets, attacking machines and networks is about identifying attack packets, attacking machines and networks
hosting them, with a particular focus on mitigating source address hosting them, with a particular focus on mitigating source address
spoofing. But if mechanisms to enforce congestion control fairness spoofing. But if mechanisms to enforce congestion control fairness
were robust to both selfishness and malice [Bri06] they would also were robust to both selfishness and malice [Bri06] they would also
naturally mitigate denial of service, which can be considered (from naturally mitigate denial of service, which can be considered (from
the perspective of well-behaving Internet user) as a congestion the perspective of well-behaving Internet user) as a congestion
control enforcement problem. control enforcement problem. Even some denial of service attacks on
hosts (rather than the network) could be considered as a congestion
control enforcement issue at the higher layer. But clearly there are
also denial of service attacks that would not be solved by enforcing
congestion control.
5. Contributors 5. Contributors
This document is the result of a collective effort to which the This document is the result of a collective effort to which the
following people have contributed: following people have contributed:
Dimitri Papadimitriou <dimitri.papadimitriou@alcatel-lucent.be> Dimitri Papadimitriou <dimitri.papadimitriou@alcatel-lucent.be>
Michael Welzl <michael.welzl@uibk.ac.at> Michael Welzl <michael.welzl@uibk.ac.at>
Wesley Eddy <weddy@grc.nasa.gov> Wesley Eddy <weddy@grc.nasa.gov>
Bela Berde <bela.berde@gmx.de> Bela Berde <bela.berde@gmx.de>
skipping to change at page 27, line 5 skipping to change at page 33, line 11
[RFC793] Postel, J., "Transmission Control Protocol", STD 7, [RFC793] Postel, J., "Transmission Control Protocol", STD 7,
RFC793, September 1981. RFC793, September 1981.
[RFC896] Nagle, J., "Congestion Control in IP/TCP", RFC 896, [RFC896] Nagle, J., "Congestion Control in IP/TCP", RFC 896,
January 1984. January 1984.
[RFC1323] Jacobson, V., Braden, R., Borman, D., "TCP Extensions for [RFC1323] Jacobson, V., Braden, R., Borman, D., "TCP Extensions for
High Performance", RFC 1323, May 1992. High Performance", RFC 1323, May 1992.
[RFC1958] B. Carpenter, Ed., “Architectural Principles of the [RFC1958] Carpenter, B., Ed., “Architectural Principles of the
[RFC2309] Braden, B., et al., "Recommendations on queue management [RFC2309] Braden, B., et al., "Recommendations on queue management
and congestion avoidance in the Internet", RFC 2309, and congestion avoidance in the Internet", RFC 2309,
April 1998. April 1998.
[RFC2003] Perkins, C., "IP Encapsulation within IP", RFC 1633, [RFC2003] Perkins, C., "IP Encapsulation within IP", RFC 1633,
October 1996. October 1996.
[RFC2474] Nichols, K., Blake, S. Baker, F. and D. Black, [RFC2474] Nichols, K., Blake, S. Baker, F. and D. Black,
"Definition of the Differentiated Services Field (DS "Definition of the Differentiated Services Field (DS
Field) in the IPv4 and IPv6 Headers", RFC 2474, December Field) in the IPv4 and IPv6 Headers", RFC 2474, December
1998. 1998.
[RFC2475] Blake, S., Black, D., Carlson, M., Davies, E., Wang, Z. [RFC2475] Blake, S., Black, D., Carlson, M., Davies, E., Wang, Z.
and W. Weiss, "An Architecture for Differentiated and Weiss, W., "An Architecture for Differentiated
Services", RFC 2475, December 1998. Services", RFC 2475, December 1998.
[RFC2581] Allman, M., Paxson, V., and W. Stevens, "TCP Congestion [RFC2581] Allman, M., Paxson, V., and W. Stevens, "TCP Congestion
Control", RFC 2581, April 1999. Control", RFC 2581, April 1999.
[RFC2861] Handley, M., J. Padhye, J., and S., Floyd, "TCP
Congestion Window Validation", RFC 2861, June 2000.
[RFC2914] Floyd, S., "Congestion Control Principles", BCP 41, [RFC2914] Floyd, S., "Congestion Control Principles", BCP 41,
RFC 2914, September 2000. RFC 2914, September 2000.
[RFC2988] Paxson, V. and Allman, M., "Computing TCP's [RFC2988] Paxson, V. and Allman, M., "Computing TCP's
Retransmission Timer", RFC 2988, Nov. 2000 Retransmission Timer", RFC 2988, Nov. 2000
[RFC2990] Huston, G., "Next Steps for the IP QoS Architecture", [RFC2990] Huston, G., "Next Steps for the IP QoS Architecture",
RFC 2990, November 2000. RFC 2990, November 2000.
[RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition
of Explicit Congestion Notification (ECN) to IP", of Explicit Congestion Notification (ECN) to IP",
RFC 3168, September 2001. RFC 3168, September 2001.
[RFC3448] Handley, M., Floyd, S., Padhye, J., and J. Widmer, "TCP [RFC3448] Handley, M., Floyd, S., Padhye, J., and J. Widmer, "TCP
Friendly Rate Control (TFRC): Protocol Specification", Friendly Rate Control (TFRC): Protocol Specification",
RFC 3448, January 2003. RFC 3448, January 2003.
[RFC3540] N. Spring, D. Wetherall, "Robust Explicit Congestion [RFC3540] Spring, N., and D. Wetherall, "Robust Explicit Congestion
Notification (ECN) Signaling with Nonces", RFC 3540, June Notification (ECN) Signaling with Nonces", RFC 3540, June
2003. 2003.
[RFC3662] Roland Bless, Kathleen Nichols, Klaus Wehrle, "A Lower [RFC3662] Bless, R., Nichols, K., and K. Wehrle, "A Lower Effort
Effort Per-Domain Behavior for Differentiated Services", Per-Domain Behavior for Differentiated Services", RFC
RFC 3662, December 2003. 3662, December 2003.
[RFC3714] S. Floyd, Ed., J. Kempf, Ed. "IAB Concerns Regarding [RFC3714] Floyd, S., and J. Kempf, Eds. "IAB Concerns Regarding
Congestion Control for Voice Traffic in the Internet", Congestion Control for Voice Traffic in the Internet",
RFC 3714, March 2004. RFC 3714, March 2004.
[RFC3742] Floyd, S., "Limited Slow-Start for TCP with Large
Congestion Windows", RFC 3742, March 2004.
[RFC3985] Bryant, S. and P. Pate, "Pseudo Wire Emulation Edge-to- [RFC3985] Bryant, S. and P. Pate, "Pseudo Wire Emulation Edge-to-
Edge (PWE3) Architecture", RFC 3985, March 2005. Edge (PWE3) Architecture", RFC 3985, March 2005.
[RFC4340] Kohler, E., Handley, M., and S. Floyd, "Datagram [RFC4340] Kohler, E., Handley, M., and S. Floyd, "Datagram
Congestion Control Protocol (DCCP)", RFC 4340, March Congestion Control Protocol (DCCP)", RFC 4340, March
2006. 2006.
[RFC4341] Floyd, S. and E. Kohler, "Profile for Datagram Congestion [RFC4341] Floyd, S. and E. Kohler, "Profile for Datagram Congestion
Control Protocol (DCCP) Congestion Control ID 2: TCP-like Control Protocol (DCCP) Congestion Control ID 2: TCP-like
Congestion Control", RFC 4341, March 2006. Congestion Control", RFC 4341, March 2006.
skipping to change at page 28, line 36 skipping to change at page 34, line 47
Division Multiplexing (TDM) over Packet (SAToP)", Division Multiplexing (TDM) over Packet (SAToP)",
RFC 4553, June 2006. RFC 4553, June 2006.
[RFC4614] Duke, M., R. Braden, R., Eddy, W., and Blanton, E., "A [RFC4614] Duke, M., R. Braden, R., Eddy, W., and Blanton, E., "A
Roadmap for Transmission Control Protocol (TCP) Roadmap for Transmission Control Protocol (TCP)
Specification Documents", RFC 4614, September 2006. Specification Documents", RFC 4614, September 2006.
[RFC4782] Floyd, S., Allman, M., Jain, A., and P. Sarolahti, [RFC4782] Floyd, S., Allman, M., Jain, A., and P. Sarolahti,
"Quick-Start for TCP and IP", RFC 4782, Jan. 2007. "Quick-Start for TCP and IP", RFC 4782, Jan. 2007.
[RFC4948] L. Andersson, E. Davies, L. Zhang, "Report from the IAB [RFC4948] Andersson, L., Davies, E., and L. Zhang, "Report from the
workshop on Unwanted Traffic March 9-10, 2006", RFC 4948, IAB workshop on Unwanted Traffic March 9-10, 2006", RFC
August 2007. 4948, August 2007.
[RFC5033] S. Floyd, M. Allman, "Specifying New Congestion Control [RFC5033] Floyd, S., and M. Allman, "Specifying New Congestion
Algorithms ", RFC 5033, Aug. 2007. Control Algorithms", RFC 5033, Aug. 2007.
[RFC5405] Eggert, L., and G. Fairhurst, "Unicast UDP Usage
Guidelines for Application Designers, RFC 5405, November
2008.
[iccrg-rfcs]Welzl, M., and W. Eddy, "Congestion Control in the RFC
Series", Internet Draft, work in Progress, October 2008.
6.2 Informative References 6.2 Informative References
[Allman99] Allman, M. and V. Paxson, "On Estimating End-to-End [Allman99] Allman, M. and V. Paxson, "On Estimating End-to-End
Network Path Properties", Proc. SIGCOMM, Sept. 99. Network Path Properties", Proceedings of ACM SIGCOMM'99,
September 1999.
[Andrew00] L. Andrew, B. Wydrowski and S. Low, "An Example of [Andrew00] L. Andrew, B. Wydrowski and S. Low, "An Example of
Instability in XCP", Manuscript available at Instability in XCP", Manuscript available at
<http://netlab.caltech.edu/maxnet/XCP_instability.pdf> <http://netlab.caltech.edu/maxnet/XCP_instability.pdf>
[Ath01] S. Athuraliya, S. Low, V. Li, and Q. Yin, "REM: Active
[Ath01] Athuraliya, S., Low, S., Li, V., and Q. Yin, "REM: Active
queue management", IEEE Network Magazine, vol.15, no.3, queue management", IEEE Network Magazine, vol.15, no.3,
pp. 48-53, May 2001. pp. 48-53, May 2001.
[BALAN01] Balan, R. K., Lee, B.P., Kumar, K.R.R., Jacob, L., Seah, [BALAN01] Balan, R. K., Lee, B.P., Kumar, K.R.R., Jacob, L., Seah,
W.K.G., Ananda, A.L., "TCP HACK: TCP Header Checksum W.K.G., and Ananda, A.L., "TCP HACK: TCP Header Checksum
Option to Improve Performance over Lossy Links", Option to Improve Performance over Lossy Links",
Proceedings of IEEE Infocom, Anchorage, Alaska, April Proceedings of IEEE INFOCOM'01, Anchorage (Alaska), USA,
2001. April 2001.
[Bonald00] T. Bonald, M. May, and J.-C. Bolot, "Analytic Evaluation [Bonald00] Bonald, T., May, M., and J.-C. Bolot, "Analytic
of RED Performance," In Proceedings of IEEE INFOCOM, Tel Evaluation of RED Performance," Proceedings of IEEE
Aviv, Israel, March 2000. INFOCOM'00, Tel Aviv, Israel, March 2000.
[Bri07] Bob Briscoe, "Flow Rate Fairness: Dismantling a Religion" [Bri08] Briscoe, B., Moncaster, T. and L. Burness, "Problem
ACM SIGCOMM Computer Communication Review 37(2) 63--74 Statement: Transport Protocols Don't Have To Do
(April 2007) Fairness", Work in progress, draft-briscoe-tsvwg-relax-
fairness-01, July 2008.
[Bri06] Bob Briscoe, "Using Self-interest to Prevent Malice; [Bri07] Briscoe, B., "Flow Rate Fairness: Dismantling a
Religion", ACM SIGCOMM Computer Communication Review,
Vol.37, No.2, pp.63-74, April 2007.
[Bri06] Briscoe, B., "Using Self-interest to Prevent Malice;
Fixing the Denial of Service Flaw of the Internet," Fixing the Denial of Service Flaw of the Internet,"
Workshop on the Economics of Securing the Information Workshop on the Economics of Securing the Information
Infrastructure (Oct 2006) Infrastructure, October 2006.
<http://wesii.econinfosec.org/draft.php?paper_id=19> <http://wesii.econinfosec.org/draft.php?paper_id=19>
[Bryant08] Bryant, S., Davie, B., Martini, L., and E. Rosen,
"Pseudowire Congestion Control Framework", Work in
Progress, draft-ietf-pwe3-congestion-frmwk-01.txt, May
2008.
[Chester04] Chesterfield, J., Chakravorty, R., Banerjee, S., [Chester04] Chesterfield, J., Chakravorty, R., Banerjee, S.,
Rodriguez, P., Pratt, I. and Crowcroft, J., "Transport Rodriguez, P., Pratt, I. and Crowcroft, J., "Transport
level optimisations for streaming media over wide-area level optimisations for streaming media over wide-area
wireless networks", WIOPT'04, March 2004. wireless networks", WIOPT'04, March 2004.
[Chiu89] D. M. Chiu and R. Jain, "Analysis of the increase and [Chiu89] Chiu, D. M., and R. Jain, "Analysis of the increase and
decrease algorithms for congestion avoidance in computer decrease algorithms for congestion avoidance in computer
networks", Computer Networks and ISDN Systems, vol. 17, networks", Computer Networks and ISDN Systems, vol. 17,
pp. 1-14, 1989. pp. 1-14, 1989.
[Clark98] D. Clark and W. Fang, "Explicit Allocation of Best-Effort [Clark98] Clark, D. and W. Fang, "Explicit Allocation of Best-
Packet Delivery Service," IEEE/ACM Transactions on Effort Packet Delivery Service," IEEE/ACM Transactions on
Networking, vol.6, no.4, pp.362-373, August 1998 Networking, vol.6, no.4, pp.362-373, August 1998.
[Dukki06] N. Dukkipati and N. McKeown, "Why Flow-Completion Time is [Dukki05] Dukkipati, N., Kobayashi, M., Zhang-Shen, R. and N.,
the Right Metric for Congestion Control" ACM SIGCOMM McKeown, "Processor Sharing Flows in the Internet",
Computer Communication Review Volume 36, issue 1, Jan. Proceedings of International Workshop on QoS (IWQoS'05),
June 2005.
[Dukki06] Dukkipati, N. and N. McKeown, "Why Flow-Completion Time
is the Right Metric for Congestion Control", ACM SIGCOMM
Computer Communication Review, Vol.36, No.1, January
2006. 2006.
[Floyd93] S. Floyd and V. Jacobson, “Random early detection [ECN-tunnel]Briscoe, B., "Layered Encapsulation of Congestion
gateways for congestion avoidance,” IEEE/ACM Trans. on Notification", draft-briscoe-tsvwg-ecn-tunnel, Work in
Networking, vol.1, no.4, pp. 397-413, Aug. 1993. progress.
[Falk07] A. Falk et al "Specification for the Explicit Control [Falk07] Falk, A., et al., "Specification for the Explicit Control
Protocol (XCP)", Work in Progress, draft-falk-xcp-spec- Protocol (XCP)", Work in Progress, draft-falk-xcp-spec-
03.txt, July 2007. 03.txt, July 2007.
[Firoiu00] V. Firoiu and M. Borden, "A Study of Active Queue [Firoiu00] Firoiu, V., and M. Borden, "A Study of Active Queue
Management for Congestion Control," In Proceedings of Management for Congestion Control," Proceedings of IEEE
IEEE INFOCOM, Tel Aviv, Israel, March 2000. INFOCOM'00, Tel Aviv, Israel, March 2000.
[Floyd94] S. Floyd, "TCP and Explicit Congestion Notification", [Floyd93] Floyd, S., and V. Jacobson, "Random early detection
ACM Computer Communication Review, vol.24, no.5, October gateways for congestion avoidance," IEEE/ACM Transactions
1994, pp. 10-23. on Networking, vol.1, no.4, pp.397-413, August 1993.
[Hollot01] C. Hollot, V. Misra, D. Towsley, and W.-B. Gong, "A [Floyd94] Floyd, S., "TCP and Explicit Congestion Notification",
Control Theoretic Analysis of RED," In Proceedings of ACM Computer Communication Review, vol.24, no.5, pp.10-
IEEE INFOCOM, Anchorage, Alaska, April 2001. 23, October 1994.
[Jacobson88] V. Jacobson, "Congestion Avoidance and Control", Proc. [Floyd08] Floyd, S., and M. Allman, "Comments on the Usefulness of
of the ACM SIGCOMM '88 Symposium, pp. 314-329, August Simple Best-Effort Traffic", RFC 5290, July 2008.
1988.
[Jain88] R. Jain and K. Ramakrishnan, "Congestion Avoidance in [Hollot01] Hollot, C., Misra, V., Towsley, D., and W.-B. Gong, "A
Control Theoretic Analysis of RED," Proceedings of IEEE
INFOCOM'01, Anchorage, Alaska, April 2001.
[Jacobson88]Jacobson, V., "Congestion Avoidance and Control",
Proceeding of ACM SIGCOMM'88 Symposium, August 1988.
[Jain88] Jain, R., and K. Ramakrishnan, "Congestion Avoidance in
Computer Networks with a Connectionless Network Layer: Computer Networks with a Connectionless Network Layer:
Concepts, Goals, and Methodology", In Proceedings of IEEE Concepts, Goals, and Methodology", Proceedings of IEEE
Computer Networking Symposium: proceedings, Sheraton Computer Networking Symposium, Washington DC, USA, April
National Hotel, Washington, DC area, April 11-13, 1988. 1988.
[Jain90] R. Jain, "Congestion Control in Computer Networks: Trends [Jain90] Jain, R., "Congestion Control in Computer Networks:
and Issues", IEEE Network, May 1990, pp. 24-30, ISSN Trends and Issues", IEEE Network, pp. 24-30, May 1990.
0890-8044.
[Jin04] Chen Jin, David X. Wei and Steven Low "FAST TCP: [Jin04] Jin, Ch., Wei, D.X., and S. Low, "FAST TCP: Motivation,
Motivation, Architecture, Algorithms, Performance," In Architecture, Algorithms, Performance," Proceedings of
Proc. IEEE Conference on Computer Communications IEEE INFOCOM'04, Hong-Kong, China, March 2004.
Infocomm'04) (March 2004)
[Katabi02] D. Katabi, M. Handley, and C. Rohr, "Internet Congestion [Katabi02] Katabi, D., M. Handley, and C. Rohr, "Internet Congestion
Control for Future High Bandwidth-Delay Product Control for Future High Bandwidth-Delay Product
Environments", Proceedings of the ACM SIGCOMM '02 Environments", Proceedings of ACM SIGCOMM'02 Symposium,
Symposium, pp. 89-102, August 2002. pp. 89-102, August 2002.
[Kelly98] F. Kelly, A. Maulloo, and D. Tan, "Rate control in [Kelly98] Kelly, F., Maulloo, A., and D. Tan, "Rate control in
communication networks: shadow prices, proportional communication networks: shadow prices, proportional
fairness, and stability," Journal of the Operational fairness, and stability," Journal of the Operational
Research Society, vol.49, pp. 237–252, 1998. Research Society, vol.49, pp. 237–252, 1998.
[Keshav] S. Keshav, "What is congestion and what is congestion [Kelly05] Kelly, F., and Th. Voice, "Stability of end-to-end
control", Presentation at IRTF ICCRG Workshop, Pfldnet algorithms for joint routing and rate control", ACM
2007, (Los Angeles), California, February 2007. SIGCOMM Computer Communication Review, Vol.35, No.2, pp.
5-12, April 2005.
[Krishnan04] R. Krishnan, J. Sterbenz, W. Eddy, C. Partridge, and M. [Keshav] Keshav, S., "What is congestion and what is congestion
Allman, "Explicit Transport Error Notification (ETEN) for control", Presentation at IRTF ICCRG Workshop, PFLDNet
Error-Prone Wireless and Satellite Networks", Computer 2007, Los Angeles (California), USA, February 2007.
Networks, vol.46, no.3, October 2004.
[Kuzmanovic] A. Kuzmanovic and E. W. Knightly, "TCP-LP: A Distributed [Key04] Key, P., Massoulié, L., Bain, A., and F. Kelly, "Fair
Algorithm for Low Priority Data Transfer", Proceedings of Internet Traffic Integration: Network Flow Models and
IEEE INFOCOM 2003, San Francisco, CA, April 2003. Analysis", Annales des Télécommunications, Vol.59, No.11-
12, pp. 1338-1352, November-December 2004.
[Low05] S. Low, L. Andrew and B. Wydrowski. "Understanding XCP: [Krishnan04] Krishnan, R., Sterbenz, J., Eddy, W., Partridge, C., and
equilibrium and fairness", Proceedings of IEEE Infocom, M. Allman, "Explicit Transport Error Notification (ETEN)
Miami, USA, March 2005. for Error-Prone Wireless and Satellite Networks",
Computer Networks, vol.46, no.3, October 2004.
[Low03.2] S. Low, F. Paganini, J. Wang, and J. Doyle, "Linear [Kuzmanovic03] Kuzmanovic, A., and E. W. Knightly, "TCP-LP: A
Distributed Algorithm for Low Priority Data Transfer",
Proceedings of IEEE INFOCOM'03, San Francisco
(California), USA, April 2003.
[Low05] Low, S., L. Andrew, L., and B. Wydrowski, "Understanding
XCP: equilibrium and fairness", Proceedings of IEEE
INFOCOM'05, Miami (Florida), USA, March 2005.
[Low03.2] Low, S., Paganini, F., Wang, J., and J. Doyle, "Linear
stability of TCP/RED and a scalable control", Computer stability of TCP/RED and a scalable control", Computer
Networks Journal, vol.43, no.5, pp.633-647, December Networks Journal, vol.43, no.5, pp.633-647, December
2003. 2003.
[Low03.1] S. Low, "A duality model of TCP and queue management [Low03.1] Low, S., "A duality model of TCP and queue management
algorithms", IEEE/ACM Trans. on Networking, vol.11, no.4, algorithms", IEEE/ACM Transactions on Networking, vol.11,
pp.525–536, August 2003. no.4, pp.525–536, August 2003.
[Low02] S. Low, F. Paganini, J. Wang, S. Adlakha, and J. C. [Low02] Low, S., Paganini, F., Wang, J., Adlakha, S., and J.C.
Doyle, "Dynamics of TCP/RED and a Scalable Control", Doyle, "Dynamics of TCP/RED and a Scalable Control",
Proceedings of IEEE Infocom, New York, USA, June 2002. Proceedings of IEEE INFOCOM'02, New York (New-Jersey),
USA, June 2002.
[Mascolo01] Saverio Mascolo, Claudio Casetti, Mario Gerla, M. Y. [LT-TCP] Tickoo, O., Subramanian, V., Kalyanaraman, S., and K.K.
Sanadidi, Ren Wang, "TCP westwood: Bandwidth estimation Ramakrishnan, "LT-TCP: End-to-End Framework to Improve
for enhanced transport over wireless links", Proceedings TCP Performance over Networks with Lossy Channels",
of MOBICOM 2001, pp. 287-297. Proceedings of International Workshop on QoS (IWQoS),
June 2005.
[Moors02] T. Moors, “A critical review of "End-to-end arguments in [Mascolo01] Mascolo, S., Casetti, Cl., Gerla M., Sanadidi, M.Y., and
system design"”, Proc. International Conference on R. Wang, "TCP westwood: Bandwidth estimation for enhanced
Communications (ICC), Apr./May 2002. transport over wireless links", Proceedings of MOBICOM
2001, pp.287-297, 2001.
[MKMV95] MacKie-Mason, J. and H. Varian, "Pricing Congestible [Moors02] Moors, T., "A critical review of "End-to-end arguments in
system design", Proceedings of IEEE International
Conference on Communications (ICC), Apr./May 2002.
[MKMV95] MacKie-Mason, J., and H. Varian, "Pricing Congestible
Network Resources", IEEE Journal on Selected Areas in Network Resources", IEEE Journal on Selected Areas in
Communications, `Advances in the Fundamentals of Communications, 'Advances in the Fundamentals of
Networking' 13(7)1141--1149, 1995, <http:// Networking', Vol.13, No.7, pp.1141-1149, 1995, <http://
www.sims.berkeley.edu/~hal/Papers/ www.sims.berkeley.edu/~hal/Papers/
pricing-congestible.pdf>. pricing-congestible.pdf>.
[Padhye98] Padhye, J., Firoiu, V., Towsley, D., Kurose, J., Modeling [Padhye98] Padhye, J., Firoiu, V., Towsley, D., and J. Kurose,
TCP Throughput: A Simple Model and Its Empirical "Modeling TCP Throughput: A Simple Model and Its
Validation, UMASS CMPSCI Tech Report TR98-008, Feb. 1998. Empirical Validation", University of Massachusetts
(UMass), CMPSCI Tech Report TR98-008, February 1998.
[Pan00] R. Pan, B. Prabhakar, and K. Psounis, "CHOKe: a stateless [Pan00] Pan, R., Prabhakar, B., and K. Psounis, "CHOKe: a
AQM scheme for approximating fair bandwidth allocation", stateless AQM scheme for approximating fair bandwidth
In Proceedings of IEEE Infocom, Tel Aviv, Israel, March allocation", In Proceedings of IEEE INFOCOM'00, Tel Aviv,
2000. Israel, March 2000.
[Rossi06] Rossi, M., "Evaluating TCP with Corruption Notification [Rossi06] Rossi, M., "Evaluating TCP with Corruption Notification
in an IEEE 802.11 Wireless LAN", master thesis, in an IEEE 802.11 Wireless LAN", master thesis,
University of Innsbruck, November 2006. Available from University of Innsbruck, November 2006. Available from
http://www.welzl.at/research/projects/corruption/ http://www.welzl.at/research/projects/corruption/
[Sarola02] Sarolahti, P. and Kuznetsov, A., "Congestion Control in [Sarola02] Sarolahti, P. and A. Kuznetsov, "Congestion Control in
Linux TCP", "Proc. USENIX Annual Technical Conference", Linux TCP", Proceedings of USENIX Annual Technical
June 2002. Conference, June 2002.
[Sarola07] Sarolahti, P., Floyd, S., and M. Kojo, "Transport-layer
Considerations for Explicit Cross-layer Indications",
Work in Progress, draft-sarolahti-tsvwg-crosslayer-
01.txt, March 2007.
[Savage99] Savage, S., Wetherall, D., and T. Anderson, "TCP [Savage99] Savage, S., Wetherall, D., and T. Anderson, "TCP
Congestion Control with a Misbehaving Receiver," in ACM Congestion Control with a Misbehaving Receiver," ACM
SIGCOMM Computer Communication Review (1999). SIGCOMM Computer Communication Review, 1999.
[Saltzer84] Saltzer, J., Reed, D., and Clark, D. D. [Saltzer84] Saltzer, J., Reed, D., and D. Clark, "End-to-end
End-to-end arguments in system design. ACM arguments in system design", ACM Transactions on Computer
Transactions on Computer Systems 2, 4 (Nov. 1984). Systems, Vol.2, No.4, November 1984.
[Shin08] Shin, M., Chong, S., and I., Rhee, "Dual-Resource TCP/AQM
for Processing-Constrained Networks", IEEE/ACM
Transactions on Networking, Vol.16, No.2, pp. 435—449,
April 2008.
[Thaler07] Thaler, D., Sridhara, M., and D. Bansal, "Implementation
Report on Experiences with Various TCP RFCs",
presentation to the IETF Transport Area,
<http://www.ietf.org/proceedings/07mar/slides/tsvarea-
3/>, March 2007.
[TRILOGY] "Trilogy Project", European Commission Seventh Framework [TRILOGY] "Trilogy Project", European Commission Seventh Framework
Program Contract Number: INFSO-ICT-216372 Program Contract Number: INFSO-ICT-216372
<http://www.trilogy-project.org> <http://www.trilogy-project.org>
[Welzl03] M. Welzl, "Scalable Performance Signalling and Congestion [Welzl03] Welzl, M., "Scalable Performance Signalling and
Avoidance", Springer, August 2003. ISBN 1-4020-7570-7. Congestion Avoidance", Springer (ISBN 1-4020-7570-7),
August 2003.
[Welzl08] M. Welzl, M. Rossi, A. Fumagalli, and M. Tacca, " TCP/IP [Welzl08] Welzl, M., Rossi, M., Fumagalli, A., and M. Tacca,
over IEEE 802.11b WLAN: the Challenge of Harnessing "TCP/IP over IEEE 802.11b WLAN: the Challenge of
Known-Corrupt Data", In Proceedings of IEEE ICC 2008, 19- Harnessing Known-Corrupt Data", Proceedings of IEEE ICC
23 May 2008, Beijing, China. 2008, Beijing, China, May 2008.
[Zhang03] H. Zhang, C. Hollot, D. Towsley, and V. Misra. "A Self- [Xia05] Xia, Y., Subramanian, L., Stoica, I., and S.
Tuning Structure for Adaptation in TCP/AQM Networks", Kalyanaraman, "One more bit is enough", Proceedings of
SIGMETRICS’03, June 10–14, 2003, San Diego, California, ACM SIGCOMM'05, and ACM Computer Communication Review,
USA. Vol.35, No.4, pp. 37-48, 2005.
[Zhang03] Zhang, H., Hollot, C., Towsley, D., and V. Misra. "A
Self-Tuning Structure for Adaptation in TCP/AQM
Networks", ACM SIGMETRICS’03, San Diego (California),
USA, June 2003.
Acknowledgments Acknowledgments
The authors would like to thank the following people whose feedback The authors would like to thank the following people whose feedback
and comments contributed to this document: Keith Moore, Jan and comments contributed to this document: Keith Moore, Jan
Vandenabeele. Vandenabeele.
Larry Dunn (his comments at the Manchester ICCRG and discussions with Larry Dunn (his comments at the Manchester ICCRG and discussions with
him helped with the section on packet-congestibility). Bob Briscoe's him helped with the section on packet-congestibility). Bob Briscoe's
contribution was partly funded by [TRILOGY], a research project contribution was partly funded by [TRILOGY], a research project
skipping to change at page 33, line 25 skipping to change at page 41, line 4
Phone : +32 3 240 8491 Phone : +32 3 240 8491
Email: dimitri.papadimitriou@alcatel-lucent.be Email: dimitri.papadimitriou@alcatel-lucent.be
Michael Scharf Michael Scharf
University of Stuttgart University of Stuttgart
Pfaffenwaldring 47 Pfaffenwaldring 47
D-70569 Stuttgart D-70569 Stuttgart
Germany Germany
Phone: +49 711 685 69006 Phone: +49 711 685 69006
Email: michael.scharf@ikr.uni-stuttgart.de Email: michael.scharf@ikr.uni-stuttgart.de
Bob Briscoe Bob Briscoe
BT & UCL BT & UCL
B54/77, Adastral Park B54/77, Adastral Park
Martlesham Heath Martlesham Heath
Ipswich IP5 3RE Ipswich IP5 3RE
UK UK
Email: bob.briscoe@bt.com Email: bob.briscoe@bt.com
Full Copyright Statement Full Copyright Statement
Copyright (C) The Internet Society (2008). Copyright (c) 2009 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to the rights, licenses and restrictions
contained in BCP 78, and except as set forth therein, the authors
retain all their rights.
This document and the information contained herein are provided on an
"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Intellectual Property
The IETF takes no position regarding the validity or scope of any
Intellectual Property Rights or other rights that might be claimed to
pertain to the implementation or use of the technology described in
this document or the extent to which any license under such rights
might or might not be available; nor does it represent that it has
made any independent effort to identify any such rights. Information
on the procedures with respect to rights in RFC documents can be
found in BCP 78 and BCP 79.
Copies of IPR disclosures made to the IETF Secretariat and any This document is subject to BCP 78 and the IETF Trust's Legal
assurances of licenses to be made available, or the result of an Provisions Relating to IETF Documents in effect on the date of
attempt made to obtain a general license or permission for the use publication of this document (http://trustee.ietf.org/license-info).
of such proprietary rights by implementers or users of this
specification can be obtained from the IETF on-line IPR repository at
http://www.ietf.org/ipr.
The IETF invites any interested party to bring to its attention any Please review these documents carefully, as they describe your rights
copyrights, patents or patent applications, or other proprietary and restrictions with respect to this document.
rights that may cover technology that may be required to implement
this standard. Please address the information to the IETF at
ietf-ipr@ietf.org.
Acknowledgment Acknowledgment
Funding for the RFC Editor function is provided by the IETF Funding for the RFC Editor function is provided by the IETF
Administrative Support Activity (IASA). Administrative Support Activity (IASA).
 End of changes. 173 change blocks. 
596 lines changed or deleted 940 lines changed or added

This html diff was produced by rfcdiff 1.35. The latest version is available from http://tools.ietf.org/tools/rfcdiff/