draft-briscoe-tsvwg-byte-pkt-mark-02.txt   draft-ietf-tsvwg-byte-pkt-congest-00.txt 
Transport Area Working Group B. Briscoe Transport Area Working Group B. Briscoe
Internet-Draft BT & UCL Internet-Draft BT & UCL
Intended status: Informational February 24, 2008 Intended status: Informational August 07, 2008
Expires: August 27, 2008 Expires: February 8, 2009
Byte and Packet Congestion Notification Byte and Packet Congestion Notification
draft-briscoe-tsvwg-byte-pkt-mark-02 draft-ietf-tsvwg-byte-pkt-congest-00
Status of this Memo Status of this Memo
By submitting this Internet-Draft, each author represents that any By submitting this Internet-Draft, each author represents that any
applicable patent or other IPR claims of which he or she is aware applicable patent or other IPR claims of which he or she is aware
have been or will be disclosed, and any of which he or she becomes have been or will be disclosed, and any of which he or she becomes
aware will be disclosed, in accordance with Section 6 of BCP 79. aware will be disclosed, in accordance with Section 6 of BCP 79.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that Task Force (IETF), its areas, and its working groups. Note that
skipping to change at page 1, line 34 skipping to change at page 1, line 34
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt. http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html. http://www.ietf.org/shadow.html.
This Internet-Draft will expire on August 27, 2008. This Internet-Draft will expire on February 8, 2009.
Copyright Notice
Copyright (C) The IETF Trust (2008).
Abstract Abstract
This memo concerns dropping or marking packets using active queue This memo concerns dropping or marking packets using active queue
management (AQM) such as random early detection (RED) or pre- management (AQM) such as random early detection (RED) or pre-
congestion notification (PCN). The primary conclusion is that packet congestion notification (PCN). The primary conclusion is that packet
size should be taken into account when transports decode congestion size should be taken into account when transports read congestion
indications, not when network equipment writes them. Reducing drop indications, not when network equipment writes them. Reducing drop
of small packets has some tempting advantages: i) it drops less of small packets has some tempting advantages: i) it drops less
control packets, which tend to be small and ii) it makes TCP's bit- control packets, which tend to be small and ii) it makes TCP's bit-
rate less dependent on packet size. However, there are ways of rate less dependent on packet size. However, there are ways of
addressing these issues at the transport layer, rather than reverse addressing these issues at the transport layer, rather than reverse
engineering network forwarding to fix specific transport problems. engineering network forwarding to fix specific transport problems.
Network layer algorithms like the byte-mode packet drop variant of Network layer algorithms like the byte-mode packet drop variant of
RED should not be used to drop fewer small packets, because that RED should not be used to drop fewer small packets, because that
creates a perverse incentive for transports to use tiny segments, creates a perverse incentive for transports to use tiny segments,
consequently also opening up a DoS vulnerability. consequently also opening up a DoS vulnerability.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 5 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 5
2. Motivating Arguments . . . . . . . . . . . . . . . . . . . . . 9 2. Motivating Arguments . . . . . . . . . . . . . . . . . . . . . 8
2.1. Scaling Congestion Control with Packet Size . . . . . . . 9 2.1. Scaling Congestion Control with Packet Size . . . . . . . 8
2.2. Avoiding Perverse Incentives to (ab)use Smaller Packets . 10 2.2. Avoiding Perverse Incentives to (ab)use Smaller Packets . 10
2.3. Small != Control . . . . . . . . . . . . . . . . . . . . . 11 2.3. Small != Control . . . . . . . . . . . . . . . . . . . . . 11
3. Working Definition of Congestion Notification . . . . . . . . 12 3. Working Definition of Congestion Notification . . . . . . . . 11
4. Congestion Measurement . . . . . . . . . . . . . . . . . . . . 12 4. Congestion Measurement . . . . . . . . . . . . . . . . . . . . 12
4.1. Congestion Measurement by Queue Length . . . . . . . . . . 12 4.1. Congestion Measurement by Queue Length . . . . . . . . . . 12
4.1.1. Fixed Size Packet Buffers . . . . . . . . . . . . . . 13 4.1.1. Fixed Size Packet Buffers . . . . . . . . . . . . . . 12
4.2. Congestion Measurement without a Queue . . . . . . . . . . 14 4.2. Congestion Measurement without a Queue . . . . . . . . . . 13
5. Idealised Wire Protocol Coding . . . . . . . . . . . . . . . . 14 5. Idealised Wire Protocol Coding . . . . . . . . . . . . . . . . 14
6. The State of the Art . . . . . . . . . . . . . . . . . . . . . 16 6. The State of the Art . . . . . . . . . . . . . . . . . . . . . 15
6.1. Congestion Measurement: Status . . . . . . . . . . . . . . 17 6.1. Congestion Measurement: Status . . . . . . . . . . . . . . 16
6.2. Congestion Coding: Status . . . . . . . . . . . . . . . . 17 6.2. Congestion Coding: Status . . . . . . . . . . . . . . . . 17
6.2.1. Network Bias when Encoding . . . . . . . . . . . . . . 17 6.2.1. Network Bias when Encoding . . . . . . . . . . . . . . 17
6.2.2. Transport Bias when Decoding . . . . . . . . . . . . . 19 6.2.2. Transport Bias when Decoding . . . . . . . . . . . . . 19
6.2.3. Making Transports Robust against Control Packet 6.2.3. Making Transports Robust against Control Packet
Losses . . . . . . . . . . . . . . . . . . . . . . . . 20 Losses . . . . . . . . . . . . . . . . . . . . . . . . 20
6.2.4. Congestion Coding: Summary of Status . . . . . . . . . 21 6.2.4. Congestion Coding: Summary of Status . . . . . . . . . 21
7. Outstanding Issues and Next Steps . . . . . . . . . . . . . . 23 7. Outstanding Issues and Next Steps . . . . . . . . . . . . . . 23
7.1. Bit-congestible World . . . . . . . . . . . . . . . . . . 23 7.1. Bit-congestible World . . . . . . . . . . . . . . . . . . 23
7.2. Bit- & Packet-congestible World . . . . . . . . . . . . . 24 7.2. Bit- & Packet-congestible World . . . . . . . . . . . . . 23
8. Security Considerations . . . . . . . . . . . . . . . . . . . 25 8. Security Considerations . . . . . . . . . . . . . . . . . . . 24
9. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 26 9. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 25
10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 27 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 27
11. Comments Solicited . . . . . . . . . . . . . . . . . . . . . . 27 11. Comments Solicited . . . . . . . . . . . . . . . . . . . . . . 27
12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 27
12.1. Normative References . . . . . . . . . . . . . . . . . . . 27
12.2. Informative References . . . . . . . . . . . . . . . . . . 27
Editorial Comments . . . . . . . . . . . . . . . . . . . . . . . . Editorial Comments . . . . . . . . . . . . . . . . . . . . . . . .
Appendix A. Example Scenarios . . . . . . . . . . . . . . . . . . 28 Appendix A. Example Scenarios . . . . . . . . . . . . . . . . . . 31
A.1. Notation . . . . . . . . . . . . . . . . . . . . . . . . . 28 A.1. Notation . . . . . . . . . . . . . . . . . . . . . . . . . 31
A.2. Bit-congestible resource, equal bit rates (Ai) . . . . . . 28 A.2. Bit-congestible resource, equal bit rates (Ai) . . . . . . 31
A.3. Bit-congestible resource, equal packet rates (Bi) . . . . 29 A.3. Bit-congestible resource, equal packet rates (Bi) . . . . 32
A.4. Pkt-congestible resource, equal bit rates (Aii) . . . . . 30 A.4. Pkt-congestible resource, equal bit rates (Aii) . . . . . 33
A.5. Pkt-congestible resource, equal packet rates (Bii) . . . . 31 A.5. Pkt-congestible resource, equal packet rates (Bii) . . . . 34
Appendix B. Congestion Notification Definition: Further Appendix B. Congestion Notification Definition: Further
Justification . . . . . . . . . . . . . . . . . . . . 31 Justification . . . . . . . . . . . . . . . . . . . . 34
Appendix C. Byte-mode Drop Complicates Policing Congestion Appendix C. Byte-mode Drop Complicates Policing Congestion
Response . . . . . . . . . . . . . . . . . . . . . . 32 Response . . . . . . . . . . . . . . . . . . . . . . 35
12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 33
12.1. Normative References . . . . . . . . . . . . . . . . . . . 33
12.2. Informative References . . . . . . . . . . . . . . . . . . 33
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 36 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 36
Intellectual Property and Copyright Statements . . . . . . . . . . 37 Intellectual Property and Copyright Statements . . . . . . . . . . 37
Relationship to existing RFCs
To be removed by the RFC Editor on publication (with appropriate
changes to the 'Updates:' header and the RFC Index as appropriate).
This memo intends to update RFC2309, which stated an interim view but
requested that further research was needed on this topic.
Changes from Previous Versions Changes from Previous Versions
To be removed by the RFC Editor on publication. To be removed by the RFC Editor on publication.
Full incremental diffs between each version are available at Full incremental diffs between each version are available at
<http://www.cs.ucl.ac.uk/staff/B.Briscoe/pubs.html#byte-pkt-mark> <http://www.cs.ucl.ac.uk/staff/B.Briscoe/pubs.html#byte-pkt-congest>
or
<http://tools.ietf.org/wg/tsvwg/draft-ietf-tsvwg-byte-pkt-congest/>
(courtesy of the rfcdiff tool): (courtesy of the rfcdiff tool):
From -01 to -02 (this version): From briscoe-byte-pkt-mark-02 to ietf-byte-pkt-congest-00 (this
version):
Added note on relationship to existing RFCs
Posed the question of whether packet-congestion could become
common and deferred it to the IRTF ICCRG. Added ref to the
dual-resource queue (DRQ) proposal.
Changed PCN references from the PCN charter & architecture to
the PCN marking behaviour draft most likely to imminently
become the standards track WG item.
From -01 to -02:
Abstract reorganised to align with clearer separation of issue Abstract reorganised to align with clearer separation of issue
in the memo. in the memo.
Introduction reorganised with motivating arguments removed to Introduction reorganised with motivating arguments removed to
new Section 2. new Section 2.
Clarified avoiding lock-out of large packets is not the main or Clarified avoiding lock-out of large packets is not the main or
only motivation for RED. only motivation for RED.
skipping to change at page 6, line 16 skipping to change at page 5, line 40
discussed. Indeed, one reason AQM was originally introduced was to discussed. Indeed, one reason AQM was originally introduced was to
reduce the lock-out effects that small packets can have on large reduce the lock-out effects that small packets can have on large
packets in drop-tail queues. This memo aims to state the principles packets in drop-tail queues. This memo aims to state the principles
we should be using and to come to conclusions on what these we should be using and to come to conclusions on what these
principles will mean for future protocol design, taking into account principles will mean for future protocol design, taking into account
the deployments we have already. the deployments we have already.
Note that the byte vs. packet dilemma concerns congestion Note that the byte vs. packet dilemma concerns congestion
notification irrespective of whether it is signalled implicitly by notification irrespective of whether it is signalled implicitly by
drop or using explicit congestion notification (ECN [RFC3168] or PCN drop or using explicit congestion notification (ECN [RFC3168] or PCN
[I-D.ietf-pcn-architecture]). Throughout this document, unless clear [I-D.eardley-pcn-marking-behaviour]). Throughout this document,
from the context, the term marking will be used to mean notifying unless clear from the context, the term marking will be used to mean
congestion explicitly, while congestion notification will be used to notifying congestion explicitly, while congestion notification will
mean notifying congestion either implicitly by drop or explicitly by be used to mean notifying congestion either implicitly by drop or
marking. explicitly by marking.
If the load on a resource depends on the rate at which packets If the load on a resource depends on the rate at which packets
arrive, it is called packet-congestible. If the load depends on the arrive, it is called packet-congestible. If the load depends on the
rate at which bits arrive it is called bit-congestible. rate at which bits arrive it is called bit-congestible.
Examples of packet-congestible resources are route look-up engines Examples of packet-congestible resources are route look-up engines
and firewalls, because load depends on how many packet headers they and firewalls, because load depends on how many packet headers they
have to process. Examples of bit-congestible resources are have to process. Examples of bit-congestible resources are
transmission links, and most buffer memory, because the load depends transmission links, and most buffer memory, because the load depends
on how many bits they have to transmit or store. Some machine on how many bits they have to transmit or store. Some machine
skipping to change at page 7, line 46 skipping to change at page 7, line 19
followed this advice. The primary purpose of this memo is to build a followed this advice. The primary purpose of this memo is to build a
definitive consensus against deliberate preferential treatment for definitive consensus against deliberate preferential treatment for
small packets in AQM algorithms and to record this advice within the small packets in AQM algorithms and to record this advice within the
RFC series. RFC series.
Now is a good time to discuss whether fairness between different Now is a good time to discuss whether fairness between different
sized packets would best be implemented in the network layer, or at sized packets would best be implemented in the network layer, or at
the transport, for a number of reasons: the transport, for a number of reasons:
1. The packet vs. byte issue requires speedy resolution because the 1. The packet vs. byte issue requires speedy resolution because the
IETF pre-congestion notification (PCN) working group has been IETF pre-congestion notification (PCN) working group is about to
chartered to produce a standards track specification of its standardise the external behaviour of a PCN congestion
congestion notification (AQM) algorithm [PCNcharter]; notification (AQM) algorithm [I-D.eardley-pcn-marking-behaviour];
2. [RFC2309] says RED may either take account of packet size or not 2. [RFC2309] says RED may either take account of packet size or not
when dropping, but gives no recommendation between the two, when dropping, but gives no recommendation between the two,
referring instead to advice on the performance implications in an referring instead to advice on the performance implications in an
email [pktByteEmail], which recommends byte-mode drop. Further, email [pktByteEmail], which recommends byte-mode drop. Further,
just before RFC2309 was issued, an addendum was added to the just before RFC2309 was issued, an addendum was added to the
archived email that revisited the issue of packet vs. byte-mode archived email that revisited the issue of packet vs. byte-mode
drop in its last para, making the recommendation less clear-cut; drop in its last para, making the recommendation less clear-cut;
3. Without the present memo, the only advice in the RFC series on 3. Without the present memo, the only advice in the RFC series on
skipping to change at page 12, line 18 skipping to change at page 11, line 39
Rather than aim to achieve what many have tried and failed, this memo Rather than aim to achieve what many have tried and failed, this memo
will not try to define congestion. It will give a working definition will not try to define congestion. It will give a working definition
of what congestion notification should be taken to mean for this of what congestion notification should be taken to mean for this
document. Congestion notification is a changing signal that aims to document. Congestion notification is a changing signal that aims to
communicate the ratio E/L, where E is the instantaneous excess load communicate the ratio E/L, where E is the instantaneous excess load
offered to a resource that it cannot (or would not) serve and L is offered to a resource that it cannot (or would not) serve and L is
the instantaneous offered load. the instantaneous offered load.
The phrase `would not serve' is added, because AQM systems (e.g. The phrase `would not serve' is added, because AQM systems (e.g.
RED, PCN [I-D.ietf-pcn-architecture]) use a virtual capacity smaller RED, PCN [I-D.eardley-pcn-marking-behaviour]) use a virtual capacity
than actual capacity, then notify congestion of this virtual capacity smaller than actual capacity, then notify congestion of this virtual
in order to avoid congestion of the actual capacity. capacity in order to avoid congestion of the actual capacity.
Note that the denominator is offered load, not capacity. Therefore Note that the denominator is offered load, not capacity. Therefore
congestion notification is a real number bounded by the range [0,1]. congestion notification is a real number bounded by the range [0,1].
This ties in with the most well-understood form of congestion This ties in with the most well-understood form of congestion
notification: drop rate. It also means that congestion has a natural notification: drop rate. It also means that congestion has a natural
interpretation as a probability; the probability of offered traffic interpretation as a probability; the probability of offered traffic
not being served (or being marked as at risk of not being served). not being served (or being marked as at risk of not being served).
Appendix B describes a further incidental benefit that arises from Appendix B describes a further incidental benefit that arises from
using load as the denominator of congestion notification. using load as the denominator of congestion notification.
skipping to change at page 14, line 40 skipping to change at page 14, line 10
and theoretically sound way to combine congestion notification for and theoretically sound way to combine congestion notification for
different bit-congestible resources at different layers along an end different bit-congestible resources at different layers along an end
to end path, whether wireless or wired, and whether with or without to end path, whether wireless or wired, and whether with or without
queues. queues.
5. Idealised Wire Protocol Coding 5. Idealised Wire Protocol Coding
We will start by inventing an idealised congestion notification We will start by inventing an idealised congestion notification
protocol before discussing how to make it practical. The idealised protocol before discussing how to make it practical. The idealised
protocol is shown to be correct using examples in Appendix A. protocol is shown to be correct using examples in Appendix A.
Congestion notification involves the congested resource coding a Congestion notification involves the congested resource coding a
congestion notification signal into the packet stream and the congestion notification signal into the packet stream and the
transports decoding it. The idealised protocol uses two different transports decoding it. The idealised protocol uses two different
fields in each datagram to signal congestion: one for byte congestion (imaginary) fields in each datagram to signal congestion: one for
and one for packet congestion. byte congestion and one for packet congestion.
We are not saying two ECN fields will be needed (and we are not We are not saying two ECN fields will be needed (and we are not
saying that somehow a resource should be able to drop a packet in one saying that somehow a resource should be able to drop a packet in one
of two different ways so that the transport can distinguish which of two different ways so that the transport can distinguish which
sort of drop it was!). These two congestion notification channels sort of drop it was!). These two congestion notification channels
are just a conceptual device. They allow us to defer having to are just a conceptual device. They allow us to defer having to
decide whether to distinguish between byte and packet congestion when decide whether to distinguish between byte and packet congestion when
the network resource codes the signal or when the transport decodes the network resource codes the signal or when the transport decodes
it. it.
skipping to change at page 20, line 38 skipping to change at page 20, line 7
The paper originally proposing TFRC with virtual packets (VP-TFRC) The paper originally proposing TFRC with virtual packets (VP-TFRC)
[CCvarPktSize] proposed that there should perhaps be two variants to [CCvarPktSize] proposed that there should perhaps be two variants to
cater for the different variants of RED. However, as the TFRC-SP cater for the different variants of RED. However, as the TFRC-SP
authors point out, there is no way for a transport to know whether authors point out, there is no way for a transport to know whether
some queues on its path have deployed RED with byte-mode packet drop some queues on its path have deployed RED with byte-mode packet drop
(except if an exhaustive survey found that no-one has deployed it!-- (except if an exhaustive survey found that no-one has deployed it!--
see Section 6.2.4). Incidentally, VP-TFRC also proposed that byte- see Section 6.2.4). Incidentally, VP-TFRC also proposed that byte-
mode RED dropping should really square the packet size compensation mode RED dropping should really square the packet size compensation
factor (like that of RED_5, but apparently unaware of it). factor (like that of RED_5, but apparently unaware of it).
Pre-congestion notification [I-D.ietf-pcn-architecture] is a proposal Pre-congestion notification [I-D.eardley-pcn-marking-behaviour] is a
to use a virtual queue for AQM marking for packets within one proposal to use a virtual queue for AQM marking for packets within
Diffserv class in order to give early warning prior to any real one Diffserv class in order to give early warning prior to any real
queuing. The proposed PCN marking algorithms have been designed not queuing. The proposed PCN marking algorithms have been designed not
to take account of packet size when forwarding through queues. to take account of packet size when forwarding through queues.
Instead the general principle has been to take account of the sizes Instead the general principle has been to take account of the sizes
of marked packets when monitoring the fraction of marking at the edge of marked packets when monitoring the fraction of marking at the edge
of the network. of the network.
6.2.3. Making Transports Robust against Control Packet Losses 6.2.3. Making Transports Robust against Control Packet Losses
Recently, two drafts have proposed changes to TCP that make it more Recently, two drafts have proposed changes to TCP that make it more
robust against losing small control packets [I-D.ietf-tcpm-ecnsyn] robust against losing small control packets [I-D.ietf-tcpm-ecnsyn]
skipping to change at page 23, line 44 skipping to change at page 23, line 11
still be prevalent in the Internet. As explained in Section 6.2.1, still be prevalent in the Internet. As explained in Section 6.2.1,
these also provide a marginal (but legitimate) bias towards small these also provide a marginal (but legitimate) bias towards small
packets. So even though RED byte-mode drop is not prevalent, it is packets. So even though RED byte-mode drop is not prevalent, it is
likely there is still some bias towards small packets in the Internet likely there is still some bias towards small packets in the Internet
due to tail drop and fixed buffer borrowing. due to tail drop and fixed buffer borrowing.
7. Outstanding Issues and Next Steps 7. Outstanding Issues and Next Steps
7.1. Bit-congestible World 7.1. Bit-congestible World
For a connectionless network with only bit-congestible resources we For a connectionless network with nearly all resources being bit-
believe the recommended position is now unarguably clear--that the congestible we believe the recommended position is now unarguably
network should not make allowance for packet sizes and the transport clear--that the network should not make allowance for packet sizes
should. This leaves two outstanding issues: and the transport should. This leaves two outstanding issues:
o How to handle any legacy of AQM with byte-mode drop already o How to handle any legacy of AQM with byte-mode drop already
deployed; deployed;
o The need to start a programme to update transport congestion o The need to start a programme to update transport congestion
control protocol standards to take account of packet size. control protocol standards to take account of packet size.
The sample of returns from our vendor survey Section 6.2.4 suggest The sample of returns from our vendor survey Section 6.2.4 suggest
that byte-mode packet drop seems not to be implemented at all let that byte-mode packet drop seems not to be implemented at all let
alone deployed, or if it is, it is likely to be very sparse. alone deployed, or if it is, it is likely to be very sparse.
Therefore, we do not really need a migration strategy from all but Therefore, we do not really need a migration strategy from all but
nothing to nothing. nothing to nothing.
A programme of standards updates to take account of packet size in A programme of standards updates to take account of packet size in
skipping to change at page 24, line 49 skipping to change at page 24, line 17
distinguishing wireless transmission losses from congestive losses. distinguishing wireless transmission losses from congestive losses.
We should also note that, strictly, packet-congestible resources are We should also note that, strictly, packet-congestible resources are
actually cycle-congestible because load also depends on the actually cycle-congestible because load also depends on the
complexity of each look-up and whether the pattern of arrivals is complexity of each look-up and whether the pattern of arrivals is
amenable to caching or not. Further, this reminds us that any amenable to caching or not. Further, this reminds us that any
solution must not require a forwarding engine to use excessive solution must not require a forwarding engine to use excessive
processor cycles in order to decide how to say it has no spare processor cycles in order to decide how to say it has no spare
processor cycles. processor cycles.
Recently, the dual resource queue (DRQ) proposal [DRQ] has been made
on the premise that, as network processors become more cost
effective, per packet operations will become more complex
(irrespective of whether more function in the network layer is
desirable). Consequently the premise is that CPU congestion will
become more common. DRQ is a proposed modification to the RED
algorithm that folds both bit congestion and packet congestion into
one signal (either loss or ECN).
The problem of signalling packet processing congestion is not The problem of signalling packet processing congestion is not
pressing, as most if not all Internet resources are designed to be pressing, as most Internet resources are designed to be bit-
bit-congestible before packet processing starts to congest. However, congestible before packet processing starts to congest. However, the
given the IRTF ICCRG has set itself the task of reaching consensus on IRTF Internet congestion control research group (ICCRG) has set
generic forwarding mechanisms that are necessary and sufficient to itself the task of reaching consensus on generic forwarding
support the Internet's future congestion control requirements mechanisms that are necessary and sufficient to support the
[I-D.irtf-iccrg-welzl-congestion-control-open-research], we must not Internet's future congestion control requirements (the first
give this problem no thought at all, just because it is hard and challenge in
currently hypothetical. [I-D.irtf-iccrg-welzl-congestion-control-open-research]). Therefore,
rather than not giving this problem any thought at all, just because
it is hard and currently hypothetical, we defer the question of
whether packet congestion might become common and what to do if it
does to the IRTF (the 'Small Packets' challenge in
[I-D.irtf-iccrg-welzl-congestion-control-open-research]).
8. Security Considerations 8. Security Considerations
This draft recommends that queues do not bias drop probability This draft recommends that queues do not bias drop probability
towards small packets as this creates a perverse incentive for towards small packets as this creates a perverse incentive for
transports to break down their flows into tiny segments. One of the transports to break down their flows into tiny segments. One of the
benefits of implementing AQM was meant to be to remove this perverse benefits of implementing AQM was meant to be to remove this perverse
incentive that drop-tail queues gave to small packets. Of course, if incentive that drop-tail queues gave to small packets. Of course, if
transports really want to make the greatest gains, they don't have to transports really want to make the greatest gains, they don't have to
respond to congestion anyway. But we don't want applications that respond to congestion anyway. But we don't want applications that
skipping to change at page 27, line 38 skipping to change at page 27, line 23
further helped survey the current status of RED implementation and further helped survey the current status of RED implementation and
deployment and, finally, thanks to the anonymous individuals who deployment and, finally, thanks to the anonymous individuals who
responded. responded.
11. Comments Solicited 11. Comments Solicited
Comments and questions are encouraged and very welcome. They can be Comments and questions are encouraged and very welcome. They can be
addressed to the IETF Transport Area working group mailing list addressed to the IETF Transport Area working group mailing list
<tsvwg@ietf.org>, and/or to the authors. <tsvwg@ietf.org>, and/or to the authors.
Editorial Comments 12. References
12.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC2309] Braden, B., Clark, D., Crowcroft, J., Davie, B., Deering,
S., Estrin, D., Floyd, S., Jacobson, V., Minshall, G.,
Partridge, C., Peterson, L., Ramakrishnan, K., Shenker,
S., Wroclawski, J., and L. Zhang, "Recommendations on
Queue Management and Congestion Avoidance in the
Internet", RFC 2309, April 1998.
[RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition
of Explicit Congestion Notification (ECN) to IP",
RFC 3168, September 2001.
[RFC3426] Floyd, S., "General Architectural and Policy
Considerations", RFC 3426, November 2002.
[RFC5033] Floyd, S. and M. Allman, "Specifying New Congestion
Control Algorithms", BCP 133, RFC 5033, August 2007.
12.2. Informative References
[CCvarPktSize]
Widmer, J., Boutremans, C., and J-Y. Le Boudec,
"Congestion Control for Flows with Variable Packet Size",
ACM CCR 34(2) 137--151, 2004,
<http://doi.acm.org/10.1145/997150.997162>.
[DRQ] Shin, M., Chong, S., and I. Rhee, "Dual-Resource TCP/AQM
for Processing-Constrained Networks", IEEE/ACM
Transactions on Networking Vol 16, issue 2, April 2008,
<http://dx.doi.org/10.1109/TNET.2007.900415>.
[DupTCP] Wischik, D., "Short messages", Royal Society workshop on
networks: modelling and control , September 2007, <http://
www.cs.ucl.ac.uk/staff/ucacdjw/Research/shortmsg.html>.
[ECNFixedWireless]
Siris, V., "Resource Control for Elastic Traffic in CDMA
Networks", Proc. ACM MOBICOM'02 , September 2002, <http://
www.ics.forth.gr/netlab/publications/
resource_control_elastic_cdma.html>.
[Evol_cc] Gibbens, R. and F. Kelly, "Resource pricing and the
evolution of congestion control", Automatica 35(12)1969--
1985, December 1999,
<http://www.statslab.cam.ac.uk/~frank/evol.html>.
[I-D.eardley-pcn-marking-behaviour]
Eardley, P., "Marking behaviour of PCN-nodes",
draft-eardley-pcn-marking-behaviour-01 (work in progress),
June 2008.
[I-D.falk-xcp-spec]
Falk, A., "Specification for the Explicit Control Protocol
(XCP)", draft-falk-xcp-spec-03 (work in progress),
July 2007.
[I-D.floyd-tcpm-ackcc]
Floyd, S. and I. Property, "Adding Acknowledgement
Congestion Control to TCP", draft-floyd-tcpm-ackcc-02
(work in progress), November 2007.
[I-D.ietf-tcpm-ecnsyn]
Floyd, S., "Adding Explicit Congestion Notification (ECN)
Capability to TCP's SYN/ACK Packets",
draft-ietf-tcpm-ecnsyn-05 (work in progress),
February 2008.
[I-D.ietf-tcpm-rfc2581bis]
Allman, M., "TCP Congestion Control",
draft-ietf-tcpm-rfc2581bis-03 (work in progress),
September 2007.
[I-D.irtf-iccrg-welzl-congestion-control-open-research]
Papadimitriou, D., "Open Research Issues in Internet
Congestion Control",
draft-irtf-iccrg-welzl-congestion-control-open-research-00
(work in progress), July 2007.
[IOSArch] Bollapragada, V., White, R., and C. Murphy, "Inside Cisco
IOS Software Architecture", Cisco Press: CCIE Professional
Development ISBN13: 978-1-57870-181-0, July 2000.
[MulTCP] Crowcroft, J. and Ph. Oechslin, "Differentiated End to End
Internet Services using a Weighted Proportional Fair
Sharing TCP", CCR 28(3) 53--69, July 1998, <http://
www.cs.ucl.ac.uk/staff/J.Crowcroft/hipparch/pricing.html>.
[PktSizeEquCC]
Vasallo, P., "Variable Packet Size Equation-Based
Congestion Control", ICSI Technical Report tr-00-008,
2000, <http://http.icsi.berkeley.edu/ftp/global/pub/
techreports/2000/tr-00-008.pdf>.
[RED93] Floyd, S. and V. Jacobson, "Random Early Detection (RED)
gateways for Congestion Avoidance", IEEE/ACM Transactions
on Networking 1(4) 397--413, August 1993,
<http://www.icir.org/floyd/papers/red/red.html>.
[REDbias] Eddy, W. and M. Allman, "A Comparison of RED's Byte and
Packet Modes", Computer Networks 42(3) 261--280,
June 2003,
<http://www.ir.bbn.com/documents/articles/redbias.ps>.
[REDbyte] De Cnodder, S., Elloumi, O., and K. Pauwels, "RED behavior
with different packet sizes", Proc. 5th IEEE Symposium on
Computers and Communications (ISCC) 793--799, July 2000,
<http://www.icir.org/floyd/red/Elloumi99.pdf>.
[RFC2474] Nichols, K., Blake, S., Baker, F., and D. Black,
"Definition of the Differentiated Services Field (DS
Field) in the IPv4 and IPv6 Headers", RFC 2474,
December 1998.
[RFC2581] Allman, M., Paxson, V., and W. Stevens, "TCP Congestion
Control", RFC 2581, April 1999.
[RFC3448] Handley, M., Floyd, S., Padhye, J., and J. Widmer, "TCP
Friendly Rate Control (TFRC): Protocol Specification",
RFC 3448, January 2003.
[RFC3714] Floyd, S. and J. Kempf, "IAB Concerns Regarding Congestion
Control for Voice Traffic in the Internet", RFC 3714,
March 2004.
[RFC4782] Floyd, S., Allman, M., Jain, A., and P. Sarolahti, "Quick-
Start for TCP and IP", RFC 4782, January 2007.
[RFC4828] Floyd, S. and E. Kohler, "TCP Friendly Rate Control
(TFRC): The Small-Packet (SP) Variant", RFC 4828,
April 2007.
[Rate_fair_Dis]
Briscoe, B., "Flow Rate Fairness: Dismantling a Religion",
ACM CCR 37(2)63--74, April 2007,
<http://portal.acm.org/citation.cfm?id=1232926>.
[Re-TCP] Briscoe, B., Jacquet, A., Moncaster, T., and A. Smith,
"Re-ECN: Adding Accountability for Causing Congestion to
TCP/IP", draft-briscoe-tsvwg-re-ecn-tcp-05 (work in
progress), January 2008.
[WindowPropFair]
Siris, V., "Service Differentiation and Performance of
Weighted Window-Based Congestion Control and Packet
Marking Algorithms in ECN Networks", Computer
Communications 26(4) 314--326, 2002, <http://
www.ics.forth.gr/netgroup/publications/
weighted_window_control.html>.
[gentle_RED]
Floyd, S., "Recommendation on using the "gentle_" variant
of RED", Web page , March 2000,
<http://www.icir.org/floyd/red/gentle.html>.
[pBox] Floyd, S. and K. Fall, "Promoting the Use of End-to-End
Congestion Control in the Internet", IEEE/ACM Transactions
on Networking 7(4) 458--472, August 1999,
<http://www.aciri.org/floyd/end2end-paper.html>.
[pktByteEmail]
Floyd, S., "RED: Discussions of Byte and Packet Modes",
email , March 1997,
<http://www-nrg.ee.lbl.gov/floyd/REDaveraging.txt>.
Editorial Comments
[Note_Variation] The algorithm of the byte-mode drop variant of RED [Note_Variation] The algorithm of the byte-mode drop variant of RED
switches off any bias towards small packets switches off any bias towards small packets
whenever the smoothed queue length dictates that whenever the smoothed queue length dictates that
the drop probability of large packets should be the drop probability of large packets should be
100%. In the example in the Introduction, as the 100%. In the example in the Introduction, as the
large packet drop probability varies around 25% the large packet drop probability varies around 25% the
small packet drop probability will vary around 1%, small packet drop probability will vary around 1%,
but with occasional jumps to 100% whenever the but with occasional jumps to 100% whenever the
instantaneous queue (after drop) manages to sustain instantaneous queue (after drop) manages to sustain
a length above the 100% drop point for longer than a length above the 100% drop point for longer than
skipping to change at page 32, line 49 skipping to change at page 36, line 6
packets or across different size flows [Rate_fair_Dis]. Therefore packets or across different size flows [Rate_fair_Dis]. Therefore
policing would work naturally with just simple packet-mode drop in policing would work naturally with just simple packet-mode drop in
RED. RED.
In summary, making drop probability depend on the size of the packets In summary, making drop probability depend on the size of the packets
that bits happen to be divided into simply encourages the bits to be that bits happen to be divided into simply encourages the bits to be
divided into smaller packets. Byte-mode drop would therefore divided into smaller packets. Byte-mode drop would therefore
irreversibly complicate any attempt to fix the Internet's incentive irreversibly complicate any attempt to fix the Internet's incentive
structures. structures.
12. References
12.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC2309] Braden, B., Clark, D., Crowcroft, J., Davie, B., Deering,
S., Estrin, D., Floyd, S., Jacobson, V., Minshall, G.,
Partridge, C., Peterson, L., Ramakrishnan, K., Shenker,
S., Wroclawski, J., and L. Zhang, "Recommendations on
Queue Management and Congestion Avoidance in the
Internet", RFC 2309, April 1998.
[RFC2474] Nichols, K., Blake, S., Baker, F., and D. Black,
"Definition of the Differentiated Services Field (DS
Field) in the IPv4 and IPv6 Headers", RFC 2474,
December 1998.
[RFC2581] Allman, M., Paxson, V., and W. Stevens, "TCP Congestion
Control", RFC 2581, April 1999.
[RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition
of Explicit Congestion Notification (ECN) to IP",
RFC 3168, September 2001.
[RFC3426] Floyd, S., "General Architectural and Policy
Considerations", RFC 3426, November 2002.
[RFC3448] Handley, M., Floyd, S., Padhye, J., and J. Widmer, "TCP
Friendly Rate Control (TFRC): Protocol Specification",
RFC 3448, January 2003.
[RFC4828] Floyd, S. and E. Kohler, "TCP Friendly Rate Control
(TFRC): The Small-Packet (SP) Variant", RFC 4828,
April 2007.
[RFC5033] Floyd, S. and M. Allman, "Specifying New Congestion
Control Algorithms", BCP 133, RFC 5033, August 2007.
12.2. Informative References
[CCvarPktSize]
Widmer, J., Boutremans, C., and J-Y. Le Boudec,
"Congestion Control for Flows with Variable Packet Size",
ACM CCR 34(2) 137--151, 2004,
<http://doi.acm.org/10.1145/997150.997162>.
[DupTCP] Wischik, D., "Short messages", Royal Society workshop on
networks: modelling and control , September 2007, <http://
www.cs.ucl.ac.uk/staff/ucacdjw/Research/shortmsg.html>.
[ECNFixedWireless]
Siris, V., "Resource Control for Elastic Traffic in CDMA
Networks", Proc. ACM MOBICOM'02 , September 2002, <http://
www.ics.forth.gr/netlab/publications/
resource_control_elastic_cdma.html>.
[Evol_cc] Gibbens, R. and F. Kelly, "Resource pricing and the
evolution of congestion control", Automatica 35(12)1969--
1985, December 1999,
<http://www.statslab.cam.ac.uk/~frank/evol.html>.
[I-D.falk-xcp-spec]
Falk, A., "Specification for the Explicit Control Protocol
(XCP)", draft-falk-xcp-spec-03 (work in progress),
July 2007.
[I-D.floyd-tcpm-ackcc]
Floyd, S. and I. Property, "Adding Acknowledgement
Congestion Control to TCP", draft-floyd-tcpm-ackcc-02
(work in progress), November 2007.
[I-D.ietf-pcn-architecture]
Eardley, P., "Pre-Congestion Notification Architecture",
draft-ietf-pcn-architecture-03 (work in progress),
February 2008.
[I-D.ietf-tcpm-ecnsyn]
Floyd, S., "Adding Explicit Congestion Notification (ECN)
Capability to TCP's SYN/ACK Packets",
draft-ietf-tcpm-ecnsyn-05 (work in progress),
February 2008.
[I-D.ietf-tcpm-rfc2581bis]
Allman, M., "TCP Congestion Control",
draft-ietf-tcpm-rfc2581bis-03 (work in progress),
September 2007.
[I-D.irtf-iccrg-welzl-congestion-control-open-research]
Papadimitriou, D., "Open Research Issues in Internet
Congestion Control",
draft-irtf-iccrg-welzl-congestion-control-open-research-00
(work in progress), July 2007.
[IOSArch] Bollapragada, V., White, R., and C. Murphy, "Inside Cisco
IOS Software Architecture", Cisco Press: CCIE Professional
Development ISBN13: 978-1-57870-181-0, July 2000.
[MulTCP] Crowcroft, J. and Ph. Oechslin, "Differentiated End to End
Internet Services using a Weighted Proportional Fair
Sharing TCP", CCR 28(3) 53--69, July 1998, <http://
www.cs.ucl.ac.uk/staff/J.Crowcroft/hipparch/pricing.html>.
[PCNcharter]
IETF, "Congestion and Pre-Congestion Notification (pcn)",
IETF w-g charter , Feb 2007,
<http://www.ietf.org/html.charters/pcn-charter.html>.
[PktSizeEquCC]
Vasallo, P., "Variable Packet Size Equation-Based
Congestion Control", ICSI Technical Report tr-00-008,
2000, <http://http.icsi.berkeley.edu/ftp/global/pub/
techreports/2000/tr-00-008.pdf>.
[RED93] Floyd, S. and V. Jacobson, "Random Early Detection (RED)
gateways for Congestion Avoidance", IEEE/ACM Transactions
on Networking 1(4) 397--413, August 1993,
<http://www.icir.org/floyd/papers/red/red.html>.
[REDbias] Eddy, W. and M. Allman, "A Comparison of RED's Byte and
Packet Modes", Computer Networks 42(3) 261--280,
June 2003,
<http://www.ir.bbn.com/documents/articles/redbias.ps>.
[REDbyte] De Cnodder, S., Elloumi, O., and K. Pauwels, "RED behavior
with different packet sizes", Proc. 5th IEEE Symposium on
Computers and Communications (ISCC) 793--799, July 2000,
<http://www.icir.org/floyd/red/Elloumi99.pdf>.
[RFC3714] Floyd, S. and J. Kempf, "IAB Concerns Regarding Congestion
Control for Voice Traffic in the Internet", RFC 3714,
March 2004.
[RFC4782] Floyd, S., Allman, M., Jain, A., and P. Sarolahti, "Quick-
Start for TCP and IP", RFC 4782, January 2007.
[Rate_fair_Dis]
Briscoe, B., "Flow Rate Fairness: Dismantling a Religion",
ACM CCR 37(2)63--74, April 2007,
<http://portal.acm.org/citation.cfm?id=1232926>.
[Re-TCP] Briscoe, B., Jacquet, A., Moncaster, T., and A. Smith,
"Re-ECN: Adding Accountability for Causing Congestion to
TCP/IP", draft-briscoe-tsvwg-re-ecn-tcp-05 (work in
progress), January 2008.
[WindowPropFair]
Siris, V., "Service Differentiation and Performance of
Weighted Window-Based Congestion Control and Packet
Marking Algorithms in ECN Networks", Computer
Communications 26(4) 314--326, 2002, <http://
www.ics.forth.gr/netgroup/publications/
weighted_window_control.html>.
[gentle_RED]
Floyd, S., "Recommendation on using the "gentle_" variant
of RED", Web page , March 2000,
<http://www.icir.org/floyd/red/gentle.html>.
[pBox] Floyd, S. and K. Fall, "Promoting the Use of End-to-End
Congestion Control in the Internet", IEEE/ACM Transactions
on Networking 7(4) 458--472, August 1999,
<http://www.aciri.org/floyd/end2end-paper.html>.
[pktByteEmail]
Floyd, S., "RED: Discussions of Byte and Packet Modes",
email , March 1997,
<http://www-nrg.ee.lbl.gov/floyd/REDaveraging.txt>.
Author's Address Author's Address
Bob Briscoe Bob Briscoe
BT & UCL BT & UCL
B54/77, Adastral Park B54/77, Adastral Park
Martlesham Heath Martlesham Heath
Ipswich IP5 3RE Ipswich IP5 3RE
UK UK
Phone: +44 1473 645196 Phone: +44 1473 645196
skipping to change at page 37, line 45 skipping to change at page 37, line 45
such proprietary rights by implementers or users of this such proprietary rights by implementers or users of this
specification can be obtained from the IETF on-line IPR repository at specification can be obtained from the IETF on-line IPR repository at
http://www.ietf.org/ipr. http://www.ietf.org/ipr.
The IETF invites any interested party to bring to its attention any The IETF invites any interested party to bring to its attention any
copyrights, patents or patent applications, or other proprietary copyrights, patents or patent applications, or other proprietary
rights that may cover technology that may be required to implement rights that may cover technology that may be required to implement
this standard. Please address the information to the IETF at this standard. Please address the information to the IETF at
ietf-ipr@ietf.org. ietf-ipr@ietf.org.
Acknowledgments Acknowledgment
Funding for the RFC Editor function is provided by the IETF This document was produced using xml2rfc v1.33 (of
Administrative Support Activity (IASA). This document was produced http://xml.resource.org/) from a source in RFC-2629 XML format.
using xml2rfc v1.32 (of http://xml.resource.org/) from a source in
RFC-2629 XML format.
 End of changes. 31 change blocks. 
232 lines changed or deleted 266 lines changed or added

This html diff was produced by rfcdiff 1.35. The latest version is available from http://tools.ietf.org/tools/rfcdiff/