< draft-mathis-conex-abstract-mech-00b.txt   draft-mathis-conex-abstract-mech-00c.txt >
Congestion Exposure (ConEx) M. Mathis Congestion Exposure (ConEx) M. Mathis
Working Group Google Working Group Google
Internet-Draft B. Briscoe Internet-Draft B. Briscoe
Intended status: Informational BT Intended status: Informational BT
Expires: April 17, 2011 October 14, 2010 Expires: April 18, 2011 October 15, 2010
Congestion Exposure (ConEx) Concepts and Abstract Mechanism Congestion Exposure (ConEx) Concepts and Abstract Mechanism
draft-mathis-conex-abstract-mech-00b draft-mathis-conex-abstract-mech-00c
Abstract Abstract
This document describes an abstract mechanism by which senders inform This document describes an abstract mechanism by which senders inform
the network about the congestion encountered by packets earlier in the network about the congestion encountered by packets earlier in
the same flow. Today, the network may signal congestion to the the same flow. Today, the network may signal congestion to the
receiver by ECN markings or by dropping packets, and the receiver may receiver by ECN markings or by dropping packets, and the receiver may
pass this information back to the sender in transport-layer feedback. pass this information back to the sender in transport-layer feedback.
The mechanism to be developed by the ConEx WG will enable the sender The mechanism to be developed by the ConEx WG will enable the sender
to also relay this congestion information back into the network in- to also relay this congestion information back into the network in-
skipping to change at page 1, line 40 skipping to change at page 1, line 40
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/. Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on April 17, 2011. This Internet-Draft will expire on April 18, 2011.
Copyright Notice Copyright Notice
Copyright (c) 2010 IETF Trust and the persons identified as the Copyright (c) 2010 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 2, line 17 skipping to change at page 2, line 17
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License. described in the Simplified BSD License.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 4 1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 4
2. Requirements for the Congestion Exposure Signal . . . . . . . 5 2. Requirements for the Congestion Exposure Signal . . . . . . . 5
3. Representing Congestion Exposure . . . . . . . . . . . . . . . 7 3. Representing Congestion Exposure . . . . . . . . . . . . . . . 7
3.1. One Simple Encoding . . . . . . . . . . . . . . . . . . . 7 3.1. Strawman Encoding . . . . . . . . . . . . . . . . . . . . 7
3.2. ECN Based Encoding . . . . . . . . . . . . . . . . . . . . 8 3.2. ECN Based Encoding . . . . . . . . . . . . . . . . . . . . 8
3.2.1. ECN Changes . . . . . . . . . . . . . . . . . . . . . 8 3.2.1. ECN Changes . . . . . . . . . . . . . . . . . . . . . 9
3.3. Abstract Encoding . . . . . . . . . . . . . . . . . . . . 9 3.3. Abstract Encoding . . . . . . . . . . . . . . . . . . . . 9
3.3.1. Separate Bits . . . . . . . . . . . . . . . . . . . . 9 3.3.1. Independent Bits . . . . . . . . . . . . . . . . . . . 9
3.3.2. Enumerated Encoding . . . . . . . . . . . . . . . . . 9 3.3.2. Codepoint Encoding . . . . . . . . . . . . . . . . . . 10
4. Congestion Exposure Components . . . . . . . . . . . . . . . . 9 4. Congestion Exposure Components . . . . . . . . . . . . . . . . 10
4.1. Modified Senders . . . . . . . . . . . . . . . . . . . . . 9 4.1. Modified Senders . . . . . . . . . . . . . . . . . . . . . 10
4.2. Policy Devices . . . . . . . . . . . . . . . . . . . . . . 9 4.2. Receivers (Optionally Modified) . . . . . . . . . . . . . 11
4.2.1. Audit . . . . . . . . . . . . . . . . . . . . . . . . 9 4.3. Audit . . . . . . . . . . . . . . . . . . . . . . . . . . 11
4.2.2. Policers and Shapers . . . . . . . . . . . . . . . . . 10 4.4. Policy Devices . . . . . . . . . . . . . . . . . . . . . . 12
5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 10 4.4.1. Congestion Policers . . . . . . . . . . . . . . . . . 12
6. Security Considerations . . . . . . . . . . . . . . . . . . . 10 4.4.2. Other Policy Devices . . . . . . . . . . . . . . . . . 12
7. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 10 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 13
8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 10 6. Security Considerations . . . . . . . . . . . . . . . . . . . 13
9. Comments Solicited . . . . . . . . . . . . . . . . . . . . . . 10 7. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 13
10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 10 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 13
10.1. Normative References . . . . . . . . . . . . . . . . . . . 10 9. Comments Solicited . . . . . . . . . . . . . . . . . . . . . . 13
10.2. Informative References . . . . . . . . . . . . . . . . . . 11 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 13
10.1. Normative References . . . . . . . . . . . . . . . . . . . 13
10.2. Informative References . . . . . . . . . . . . . . . . . . 13
1. Introduction 1. Introduction
One of the required functions of a transport protocol is controlling One of the required functions of a transport protocol is controlling
congestion in the network. There are three techniques in use today congestion in the network. There are three techniques in use today
for the network to signal congestion to a transport: for the network to signal congestion to a transport:
o The most common congestion signal is packet loss. When congested, o The most common congestion signal is packet loss. When congested,
the network simply discards some packets either as part of an the network simply discards some packets either as part of an
explicit control function [RFC2309] or as the consequence of a explicit control function [RFC2309] or as the consequence of a
skipping to change at page 4, line 25 skipping to change at page 4, line 25
| Sender |>-(new)-IP layer Congestion Exposure Signal--->| Receiver| | Sender |>-(new)-IP layer Congestion Exposure Signal--->| Receiver|
| | (Carried in Data Packet Headers) | | | | (Carried in Data Packet Headers) | |
| | +-----------+ | | | | +-----------+ | |
| |>=Data=Path=>|(Congested)|>=====Data=Path=====>| | | |>=Data=Path=>|(Congested)|>=====Data=Path=====>| |
| | | Network |>-Congestion-Signal->| | | | | Network |>-Congestion-Signal->| |
| | | Device | | | | | | Device | | |
+---------+ +-----------+ +---------+ +---------+ +-----------+ +---------+
Not shown are policy devices along the data path that observe the Not shown are policy devices along the data path that observe the
Congestion Exposure Signal, and use the information to monitor or Congestion Exposure Signal, and use the information to monitor or
manage traffic. These are discussed in Section 4.2. manage traffic. These are discussed in Section 4.4.
Figure 1 Figure 1
1.1. Terminology 1.1. Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [RFC2119]. document are to be interpreted as described in RFC 2119 [RFC2119].
ConEx signals in IP packet headers from the sender to the network ConEx signals in IP packet headers from the sender to the network
{ToDo: These are placeholders for whatever words we decide to use}: {ToDo: These are placeholders for whatever words we decide to use}:
Re-Echo Loss (aka Black-Loss) The transport has experienced a loss. Not-ConEx (aka White) The transport is not ConEx-capable
Re-Echo ECN (aka Black-ECN) The transport has experienced an ECN ConEx (aka Grey) The transport is ConEx-capable
mark
Pre-Echo (aka Green) The transport is building up credit to allow Re-Echo-Loss (aka Purple) The transport has experienced a loss.
for any future delay in expected ConEx signals
Neutral (aka Grey) The transport is ConEx-capable Re-Echo-ECN (aka Black) The transport has experienced an ECN mark
Not-ConEx (aka White) The transport is not ConEx-capable
Credit (aka Green) The transport is building up credit to allow for
any future delay in expected ConEx signals
ConEx-Marked Any of Re-Echo-Loss, Re-Echo-ECN or Credit.
ConEx-Unmarked ConEx, but not ConEx-Marked.
2. Requirements for the Congestion Exposure Signal 2. Requirements for the Congestion Exposure Signal
Ideally, all the following requirements would be met by a Congestion
Exposure Signal. However it is already known that some compromises
will be necessary, therefore all the requirements are expressed with
the keyword 'SHOULD' rather then 'MUST'. The only mandatory
requirement is that a concrete protocol description MUST give sound
reasoning if it chooses not to meet any of these requirements:
a. The Congestion Exposure Signal SHOULD be visible to internetwork a. The Congestion Exposure Signal SHOULD be visible to internetwork
layer devices along the entire path from the transport sender to layer devices along the entire path from the transport sender to
the transport receiver. Equivalently, it SHOULD be present in the transport receiver. Equivalently, it SHOULD be present in
the IPv4 or IPv6 header, and in the outermost IP header if using the IPv4 or IPv6 header, and in the outermost IP header if using
IP in IP tunnelling. The Congestion Exposure Signal SHOULD be IP in IP tunnelling. The Congestion Exposure Signal SHOULD be
immutable once set by the transport sender. A corollary of these immutable once set by the transport sender. A corollary of these
requirements is that existing (legacy) networking gear SHOULD requirements is that existing (legacy) networking gear SHOULD
pass the Congestion Exposure Signal silently without pass the Congestion Exposure Signal silently without
modification. modification.
skipping to change at page 5, line 47 skipping to change at page 6, line 8
actually experiencing. actually experiencing.
d. The Congestion Exposure Signal SHOULD be timely. There will be a d. The Congestion Exposure Signal SHOULD be timely. There will be a
delay between the time when an auditing device sees an actual delay between the time when an auditing device sees an actual
congestion signal and when it sees the subsequent Congestion congestion signal and when it sees the subsequent Congestion
Exposure Signal from the sender. The minimum delay will be one Exposure Signal from the sender. The minimum delay will be one
round trip, but it may be much longer depending on the round trip, but it may be much longer depending on the
transport's choice of feedback delay (consider RTCP [RFC3550] for transport's choice of feedback delay (consider RTCP [RFC3550] for
example). It is not practical to expect auditing devices in the example). It is not practical to expect auditing devices in the
network to make allowance for such feedback delays. Instead, the network to make allowance for such feedback delays. Instead, the
sender MUST be able to send Congestion Exposure signals in sender SHOULD be able to send Congestion Exposure signals in
advance, as 'credit' for any audit device to hold as a balance advance, as 'credit' for any audit device to hold as a balance
against the risk of congestion during the feedback delay. This against the risk of congestion during the feedback delay. This
design choice simplifies auditing devices and correctly makes the design choice simplifies auditing devices and correctly makes the
transport responsible for both minimising feedback delay and transport responsible for both minimising feedback delay and
minimising sharp increases in packets in flight that would risk minimising sharp increases in packets in flight that would risk
causing excessive congestion to others. This issue is discussed causing excessive congestion to others. This issue is discussed
in more detail in Section 4.2.1. in more detail in Section 4.3.
It is important to note that the auditing requirement implies a It is important to note that the auditing requirement implies a
number of additional constraints: The basic auditing technique is to number of additional constraints: The basic auditing technique is to
count both actual congestion signals and Congestion Exposure Signals count both actual congestion signals and Congestion Exposure Signals
someplace along the data path: someplace along the data path:
o For congestion signaled by ECN, auditing is most accurate when o For congestion signaled by ECN, auditing is most accurate when
located near the transport receiver. Within any flow or aggregate located near the transport receiver. Within any flow or aggregate
of flows, the total volume of ECN marked data seen near the of flows, the total volume of ECN marked data seen near the
receiver should always be equal to or less than the volume of data receiver should always be equal to or less than the volume of data
skipping to change at page 6, line 28 skipping to change at page 6, line 37
o For congestion signaled by loss, totally accurate auditing is not o For congestion signaled by loss, totally accurate auditing is not
believed to be possible in the general case, because it involves a believed to be possible in the general case, because it involves a
network node detecting the absence of some packets, when it cannot network node detecting the absence of some packets, when it cannot
necessarily see the transport protocol sequence numbers and when necessarily see the transport protocol sequence numbers and when
the missing packets might simply be taking a different route. But the missing packets might simply be taking a different route. But
there are common cases where sufficient audit accuracy should be there are common cases where sufficient audit accuracy should be
possible: possible:
* For non-IPsec traffic conforming to standard TCP sequence * For non-IPsec traffic conforming to standard TCP sequence
numbering on a single path, the auditor could detect losses by numbering on a single path, an auditor could detect losses by
observing both the original transmission and the retransmission observing both the original transmission and the retransmission
after the loss. Such auditing would be most accurate near the after the loss. Such auditing would be most accurate near the
sender. sender.
* For networks designed so that losses predominantly occur under * For networks designed so that losses predominantly occur under
the management of one IP-aware node on the path, the auditor the management of one IP-aware node on the path, the auditor
could be located at this bottleneck. It could simply compare could be located at this bottleneck. It could simply compare
Congestion Exposure Signals with actual local losses. Most Congestion Exposure Signals with actual local losses. This is
consumer access networks are design to this model, e.g. the a good model for most consumer access networks and audit
radio network controller (RNC) in a cellular network or the accuracy could well be sufficient even if losses occasionally
broadband remote access server (BRAS) in a digital subscriber occurred at other nodes in the network, such as border gateways
line (DSL) network. Unlike the above TCP-specific solution, (see Section 4.3 for details).
this would work for IP packets carrying any transport layer
protocol, and whether encrypted or not.
The accuracy of an auditor at one predominant bottleneck might
still be sufficient, even if losses occasionally occurred at
other nodes in the network (e.g. border gateways). Although
the auditor at the predominant bottleneck would not always be
able to detect losses at other nodes, transports would not know
where losses were occurring either. Therefore any transport
would not know which losses it could cheat on without getting
caught, and which ones it couldn't.
Given that loss-based and ECN-based Congestion Exposure might Given that loss-based and ECN-based Congestion Exposure might
sometimes be best audited at different locations, have distinct sometimes be best audited at different locations, having distinct
encodings would widen the design space for the auditing function. encodings would widen the design space for the auditing function.
{Bob: Got to here making suggested changes.}
3. Representing Congestion Exposure 3. Representing Congestion Exposure
Most protocol specifications start with a description of packet Most protocol specifications start with a description of packet
formats and code points with their associated meanings. This formats and codepoints with their associated meanings. This document
document does not: It is already known that choosing the encoding for does not: It is already known that choosing the encoding for the
the Congestion Exposure Signal is likely to entail some engineering Congestion Exposure Signal is likely to entail some engineering
compromises that have the potential to reduce the protocol's compromises that have the potential to reduce the protocol's
usefulness in some settings. Rather than making these engineering usefulness in some settings. Rather than making these engineering
choices prematurely, this document side steps the encoding problem by choices prematurely, this document side steps the encoding problem by
describing an abstract representation of Congestion Exposure Signal. describing an abstract representation of a Congestion Exposure
All of the elements of the protocol can be defined in terms of this Signal. All of the elements of the protocol can be defined in terms
abstract representation. Most important, the preliminary use cases of this abstract representation. Most important, the preliminary use
for the protocol are described in terms of the abstract cases for the protocol are described in terms of the abstract
representation in companion documents. representation in companion documents [I-D.conex-concepts-uses].
Once we have some example use cases we can evaluate different Once we have some example use cases we can evaluate different
encoding schemes. Since these schemes are likely to include some encoding schemes. Since these schemes are likely to include some
conflated code points, some information will be lost resulting in conflated code points, some information will be lost resulting in
weakening or disabling some of the algorithms and eliminating some weakening or disabling some of the algorithms and eliminating some
use cases. use cases.
The goal of this approach is to be as complete as possible for The goal of this approach is to be as complete as possible for
discovering the potential usage and capabilities of the Congestion discovering the potential usage and capabilities of the Congestion
Exposure protocol, so we have some hope of making optimal design Exposure protocol, so we have some hope of making optimal design
decisions when choosing the encoding. decisions when choosing the encoding.
3.1. One Simple Encoding 3.1. Strawman Encoding
As an aid to the reader, it might be helpful to describe one simple
encoding of the Congestion Exposure protocol: set IPv4 header bit 48
(aka the "evil bit" [RFC3514]) on all retransmissions or once per ECN
signaled window reduction. Clearly network devices along the forward
path can see this bit and act on it. For example they can count
marked and unmarked packets to estimate the congestion levels along
the path.
However this encoding has been forbidden by RFC xxxx, which seeks to As an aid to the reader, it might be helpful to describe a naive
preserve the last unallocated bit in the IPv4 header for some strawman encoding of the Congestion Exposure protocol described
unspecifed future use. solely in terms of TCP: set the Reserved bit in the IPv4 header (bit
48 counting from zero [RFC0791]--aka the "evil bit" [RFC3514]) on all
retransmissions or once per ECN signaled window reduction. Clearly
network devices along the forward path can see this bit and act on
it. For example they can count marked and unmarked packets to
estimate the congestion levels along the path.
Furthermore this encoding, by itself, does not sufficiently support However, the IESG has chartered the ConEx working group to establish
partial deployment or strong auditing and might motivate users and/or that there is sufficient demand for an IPv6 ConEx protocol before
applications to misrepresent the congestion that they are be causing. using the last available bit in the IPv4 header. Furthermore this
encoding, by itself, does not sufficiently support partial deployment
or strong auditing and might motivate users and/or applications to
misrepresent the congestion that they are causing.
However, this simple encoding does present a clear mental model of Nonetheless, this strawman encoding does present a clear mental model
how the Congestion Exposure protocol functions and is very useful for of how the Congestion Exposure protocol might function under various
conducting thought experiments about how the protocol might function uses.
under various uses.
3.2. ECN Based Encoding 3.2. ECN Based Encoding
Bob Briscoe's PhD thesis [Refb-dis], and many derivative works The re-ECN specification [I-D.briscoe-tsvwg-re-ecn-tcp] presents an
including RE-ECN [I-D.briscoe-tsvwg-re-ecn-tcp] present an ECN based ECN based implementation of ConEx. The central theme of this work is
implementation of ConEx. The central theme of this work includes an audit mechanism that can provide sufficient disincentives against
strong disincentives for misrepresenting congestion misrepresenting congestion [I-D.briscoe-tsvwg-re-ecn-motiv], which is
[I-D.briscoe-tsvwg-re-ecn-motiv]. However, it also pre-supposes the analysed extensively in Briscoe's PhD dissertation [Refb-dis].
full deployment of ECN, and does not adequately signal congestion
indicated by packet loss. Furthermore, given that after 10 years ECN
still has not been widely deployed, it does not seem prudent to
require its deployment as a prerequisite for deploying a Congestion
Exposure protocol.
As it currently stands, this work fails to meet the "partial The re-ECN encoding is tightly integrated with the encoding of ECN in
deployment" requirement described above in section Section 2. the IP header. However, re-ECN can be incrementally deployed on
hosts whether or not any networks support ECN marking and whether or
not any networks take note of re-ECN markings. Nonetheless, the
audit function has only been formally analysed where at least one
autonomous network has deployed ECN marking, which it uses to audit
whether the Congestion Exposure Signal matches actual congestion.
Thus, even if networks have not deployed ECN, re-ECN acts perfectly
well as a loss-based Congestion Exposure protocol. As such, a
network could potentially audit re-ECN signals against losses using
the loss-based audit techniques in Section 4.3, rather than deploying
ECN.
Although re-ECN does not require networks to support ECN, it still
embodies a major incremental deployment challenge; a sender cannot
use re-ECN unless the receiver at least supports ECN. Most operating
systems currently being supplied (late 2010) implement ECN, but it is
turned off by default at the client end, even though it is on by
default at the server end. This is primarily because one home
gateway model widely supplied in 2006 crashes if a TCP client behind
it attempts to use ECN (there are issues with some other home
gateways from that era, but they are surmountable with ECN black-hole
detection).
Given that, 10 years after standardisation, ECN has still not been
widely enabled on TCP clients, if at all possible the Congestion
Exposure protocol should not require the receiver to be ECN capable.
Therefore, as it currently stands, the re-ECN encoding would fail to
meet the "partial deployment" requirement of Section 2.
For a tutorial background on Re-Feedback techniques, see [,,] {Bob: For a tutorial background on Re-Feedback techniques, see [,,] {Bob:
Matt, What did you have in mind here? SIGCOMM'05 paper? IEEE Matt, What did you have in mind here? SIGCOMM'05 paper? IEEE
Spectrum article? Re-ECN Web page?}. Spectrum article? Re-ECN Web page?}.
3.2.1. ECN Changes 3.2.1. ECN Changes
It is important to note that Briscoe's work proposes some relatively Although the re-ECN protocol requires no changes to the network side
minor modifications to the ECN protocol specified in RFC 3168. They of the ECN protocol, it is important to note that it does propose
include: redefining the ECT(0) and ECT(1) code points (this is some relatively minor modifications to the host-to-host aspects of
consistent with RFC3168 but requires deprecating [RFC3540]); the ECN protocol specified in RFC 3168. They include: redefining the
permitting routers to send ECN signals at a different threshold than ECT(1) code point (the change is consistent with RFC3168 but requires
packet loss; modifications to the ECN negotiations carried on the SYN deprecating the experimental ECN nonce [RFC3540]); modifications to
and SYN-ACK; and using a different state machine to carry ECN signals the ECN negotiations carried on the SYN and SYN-ACK; and using a
in the transport acknowledgments from the Receiver to the Sender. different state machine to carry ECN signals in the transport
This later change permits the transport protocol to carry multiple acknowledgments from the Receiver to the Sender. This last change
congestion signals per round trip, and greatly simplifies accurate permits the transport protocol to carry multiple congestion signals
auditing. per round trip, and greatly simplifies accurate auditing.
All of these adjustments to RFC 3168 may also be needed in a future All of these adjustments to RFC 3168 may also be needed in a future
standardized Congestion Exposure protocol. There will be very standardized Congestion Exposure protocol. There will need to be
careful considerations about any proposed changes to ECN or other very careful consideration of any proposed changes to ECN or other
existing protocols, because any such changes increase the cost of existing protocols, because any such changes increase the cost of
deployment. deployment.
3.3. Abstract Encoding 3.3. Abstract Encoding
{ToDo: Not really done, extra terse} The Congestion Exposure protocol could take one of two different
encodings: independently settable bits or an enumerated set of
mutually exclusive codepoints.
Model with two different encodings: individual bits or as an In both cases, the amount of congestion is signaled by the volume of
enumerated set. Enumerated encoding is probably good enough for most marked data--just as the volume of lost data or ECN marked data
purposes, but it must not be forgotten that it does lose some small signals the amount of congestion experienced. Thus the size of each
amount of information. packet carrying a Congestion Exposure Signal is signficant.
3.3.1. Separate Bits 3.3.1. Independent Bits
One bit each for This encoding involves a field of four flag bits, each of which the
sender can set independently to indicate to the network that:
o Not supported (implicit signal from legacy transport senders) ConEx (Not-ConEx) The transport is (or is not) using ConEx with this
packet (the protocol MUST be arranged so that legacy transport
senders implicitly send Not-ConEx)
o Congestion indicated by packet losses Re-Echo-Loss (Not-Re-Echo-Loss) The transport has (or has not)
experienced a loss
o ECN signaled congestion Re-Echo-ECN (Not-Re-Echo-ECN) The transport has (or has not)
experienced ECN signaled congestion
o Pre-congestion credit (AKA green). See Section 4.2.1 devices Credit (Not-Credit) The transport is (or is not) building up
below. congestion credit (see Section 4.3 on audit devices)
3.3.2. Enumerated Encoding 3.3.2. Codepoint Encoding
For enumerated encoding some marks must be delayed such that each This encoding involves a bit-field large enough to signal one of the
packet only carries at most one mark. following five codepoints:
ENUM {Not_Supported, No_Mark, Black_ECN, Black_Loss, Green} ENUM {Not-ConEx, ConEx, Re-Echo-Loss, Re-Echo-ECN, Credit}
Each named codepoint has the same meaning as in the encoding using
independent bits (Section 3.3.1). The use of any one codepoint
implies the negative of all the others, except the last three
codepoints (Re-Echo-Loss, Re-Echo-ECN and Credit) obviously also
imply ConEx is supported.
Inherently, the semantics of most of the enumerated codepoints are
mutually exclusive. 'Credit' is the only one that might need to be
used in combination with either Re-Echo-Loss or Re-Echo-ECN, but even
that requirement is questionable. It must not be forgotten that the
enumerated encoding loses the flexibility to signal these two
combinations, whereas the encoding with four independent bits is not
so limited. Alternatively two extra codepoints could be assigned to
these two combinations of semantics.
{ToDo: Default behaviour for Currently Unused codepoints}
{ToDo: Signal from Policer to Receiver to distinguish policy-induced
drop from congestion-induced drop}
Some might prefer to use the following colours respectively for each
codepoint. The same colours as follows (with the omission of Purple)
were used to describe re-ECN codepoints:
ENUM {White, Grey, Purple, Black, Green}.
4. Congestion Exposure Components 4. Congestion Exposure Components
{ToDo: Picture of the components, similar to that in the last
slideset about conex-concepts-uses?}
4.1. Modified Senders 4.1. Modified Senders
Send Congestion Exposure Signals per congestion signals. The sending transport needs to be modified to send Congestion
Exposure Signals in response to congestion feedback signals.
4.2. Policy Devices 4.2. Receivers (Optionally Modified)
4.2.1. Audit The receiving transport may already feedback sufficiently useful
signals to the sender so that it does not need to be altered.
For loss: detect retransmissions by monitoring sequence numbers. However, a TCP receiver feeds back ECN congestion signals no more
Assure that #retransmissions<=#Black_Loss than once within a round trip. The sender may require more precise
feedback from the receiver otherwise it will appear to be
understating its Congestion Exposure Signals (see Section 3.2.1).
(May need to include a fudge factor, because it would be more robust Ideally, Congestion Exposure should be added to a transport like TCP
to mark the packet after a retransmission. Otherwise network devices without mandatory modifications to the receiver. But an optional
that discard marked packets will cause connectivity failures, rather modification to the receiver could be recommended for precision.
than poor performance). This was the approach taken when adding re-ECN to TCP
[I-D.briscoe-tsvwg-re-ecn-tcp].
For ECN: count Congestion Exposure Signals and ECN. Would normally 4.3. Audit
need to delay ECN by one RTT to avoid false positives. Alternative:
use Green (pre-credits) to assure that #ECN<=#Black_ECN+#GREEN, even
though the #Black_ECN is delayed by one RTT.
4.2.2. Policers and Shapers To audit Congestion Exposure Signals against actual losses an auditor
could use one of the following techniques:
{ToDo: Beware these terms are defined differently than the TCP-specific approach: The auditor could monitor TCP flows or
conventional usage.} aggregates of flows, only holding state on a flow if it first
sends a Credit or a Re-Echo-Loss marking. The auditor could
detect retransmissions by monitoring sequence numbers. It would
assure that (volume of retransmitted data) <= (volume of data
marked Re-Echo-Loss). Traffic would only be auditable in this way
if it conformed to the standard TCP protocol and the IP payload
was not encrypted (e.g. with IPsec).
{ToDo: Abridge from existing doc?} Predominant bottleneck approach: Unlike the above TCP-specific
solution, this technique would work for IP packets carrying any
transport layer protocol, and whether encrypted or not. But it
only works well for networks designed so that losses predominantly
occur under the management of one IP-aware node on the path. The
auditor could then be located at this bottleneck. It could simply
compare Congestion Exposure Signals with actual local losses.
Most consumer access networks are design to this model, e.g. the
radio network controller (RNC) in a cellular network or the
broadband remote access server (BRAS) in a digital subscriber line
(DSL) network.
The accuracy of an auditor at one predominant bottleneck might
still be sufficient, even if losses occasionally occurred at other
nodes in the network (e.g. border gateways). Although the auditor
at the predominant bottleneck would not always be able to detect
losses at other nodes, transports would not know where losses were
occurring either. Therefore any transport would not know which
losses it could cheat on without getting caught, and which ones it
couldn't.
To audit Congestion Exposure Signals against actual ECN markings or
losses, the auditor could work as follows: monitor flows or
aggregates of flows, only holding state on a flow if it first sends a
Credit or either Re-Echo marking. Count the number of bytes marked
with Credit or Re-Echo-ECN. Separately count the number of bytes
marked with ECN. Use Credits to assure that #ECN<=#Re-Echo-
ECN+#Credit, even though the Re-Echo-ECN markings are delayed by at
least one RTT.
Note that an auditing device involves no policy configuration; it
merely enforces protocol compliance, not policy.
4.4. Policy Devices
4.4.1. Congestion Policers
Note that a congestion policer can be implemented in a very similar
way to a bit-rate policer, but its effect is focused solely on
traffic causing congestion downstream, not on all traffic just in
case it causes congestion.
It monitors all ConEx traffic entering a network, or some
identifiable subset. Using Congestion Exposure signals, it measures
the amount of congestion being caused by this traffic. If this
exceeds a policy-configured 'congestion-bit-rate' the congestion
policer will limit all the monitored ConEx traffic. A congestion
policer can be implemented by a simple token bucket. But unlike a
bit-rate policer, it only removes tokens when forwarding packets that
a ConEx marked. See [CongPol] for details.
4.4.2. Other Policy Devices
Other policy devices that use Congestion Exposure signaling might
traffic traffic based on Congestion Exposure Signals in much the same
way as the monitoring element of a Congestion Policer. But the
resulting action could be different. It might re-route traffic or
downgrade the class of service.
It might do nothing directly to the traffic, but instead report
measurements of Congestion Exposure Signals to systems designed to
control congestion indirectly. For instance the measurements might
be used to trigger penalty clauses in contracts, to levy charges
between networks based on congestion or simply to notify customers
who cause excessive congestion.
5. IANA Considerations 5. IANA Considerations
This memo includes no request to IANA. This memo includes no request to IANA.
Note to RFC Editor: this section may be removed on publication as an Note to RFC Editor: this section may be removed on publication as an
RFC. RFC.
6. Security Considerations 6. Security Considerations
{ToDo:} Significant parts of this whole document are about the auditability
of Congestion Exposure Signals, in particular Section 4.3.
7. Conclusions 7. Conclusions
{ToDo:} {ToDo:}
8. Acknowledgements 8. Acknowledgements
This document was improved by review comments from Toby Moncaster. This document was improved by review comments from Toby Moncaster.
9. Comments Solicited 9. Comments Solicited
skipping to change at page 11, line 7 skipping to change at page 13, line 42
10.1. Normative References 10.1. Normative References
[RFC2119] Bradner, S., "Key words for use in [RFC2119] Bradner, S., "Key words for use in
RFCs to Indicate Requirement RFCs to Indicate Requirement
Levels", BCP 14, RFC 2119, Levels", BCP 14, RFC 2119,
March 1997. March 1997.
10.2. Informative References 10.2. Informative References
[CongPol] Jacquet, A., Briscoe, B., and T.
Moncaster, "Policing Freedom to Use
the Internet Resource Pool", Proc
ACM Workshop on Re-Architecting the
Internet (ReArch'08) ,
December 2008, <http://
www.bobbriscoe.net/
pubs.html#polfree>.
[I-D.briscoe-tsvwg-re-ecn-motiv] Briscoe, B., Jacquet, A., [I-D.briscoe-tsvwg-re-ecn-motiv] Briscoe, B., Jacquet, A.,
Moncaster, T., and A. Smith, "Re- Moncaster, T., and A. Smith, "Re-
ECN: A Framework for adding ECN: A Framework for adding
Congestion Accountability to Congestion Accountability to
TCP/IP", draft-briscoe-tsvwg-re- TCP/IP", draft-briscoe-tsvwg-re-
ecn-tcp-motivation-01 (work in ecn-tcp-motivation-01 (work in
progress), September 2009. progress), September 2009.
[I-D.briscoe-tsvwg-re-ecn-tcp] Briscoe, B., Jacquet, A., [I-D.briscoe-tsvwg-re-ecn-tcp] Briscoe, B., Jacquet, A.,
Moncaster, T., and A. Smith, "Re- Moncaster, T., and A. Smith, "Re-
ECN: Adding Accountability for ECN: Adding Accountability for
Causing Congestion to TCP/IP", Causing Congestion to TCP/IP",
draft-briscoe-tsvwg-re-ecn-tcp-08 draft-briscoe-tsvwg-re-ecn-tcp-08
(work in progress), September 2009. (work in progress), September 2009.
[I-D.conex-concepts-uses] Briscoe, B., Woundy, R., Moncaster,
T., and J. Leslie, "ConEx Concepts
and Use Cases", draft-moncaster-
conex-concepts-uses-01 (work in
progress), July 2010.
[I-D.ietf-ledbat-congestion] Shalunov, S. and G. Hazel, "Low [I-D.ietf-ledbat-congestion] Shalunov, S. and G. Hazel, "Low
Extra Delay Background Transport Extra Delay Background Transport
(LEDBAT)", (LEDBAT)",
draft-ietf-ledbat-congestion-02 draft-ietf-ledbat-congestion-02
(work in progress), July 2010. (work in progress), July 2010.
[I-D.sridharan-tcpm-ctcp] Sridharan, M., Tan, K., Bansal, D., [I-D.sridharan-tcpm-ctcp] Sridharan, M., Tan, K., Bansal, D.,
and D. Thaler, "Compound TCP: A New and D. Thaler, "Compound TCP: A New
TCP Congestion Control for High- TCP Congestion Control for High-
Speed and Long Distance Networks", Speed and Long Distance Networks",
draft-sridharan-tcpm-ctcp-02 (work draft-sridharan-tcpm-ctcp-02 (work
in progress), November 2008. in progress), November 2008.
[RFC0791] Postel, J., "Internet Protocol",
STD 5, RFC 791, September 1981.
[RFC2309] Braden, B., Clark, D., Crowcroft, [RFC2309] Braden, B., Clark, D., Crowcroft,
J., Davie, B., Deering, S., Estrin, J., Davie, B., Deering, S., Estrin,
D., Floyd, S., Jacobson, V., D., Floyd, S., Jacobson, V.,
Minshall, G., Partridge, C., Minshall, G., Partridge, C.,
Peterson, L., Ramakrishnan, K., Peterson, L., Ramakrishnan, K.,
Shenker, S., Wroclawski, J., and L. Shenker, S., Wroclawski, J., and L.
Zhang, "Recommendations on Queue Zhang, "Recommendations on Queue
Management and Congestion Avoidance Management and Congestion Avoidance
in the Internet", RFC 2309, in the Internet", RFC 2309,
April 1998. April 1998.
 End of changes. 53 change blocks. 
136 lines changed or deleted 286 lines changed or added

This html diff was produced by rfcdiff 1.40. The latest version is available from http://tools.ietf.org/tools/rfcdiff/