Diff: draft-mathis-conex-abstract-mech-00b.txt - draft-mathis-conex-abstract-mech-00c.txt

	< draft-mathis-conex-abstract-mech-00b.txt	draft-mathis-conex-abstract-mech-00c.txt >

	Congestion Exposure (ConEx) M. Mathis	Congestion Exposure (ConEx) M. Mathis
	Working Group Google	Working Group Google
	Internet-Draft B. Briscoe	Internet-Draft B. Briscoe
	Intended status: Informational BT	Intended status: Informational BT

	Expires: April 17, 2011 October 14, 2010	Expires: April 18, 2011 October 15, 2010

	Congestion Exposure (ConEx) Concepts and Abstract Mechanism	Congestion Exposure (ConEx) Concepts and Abstract Mechanism

	draft-mathis-conex-abstract-mech-00b	draft-mathis-conex-abstract-mech-00c

	Abstract	Abstract

	This document describes an abstract mechanism by which senders inform	This document describes an abstract mechanism by which senders inform
	the network about the congestion encountered by packets earlier in	the network about the congestion encountered by packets earlier in
	the same flow. Today, the network may signal congestion to the	the same flow. Today, the network may signal congestion to the
	receiver by ECN markings or by dropping packets, and the receiver may	receiver by ECN markings or by dropping packets, and the receiver may
	pass this information back to the sender in transport-layer feedback.	pass this information back to the sender in transport-layer feedback.
	The mechanism to be developed by the ConEx WG will enable the sender	The mechanism to be developed by the ConEx WG will enable the sender
	to also relay this congestion information back into the network in-	to also relay this congestion information back into the network in-

	skipping to change at page 1, line 40	skipping to change at page 1, line 40
	Internet-Drafts are working documents of the Internet Engineering	Internet-Drafts are working documents of the Internet Engineering
	Task Force (IETF). Note that other groups may also distribute	Task Force (IETF). Note that other groups may also distribute
	working documents as Internet-Drafts. The list of current Internet-	working documents as Internet-Drafts. The list of current Internet-
	Drafts is at http://datatracker.ietf.org/drafts/current/.	Drafts is at http://datatracker.ietf.org/drafts/current/.

	Internet-Drafts are draft documents valid for a maximum of six months	Internet-Drafts are draft documents valid for a maximum of six months
	and may be updated, replaced, or obsoleted by other documents at any	and may be updated, replaced, or obsoleted by other documents at any
	time. It is inappropriate to use Internet-Drafts as reference	time. It is inappropriate to use Internet-Drafts as reference
	material or to cite them other than as "work in progress."	material or to cite them other than as "work in progress."


	This Internet-Draft will expire on April 17, 2011.	This Internet-Draft will expire on April 18, 2011.

	Copyright Notice	Copyright Notice

	Copyright (c) 2010 IETF Trust and the persons identified as the	Copyright (c) 2010 IETF Trust and the persons identified as the
	document authors. All rights reserved.	document authors. All rights reserved.

	This document is subject to BCP 78 and the IETF Trust's Legal	This document is subject to BCP 78 and the IETF Trust's Legal
	Provisions Relating to IETF Documents	Provisions Relating to IETF Documents
	(http://trustee.ietf.org/license-info) in effect on the date of	(http://trustee.ietf.org/license-info) in effect on the date of
	publication of this document. Please review these documents	publication of this document. Please review these documents

	skipping to change at page 2, line 17	skipping to change at page 2, line 17
	include Simplified BSD License text as described in Section 4.e of	include Simplified BSD License text as described in Section 4.e of
	the Trust Legal Provisions and are provided without warranty as	the Trust Legal Provisions and are provided without warranty as
	described in the Simplified BSD License.	described in the Simplified BSD License.

	Table of Contents	Table of Contents

	1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3	1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
	1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 4	1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 4
	2. Requirements for the Congestion Exposure Signal . . . . . . . 5	2. Requirements for the Congestion Exposure Signal . . . . . . . 5
	3. Representing Congestion Exposure . . . . . . . . . . . . . . . 7	3. Representing Congestion Exposure . . . . . . . . . . . . . . . 7

	3.1. One Simple Encoding . . . . . . . . . . . . . . . . . . . 7	3.1. Strawman Encoding . . . . . . . . . . . . . . . . . . . . 7
	3.2. ECN Based Encoding . . . . . . . . . . . . . . . . . . . . 8	3.2. ECN Based Encoding . . . . . . . . . . . . . . . . . . . . 8

	3.2.1. ECN Changes . . . . . . . . . . . . . . . . . . . . . 8	3.2.1. ECN Changes . . . . . . . . . . . . . . . . . . . . . 9
	3.3. Abstract Encoding . . . . . . . . . . . . . . . . . . . . 9	3.3. Abstract Encoding . . . . . . . . . . . . . . . . . . . . 9

	3.3.1. Separate Bits . . . . . . . . . . . . . . . . . . . . 9	3.3.1. Independent Bits . . . . . . . . . . . . . . . . . . . 9
	3.3.2. Enumerated Encoding . . . . . . . . . . . . . . . . . 9	3.3.2. Codepoint Encoding . . . . . . . . . . . . . . . . . . 10
	4. Congestion Exposure Components . . . . . . . . . . . . . . . . 9	4. Congestion Exposure Components . . . . . . . . . . . . . . . . 10
	4.1. Modified Senders . . . . . . . . . . . . . . . . . . . . . 9	4.1. Modified Senders . . . . . . . . . . . . . . . . . . . . . 10
	4.2. Policy Devices . . . . . . . . . . . . . . . . . . . . . . 9	4.2. Receivers (Optionally Modified) . . . . . . . . . . . . . 11
	4.2.1. Audit . . . . . . . . . . . . . . . . . . . . . . . . 9	4.3. Audit . . . . . . . . . . . . . . . . . . . . . . . . . . 11
	4.2.2. Policers and Shapers . . . . . . . . . . . . . . . . . 10	4.4. Policy Devices . . . . . . . . . . . . . . . . . . . . . . 12
	5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 10	4.4.1. Congestion Policers . . . . . . . . . . . . . . . . . 12
	6. Security Considerations . . . . . . . . . . . . . . . . . . . 10	4.4.2. Other Policy Devices . . . . . . . . . . . . . . . . . 12
	7. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 10	5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 13
	8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 10	6. Security Considerations . . . . . . . . . . . . . . . . . . . 13
	9. Comments Solicited . . . . . . . . . . . . . . . . . . . . . . 10	7. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 13
	10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 10	8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 13
	10.1. Normative References . . . . . . . . . . . . . . . . . . . 10	9. Comments Solicited . . . . . . . . . . . . . . . . . . . . . . 13
	10.2. Informative References . . . . . . . . . . . . . . . . . . 11	10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 13
		10.1. Normative References . . . . . . . . . . . . . . . . . . . 13
		10.2. Informative References . . . . . . . . . . . . . . . . . . 13

	1. Introduction	1. Introduction

	One of the required functions of a transport protocol is controlling	One of the required functions of a transport protocol is controlling
	congestion in the network. There are three techniques in use today	congestion in the network. There are three techniques in use today
	for the network to signal congestion to a transport:	for the network to signal congestion to a transport:

	o The most common congestion signal is packet loss. When congested,	o The most common congestion signal is packet loss. When congested,
	the network simply discards some packets either as part of an	the network simply discards some packets either as part of an
	explicit control function [RFC2309] or as the consequence of a	explicit control function [RFC2309] or as the consequence of a

	skipping to change at page 4, line 25	skipping to change at page 4, line 25
	\| Sender \|>-(new)-IP layer Congestion Exposure Signal--->\| Receiver\|	\| Sender \|>-(new)-IP layer Congestion Exposure Signal--->\| Receiver\|
	\| \| (Carried in Data Packet Headers) \| \|	\| \| (Carried in Data Packet Headers) \| \|
	\| \| +-----------+ \| \|	\| \| +-----------+ \| \|
	\| \|>=Data=Path=>\|(Congested)\|>=====Data=Path=====>\| \|	\| \|>=Data=Path=>\|(Congested)\|>=====Data=Path=====>\| \|
	\| \| \| Network \|>-Congestion-Signal->\| \|	\| \| \| Network \|>-Congestion-Signal->\| \|
	\| \| \| Device \| \| \|	\| \| \| Device \| \| \|
	+---------+ +-----------+ +---------+	+---------+ +-----------+ +---------+

	Not shown are policy devices along the data path that observe the	Not shown are policy devices along the data path that observe the
	Congestion Exposure Signal, and use the information to monitor or	Congestion Exposure Signal, and use the information to monitor or

	manage traffic. These are discussed in Section 4.2.	manage traffic. These are discussed in Section 4.4.

	Figure 1	Figure 1

	1.1. Terminology	1.1. Terminology

	The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",	The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
	"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this	"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
	document are to be interpreted as described in RFC 2119 [RFC2119].	document are to be interpreted as described in RFC 2119 [RFC2119].

	ConEx signals in IP packet headers from the sender to the network	ConEx signals in IP packet headers from the sender to the network
	{ToDo: These are placeholders for whatever words we decide to use}:	{ToDo: These are placeholders for whatever words we decide to use}:


	Re-Echo Loss (aka Black-Loss) The transport has experienced a loss.	Not-ConEx (aka White) The transport is not ConEx-capable


	Re-Echo ECN (aka Black-ECN) The transport has experienced an ECN	ConEx (aka Grey) The transport is ConEx-capable
	mark


	Pre-Echo (aka Green) The transport is building up credit to allow	Re-Echo-Loss (aka Purple) The transport has experienced a loss.
	for any future delay in expected ConEx signals


	Neutral (aka Grey) The transport is ConEx-capable	Re-Echo-ECN (aka Black) The transport has experienced an ECN mark
	Not-ConEx (aka White) The transport is not ConEx-capable
		Credit (aka Green) The transport is building up credit to allow for
		any future delay in expected ConEx signals

		ConEx-Marked Any of Re-Echo-Loss, Re-Echo-ECN or Credit.

		ConEx-Unmarked ConEx, but not ConEx-Marked.

	2. Requirements for the Congestion Exposure Signal	2. Requirements for the Congestion Exposure Signal


		Ideally, all the following requirements would be met by a Congestion
		Exposure Signal. However it is already known that some compromises
		will be necessary, therefore all the requirements are expressed with
		the keyword 'SHOULD' rather then 'MUST'. The only mandatory
		requirement is that a concrete protocol description MUST give sound
		reasoning if it chooses not to meet any of these requirements:

	a. The Congestion Exposure Signal SHOULD be visible to internetwork	a. The Congestion Exposure Signal SHOULD be visible to internetwork
	layer devices along the entire path from the transport sender to	layer devices along the entire path from the transport sender to
	the transport receiver. Equivalently, it SHOULD be present in	the transport receiver. Equivalently, it SHOULD be present in
	the IPv4 or IPv6 header, and in the outermost IP header if using	the IPv4 or IPv6 header, and in the outermost IP header if using
	IP in IP tunnelling. The Congestion Exposure Signal SHOULD be	IP in IP tunnelling. The Congestion Exposure Signal SHOULD be
	immutable once set by the transport sender. A corollary of these	immutable once set by the transport sender. A corollary of these
	requirements is that existing (legacy) networking gear SHOULD	requirements is that existing (legacy) networking gear SHOULD
	pass the Congestion Exposure Signal silently without	pass the Congestion Exposure Signal silently without
	modification.	modification.


	skipping to change at page 5, line 47	skipping to change at page 6, line 8
	actually experiencing.	actually experiencing.

	d. The Congestion Exposure Signal SHOULD be timely. There will be a	d. The Congestion Exposure Signal SHOULD be timely. There will be a
	delay between the time when an auditing device sees an actual	delay between the time when an auditing device sees an actual
	congestion signal and when it sees the subsequent Congestion	congestion signal and when it sees the subsequent Congestion
	Exposure Signal from the sender. The minimum delay will be one	Exposure Signal from the sender. The minimum delay will be one
	round trip, but it may be much longer depending on the	round trip, but it may be much longer depending on the
	transport's choice of feedback delay (consider RTCP [RFC3550] for	transport's choice of feedback delay (consider RTCP [RFC3550] for
	example). It is not practical to expect auditing devices in the	example). It is not practical to expect auditing devices in the
	network to make allowance for such feedback delays. Instead, the	network to make allowance for such feedback delays. Instead, the

	sender MUST be able to send Congestion Exposure signals in	sender SHOULD be able to send Congestion Exposure signals in
	advance, as 'credit' for any audit device to hold as a balance	advance, as 'credit' for any audit device to hold as a balance
	against the risk of congestion during the feedback delay. This	against the risk of congestion during the feedback delay. This
	design choice simplifies auditing devices and correctly makes the	design choice simplifies auditing devices and correctly makes the
	transport responsible for both minimising feedback delay and	transport responsible for both minimising feedback delay and
	minimising sharp increases in packets in flight that would risk	minimising sharp increases in packets in flight that would risk
	causing excessive congestion to others. This issue is discussed	causing excessive congestion to others. This issue is discussed

	in more detail in Section 4.2.1.	in more detail in Section 4.3.

	It is important to note that the auditing requirement implies a	It is important to note that the auditing requirement implies a
	number of additional constraints: The basic auditing technique is to	number of additional constraints: The basic auditing technique is to
	count both actual congestion signals and Congestion Exposure Signals	count both actual congestion signals and Congestion Exposure Signals
	someplace along the data path:	someplace along the data path:

	o For congestion signaled by ECN, auditing is most accurate when	o For congestion signaled by ECN, auditing is most accurate when
	located near the transport receiver. Within any flow or aggregate	located near the transport receiver. Within any flow or aggregate
	of flows, the total volume of ECN marked data seen near the	of flows, the total volume of ECN marked data seen near the
	receiver should always be equal to or less than the volume of data	receiver should always be equal to or less than the volume of data

	skipping to change at page 6, line 28	skipping to change at page 6, line 37

	o For congestion signaled by loss, totally accurate auditing is not	o For congestion signaled by loss, totally accurate auditing is not
	believed to be possible in the general case, because it involves a	believed to be possible in the general case, because it involves a
	network node detecting the absence of some packets, when it cannot	network node detecting the absence of some packets, when it cannot
	necessarily see the transport protocol sequence numbers and when	necessarily see the transport protocol sequence numbers and when
	the missing packets might simply be taking a different route. But	the missing packets might simply be taking a different route. But
	there are common cases where sufficient audit accuracy should be	there are common cases where sufficient audit accuracy should be
	possible:	possible:

	* For non-IPsec traffic conforming to standard TCP sequence	* For non-IPsec traffic conforming to standard TCP sequence

	numbering on a single path, the auditor could detect losses by	numbering on a single path, an auditor could detect losses by
	observing both the original transmission and the retransmission	observing both the original transmission and the retransmission
	after the loss. Such auditing would be most accurate near the	after the loss. Such auditing would be most accurate near the
	sender.	sender.

	* For networks designed so that losses predominantly occur under	* For networks designed so that losses predominantly occur under
	the management of one IP-aware node on the path, the auditor	the management of one IP-aware node on the path, the auditor
	could be located at this bottleneck. It could simply compare	could be located at this bottleneck. It could simply compare

	Congestion Exposure Signals with actual local losses. Most	Congestion Exposure Signals with actual local losses. This is
	consumer access networks are design to this model, e.g. the	a good model for most consumer access networks and audit
	radio network controller (RNC) in a cellular network or the	accuracy could well be sufficient even if losses occasionally
	broadband remote access server (BRAS) in a digital subscriber	occurred at other nodes in the network, such as border gateways
	line (DSL) network. Unlike the above TCP-specific solution,	(see Section 4.3 for details).
	this would work for IP packets carrying any transport layer
	protocol, and whether encrypted or not.

	The accuracy of an auditor at one predominant bottleneck might
	still be sufficient, even if losses occasionally occurred at
	other nodes in the network (e.g. border gateways). Although
	the auditor at the predominant bottleneck would not always be
	able to detect losses at other nodes, transports would not know
	where losses were occurring either. Therefore any transport
	would not know which losses it could cheat on without getting
	caught, and which ones it couldn't.

	Given that loss-based and ECN-based Congestion Exposure might	Given that loss-based and ECN-based Congestion Exposure might

	sometimes be best audited at different locations, have distinct	sometimes be best audited at different locations, having distinct
	encodings would widen the design space for the auditing function.	encodings would widen the design space for the auditing function.


	{Bob: Got to here making suggested changes.}

	3. Representing Congestion Exposure	3. Representing Congestion Exposure

	Most protocol specifications start with a description of packet	Most protocol specifications start with a description of packet

	formats and code points with their associated meanings. This	formats and codepoints with their associated meanings. This document
	document does not: It is already known that choosing the encoding for	does not: It is already known that choosing the encoding for the
	the Congestion Exposure Signal is likely to entail some engineering	Congestion Exposure Signal is likely to entail some engineering
	compromises that have the potential to reduce the protocol's	compromises that have the potential to reduce the protocol's
	usefulness in some settings. Rather than making these engineering	usefulness in some settings. Rather than making these engineering
	choices prematurely, this document side steps the encoding problem by	choices prematurely, this document side steps the encoding problem by

	describing an abstract representation of Congestion Exposure Signal.	describing an abstract representation of a Congestion Exposure
	All of the elements of the protocol can be defined in terms of this	Signal. All of the elements of the protocol can be defined in terms
	abstract representation. Most important, the preliminary use cases	of this abstract representation. Most important, the preliminary use
	for the protocol are described in terms of the abstract	cases for the protocol are described in terms of the abstract
	representation in companion documents.	representation in companion documents [I-D.conex-concepts-uses].

	Once we have some example use cases we can evaluate different	Once we have some example use cases we can evaluate different
	encoding schemes. Since these schemes are likely to include some	encoding schemes. Since these schemes are likely to include some
	conflated code points, some information will be lost resulting in	conflated code points, some information will be lost resulting in
	weakening or disabling some of the algorithms and eliminating some	weakening or disabling some of the algorithms and eliminating some
	use cases.	use cases.

	The goal of this approach is to be as complete as possible for	The goal of this approach is to be as complete as possible for
	discovering the potential usage and capabilities of the Congestion	discovering the potential usage and capabilities of the Congestion
	Exposure protocol, so we have some hope of making optimal design	Exposure protocol, so we have some hope of making optimal design
	decisions when choosing the encoding.	decisions when choosing the encoding.


	3.1. One Simple Encoding	3.1. Strawman Encoding

	As an aid to the reader, it might be helpful to describe one simple
	encoding of the Congestion Exposure protocol: set IPv4 header bit 48
	(aka the "evil bit" [RFC3514]) on all retransmissions or once per ECN
	signaled window reduction. Clearly network devices along the forward
	path can see this bit and act on it. For example they can count
	marked and unmarked packets to estimate the congestion levels along
	the path.


	However this encoding has been forbidden by RFC xxxx, which seeks to	As an aid to the reader, it might be helpful to describe a naive
	preserve the last unallocated bit in the IPv4 header for some	strawman encoding of the Congestion Exposure protocol described
	unspecifed future use.	solely in terms of TCP: set the Reserved bit in the IPv4 header (bit
		48 counting from zero [RFC0791]--aka the "evil bit" [RFC3514]) on all
		retransmissions or once per ECN signaled window reduction. Clearly
		network devices along the forward path can see this bit and act on
		it. For example they can count marked and unmarked packets to
		estimate the congestion levels along the path.


	Furthermore this encoding, by itself, does not sufficiently support	However, the IESG has chartered the ConEx working group to establish
	partial deployment or strong auditing and might motivate users and/or	that there is sufficient demand for an IPv6 ConEx protocol before
	applications to misrepresent the congestion that they are be causing.	using the last available bit in the IPv4 header. Furthermore this
		encoding, by itself, does not sufficiently support partial deployment
		or strong auditing and might motivate users and/or applications to
		misrepresent the congestion that they are causing.


	However, this simple encoding does present a clear mental model of	Nonetheless, this strawman encoding does present a clear mental model
	how the Congestion Exposure protocol functions and is very useful for	of how the Congestion Exposure protocol might function under various
	conducting thought experiments about how the protocol might function	uses.
	under various uses.

	3.2. ECN Based Encoding	3.2. ECN Based Encoding


	Bob Briscoe's PhD thesis [Refb-dis], and many derivative works	The re-ECN specification [I-D.briscoe-tsvwg-re-ecn-tcp] presents an
	including RE-ECN [I-D.briscoe-tsvwg-re-ecn-tcp] present an ECN based	ECN based implementation of ConEx. The central theme of this work is
	implementation of ConEx. The central theme of this work includes	an audit mechanism that can provide sufficient disincentives against
	strong disincentives for misrepresenting congestion	misrepresenting congestion [I-D.briscoe-tsvwg-re-ecn-motiv], which is
	[I-D.briscoe-tsvwg-re-ecn-motiv]. However, it also pre-supposes the	analysed extensively in Briscoe's PhD dissertation [Refb-dis].
	full deployment of ECN, and does not adequately signal congestion
	indicated by packet loss. Furthermore, given that after 10 years ECN
	still has not been widely deployed, it does not seem prudent to
	require its deployment as a prerequisite for deploying a Congestion
	Exposure protocol.


	As it currently stands, this work fails to meet the "partial	The re-ECN encoding is tightly integrated with the encoding of ECN in
	deployment" requirement described above in section Section 2.	the IP header. However, re-ECN can be incrementally deployed on
		hosts whether or not any networks support ECN marking and whether or
		not any networks take note of re-ECN markings. Nonetheless, the
		audit function has only been formally analysed where at least one
		autonomous network has deployed ECN marking, which it uses to audit
		whether the Congestion Exposure Signal matches actual congestion.

		Thus, even if networks have not deployed ECN, re-ECN acts perfectly
		well as a loss-based Congestion Exposure protocol. As such, a
		network could potentially audit re-ECN signals against losses using
		the loss-based audit techniques in Section 4.3, rather than deploying
		ECN.

		Although re-ECN does not require networks to support ECN, it still
		embodies a major incremental deployment challenge; a sender cannot
		use re-ECN unless the receiver at least supports ECN. Most operating
		systems currently being supplied (late 2010) implement ECN, but it is
		turned off by default at the client end, even though it is on by
		default at the server end. This is primarily because one home
		gateway model widely supplied in 2006 crashes if a TCP client behind
		it attempts to use ECN (there are issues with some other home
		gateways from that era, but they are surmountable with ECN black-hole
		detection).

		Given that, 10 years after standardisation, ECN has still not been
		widely enabled on TCP clients, if at all possible the Congestion
		Exposure protocol should not require the receiver to be ECN capable.
		Therefore, as it currently stands, the re-ECN encoding would fail to
		meet the "partial deployment" requirement of Section 2.

	For a tutorial background on Re-Feedback techniques, see [,,] {Bob:	For a tutorial background on Re-Feedback techniques, see [,,] {Bob:
	Matt, What did you have in mind here? SIGCOMM'05 paper? IEEE	Matt, What did you have in mind here? SIGCOMM'05 paper? IEEE
	Spectrum article? Re-ECN Web page?}.	Spectrum article? Re-ECN Web page?}.

	3.2.1. ECN Changes	3.2.1. ECN Changes


	It is important to note that Briscoe's work proposes some relatively	Although the re-ECN protocol requires no changes to the network side
	minor modifications to the ECN protocol specified in RFC 3168. They	of the ECN protocol, it is important to note that it does propose
	include: redefining the ECT(0) and ECT(1) code points (this is	some relatively minor modifications to the host-to-host aspects of
	consistent with RFC3168 but requires deprecating [RFC3540]);	the ECN protocol specified in RFC 3168. They include: redefining the
	permitting routers to send ECN signals at a different threshold than	ECT(1) code point (the change is consistent with RFC3168 but requires
	packet loss; modifications to the ECN negotiations carried on the SYN	deprecating the experimental ECN nonce [RFC3540]); modifications to
	and SYN-ACK; and using a different state machine to carry ECN signals	the ECN negotiations carried on the SYN and SYN-ACK; and using a
	in the transport acknowledgments from the Receiver to the Sender.	different state machine to carry ECN signals in the transport
	This later change permits the transport protocol to carry multiple	acknowledgments from the Receiver to the Sender. This last change
	congestion signals per round trip, and greatly simplifies accurate	permits the transport protocol to carry multiple congestion signals
	auditing.	per round trip, and greatly simplifies accurate auditing.

	All of these adjustments to RFC 3168 may also be needed in a future	All of these adjustments to RFC 3168 may also be needed in a future

	standardized Congestion Exposure protocol. There will be very	standardized Congestion Exposure protocol. There will need to be
	careful considerations about any proposed changes to ECN or other	very careful consideration of any proposed changes to ECN or other
	existing protocols, because any such changes increase the cost of	existing protocols, because any such changes increase the cost of
	deployment.	deployment.

	3.3. Abstract Encoding	3.3. Abstract Encoding


	{ToDo: Not really done, extra terse}	The Congestion Exposure protocol could take one of two different
		encodings: independently settable bits or an enumerated set of
		mutually exclusive codepoints.


	Model with two different encodings: individual bits or as an	In both cases, the amount of congestion is signaled by the volume of
	enumerated set. Enumerated encoding is probably good enough for most	marked data--just as the volume of lost data or ECN marked data
	purposes, but it must not be forgotten that it does lose some small	signals the amount of congestion experienced. Thus the size of each
	amount of information.	packet carrying a Congestion Exposure Signal is signficant.


	3.3.1. Separate Bits	3.3.1. Independent Bits


	One bit each for	This encoding involves a field of four flag bits, each of which the
		sender can set independently to indicate to the network that:


	o Not supported (implicit signal from legacy transport senders)	ConEx (Not-ConEx) The transport is (or is not) using ConEx with this
		packet (the protocol MUST be arranged so that legacy transport
		senders implicitly send Not-ConEx)


	o Congestion indicated by packet losses	Re-Echo-Loss (Not-Re-Echo-Loss) The transport has (or has not)
		experienced a loss


	o ECN signaled congestion	Re-Echo-ECN (Not-Re-Echo-ECN) The transport has (or has not)
		experienced ECN signaled congestion


	o Pre-congestion credit (AKA green). See Section 4.2.1 devices	Credit (Not-Credit) The transport is (or is not) building up
	below.	congestion credit (see Section 4.3 on audit devices)


	3.3.2. Enumerated Encoding	3.3.2. Codepoint Encoding


	For enumerated encoding some marks must be delayed such that each	This encoding involves a bit-field large enough to signal one of the
	packet only carries at most one mark.	following five codepoints:


	ENUM {Not_Supported, No_Mark, Black_ECN, Black_Loss, Green}	ENUM {Not-ConEx, ConEx, Re-Echo-Loss, Re-Echo-ECN, Credit}

		Each named codepoint has the same meaning as in the encoding using
		independent bits (Section 3.3.1). The use of any one codepoint
		implies the negative of all the others, except the last three
		codepoints (Re-Echo-Loss, Re-Echo-ECN and Credit) obviously also
		imply ConEx is supported.

		Inherently, the semantics of most of the enumerated codepoints are
		mutually exclusive. 'Credit' is the only one that might need to be
		used in combination with either Re-Echo-Loss or Re-Echo-ECN, but even
		that requirement is questionable. It must not be forgotten that the
		enumerated encoding loses the flexibility to signal these two
		combinations, whereas the encoding with four independent bits is not
		so limited. Alternatively two extra codepoints could be assigned to
		these two combinations of semantics.

		{ToDo: Default behaviour for Currently Unused codepoints}

		{ToDo: Signal from Policer to Receiver to distinguish policy-induced
		drop from congestion-induced drop}

		Some might prefer to use the following colours respectively for each
		codepoint. The same colours as follows (with the omission of Purple)
		were used to describe re-ECN codepoints:

		ENUM {White, Grey, Purple, Black, Green}.

	4. Congestion Exposure Components	4. Congestion Exposure Components


		{ToDo: Picture of the components, similar to that in the last
		slideset about conex-concepts-uses?}

	4.1. Modified Senders	4.1. Modified Senders


	Send Congestion Exposure Signals per congestion signals.	The sending transport needs to be modified to send Congestion
		Exposure Signals in response to congestion feedback signals.


	4.2. Policy Devices	4.2. Receivers (Optionally Modified)


	4.2.1. Audit	The receiving transport may already feedback sufficiently useful
		signals to the sender so that it does not need to be altered.


	For loss: detect retransmissions by monitoring sequence numbers.	However, a TCP receiver feeds back ECN congestion signals no more
	Assure that #retransmissions<=#Black_Loss	than once within a round trip. The sender may require more precise
		feedback from the receiver otherwise it will appear to be
		understating its Congestion Exposure Signals (see Section 3.2.1).


	(May need to include a fudge factor, because it would be more robust	Ideally, Congestion Exposure should be added to a transport like TCP
	to mark the packet after a retransmission. Otherwise network devices	without mandatory modifications to the receiver. But an optional
	that discard marked packets will cause connectivity failures, rather	modification to the receiver could be recommended for precision.
	than poor performance).	This was the approach taken when adding re-ECN to TCP
		[I-D.briscoe-tsvwg-re-ecn-tcp].


	For ECN: count Congestion Exposure Signals and ECN. Would normally	4.3. Audit
	need to delay ECN by one RTT to avoid false positives. Alternative:
	use Green (pre-credits) to assure that #ECN<=#Black_ECN+#GREEN, even
	though the #Black_ECN is delayed by one RTT.


	4.2.2. Policers and Shapers	To audit Congestion Exposure Signals against actual losses an auditor
		could use one of the following techniques:


	{ToDo: Beware these terms are defined differently than the	TCP-specific approach: The auditor could monitor TCP flows or
	conventional usage.}	aggregates of flows, only holding state on a flow if it first
		sends a Credit or a Re-Echo-Loss marking. The auditor could
		detect retransmissions by monitoring sequence numbers. It would
		assure that (volume of retransmitted data) <= (volume of data
		marked Re-Echo-Loss). Traffic would only be auditable in this way
		if it conformed to the standard TCP protocol and the IP payload
		was not encrypted (e.g. with IPsec).


	{ToDo: Abridge from existing doc?}	Predominant bottleneck approach: Unlike the above TCP-specific
		solution, this technique would work for IP packets carrying any
		transport layer protocol, and whether encrypted or not. But it
		only works well for networks designed so that losses predominantly
		occur under the management of one IP-aware node on the path. The
		auditor could then be located at this bottleneck. It could simply
		compare Congestion Exposure Signals with actual local losses.
		Most consumer access networks are design to this model, e.g. the
		radio network controller (RNC) in a cellular network or the
		broadband remote access server (BRAS) in a digital subscriber line
		(DSL) network.

		The accuracy of an auditor at one predominant bottleneck might
		still be sufficient, even if losses occasionally occurred at other
		nodes in the network (e.g. border gateways). Although the auditor
		at the predominant bottleneck would not always be able to detect
		losses at other nodes, transports would not know where losses were
		occurring either. Therefore any transport would not know which
		losses it could cheat on without getting caught, and which ones it
		couldn't.

		To audit Congestion Exposure Signals against actual ECN markings or
		losses, the auditor could work as follows: monitor flows or
		aggregates of flows, only holding state on a flow if it first sends a
		Credit or either Re-Echo marking. Count the number of bytes marked
		with Credit or Re-Echo-ECN. Separately count the number of bytes
		marked with ECN. Use Credits to assure that #ECN<=#Re-Echo-
		ECN+#Credit, even though the Re-Echo-ECN markings are delayed by at
		least one RTT.

		Note that an auditing device involves no policy configuration; it
		merely enforces protocol compliance, not policy.

		4.4. Policy Devices

		4.4.1. Congestion Policers

		Note that a congestion policer can be implemented in a very similar
		way to a bit-rate policer, but its effect is focused solely on
		traffic causing congestion downstream, not on all traffic just in
		case it causes congestion.

		It monitors all ConEx traffic entering a network, or some
		identifiable subset. Using Congestion Exposure signals, it measures
		the amount of congestion being caused by this traffic. If this
		exceeds a policy-configured 'congestion-bit-rate' the congestion
		policer will limit all the monitored ConEx traffic. A congestion
		policer can be implemented by a simple token bucket. But unlike a
		bit-rate policer, it only removes tokens when forwarding packets that
		a ConEx marked. See [CongPol] for details.

		4.4.2. Other Policy Devices

		Other policy devices that use Congestion Exposure signaling might
		traffic traffic based on Congestion Exposure Signals in much the same
		way as the monitoring element of a Congestion Policer. But the
		resulting action could be different. It might re-route traffic or
		downgrade the class of service.

		It might do nothing directly to the traffic, but instead report
		measurements of Congestion Exposure Signals to systems designed to
		control congestion indirectly. For instance the measurements might
		be used to trigger penalty clauses in contracts, to levy charges
		between networks based on congestion or simply to notify customers
		who cause excessive congestion.

	5. IANA Considerations	5. IANA Considerations

	This memo includes no request to IANA.	This memo includes no request to IANA.

	Note to RFC Editor: this section may be removed on publication as an	Note to RFC Editor: this section may be removed on publication as an
	RFC.	RFC.

	6. Security Considerations	6. Security Considerations


	{ToDo:}	Significant parts of this whole document are about the auditability
		of Congestion Exposure Signals, in particular Section 4.3.

	7. Conclusions	7. Conclusions

	{ToDo:}	{ToDo:}

	8. Acknowledgements	8. Acknowledgements

	This document was improved by review comments from Toby Moncaster.	This document was improved by review comments from Toby Moncaster.

	9. Comments Solicited	9. Comments Solicited

	skipping to change at page 11, line 7	skipping to change at page 13, line 42

	10.1. Normative References	10.1. Normative References

	[RFC2119] Bradner, S., "Key words for use in	[RFC2119] Bradner, S., "Key words for use in
	RFCs to Indicate Requirement	RFCs to Indicate Requirement
	Levels", BCP 14, RFC 2119,	Levels", BCP 14, RFC 2119,
	March 1997.	March 1997.

	10.2. Informative References	10.2. Informative References


		[CongPol] Jacquet, A., Briscoe, B., and T.
		Moncaster, "Policing Freedom to Use
		the Internet Resource Pool", Proc
		ACM Workshop on Re-Architecting the
		Internet (ReArch'08) ,
		December 2008, <http://
		www.bobbriscoe.net/
		pubs.html#polfree>.

	[I-D.briscoe-tsvwg-re-ecn-motiv] Briscoe, B., Jacquet, A.,	[I-D.briscoe-tsvwg-re-ecn-motiv] Briscoe, B., Jacquet, A.,
	Moncaster, T., and A. Smith, "Re-	Moncaster, T., and A. Smith, "Re-
	ECN: A Framework for adding	ECN: A Framework for adding
	Congestion Accountability to	Congestion Accountability to
	TCP/IP", draft-briscoe-tsvwg-re-	TCP/IP", draft-briscoe-tsvwg-re-
	ecn-tcp-motivation-01 (work in	ecn-tcp-motivation-01 (work in
	progress), September 2009.	progress), September 2009.

	[I-D.briscoe-tsvwg-re-ecn-tcp] Briscoe, B., Jacquet, A.,	[I-D.briscoe-tsvwg-re-ecn-tcp] Briscoe, B., Jacquet, A.,
	Moncaster, T., and A. Smith, "Re-	Moncaster, T., and A. Smith, "Re-
	ECN: Adding Accountability for	ECN: Adding Accountability for
	Causing Congestion to TCP/IP",	Causing Congestion to TCP/IP",
	draft-briscoe-tsvwg-re-ecn-tcp-08	draft-briscoe-tsvwg-re-ecn-tcp-08
	(work in progress), September 2009.	(work in progress), September 2009.


		[I-D.conex-concepts-uses] Briscoe, B., Woundy, R., Moncaster,
		T., and J. Leslie, "ConEx Concepts
		and Use Cases", draft-moncaster-
		conex-concepts-uses-01 (work in
		progress), July 2010.

	[I-D.ietf-ledbat-congestion] Shalunov, S. and G. Hazel, "Low	[I-D.ietf-ledbat-congestion] Shalunov, S. and G. Hazel, "Low
	Extra Delay Background Transport	Extra Delay Background Transport
	(LEDBAT)",	(LEDBAT)",
	draft-ietf-ledbat-congestion-02	draft-ietf-ledbat-congestion-02
	(work in progress), July 2010.	(work in progress), July 2010.

	[I-D.sridharan-tcpm-ctcp] Sridharan, M., Tan, K., Bansal, D.,	[I-D.sridharan-tcpm-ctcp] Sridharan, M., Tan, K., Bansal, D.,
	and D. Thaler, "Compound TCP: A New	and D. Thaler, "Compound TCP: A New
	TCP Congestion Control for High-	TCP Congestion Control for High-
	Speed and Long Distance Networks",	Speed and Long Distance Networks",
	draft-sridharan-tcpm-ctcp-02 (work	draft-sridharan-tcpm-ctcp-02 (work
	in progress), November 2008.	in progress), November 2008.


		[RFC0791] Postel, J., "Internet Protocol",
		STD 5, RFC 791, September 1981.

	[RFC2309] Braden, B., Clark, D., Crowcroft,	[RFC2309] Braden, B., Clark, D., Crowcroft,
	J., Davie, B., Deering, S., Estrin,	J., Davie, B., Deering, S., Estrin,
	D., Floyd, S., Jacobson, V.,	D., Floyd, S., Jacobson, V.,
	Minshall, G., Partridge, C.,	Minshall, G., Partridge, C.,
	Peterson, L., Ramakrishnan, K.,	Peterson, L., Ramakrishnan, K.,
	Shenker, S., Wroclawski, J., and L.	Shenker, S., Wroclawski, J., and L.
	Zhang, "Recommendations on Queue	Zhang, "Recommendations on Queue
	Management and Congestion Avoidance	Management and Congestion Avoidance
	in the Internet", RFC 2309,	in the Internet", RFC 2309,
	April 1998.	April 1998.

End of changes. 53 change blocks.
	136 lines changed or deleted	286 lines changed or added
This html diff was produced by rfcdiff 1.40. The latest version is available from http://tools.ietf.org/tools/rfcdiff/