Diff: draft-briscoe-tsvwg-re-ecn-tcp-03.txt - draft-briscoe-tsvwg-re-ecn-tcp-04.txt

	draft-briscoe-tsvwg-re-ecn-tcp-03.txt		draft-briscoe-tsvwg-re-ecn-tcp-04.txt

	Transport Area Working Group B. Briscoe		Transport Area Working Group B. Briscoe
	Internet-Draft BT & UCL		Internet-Draft BT & UCL

	Intended status: Informational A. Jacquet		Intended status: Standards Track A. Jacquet
	Expires: April 26, 2007 A. Salvatori		Expires: January 10, 2008 A. Salvatori
	M. Koyabe		M. Koyabe

			T. Moncaster
	BT		BT

	October 23, 2006		July 09, 2007

	Re-ECN: Adding Accountability for Causing Congestion to TCP/IP		Re-ECN: Adding Accountability for Causing Congestion to TCP/IP

	draft-briscoe-tsvwg-re-ecn-tcp-03		draft-briscoe-tsvwg-re-ecn-tcp-04

	Status of this Memo		Status of this Memo

	By submitting this Internet-Draft, each author represents that any		By submitting this Internet-Draft, each author represents that any
	applicable patent or other IPR claims of which he or she is aware		applicable patent or other IPR claims of which he or she is aware
	have been or will be disclosed, and any of which he or she becomes		have been or will be disclosed, and any of which he or she becomes
	aware will be disclosed, in accordance with Section 6 of BCP 79.		aware will be disclosed, in accordance with Section 6 of BCP 79.

	Internet-Drafts are working documents of the Internet Engineering		Internet-Drafts are working documents of the Internet Engineering
	Task Force (IETF), its areas, and its working groups. Note that		Task Force (IETF), its areas, and its working groups. Note that

	skipping to change at page 1, line 37		skipping to change at page 1, line 38
	and may be updated, replaced, or obsoleted by other documents at any		and may be updated, replaced, or obsoleted by other documents at any
	time. It is inappropriate to use Internet-Drafts as reference		time. It is inappropriate to use Internet-Drafts as reference
	material or to cite them other than as "work in progress."		material or to cite them other than as "work in progress."

	The list of current Internet-Drafts can be accessed at		The list of current Internet-Drafts can be accessed at
	http://www.ietf.org/ietf/1id-abstracts.txt.		http://www.ietf.org/ietf/1id-abstracts.txt.

	The list of Internet-Draft Shadow Directories can be accessed at		The list of Internet-Draft Shadow Directories can be accessed at
	http://www.ietf.org/shadow.html.		http://www.ietf.org/shadow.html.


	This Internet-Draft will expire on April 26, 2007.		This Internet-Draft will expire on January 10, 2008.

	Copyright Notice		Copyright Notice


	Copyright (C) The Internet Society (2006).		Copyright (C) The IETF Trust (2007).

	Abstract		Abstract

	This document introduces a new protocol for explicit congestion		This document introduces a new protocol for explicit congestion
	notification (ECN), termed re-ECN, which can be deployed		notification (ECN), termed re-ECN, which can be deployed
	incrementally around unmodified routers. The protocol arranges an		incrementally around unmodified routers. The protocol arranges an
	extended ECN field in each packet so that, as it crosses any		extended ECN field in each packet so that, as it crosses any
	interface in an internetwork, it will carry a truthful prediction of		interface in an internetwork, it will carry a truthful prediction of
	congestion on the remainder of its path. Then the upstream party at		congestion on the remainder of its path. Then the upstream party at
	any trust boundary in the internetwork can be held responsible for		any trust boundary in the internetwork can be held responsible for

	skipping to change at page 2, line 21		skipping to change at page 2, line 22
	changes required to transport protocols. It includes the changes		changes required to transport protocols. It includes the changes
	required to TCP both as an example and as a specification. It also		required to TCP both as an example and as a specification. It also
	gives examples of mechanisms that can use the protocol to ensure data		gives examples of mechanisms that can use the protocol to ensure data
	sources respond correctly to congestion. And it describes example		sources respond correctly to congestion. And it describes example
	mechanisms that ensure the dominant selfish strategy of both network		mechanisms that ensure the dominant selfish strategy of both network
	domains and end-points will be to set the extended ECN field		domains and end-points will be to set the extended ECN field
	honestly.		honestly.

	Authors' Statement: Status (to be removed by the RFC Editor)		Authors' Statement: Status (to be removed by the RFC Editor)


	This document is posted as an Internet-Draft with the intent (at
	least that of the authors) to eventually progress to standards track.

	Although the re-ECN protocol is intended to make a simple but far-		Although the re-ECN protocol is intended to make a simple but far-
	reaching change to the Internet architecture, the most immediate		reaching change to the Internet architecture, the most immediate
	priority for the authors is to delay any move of the ECN nonce to		priority for the authors is to delay any move of the ECN nonce to
	Proposed Standard status. The argument for this position is		Proposed Standard status. The argument for this position is
	developed in Appendix I.		developed in Appendix I.

	Changes from previous drafts (to be removed by the RFC Editor)		Changes from previous drafts (to be removed by the RFC Editor)


	From -00 to -01:		Full diffs created using the rfcdiff tool are available at
			<http://www.cs.ucl.ac.uk/staff/B.Briscoe/pubs.html#retcp>


	Encoding of re-ECN wire protocol changed for reasons given in		From -03 to -04 (current version):
	Appendix B and consequently draft substantially re-written.


	Substantial text added in sections on applications, incremental		Clarified reasons for holding back ECN nonce (Section 3.2 &
	deployment, architectural rationale and security considerations.		Appendix I).

			Clarified Figure 1.

			Added Section 4.1.1.1 on equivalence of drops and ECN marks.

			Improved precision of Section 5.6 on IP in IP tunnels.

			Explained the RTT fairness is possible to enforce, but unlikely to
			be required (Section 6.1.3 & Appendix F).

			Explained that bulk per-user policing should be adequate but per-
			flow policing is also possible if desired, though it is not likely
			to be necessary (Section 6.1.5 & Appendix G).

			Reinforced need for passive policing at inter-domain borders to
			enable all-optical networking (Section 6.1.6).

			Minor editorial changes throughout.

			From -02 to -03:

			Started guidelines for re-ECN support in DCCP and SCTP.

			Added annex on limitations of nonce mechanism.

			Minor editorial changes throughout.

	From -01 to -02:		From -01 to -02:

	Explanation on informal terminology in Section 3.4 clarified.		Explanation on informal terminology in Section 3.4 clarified.

	IPv6 wire protocol encoding added (Section 5.2).		IPv6 wire protocol encoding added (Section 5.2).

	Text on (non-)issues with tunnels, encryption and link layer		Text on (non-)issues with tunnels, encryption and link layer
	congestion notification added (Section 5.6 & Section 5.7).		congestion notification added (Section 5.6 & Section 5.7).

	Section added giving evolvability arguments against encouraging		Section added giving evolvability arguments against encouraging
	bottleneck policing (Section 6.1.2). And text on re-ECN's		bottleneck policing (Section 6.1.2). And text on re-ECN's
	evolvability by design added to Section 6.1.3		evolvability by design added to Section 6.1.3

	Text on inter-domain policing (Section 6.1.6) and inter-domain		Text on inter-domain policing (Section 6.1.6) and inter-domain
	fail-safes (Section 6.1.7) added.		fail-safes (Section 6.1.7) added.


	From -02 to -03:		From -00 to -01:

	Started guidelines for re-ECN support in DCCP and SCTP.


	Added annex on limitations of nonce mechanism.		Encoding of re-ECN wire protocol changed for reasons given in
			Appendix B and consequently draft substantially re-written.


	Minor editorial changes throughout.		Substantial text added in sections on applications, incremental
			deployment, architectural rationale and security considerations.

	Table of Contents		Table of Contents

	1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 6		1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 6
	2. Requirements notation . . . . . . . . . . . . . . . . . . . . 7		2. Requirements notation . . . . . . . . . . . . . . . . . . . . 7
	3. Protocol Overview . . . . . . . . . . . . . . . . . . . . . . 8		3. Protocol Overview . . . . . . . . . . . . . . . . . . . . . . 8
	3.1. Background and Applicability . . . . . . . . . . . . . . . 8		3.1. Background and Applicability . . . . . . . . . . . . . . . 8
	3.2. Re-ECN Abstracted Network Layer Wire Protocol (IPv4 or		3.2. Re-ECN Abstracted Network Layer Wire Protocol (IPv4 or
	v6) . . . . . . . . . . . . . . . . . . . . . . . . . . . 9		v6) . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

	3.3. Re-ECN Protocol Operation . . . . . . . . . . . . . . . . 10		3.3. Re-ECN Protocol Operation . . . . . . . . . . . . . . . . 11
	3.4. Informal Terminology . . . . . . . . . . . . . . . . . . . 12		3.4. Informal Terminology . . . . . . . . . . . . . . . . . . . 13
	4. Transport Layers . . . . . . . . . . . . . . . . . . . . . . . 15		4. Transport Layers . . . . . . . . . . . . . . . . . . . . . . . 15
	4.1. TCP . . . . . . . . . . . . . . . . . . . . . . . . . . . 15		4.1. TCP . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
	4.1.1. RECN mode: Full re-ECN capable transport . . . . . . . 16		4.1.1. RECN mode: Full re-ECN capable transport . . . . . . . 16
	4.1.2. RECN-Co mode: Re-ECT Sender with a Vanilla or		4.1.2. RECN-Co mode: Re-ECT Sender with a Vanilla or

	Nonce ECT Receiver . . . . . . . . . . . . . . . . . . 18		Nonce ECT Receiver . . . . . . . . . . . . . . . . . . 20
	4.1.3. Capability Negotiation . . . . . . . . . . . . . . . . 20		4.1.3. Capability Negotiation . . . . . . . . . . . . . . . . 21
	4.1.4. Extended ECN (EECN) Field Settings during Flow		4.1.4. Extended ECN (EECN) Field Settings during Flow

	Start or after Idle Periods . . . . . . . . . . . . . 21		Start or after Idle Periods . . . . . . . . . . . . . 23
	4.1.5. Pure ACKS, Retransmissions, Window Probes and		4.1.5. Pure ACKS, Retransmissions, Window Probes and

	Partial ACKs . . . . . . . . . . . . . . . . . . . . . 25		Partial ACKs . . . . . . . . . . . . . . . . . . . . . 26
	4.2. Other Transports . . . . . . . . . . . . . . . . . . . . . 26		4.2. Other Transports . . . . . . . . . . . . . . . . . . . . . 27
	4.2.1. General Guidelines for Adding Re-ECN to Other		4.2.1. General Guidelines for Adding Re-ECN to Other

	Transports . . . . . . . . . . . . . . . . . . . . . . 26		Transports . . . . . . . . . . . . . . . . . . . . . . 27
	4.2.2. Guidelines for adding Re-ECN to RSVP or NSIS . . . . . 26		4.2.2. Guidelines for adding Re-ECN to RSVP or NSIS . . . . . 28
	4.2.3. Guidelines for adding Re-ECN to DCCP . . . . . . . . . 27		4.2.3. Guidelines for adding Re-ECN to DCCP . . . . . . . . . 28
	4.2.4. Guidelines for adding Re-ECN to SCTP . . . . . . . . . 27		4.2.4. Guidelines for adding Re-ECN to SCTP . . . . . . . . . 28
	5. Network Layer . . . . . . . . . . . . . . . . . . . . . . . . 27		5. Network Layer . . . . . . . . . . . . . . . . . . . . . . . . 28
	5.1. Re-ECN IPv4 Wire Protocol . . . . . . . . . . . . . . . . 27		5.1. Re-ECN IPv4 Wire Protocol . . . . . . . . . . . . . . . . 28
	5.2. Re-ECN IPv6 Wire Protocol . . . . . . . . . . . . . . . . 28		5.2. Re-ECN IPv6 Wire Protocol . . . . . . . . . . . . . . . . 30
	5.3. Router Forwarding Behaviour . . . . . . . . . . . . . . . 30		5.3. Router Forwarding Behaviour . . . . . . . . . . . . . . . 31
	5.4. Justification for Setting the First SYN to FNE . . . . . . 31		5.4. Justification for Setting the First SYN to FNE . . . . . . 32
	5.5. Control and Management . . . . . . . . . . . . . . . . . . 32		5.5. Control and Management . . . . . . . . . . . . . . . . . . 33
	5.5.1. Negative Balance Warning . . . . . . . . . . . . . . . 32		5.5.1. Negative Balance Warning . . . . . . . . . . . . . . . 33
	5.5.2. Rate Response Control . . . . . . . . . . . . . . . . 33		5.5.2. Rate Response Control . . . . . . . . . . . . . . . . 34
	5.6. IP in IP Tunnels . . . . . . . . . . . . . . . . . . . . . 33		5.6. IP in IP Tunnels . . . . . . . . . . . . . . . . . . . . . 34
	5.7. Non-Issues . . . . . . . . . . . . . . . . . . . . . . . . 34		5.7. Non-Issues . . . . . . . . . . . . . . . . . . . . . . . . 35
	6. Applications . . . . . . . . . . . . . . . . . . . . . . . . . 35		6. Applications . . . . . . . . . . . . . . . . . . . . . . . . . 36
	6.1. Policing Congestion Response . . . . . . . . . . . . . . . 35		6.1. Policing Congestion Response . . . . . . . . . . . . . . . 36
	6.1.1. The Policing Problem . . . . . . . . . . . . . . . . . 35		6.1.1. The Policing Problem . . . . . . . . . . . . . . . . . 36
	6.1.2. The Case Against Bottleneck Policing . . . . . . . . . 36		6.1.2. The Case Against Bottleneck Policing . . . . . . . . . 37
	6.1.3. Re-ECN Incentive Framework . . . . . . . . . . . . . . 37		6.1.3. Re-ECN Incentive Framework . . . . . . . . . . . . . . 38
	6.1.4. Egress Dropper . . . . . . . . . . . . . . . . . . . . 44		6.1.4. Egress Dropper . . . . . . . . . . . . . . . . . . . . 45
	6.1.5. Rate Policing . . . . . . . . . . . . . . . . . . . . 45		6.1.5. Policing . . . . . . . . . . . . . . . . . . . . . . . 47
	6.1.6. Inter-domain Policing . . . . . . . . . . . . . . . . 47		6.1.6. Inter-domain Policing . . . . . . . . . . . . . . . . 48
	6.1.7. Inter-domain Fail-safes . . . . . . . . . . . . . . . 51		6.1.7. Inter-domain Fail-safes . . . . . . . . . . . . . . . 52
	6.1.8. Simulations . . . . . . . . . . . . . . . . . . . . . 51		6.1.8. Simulations . . . . . . . . . . . . . . . . . . . . . 53
	6.2. Other Applications . . . . . . . . . . . . . . . . . . . . 51		6.2. Other Applications . . . . . . . . . . . . . . . . . . . . 53
	6.2.1. DDoS Mitigation . . . . . . . . . . . . . . . . . . . 52		6.2.1. DDoS Mitigation . . . . . . . . . . . . . . . . . . . 53
	6.2.2. End-to-end QoS . . . . . . . . . . . . . . . . . . . . 53		6.2.2. End-to-end QoS . . . . . . . . . . . . . . . . . . . . 54
	6.2.3. Traffic Engineering . . . . . . . . . . . . . . . . . 53		6.2.3. Traffic Engineering . . . . . . . . . . . . . . . . . 54
	6.2.4. Inter-Provider Service Monitoring . . . . . . . . . . 53		6.2.4. Inter-Provider Service Monitoring . . . . . . . . . . 54
	6.3. Limitations . . . . . . . . . . . . . . . . . . . . . . . 53		6.3. Limitations . . . . . . . . . . . . . . . . . . . . . . . 54
	7. Incremental Deployment . . . . . . . . . . . . . . . . . . . . 54		7. Incremental Deployment . . . . . . . . . . . . . . . . . . . . 55
	7.1. Incremental Deployment Features . . . . . . . . . . . . . 54		7.1. Incremental Deployment Features . . . . . . . . . . . . . 55
	7.2. Incremental Deployment Incentives . . . . . . . . . . . . 55		7.2. Incremental Deployment Incentives . . . . . . . . . . . . 57
	8. Architectural Rationale . . . . . . . . . . . . . . . . . . . 60		8. Architectural Rationale . . . . . . . . . . . . . . . . . . . 61
	9. Related Work . . . . . . . . . . . . . . . . . . . . . . . . . 63		9. Related Work . . . . . . . . . . . . . . . . . . . . . . . . . 64
	9.1. Policing Rate Response to Congestion . . . . . . . . . . . 63		9.1. Policing Rate Response to Congestion . . . . . . . . . . . 64
	9.2. Congestion Notification Integrity . . . . . . . . . . . . 63		9.2. Congestion Notification Integrity . . . . . . . . . . . . 65
	9.3. Identifying Upstream and Downstream Congestion . . . . . . 64		9.3. Identifying Upstream and Downstream Congestion . . . . . . 66
	10. Security Considerations . . . . . . . . . . . . . . . . . . . 65		10. Security Considerations . . . . . . . . . . . . . . . . . . . 66
	11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 66		11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 68
	12. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 67		12. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 68
	13. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 67		13. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 68
	14. Comments Solicited . . . . . . . . . . . . . . . . . . . . . . 67		14. Comments Solicited . . . . . . . . . . . . . . . . . . . . . . 69
	15. References . . . . . . . . . . . . . . . . . . . . . . . . . . 67		15. References . . . . . . . . . . . . . . . . . . . . . . . . . . 69
	15.1. Normative References . . . . . . . . . . . . . . . . . . . 67		15.1. Normative References . . . . . . . . . . . . . . . . . . . 69
	15.2. Informative References . . . . . . . . . . . . . . . . . . 68		15.2. Informative References . . . . . . . . . . . . . . . . . . 70
	Appendix A. Precise Re-ECN Protocol Operation . . . . . . . . . . 71		Appendix A. Precise Re-ECN Protocol Operation . . . . . . . . . . 73
	Appendix B. Justification for Two Codepoints Signifying Zero		Appendix B. Justification for Two Codepoints Signifying Zero

	Worth Packets . . . . . . . . . . . . . . . . . . . . 72		Worth Packets . . . . . . . . . . . . . . . . . . . . 74
	Appendix C. ECN Compatibility . . . . . . . . . . . . . . . . . . 74		Appendix C. ECN Compatibility . . . . . . . . . . . . . . . . . . 76
	Appendix D. Packet Marking During Flow Start . . . . . . . . . . 75		Appendix D. Packet Marking During Flow Start . . . . . . . . . . 77
	Appendix E. Example Egress Dropper Algorithm . . . . . . . . . . 75		Appendix E. Example Egress Dropper Algorithm . . . . . . . . . . 77
	Appendix F. Re-TTL . . . . . . . . . . . . . . . . . . . . . . . 75		Appendix F. Re-TTL . . . . . . . . . . . . . . . . . . . . . . . 77
	Appendix G. Policer Designs to ensure Congestion		Appendix G. Policer Designs to ensure Congestion

	Responsiveness . . . . . . . . . . . . . . . . . . . 76		Responsiveness . . . . . . . . . . . . . . . . . . . 78
	G.1. Per-user Policing . . . . . . . . . . . . . . . . . . . . 76		G.1. Per-user Policing . . . . . . . . . . . . . . . . . . . . 78
	G.2. Per-flow Rate Policing . . . . . . . . . . . . . . . . . . 77		G.2. Per-flow Rate Policing . . . . . . . . . . . . . . . . . . 79
	Appendix H. Downstream Congestion Metering Algorithms . . . . . . 80		Appendix H. Downstream Congestion Metering Algorithms . . . . . . 82
	H.1. Bulk Downstream Congestion Metering Algorithm . . . . . . 80		H.1. Bulk Downstream Congestion Metering Algorithm . . . . . . 82
	H.2. Inflation Factor for Persistently Negative Flows . . . . . 80		H.2. Inflation Factor for Persistently Negative Flows . . . . . 83
	Appendix I. Argument for holding back the ECN nonce . . . . . . . 81		Appendix I. Argument for holding back the ECN nonce . . . . . . . 84
	Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 83		Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 85
	Intellectual Property and Copyright Statements . . . . . . . . . . 85		Intellectual Property and Copyright Statements . . . . . . . . . . 88

	1. Introduction		1. Introduction

	This document aims:		This document aims:

	o To provide a complete specification of the addition of the re-ECN		o To provide a complete specification of the addition of the re-ECN
	protocol to IP and guidelines on how to add it to transport layer		protocol to IP and guidelines on how to add it to transport layer
	protocols, including a complete specification of re-ECN in TCP as		protocols, including a complete specification of re-ECN in TCP as
	an example;		an example;


	skipping to change at page 7, line 32		skipping to change at page 7, line 32

	This document is structured as follows. First an overview of the re-		This document is structured as follows. First an overview of the re-
	ECN protocol is given (Section 3), outlining its attributes and		ECN protocol is given (Section 3), outlining its attributes and
	explaining conceptually how it works as a whole. The two main parts		explaining conceptually how it works as a whole. The two main parts
	of the document follow, as described above. That is, the protocol		of the document follow, as described above. That is, the protocol
	specification divided into transport (Section 4) and network		specification divided into transport (Section 4) and network
	(Section 5) layers, then the applications it can be put to, such as		(Section 5) layers, then the applications it can be put to, such as
	policing DDoS, QoS and congestion control (Section 6). Although		policing DDoS, QoS and congestion control (Section 6). Although
	these applications do not require standardisation themselves, they		these applications do not require standardisation themselves, they
	are described in a fair degree of detail in order to explain how re-		are described in a fair degree of detail in order to explain how re-

	ECN can be used. Given, re-ECN proposes to use the last undefined		ECN can be used. Given re-ECN proposes to use the last undefined bit
	bit in the IPv4 header, we felt it necessary to outline the potential		in the IPv4 header, we felt it necessary to outline the potential
	that re-ECN could release in return for being given that bit.		that re-ECN could release in return for being given that bit.

	Deployment issues discussed throughout the document are brought		Deployment issues discussed throughout the document are brought
	together in Section 7, which is followed by a brief section		together in Section 7, which is followed by a brief section
	explaining the somewhat subtle rationale for the design from an		explaining the somewhat subtle rationale for the design from an
	architectural perspective (Section 8). We end by describing related		architectural perspective (Section 8). We end by describing related
	work (Section 9), listing security considerations (Section 10) and		work (Section 9), listing security considerations (Section 10) and
	finally drawing conclusions (Section 12).		finally drawing conclusions (Section 12).

	2. Requirements notation		2. Requirements notation

	skipping to change at page 8, line 49		skipping to change at page 8, line 49
	congestion feedback. But Section 9.2 explains that it still gives no		congestion feedback. But Section 9.2 explains that it still gives no
	control over how fast the sender transmits as a result of the		control over how fast the sender transmits as a result of the
	feedback. On the other hand, re-ECN is designed both to ensure that		feedback. On the other hand, re-ECN is designed both to ensure that
	congestion is declared honestly and that the sender's rate responds		congestion is declared honestly and that the sender's rate responds
	appropriately.		appropriately.

	Re-ECN is based on a feedback arrangement called `re-		Re-ECN is based on a feedback arrangement called `re-
	feedback' [Re-fb]. The word is short for either receiver-aligned,		feedback' [Re-fb]. The word is short for either receiver-aligned,
	re-inserted or re-echoed feedback. But it actually works even when		re-inserted or re-echoed feedback. But it actually works even when
	no feedback is available. In fact it has been carefully designed to		no feedback is available. In fact it has been carefully designed to

	work for single datagram flows. Indeed, it even encourages		work for single datagram flows. It also encourages aggregation of
	aggregation of single packet flows by congestion control proxies.		single packet flows by congestion control proxies. Then, even if the
			traffic mix of the Internet were to become dominated by short
	Then, even if the traffic mix of the Internet were to become		messages, it would still be possible to control congestion
	dominated by short messages, it would still be possible to control		effectively and efficiently.
	congestion effectively and efficiently.

	Changing the Internet's feedback architecture seems to imply		Changing the Internet's feedback architecture seems to imply
	considerable upheaval. But re-ECN can be deployed incrementally at		considerable upheaval. But re-ECN can be deployed incrementally at
	the transport layer around unmodified routers using existing fields		the transport layer around unmodified routers using existing fields
	in IP (v4 or v6). However it does also require the last undefined		in IP (v4 or v6). However it does also require the last undefined
	bit in the IPv4 header, which it uses in combination with the 2-bit		bit in the IPv4 header, which it uses in combination with the 2-bit
	ECN field to create four new codepoints. Nonetheless, changes to IP		ECN field to create four new codepoints. Nonetheless, changes to IP
	routers are RECOMMENDED in order to improve resilience against DoS		routers are RECOMMENDED in order to improve resilience against DoS
	attacks. Similarly, re-ECN works best if both the sender and		attacks. Similarly, re-ECN works best if both the sender and
	receiver transports are re-ECN-capable, but it can work with just		receiver transports are re-ECN-capable, but it can work with just

	skipping to change at page 10, line 13		skipping to change at page 10, line 13
	be defined in another specification (e.g. [Re-PCN]).		be defined in another specification (e.g. [Re-PCN]).

	Although the RE flag is a separate, single bit field, it can be read		Although the RE flag is a separate, single bit field, it can be read
	as an extension to the two-bit ECN field; the three concatenated bits		as an extension to the two-bit ECN field; the three concatenated bits
	in what we will call the extended ECN field (EECN) making eight		in what we will call the extended ECN field (EECN) making eight
	codepoints. We will use the RFC3168 names of the ECN codepoints to		codepoints. We will use the RFC3168 names of the ECN codepoints to
	describe settings of the ECN field when the RE flag setting is "don't		describe settings of the ECN field when the RE flag setting is "don't
	care", but we also define the following six extended ECN codepoint		care", but we also define the following six extended ECN codepoint
	names for when we need to be more specific.		names for when we need to be more specific.


			RFC3168 ECN defines uses for all four codepoints of the two-bit ECN
			field. This memo widens the codepoint space to eight, and uses six
			codepoints. One of re-ECN's codepoints is an alternative use of the
			codepoint set aside in RFC3168 for the ECN nonce (ECT(1)).
			Transports not using re-ECN can still use the ECN nonce, while those
			using re-ECN do not need to as long as the sender is also checking
			for transport protocol compliance [I-D.moncaster-tcpm-rcv-cheat].
			The case for doing this is given in Appendix I. Two re-ECN
			codepoints are given compatible uses to those defined in RFC3168
			(Not-ECT and CE). The other codepoint used by RFC3168 (ECT(0)) isn't
			used for re-ECN. Altogether this leave one codepoint of the eight
			unused and available for future use.

	+-------+------------+------+--------------+------------------------+		+-------+------------+------+--------------+------------------------+
	\| ECN \| RFC3168 \| RE \| Extended ECN \| Re-ECN meaning \|		\| ECN \| RFC3168 \| RE \| Extended ECN \| Re-ECN meaning \|
	\| field \| codepoint \| flag \| codepoint \| \|		\| field \| codepoint \| flag \| codepoint \| \|
	+-------+------------+------+--------------+------------------------+		+-------+------------+------+--------------+------------------------+
	\| 00 \| Not-ECT \| 0 \| Not-RECT \| Not re-ECN-capable \|		\| 00 \| Not-ECT \| 0 \| Not-RECT \| Not re-ECN-capable \|
	\| \| \| \| \| transport \|		\| \| \| \| \| transport \|
	\| 00 \| Not-ECT \| 1 \| FNE \| Feedback not \|		\| 00 \| Not-ECT \| 1 \| FNE \| Feedback not \|
	\| \| \| \| \| established \|		\| \| \| \| \| established \|
	\| 01 \| ECT(1) \| 0 \| Re-Echo \| Re-echoed congestion \|		\| 01 \| ECT(1) \| 0 \| Re-Echo \| Re-echoed congestion \|
	\| \| \| \| \| and RECT \|		\| \| \| \| \| and RECT \|

	skipping to change at page 11, line 19		skipping to change at page 12, line 5
	re-ECN sender will clear the RE flag to "0" in the next packet it		re-ECN sender will clear the RE flag to "0" in the next packet it
	sends.		sends.

	We chose to set and clear the RE flag this way round to ease		We chose to set and clear the RE flag this way round to ease
	incremental deployment (see Section 7.1). To avoid confusion we will		incremental deployment (see Section 7.1). To avoid confusion we will
	use the term `blanking' (rather than marking) when the RE flag is		use the term `blanking' (rather than marking) when the RE flag is
	cleared to "0". So, over a stream of packets, we will talk of the		cleared to "0". So, over a stream of packets, we will talk of the
	`RE blanking fraction' as the fraction of octets in packets with the		`RE blanking fraction' as the fraction of octets in packets with the
	RE flag cleared to "0".		RE flag cleared to "0".


	^		_ _ _ _
	\|		/ \ / \ / \ / \
	\| RE blanking fraction		\| S \|--\| 0 \| - - - - - - - - \| i \|--\| D \|
	3% \|--------------------------------+=====		\ _ / \ _ / \ _ / \ _ /
	\| \|		. . . .
	2% \| \|		^ . . . .
	\| CE marking fraction \|		\| . . . .
	1% \| +-----------------------+		\| . RE blanking fraction . .
	\| \|		3% \|-------------------------------+=======
	0% +---------------------------------------->		\| . . \| .
			2% \| . . \| .
			\| . . CE marking fraction \| .
			1% \| . +----------------------+ .
			\| . \| . .
			0% +--------------------------------------->
	^ 0 ^ i ^ resource index		^ 0 ^ i ^ resource index

	\| ^ \| ^ \|		0 ^ 1 ^ 2 observation points
	0 \| 1 \| 2 observation points		\| \|
	1.00% 2.00% marking fraction		1.00% 2.00% marking fraction

	Figure 1: A 2-Router Example (Imprecise)		Figure 1: A 2-Router Example (Imprecise)


	Figure 1 uses the two router example introduced earlier to illustrate		Figure 1 uses a simple network to illustrate how re-ECN allows
	why re-ECN allows routers to measure downstream congestion. The		routers to measure downstream congestion. The horizontal axis
	horizontal axis represents the index of each congestible resource		represents the index of each congestible resource (typically queues)
	(typically queues) along a path through the Internet. There may be		along a path through the Internet. There may be many routers on the
	many routers on the path, but we assume only two are currently		path, but we assume only two are currently congested (those with
	congested (those with resource index 0 and i). The two superimposed		resource index 0 and i). The two superimposed plots show the
	plots show the fraction of each extended ECN codepoint in a flow		fraction of each extended ECN codepoint in a flow observed along this
	observed along this path. Given about 3% of packets reaching the		path. Given about 3% of packets reaching the destination are marked
	destination are marked CE, in response to feedback the sender will		CE, in response to feedback the sender will blank the RE flag in
	blank the RE flag in about 3% of packets it sends. Then approximate		about 3% of packets it sends. Then approximate downstream congestion
	downstream congestion can be measured at the observation points shown		can be measured at the observation points shown along the path by
	along the path by subtracting the CE marking fraction from the RE		subtracting the CE marking fraction from the RE blanking fraction, as
	blanking fraction, as shown in the table below (Appendix A derives		shown in the table below (Appendix A derives these approximations
	these approximations from a precise analysis).		from a precise analysis).

	+-------------------+------------------------------+		+-------------------+------------------------------+
	\| Observation point \| Approx downstream congestion \|		\| Observation point \| Approx downstream congestion \|
	+-------------------+------------------------------+		+-------------------+------------------------------+
	\| 0 \| 3% - 0% = 3% \|		\| 0 \| 3% - 0% = 3% \|
	\| 1 \| 3% - 1% = 2% \|		\| 1 \| 3% - 1% = 2% \|
	\| 2 \| 3% - 3% = 0% \|		\| 2 \| 3% - 3% = 0% \|
	+-------------------+------------------------------+		+-------------------+------------------------------+

	Table 2: Downstream Congestion Measured at Example Observation Points		Table 2: Downstream Congestion Measured at Example Observation Points

	skipping to change at page 16, line 12		skipping to change at page 17, line 6
	be in RECN mode, at least not until it has confirmed that the other		be in RECN mode, at least not until it has confirmed that the other
	host is Re-ECT.		host is Re-ECT.

	4.1.1. RECN mode: Full re-ECN capable transport		4.1.1. RECN mode: Full re-ECN capable transport

	In full RECN mode, for each half connection, both the sender and the		In full RECN mode, for each half connection, both the sender and the
	receiver each maintain an unsigned integer counter we will call ECC		receiver each maintain an unsigned integer counter we will call ECC
	(echo congestion counter). The receiver maintains a count, modulo 8,		(echo congestion counter). The receiver maintains a count, modulo 8,
	of how many times a CE marked packet has arrived during the half-		of how many times a CE marked packet has arrived during the half-
	connection. Once a RECN connection is established, the three TCP		connection. Once a RECN connection is established, the three TCP

	option flags (ECE, CWR & NS) used for ECN-related functions in		option flags (ECE, CWR & NS) used for ECN-related functions in other
	previous versions of ECN are used as a 3-bit field for the receiver		versions of ECN are used as a 3-bit field for the receiver to
	to repeatedly tell the sender the current value of ECC whenever it		repeatedly tell the sender the current value of ECC whenever it sends
	sends a TCP ACK. We will call this the echo congestion increment		a TCP ACK. We will call this the echo congestion increment (ECI)
	(ECI) field. This overloaded use of these 3 option flags as one		field. This overloaded use of these 3 option flags as one 3-bit ECI
	3-bit ECI field is shown in Figure 4. The actual definition of the		field is shown in Figure 4. The actual definition of the TCP header,
	TCP header, including the addition of support for the ECN nonce, is		including the addition of support for the ECN nonce, is shown for
	shown for comparison in Figure 3. This specification does not		comparison in Figure 3. This specification does not redefine the
	redefine the names of these three TCP option flags, it merely		names of these three TCP option flags, it merely overloads them with
	overloads them with another definition once a flow is established.		another definition once a flow is established.

	0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15		0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
	+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+		+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
	\| \| \| N \| C \| E \| U \| A \| P \| R \| S \| F \|		\| \| \| N \| C \| E \| U \| A \| P \| R \| S \| F \|
	\| Header Length \| Reserved \| S \| W \| C \| R \| C \| S \| S \| Y \| I \|		\| Header Length \| Reserved \| S \| W \| C \| R \| C \| S \| S \| Y \| I \|
	\| \| \| \| R \| E \| G \| K \| H \| T \| N \| N \|		\| \| \| \| R \| E \| G \| K \| H \| T \| N \| N \|
	+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+		+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+

	Figure 3: The (post-ECN Nonce) definition of bytes 13 and 14 of the		Figure 3: The (post-ECN Nonce) definition of bytes 13 and 14 of the
	TCP Header		TCP Header

	skipping to change at page 17, line 20		skipping to change at page 18, line 16
	delayed-ACK, which would be necessary if ACK-withholding were		delayed-ACK, which would be necessary if ACK-withholding were
	implemented.		implemented.

	Sender Action in RECN Mode		Sender Action in RECN Mode

	On the arrival of every ACK, the sender compares the ECI field		On the arrival of every ACK, the sender compares the ECI field
	with its own ECC value, then replaces its local value with that		with its own ECC value, then replaces its local value with that
	from the ACK. The difference D is assumed to be the number of CE		from the ACK. The difference D is assumed to be the number of CE
	marked packets that arrived at the receiver since it sent the		marked packets that arrived at the receiver since it sent the
	previously received ACK (but see below for the sender's safety		previously received ACK (but see below for the sender's safety

	strategy). Whenever the ECI field increments by D (or D drops are		strategy). Whenever the ECI field increments by D (and/or d drops
	detected), the sender MUST clear the RE flag to "0" in the IP		are detected), the sender MUST clear the RE flag to "0" in the IP
	header of the next D data packets it sends, effectively re-echoing		header of the next D' data packets it sends (where D' = D + d),
	each single increment of ECI. Otherwise the data sender MUST send		effectively re-echoing each single increment of ECI. Otherwise
	all data packets with RE set to "1".		the data sender MUST send all data packets with RE set to "1".

	As a general rule, once a flow is established, as well as setting		As a general rule, once a flow is established, as well as setting
	or clearing the RE flag as above, a data sender in RECN mode MUST		or clearing the RE flag as above, a data sender in RECN mode MUST
	always set the ECN field to ECT(1). However, the settings of the		always set the ECN field to ECT(1). However, the settings of the
	extended ECN field during flow start are defined in Section 4.1.4.		extended ECN field during flow start are defined in Section 4.1.4.

	As we have already emphasised, the re-ECN protocol makes no		As we have already emphasised, the re-ECN protocol makes no
	changes and has no effect on the TCP congestion control algorithm.		changes and has no effect on the TCP congestion control algorithm.
	So, each increment of ECI (or detection of a drop) also triggers		So, each increment of ECI (or detection of a drop) also triggers
	the standard TCP congestion response, but with no more than one		the standard TCP congestion response, but with no more than one

	skipping to change at page 18, line 5		skipping to change at page 18, line 43
	A TCP sender also acts as the receiver for the other half-		A TCP sender also acts as the receiver for the other half-
	connection. The host will maintain two ECC values S.ECC and R.ECC		connection. The host will maintain two ECC values S.ECC and R.ECC
	as sender and receiver respectively. Every TCP header sent by a		as sender and receiver respectively. Every TCP header sent by a
	host in RECN mode will also repeat the prevailing value of R.ECC		host in RECN mode will also repeat the prevailing value of R.ECC
	in its ECI field. If a sender in RECN mode has to retransmit a		in its ECI field. If a sender in RECN mode has to retransmit a
	packet due to a suspected loss, the re-transmitted packet MUST		packet due to a suspected loss, the re-transmitted packet MUST
	carry the latest prevailing value of R.ECC when it is re-		carry the latest prevailing value of R.ECC when it is re-
	transmitted, which will not necessarily be the one it carried		transmitted, which will not necessarily be the one it carried
	originally.		originally.


	4.1.1.1. Safety against Long Pure ACK Loss Sequences		4.1.1.1. Drops and Marks

			Re-ECN is based on the ECN protocol [RFC3168] which in turn is
			typically based on the RED algorithm [RFC2309]. This algorithm marks
			packets as CE with a probability that increases as the size of the
			router queue increases. Howeverif the queue becomes too full then it
			will revert to dropping packets. Because of this it is important
			that re-ECN treats each packet drop it detects as if it were actually
			a CE mark. This ensures that it can continue to correctly echo
			congestion even through a highly congested path.

			In order to ensure that drops are correctly echoed the sender needs
			to add the number of drops detected per RTT to the difference in ECI
			value waiting to be echoed. A drop is defined as set out in
			[RFC2581] -- if the connection is in slow start then a single
			duplicate aknowledgement will be treated as an indication of a drop.
			When the system is in the congestion avoidance stage then 3 duplicate
			acknowledgements will be treated as a sign of a drop. In all cases,
			if a re-transmission time-out occurs then that will be treatd as a
			drop.

			4.1.1.2. Safety against Long Pure ACK Loss Sequences

	The ECI method was chosen for echoing congestion marking because a		The ECI method was chosen for echoing congestion marking because a
	re-ECN sender needs to know about every CE mark arriving at the		re-ECN sender needs to know about every CE mark arriving at the
	receiver, not just whether at least one arrives within a round trip		receiver, not just whether at least one arrives within a round trip
	time (which is all the ECE/CWR mechanism supported). And, as pure		time (which is all the ECE/CWR mechanism supported). And, as pure
	ACKs are not protected by TCP reliable delivery, we repeat the same		ACKs are not protected by TCP reliable delivery, we repeat the same
	ECI value in every ACK until it changes. Even if many ACKs in a row		ECI value in every ACK until it changes. Even if many ACKs in a row
	are lost, as soon as one gets through, the ECI field it repeats from		are lost, as soon as one gets through, the ECI field it repeats from
	previous ACKs that didn't get through will update the sender on how		previous ACKs that didn't get through will update the sender on how
	many CE marks arrived since the last ACK got through.		many CE marks arrived since the last ACK got through.

	skipping to change at page 22, line 24		skipping to change at page 23, line 36
	means that Re-ECT server B MUST set FNE on a SYN ACK whether it is		means that Re-ECT server B MUST set FNE on a SYN ACK whether it is
	responding to a SYN from a Re-ECT client or from a client that is		responding to a SYN from a Re-ECT client or from a client that is
	merely ECN-capable.		merely ECN-capable.

	The original ECN specification [RFC3168] required SYNs and SYN ACKs		The original ECN specification [RFC3168] required SYNs and SYN ACKs
	to use the Not-ECT codepoint of the ECN field. The aim was to		to use the Not-ECT codepoint of the ECN field. The aim was to
	prevent well-known DoS attacks such as SYN flooding being able to		prevent well-known DoS attacks such as SYN flooding being able to
	gain from the advantage that ECN capability afforded over drop at		gain from the advantage that ECN capability afforded over drop at
	ECN-capable routers.		ECN-capable routers.


	For a SYN ACK, Kuzmanovic [I-D.ietf-tsvwg-ecnsyn] has shown that this		For a SYN ACK, Kuzmanovic [I-D.ietf-tcpm-ecnsyn] has shown that this
	caution was unnecessary, and proposes to allow a SYN ACK to be ECN-		caution was unnecessary, and proposes to allow a SYN ACK to be ECN-
	capable to improve performance. We have gone further by proposing to		capable to improve performance. We have gone further by proposing to
	make the initial SYN ECN-capable too. By stipulating the FNE		make the initial SYN ECN-capable too. By stipulating the FNE
	codepoint for the initial SYN, we comply with RFC3168 in word but not		codepoint for the initial SYN, we comply with RFC3168 in word but not
	in spirit, because we have indeed set the ECN field to Not-ECT, but		in spirit, because we have indeed set the ECN field to Not-ECT, but
	we have extended the ECN field with another bit. And it will be seen		we have extended the ECN field with another bit. And it will be seen
	(Section 5.3) that we have defined one setting of that bit to mean an		(Section 5.3) that we have defined one setting of that bit to mean an
	ECN-capable transport. Therefore, by proposing that the FNE		ECN-capable transport. Therefore, by proposing that the FNE
	codepoint MUST be used on the initial SYN of a connection, we have		codepoint MUST be used on the initial SYN of a connection, we have
	(deliberately) made the initial SYN ECN-capable. Section 5.4		(deliberately) made the initial SYN ECN-capable. Section 5.4

	skipping to change at page 26, line 26		skipping to change at page 27, line 37
	If the sender transport does not have sufficient feedback to even		If the sender transport does not have sufficient feedback to even
	estimate the path's CE rate, it SHOULD set FNE continuously. If the		estimate the path's CE rate, it SHOULD set FNE continuously. If the
	sender transport has some, perhaps stale, feedback to estimate that		sender transport has some, perhaps stale, feedback to estimate that
	the path's CE rate is nearly definitely less than E%, the transport		the path's CE rate is nearly definitely less than E%, the transport
	MAY blank RE in packets for E% of sent octets, and set the RECT		MAY blank RE in packets for E% of sent octets, and set the RECT
	codepoint for the remainder.		codepoint for the remainder.

	The following sections give guidelines on how re-ECN support could be		The following sections give guidelines on how re-ECN support could be
	added to RSVP or NSIS, to DCCP, and to SCTP - although separate		added to RSVP or NSIS, to DCCP, and to SCTP - although separate
	Internet drafts will be necessary to document the exact mechanics of		Internet drafts will be necessary to document the exact mechanics of

	re-ECN if each of these protocols.		re-ECN in each of these protocols.

	{ToDo: Give a brief outline of what would be expected for each of the		{ToDo: Give a brief outline of what would be expected for each of the
	following:		following:

	o UDP fire and forget (e.g. DNS)		o UDP fire and forget (e.g. DNS)

	o UDP streaming with no feedback		o UDP streaming with no feedback

	o UDP streaming with feedback		o UDP streaming with feedback

	}		}

	4.2.2. Guidelines for adding Re-ECN to RSVP or NSIS		4.2.2. Guidelines for adding Re-ECN to RSVP or NSIS

	A separate I-D has been submitted [Re-PCN] describing how re-ECN can		A separate I-D has been submitted [Re-PCN] describing how re-ECN can
	be used in an edge-to-edge rather than end-to-end scenario. It can		be used in an edge-to-edge rather than end-to-end scenario. It can
	then be used by downstream networks to police whether upstream		then be used by downstream networks to police whether upstream
	networks are blocking new flow reservations when downstream		networks are blocking new flow reservations when downstream
	congestion is too high, even though the congestion is in other		congestion is too high, even though the congestion is in other

	operators' downstream networks. This relates to current work in		operators' downstream networks. This relates to current IETF work on
	progress on Admission Control over Diffserv using Pre-Congestion		Admission Control over Diffserv using Pre-Congestion Notification
	Notification, being reported to the IETF TSVWG [CL-deploy].		(PCN) [PCN-arch].

	4.2.3. Guidelines for adding Re-ECN to DCCP		4.2.3. Guidelines for adding Re-ECN to DCCP

	Beside adjusting the initial features negotiation sequence, operating		Beside adjusting the initial features negotiation sequence, operating

	re-ECN in DCCP could be achieved by defining a new option to be added		re-ECN in DCCP [RFC4340] could be achieved by defining a new option
	to acknowledgments, that would include a multibit field where the		to be added to acknowledgments, that would include a multibit field
	destination could copy its ECC.		where the destination could copy its ECC.

	4.2.4. Guidelines for adding Re-ECN to SCTP		4.2.4. Guidelines for adding Re-ECN to SCTP


	Annex 1 in RFC4340 gives the specifications for SCTP to support ECN.		Annex 1 in [RFC2960] gives the specifications for SCTP to support
	Similar steps should be taken to support re-ECN. Beside adjusting		ECN. Similar steps should be taken to support re-ECN. Beside
	the initial features negotiation sequence, operating re-ECN in SCTP		adjusting the initial features negotiation sequence, operating re-ECN
	could be achieved by defining a new control chunk, that would include		in SCTP could be achieved by defining a new control chunk, that would
	a multibit field where the destination could copy its ECC		include a multibit field where the destination could copy its ECC

	5. Network Layer		5. Network Layer

	5.1. Re-ECN IPv4 Wire Protocol		5.1. Re-ECN IPv4 Wire Protocol

	The wire protocol of the ECN field in the IP header remains largely		The wire protocol of the ECN field in the IP header remains largely
	unchanged from [RFC3168]. However, an extension to the ECN field we		unchanged from [RFC3168]. However, an extension to the ECN field we
	call the RE (re-ECN extension) flag (Section 3.2) is defined in this		call the RE (re-ECN extension) flag (Section 3.2) is defined in this
	document. It doubles the extended ECN codepoint space, giving 8		document. It doubles the extended ECN codepoint space, giving 8
	potential codepoints. The semantics of the extra codepoints are		potential codepoints. The semantics of the extra codepoints are

	skipping to change at page 29, line 8		skipping to change at page 30, line 14

	5.2. Re-ECN IPv6 Wire Protocol		5.2. Re-ECN IPv6 Wire Protocol

	For IPv6, this document proposes that the new RE control flag will be		For IPv6, this document proposes that the new RE control flag will be
	positioned as the first bit of the option field of a new Congestion		positioned as the first bit of the option field of a new Congestion
	hop by hop option header (Figure 6).		hop by hop option header (Figure 6).

	0 1 2 3		0 1 2 3
	0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1		0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
	+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+		+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

	\| Next Header \| Hdr ext Len \| Option Type \| Option Len \|		\| Next Header \| Hdr ext Len \| Option Type \| Opt Length =4 \|
	+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+		+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
	\|R\| Reserved for future use \|		\|R\| Reserved for future use \|
	\|E\| \|		\|E\| \|
	+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+		+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

	Figure 6: Definition of a New IPv6 Congestion Hop by Hop Option		Figure 6: Definition of a New IPv6 Congestion Hop by Hop Option
	Header containing the Re-ECN Extension (RE) Control Flag		Header containing the Re-ECN Extension (RE) Control Flag

	0 1 2 3 4 5 6 7 8		0 1 2 3 4 5 6 7 8
	+-+-+-+-+-+-+-+-+-		+-+-+-+-+-+-+-+-+-

	skipping to change at page 33, line 19		skipping to change at page 34, line 19
	not aware of. Otherwise, spoof messages could be sent by malicious		not aware of. Otherwise, spoof messages could be sent by malicious
	sources to slow down a sender (c.f. ICMP source quench).		sources to slow down a sender (c.f. ICMP source quench).

	However, the need for this message type is not yet confirmed, as we		However, the need for this message type is not yet confirmed, as we
	are considering how to prevent it being used by malicious senders to		are considering how to prevent it being used by malicious senders to
	scan for droppers and to test their threshold settings. {ToDo:		scan for droppers and to test their threshold settings. {ToDo:
	Complete this section.}		Complete this section.}

	5.5.2. Rate Response Control		5.5.2. Rate Response Control


	The incentive framework of Section 6.1.3 implies there may be a need		As discussed in Section 6.1.5 the sender's access operator will be
	for a sender to send a request to an ingress policer asking that it		expected to use bulk per-user policing, but they might choose to
	be allowed to apply a non-default response to congestion (where TCP-		introduce a per-flow policer. In cases where operators do introduce
	friendly is assumed to be the default). This would require the		per-flow policing, there may be a need for a sender to send a request
	sender to know what message format(s) to use and to be able to		to the ingress policer asking for permission to apply a non-default
	discover how to address the policer. The required control		response to congestion (where TCP-friendly is assumed to be the
	protocol(s) are outside the scope of this document, but will require		default). This would require the sender to know what message
	definition elsewhere.		format(s) to use and to be able to discover how to address the
			policer. The required control protocol(s) are outside the scope of
			this document, but will require definition elsewhere.

	The policer is likely to be local to the sender and inline, probably		The policer is likely to be local to the sender and inline, probably
	at the ingress interface to the internetwork. So, discovery should		at the ingress interface to the internetwork. So, discovery should
	not be hard. A variety of control protocols already exist for some		not be hard. A variety of control protocols already exist for some
	widely used rate-responses to congestion. For instance DCCP		widely used rate-responses to congestion. For instance DCCP
	congestion control identifiers (CCIDs [RFC4340]) fulfil this role and		congestion control identifiers (CCIDs [RFC4340]) fulfil this role and
	so does QoS signalling (e.g. and RSVP request for controlled load		so does QoS signalling (e.g. and RSVP request for controlled load
	service is equivalent to a request for no rate response to		service is equivalent to a request for no rate response to
	congestion, but with admission control).		congestion, but with admission control).

	5.6. IP in IP Tunnels		5.6. IP in IP Tunnels

	For re-ECN to work correctly through IP in IP tunnels, it needs		For re-ECN to work correctly through IP in IP tunnels, it needs
	slightly different tunnel handling to regular ECN [RFC3168].		slightly different tunnel handling to regular ECN [RFC3168].

	Ideally, for re-ECN to work through a tunnel, the tunnel entry should		Currently there is some incosistency between how the handling of IP
	copy both the RE flag and the ECN field from the inner to the outer		in IP tunnels is defined in [RFC3168] and how it is defined in
	IP header. Then at the tunnel exit, any congestion marking of the		[RFC4301], but re-ECN would work fine with the IPsec behaviour. This
	outer ECN field should overwrite the inner ECN field (unless the		inconsistency is addressed in a new Internet Draft [ECN-tunnel] that
	inner field is Not-ECT in which case an alarm should be raised). The		proposes to update RFC3168 tunnel behaviour to bring it into line
	RE flag shouldn't change along a path, so the outer RE flag should be		with IPsec. Ideally, for re-ECN to work through a tunnel, the tunnel
	the same as the inner. If it isn't a management alarm should be		entry should copy both the RE flag and the ECN field from the inner
	raised. This behaviour is the same as the full-functionality variant		to the outer IP header. Then at the tunnel exit, any congestion
	of [RFC3168] at tunnel exit, but different at tunnel entry.		marking of the outer ECN field should overwrite the inner ECN field
			(unless the inner field is Not-ECT in which case an alarm should be
			raised). The RE flag shouldn't change along a path, so the outer RE
			flag should be the same as the inner. If it isn't a management alarm
			should be raised. This behaviour is the same as the full-
			functionality variant of [RFC3168] at tunnel exit, but different at
			tunnel entry.

	If tunnels are left as they are specified in [RFC3168], whether the		If tunnels are left as they are specified in [RFC3168], whether the
	limited or full-functionality variants are used, a problem arises		limited or full-functionality variants are used, a problem arises
	with re-ECN if a tunnel crosses an inter-domain boundary, because the		with re-ECN if a tunnel crosses an inter-domain boundary, because the
	difference between positive and negative markings will not be		difference between positive and negative markings will not be
	correctly accounted for. In a limited functionality ECN tunnel, the		correctly accounted for. In a limited functionality ECN tunnel, the
	flow will appear to be legacy traffic, and therefore may be wrongly		flow will appear to be legacy traffic, and therefore may be wrongly
	rate limited. In a full-functionality ECN tunnel, the result will		rate limited. In a full-functionality ECN tunnel, the result will
	depend whether the tunnel entry copies the inner RE flag to the outer		depend whether the tunnel entry copies the inner RE flag to the outer
	header or the RE flag in the outer header is always cleared. If the		header or the RE flag in the outer header is always cleared. If the
	former, the flow will tend to be too positive when accounted for at		former, the flow will tend to be too positive when accounted for at

	borders. If the latter, it will be too negative.		borders. If the latter, it will be too negative. If the rules set
			out in [ECN-tunnel] are followed then this will not be an issue.
	{ToDo: A future version of this draft will discuss the necessary
	changes to IP in IP tunnels in more depth.}

	5.7. Non-Issues		5.7. Non-Issues

	The following issues might seem to cause unfavourable interactions		The following issues might seem to cause unfavourable interactions
	with re-ECN, but we will explain why they don't:		with re-ECN, but we will explain why they don't:

	o Various link layers support explicit congestion notification, such		o Various link layers support explicit congestion notification, such
	as Frame Relay and ATM. Explicit congestion notification is		as Frame Relay and ATM. Explicit congestion notification is
	proposed to be added to other link layers, such as Ethernet		proposed to be added to other link layers, such as Ethernet
	(802.3ar Ethernet congestion management) and MPLS [ECN-MPLS];		(802.3ar Ethernet congestion management) and MPLS [ECN-MPLS];

	skipping to change at page 35, line 31		skipping to change at page 36, line 37
	6. Applications		6. Applications

	6.1. Policing Congestion Response		6.1. Policing Congestion Response

	6.1.1. The Policing Problem		6.1.1. The Policing Problem

	The current Internet architecture trusts hosts to respond voluntarily		The current Internet architecture trusts hosts to respond voluntarily
	to congestion. Limited evidence shows that the large majority of		to congestion. Limited evidence shows that the large majority of
	end-points on the Internet comply with a TCP-friendly response to		end-points on the Internet comply with a TCP-friendly response to
	congestion. But telephony (and increasingly video) services over the		congestion. But telephony (and increasingly video) services over the

	best efforts Internet are attracting the interest of major commercial		best effort Internet are attracting the interest of major commercial
	operations. Most of these applications do not respond to congestion		operations. Most of these applications do not respond to congestion
	at all. Those that can switch to lower rate codecs, still have a		at all. Those that can switch to lower rate codecs, still have a
	lower bound below which they must become unresponsive to congestion.		lower bound below which they must become unresponsive to congestion.

	Of course, the Internet is intended to support many different		Of course, the Internet is intended to support many different
	application behaviours. But the problem is that this freedom can be		application behaviours. But the problem is that this freedom can be
	exercised irresponsibly. The greater problem is that we will never		exercised irresponsibly. The greater problem is that we will never
	be able to agree on where the boundary is between responsible and		be able to agree on where the boundary is between responsible and
	irresponsible. Therefore re-ECN is designed to allow different		irresponsible. Therefore re-ECN is designed to allow different
	networks to set their own view of the limit to irresponsibility, and		networks to set their own view of the limit to irresponsibility, and

	skipping to change at page 37, line 37		skipping to change at page 38, line 44
	return address at a higher layer.		return address at a higher layer.

	6.1.3. Re-ECN Incentive Framework		6.1.3. Re-ECN Incentive Framework

	The aim is to create an incentive environment that ensures optimal		The aim is to create an incentive environment that ensures optimal
	sharing of capacity despite everyone acting selfishly (including		sharing of capacity despite everyone acting selfishly (including
	lying and cheating). Of course, the mechanisms put in place for this		lying and cheating). Of course, the mechanisms put in place for this
	can lie dormant wherever co-operation is the norm.		can lie dormant wherever co-operation is the norm.

	Throughout this document we focus on path congestion. But some forms		Throughout this document we focus on path congestion. But some forms

	of fairness, particularly TCP's, also depend on round trip time. So,		of fairness, particularly TCP's, also depend on round trip time. If
	we also propose to measure downstream path delay using re-feedback.		TCP-fairness is required, we also propose to measure downstream path
	This proposal will be published in a very simple future draft, but		delay using re-feedback. We give a simple outline of how this could
	for now we give an outline in Appendix F.		work in Appendix F. However, we do not expect this to be necessary,
			as researchers tend to agree that only congestion control dynamics
			need to depend on RTT, not the rate that the algorithm would converge
			on after a period of stability.

	Figure 8 sketches the incentive framework that we will describe piece		Figure 8 sketches the incentive framework that we will describe piece
	by piece throughout this section. We will do a first pass in		by piece throughout this section. We will do a first pass in
	overview, then return to each piece in detail. We re-use the earlier		overview, then return to each piece in detail. We re-use the earlier
	example of how downstream congestion is derived by subtracting		example of how downstream congestion is derived by subtracting
	upstream congestion from path congestion (Figure 1) but depict		upstream congestion from path congestion (Figure 1) but depict
	multiple trust boundaries to turn it into an internetwork. For		multiple trust boundaries to turn it into an internetwork. For
	clarity, only downstream congestion is shown (the difference between		clarity, only downstream congestion is shown (the difference between
	the two earlier plots). The graph displays downstream path		the two earlier plots). The graph displays downstream path
	congestion seen in a typical flow as it traverses an example path		congestion seen in a typical flow as it traverses an example path

	skipping to change at page 39, line 12		skipping to change at page 40, line 42
	enhanced QoS), to some extent it will always be against the		enhanced QoS), to some extent it will always be against the
	sender's interest to comply.		sender's interest to comply.

	Ingress policing: But it is in all the network operators' interests		Ingress policing: But it is in all the network operators' interests
	to encourage fair congestion response, so that their investments		to encourage fair congestion response, so that their investments
	are employed to satisfy the most valuable demand. The re-ECN		are employed to satisfy the most valuable demand. The re-ECN
	protocol ensures packets carry the necessary information about		protocol ensures packets carry the necessary information about
	their own expected downstream congestion so that N1 can deploy a		their own expected downstream congestion so that N1 can deploy a
	policer at its ingress to check that S1 is complying with whatever		policer at its ingress to check that S1 is complying with whatever
	congestion control it should be using (Section 6.1.5). If N1 is		congestion control it should be using (Section 6.1.5). If N1 is

	extremely conservative it may police each flow, but it can choose		extremely conservative it could police each flow, but it is likely
	to just police the bulk amount of congestion each customer causes		to just police the bulk amount of congestion each customer causes
	without regard to flows, or if it is extremely liberal it need not		without regard to flows, or if it is extremely liberal it need not
	police congestion control at all. Whatever, it is always		police congestion control at all. Whatever, it is always
	preferable to police traffic at the very first ingress into an		preferable to police traffic at the very first ingress into an
	internetwork, before non-compliant traffic can cause any damage.		internetwork, before non-compliant traffic can cause any damage.

	Edge egress dropper: If the policer ensures the source has less		Edge egress dropper: If the policer ensures the source has less
	right to a high rate the higher it declares downstream congestion,		right to a high rate the higher it declares downstream congestion,
	the source has a clear incentive to understate downstream		the source has a clear incentive to understate downstream
	congestion. But, if flows of packets are understated when they		congestion. But, if flows of packets are understated when they

	skipping to change at page 40, line 41		skipping to change at page 42, line 21
	at the egress of N2. Then N2 has an incentive either to police		at the egress of N2. Then N2 has an incentive either to police
	the congestion response of its own ingress traffic (from N1) or to		the congestion response of its own ingress traffic (from N1) or to
	emulate policing by applying penalties to N1 in turn on the basis		emulate policing by applying penalties to N1 in turn on the basis
	of congestion counted at their mutual boundary. In this recursive		of congestion counted at their mutual boundary. In this recursive
	way, the incentives for each flow to respond correctly to		way, the incentives for each flow to respond correctly to
	congestion trace back with each flow precisely to each source,		congestion trace back with each flow precisely to each source,
	despite the mechanism not recognising flows (see Section 6.2.2).		despite the mechanism not recognising flows (see Section 6.2.2).

	Inter-domain congestion charging diversity: Any two networks are		Inter-domain congestion charging diversity: Any two networks are
	free to agree any of a range of penalty regimes between themselves		free to agree any of a range of penalty regimes between themselves

			but they would only provide the right incentives if they were
	within the following reasonable constraints. N2 should expect to		within the following reasonable constraints. N2 should expect to
	have to pay penalties to N4 where penalties monotonically increase		have to pay penalties to N4 where penalties monotonically increase
	with the volume of congestion and negative penalties are not		with the volume of congestion and negative penalties are not
	allowed. For instance, they may agree an SLA with tiered		allowed. For instance, they may agree an SLA with tiered
	congestion thresholds, where higher penalties apply the higher the		congestion thresholds, where higher penalties apply the higher the
	threshold that is broken. But the most obvious (and useful) form		threshold that is broken. But the most obvious (and useful) form
	of penalty is where N4 levies a charge on N2 proportional to the		of penalty is where N4 levies a charge on N2 proportional to the
	volume of downstream congestion N2 dumps into N4. In the		volume of downstream congestion N2 dumps into N4. In the
	explanation that follows, we assume this specific variant of		explanation that follows, we assume this specific variant of
	volume charging between networks - charging proportionate to the		volume charging between networks - charging proportionate to the

	skipping to change at page 41, line 14		skipping to change at page 42, line 43

	We must make clear that we are not advocating that everyone should		We must make clear that we are not advocating that everyone should
	use this form of contract. We are well aware that the IETF tries		use this form of contract. We are well aware that the IETF tries
	to avoid standardising technology that depends on a particular		to avoid standardising technology that depends on a particular
	business model. And we strongly share this desire to encourage		business model. And we strongly share this desire to encourage
	diversity. But our aim is merely to show that border policing can		diversity. But our aim is merely to show that border policing can
	at least work with this one model, then we can assume that		at least work with this one model, then we can assume that
	operators might experiment with the metric in other models (see		operators might experiment with the metric in other models (see
	Section 6.1.6 for examples). Of course, operators are free to		Section 6.1.6 for examples). Of course, operators are free to
	complement this usage element of their charges with traditional		complement this usage element of their charges with traditional

	capacity charging, and we expect they will.		capacity charging, and we expect they will as predicted by
			economics.

	No congestion charging to users: Bulk congestion penalties at trust		No congestion charging to users: Bulk congestion penalties at trust
	boundaries are passive and extremely simple, and lose none of		boundaries are passive and extremely simple, and lose none of
	their per-packet precision from one boundary to the next (unlike		their per-packet precision from one boundary to the next (unlike
	Diffserv all-address traffic conditioning agreements, which		Diffserv all-address traffic conditioning agreements, which
	dissipate their effectiveness across long topologies). But at any		dissipate their effectiveness across long topologies). But at any
	trust boundary, there is no imperative to use congestion charging.		trust boundary, there is no imperative to use congestion charging.


	Traditional traffic policing can be used, if the complexity and		Traditional traffic policing can be used, if the complexity and
	cost is preferred. In particular, at the boundary with end		cost is preferred. In particular, at the boundary with end
	customers (e.g. between S and N1), traffic policing will most		customers (e.g. between S and N1), traffic policing will most
	likely be more appropriate. Policer complexity is less of a		likely be more appropriate. Policer complexity is less of a
	concern at the edge of the network. And end-customers are known		concern at the edge of the network. And end-customers are known
	to be highly averse to the unpredictability of congestion		to be highly averse to the unpredictability of congestion
	charging.		charging.

	NOTE WELL: This document neither advocates nor requires congestion		NOTE WELL: This document neither advocates nor requires congestion
	charging for end customers and advocates but does not require		charging for end customers and advocates but does not require

	skipping to change at page 41, line 40		skipping to change at page 43, line 23
	NOTE WELL: This document neither advocates nor requires congestion		NOTE WELL: This document neither advocates nor requires congestion
	charging for end customers and advocates but does not require		charging for end customers and advocates but does not require
	inter-domain congestion charging.		inter-domain congestion charging.

	Competitive discipline of inter-domain traffic engineering: With		Competitive discipline of inter-domain traffic engineering: With
	inter-domain congestion charging, a domain seems to have a		inter-domain congestion charging, a domain seems to have a
	perverse incentive to fake congestion; N2's profit depends on the		perverse incentive to fake congestion; N2's profit depends on the
	difference between congestion at its ingress (its revenue) and at		difference between congestion at its ingress (its revenue) and at
	its egress (its cost). So, overstating internal congestion seems		its egress (its cost). So, overstating internal congestion seems
	to increase profit. However, smart border routing [Smart_rtg] by		to increase profit. However, smart border routing [Smart_rtg] by

	N1 will bias its multipath routing towards the least cost routes.		N1 will bias its routing towards the least cost routes. So, N2
	So, N2 risks losing all its revenue to competitive routes if it		risks losing all its revenue to competitive routes if it
	overstates congestion (see Section 6.2.3). In other words, if N2		overstates congestion (see Section 6.2.3). In other words, if N2
	is the least congested route, its ability to raise excess profits		is the least congested route, its ability to raise excess profits
	is limited by the congestion on the next least congested route.		is limited by the congestion on the next least congested route.
	This pressure on N2 to remain competitive is represented by the		This pressure on N2 to remain competitive is represented by the
	dotted downward arrow at the ingress to N2 in Figure 9.		dotted downward arrow at the ingress to N2 in Figure 9.

	Closing the loop: All the above elements conspire to trap everyone		Closing the loop: All the above elements conspire to trap everyone
	between two opposing pressures (the downward and upward arrows in		between two opposing pressures (the downward and upward arrows in
	Figure 8 & Figure 9), ensuring the downstream congestion metric		Figure 8 & Figure 9), ensuring the downstream congestion metric
	arrives at the destination neither above nor below zero. So, we		arrives at the destination neither above nor below zero. So, we

	skipping to change at page 42, line 24		skipping to change at page 44, line 7
	superior to bottleneck policing or to any policing of different		superior to bottleneck policing or to any policing of different
	QoS for different flows. Even if all access networks choose to		QoS for different flows. Even if all access networks choose to
	conservatively police congestion per flow, each will want to		conservatively police congestion per flow, each will want to
	compete with the others to allow new responses to congestion for		compete with the others to allow new responses to congestion for
	new types of application. With re-ECN, each can introduce new		new types of application. With re-ECN, each can introduce new
	controls independently, without coordinating with other networks		controls independently, without coordinating with other networks
	and without having to standardise anything. But, as we have just		and without having to standardise anything. But, as we have just
	seen, by making inter-domain penalties proportionate to bulk		seen, by making inter-domain penalties proportionate to bulk
	downtream congestion, downstream networks can be agnostic to the		downtream congestion, downstream networks can be agnostic to the
	specific congestion response for each flow, but they can still		specific congestion response for each flow, but they can still

	apply more back-pressure the more liberal the ingress access		apply more penalty the more liberal the ingress access network has
	network has been in the response to congestion it allowed for each		been in the response to congestion it allowed for each flow.
	flow.

	6.1.3.1. The Case against Classic Feedback		6.1.3.1. The Case against Classic Feedback

	A system that produces an optimal outcome as a result of everyone's		A system that produces an optimal outcome as a result of everyone's
	selfish actions is extremely powerful. Especially one that enables		selfish actions is extremely powerful. Especially one that enables
	evolvability of congestion control. But why do we have to change to		evolvability of congestion control. But why do we have to change to
	re-ECN to achieve it? Can't classic congestion feedback (as used		re-ECN to achieve it? Can't classic congestion feedback (as used
	already by standard ECN) be arranged to provide similar incentives		already by standard ECN) be arranged to provide similar incentives
	and similar evolvability? Superficially it can. Kelly's seminal		and similar evolvability? Superficially it can. Kelly's seminal
	work showed how we can allow everyone the freedom to evolve whatever		work showed how we can allow everyone the freedom to evolve whatever
	congestion control behaviour is in their application's best interest		congestion control behaviour is in their application's best interest
	but still optimise the whole system of networks and users by placing		but still optimise the whole system of networks and users by placing
	a price on congestion to ensure responsible use of this		a price on congestion to ensure responsible use of this
	freedom [Evol_cc]). Kelly used ECN with its classic congestion		freedom [Evol_cc]). Kelly used ECN with its classic congestion
	feedback model as the mechanism to convey congestion price		feedback model as the mechanism to convey congestion price

	information. The mechanism was nearly identical to volume charging;		information. The mechanism could be thought of as volume charging;
	except only the volume of packets marked with congestion experienced		except only the volume of packets marked with congestion experienced
	(CE) was counted.		(CE) was counted.

	However, below we explain why relying on classic feedback /required/		However, below we explain why relying on classic feedback /required/
	congestion charging to be used, while re-ECN achieves the same		congestion charging to be used, while re-ECN achieves the same
	powerful outcome (given it is built on Kelly's foundations), but does		powerful outcome (given it is built on Kelly's foundations), but does
	not /require/ congestion charging. In brief, the problem with		not /require/ congestion charging. In brief, the problem with
	classic feedback is that the incentives have to trace the indirect		classic feedback is that the incentives have to trace the indirect
	path back to the sender---the long way round the feedback loop. For		path back to the sender---the long way round the feedback loop. For
	example, if classic feedback were used in Figure 8, N2 would have had		example, if classic feedback were used in Figure 8, N2 would have had

	skipping to change at page 45, line 22		skipping to change at page 47, line 5
	from the receiver. So, counting packets with FNE cleared would be		from the receiver. So, counting packets with FNE cleared would be
	likely to make the average unnecessarily positive, providing headroom		likely to make the average unnecessarily positive, providing headroom
	(or should we say footroom?) for dishonest (negative) traffic.		(or should we say footroom?) for dishonest (negative) traffic.

	If the dropper detects a persistently negative flow, it SHOULD drop		If the dropper detects a persistently negative flow, it SHOULD drop
	sufficient negative and neutral packets to force the flow to not be		sufficient negative and neutral packets to force the flow to not be
	negative. Drops SHOULD be focused on just sufficient packets in		negative. Drops SHOULD be focused on just sufficient packets in
	misbehaving flows to remove the negative bias while doing minimal		misbehaving flows to remove the negative bias while doing minimal
	extra harm.		extra harm.


	6.1.5. Rate Policing		6.1.5. Policing


	Access operators who wish to check that a sender is complying with a		Access operators who wish to limit the congeston that a sender is
	particular rate response to congestion can deploy rate policers at		able to cause can deploy policers at the very first ingress to the
	the very first ingress to the internetwork. Re-ECN has been designed		internetwork. Re-ECN has been designed to avoid the need for
	to avoid the need for bottleneck policing so that we can avoid a		bottleneck policing so that we can avoid a future where a single rate
	future where a single rate adaptation policy is embedded throughout		adaptation policy is embedded throughout the network. Instead, re-
	the network. Instead, re-ECN allows the particular rate adaptation		ECN allows the particular rate adaptation policy to be solely agreed
	policy to be solely agreed bilaterally between the sender and its		bilaterally between the sender and its ingress access provider
	ingress access provider (Section 5.5.2 discusses possible ways to		(Section 5.5.2 discusses possible ways to signal between them), which
	signal between them), which allows congestion control to be policed,		allows congestion control to be policed, but maintains its
	but maintains its evolvability, requiring only a single, local box to		evolvability, requiring only a single, local box to be updated.
	be updated.


	If desired, the re-ECN protocol allows these ingress policers to		Appendix G gives examples of per-user policing algorithms. But there
	perform per-flow policing according to the widely adopted TCP rate		is no implication that these algorithms are to be standardised, or
	adaptation, perhaps as a default. But it also allows new rate		that they are ideal. The ingress rate policer is the part of the re-
	adaptation policies beyond TCP to be enforced. Perhaps more		ECN incentive framework that is intended to be the most flexible.
	usefully, it also allows the flexibility for networks to choose to		Once endpoint protocol handlers for re-ECN and egress droppers are in
	police users as a whole, rather than flows.		place, operators can choose exactly which congestion response they
			want to police, and whether they want to do it per user, per flow or
			not at all.


	Appendix G gives examples of per-user and per-flow policing		The re-ECN protocol allows these ingress policers to easily perform
	algorithms. But there is no implication that these algorithms are to		bulk per-user policing (Appendix G.1). This is likely to provide
	be standardised, or that they are ideal. The ingress rate policer is		sufficient incentive to the user to correctly respond to congestion
	the part of the re-ECN incentive framework that is intended to be the		without needing the policing function to be overly complex. If an
	most flexible. Once endpoint protocol handlers for re-ECN and egress		access operator chose they could use per-flow policing according to
	droppers are in place, operators can choose exactly which congestion		the widely adopted TCP rate adaptation ( Appendix G.2) or other
	response they want to police, and whether they want to do it per		alternatives, however this would introduce extra complexity to the
	user, per flow or not at all.		system.


	However, if a rate policer is used, it should use path (not		If a per-flow rate policer is used, it should use path (not
	downstream) congestion as the relevant metric, which is represented		downstream) congestion as the relevant metric, which is represented
	by the fraction of octets in packets with positive (Re-Echo and FNE)		by the fraction of octets in packets with positive (Re-Echo and FNE)
	and canceled (CE(0)) markings. Of course, re-ECN provides all the		and canceled (CE(0)) markings. Of course, re-ECN provides all the
	information a policer needs directly in the packets being policed.		information a policer needs directly in the packets being policed.

	So, even policing TCP's AIMD algorithm is relatively straightforward.		So, even policing TCP's AIMD algorithm is relatively straightforward
	Appendix G presents an example design, but the choice of preferred		(Appendix G.2).
	mechanism is up to the implementer.

	Note that we have included canceled packets in the measure of path		Note that we have included canceled packets in the measure of path
	congestion. Canceled packets arise when the sender re-echoes earlier		congestion. Canceled packets arise when the sender re-echoes earlier
	congestion, but then this Re-Echo packet just happens to be		congestion, but then this Re-Echo packet just happens to be
	congestion marked itself. One would not normally expect many		congestion marked itself. One would not normally expect many
	canceled packets at the first ingress because one would not normally		canceled packets at the first ingress because one would not normally
	expect much congestion marking to have been necessary that soon in		expect much congestion marking to have been necessary that soon in
	the path. However, a home network or campus network may well sit		the path. However, a home network or campus network may well sit
	between the sending endpoint and the ingress policer, so some		between the sending endpoint and the ingress policer, so some
	congestion may occur upstream of the policer. And if congestion does		congestion may occur upstream of the policer. And if congestion does

	skipping to change at page 47, line 5		skipping to change at page 48, line 36
	Of course, even if the sender does operate its own network, it may		Of course, even if the sender does operate its own network, it may
	arrange not to congestion mark traffic. Whether the sender does this		arrange not to congestion mark traffic. Whether the sender does this
	or not is of no concern to anyone else except the sender. Such a		or not is of no concern to anyone else except the sender. Such a
	sender will not be policed against its own network's contribution to		sender will not be policed against its own network's contribution to
	congestion, but the only resulting problem would be overload in the		congestion, but the only resulting problem would be overload in the
	sender's own network.		sender's own network.

	Finally, we must not forget that an easy way to circumvent re-ECN's		Finally, we must not forget that an easy way to circumvent re-ECN's
	defences is for the source to turn off re-ECN support, by setting the		defences is for the source to turn off re-ECN support, by setting the
	Not-RECT codepoint, implying legacy traffic. Therefore an ingress		Not-RECT codepoint, implying legacy traffic. Therefore an ingress

	policer must put a general rate-limit on Not-RECT traffic, which		policer should put a general rate-limit on Not-RECT traffic, which
	SHOULD be lax during early, patchy deployment, but will have to		SHOULD be lax during early, patchy deployment, but will have to
	become stricter as deployment widens. Similarly, flows starting		become stricter as deployment widens. Similarly, flows starting
	without an FNE packet can be confined by a strict rate-limit used for		without an FNE packet can be confined by a strict rate-limit used for
	the remainder of flows that haven't proved they are well-behaved by		the remainder of flows that haven't proved they are well-behaved by
	starting correctly (therefore they need not consume any flow state---		starting correctly (therefore they need not consume any flow state---
	they are just confined to the `misbehaving' bin if they carry an		they are just confined to the `misbehaving' bin if they carry an
	unrecognised flow ID).		unrecognised flow ID).

	6.1.6. Inter-domain Policing		6.1.6. Inter-domain Policing

	One of the main design goals of re-ECN is for border security		One of the main design goals of re-ECN is for border security
	mechanisms to be as simple as possible, otherwise they will become		mechanisms to be as simple as possible, otherwise they will become
	the pinch-points that limit scalability of the whole internetwork.		the pinch-points that limit scalability of the whole internetwork.
	We want to avoid per-flow processing at borders and to keep to		We want to avoid per-flow processing at borders and to keep to
	passive mechanisms that can monitor traffic in parallel to		passive mechanisms that can monitor traffic in parallel to
	forwarding, rather than having to filter traffic inline---in series		forwarding, rather than having to filter traffic inline---in series

	with forwarding.		with forwarding. Such passive, off-line mechanisms are essential for
			future high-speed all-optical border interconnection where packets
			cannot be buffered while they are checked for policy compliance.

	So far, we have been able to keep the border mechanisms simple,		So far, we have been able to keep the border mechanisms simple,
	despite having had to harden them against some subtle attacks on the		despite having had to harden them against some subtle attacks on the
	re-ECN design. The mechanisms are still passive and avoid per-flow		re-ECN design. The mechanisms are still passive and avoid per-flow
	processing.		processing.

	The basic accounting mechanism at each border interface simply		The basic accounting mechanism at each border interface simply
	involves accumulating the volume of packets with positive worth (Re-		involves accumulating the volume of packets with positive worth (Re-
	Echo and FNE), and subtracting the volume of those with negative		Echo and FNE), and subtracting the volume of those with negative
	worth: CE(-1). Even though this mechanism takes no regard of flows,		worth: CE(-1). Even though this mechanism takes no regard of flows,

	skipping to change at page 50, line 33		skipping to change at page 52, line 18
	tend to be dropped before others if routers use the preferential drop		tend to be dropped before others if routers use the preferential drop
	rules in Section 5.3, which discriminate against non-positive		rules in Section 5.3, which discriminate against non-positive
	packets. All networks below the point where a flow goes negative		packets. All networks below the point where a flow goes negative
	(N1, N2 and N4 in this case) have an incentive to remove this flow,		(N1, N2 and N4 in this case) have an incentive to remove this flow,
	but the router where it first goes negative (in N1) can of course		but the router where it first goes negative (in N1) can of course
	remove the problem for everyone downstream.		remove the problem for everyone downstream.

	In the case of DDoS attacks, Section 6.2.1 describes how re-ECN		In the case of DDoS attacks, Section 6.2.1 describes how re-ECN
	mitigates their force.		mitigates their force.


	Note that the guiding principle behind all the above discussion is
	that any gain from subverting the protocol should be precisely
	neutralised, rather than punished. If a gain is punished to a
	greater extent than is sufficient to neutralise it, it will most
	likely open up a new vulnerability, where the amplifying effect of
	the punishment mechanism can be turned on others.

	For instance, if possible, flows should be removed as soon as they go
	negative, but we do NOT RECOMMEND any attempts to discard such flows
	further upstream while they are still positive. Such over-zealous
	push-back is unnecessary and potentially dangerous. These flows have
	paid their `fare' up to the point they go negative, so there is no
	harm in delivering them that far. If someone downstream asks for a
	flow to be dropped as near to the source as possible, because they
	say it is going to become negative later, an upstream node cannot
	test the truth of this assertion. Rather than have to authenticate
	such messages, re-ECN has been designed so that flows can be dropped
	solely based on locally measurable evidence. A message hinting that
	a flow should be watched closely to test for negativity is fine. But
	not a message that claims that a positive flow will go negative
	later, so it should be dropped. .

	6.1.7. Inter-domain Fail-safes		6.1.7. Inter-domain Fail-safes

	The mechanisms described so far create incentives for rational		The mechanisms described so far create incentives for rational
	network operators to behave. That is, one operator aims to make		network operators to behave. That is, one operator aims to make
	another behave responsibly by applying penalties and expects a		another behave responsibly by applying penalties and expects a
	rational response (i.e. one that trades off costs against benefits).		rational response (i.e. one that trades off costs against benefits).
	It is usually reasonable to assume that other network operators will		It is usually reasonable to assume that other network operators will
	behave rationally (policy routing can avoid those that might not).		behave rationally (policy routing can avoid those that might not).
	But this approach does not protect against the misconfigurations and		But this approach does not protect against the misconfigurations and
	accidents of other operators.		accidents of other operators.

	skipping to change at page 56, line 36		skipping to change at page 57, line 47
	* ECN `only' gives a performance improvement. Making a product a		* ECN `only' gives a performance improvement. Making a product a
	bit faster (whether the product is a device or a network),		bit faster (whether the product is a device or a network),
	isn't usually a sufficient selling point to be worth the cost		isn't usually a sufficient selling point to be worth the cost
	of co-ordinating across the industry to deploy it. Network		of co-ordinating across the industry to deploy it. Network
	operators tend to avoid re-configuring a working network unless		operators tend to avoid re-configuring a working network unless
	launching a new product.		launching a new product.

	ECN and re-ECN for Edge-to-edge Assured QoS:		ECN and re-ECN for Edge-to-edge Assured QoS:

	We believe the proposal to provide assured QoS sessions using a		We believe the proposal to provide assured QoS sessions using a

	form of ECN called pre-congestion notification (PCN) [CL-deploy]		form of ECN called pre-congestion notification (PCN) [PCN-arch] is
	is most likely to break the deadlock in ECN deployment first. It		most likely to break the deadlock in ECN deployment first. It
	only requires edge-to-edge deployment so it does not require		only requires edge-to-edge deployment so it does not require
	endpoint support. It can be deployed in a single network, then		endpoint support. It can be deployed in a single network, then
	grow incrementally to interconnected networks. And it provides a		grow incrementally to interconnected networks. And it provides a
	different `product' (internetworked assured QoS), rather than		different `product' (internetworked assured QoS), rather than
	merely making an existing product a bit faster.		merely making an existing product a bit faster.

	Not only could this assured QoS application kick-start ECN		Not only could this assured QoS application kick-start ECN
	deployment, it could also carry re-ECN deployment with it; because		deployment, it could also carry re-ECN deployment with it; because
	re-ECN can enable the assured QoS region to expand to a large		re-ECN can enable the assured QoS region to expand to a large
	internetwork where neighbouring networks do not trust each other.		internetwork where neighbouring networks do not trust each other.

	skipping to change at page 63, line 5		skipping to change at page 64, line 15
	to the higher layer and hide how the lower layer does it. However,		to the higher layer and hide how the lower layer does it. However,
	ECN reveals the state of the network layer and below to the transport		ECN reveals the state of the network layer and below to the transport
	layer. A more positive way to describe ECN is that it is like the		layer. A more positive way to describe ECN is that it is like the
	return value of a function call to the network layer. It explicitly		return value of a function call to the network layer. It explicitly
	returns the status of the request to deliver a packet, by returning a		returns the status of the request to deliver a packet, by returning a
	value representing the current risk that a packet will not be served.		value representing the current risk that a packet will not be served.
	Re-ECN has similar semantics, except the transport layer must try to		Re-ECN has similar semantics, except the transport layer must try to
	guess the return value, then it can use the actual return value from		guess the return value, then it can use the actual return value from
	the network layer to modify the next guess.		the network layer to modify the next guess.


			The guiding principle behind all the discussion in Section 6.1.6 on
			Policing is that any gain from subverting the protocol should be
			precisely neutralised, rather than punished. If a gain is punished
			to a greater extent than is sufficient to neutralise it, it will most
			likely open up a new vulnerability, where the amplifying effect of
			the punishment mechanism can be turned on others.

			For instance, if possible, flows should be removed as soon as they go
			negative, but we do NOT RECOMMEND any attempts to discard such flows
			further upstream while they are still positive. Such over-zealous
			push-back is unnecessary and potentially dangerous. These flows have
			paid their `fare' up to the point they go negative, so there is no
			harm in delivering them that far. If someone downstream asks for a
			flow to be dropped as near to the source as possible, because they
			say it is going to become negative later, an upstream node cannot
			test the truth of this assertion. Rather than have to authenticate
			such messages, re-ECN has been designed so that flows can be dropped
			solely based on locally measurable evidence. A message hinting that
			a flow should be watched closely to test for negativity is fine. But
			not a message that claims that a positive flow will go negative
			later, so it should be dropped. .

	9. Related Work		9. Related Work

	{Due to lack of time, this section is incomplete. The reader is		{Due to lack of time, this section is incomplete. The reader is
	referred to the Related Work section of [Re-fb] for a brief selection		referred to the Related Work section of [Re-fb] for a brief selection
	of related ideas.}		of related ideas.}

	9.1. Policing Rate Response to Congestion		9.1. Policing Rate Response to Congestion

	ATM network elements send congestion back-pressure		ATM network elements send congestion back-pressure
	messages [ITU-T.I.371] along each connection, duplicating any end to		messages [ITU-T.I.371] along each connection, duplicating any end to

	skipping to change at page 63, line 52		skipping to change at page 65, line 37
	9.2. Congestion Notification Integrity		9.2. Congestion Notification Integrity

	The choice of two ECT code-points in the ECN field [RFC3168]		The choice of two ECT code-points in the ECN field [RFC3168]
	permitted future flexibility, optionally allowing the sender to		permitted future flexibility, optionally allowing the sender to
	encode the experimental ECN nonce [RFC3540] in the packet stream.		encode the experimental ECN nonce [RFC3540] in the packet stream.
	This mechanism has since been included in the specifications of DCCP		This mechanism has since been included in the specifications of DCCP
	[RFC4340].		[RFC4340].

	The ECN nonce is an elegant scheme that allows the sender to detect		The ECN nonce is an elegant scheme that allows the sender to detect
	if someone in the feedback loop - the receiver especially - tries to		if someone in the feedback loop - the receiver especially - tries to

	claim no congestion was experienced when in fact congestion lead to		claim no congestion was experienced when in fact congestion led to
	packet drops or ECN marks. For each packet it sends, the sender		packet drops or ECN marks. For each packet it sends, the sender
	chooses between the two ECT codepoints in a pseudo-random sequence.		chooses between the two ECT codepoints in a pseudo-random sequence.
	Then, whenever the network marks a packet with CE, if the receiver		Then, whenever the network marks a packet with CE, if the receiver
	wants to deny congestion happened, she has to guess which ECT		wants to deny congestion happened, she has to guess which ECT
	codepoint was overwritten. She has only a 50:50 chance of being		codepoint was overwritten. She has only a 50:50 chance of being
	correct each time she denies a congestion mark or a drop, which		correct each time she denies a congestion mark or a drop, which
	ultimately will give her away.		ultimately will give her away.


	The purpose of a network-layer nonce has to be the protection of the		The purpose of a network-layer nonce should primarily be protection
	network in the first place, while a transport-layer nonce had better		of the network, while a transport-layer nonce would be better used to
	be used to protect the sender from cheating receivers. Now, the		protect the sender from cheating receivers. Now, the assumption
	assumption behind the ECN nonce is that a sender will want to detect		behind the ECN nonce is that a sender will want to detect whether a
	whether a receiver is suppressing congestion feedback. This is only		receiver is suppressing congestion feedback. This is only true if
	true if the sender's interests are aligned with the network's, or		the sender's interests are aligned with the network's, or with the
	with the community of users as a whole. This may be true for certain		community of users as a whole. This may be true for certain large
	large senders, who are under close scrutiny and have a reputation to		senders, who are under close scrutiny and have a reputation to
	maintain. But we have to deal with a more hostile world, where		maintain. But we have to deal with a more hostile world, where
	traffic may be dominated by peer-to-peer transfers, rather than		traffic may be dominated by peer-to-peer transfers, rather than
	downloads from a few popular sites. Often the `natural' self-		downloads from a few popular sites. Often the `natural' self-
	interest of a sender is not aligned with the interests of other		interest of a sender is not aligned with the interests of other
	users. It often wishes to transfer data quickly to the receiver as		users. It often wishes to transfer data quickly to the receiver as
	much as the receiver wants the data quickly.		much as the receiver wants the data quickly.

	In contrast, the re-ECN protocol enables policing of an agreed rate-		In contrast, the re-ECN protocol enables policing of an agreed rate-
	response to congestion (e.g. TCP-friendliness) at the sender's		response to congestion (e.g. TCP-friendliness) at the sender's
	interface with the internetwork. It also ensures downstream networks		interface with the internetwork. It also ensures downstream networks

	skipping to change at page 66, line 16		skipping to change at page 67, line 49
	rather wastefully to encode just five states. In effect the RE flag		rather wastefully to encode just five states. In effect the RE flag
	has been used as an orthogonal single bit, using up four codepoints		has been used as an orthogonal single bit, using up four codepoints
	to encode the three states of positive, neutral and negative worth.		to encode the three states of positive, neutral and negative worth.
	The mapping of the codepoints in an earlier version of this proposal		The mapping of the codepoints in an earlier version of this proposal
	used the codepoint space more efficiently, but the scheme became		used the codepoint space more efficiently, but the scheme became
	vulnerable to network operators bypassing congestion penalties by		vulnerable to network operators bypassing congestion penalties by
	focusing congestion marking on positive packets. Appendix B explains		focusing congestion marking on positive packets. Appendix B explains
	why fixing that problem while allowing for incremental deployment,		why fixing that problem while allowing for incremental deployment,
	would have used another codepoint anyway. So it was better to use		would have used another codepoint anyway. So it was better to use
	this orthogonal encoding scheme, which greatly simplified the whole		this orthogonal encoding scheme, which greatly simplified the whole

	protocol and brought with it some subtle security benefits.		protocol and brought with it some subtle security benefits (see the
			last paragraph of Appendix B).

	With the scheme as now proposed, once the RE flag is set or cleared		With the scheme as now proposed, once the RE flag is set or cleared
	by the sender or its proxy, it should not be written by the network,		by the sender or its proxy, it should not be written by the network,

	only read. So the gateways can detect if any network maliciously		only read. So the endpoints can detect if any network maliciously
	alters the RE flag. IPSec AH integrity checking does not cover the		alters the RE flag. IPSec AH integrity checking does not cover the
	IPv4 option flags (they were considered mutable---even the one we		IPv4 option flags (they were considered mutable---even the one we
	propose using for the RE flag that was `currently unused' when IPSec		propose using for the RE flag that was `currently unused' when IPSec

	was defined). But it would be sufficient for a pair of gateways to		was defined). But it would be sufficient for a pair of endpoints to
	make random checks on whether the RE flag was the same when it		make random checks on whether the RE flag was the same when it

	reached the egress gateway as when it left the ingress. Indeed, if		reached the egress as when it left the ingress. Indeed, if IPSec AH
	IPSec AH had covered the RE flag, any network intending to alter		had covered the RE flag, any network intending to alter sufficient RE
	sufficient RE flags to make a gain would have focused its alterations		flags to make a gain would have focused its alterations on packets
	on packets without authenticating headers (AHs).		without authenticating headers (AHs).

	The security of re-ECN has been deliberately designed to not rely on		The security of re-ECN has been deliberately designed to not rely on
	cryptography.		cryptography.

	11. IANA Considerations		11. IANA Considerations

	This memo includes no request to IANA (yet).		This memo includes no request to IANA (yet).

	If this memo was to progress to standards track, it would list:		If this memo was to progress to standards track, it would list:


	skipping to change at page 68, line 42		skipping to change at page 70, line 28
	Internet to Support Real-Time Content Supply from a Large		Internet to Support Real-Time Content Supply from a Large
	Fraction of Broadband Residential Users", BT Technology		Fraction of Broadband Residential Users", BT Technology
	Journal (BTTJ) 23(2), April 2005.		Journal (BTTJ) 23(2), April 2005.

	[Bauer06] Bauer, S., Faratin, P., and R. Beverly, "Assessing the		[Bauer06] Bauer, S., Faratin, P., and R. Beverly, "Assessing the
	assumptions underlying mechanism design for the Internet",		assumptions underlying mechanism design for the Internet",
	Proc. Workshop on the Economics of Networked Systems		Proc. Workshop on the Economics of Networked Systems
	(NetEcon06) , June 2006, <http://www.cs.duke.edu/nicl/		(NetEcon06) , June 2006, <http://www.cs.duke.edu/nicl/
	netecon06/papers/ne06-assessing.pdf>.		netecon06/papers/ne06-assessing.pdf>.


	[CL-deploy]
	Briscoe, B., Eardley, P., Songhurst, D., Le Faucheur, F.,
	Charny, A., Babiarz, J., Chan, K., Westberg, L., Bader,
	A., and G. Karagiannis, "A Deployment Model for Admission
	Control over DiffServ using Pre-Congestion Notification",
	draft-briscoe-tsvwg-cl-architecture-03 (work in progress),
	June 2006.

	[CLoop_pol]		[CLoop_pol]
	Salvatori, A., "Closed Loop Traffic Policing", Politecnico		Salvatori, A., "Closed Loop Traffic Policing", Politecnico
	Torino and Institut Eurecom Masters Thesis ,		Torino and Institut Eurecom Masters Thesis ,
	September 2005.		September 2005.

	[ECN-Deploy]		[ECN-Deploy]
	Floyd, S., "ECN (Explicit Congestion Notification) in		Floyd, S., "ECN (Explicit Congestion Notification) in
	TCP/IP; Implementation and Deployment of ECN", Web-page ,		TCP/IP; Implementation and Deployment of ECN", Web-page ,
	May 2004,		May 2004,
	<http://www.icir.org/floyd/ecn.html#implementations>.		<http://www.icir.org/floyd/ecn.html#implementations>.

	[ECN-MPLS]		[ECN-MPLS]

	Bruce, B., Briscoe, B., and J. Tay, "Explicit Congestion		Davie, B., Briscoe, B., and J. Tay, "Explicit Congestion
	Marking in MPLS", draft-davie-ecn-mpls-00 (work in		Marking in MPLS", draft-ietf-tsvwg-ecn-mpls-01 (work in
	progress), June 2006.		progress), June 2007.

			[ECN-tunnel]
			Briscoe, B., "Layered Encapsulation of Congestion
			Notification", draft-briscoe-tsvwg-ecn-tunnel-00 (work in
			progress), July 2007.

	[Evol_cc] Gibbens, R. and F. Kelly, "Resource pricing and the		[Evol_cc] Gibbens, R. and F. Kelly, "Resource pricing and the
	evolution of congestion control", Automatica 35(12)1969--		evolution of congestion control", Automatica 35(12)1969--
	1985, December 1999,		1985, December 1999,
	<http://www.statslab.cam.ac.uk/~frank/evol.html>.		<http://www.statslab.cam.ac.uk/~frank/evol.html>.


	[I-D.ietf-tsvwg-ecnsyn]		[I-D.ietf-tcpm-ecnsyn]
	Kuzmanovic, A., "Adding Explicit Congestion Notification		Kuzmanovic, A., "Adding Explicit Congestion Notification
	(ECN) Capability to TCP's SYN/ACK Packets",		(ECN) Capability to TCP's SYN/ACK Packets",

	draft-ietf-tsvwg-ecnsyn-00 (work in progress),		draft-ietf-tcpm-ecnsyn-01 (work in progress),
	November 2005.		October 2006.

			[I-D.moncaster-tcpm-rcv-cheat]
			Moncaster, T., "A TCP Test to Allow Senders to Identify
			Receiver Non-Compliance",
			draft-moncaster-tcpm-rcv-cheat-01 (work in progress),
			June 2007.

	[ITU-T.I.371]		[ITU-T.I.371]
	ITU-T, "Traffic Control and Congestion Control in		ITU-T, "Traffic Control and Congestion Control in
	{B-ISDN}", ITU-T Rec. I.371 (03/04), March 2004.		{B-ISDN}", ITU-T Rec. I.371 (03/04), March 2004.

	[Jiang02] Jiang, H. and D. Dovrolis, "The Macroscopic Behavior of		[Jiang02] Jiang, H. and D. Dovrolis, "The Macroscopic Behavior of
	the TCP Congestion Avoidance Algorithm", ACM SIGCOMM		the TCP Congestion Avoidance Algorithm", ACM SIGCOMM
	CCR 32(3)75-88, July 2002,		CCR 32(3)75-88, July 2002,
	<http://doi.acm.org/10.1145/571697.571725>.		<http://doi.acm.org/10.1145/571697.571725>.

	[Mathis97]		[Mathis97]
	Mathis, M., Semke, J., Mahdavi, J., and T. Ott, "The		Mathis, M., Semke, J., Mahdavi, J., and T. Ott, "The
	Macroscopic Behavior of the TCP Congestion Avoidance		Macroscopic Behavior of the TCP Congestion Avoidance
	Algorithm", ACM SIGCOMM CCR 27(3)67--82, July 1997,		Algorithm", ACM SIGCOMM CCR 27(3)67--82, July 1997,
	<http://doi.acm.org/10.1145/263932.264023>.		<http://doi.acm.org/10.1145/263932.264023>.


			[PCN-arch]
			Eardley, P., Babiarz, J., Chan, K., Charny, A., Geib, R.,
			Karagiannis, G., Menth, M., and T. Tsou, "Pre-Congestion
			Notification Architecture",
			draft-eardley-pcn-architecture-00 (work in progress),
			June 2007.

	[Purple] Pletka, R., Waldvogel, M., and S. Mannal, "PURPLE:		[Purple] Pletka, R., Waldvogel, M., and S. Mannal, "PURPLE:
	Predictive Active Queue Management Utilizing Congestion		Predictive Active Queue Management Utilizing Congestion
	Information", Proc. Local Computer Networks (LCN 2003) ,		Information", Proc. Local Computer Networks (LCN 2003) ,
	October 2003.		October 2003.

	[RFC2208] Mankin, A., Baker, F., Braden, B., Bradner, S., O'Dell,		[RFC2208] Mankin, A., Baker, F., Braden, B., Bradner, S., O'Dell,
	M., Romanow, A., Weinrib, A., and L. Zhang, "Resource		M., Romanow, A., Weinrib, A., and L. Zhang, "Resource
	ReSerVation Protocol (RSVP) Version 1 Applicability		ReSerVation Protocol (RSVP) Version 1 Applicability
	Statement Some Guidelines on Deployment", RFC 2208,		Statement Some Guidelines on Deployment", RFC 2208,
	September 1997.		September 1997.

	skipping to change at page 70, line 33		skipping to change at page 72, line 30
	RFC 3514, April 2003.		RFC 3514, April 2003.

	[RFC3540] Spring, N., Wetherall, D., and D. Ely, "Robust Explicit		[RFC3540] Spring, N., Wetherall, D., and D. Ely, "Robust Explicit
	Congestion Notification (ECN) Signaling with Nonces",		Congestion Notification (ECN) Signaling with Nonces",
	RFC 3540, June 2003.		RFC 3540, June 2003.

	[RFC3714] Floyd, S. and J. Kempf, "IAB Concerns Regarding Congestion		[RFC3714] Floyd, S. and J. Kempf, "IAB Concerns Regarding Congestion
	Control for Voice Traffic in the Internet", RFC 3714,		Control for Voice Traffic in the Internet", RFC 3714,
	March 2004.		March 2004.


			[RFC4301] Kent, S. and K. Seo, "Security Architecture for the
			Internet Protocol", RFC 4301, December 2005.

	[Re-PCN] Briscoe, B., "Emulating Border Flow Policing using Re-ECN		[Re-PCN] Briscoe, B., "Emulating Border Flow Policing using Re-ECN
	on Bulk Data", draft-briscoe-tsvwg-re-ecn-border-cheat-01		on Bulk Data", draft-briscoe-tsvwg-re-ecn-border-cheat-01
	(work in progress), March 2006.		(work in progress), March 2006.

	[Re-fb] Briscoe, B., Jacquet, A., Di Cairano-Gilfedder, C.,		[Re-fb] Briscoe, B., Jacquet, A., Di Cairano-Gilfedder, C.,
	Salvatori, A., Soppera, A., and M. Koyabe, "Policing		Salvatori, A., Soppera, A., and M. Koyabe, "Policing
	Congestion Response in an Internetwork Using Re-Feedback",		Congestion Response in an Internetwork Using Re-Feedback",
	ACM SIGCOMM CCR 35(4)277--288, August 2005, <http://		ACM SIGCOMM CCR 35(4)277--288, August 2005, <http://
	www.acm.org/sigs/sigcomm/sigcomm2005/		www.acm.org/sigs/sigcomm/sigcomm2005/
	techprog.html#session8>.		techprog.html#session8>.


			[Savage99]
			Savage, S., Cardwell, N., Wetherall, D., and T. Anderson,
			"TCP congestion control with a misbehaving receiver", ACM
			SIGCOMM CCR 29(5), October 1999,
			<http://citeseer.ist.psu.edu/savage99tcp.html>.

	[Smart_rtg]		[Smart_rtg]
	Goldenberg, D., Qiu, L., Xie, H., Yang, Y., and Y. Zhang,		Goldenberg, D., Qiu, L., Xie, H., Yang, Y., and Y. Zhang,
	"Optimizing Cost and Performance for Multihoming", ACM		"Optimizing Cost and Performance for Multihoming", ACM
	SIGCOMM CCR 34(4)79--92, October 2004,		SIGCOMM CCR 34(4)79--92, October 2004,
	<http://citeseer.ist.psu.edu/698472.html>.		<http://citeseer.ist.psu.edu/698472.html>.

	[Steps_DoS]		[Steps_DoS]
	Handley, M. and A. Greenhalgh, "Steps towards a DoS-		Handley, M. and A. Greenhalgh, "Steps towards a DoS-
	resistant Internet Architecture", Proc. ACM SIGCOMM		resistant Internet Architecture", Proc. ACM SIGCOMM
	workshop on Future directions in network architecture		workshop on Future directions in network architecture

	skipping to change at page 75, line 38		skipping to change at page 77, line 43

	Appendix E. Example Egress Dropper Algorithm		Appendix E. Example Egress Dropper Algorithm

	{ToDo: Write up the basic algorithm with flow state, then the		{ToDo: Write up the basic algorithm with flow state, then the
	aggregated one.}		aggregated one.}

	Appendix F. Re-TTL		Appendix F. Re-TTL

	This Appendix gives an overview of a proposal to be able to overload		This Appendix gives an overview of a proposal to be able to overload
	the TTL field in the IP header to monitor downstream propagation		the TTL field in the IP header to monitor downstream propagation

	delay. It is planned to fully write up this proposal in a future		delay. This is included to show that it would be possible to take
	Internet Draft.		account of RTT if it was deemed desirable.

	Delay re-feedback can be achieved by overloading the TTL field,		Delay re-feedback can be achieved by overloading the TTL field,
	without changing IP or router TTL processing. A target value for TTL		without changing IP or router TTL processing. A target value for TTL
	at the destination would need standardising, say 16. If the path hop		at the destination would need standardising, say 16. If the path hop
	count increased by more than 16 during a routing change, it would		count increased by more than 16 during a routing change, it would
	temporarily be mistaken for a routing loop, so this target would need		temporarily be mistaken for a routing loop, so this target would need
	to be chosen to exceed typical hop count increases. The TCP wire		to be chosen to exceed typical hop count increases. The TCP wire
	protocol and handlers would need modifying to feed back the		protocol and handlers would need modifying to feed back the
	destination TTL and initialise it. It would be necessary to		destination TTL and initialise it. It would be necessary to
	standardise the unit of TTL in terms of real time (as was the		standardise the unit of TTL in terms of real time (as was the

	skipping to change at page 77, line 38		skipping to change at page 79, line 43
	o r = C_FNE/T_FNE		o r = C_FNE/T_FNE

	o b_max = b_0		o b_max = b_0

	T_FNE should be a much shorter period than T_user: for instance T_FNE		T_FNE should be a much shorter period than T_user: for instance T_FNE
	could be in the order of minutes while T_user could be in order of		could be in the order of minutes while T_user could be in order of
	weeks.		weeks.

	G.2. Per-flow Rate Policing		G.2. Per-flow Rate Policing


	Per-flow policing aims to enforce congestion responsiveness on the		Whilst we believe that simple per-user policing would be sufficient
	shortest information timescale on a network path: packet roundtrips.		to ensure senders comply with congestion control, some operators may
			wish to police the rate response of each flow to congestion as well.
			Although we do not believe this will be neceesary, we include this
			section to show how one could perform per-flow policing using
			enforcement of TCP-fairness as an example. Per-flow policing aims to
			enforce congestion responsiveness on the shortest information
			timescale on a network path: packet roundtrips.

	This again requires that the appropriate terms be agreed between a		This again requires that the appropriate terms be agreed between a
	network operator and its users, where a congestion responsiveness		network operator and its users, where a congestion responsiveness
	policy might be required for the use of a given network service		policy might be required for the use of a given network service
	(perhaps unless the user specifically requests otherwise).		(perhaps unless the user specifically requests otherwise).

	As an example, we describe below how a rate adaptation policer can be		As an example, we describe below how a rate adaptation policer can be
	designed when the applicable rate adaptation policy is TCP-		designed when the applicable rate adaptation policy is TCP-
	compliance. In that context, the average throughput of a flow will		compliance. In that context, the average throughput of a flow will
	be expected to be bounded by the value of the TCP throughput during		be expected to be bounded by the value of the TCP throughput during

	congestion avoidance, given n Mathis' formula [Mathis97]		congestion avoidance, given in Mathis' formula [Mathis97]

	x_TCP = k * s / ( T * sqrt(m) )		x_TCP = k * s / ( T * sqrt(m) )

	where:		where:

	o x_TCP is the throughput of the TCP flow in packets per second,		o x_TCP is the throughput of the TCP flow in packets per second,

	o k is a constant upper-bounded by sqrt(3/2),		o k is a constant upper-bounded by sqrt(3/2),

	o s is the average packet size of the flow,		o s is the average packet size of the flow,


	skipping to change at page 81, line 8		skipping to change at page 83, line 20

	H.2. Inflation Factor for Persistently Negative Flows		H.2. Inflation Factor for Persistently Negative Flows

	The following process is suggested to complement the simple algorithm		The following process is suggested to complement the simple algorithm
	above in order to protect against the various attacks from		above in order to protect against the various attacks from
	persistently negative flows described in Section 6.1.6. As explained		persistently negative flows described in Section 6.1.6. As explained
	in that section, the most important and first step is to estimate the		in that section, the most important and first step is to estimate the
	contribution of persistently negative flows to the bulk volume of		contribution of persistently negative flows to the bulk volume of
	downstream pre-congestion and to inflate this bulk volume as if these		downstream pre-congestion and to inflate this bulk volume as if these
	flows weren't there. The process below has been designed to give an		flows weren't there. The process below has been designed to give an

	unboased estimate, but it may be possible to define other processes		unbiased estimate, but it may be possible to define other processes
	that achieve similar ends.		that achieve similar ends.

	While the above simple metering algorithm is counting the bulk of		While the above simple metering algorithm is counting the bulk of
	traffic over an accounting period, the meter should also select a		traffic over an accounting period, the meter should also select a
	subset of the whole flow ID space that is small enough to be able to		subset of the whole flow ID space that is small enough to be able to
	realistically measure but large enough to give a realistic sample.		realistically measure but large enough to give a realistic sample.
	Many different samples of different subsets of the ID space should be		Many different samples of different subsets of the ID space should be
	taken at different times during the accounting period, preferably		taken at different times during the accounting period, preferably
	covering the whole ID space. During each sample, the meter should		covering the whole ID space. During each sample, the meter should
	count the volume of positive packets and subtract the volume of		count the volume of positive packets and subtract the volume of

	skipping to change at page 81, line 45		skipping to change at page 84, line 13
	by the effect of persistently negative flows.		by the effect of persistently negative flows.

	Appendix I. Argument for holding back the ECN nonce		Appendix I. Argument for holding back the ECN nonce

	The ECN nonce is a mechanism that allows a /sending/ transport to		The ECN nonce is a mechanism that allows a /sending/ transport to
	detect if drop or ECN marking at a congested router has been		detect if drop or ECN marking at a congested router has been
	suppressed by a node somewhere in the feedback loop---another router		suppressed by a node somewhere in the feedback loop---another router
	or the receiver.		or the receiver.

	Space for the ECN nonce was set aside in [RFC3168] (currently		Space for the ECN nonce was set aside in [RFC3168] (currently

	proposed standard) while the full nonce mechanism is specified in RFC		proposed standard) while the full nonce mechanism is specified in
	3540 (currently experimental). The specifications for [RFC4340]		[RFC3540] (currently experimental). The specifications for [RFC4340]
	(currently proposed standard) requires that "Each DCCP sender SHOULD		(currently proposed standard) requires that "Each DCCP sender SHOULD
	set ECN Nonces on its packets...". It also mandates as a requirement		set ECN Nonces on its packets...". It also mandates as a requirement
	for all CCID profiles that "Any newly defined acknowledgement		for all CCID profiles that "Any newly defined acknowledgement
	mechanism MUST include a way to transmit ECN Nonce Echoes back to the		mechanism MUST include a way to transmit ECN Nonce Echoes back to the
	sender.", therefore:		sender.", therefore:

	o The CCID profile for TCP-like Congestion Control [RFC4341]		o The CCID profile for TCP-like Congestion Control [RFC4341]
	(currently proposed standard) says "The sender will use the ECN		(currently proposed standard) says "The sender will use the ECN
	Nonce for data packets, and the receiver will echo those nonces in		Nonce for data packets, and the receiver will echo those nonces in
	its Ack Vectors."		its Ack Vectors."

	o The CCID profile for TCP-Friendly Rate Control (TFRC) [RFC4342]		o The CCID profile for TCP-Friendly Rate Control (TFRC) [RFC4342]
	recommends that "The sender [use] Loss Intervals options' ECN		recommends that "The sender [use] Loss Intervals options' ECN
	Nonce Echoes (and possibly any Ack Vectors' ECN Nonce Echoes) to		Nonce Echoes (and possibly any Ack Vectors' ECN Nonce Echoes) to
	probabilistically verify that the receiver is correctly reporting		probabilistically verify that the receiver is correctly reporting
	all dropped or marked packets."		all dropped or marked packets."


	The ECN nonce is used for three types of functions:		The primary function of the ECN nonce is to protect the integrity of
			the information about congestion: ECN marks and packet drops.
	o if the sender wants to ensure the integrity of the information
	about packet drops,

	o if the sending transport chooses to act in the interests of a
	congested router,

	o if the sending transport wants to allocate its own resources in
	proportion to the rates that each network path can sustain, based
	on congestion control.

	However, when the nonce is used to protect the integrity of		However, when the nonce is used to protect the integrity of
	information about packet drops, rather than ECN marks, a transport		information about packet drops, rather than ECN marks, a transport
	layer nonce will always be sufficient (because a drop loses the		layer nonce will always be sufficient (because a drop loses the
	transport header as well as the ECN field in the network header),		transport header as well as the ECN field in the network header),
	which would avoid using scarce IP header codepoint space. Similarly,		which would avoid using scarce IP header codepoint space. Similarly,
	a transport layer nonce would protect against a receiver sending		a transport layer nonce would protect against a receiver sending

	early acknowledgements.		early acknowledgements [Savage99].


	The other two functions need the ECN nonce to be in the network		If the ECN nonce reveals integrity problems with the information
	layer, but both require rather optimistic trust assumptions in order		about congestion, the sending transport can use that knowledge for
	to be useful. If the sending transport chooses to act in the		two functions:
	interests of a congested router, it can reduce its rate if it detects
	some malicious party in the feedback loop may be suppressing ECN		o to protect its own resources, by allocating them in proportion to
	feedback. But it would only be useful to a router when /all/ senders		the rates that each network path can sustain, based on congestion
	using the router are trusted to act in the router's interest.		control,

			o and to protect congested routers in the network, by slowing down
			drastically its connection to the destination with corrupt
			congestion information.

			If the sending transport chooses to act in the interests of congested
			routers, it can reduce its rate if it detects some malicious party in
			the feedback loop may be suppressing ECN feedback. But it would only
			be useful to congested routers when /all/ senders using them are
			trusted to act in interest of the congested routers.

	In the end, the only essential use of a network layer nonce is when		In the end, the only essential use of a network layer nonce is when
	sending transports (e.g. large servers) want to allocate their /own/		sending transports (e.g. large servers) want to allocate their /own/
	resources in proportion to the rates that each network path can		resources in proportion to the rates that each network path can
	sustain, based on congestion control. In that case, the nonce allows		sustain, based on congestion control. In that case, the nonce allows
	senders to be assured that they aren't being duped into giving more		senders to be assured that they aren't being duped into giving more
	of their own resources to a particular flow. And if congestion		of their own resources to a particular flow. And if congestion
	suppression is detected, the sending transport can rate limit the		suppression is detected, the sending transport can rate limit the
	offending connection to protect its own resources. Certainly, this		offending connection to protect its own resources. Certainly, this
	is a useful function, but the IETF should carefully decide whether		is a useful function, but the IETF should carefully decide whether

	skipping to change at page 83, line 17		skipping to change at page 85, line 31

	In contrast, re-ECN allows all routers to fully protect themselves		In contrast, re-ECN allows all routers to fully protect themselves
	from such attacks, without having to trust anyone - senders,		from such attacks, without having to trust anyone - senders,
	receivers, neighbouring networks. Re-ECN is therefore proposed in		receivers, neighbouring networks. Re-ECN is therefore proposed in
	preference to the ECN nonce on the basis that it addresses the		preference to the ECN nonce on the basis that it addresses the
	generic problem of accountability for congestion of a network's		generic problem of accountability for congestion of a network's
	resources at the IP layer.		resources at the IP layer.

	Delaying the ECN nonce is justified because the applicability of the		Delaying the ECN nonce is justified because the applicability of the
	ECN nonce seems too limited for it to consume a two-bit codepoint in		ECN nonce seems too limited for it to consume a two-bit codepoint in

	the IP header.		the IP header. It therefore seems prudent to give time for an
			alternative way to be found to do the one function the nonce is
			essential for.

	Moreover, while we have re-designed the re-ECN codepoints so that		Moreover, while we have re-designed the re-ECN codepoints so that
	they do not prevent the ECN nonce progressing, the same is not true		they do not prevent the ECN nonce progressing, the same is not true
	the other way round. If the ECN nonce started to see some deployment		the other way round. If the ECN nonce started to see some deployment
	(perhaps because it was blessed with proposed standard status),		(perhaps because it was blessed with proposed standard status),
	incremental deployment of re-ECN would effectively be impossible,		incremental deployment of re-ECN would effectively be impossible,
	because re-ECN marking fractions at inter-domain borders would be		because re-ECN marking fractions at inter-domain borders would be
	polluted by unknown levels of nonce traffic.		polluted by unknown levels of nonce traffic.

	The authors are aware that re-ECN must prove it has the potential it		The authors are aware that re-ECN must prove it has the potential it

	skipping to change at page 84, line 22		skipping to change at page 86, line 36
	Email: arnaud.jacquet@bt.com		Email: arnaud.jacquet@bt.com
	URI:		URI:

	Alessandro Salvatori		Alessandro Salvatori
	BT		BT
	B54/77, Adastral Park		B54/77, Adastral Park
	Martlesham Heath		Martlesham Heath
	Ipswich IP5 3RE		Ipswich IP5 3RE
	UK		UK


	Email: sandr8@gmail.com		Email: alessandro.salvatori@gmail.com

	Martin Koyabe		Martin Koyabe
	BT		BT

	B54/69, Adastral Park		PP2a Rigel House, Adastral Park
	Martlesham Heath		Martlesham Heath
	Ipswich IP5 3RE		Ipswich IP5 3RE
	UK		UK

	Phone: +44 1473 646923		Phone: +44 1473 646923
	Email: martin.koyabe@bt.com		Email: martin.koyabe@bt.com
	URI:		URI:


			Toby Moncaster
			BT
			B54/70, Adastral Park
			Martlesham Heath
			Ipswich IP5 3RE
			UK

			Phone: +44 1473 648734
			Email: toby.moncaster@bt.com

	Full Copyright Statement		Full Copyright Statement


	Copyright (C) The Internet Society (2006).		Copyright (C) The IETF Trust (2007).

	This document is subject to the rights, licenses and restrictions		This document is subject to the rights, licenses and restrictions
	contained in BCP 78, and except as set forth therein, the authors		contained in BCP 78, and except as set forth therein, the authors
	retain all their rights.		retain all their rights.

	This document and the information contained herein are provided on an		This document and the information contained herein are provided on an
	"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS		"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS

	OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET		OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
	ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,		THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
	INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE		OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
	INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED		THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
	WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.		WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

	Intellectual Property		Intellectual Property

	The IETF takes no position regarding the validity or scope of any		The IETF takes no position regarding the validity or scope of any
	Intellectual Property Rights or other rights that might be claimed to		Intellectual Property Rights or other rights that might be claimed to
	pertain to the implementation or use of the technology described in		pertain to the implementation or use of the technology described in
	this document or the extent to which any license under such rights		this document or the extent to which any license under such rights
	might or might not be available; nor does it represent that it has		might or might not be available; nor does it represent that it has
	made any independent effort to identify any such rights. Information		made any independent effort to identify any such rights. Information

	skipping to change at page 85, line 45		skipping to change at page 88, line 45
	such proprietary rights by implementers or users of this		such proprietary rights by implementers or users of this
	specification can be obtained from the IETF on-line IPR repository at		specification can be obtained from the IETF on-line IPR repository at
	http://www.ietf.org/ipr.		http://www.ietf.org/ipr.

	The IETF invites any interested party to bring to its attention any		The IETF invites any interested party to bring to its attention any
	copyrights, patents or patent applications, or other proprietary		copyrights, patents or patent applications, or other proprietary
	rights that may cover technology that may be required to implement		rights that may cover technology that may be required to implement
	this standard. Please address the information to the IETF at		this standard. Please address the information to the IETF at
	ietf-ipr@ietf.org.		ietf-ipr@ietf.org.


	Acknowledgment		Acknowledgments

	Funding for the RFC Editor function is provided by the IETF		Funding for the RFC Editor function is provided by the IETF

	Administrative Support Activity (IASA).		Administrative Support Activity (IASA). This document was produced
			using xml2rfc v1.32 (of http://xml.resource.org/) from a source in
			RFC-2629 XML format.

End of changes. 87 change blocks.
	308 lines changed or deleted		422 lines changed or added
This html diff was produced by rfcdiff 1.33. The latest version is available from http://tools.ietf.org/tools/rfcdiff/