Diff: draft-briscoe-tsvwg-re-ecn-border-cheat-01.txt - draft-briscoe-re-pcn-border-cheat-00.txt

	draft-briscoe-tsvwg-re-ecn-border-cheat-01.txt		draft-briscoe-re-pcn-border-cheat-00.txt


	Transport Area Working Group B. Briscoe		PCN Working Group B. Briscoe
	Internet-Draft BT & UCL		Internet-Draft BT & UCL

	Expires: December 28, 2006 June 26, 2006		Intended status: Informational June 30, 2007
			Expires: January 1, 2008

	Emulating Border Flow Policing using Re-ECN on Bulk Data		Emulating Border Flow Policing using Re-ECN on Bulk Data

	draft-briscoe-tsvwg-re-ecn-border-cheat-01		draft-briscoe-re-pcn-border-cheat-00

	Status of this Memo		Status of this Memo

	By submitting this Internet-Draft, each author represents that any		By submitting this Internet-Draft, each author represents that any
	applicable patent or other IPR claims of which he or she is aware		applicable patent or other IPR claims of which he or she is aware
	have been or will be disclosed, and any of which he or she becomes		have been or will be disclosed, and any of which he or she becomes
	aware will be disclosed, in accordance with Section 6 of BCP 79.		aware will be disclosed, in accordance with Section 6 of BCP 79.

	Internet-Drafts are working documents of the Internet Engineering		Internet-Drafts are working documents of the Internet Engineering
	Task Force (IETF), its areas, and its working groups. Note that		Task Force (IETF), its areas, and its working groups. Note that

	skipping to change at page 1, line 33		skipping to change at page 1, line 34
	and may be updated, replaced, or obsoleted by other documents at any		and may be updated, replaced, or obsoleted by other documents at any
	time. It is inappropriate to use Internet-Drafts as reference		time. It is inappropriate to use Internet-Drafts as reference
	material or to cite them other than as "work in progress."		material or to cite them other than as "work in progress."

	The list of current Internet-Drafts can be accessed at		The list of current Internet-Drafts can be accessed at
	http://www.ietf.org/ietf/1id-abstracts.txt.		http://www.ietf.org/ietf/1id-abstracts.txt.

	The list of Internet-Draft Shadow Directories can be accessed at		The list of Internet-Draft Shadow Directories can be accessed at
	http://www.ietf.org/shadow.html.		http://www.ietf.org/shadow.html.


	This Internet-Draft will expire on December 28, 2006.		This Internet-Draft will expire on January 1, 2008.

	Copyright Notice		Copyright Notice


	Copyright (C) The Internet Society (2006).		Copyright (C) The IETF Trust (2007).

	Abstract		Abstract

	Scaling per flow admission control to the Internet is a hard problem.		Scaling per flow admission control to the Internet is a hard problem.
	A recently proposed approach combines Diffserv and pre-congestion		A recently proposed approach combines Diffserv and pre-congestion
	notification (PCN) to provide a service slightly better than Intserv		notification (PCN) to provide a service slightly better than Intserv
	controlled load. It scales to networks of any size, but only if		controlled load. It scales to networks of any size, but only if
	domains trust each other to comply with admission control and rate		domains trust each other to comply with admission control and rate
	policing. This memo claims to solve this trust problem without		policing. This memo claims to solve this trust problem without
	losing scalability. It describes bulk border policing that provides		losing scalability. It describes bulk border policing that provides
	a sufficient emulation of per-flow policing with the help of another		a sufficient emulation of per-flow policing with the help of another
	recently proposed extension to ECN, involving re-echoing ECN feedback		recently proposed extension to ECN, involving re-echoing ECN feedback
	(re-ECN). With only passive bulk measurements at borders, sanctions		(re-ECN). With only passive bulk measurements at borders, sanctions
	can be applied against cheating networks.		can be applied against cheating networks.

	Status (to be removed by the RFC Editor)		Status (to be removed by the RFC Editor)

	This memo is posted as an Internet-Draft with the intent to		This memo is posted as an Internet-Draft with the intent to

	eventually progress to informational status. It is envisaged that		eventually be broken down in two documents; one for the standards
	the necessary standards actions to realise the system described would		track and one for informational status. But until it becomes an item
	sit in three other documents currently being discussed (but not on		of IETF working group business the whole proposal has been kept
	the standards track) in the IETF Transport Area [Re-TCP], [RSVP-ECN]		together to aid understanding. Only the text of Section 4 of this
	& [PCN]. The authors seek comments from the Internet community on		document requires standardisation. The rest of the sections describe
	whether combining PCN and re-ECN is a sufficient solution to the		how a system might be built from these protocols by the operators of
	admission control problem.		an internetwork. Note in particular that the policing and monitoring
			functions proposed for the trust boundaries between operators would
			not need standardisation by the IETF. They simply represent one way
			that the proposed protocols could be used to extend the PCN
			architecture [PCN-arch] to span multiple domains without mutual trust
			between the operators.

			To realise the system described, this document also depends on
			standardisation of three other documents currently being discussed
			(but not on the standards track) in the IETF Transport Area: pre-
			congestion notification (PCN) marking on interior nodes [PCN];
			feedback of aggregate PCN measurements by suitably extending the
			admission control signalling protocol (e.g. RSVP) [RSVP-ECN]; and
			re-insertion of the feedback into the forward stream of IP packets by
			the PCN ingress gateway in a similar way to that proposed for a TCP
			source [Re-TCP].

			The authors seek comments from the Internet community on whether
			combining PCN and re-ECN in this way is a sufficient solution to the
			problem of scaling microflow admission control to the Internet as a
			whole, even though such scaling must take account of the increasing
			numbers of networks and users who may all have conflicting interests.

	Changes from previous drafts (to be removed by the RFC Editor)		Changes from previous drafts (to be removed by the RFC Editor)


	From -00 to -01:		Changes in this version <draft-briscoe-re-pcn-border-cheat-00>
			relative to the last <draft-briscoe-tsvwg-re-ecn-border-cheat-01>:

			Changed filename to associate it with the new IETF PCN w-g, rather
			than the TSVWG w-g.

			Introduction: Clarified that bulk policing only replaces per-flow
			policing at interior inter-domain borders, while per-flow policing
			is still needed at the access interface to the internetwork. Also
			clarified that the aim is to neutralise any gains from cheating
			using local bilateral contracts between neighbouring networks,
			rather than merely identifying remote cheaters.

			Section 3.1: Described the traditional per-flow policing problem
			with inter-domain reservations more precisely, particularly with
			respect to direction of reservations and of traffic flows.

			Clarified status of Section 5 onwards, in particular that policers
			and monitors would not need standardisation, but that the protocol
			in Section 4 would require standardisation.

			Section 5.6.2 on competitive routing: Added discussion of direct
			incentives for a receiver to switch to a different provider even
			if the provider has a termination monopoly.

			Clarified that "Designing in security from the start" merely means
			allowing codepoint space in the PCN protocol encoding. There is
			no need to actually implement inter-domain security mechanisms for
			solutions confined to a single domain.

			Updated some references and added a ref to the Security
			Considerations, as well as other minor corrections and
			improvements.

			Changes from <draft-briscoe-tsvwg-re-ecn-border-cheat-00 to
			<draft-briscoe-tsvwg-re-ecn-border-cheat-01>:

	Added subsection on Border Accounting Mechanisms (Section 5.6.1)		Added subsection on Border Accounting Mechanisms (Section 5.6.1)

	Section 4.2 on the re-ECN wire protocol clarified and re-organised		Section 4.2 on the re-ECN wire protocol clarified and re-organised
	to separately discuss re-ECN for default ECN marking and for pre-		to separately discuss re-ECN for default ECN marking and for pre-
	congestion marking (PCN).		congestion marking (PCN).

	Router Forwarding Behaviour subsection added to re-organised		Router Forwarding Behaviour subsection added to re-organised
	section on Protocol Operation (Section 4.3). Extensions section		section on Protocol Operation (Section 4.3). Extensions section
	moved within Protocol Operations.		moved within Protocol Operations.

	skipping to change at page 3, line 7		skipping to change at page 5, line 7

	Sections on Design Rationale (Section 8) and Security		Sections on Design Rationale (Section 8) and Security
	Considerations (Section 9) expanded with some new material,		Considerations (Section 9) expanded with some new material,
	including new attacks and their defences.		including new attacks and their defences.

	Suggested Border Metering Algorithms improved (Appendix A.2) for		Suggested Border Metering Algorithms improved (Appendix A.2) for
	resilience to newly identified attacks.		resilience to newly identified attacks.

	Table of Contents		Table of Contents


	1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 5		1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 7
	2. Requirements Notation . . . . . . . . . . . . . . . . . . . . 7		2. Requirements Notation . . . . . . . . . . . . . . . . . . . . 9
	3. The Problem . . . . . . . . . . . . . . . . . . . . . . . . . 7		3. The Problem . . . . . . . . . . . . . . . . . . . . . . . . . 9
	3.1. The Traditional Per-flow Policing Problem . . . . . . . . 7		3.1. The Traditional Per-flow Policing Problem . . . . . . . . 9
	3.2. Generic Scenario . . . . . . . . . . . . . . . . . . . . . 9		3.2. Generic Scenario . . . . . . . . . . . . . . . . . . . . . 11
	4. Re-ECN Protocol for an RSVP (or similar) Transport . . . . . . 11		4. Re-ECN Protocol for an RSVP (or similar) Transport . . . . . . 14
	4.1. Protocol Overview . . . . . . . . . . . . . . . . . . . . 11		4.1. Protocol Overview . . . . . . . . . . . . . . . . . . . . 14
	4.2. Re-ECN Abstracted Network Layer Wire Protocol (IPv4 or		4.2. Re-ECN Abstracted Network Layer Wire Protocol (IPv4 or

	v6) . . . . . . . . . . . . . . . . . . . . . . . . . . . 13		v6) . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
	4.2.1. Re-ECN Recap . . . . . . . . . . . . . . . . . . . . . 13		4.2.1. Re-ECN Recap . . . . . . . . . . . . . . . . . . . . . 16
	4.2.2. Re-ECN Combined with Pre-Congestion Notification		4.2.2. Re-ECN Combined with Pre-Congestion Notification

	(re-PCN) . . . . . . . . . . . . . . . . . . . . . . . 14		(re-PCN) . . . . . . . . . . . . . . . . . . . . . . . 17
	4.3. Protocol Operation . . . . . . . . . . . . . . . . . . . . 17		4.3. Protocol Operation . . . . . . . . . . . . . . . . . . . . 19
	4.3.1. Protocol Operation for an Established Flow . . . . . . 17		4.3.1. Protocol Operation for an Established Flow . . . . . . 19
	4.3.2. Aggregate Bootstrap . . . . . . . . . . . . . . . . . 18		4.3.2. Aggregate Bootstrap . . . . . . . . . . . . . . . . . 21
	4.3.3. Flow Bootstrap . . . . . . . . . . . . . . . . . . . . 19		4.3.3. Flow Bootstrap . . . . . . . . . . . . . . . . . . . . 22
	4.3.4. Router Forwarding Behaviour . . . . . . . . . . . . . 20		4.3.4. Router Forwarding Behaviour . . . . . . . . . . . . . 23
	4.3.5. Extensions . . . . . . . . . . . . . . . . . . . . . . 22		4.3.5. Extensions . . . . . . . . . . . . . . . . . . . . . . 24
	5. Emulating Border Policing with Re-ECN . . . . . . . . . . . . 22		5. Emulating Border Policing with Re-ECN . . . . . . . . . . . . 24
	5.1. Informal Terminology . . . . . . . . . . . . . . . . . . . 22		5.1. Informal Terminology . . . . . . . . . . . . . . . . . . . 25
	5.2. Policing Overview . . . . . . . . . . . . . . . . . . . . 23		5.2. Policing Overview . . . . . . . . . . . . . . . . . . . . 26
	5.3. Pre-requisite Contractual Arrangements . . . . . . . . . . 25		5.3. Pre-requisite Contractual Arrangements . . . . . . . . . . 28
	5.4. Emulation of Per-Flow Rate Policing: Rationale and		5.4. Emulation of Per-Flow Rate Policing: Rationale and

	Limits . . . . . . . . . . . . . . . . . . . . . . . . . . 28		Limits . . . . . . . . . . . . . . . . . . . . . . . . . . 31
	5.5. Sanctioning Dishonest Marking . . . . . . . . . . . . . . 29		5.5. Sanctioning Dishonest Marking . . . . . . . . . . . . . . 32
	5.6. Border Mechanisms . . . . . . . . . . . . . . . . . . . . 31		5.6. Border Mechanisms . . . . . . . . . . . . . . . . . . . . 34
	5.6.1. Border Accounting Mechanisms . . . . . . . . . . . . . 31		5.6.1. Border Accounting Mechanisms . . . . . . . . . . . . . 34
	5.6.2. Competitive Routing . . . . . . . . . . . . . . . . . 35		5.6.2. Competitive Routing . . . . . . . . . . . . . . . . . 38
	5.6.3. Fail-safes . . . . . . . . . . . . . . . . . . . . . . 35		5.6.3. Fail-safes . . . . . . . . . . . . . . . . . . . . . . 39
	6. Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 36		6. Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
	7. Incremental Deployment . . . . . . . . . . . . . . . . . . . . 39		7. Incremental Deployment . . . . . . . . . . . . . . . . . . . . 42
	8. Design Choices and Rationale . . . . . . . . . . . . . . . . . 40		8. Design Choices and Rationale . . . . . . . . . . . . . . . . . 43
	9. Security Considerations . . . . . . . . . . . . . . . . . . . 41		9. Security Considerations . . . . . . . . . . . . . . . . . . . 45
	10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 43		10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 46
	11. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 43		11. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 46
	12. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 44		12. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 47
	13. Comments Solicited . . . . . . . . . . . . . . . . . . . . . . 44		13. Comments Solicited . . . . . . . . . . . . . . . . . . . . . . 47
	14. References . . . . . . . . . . . . . . . . . . . . . . . . . . 44		14. References . . . . . . . . . . . . . . . . . . . . . . . . . . 48
	14.1. Normative References . . . . . . . . . . . . . . . . . . . 44		14.1. Normative References . . . . . . . . . . . . . . . . . . . 48
	14.2. Informative References . . . . . . . . . . . . . . . . . . 45		14.2. Informative References . . . . . . . . . . . . . . . . . . 48
	Appendix A. Implementation . . . . . . . . . . . . . . . . . . . 46		Appendix A. Implementation . . . . . . . . . . . . . . . . . . . 50
	A.1. Ingress Gateway Algorithm for Blanking the RE flag . . . . 47		A.1. Ingress Gateway Algorithm for Blanking the RE flag . . . . 50
	A.2. Downstream Congestion Metering Algorithms . . . . . . . . 47		A.2. Downstream Congestion Metering Algorithms . . . . . . . . 51
	A.2.1. Bulk Downstream Congestion Metering Algorithm . . . . 47		A.2.1. Bulk Downstream Congestion Metering Algorithm . . . . 51
	A.2.2. Inflation Factor for Persistently Negative Flows . . . 48		A.2.2. Inflation Factor for Persistently Negative Flows . . . 52
	A.3. Algorithm for Sanctioning Negative Traffic . . . . . . . . 49		A.3. Algorithm for Sanctioning Negative Traffic . . . . . . . . 52
	Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 50		Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 53
	Intellectual Property and Copyright Statements . . . . . . . . . . 51		Intellectual Property and Copyright Statements . . . . . . . . . . 54

	1. Introduction		1. Introduction

	The Internet community largely lost interest in the Intserv		The Internet community largely lost interest in the Intserv
	architecture after it was clarified that it would be unlikely to		architecture after it was clarified that it would be unlikely to
	scale to the whole Internet [RFC2208]. Although Intserv mechanisms		scale to the whole Internet [RFC2208]. Although Intserv mechanisms
	proved impractical, the bandwidth reservation service it aimed to		proved impractical, the bandwidth reservation service it aimed to
	offer is still very much required.		offer is still very much required.


	A recently proposed approach [CL-deploy] combines Diffserv and pre-		A recently proposed approach [PCN-arch] combines Diffserv and pre-
	congestion notification (PCN) to provide a service slightly better		congestion notification (PCN) to provide a service slightly better
	than Intserv controlled load [RFC2211]. It scales to any size		than Intserv controlled load [RFC2211]. It scales to any size
	network, but only if domains trust their neighbours to have checked		network, but only if domains trust their neighbours to have checked
	that upstream customers aren't taking more bandwidth than they		that upstream customers aren't taking more bandwidth than they
	reserved, either accidentally or deliberately. This memo describes		reserved, either accidentally or deliberately. This memo describes
	border policing measures so that one network can protect its		border policing measures so that one network can protect its
	interests, even if networks around it are deliberately trying to		interests, even if networks around it are deliberately trying to
	cheat. The approach provides a sufficient emulation of flow rate		cheat. The approach provides a sufficient emulation of flow rate
	policing at trust boundaries but without per-flow processing. The		policing at trust boundaries but without per-flow processing. The
	emulation is not perfect, but it is sufficient to ensure that the		emulation is not perfect, but it is sufficient to ensure that the
	punishment is at least proportionate to the severity of the cheat.		punishment is at least proportionate to the severity of the cheat.

			Per-flow rate policing for each reservation is still expected to be
			used at the access edge of the internetwork, but at the borders
			between networks bulk policing can be used to emulate per-flow
			policing.

	The aim is to be able to scale controlled load service to any number		The aim is to be able to scale controlled load service to any number
	of endpoints, even though such scaling must take account of the		of endpoints, even though such scaling must take account of the
	increasing numbers of networks and users who may all have conflicting		increasing numbers of networks and users who may all have conflicting
	interests. To achieve such scaling, this memo combines two recent		interests. To achieve such scaling, this memo combines two recent
	proposals, both of which it briefly recaps:		proposals, both of which it briefly recaps:

	o A deployment model for admission control over Diffserv using pre-		o A deployment model for admission control over Diffserv using pre-

	congestion notification [CL-deploy] describes how bulk pre-		congestion notification [PCN-arch] describes how bulk pre-
	congestion notification on routers within an edge-to-edge Diffserv		congestion notification on routers within an edge-to-edge Diffserv
	region can emulate the precision of per-flow admission control to		region can emulate the precision of per-flow admission control to
	provide controlled load service without unscalable per-flow		provide controlled load service without unscalable per-flow
	processing;		processing;

	o Re-ECN: Adding Accountability to TCP/IP [Re-TCP]. The trick that		o Re-ECN: Adding Accountability to TCP/IP [Re-TCP]. The trick that
	addresses cheating at borders is to recognise that border policing		addresses cheating at borders is to recognise that border policing
	is mainly necessary because cheating upstream networks will admit		is mainly necessary because cheating upstream networks will admit
	traffic when they shouldn't only as long as they don't directly		traffic when they shouldn't only as long as they don't directly
	experience the downstream congestion their misbehaviour can cause.		experience the downstream congestion their misbehaviour can cause.
	The re-ECN protocol requires upstream nodes to declare expected		The re-ECN protocol requires upstream nodes to declare expected
	downstream congestion in all forwarded packets and it makes it in		downstream congestion in all forwarded packets and it makes it in
	their interests to declare it honestly. Operators can then		their interests to declare it honestly. Operators can then
	monitor downstream congestion in bulk at borders to emulate		monitor downstream congestion in bulk at borders to emulate
	policing.		policing.


			The aim is not to enable a network to _identify_ some remote cheating
			party, which would rarely be useful given the victim network would be
			unlikely to be able to seek redress from a cheater in some remote
			part of the world with whom no direct contractual relationship
			exists. Rather the aim is to ensure that any gain from cheating will
			be cancelled out by penalties applied to the cheating party by its
			local network. Further, the solution ensures each of the chain of
			networks between the cheater and the victim will lose out if it
			doesn't apply penalties to its neighbour. Thus the solution builds
			on the local bilateral contractual relationships that already exist
			between neighbouring networks.

	Rather than the end-to-end arrangement used when re-ECN was specified		Rather than the end-to-end arrangement used when re-ECN was specified
	for the TCP transport [Re-TCP], this memo specifies re-ECN in an		for the TCP transport [Re-TCP], this memo specifies re-ECN in an
	edge-to-edge arrangement, making it applicable to the above		edge-to-edge arrangement, making it applicable to the above
	deployment model for admission control over Diffserv. Also, rather		deployment model for admission control over Diffserv. Also, rather
	than using a TCP transport for regular congestion feedback, this memo		than using a TCP transport for regular congestion feedback, this memo
	specifies re-ECN using RSVP as the transport for feedback [RSVP-ECN].		specifies re-ECN using RSVP as the transport for feedback [RSVP-ECN].
	A similar deployment model, but with a different transport for		A similar deployment model, but with a different transport for
	signalling congestion feedback could be used (e.g. RMD [NSIS-RMD]		signalling congestion feedback could be used (e.g. RMD [NSIS-RMD]
	uses NSIS).		uses NSIS).

	This memo aims to do two things: i) define how to apply the re-ECN		This memo aims to do two things: i) define how to apply the re-ECN
	protocol to the admission control over Diffserv scenario; and ii)		protocol to the admission control over Diffserv scenario; and ii)
	explain why re-ECN sufficiently emulates border policing in that		explain why re-ECN sufficiently emulates border policing in that
	scenario. Most of the memo is taken up with the second aim;		scenario. Most of the memo is taken up with the second aim;
	explaining why it works. Applying re-ECN to the scenario actually		explaining why it works. Applying re-ECN to the scenario actually

	involves quite a trivial modification to the ingress gateway. Our		involves quite a trivial modification to the ingress gateway. That
	immediate goal is to convince everyone to build that modification in		modification can be added to gateways later, so our immediate goal is
	to ingress gateways from the start, whether first deployments require		to convince everyone to have the foresight to define the PCN wire
	policing or not. Otherwise, when we want to add policing, we will		protocol encoding to accommodate the extended codepoints defined in
	have built ourselves a legacy problem. In other words, we aim to		this document, whether first deployments require border policing or
	convince people to "Build in security from the start."		not. Otherwise, when we want to add policing, we will have built
			ourselves a legacy problem. In other words, we aim to convince
			people to "Design in security from the start."

	The body of this memo is structured as follows:		The body of this memo is structured as follows:

	Section 3 describes the border policing problem. We recap the		Section 3 describes the border policing problem. We recap the
	traditional, unscalable view of how to solve the problem, and we		traditional, unscalable view of how to solve the problem, and we
	recap the admission control solution which has the scalability we		recap the admission control solution which has the scalability we
	do not want to lose when we add border policing;		do not want to lose when we add border policing;

	Section 4 specifies the re-ECN protocol solution in detail;		Section 4 specifies the re-ECN protocol solution in detail;


	skipping to change at page 6, line 48		skipping to change at page 9, line 17
	design decisions;		design decisions;

	Section 9 comments on the overall robustness of the security		Section 9 comments on the overall robustness of the security
	assumptions and lists specific security issues.		assumptions and lists specific security issues.

	It must be emphasised that we are not evangelical about removing per-		It must be emphasised that we are not evangelical about removing per-
	flow processing from borders. Network operators may choose to do		flow processing from borders. Network operators may choose to do
	per-flow processing at their borders for their own reasons, such as		per-flow processing at their borders for their own reasons, such as
	to support business models that require per-flow accounting. Our aim		to support business models that require per-flow accounting. Our aim
	is to show that per-flow processing at borders is no longer		is to show that per-flow processing at borders is no longer

	/necessary/ in order to provide end-to-end QoS using flow admission		_necessary_ in order to provide end-to-end QoS using flow admission
	control. Indeed, we are absolutely opposed to standardisation of		control. Indeed, we are absolutely opposed to standardisation of
	technology that embeds particular business models into the Internet.		technology that embeds particular business models into the Internet.
	Our aim is merely to provide a new useful metric (downstream		Our aim is merely to provide a new useful metric (downstream
	congestion) at trust boundaries. Given the well-known significance		congestion) at trust boundaries. Given the well-known significance
	of congestion in economics, operators can then use this new metric in		of congestion in economics, operators can then use this new metric in
	their interconnection contracts if they choose. This will enable		their interconnection contracts if they choose. This will enable
	competitive evolution of new business models (for examples		competitive evolution of new business models (for examples

	see [IXQoS]), alongside more traditional models that depend on more		see [IXQoS]), even for sets of flows running alongside another set
	costly per-flow processing at borders.		across the same border but using the more traditional model that
			depends on more costly per-flow processing at each border.

	2. Requirements Notation		2. Requirements Notation

	The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",		The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
	"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this		"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
	document are to be interpreted as described in [RFC2119].		document are to be interpreted as described in [RFC2119].

	3. The Problem		3. The Problem

	3.1. The Traditional Per-flow Policing Problem		3.1. The Traditional Per-flow Policing Problem

	If we claim to be able to emulate per-flow policing with bulk		If we claim to be able to emulate per-flow policing with bulk
	policing at trust boundaries, we need to know exactly what we are		policing at trust boundaries, we need to know exactly what we are

	emulating. So, even though we expect it to become a historic		emulating. So, we will start from the traditional scenario with per-
	practice, we will start from the traditional scenario with per-flow		flow policing at trust boundaries to explain why it has always been
	policing at trust boundaries to explain why it has always been
	considered necessary.		considered necessary.

	To be able to take advantage of a reservation-based service such as		To be able to take advantage of a reservation-based service such as

	controlled load, a source must reserve resources using a signalling		controlled load, a source-destination pair must reserve resources
	protocol such as RSVP [RFC2205]. An RSVP signalling request refers		using a signalling protocol such as RSVP [RFC2205]. An RSVP
	to a flow of packets by its flow ID tuple (filter spec [RFC2205]) (or		signalling request refers to a flow of packets by its flow ID tuple
	its security parameter index (SPI) [RFC2207] if port numbers are		(filter spec [RFC2205]) (or its security parameter index
	hidden by IPSec encryption). Other signalling protocols use similar		(SPI) [RFC2207] if port numbers are hidden by IPSec encryption).
	flow identifiers. But, it is insufficient to merely authorise and		Other signalling protocols use similar flow identifiers. But, it is
	admit a flow based on its identifiers, for instance merely opening a		insufficient to merely authorise and admit a flow based on its
	pin-hole for packets with identifiers that match an admitted flow ID.		identifiers, for instance merely opening a pin-hole for packets with
	Once a flow is admitted, it cannot necessarily be trusted to send		identifiers that match an admitted flow ID. Because, once a flow is
	packets within the rate profile it requested.		admitted, it cannot necessarily be trusted to send packets within the
			rate profile it requested.

	The packet rate must also be policed to keep the flow within the		The packet rate must also be policed to keep the flow within the
	requested flow spec [RFC2205]. For instance, without data rate		requested flow spec [RFC2205]. For instance, without data rate

	policing, a source could reserve resources for an 8kbps audio flow		policing, a source-destination pair could reserve resources for an
	but transmit a 6Mbps video (theft of service). More subtly, the		8kbps audio flow but the source could transmit a 6Mbps video (theft
	sender could generate bursts that were outside the profile it had		of service). More subtly, the sender could generate bursts that were
	requested.		outside the profile requested.

	In traditional architectures, per-flow packet rate-policing is		In traditional architectures, per-flow packet rate-policing is
	expensive and unscalable but, without it, a network is vulnerable to		expensive and unscalable but, without it, a network is vulnerable to
	such theft of service (whether malicious or accidental). Perhaps		such theft of service (whether malicious or accidental). Perhaps
	more importantly, if flows are allowed to send more data than they		more importantly, if flows are allowed to send more data than they
	were permitted, the ability of admission control to give assurances		were permitted, the ability of admission control to give assurances
	to other flows will break.		to other flows will break.


	Just as sources need not be trusted to keep within their requested		Just as sources need not be trusted to keep within the requested flow
	flow spec, whole networks might also try to cheat. We will now set		spec, whole networks might also try to cheat. We will now set up a
	up a concrete scenario to illustrate such cheats. Imagine		concrete scenario to illustrate such cheats. Imagine reservations
	reservations for unidirectional flows from senders, through at least		for unidirectional flows, through at least two networks, an edge
	two networks, an edge network and its downstream transit provider.		network and its downstream transit provider. Imagine the edge
	Imagine the edge network charges its retail customers per reservation		network charges its retail customers per reservation but also has to
	but also has to pay its transit provider a charge per reservation.		pay its transit provider a charge per reservation. Typically, both
	Typically, both its selling and buying charges might depend on the		its selling and buying charges might depend on the duration and rate
	duration and rate of each reservation. The level of the actual		of each reservation. The level of the actual selling and buying
	selling and buying prices are irrelevant to our discussion (most		prices are irrelevant to our discussion (most likely the network will
	likely the network will sell at a higher price than it buys, of		sell at a higher price than it buys, of course).
	course).

	A cheating ingress network could systematically reduce the size of		A cheating ingress network could systematically reduce the size of

	its retail customers' reservation signalling requests before		its retail customers' reservation signalling requests (e.g. the
	forwarding them to its transit provider (and systematically reinstate		SENDER_TSPEC object in RSVP's PATH message) before forwarding them to
	the responses on the way back). It would then receive an honest		its transit provider and systematically reinstate the responses on
	income from its upstream retail customer but only pay for		the way back (e.g. the FLOWSPEC object in RSVP's RESV message). It
	fraudulently smaller reservations downstream. Equivalently, a		would then receive an honest income from its upstream retail customer
	cheating ingress network may feed the traffic from a number of flows		but only pay for fraudulently smaller reservations downstream. A
	into an aggregate reservation over the transit that is smaller than		similar but opposite trick (increasing the TSPEC and decreasing the
	the total of all the flows. Because of these fraud possibilities, in		FLOWSPEC) could be perpetrated by the receiver's access network if
	traditional QoS reservation architectures the downstream network		the reservation was paid for by the receiver.
	polices at each border. The policer checks that the actual sent data
	rate of each flow is within the signalled reservation.		Equivalently, a cheating ingress network may feed the traffic from a
			number of flows into an aggregate reservation over the transit that
			is smaller than the total of all the flows. Because of these fraud
			possibilities, in traditional QoS reservation architectures the
			downstream network polices at each border. The policer checks that
			the actual sent data rate of each flow is within the signalled
			reservation.

	Reservation signalling could be authenticated end to end, but this		Reservation signalling could be authenticated end to end, but this
	wouldn't prevent the aggregation cheat just described. For this		wouldn't prevent the aggregation cheat just described. For this
	reason, and to avoid the need for a global PKI, signalling integrity		reason, and to avoid the need for a global PKI, signalling integrity
	is typically only protected on a hop-by-hop basis [RFC2747].		is typically only protected on a hop-by-hop basis [RFC2747].

	A variant of the above cheat is where a router in an honest		A variant of the above cheat is where a router in an honest
	downstream network denies admission to a new reservation, but a		downstream network denies admission to a new reservation, but a
	cheating upstream network still admits the flow. For instance, the		cheating upstream network still admits the flow. For instance, the
	networks may be using Diffserv internally, but Intserv admission		networks may be using Diffserv internally, but Intserv admission

	skipping to change at page 9, line 9		skipping to change at page 11, line 32
	revenue from the reservation, but it doesn't have to pay any		revenue from the reservation, but it doesn't have to pay any
	downstream wholesale charges and the congestion is in someone else's		downstream wholesale charges and the congestion is in someone else's
	network. The cheating network may calculate that most of the flows		network. The cheating network may calculate that most of the flows
	affected by congestion in the downstream network aren't likely to be		affected by congestion in the downstream network aren't likely to be
	its own. It may also calculate that the downstream router has been		its own. It may also calculate that the downstream router has been
	configured to deny admission to new flows in order to protect		configured to deny admission to new flows in order to protect
	bandwidth assigned to other network services (e.g. enterprise VPNs).		bandwidth assigned to other network services (e.g. enterprise VPNs).
	So the cheating network can steal capacity from the downstream		So the cheating network can steal capacity from the downstream
	operator's VPNs that are probably not actually congested.		operator's VPNs that are probably not actually congested.


			All the above cheats are framed in the context of RSVP's receiver
			confirmed reservation model, but similar cheats are possible with
			sender-initiated and other models.

	To summarise, in traditional reservation signalling architectures, if		To summarise, in traditional reservation signalling architectures, if
	a network cannot trust a neighbouring upstream network to rate-police		a network cannot trust a neighbouring upstream network to rate-police
	each reservation, it has to check for itself that the data rate fits		each reservation, it has to check for itself that the data rate fits
	within each of the reservations it has admitted.		within each of the reservations it has admitted.

	3.2. Generic Scenario		3.2. Generic Scenario

	We will now describe a generic internetworking scenario that we will		We will now describe a generic internetworking scenario that we will
	use to describe and to test our bulk policing proposal. It consists		use to describe and to test our bulk policing proposal. It consists
	of a number of networks and endpoints that do not fully trust each		of a number of networks and endpoints that do not fully trust each

	skipping to change at page 10, line 16		skipping to change at page 12, line 45
	Within the Diffserv region are three interior domains, A, B and C, as		Within the Diffserv region are three interior domains, A, B and C, as
	well as the inward facing interfaces of the ingress and egress		well as the inward facing interfaces of the ingress and egress
	gateways. An ingress and egress border router (BR) is shown		gateways. An ingress and egress border router (BR) is shown
	interconnecting each interior domain with the next. There may be		interconnecting each interior domain with the next. There may be
	other interior routers (not shown) within each interior domain.		other interior routers (not shown) within each interior domain.

	In two paragraphs we now briefly recap how pre-congestion		In two paragraphs we now briefly recap how pre-congestion
	notification is intended to be used to control flow admission to a		notification is intended to be used to control flow admission to a
	large Diffserv region. The first paragraph describes data plane		large Diffserv region. The first paragraph describes data plane
	functions and the second describes signalling in the control plane.		functions and the second describes signalling in the control plane.

	We omit many details from [CL-deploy] including behaviour during		We omit many details from [PCN-arch] including behaviour during
	routing changes. For brevity here we assume other flows are already		routing changes. For brevity here we assume other flows are already
	in progress across a path through the Diffserv region before a new		in progress across a path through the Diffserv region before a new
	one arrives, but how bootstrap works is described in Section 4.3.2.		one arrives, but how bootstrap works is described in Section 4.3.2.

	Figure 1 shows a single simplex reserved flow from the sending (Sx)		Figure 1 shows a single simplex reserved flow from the sending (Sx)
	end host to the receiving (Rx) end host. The ingress gateway polices		end host to the receiving (Rx) end host. The ingress gateway polices
	incoming traffic within its admitted reservation and remarks it to		incoming traffic within its admitted reservation and remarks it to
	turn on an ECN-capable codepoint [RFC3168] and the controlled load		turn on an ECN-capable codepoint [RFC3168] and the controlled load
	(CL) Diffserv codepoint. Together, these codepoints define which		(CL) Diffserv codepoint. Together, these codepoints define which
	traffic is entitled to the enhanced scheduling of the CL behaviour		traffic is entitled to the enhanced scheduling of the CL behaviour

	skipping to change at page 11, line 18		skipping to change at page 13, line 46
	otherwise it returns the original RESV signal back towards the data		otherwise it returns the original RESV signal back towards the data
	sender.		sender.

	Once a reservation is admitted, its traffic will always receive low		Once a reservation is admitted, its traffic will always receive low
	delay service for the duration of the reservation. This is because		delay service for the duration of the reservation. This is because
	ingress gateways ensure that traffic not under a reservation cannot		ingress gateways ensure that traffic not under a reservation cannot
	pass into the Diffserv region with the CL DSCP set. So non-reserved		pass into the Diffserv region with the CL DSCP set. So non-reserved
	traffic will always be treated with a lower priority PHB at each		traffic will always be treated with a lower priority PHB at each
	interior router. And even if some disaster re-routes traffic after		interior router. And even if some disaster re-routes traffic after
	it has been admitted, if the traffic through any resource tips over a		it has been admitted, if the traffic through any resource tips over a

	fail-safe threshold, pre-congestion notification will trigger flow-		fail-safe threshold, pre-congestion notification will trigger flow
	pre-emption to very quickly bring every router within the whole		pre-emption to very quickly bring every router within the whole
	Diffserv region back below its operating point.		Diffserv region back below its operating point.

	The whole admission control system just described deliberately		The whole admission control system just described deliberately
	confines per-flow processing to the access edges of the network,		confines per-flow processing to the access edges of the network,
	where it will not limit the system's scalability. But ideally we		where it will not limit the system's scalability. But ideally we
	want to extend this approach to multiple networks, to take even more		want to extend this approach to multiple networks, to take even more
	advantage of its scaling potential. We would still need per-flow		advantage of its scaling potential. We would still need per-flow
	processing at the access edges of each network, but not at the high		processing at the access edges of each network, but not at the high
	speed interfaces where they interconnect. Even though such an		speed interfaces where they interconnect. Even though such an
	admission control system would work technically, it would gain us no		admission control system would work technically, it would gain us no
	scaling advantage if each network also wanted to police the rate of		scaling advantage if each network also wanted to police the rate of

	each admitted flow for itself---border routers would still have to do		each admitted flow for itself--border routers would still have to do
	complex packet operations per-flow anyway, given they don't trust		complex packet operations per-flow anyway, given they don't trust
	upstream networks to do their policing for them.		upstream networks to do their policing for them.

	This memo describes how to emulate per-flow rate policing using bulk		This memo describes how to emulate per-flow rate policing using bulk
	mechanisms at border routers, so the full scalability potential of		mechanisms at border routers, so the full scalability potential of
	pre-congestion notification is not limited by the need for per-flow		pre-congestion notification is not limited by the need for per-flow
	policing mechanisms at borders, which would make borders the most		policing mechanisms at borders, which would make borders the most
	cost-critical pinch-points. Then we can achieve the long sought-for		cost-critical pinch-points. Then we can achieve the long sought-for
	vision of secure Internet-wide bandwidth reservations without needing		vision of secure Internet-wide bandwidth reservations without needing

	per-flow processing at all in core and border routers---where		per-flow processing at all in core and border routers--where
	scalability is most critical.		scalability is most critical.

	4. Re-ECN Protocol for an RSVP (or similar) Transport		4. Re-ECN Protocol for an RSVP (or similar) Transport

	4.1. Protocol Overview		4.1. Protocol Overview

	First we need to recap the way routers accumulate congestion marking		First we need to recap the way routers accumulate congestion marking
	along a path. Each ECN-capable router marks some packets with CE,		along a path. Each ECN-capable router marks some packets with CE,
	the marking probability increasing with the length of the queue at		the marking probability increasing with the length of the queue at
	its egress link. The only difference with pre-congestion		its egress link. The only difference with pre-congestion

	skipping to change at page 14, line 21		skipping to change at page 17, line 5
	which need only be read by border policing functions.		which need only be read by border policing functions.

	Although the RE flag is a separate, single bit field, it can be read		Although the RE flag is a separate, single bit field, it can be read
	as an extension to the two-bit ECN field; the three concatenated bits		as an extension to the two-bit ECN field; the three concatenated bits
	in what we will call the extended ECN field (EECN) make eight		in what we will call the extended ECN field (EECN) make eight
	codepoints available. When the RE flag setting is "don't care", we		codepoints available. When the RE flag setting is "don't care", we
	use the RFC3168 names of the ECN codepoints, but [Re-TCP] proposes		use the RFC3168 names of the ECN codepoints, but [Re-TCP] proposes
	the following six codepoint names for when there is a need to be more		the following six codepoint names for when there is a need to be more
	specific.		specific.


	+-------+------------+------+---------------+-----------------------+		+--------+-------------+-------+-------------+----------------------+
	\| ECN \| RFC3168 \| RE \| Extended ECN \| Re-ECN meaning \|		\| ECN \| RFC3168 \| RE \| Extended \| Re-ECN meaning \|
	\| field \| codepoint \| flag \| codepoint \| \|		\| field \| codepoint \| flag \| ECN \| \|
	+-------+------------+------+---------------+-----------------------+		\| \| \| \| codepoint \| \|
			+--------+-------------+-------+-------------+----------------------+
	\| 00 \| Not-ECT \| 0 \| Not-RECT \| Not re-ECN-capable \|		\| 00 \| Not-ECT \| 0 \| Not-RECT \| Not re-ECN-capable \|
	\| \| \| \| \| transport \|		\| \| \| \| \| transport \|
	\| 00 \| Not-ECT \| 1 \| FNE \| Feedback not \|		\| 00 \| Not-ECT \| 1 \| FNE \| Feedback not \|
	\| \| \| \| \| established \|		\| \| \| \| \| established \|
	\| 01 \| ECT(1) \| 0 \| Re-Echo \| Re-echoed congestion \|		\| 01 \| ECT(1) \| 0 \| Re-Echo \| Re-echoed congestion \|
	\| \| \| \| \| and RECT \|		\| \| \| \| \| and RECT \|
	\| 01 \| ECT(1) \| 1 \| RECT \| Re-ECN capable \|		\| 01 \| ECT(1) \| 1 \| RECT \| Re-ECN capable \|
	\| \| \| \| \| transport \|		\| \| \| \| \| transport \|
	\| 10 \| ECT(0) \| 0 \| --- \| Legacy ECN use \|		\| 10 \| ECT(0) \| 0 \| --- \| Legacy ECN use \|
	\| \| \| \| \| only \|		\| \| \| \| \| only \|
	\| 10 \| ECT(0) \| 1 \| --CU-- \| Currently unused \|		\| 10 \| ECT(0) \| 1 \| --CU-- \| Currently unused \|
	\| \| \| \| \| \|		\| \| \| \| \| \|
	\| 11 \| CE \| 0 \| CE(0) \| Congestion \|		\| 11 \| CE \| 0 \| CE(0) \| Congestion \|
	\| \| \| \| \| experienced with \|		\| \| \| \| \| experienced with \|
	\| \| \| \| \| Re-Echo \|		\| \| \| \| \| Re-Echo \|
	\| 11 \| CE \| 1 \| CE(-1) \| Congestion \|		\| 11 \| CE \| 1 \| CE(-1) \| Congestion \|
	\| \| \| \| \| experienced \|		\| \| \| \| \| experienced \|

	+-------+------------+------+---------------+-----------------------+		+--------+-------------+-------+-------------+----------------------+

	Table 1: Re-cap of Default Extended ECN Codepoints Proposed for Re-		Table 1: Re-cap of Default Extended ECN Codepoints Proposed for Re-
	ECN		ECN

	4.2.2. Re-ECN Combined with Pre-Congestion Notification (re-PCN)		4.2.2. Re-ECN Combined with Pre-Congestion Notification (re-PCN)

	As permitted by the ECN specification [RFC3168], a proposal is		As permitted by the ECN specification [RFC3168], a proposal is
	currently being advanced in the IETF to define different semantics		currently being advanced in the IETF to define different semantics
	for how routers might mark the ECN field of certain packets. The		for how routers might mark the ECN field of certain packets. The
	idea is to be able to notify congestion when the router's load		idea is to be able to notify congestion when the router's load

	skipping to change at page 16, line 4		skipping to change at page 18, line 36
	sending node (or its proxy) to detect suppression of congestion		sending node (or its proxy) to detect suppression of congestion
	marking in the feedback loop. Thus the Nonce requires the sender or		marking in the feedback loop. Thus the Nonce requires the sender or
	its proxy to be trusted to respond correctly to congestion. But this		its proxy to be trusted to respond correctly to congestion. But this
	is precisely the main cheat we want to protect against (as well as		is precisely the main cheat we want to protect against (as well as
	many others).		many others).

	One of the compromise protocol encodings that [PCN] explores		One of the compromise protocol encodings that [PCN] explores
	("Alternative 5") leaves out support for the ECN Nonce. Therefore we		("Alternative 5") leaves out support for the ECN Nonce. Therefore we
	use that one. This encoding of PCN markings is shown on the left of		use that one. This encoding of PCN markings is shown on the left of
	Table 2. Note that these codepoints of the ECN field only take on		Table 2. Note that these codepoints of the ECN field only take on

	the semantics of pre-congestion noticiation if they are combined with		the semantics of pre-congestion notification if they are combined
	a Diffserv codepoint that the operator has configured to cause PCN		with a Diffserv codepoint that the operator has configured to cause
	marking, by mapping it to a PCN-enhanced PHB.		PCN marking, by mapping it to a PCN-enhanced PHB.

	For the rest of this memo, we will not distinguish between Admission		For the rest of this memo, we will not distinguish between Admission
	Marking and Pre-emption Marking unless we need to be specific. We		Marking and Pre-emption Marking unless we need to be specific. We
	will call both "congestion marking". With the above encoding,		will call both "congestion marking". With the above encoding,
	congestion marking can be read to mean any packet with the left-most		congestion marking can be read to mean any packet with the left-most
	bit of the ECN field set.		bit of the ECN field set.

	The re-ECN protocol can be used to control misbehaving sources		The re-ECN protocol can be used to control misbehaving sources
	whether congestion is with respect to a logical threshold (PCN) or		whether congestion is with respect to a logical threshold (PCN) or
	the physical line rate (ECN). In either case the RE flag can be used		the physical line rate (ECN). In either case the RE flag can be used
	to create an extended ECN field. For PCN-capable packets, the 8		to create an extended ECN field. For PCN-capable packets, the 8
	possible encodings of this 3-bit extended ECN (EECN) field are		possible encodings of this 3-bit extended ECN (EECN) field are
	defined on the right of Table 2 below. The purposes of these		defined on the right of Table 2 below. The purposes of these
	different codepoints will be introduced in subsequent sections.		different codepoints will be introduced in subsequent sections.


	+-------+-----------------+------+-------------+--------------------+		+-------+-----------------+------+--------------+-------------------+
	\| ECN \| PCN codepoint \| RE \| Extended \| Re-ECN meaning \|		\| ECN \| PCN codepoint \| RE \| Extended ECN \| Re-ECN meaning \|
	\| field \| (Alternative 5) \| flag \| ECN \| \|		\| field \| (Alternative 5) \| flag \| codepoint \| \|
	\| \| \| \| codepoint \| \|		+-------+-----------------+------+--------------+-------------------+
	+-------+-----------------+------+-------------+--------------------+		\| 00 \| Not-ECT \| 0 \| Not-RECT \| Not \|
	\| 00 \| Not-ECT \| 0 \| Not-RECT \| Not re-ECN-capable \|		\| \| \| \| \| re-ECN-capable \|
	\| \| \| \| \| transport \|		\| \| \| \| \| transport \|
	\| 00 \| Not-ECT \| 1 \| FNE \| Feedback not \|		\| 00 \| Not-ECT \| 1 \| FNE \| Feedback not \|
	\| \| \| \| \| established \|		\| \| \| \| \| established \|
	\| 01 \| ECT(1) \| 0 \| Re-Echo \| Re-echoed \|		\| 01 \| ECT(1) \| 0 \| Re-Echo \| Re-echoed \|
	\| \| \| \| \| congestion and \|		\| \| \| \| \| congestion and \|
	\| \| \| \| \| RECT \|		\| \| \| \| \| RECT \|
	\| 01 \| ECT(1) \| 1 \| RECT \| Re-ECN capable \|		\| 01 \| ECT(1) \| 1 \| RECT \| Re-ECN capable \|
	\| \| \| \| \| transport \|		\| \| \| \| \| transport \|
	\| 10 \| AM \| 0 \| AM(0) \| Admission Marking \|		\| 10 \| AM \| 0 \| AM(0) \| Admission Marking \|
	\| \| \| \| \| with Re-Echo \|		\| \| \| \| \| with Re-Echo \|
	\| 10 \| AM \| 1 \| AM(-1) \| Admission Marking \|		\| 10 \| AM \| 1 \| AM(-1) \| Admission Marking \|
	\| \| \| \| \| \|		\| \| \| \| \| \|
	\| 11 \| PM \| 0 \| PM(0) \| Pre-emption \|		\| 11 \| PM \| 0 \| PM(0) \| Pre-emption \|
	\| \| \| \| \| Marking with \|		\| \| \| \| \| Marking with \|
	\| \| \| \| \| Re-Echo \|		\| \| \| \| \| Re-Echo \|
	\| 11 \| PM \| 1 \| PM(-1) \| Pre-emption \|		\| 11 \| PM \| 1 \| PM(-1) \| Pre-emption \|
	\| \| \| \| \| Marking \|		\| \| \| \| \| Marking \|

	+-------+-----------------+------+-------------+--------------------+		+-------+-----------------+------+--------------+-------------------+

	Table 2: Extended ECN Codepoints if the Diffserv codepoint uses Pre-		Table 2: Extended ECN Codepoints if the Diffserv codepoint uses Pre-
	congestion Notification (PCN)		congestion Notification (PCN)

	4.3. Protocol Operation		4.3. Protocol Operation

	4.3.1. Protocol Operation for an Established Flow		4.3.1. Protocol Operation for an Established Flow

	The re-ECN protocol involves a simple tweak to the action of the		The re-ECN protocol involves a simple tweak to the action of the
	gateway at the ingress edge of the CL region. In the deployment		gateway at the ingress edge of the CL region. In the deployment

	model just described [CL-deploy], for each active traffic aggregate		model just described [PCN-arch], for each active traffic aggregate
	across the CL region (CL-region-aggregate) the ingress gateway will		across the CL region (CL-region-aggregate) the ingress gateway will
	hold a fairly recent Congestion-Level-Estimate that the egress		hold a fairly recent Congestion-Level-Estimate that the egress
	gateway will have fed back to it, piggybacked on the signalling that		gateway will have fed back to it, piggybacked on the signalling that
	sets up each flow. For instance, one aggregate might have been		sets up each flow. For instance, one aggregate might have been
	experiencing 3% pre-congestion (that is, congestion marked octets		experiencing 3% pre-congestion (that is, congestion marked octets
	whether Admission Marked or Pre-emption Marked). In this case, the		whether Admission Marked or Pre-emption Marked). In this case, the
	ingress gateway MUST clear the RE flag to "0" for the same percentage		ingress gateway MUST clear the RE flag to "0" for the same percentage
	of octets of CL-packets (3%) and set it to "1" in the rest (97%).		of octets of CL-packets (3%) and set it to "1" in the rest (97%).
	Appendix A.1 gives a simple pseudo-code algorithm that the ingress		Appendix A.1 gives a simple pseudo-code algorithm that the ingress
	gateway may use to do this.		gateway may use to do this.

	skipping to change at page 18, line 49		skipping to change at page 21, line 30

	4.3.2. Aggregate Bootstrap		4.3.2. Aggregate Bootstrap

	When a new reservation PATH message arrives at the egress, if there		When a new reservation PATH message arrives at the egress, if there
	are currently no flows in progress from the same ingress, there will		are currently no flows in progress from the same ingress, there will
	be no state maintaining the current level of pre-congestion marking		be no state maintaining the current level of pre-congestion marking
	for the aggregate. While the reservation signalling continues onward		for the aggregate. While the reservation signalling continues onward
	towards the receiving host, the egress gateway returns an RSVP		towards the receiving host, the egress gateway returns an RSVP
	message to the ingress with a flag [RSVP-ECN] asking the ingress to		message to the ingress with a flag [RSVP-ECN] asking the ingress to
	send a specified number of data probes between them. This bootstrap		send a specified number of data probes between them. This bootstrap

	behaviour is all described in the deployment model [CL-deploy].		behaviour is all described in the deployment model [PCN-arch].

	However, with our new re-ECN scheme, the ingress does not know what		However, with our new re-ECN scheme, the ingress does not know what
	proportion of the data probes should have the RE flag blanked,		proportion of the data probes should have the RE flag blanked,
	because it has no estimate yet of pre-congestion for the path across		because it has no estimate yet of pre-congestion for the path across
	the Diffserv region.		the Diffserv region.

	To be conservative, following the guidance for specifying other re-		To be conservative, following the guidance for specifying other re-
	ECN transports in [Re-TCP], the ingress SHOULD set the FNE codepoint		ECN transports in [Re-TCP], the ingress SHOULD set the FNE codepoint
	of the extended ECN header in all probe packets (Table 2). As per		of the extended ECN header in all probe packets (Table 2). As per
	the deployment model, the egress gateway measures the fraction of		the deployment model, the egress gateway measures the fraction of

	skipping to change at page 20, line 19		skipping to change at page 22, line 48
	gateway. It will often be possible to apply sanctions at the		gateway. It will often be possible to apply sanctions at the
	granularity of aggregates rather than flows, but in an internetworked		granularity of aggregates rather than flows, but in an internetworked
	environment it cannot be guaranteed that aggregates will be		environment it cannot be guaranteed that aggregates will be
	identifiable in remote networks. So setting FNE at the start of each		identifiable in remote networks. So setting FNE at the start of each
	flow is a safe strategy. For instance, a remote network may have		flow is a safe strategy. For instance, a remote network may have
	equal cost multi-path (ECMP) routing enabled, causing different flows		equal cost multi-path (ECMP) routing enabled, causing different flows
	between the same gateways to traverse different paths.		between the same gateways to traverse different paths.

	After an idle period of more than 1 second, the ingress gateway		After an idle period of more than 1 second, the ingress gateway
	SHOULD set the EECN field of the next packet it sends to FNE. This		SHOULD set the EECN field of the next packet it sends to FNE. This

	allows the design of network policers to be deterministic (see [Re-		allows the design of network policers to be deterministic (see
	TCP]).		[Re-TCP]).

	However, if the ingress gateway can guarantee that the network(s)		However, if the ingress gateway can guarantee that the network(s)
	that will carry the flow to its egress gateway all use a common		that will carry the flow to its egress gateway all use a common
	identifier for the aggregate (e.g. a single MPLS network without ECMP		identifier for the aggregate (e.g. a single MPLS network without ECMP
	routing), it MAY NOT set FNE when it adds a new flow to an active		routing), it MAY NOT set FNE when it adds a new flow to an active
	aggregate. And an FNE packet need only be sent if a whole aggregate		aggregate. And an FNE packet need only be sent if a whole aggregate
	has been idle for more than 1 second.		has been idle for more than 1 second.

	4.3.4. Router Forwarding Behaviour		4.3.4. Router Forwarding Behaviour


	skipping to change at page 21, line 5		skipping to change at page 23, line 25
	congestion notification:		congestion notification:

	Preferential drop: When a router cannot avoid dropping ECN-capable		Preferential drop: When a router cannot avoid dropping ECN-capable
	packets, preferential dropping of packets with different extended		packets, preferential dropping of packets with different extended
	ECN codepoints SHOULD be implemented between packets within a PHB		ECN codepoints SHOULD be implemented between packets within a PHB
	that uses PCN marking. The drop preference order to use is		that uses PCN marking. The drop preference order to use is
	defined in Table 4. Note that to reduce configuration complexity,		defined in Table 4. Note that to reduce configuration complexity,
	Re-Echo and FNE MAY be given the same drop preference, but if		Re-Echo and FNE MAY be given the same drop preference, but if
	feasible, FNE should be dropped in preference to Re-Echo.		feasible, FNE should be dropped in preference to Re-Echo.


	+--------+------+----------------+---------+------------------------+		+---------+-------+----------------+---------+----------------------+
	\| ECN \| RE \| Extended ECN \| Drop \| Re-ECN meaning \|		\| ECN \| RE \| Extended ECN \| Drop \| Re-ECN meaning \|
	\| field \| flag \| codepoint \| Pref \| \|		\| field \| flag \| codepoint \| Pref \| \|

	+--------+------+----------------+---------+------------------------+		+---------+-------+----------------+---------+----------------------+
	\| 01 \| 0 \| Re-Echo \| 5/4 \| Re-echoed congestion \|		\| 01 \| 0 \| Re-Echo \| 5/4 \| Re-echoed congestion \|
	\| \| \| \| \| and RECT \|		\| \| \| \| \| and RECT \|
	\| 00 \| 1 \| FNE \| 4 \| Feedback not \|		\| 00 \| 1 \| FNE \| 4 \| Feedback not \|
	\| \| \| \| \| established \|		\| \| \| \| \| established \|
	\| 01 \| 1 \| RECT \| 3 \| Re-ECN capable \|		\| 01 \| 1 \| RECT \| 3 \| Re-ECN capable \|
	\| \| \| \| \| transport \|		\| \| \| \| \| transport \|

	\| 10 \| 0 \| AM(0) \| 3 \| Admission Marking with \|		\| 10 \| 0 \| AM(0) \| 3 \| Admission Marking \|
	\| \| \| \| \| Re-Echo \|		\| \| \| \| \| with Re-Echo \|
	\| 10 \| 1 \| AM(-1) \| 3 \| Admission Marking \|		\| 10 \| 1 \| AM(-1) \| 3 \| Admission Marking \|
	\| \| \| \| \| \|		\| \| \| \| \| \|
	\| 11 \| 0 \| PM(0) \| 2 \| Pre-emption Marking \|		\| 11 \| 0 \| PM(0) \| 2 \| Pre-emption Marking \|
	\| \| \| \| \| with Re-Echo \|		\| \| \| \| \| with Re-Echo \|
	\| 11 \| 1 \| PM(-1) \| 2 \| Pre-emption Marking \|		\| 11 \| 1 \| PM(-1) \| 2 \| Pre-emption Marking \|
	\| \| \| \| \| \|		\| \| \| \| \| \|
	\| 00 \| 0 \| Not-RECT \| 1 \| Not re-ECN-capable \|		\| 00 \| 0 \| Not-RECT \| 1 \| Not re-ECN-capable \|
	\| \| \| \| \| transport \|		\| \| \| \| \| transport \|

	+--------+------+----------------+---------+------------------------+		+---------+-------+----------------+---------+----------------------+

	Table 4: Drop Preference of Extended ECN Codepoints (1 = drop 1st)		Table 4: Drop Preference of Extended ECN Codepoints (1 = drop 1st)


	Given this proposal is being advanced at the same time as PCN		Given this proposal is being advanced at the same time as PCN
	itself, we strongly RECOMMEND that preferential drop based on		itself, we strongly RECOMMEND that preferential drop based on
	extended ECN codepoint is added to router forwarding at the same		extended ECN codepoint is added to router forwarding at the same
	time as PCN marking. Preferential dropping can be difficult to		time as PCN marking. Preferential dropping can be difficult to
	implement, but we strongly RECOMMEND this security-related re-ECN		implement, but we strongly RECOMMEND this security-related re-ECN
	improvement where feasible as it is an effective defence against		improvement where feasible as it is an effective defence against
	flooding attacks.		flooding attacks.

	Marking vs. Drop: We propose that PCN-routers SHOULD inspect the RE		Marking vs. Drop: We propose that PCN-routers SHOULD inspect the RE
	flag as well as the ECN field to decide whether to drop or mark		flag as well as the ECN field to decide whether to drop or mark

	skipping to change at page 22, line 6		skipping to change at page 24, line 30
	understand drop, not congestion marking. But a PCN-capable router		understand drop, not congestion marking. But a PCN-capable router
	can mark rather than drop an FNE packet, even though its ECN field		can mark rather than drop an FNE packet, even though its ECN field
	when looked at in isolation is '00' which appears to be a legacy		when looked at in isolation is '00' which appears to be a legacy
	Not-ECT packet. Therefore, if a packet's RE flag is '1', even if		Not-ECT packet. Therefore, if a packet's RE flag is '1', even if
	its ECN field is '00', a PCN-enabled router SHOULD use congestion		its ECN field is '00', a PCN-enabled router SHOULD use congestion
	marking. This allows the `feedback not established' (FNE)		marking. This allows the `feedback not established' (FNE)
	codepoint to be used for probe packets, in order to pick up PCN		codepoint to be used for probe packets, in order to pick up PCN
	marking when bootstrapping an aggregate.		marking when bootstrapping an aggregate.

	ECN marking rather than dropping of FNE packets MUST only be		ECN marking rather than dropping of FNE packets MUST only be

	deployed in controlled environments, such as that in [CL-deploy],		deployed in controlled environments, such as that in [PCN-arch],
	where the presence of an egress node that understands ECN marking		where the presence of an egress node that understands ECN marking
	is assured. Congestion events might otherwise be ignored if the		is assured. Congestion events might otherwise be ignored if the
	receiver only understands drop, rather than ECN marking. This is		receiver only understands drop, rather than ECN marking. This is
	because there is no guarantee that ECN capability has been		because there is no guarantee that ECN capability has been
	negotiated if feedback is not established (FNE). Also, [Re-TCP]		negotiated if feedback is not established (FNE). Also, [Re-TCP]
	places the strong condition that a router MUST apply drop rather		places the strong condition that a router MUST apply drop rather
	than marking to FNE packets unless it can guarantee that FNE		than marking to FNE packets unless it can guarantee that FNE
	packets are rate limited either locally or upstream.		packets are rate limited either locally or upstream.

	4.3.5. Extensions		4.3.5. Extensions

	If a different signalling system, such as NSIS, were used, but it		If a different signalling system, such as NSIS, were used, but it
	provided admission control in a similar way, using pre-congestion		provided admission control in a similar way, using pre-congestion
	notification (e.g. with RMD [NSIS-RMD]) we believe re-ECN could be		notification (e.g. with RMD [NSIS-RMD]) we believe re-ECN could be
	used to protect against misbehaving networks in the same way as		used to protect against misbehaving networks in the same way as
	proposed above.		proposed above.

	5. Emulating Border Policing with Re-ECN		5. Emulating Border Policing with Re-ECN


			Note that the re-ECN protocol described in Section 4 above would
			require standardisation, whereas operators acting in their own
			interests would be expected to deploy policing and monitoring
			functions similar to those proposed in the sections below without any
			further need for standardisation by the IETF. Flexibility is
			expected in exactly how policing and monitoring is done.

	5.1. Informal Terminology		5.1. Informal Terminology

	In the rest of this memo, where the context makes it clear, we will		In the rest of this memo, where the context makes it clear, we will
	sometimes loosely use the term `congestion' rather than using the		sometimes loosely use the term `congestion' rather than using the
	stricter `downstream pre-congestion'. Also we will loosely talk of		stricter `downstream pre-congestion'. Also we will loosely talk of
	positive or negative flows, meaning flows where the moving average of		positive or negative flows, meaning flows where the moving average of
	the downstream pre-congestion metric is persistently positive or		the downstream pre-congestion metric is persistently positive or
	negative. The notion of a negative metric arises because it is		negative. The notion of a negative metric arises because it is
	derived by subtracting one metric from another. Of course actual		derived by subtracting one metric from another. Of course actual
	downstream congestion cannot be negative, only the metric can		downstream congestion cannot be negative, only the metric can

	skipping to change at page 23, line 7		skipping to change at page 26, line 5
	0. Blanking the RE flag increments the worth of a packet to +1.		0. Blanking the RE flag increments the worth of a packet to +1.
	Congestion marking a packet decrements its worth (whether admission		Congestion marking a packet decrements its worth (whether admission
	marking or pre-emption marking). Congestion marking a previously		marking or pre-emption marking). Congestion marking a previously
	blanked packet cancel out the positive and negative worth of each		blanked packet cancel out the positive and negative worth of each
	marking (a worth of 0). The FNE codepoint is an exception. It has		marking (a worth of 0). The FNE codepoint is an exception. It has
	the same positive worth as a packet with the Re-Echo codepoint. The		the same positive worth as a packet with the Re-Echo codepoint. The
	table below specifies unambiguously the worth of each extended ECN		table below specifies unambiguously the worth of each extended ECN
	codepoint. Note the order is different from the previous table to		codepoint. Note the order is different from the previous table to
	emphasise how congestion marking processes decrement the worth.		emphasise how congestion marking processes decrement the worth.


	+--------+------+------------------+-------+------------------------+		+---------+-------+-----------------+-------+-----------------------+
	\| ECN \| RE \| Extended ECN \| Worth \| Re-ECN meaning \|		\| ECN \| RE \| Extended ECN \| Worth \| Re-ECN meaning \|
	\| field \| flag \| codepoint \| \| \|		\| field \| flag \| codepoint \| \| \|

	+--------+------+------------------+-------+------------------------+		+---------+-------+-----------------+-------+-----------------------+
	\| 00 \| 0 \| Not-RECT \| n/a \| Not re-ECN-capable \|		\| 00 \| 0 \| Not-RECT \| n/a \| Not re-ECN-capable \|
	\| \| \| \| \| transport \|		\| \| \| \| \| transport \|
	\| 01 \| 0 \| Re-Echo \| +1 \| Re-echoed congestion \|		\| 01 \| 0 \| Re-Echo \| +1 \| Re-echoed congestion \|
	\| \| \| \| \| and RECT \|		\| \| \| \| \| and RECT \|

	\| 10 \| 0 \| AM(0) \| 0 \| Admission Marking with \|		\| 10 \| 0 \| AM(0) \| 0 \| Admission Marking \|
	\| \| \| \| \| Re-Echo \|		\| \| \| \| \| with Re-Echo \|
	\| 11 \| 0 \| PM(0) \| 0 \| Pre-emption Marking \|		\| 11 \| 0 \| PM(0) \| 0 \| Pre-emption Marking \|
	\| \| \| \| \| with Re-Echo \|		\| \| \| \| \| with Re-Echo \|
	\| 00 \| 1 \| FNE \| +1 \| Feedback not \|		\| 00 \| 1 \| FNE \| +1 \| Feedback not \|
	\| \| \| \| \| established \|		\| \| \| \| \| established \|
	\| 01 \| 1 \| RECT \| 0 \| Re-ECN capable \|		\| 01 \| 1 \| RECT \| 0 \| Re-ECN capable \|
	\| \| \| \| \| transport \|		\| \| \| \| \| transport \|
	\| 10 \| 1 \| AM(-1) \| -1 \| Admission Marking \|		\| 10 \| 1 \| AM(-1) \| -1 \| Admission Marking \|
	\| \| \| \| \| \|		\| \| \| \| \| \|
	\| 11 \| 1 \| PM(-1) \| -1 \| Pre-emption Marking \|		\| 11 \| 1 \| PM(-1) \| -1 \| Pre-emption Marking \|

	+--------+------+------------------+-------+------------------------+		+---------+-------+-----------------+-------+-----------------------+

	Table 5: 'Worth' of Extended ECN Codepoints		Table 5: 'Worth' of Extended ECN Codepoints

	5.2. Policing Overview		5.2. Policing Overview

	It will be recalled that downstream congestion can be found by		It will be recalled that downstream congestion can be found by
	subtracting upstream congestion from path congestion. Figure 4		subtracting upstream congestion from path congestion. Figure 4
	displays the difference between the two plots in Figure 3 to show		displays the difference between the two plots in Figure 3 to show
	downstream pre-congestion across the same path through the Internet.		downstream pre-congestion across the same path through the Internet.


	skipping to change at page 24, line 41		skipping to change at page 27, line 41
	sanctions to flows if downstream congestion goes negative before the		sanctions to flows if downstream congestion goes negative before the
	egress gateway. The upward arrow at Domain C's border with the		egress gateway. The upward arrow at Domain C's border with the
	egress gateway represents the incentive the sanctions would create to		egress gateway represents the incentive the sanctions would create to
	prevent negative traffic. The same upward pressure can be applied at		prevent negative traffic. The same upward pressure can be applied at
	any domain border (arrows not shown).		any domain border (arrows not shown).

	Any flow that persistently goes negative by the time it leaves a		Any flow that persistently goes negative by the time it leaves a
	domain must not have been marked correctly in the first place. A		domain must not have been marked correctly in the first place. A
	domain that discovers such a flow can adopt a range of strategies to		domain that discovers such a flow can adopt a range of strategies to
	protect itself. Which strategy it uses will depend on policy,		protect itself. Which strategy it uses will depend on policy,

	because it cannot immediately assume malice---there may be an		because it cannot immediately assume malice--there may be an innocent
	innocent configuration error somewhere in the system.		configuration error somewhere in the system.

	This memo does not propose to standardise any particular mechanism to		This memo does not propose to standardise any particular mechanism to
	detect persistently negative flows, but Section 5.5 does give		detect persistently negative flows, but Section 5.5 does give
	examples. Note that we have used the term flow, but there will be no		examples. Note that we have used the term flow, but there will be no
	need to bury into the transport layer for port numbers; identifiers		need to bury into the transport layer for port numbers; identifiers
	visible in the network layer will be sufficient (IP address pair,		visible in the network layer will be sufficient (IP address pair,
	DSCP, protocol ID). The appendix also gives a mechanism to bound the		DSCP, protocol ID). The appendix also gives a mechanism to bound the
	required flow state, preventing state exhaustion attacks.		required flow state, preventing state exhaustion attacks.

	Of course, some domains may trust other domains to comply with		Of course, some domains may trust other domains to comply with

	skipping to change at page 26, line 28		skipping to change at page 29, line 28
	price to pre-congestion itself. Then the usage element of the		price to pre-congestion itself. Then the usage element of the
	interconnection contract would directly relate to the volume of pre-		interconnection contract would directly relate to the volume of pre-
	congestion caused by the upstream network.		congestion caused by the upstream network.

	The direction of penalties and charges relative to the direction of		The direction of penalties and charges relative to the direction of
	traffic flow is a constant source of confusion. Typically, where		traffic flow is a constant source of confusion. Typically, where
	capacity charges are concerned, lower tier customer networks pay		capacity charges are concerned, lower tier customer networks pay
	higher tier provider networks. So money flows from the edges to the		higher tier provider networks. So money flows from the edges to the
	middle of the internetwork, towards greater connectivity,		middle of the internetwork, towards greater connectivity,
	irrespective of the flow of data. But we advise that penalties or		irrespective of the flow of data. But we advise that penalties or

	charges for usage should follow the same direction as the data		charges for usage should follow the same direction as the data flow--
	flow---the direction of control at the network layer. Otherwise a		the direction of control at the network layer. Otherwise a network
	network lays itself open to `denial of funds' attacks. So, where a		lays itself open to `denial of funds' attacks. So, where a tier 2
	tier 2 provider sends data into a tier 3 customer network, we would		provider sends data into a tier 3 customer network, we would expect
	expect the penalty clauses for sending too much pre-congestion to be		the penalty clauses for sending too much pre-congestion to be against
	against the tier 2 network, even though it is the provider.		the tier 2 network, even though it is the provider.

	It may help to remember that data will be flowing in the other		It may help to remember that data will be flowing in the other
	direction too. So the provider network has as much opportunity to		direction too. So the provider network has as much opportunity to
	levy usage penalties as its customer, and it can set the price or		levy usage penalties as its customer, and it can set the price or
	strength of its own penalties higher if it chooses. Usage charges in		strength of its own penalties higher if it chooses. Usage charges in
	both directions tend to cancel each other out, which confirms that		both directions tend to cancel each other out, which confirms that
	usage-charging is less to do with revenue raising and more to do with		usage-charging is less to do with revenue raising and more to do with
	encouraging load control discipline in order to smooth peaks and		encouraging load control discipline in order to smooth peaks and
	troughs, improving utilisation and quality.		troughs, improving utilisation and quality.


	skipping to change at page 28, line 50		skipping to change at page 31, line 50
	cheater, because the penalties are at least proportionate to the		cheater, because the penalties are at least proportionate to the
	level of the cheat. If an edge network operator is selling		level of the cheat. If an edge network operator is selling
	reservations at a large profit over the congestion cost, these pre-		reservations at a large profit over the congestion cost, these pre-
	congestion penalties will not be sufficient to ensure networks in the		congestion penalties will not be sufficient to ensure networks in the
	middle get a share of those profits, but at least they can cover		middle get a share of those profits, but at least they can cover
	their costs.		their costs.

	We will now explain with an example. When a whole inter-network is		We will now explain with an example. When a whole inter-network is
	operating at normal (typically very low) congestion, the pre-		operating at normal (typically very low) congestion, the pre-
	congestion marking from virtual queues will be a little higher than		congestion marking from virtual queues will be a little higher than

	if the real queues had been used---still low, but more noticeable.		if the real queues had been used--still low, but more noticeable.
	But low congestion levels do not imply that usage /charges/ must also		But low congestion levels do not imply that usage _charges_ must also
	be low. Usage charges will depend on the /price/ L as well.		be low. Usage charges will depend on the _price_ L as well.

	If the metric of the usage element of an interconnection agreement		If the metric of the usage element of an interconnection agreement
	was changed from pure volume to pre-congested volume, one would		was changed from pure volume to pre-congested volume, one would
	expect the price of pre-congestion to be arranged so that the total		expect the price of pre-congestion to be arranged so that the total
	usage charge remained about the same. So, if an average pre-		usage charge remained about the same. So, if an average pre-
	congestion fraction turned out to be 1/1000, one would expect that		congestion fraction turned out to be 1/1000, one would expect that
	the price L (per octet) of pre-congestion would be about 1000 times		the price L (per octet) of pre-congestion would be about 1000 times
	the previously used (per octet) price for volume. We should add that		the previously used (per octet) price for volume. We should add that
	a switch to pre-congestion is unlikely to exactly maintain the same		a switch to pre-congestion is unlikely to exactly maintain the same
	overall level of usage charges, but this argument will be		overall level of usage charges, but this argument will be
	approximately true, because usage charge will rise to at least the		approximately true, because usage charge will rise to at least the
	level the market finds necessary to push back against usage.		level the market finds necessary to push back against usage.

	From the above example it can be seen why a 1000x higher price will		From the above example it can be seen why a 1000x higher price will
	make operators become acutely sensitive to the congestion they cause		make operators become acutely sensitive to the congestion they cause
	in other networks, which is of course the desired effect; to		in other networks, which is of course the desired effect; to

	encourage networks to /control/ the congestion they allow their users		encourage networks to _control_ the congestion they allow their users
	to cause to others.		to cause to others.

	If any network sends even one flow at higher rate, they will		If any network sends even one flow at higher rate, they will
	immediately have to pay proportionately more usage charges. Because		immediately have to pay proportionately more usage charges. Because
	there is no knowledge of reservations within the Diffserv region, no		there is no knowledge of reservations within the Diffserv region, no
	interior router can police whether the rate of each flow is greater		interior router can police whether the rate of each flow is greater
	than each reservation. So the system doesn't truly emulate rate-		than each reservation. So the system doesn't truly emulate rate-
	policing of each flow. But there is no incentive to pack a higher		policing of each flow. But there is no incentive to pack a higher
	rate into a reservation, because the charges are directly		rate into a reservation, because the charges are directly
	proportional to rate, irrespective of the reservations.		proportional to rate, irrespective of the reservations.

	skipping to change at page 30, line 8		skipping to change at page 33, line 8
	5.5. Sanctioning Dishonest Marking		5.5. Sanctioning Dishonest Marking

	As CL traffic leaves the last network before the egress gateway		As CL traffic leaves the last network before the egress gateway
	(domain C) the RE blanking fraction should match the congestion		(domain C) the RE blanking fraction should match the congestion
	marking fraction, when averaged over a sufficiently long duration		marking fraction, when averaged over a sufficiently long duration
	(perhaps ~10s to allow a few rounds of feedback through regular		(perhaps ~10s to allow a few rounds of feedback through regular
	signalling of new and refreshed reservations).		signalling of new and refreshed reservations).

	To protect itself, domain C should install a monitor at its egress.		To protect itself, domain C should install a monitor at its egress.
	It aims to detect flows of CL packets that are persistently negative.		It aims to detect flows of CL packets that are persistently negative.

	If flows are positive, domain C need take no action---this simply		If flows are positive, domain C need take no action--this simply
	means an upstream network must be paying more penalties than it needs		means an upstream network must be paying more penalties than it needs
	to. Appendix A.3 gives a suggested algorithm for the monitor,		to. Appendix A.3 gives a suggested algorithm for the monitor,
	meeting the criteria below.		meeting the criteria below.

	o It SHOULD introduce minimal false positives for honest flows;		o It SHOULD introduce minimal false positives for honest flows;

	o It SHOULD quickly detect and sanction dishonest flows (minimal		o It SHOULD quickly detect and sanction dishonest flows (minimal
	false negatives);		false negatives);

	o It MUST be invulnerable to state exhaustion attacks from malicious		o It MUST be invulnerable to state exhaustion attacks from malicious

	skipping to change at page 31, line 49		skipping to change at page 34, line 49
	5.6. Border Mechanisms		5.6. Border Mechanisms

	5.6.1. Border Accounting Mechanisms		5.6.1. Border Accounting Mechanisms

	One of the main design goals of re-ECN was for border security		One of the main design goals of re-ECN was for border security
	mechanisms to be as simple as possible, otherwise they would become		mechanisms to be as simple as possible, otherwise they would become
	the pinch-points that limit scalability of the whole internetwork.		the pinch-points that limit scalability of the whole internetwork.
	As the title of this memo suggests, we want to avoid per-flow		As the title of this memo suggests, we want to avoid per-flow
	processing at borders. We also want to keep to passive mechanisms		processing at borders. We also want to keep to passive mechanisms
	that can monitor traffic in parallel to forwarding, rather than		that can monitor traffic in parallel to forwarding, rather than

	having to filter traffic inline---in series with forwarding. As data		having to filter traffic inline--in series with forwarding. As data
	rates continue to rise, we suspect that all-optical interconnection		rates continue to rise, we suspect that all-optical interconnection
	between networks will soon be a requirement. So we want to avoid any		between networks will soon be a requirement. So we want to avoid any
	new need for buffering (even though border filtering is current		new need for buffering (even though border filtering is current
	practice for other reasons, we don't want to make it even less likely		practice for other reasons, we don't want to make it even less likely
	that we will ever get rid of it).		that we will ever get rid of it).

	So far, we have been able to keep the border mechanisms simple,		So far, we have been able to keep the border mechanisms simple,
	despite having had to harden them against some subtle attacks on the		despite having had to harden them against some subtle attacks on the
	re-ECN design. The mechanisms are still passive and avoid per-flow		re-ECN design. The mechanisms are still passive and avoid per-flow
	processing, although we do use filtering as a fail-safe to		processing, although we do use filtering as a fail-safe to

	skipping to change at page 34, line 18		skipping to change at page 37, line 18
	negative flows may not be easy, just the single step of neutralising		negative flows may not be easy, just the single step of neutralising
	their polluting effect on congestion metrics removes all the gains		their polluting effect on congestion metrics removes all the gains
	networks could otherwise make from mounting dummy traffic attacks on		networks could otherwise make from mounting dummy traffic attacks on
	each other. This puts all networks on the same side (only with		each other. This puts all networks on the same side (only with
	respect to negative flows of course), rather than being pitched		respect to negative flows of course), rather than being pitched
	against each other. The network where this flow goes negative as		against each other. The network where this flow goes negative as
	well as all the networks downstream lose out from not being		well as all the networks downstream lose out from not being
	reimbursed for any congestion this flow causes. So they all have an		reimbursed for any congestion this flow causes. So they all have an
	interest in getting rid of these negative flows. Networks forwarding		interest in getting rid of these negative flows. Networks forwarding
	a flow before it goes negative aren't strictly on the same side, but		a flow before it goes negative aren't strictly on the same side, but

	they are disinterested bystanders---they don't care that the flow		they are disinterested bystanders--they don't care that the flow goes
	goes negative downstream, but at least they can't actively gain from		negative downstream, but at least they can't actively gain from
	making it go negative. The problem becomes localised so that once a		making it go negative. The problem becomes localised so that once a
	flow goes negative, all the networks from where it happens and beyond		flow goes negative, all the networks from where it happens and beyond
	downstream each have a small problem, each can detect it has a		downstream each have a small problem, each can detect it has a
	problem and each can get rid of the problem if it chooses to. But		problem and each can get rid of the problem if it chooses to. But
	negative flows can no longer be used for any new attacks.		negative flows can no longer be used for any new attacks.

	Once an unbiased estimate of the effect of negative flows can be		Once an unbiased estimate of the effect of negative flows can be
	made, the problem reduces to detecting and preferably removing flows		made, the problem reduces to detecting and preferably removing flows
	that have gone negative as soon as possible. But importantly,		that have gone negative as soon as possible. But importantly,

	complete eradication of negative flows is no longer critical---best		complete eradication of negative flows is no longer critical--best
	endeavours will be sufficient.		endeavours will be sufficient.

	Note that the guiding principle behind all the above discussion is		Note that the guiding principle behind all the above discussion is
	that any gain from subverting the protocol should be precisely		that any gain from subverting the protocol should be precisely
	neutralised, rather than punished. If a gain is punished to a		neutralised, rather than punished. If a gain is punished to a
	greater extent than is sufficient to neutralise it, it will most		greater extent than is sufficient to neutralise it, it will most
	likely open up a new vulnerability, where the amplifying effect of		likely open up a new vulnerability, where the amplifying effect of
	the punishment mechanism can be turned on others.		the punishment mechanism can be turned on others.

	For instance, if possible, flows should be removed as soon as they go		For instance, if possible, flows should be removed as soon as they go

	skipping to change at page 35, line 16		skipping to change at page 38, line 16
	5.6.2. Competitive Routing		5.6.2. Competitive Routing

	With the above penalty system, each domain seems to have a perverse		With the above penalty system, each domain seems to have a perverse
	incentive to fake pre-congestion. For instance domain B profits from		incentive to fake pre-congestion. For instance domain B profits from
	the difference between penalties it receives at its ingress (its		the difference between penalties it receives at its ingress (its
	revenue) and those it pays at its egress (its cost). So if B		revenue) and those it pays at its egress (its cost). So if B
	overstates internal pre-congestion it seems to increase its profit.		overstates internal pre-congestion it seems to increase its profit.
	However, we can assume that domain A could bypass B, routing through		However, we can assume that domain A could bypass B, routing through
	other domains to reach the egress. So the competitive discipline of		other domains to reach the egress. So the competitive discipline of
	least-cost routing can ensure that any domain tempted to fake pre-		least-cost routing can ensure that any domain tempted to fake pre-

	congestion for profit risks losing /all/ its incoming traffic. The		congestion for profit risks losing _all_ its incoming traffic. The
	least congested route would eventually be able to win this		least congested route would eventually be able to win this
	competitive game, only as long as it didn't declare more fake pre-		competitive game, only as long as it didn't declare more fake pre-
	congestion than the next most competitive route.		congestion than the next most competitive route.


			The competitive effect of interdomain routing might be weaker nearer
			to the egress. For instance, C may be the only route B can take to
			reach the ultimate receiver. And if C over-penalises B, the egress
			gateway and the ultimate receiver seem to have no incentive to move
			their terminating attachment to another network, because only B and
			those upstream of B suffer the higher penalties. However, we must
			remember that we are only looking at the money flows at the
			unidirectional network layer. There are likely to be all sorts of
			higher level business models constructed over the top of these low
			level 'sender-pays' penalties. For instance, we might expect a
			session layer charging model where the session originator pays for a
			pair of duplex flows, one as receiver and one as sender.
			Traditionally this has been a common model for telephony and we might
			expect it to be used, at least sometimes, for other media such as
			video. Wherever such a model is used, the data receiver will be
			directly affected if its sessions terminate through a network like C
			that fakes congestion to over-penalise B. So end-customers will
			experience a direct competitive pressure to switch to cheaper
			networks, away from networks like C that try to over-penalise B.

	This memo does not need to standardise any particular mechanism for		This memo does not need to standardise any particular mechanism for
	routing based on re-ECN. Goldenberg et al [Smart_rtg] refers to		routing based on re-ECN. Goldenberg et al [Smart_rtg] refers to
	various commercial products and presents its own algorithms for		various commercial products and presents its own algorithms for
	moving traffic between multi-homed routes based on usage charges.		moving traffic between multi-homed routes based on usage charges.
	None of these systems require any changes to standards protocols		None of these systems require any changes to standards protocols
	because the choice between the available border gateway protocol		because the choice between the available border gateway protocol
	(BGP) routes is based on a combination of local knowledge of the		(BGP) routes is based on a combination of local knowledge of the
	charging regime and local measurement of traffic levels. If, as we		charging regime and local measurement of traffic levels. If, as we
	propose, charges or penalties were based on the level of re-ECN		propose, charges or penalties were based on the level of re-ECN
	measured in passing traffic, a similar optimisation could be achieved		measured in passing traffic, a similar optimisation could be achieved

	skipping to change at page 36, line 30		skipping to change at page 39, line 50
	interface. Then subsequent packets matching the same source and		interface. Then subsequent packets matching the same source and
	destination address and DSCP should be monitored. If the RE		destination address and DSCP should be monitored. If the RE
	blanking fraction minus the congestion marking fraction is		blanking fraction minus the congestion marking fraction is
	persistently negative, a management alarm SHOULD be raised, and		persistently negative, a management alarm SHOULD be raised, and
	the flow MAY be automatically subject to focused drop.		the flow MAY be automatically subject to focused drop.

	Both these mechanisms rely on the fact that highly positive (or		Both these mechanisms rely on the fact that highly positive (or
	negative) flows will appear more quickly in the sample by selecting		negative) flows will appear more quickly in the sample by selecting
	randomly solely from positive (or negative) packets.		randomly solely from positive (or negative) packets.


	Note that there is no assumption that /users/ behave rationally. The		Note that there is no assumption that _users_ behave rationally. The
	system is protected from the vagaries of irrational user behaviour by		system is protected from the vagaries of irrational user behaviour by
	the ingress gateways, which transform internal penalties into a		the ingress gateways, which transform internal penalties into a
	deterministic, admission control mechanism that prevents users from		deterministic, admission control mechanism that prevents users from
	misbehaving, by directly engineered means.		misbehaving, by directly engineered means.

	6. Analysis		6. Analysis

	The domains in Figure 1 are not expected to be completely malicious		The domains in Figure 1 are not expected to be completely malicious
	towards each other. After all, we can assume that they are all co-		towards each other. After all, we can assume that they are all co-
	operating to provide an internetworking service to the benefit of		operating to provide an internetworking service to the benefit of
	each of them and their customers. Otherwise their routing polices		each of them and their customers. Otherwise their routing polices
	would not interconnect them in the first place. However, we assume		would not interconnect them in the first place. However, we assume
	that they are also competitors of each other. So a network may try		that they are also competitors of each other. So a network may try
	to contravene our proposed protocol if it would gain or make a		to contravene our proposed protocol if it would gain or make a
	competitor lose, or both, but only if it can do so without being		competitor lose, or both, but only if it can do so without being
	caught. Therefore we do not have to consider every possible random		caught. Therefore we do not have to consider every possible random
	attack one network could launch on the traffic of another, given		attack one network could launch on the traffic of another, given
	anyway one network can always drop or corrupt packets that it		anyway one network can always drop or corrupt packets that it
	forwards on behalf of another.		forwards on behalf of another.


	Therefore, we only consider new opportunities for /gainful/ attack		Therefore, we only consider new opportunities for _gainful_ attack
	that our proposal introduces. But to a certain extent we can also		that our proposal introduces. But to a certain extent we can also
	rely on the in depth defences we have described (Section 5.6.3 )		rely on the in depth defences we have described (Section 5.6.3 )
	intended to mitigate the potential impact if one network accidentally		intended to mitigate the potential impact if one network accidentally
	misconfiguring the workings of this protocol.		misconfiguring the workings of this protocol.

	The ingress and egress gateways are shown in the most generic		The ingress and egress gateways are shown in the most generic
	arrangement possible in Figure 1, without any surrounding network.		arrangement possible in Figure 1, without any surrounding network.
	This allows us to consider more specific cases where these gateways		This allows us to consider more specific cases where these gateways
	and a neighbouring network are operated by the same player. As well		and a neighbouring network are operated by the same player. As well
	as cases where the same player operates neighbouring networks, we		as cases where the same player operates neighbouring networks, we

	skipping to change at page 38, line 11		skipping to change at page 41, line 30

	o If the ingress gateway does not declare downstream pre-congestion		o If the ingress gateway does not declare downstream pre-congestion
	high enough on average, it will `hit the ground before the		high enough on average, it will `hit the ground before the
	runway', going negative and triggering sanctions, either directly		runway', going negative and triggering sanctions, either directly
	against the traffic or against the ingress gateway at a management		against the traffic or against the ingress gateway at a management
	level		level

	An executive summary of our security analysis can be stated in three		An executive summary of our security analysis can be stated in three
	parts, distinguished by the type of collusion considered.		parts, distinguished by the type of collusion considered.


	Neighbour-only Middle-Middle Collusion: Here there is no collusion or		Neighbour-only Middle-Middle Collusion: Here there is no collusion
	collusion is limited to neighbours in the feedback loop. In other		or collusion is limited to neighbours in the feedback loop. In
	words, two neighbouring networks can be assumed to act as one. Or		other words, two neighbouring networks can be assumed to act as
	the egress gateway might collude with domain C. Or the ingress		one. Or the egress gateway might collude with domain C. Or the
	gateway might collude with domain A. Or ingress and egress		ingress gateway might collude with domain A. Or ingress and egress
	gateways might collude with each other.		gateways might collude with each other.

	In these cases where only neighbours in the feedback loop collude,		In these cases where only neighbours in the feedback loop collude,
	we concludes that all parties have a positive incentive to declare		we concludes that all parties have a positive incentive to declare
	downstream pre-congestion truthfully, and the ingress gateway has		downstream pre-congestion truthfully, and the ingress gateway has
	a positive incentive to invoke admission control when congestion		a positive incentive to invoke admission control when congestion
	rises above the admission threshold in any network in the region		rises above the admission threshold in any network in the region
	(including its own). No party has an incentive to send more		(including its own). No party has an incentive to send more
	traffic than declared in reservation signalling (even though only		traffic than declared in reservation signalling (even though only
	the gateways read this signalling). In short, no party can gain		the gateways read this signalling). In short, no party can gain

	skipping to change at page 39, line 16		skipping to change at page 42, line 34
	incentive to break it have mounted a full analysis.		incentive to break it have mounted a full analysis.

	7. Incremental Deployment		7. Incremental Deployment

	We believe ECN has so far not been widely deployed because it		We believe ECN has so far not been widely deployed because it
	requires widespread end system and network deployment just to achieve		requires widespread end system and network deployment just to achieve
	a marginal improvement in performance. The ability to offer a new		a marginal improvement in performance. The ability to offer a new
	service (admission control) would be a much stronger driver for ECN		service (admission control) would be a much stronger driver for ECN
	deployment.		deployment.


	As stated in the introduction, the aim of this memo is to "build in		As stated in the introduction, the aim of this memo is to "Design in
	security from the start" when admission control is based on pre-		security from the start" when admission control is based on pre-

	congestion notification. However, the proposal has been designed so		congestion notification. The proposal has been designed so that
	that security can be added some time after first deployment. Given		security can be added some time after first deployment, but only if
	admission control based on pre-congestion notification requires few		the PCN wire protocol encoding is defined with the foresight to
	changes to standards, it should be deployable fairly soon. However,		accommodate the extended set of codepoints defined in this document.
	re-ECN requires a change to IP, which may take a little longer.		Given admission control based on pre-congestion notification requires
			few changes to standards, it should be deployable fairly soon.
			However, re-ECN requires a change to IP, which may take a little
			longer.

	We expect that initial deployments of PCN-based admission control		We expect that initial deployments of PCN-based admission control
	will be confined to single networks, or to clubs of networks that		will be confined to single networks, or to clubs of networks that
	trust each other. The proposal in this memo will only become		trust each other. The proposal in this memo will only become
	relevant once networks with conflicting interests wish to		relevant once networks with conflicting interests wish to
	interconnect their admission controlled services, but without the		interconnect their admission controlled services, but without the
	scalability constraints of per-flow border policing. It will not be		scalability constraints of per-flow border policing. It will not be
	possible to use re-ECN, even in a controlled environment between		possible to use re-ECN, even in a controlled environment between
	consenting operators, unless it is standardised into IP. Given the		consenting operators, unless it is standardised into IP. Given the
	IPv4 header has limited space for further changes, current IESG		IPv4 header has limited space for further changes, current IESG

	policy [{ToDo: ref?}] is not to allow experimental use of codepoints		policy [RFC4727] is not to allow experimental use of codepoints in
	in the IPv4 header, as whenever an experiment isn't taken up, the		the IPv4 header, as whenever an experiment isn't taken up, the space
	space it used tends to be impossible to reclaim.		it used tends to be impossible to reclaim.

	If PCN-based admission control is deployed before re-ECN is		If PCN-based admission control is deployed before re-ECN is
	standardised into IP, wherever a networks (or club of networks)		standardised into IP, wherever a networks (or club of networks)
	connects to another network (or club of networks) with conflicting		connects to another network (or club of networks) with conflicting
	interests, they will place a gateway between the two regions that		interests, they will place a gateway between the two regions that
	does per-flow rate policing and admission control. If re-ECN is		does per-flow rate policing and admission control. If re-ECN is
	eventually standardised into IP, it will be possible for these		eventually standardised into IP, it will be possible for these
	separate regions to upgrade all their gateways to use re-ECN before		separate regions to upgrade all their gateways to use re-ECN before
	removing the per-flow policing gateways between them. Given the		removing the per-flow policing gateways between them. Given the
	edge-to-edge deployment model of PCN-based admission control, it is		edge-to-edge deployment model of PCN-based admission control, it is

	skipping to change at page 40, line 30		skipping to change at page 44, line 4
	causes in a remote network. This is the problem that has previously		causes in a remote network. This is the problem that has previously
	made it so hard to provide scalable admission control.		made it so hard to provide scalable admission control.

	The case for using re-feedback (a generalisation of re-ECN) to police		The case for using re-feedback (a generalisation of re-ECN) to police
	congestion response and provide QoS is made in [Re-fb]. Essentially,		congestion response and provide QoS is made in [Re-fb]. Essentially,
	the insight is that congestion is a factor that crosses layers from		the insight is that congestion is a factor that crosses layers from
	the physical upwards. Therefore re-feedback polices congestion where		the physical upwards. Therefore re-feedback polices congestion where
	it emerges from a physical interface between networks. This is		it emerges from a physical interface between networks. This is
	achieved by bringing the congestion information to the interface,		achieved by bringing the congestion information to the interface,
	rather than examining packet addressing where there is congestion.		rather than examining packet addressing where there is congestion.


	Then congestion crossing the physical interface at a border can be		Then congestion crossing the physical interface at a border can be
	policed at the interface, rather than policing the congestion on		policed at the interface, rather than policing the congestion on
	packets that claim to come from an address (which may be spoofed).		packets that claim to come from an address (which may be spoofed).
	Also, re-feedback works in the network layer independently of other		Also, re-feedback works in the network layer independently of other

	layers---despite its name re-feedback does not actually require		layers--despite its name re-feedback does not actually require
	feedback. It requires a source to act conservatively before it gets		feedback. It requires a source to act conservatively before it gets
	feedback.		feedback.

	On the subject of lack of feedback, the feedback not established		On the subject of lack of feedback, the feedback not established
	(FNE) codepoint is motivated by arguments for a state set-up bit in		(FNE) codepoint is motivated by arguments for a state set-up bit in
	IP to prevent state exhaustion attacks. This idea was first put		IP to prevent state exhaustion attacks. This idea was first put
	forward informally by David Clark and documented by Handley and		forward informally by David Clark and documented by Handley and
	Greenhalgh in [Steps_DoS]. The idea is that network layer datagrams		Greenhalgh in [Steps_DoS]. The idea is that network layer datagrams
	should signal explicitly when they require state to be created in the		should signal explicitly when they require state to be created in the
	network layer or the layer above (e.g. at flow start). Then a node		network layer or the layer above (e.g. at flow start). Then a node

	skipping to change at page 41, line 49		skipping to change at page 45, line 22

	9. Security Considerations		9. Security Considerations

	This whole memo concerns the security of a scalable admission control		This whole memo concerns the security of a scalable admission control
	system. In particular the analysis section. Below some specific		system. In particular the analysis section. Below some specific
	security issues are mentioned that did not belong elsewhere or which		security issues are mentioned that did not belong elsewhere or which
	comment on the overall robustness of the security provided by the		comment on the overall robustness of the security provided by the
	design.		design.

	Firstly, we must repeat the statement of applicability in the		Firstly, we must repeat the statement of applicability in the

	analysis: that we only consider new opportunities for /gainful/		analysis: that we only consider new opportunities for _gainful_
	attack that our proposal introduces, particularly if the attacker can		attack that our proposal introduces, particularly if the attacker can
	avoid being identified. Despite only involving a few bits, there is		avoid being identified. Despite only involving a few bits, there is
	sufficient complexity in the whole system that there are probably		sufficient complexity in the whole system that there are probably
	numerous possibilities for other attacks. However, as far as we are		numerous possibilities for other attacks. However, as far as we are
	aware, none reap any benefit to the attacker. For instance, it would		aware, none reap any benefit to the attacker. For instance, it would
	be possible for a downstream network to remove the congestion		be possible for a downstream network to remove the congestion
	markings introduced by an upstream network, but it would only lose		markings introduced by an upstream network, but it would only lose
	out on the penalties it could apply to a downstream network.		out on the penalties it could apply to a downstream network.

	When one network forwards a neighbouring network's traffic it will		When one network forwards a neighbouring network's traffic it will

	skipping to change at page 42, line 42		skipping to change at page 46, line 14
	flow pre-emption are similar to those for admission control.		flow pre-emption are similar to those for admission control.

	Finally, it may seem that the 8 codepoints that have been made		Finally, it may seem that the 8 codepoints that have been made
	available by extending the ECN field with the RE flag have been used		available by extending the ECN field with the RE flag have been used
	rather wastefully. In effect the RE flag has been used as an		rather wastefully. In effect the RE flag has been used as an
	orthogonal single bit in nearly all cases. The only exception being		orthogonal single bit in nearly all cases. The only exception being
	when the ECN field is cleared to "00". The mapping of the codepoints		when the ECN field is cleared to "00". The mapping of the codepoints
	in an earlier version of this proposal used the codepoint space more		in an earlier version of this proposal used the codepoint space more
	efficiently, but the scheme became vulnerable to a network operator		efficiently, but the scheme became vulnerable to a network operator
	focusing its congestion marking to mark more positive than neutral		focusing its congestion marking to mark more positive than neutral

	packets in order to reduce its penalties.		packets in order to reduce its penalties (see Appendix B of
			[Re-TCP]).

	With the scheme as now proposed, once the RE flag is set or cleared		With the scheme as now proposed, once the RE flag is set or cleared
	by the sender or its proxy, it should not be written by the network,		by the sender or its proxy, it should not be written by the network,
	only read. So the gateways can detect if any network maliciously		only read. So the gateways can detect if any network maliciously
	alters the RE flag. IPSec AH integrity checking does not cover the		alters the RE flag. IPSec AH integrity checking does not cover the

	IPv4 option flags (they were considered mutable---even the one we		IPv4 option flags (they were considered mutable--even the one we
	propose using for the RE flag that was `currently unused' when IPSec		propose using for the RE flag that was `currently unused' when IPSec
	was defined). But it would be sufficient for a pair of gateways to		was defined). But it would be sufficient for a pair of gateways to
	make random checks on whether the RE flag was the same when it		make random checks on whether the RE flag was the same when it
	reached the egress gateway as when it left the ingress. Indeed, if		reached the egress gateway as when it left the ingress. Indeed, if
	IPSec AH had covered the RE flag, any network intending to alter		IPSec AH had covered the RE flag, any network intending to alter
	sufficient RE flags to make a gain would have focused its alterations		sufficient RE flags to make a gain would have focused its alterations
	on packets without authenticating headers (AHs).		on packets without authenticating headers (AHs).

	No cryptographic algorithms have been harmed in the making of this		No cryptographic algorithms have been harmed in the making of this
	proposal.		proposal.

	skipping to change at page 43, line 22		skipping to change at page 46, line 43
	10. IANA Considerations		10. IANA Considerations

	This memo includes no request to IANA.		This memo includes no request to IANA.

	11. Conclusions		11. Conclusions

	This memo builds on a promising technique to solve the classic		This memo builds on a promising technique to solve the classic
	problem of making flow admission control scale to any size network.		problem of making flow admission control scale to any size network.
	It involves the use of Diffserv in a deployment model that uses pre-		It involves the use of Diffserv in a deployment model that uses pre-
	congestion notification feedback to control admission into a network		congestion notification feedback to control admission into a network

	path [CL-deploy]. However as it stands, that deployment model		path [PCN-arch]. However as it stands, that deployment model depends
	depends on all network domains trusting each other to comply with the		on all network domains trusting each other to comply with the
	protocols, invoking admission control and flow pre-emption when		protocols, invoking admission control and flow pre-emption when
	requested.		requested.

	We propose that the congestion feedback used in that deployment model		We propose that the congestion feedback used in that deployment model
	should be re-echoed into the forward data path, by making a trivial		should be re-echoed into the forward data path, by making a trivial
	modification to the ingress gateway. We then explain how the		modification to the ingress gateway. We then explain how the
	resulting downstream pre-congestion metric in packets can be		resulting downstream pre-congestion metric in packets can be
	monitored in bulk at borders to sufficiently emulate flow rate		monitored in bulk at borders to sufficiently emulate flow rate
	policing.		policing.

	We claim the result of combining these two approaches is an admission		We claim the result of combining these two approaches is an admission

	control system that scales to any size network /and/ any number of		control system that scales to any size network _and_ any number of
	interconnected networks, even if they all act in their own interests.		interconnected networks, even if they all act in their own interests.

	This proposal aims to convince its readers to "Design in Security		This proposal aims to convince its readers to "Design in Security

	from the start," by building modified ingress gateways from day one,		from the start," by ensuring the PCN wire protocol encoding can
			accommodate the extended set of codepoints defined in this document,
	even if border policing is not needed at first. This way, we will		even if border policing is not needed at first. This way, we will
	not build ourselves tomorrow's legacy problem.		not build ourselves tomorrow's legacy problem.

	Re-echoing congestion feedback is based on a principled technique		Re-echoing congestion feedback is based on a principled technique
	called Re-ECN [Re-TCP], designed to add accountability for causing		called Re-ECN [Re-TCP], designed to add accountability for causing
	congestion to the general-purpose IP datagram service. Re-ECN		congestion to the general-purpose IP datagram service. Re-ECN
	proposes to consume the last completely unused bit in the basic IPv4		proposes to consume the last completely unused bit in the basic IPv4
	header.		header.

	12. Acknowledgements		12. Acknowledgements

	All the following have given helpful comments and some may become co-		All the following have given helpful comments and some may become co-
	authors of later drafts: Arnaud Jacquet, Alessandro Salvatori, Steve		authors of later drafts: Arnaud Jacquet, Alessandro Salvatori, Steve
	Rudkin, David Songhurst, John Davey, Ian Self, Anthony Sheppard,		Rudkin, David Songhurst, John Davey, Ian Self, Anthony Sheppard,
	Carla Di Cairano-Gilfedder (BT), Mark Handley (who identified the		Carla Di Cairano-Gilfedder (BT), Mark Handley (who identified the
	excess canceled packets attack), Stephen Hailes, Adam Greenhalgh		excess canceled packets attack), Stephen Hailes, Adam Greenhalgh
	(UCL), Francois Le Faucheur, Anna Charny (Cisco), Jozef Babiarz,		(UCL), Francois Le Faucheur, Anna Charny (Cisco), Jozef Babiarz,
	Kwok-Ho Chan, Corey Alexander (Nortel), David Clark, Bill Lehr,		Kwok-Ho Chan, Corey Alexander (Nortel), David Clark, Bill Lehr,
	Sharon Gillett, Steve Bauer (MIT) (who publicised various dummy		Sharon Gillett, Steve Bauer (MIT) (who publicised various dummy
	traffic attacks), Sally Floyd (ICIR) and comments from participants		traffic attacks), Sally Floyd (ICIR) and comments from participants

	in the CFP/CRN inter-provider QoS and broadband working groups.		in the CFP/CRN Inter-Provider QoS, Broadband and DoS-Resistant
			Internet working groups.

	13. Comments Solicited		13. Comments Solicited

	Comments and questions are encouraged and very welcome. They can be		Comments and questions are encouraged and very welcome. They can be
	addressed to the IETF Transport Area working group's mailing list		addressed to the IETF Transport Area working group's mailing list
	<tsvwg@ietf.org>, and/or to the authors.		<tsvwg@ietf.org>, and/or to the authors.

	14. References		14. References

	14.1. Normative References		14.1. Normative References

	[PCN] Briscoe, B., Eardley, P., Songhurst, D., Le Faucheur, F.,		[PCN] Briscoe, B., Eardley, P., Songhurst, D., Le Faucheur, F.,
	Charny, A., Liatsos, V., Babiarz, J., Chan, K., Dudley,		Charny, A., Liatsos, V., Babiarz, J., Chan, K., Dudley,
	S., Westberg, L., Bader, A., and G. Karagiannis, "Pre-		S., Westberg, L., Bader, A., and G. Karagiannis, "Pre-
	Congestion Notification Marking",		Congestion Notification Marking",

	draft-briscoe-tsvwg-cl-phb-02 (work in progress),		draft-briscoe-tsvwg-cl-phb-03 (work in progress),
	June 2006.		October 2006.

	[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate		[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
	Requirement Levels", BCP 14, RFC 2119, March 1997.		Requirement Levels", BCP 14, RFC 2119, March 1997.

	[RFC2211] Wroclawski, J., "Specification of the Controlled-Load		[RFC2211] Wroclawski, J., "Specification of the Controlled-Load
	Network Element Service", RFC 2211, September 1997.		Network Element Service", RFC 2211, September 1997.

	[RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition		[RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition
	of Explicit Congestion Notification (ECN) to IP",		of Explicit Congestion Notification (ECN) to IP",
	RFC 3168, September 2001.		RFC 3168, September 2001.

	skipping to change at page 45, line 10		skipping to change at page 48, line 36
	Stiliadis, "An Expedited Forwarding PHB (Per-Hop		Stiliadis, "An Expedited Forwarding PHB (Per-Hop
	Behavior)", RFC 3246, March 2002.		Behavior)", RFC 3246, March 2002.

	[RSVP-ECN]		[RSVP-ECN]
	Le Faucheur, F., Charny, A., Briscoe, B., Eardley, P.,		Le Faucheur, F., Charny, A., Briscoe, B., Eardley, P.,
	Babiarz, J., and K. Chan, "RSVP Extensions for Admission		Babiarz, J., and K. Chan, "RSVP Extensions for Admission
	Control over Diffserv using Pre-congestion Notification",		Control over Diffserv using Pre-congestion Notification",
	draft-lefaucheur-rsvp-ecn-01 (work in progress),		draft-lefaucheur-rsvp-ecn-01 (work in progress),
	June 2006.		June 2006.


	[Re-TCP] Briscoe, B., Jacquet, A., and A. Salvatori, "Re-ECN:		[Re-TCP] Briscoe, B., Jacquet, A., Salvatori, A., and M. Koyabi,
	Adding Accountability for Causing Congestion to TCP/IP",		"Re-ECN: Adding Accountability for Causing Congestion to
	draft-briscoe-tsvwg-re-ecn-tcp-02 (work in progress),		TCP/IP", draft-briscoe-tsvwg-re-ecn-tcp-04 (work in
	June 2006.		progress), June 2007.

	14.2. Informative References		14.2. Informative References


	[CL-deploy]
	Briscoe, B., Eardley, P., Songhurst, D., Le Faucheur, F.,
	Charny, A., Babiarz, J., Chan, K., Westberg, L., Bader,
	A., and G. Karagiannis, "A Deployment Model for Admission
	Control over DiffServ using Pre-Congestion Notification",
	draft-briscoe-tsvwg-cl-architecture-03 (work in progress),
	June 2006.

	[CLoop_pol]		[CLoop_pol]
	Salvatori, A., "Closed Loop Traffic Policing", Politecnico		Salvatori, A., "Closed Loop Traffic Policing", Politecnico
	Torino and Institut Eurecom Masters Thesis ,		Torino and Institut Eurecom Masters Thesis ,
	September 2005.		September 2005.

	[ECN-BGP] Mortier, R. and I. Pratt, "Incentive Based Inter-Domain		[ECN-BGP] Mortier, R. and I. Pratt, "Incentive Based Inter-Domain
	Routeing", Proc Internet Charging and QoS Technology		Routeing", Proc Internet Charging and QoS Technology
	Workshop (ICQT'03) pp308--317, September 2003, <http://		Workshop (ICQT'03) pp308--317, September 2003, <http://
	research.microsoft.com/users/mort/publications.aspx>.		research.microsoft.com/users/mort/publications.aspx>.

	[ECN-MPLS]		[ECN-MPLS]

	Bruce, B., Briscoe, B., and J. Tay, "Explicit Congestion		Davie, B., Briscoe, B., and J. Tay, "Explicit Congestion
	Marking in MPLS", draft-davie-ecn-mpls-00 (work in		Marking in MPLS", draft-ietf-tsvwg-ecn-mpls-01 (work in
	progress), June 2006.		progress), June 2007.

	[IXQoS] Briscoe, B. and S. Rudkin, "Commercial Models for IP		[IXQoS] Briscoe, B. and S. Rudkin, "Commercial Models for IP
	Quality of Service Interconnect", BT Technology Journal		Quality of Service Interconnect", BT Technology Journal
	(BTTJ) 23(2)171--195, April 2005,		(BTTJ) 23(2)171--195, April 2005,
	<http://www.cs.ucl.ac.uk/staff/B.Briscoe/pubs.html#ixqos>.		<http://www.cs.ucl.ac.uk/staff/B.Briscoe/pubs.html#ixqos>.

	[NSIS-RMD]		[NSIS-RMD]
	Bader, A., Westberg, L., Karagiannis, G., Kappler, C., and		Bader, A., Westberg, L., Karagiannis, G., Kappler, C., and
	T. Phelan, "RMD-QOSM - The Resource Management in Diffserv		T. Phelan, "RMD-QOSM - The Resource Management in Diffserv

	QOS Model", draft-ietf-nsis-rmd-06 (work in progress),		QOS Model", draft-ietf-nsis-rmd-09 (work in progress),
	February 2006.		March 2007.


	[RFC2205] Braden, B., Zhang, L., Berson, S., Herzog, S., and S.		[PCN-arch]
			Eardley, P., Babiarz, J., Chan, K., Charny, A., Geib, R.,
			Karagiannis, G., Menth, M., and T. Tsou, "Pre-Congestion
			Notification Architecture",
			draft-eardley-pcn-architecture-00 (work in progress),
			June 2007.


			[RFC2205] Braden, B., Zhang, L., Berson, S., Herzog, S., and S.
	Jamin, "Resource ReSerVation Protocol (RSVP) -- Version 1		Jamin, "Resource ReSerVation Protocol (RSVP) -- Version 1
	Functional Specification", RFC 2205, September 1997.		Functional Specification", RFC 2205, September 1997.

	[RFC2207] Berger, L. and T. O'Malley, "RSVP Extensions for IPSEC		[RFC2207] Berger, L. and T. O'Malley, "RSVP Extensions for IPSEC
	Data Flows", RFC 2207, September 1997.		Data Flows", RFC 2207, September 1997.

	[RFC2208] Mankin, A., Baker, F., Braden, B., Bradner, S., O'Dell,		[RFC2208] Mankin, A., Baker, F., Braden, B., Bradner, S., O'Dell,
	M., Romanow, A., Weinrib, A., and L. Zhang, "Resource		M., Romanow, A., Weinrib, A., and L. Zhang, "Resource
	ReSerVation Protocol (RSVP) Version 1 Applicability		ReSerVation Protocol (RSVP) Version 1 Applicability
	Statement Some Guidelines on Deployment", RFC 2208,		Statement Some Guidelines on Deployment", RFC 2208,

	skipping to change at page 46, line 29		skipping to change at page 50, line 5

	[RFC2998] Bernet, Y., Ford, P., Yavatkar, R., Baker, F., Zhang, L.,		[RFC2998] Bernet, Y., Ford, P., Yavatkar, R., Baker, F., Zhang, L.,
	Speer, M., Braden, R., Davie, B., Wroclawski, J., and E.		Speer, M., Braden, R., Davie, B., Wroclawski, J., and E.
	Felstaine, "A Framework for Integrated Services Operation		Felstaine, "A Framework for Integrated Services Operation
	over Diffserv Networks", RFC 2998, November 2000.		over Diffserv Networks", RFC 2998, November 2000.

	[RFC3540] Spring, N., Wetherall, D., and D. Ely, "Robust Explicit		[RFC3540] Spring, N., Wetherall, D., and D. Ely, "Robust Explicit
	Congestion Notification (ECN) Signaling with Nonces",		Congestion Notification (ECN) Signaling with Nonces",
	RFC 3540, June 2003.		RFC 3540, June 2003.


			[RFC4727] Fenner, B., "Experimental Values In IPv4, IPv6, ICMPv4,
			ICMPv6, UDP, and TCP Headers", RFC 4727, November 2006.

	[Re-fb] Briscoe, B., Jacquet, A., Di Cairano-Gilfedder, C.,		[Re-fb] Briscoe, B., Jacquet, A., Di Cairano-Gilfedder, C.,
	Salvatori, A., Soppera, A., and M. Koyabe, "Policing		Salvatori, A., Soppera, A., and M. Koyabe, "Policing
	Congestion Response in an Internetwork Using Re-Feedback",		Congestion Response in an Internetwork Using Re-Feedback",
	ACM SIGCOMM CCR 35(4)277--288, August 2005, <http://		ACM SIGCOMM CCR 35(4)277--288, August 2005, <http://
	www.acm.org/sigs/sigcomm/sigcomm2005/		www.acm.org/sigs/sigcomm/sigcomm2005/
	techprog.html#session8>.		techprog.html#session8>.

	[Smart_rtg]		[Smart_rtg]
	Goldenberg, D., Qiu, L., Xie, H., Yang, Y., and Y. Zhang,		Goldenberg, D., Qiu, L., Xie, H., Yang, Y., and Y. Zhang,
	"Optimizing Cost and Performance for Multihoming", ACM		"Optimizing Cost and Performance for Multihoming", ACM

	skipping to change at page 47, line 24		skipping to change at page 51, line 8
	sends with the RE flag blanked. Z_0 will also take account of the		sends with the RE flag blanked. Z_0 will also take account of the
	sustainable rate reported during the flow pre-emption process, if		sustainable rate reported during the flow pre-emption process, if
	necessary.		necessary.

	A suitable pseudo-code algorithm for the ingress gateway is as		A suitable pseudo-code algorithm for the ingress gateway is as
	follows:		follows:

	====================================================================		====================================================================
	B_i = 0 /* interblank volume */		B_i = 0 /* interblank volume */
	for each PCN-capable packet {		for each PCN-capable packet {

	b = readLength() /* set b to packet size */		b = readLength(packet) /* set b to packet size */
	B_i += b /* accumulate interblank volume */		B_i += b /* accumulate interblank volume */
	if B_i < b * Z_0 { /* test whether interblank volume... */		if B_i < b * Z_0 { /* test whether interblank volume... */
	writeRE(1)		writeRE(1)
	} else { /* ...exceeds blank RE spacing * pkt size*/		} else { /* ...exceeds blank RE spacing * pkt size*/
	writeRE(0) /* ...and if so, clear RE */		writeRE(0) /* ...and if so, clear RE */
	B_i = 0 /* ...and re-set interblank volume */		B_i = 0 /* ...and re-set interblank volume */
	}		}
	}		}
	====================================================================		====================================================================


	skipping to change at page 48, line 37		skipping to change at page 52, line 17

	A.2.2. Inflation Factor for Persistently Negative Flows		A.2.2. Inflation Factor for Persistently Negative Flows

	The following process is suggested to complement the simple algorithm		The following process is suggested to complement the simple algorithm
	above in order to protect against the various attacks from		above in order to protect against the various attacks from
	persistently negative flows described in Section 5.6.1. As explained		persistently negative flows described in Section 5.6.1. As explained
	in that section, the most important and first step is to estimate the		in that section, the most important and first step is to estimate the
	contribution of persistently negative flows to the bulk volume of		contribution of persistently negative flows to the bulk volume of
	downstream pre-congestion and to inflate this bulk volume as if these		downstream pre-congestion and to inflate this bulk volume as if these
	flows weren't there. The process below has been designed to give an		flows weren't there. The process below has been designed to give an

	unboased estimate, but it may be possible to define other processes		unbiased estimate, but it may be possible to define other processes
	that achieve similar ends.		that achieve similar ends.

	While the above simple metering algorithm is counting the bulk of		While the above simple metering algorithm is counting the bulk of
	traffic over an accounting period, the meter should also select a		traffic over an accounting period, the meter should also select a
	subset of the whole flow ID space that is small enough to be able to		subset of the whole flow ID space that is small enough to be able to
	realistically measure but large enough to give a realistic sample.		realistically measure but large enough to give a realistic sample.
	Many different samples of different subsets of the ID space should be		Many different samples of different subsets of the ID space should be
	taken at different times during the accounting period, preferably		taken at different times during the accounting period, preferably
	covering the whole ID space. During each sample, the meter should		covering the whole ID space. During each sample, the meter should
	count the volume of positive packets and subtract the volume of		count the volume of positive packets and subtract the volume of

	skipping to change at page 51, line 5		skipping to change at page 54, line 5
	BT & UCL		BT & UCL
	B54/77, Adastral Park		B54/77, Adastral Park
	Martlesham Heath		Martlesham Heath
	Ipswich IP5 3RE		Ipswich IP5 3RE
	UK		UK

	Phone: +44 1473 645196		Phone: +44 1473 645196
	Email: bob.briscoe@bt.com		Email: bob.briscoe@bt.com
	URI: http://www.cs.ucl.ac.uk/staff/B.Briscoe/		URI: http://www.cs.ucl.ac.uk/staff/B.Briscoe/


	Intellectual Property Statement		Full Copyright Statement

			Copyright (C) The IETF Trust (2007).

			This document is subject to the rights, licenses and restrictions
			contained in BCP 78, and except as set forth therein, the authors
			retain all their rights.

			This document and the information contained herein are provided on an
			"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
			OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
			THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
			OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
			THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
			WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

			Intellectual Property

	The IETF takes no position regarding the validity or scope of any		The IETF takes no position regarding the validity or scope of any
	Intellectual Property Rights or other rights that might be claimed to		Intellectual Property Rights or other rights that might be claimed to
	pertain to the implementation or use of the technology described in		pertain to the implementation or use of the technology described in
	this document or the extent to which any license under such rights		this document or the extent to which any license under such rights
	might or might not be available; nor does it represent that it has		might or might not be available; nor does it represent that it has
	made any independent effort to identify any such rights. Information		made any independent effort to identify any such rights. Information
	on the procedures with respect to rights in RFC documents can be		on the procedures with respect to rights in RFC documents can be
	found in BCP 78 and BCP 79.		found in BCP 78 and BCP 79.


	skipping to change at page 51, line 29		skipping to change at page 54, line 45
	such proprietary rights by implementers or users of this		such proprietary rights by implementers or users of this
	specification can be obtained from the IETF on-line IPR repository at		specification can be obtained from the IETF on-line IPR repository at
	http://www.ietf.org/ipr.		http://www.ietf.org/ipr.

	The IETF invites any interested party to bring to its attention any		The IETF invites any interested party to bring to its attention any
	copyrights, patents or patent applications, or other proprietary		copyrights, patents or patent applications, or other proprietary
	rights that may cover technology that may be required to implement		rights that may cover technology that may be required to implement
	this standard. Please address the information to the IETF at		this standard. Please address the information to the IETF at
	ietf-ipr@ietf.org.		ietf-ipr@ietf.org.


	Disclaimer of Validity		Acknowledgments

	This document and the information contained herein are provided on an
	"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
	OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
	ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
	INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
	INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
	WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

	Copyright Statement

	Copyright (C) The Internet Society (2006). This document is subject
	to the rights, licenses and restrictions contained in BCP 78, and
	except as set forth therein, the authors retain all their rights.

	Acknowledgment


	Funding for the RFC Editor function is currently provided by the		Funding for the RFC Editor function is provided by the IETF
	Internet Society.		Administrative Support Activity (IASA). This document was produced
			using xml2rfc v1.32 (of http://xml.resource.org/) from a source in
			RFC-2629 XML format.

End of changes. 85 change blocks.
	227 lines changed or deleted		347 lines changed or added
This html diff was produced by rfcdiff 1.33. The latest version is available from http://tools.ietf.org/tools/rfcdiff/