Diff: draft-ietf-tsvwg-aqm-dualq-coupled-24.txt - draft-ietf-tsvwg-aqm-dualq-coupled-25d.txt

	< draft-ietf-tsvwg-aqm-dualq-coupled-24.txt	draft-ietf-tsvwg-aqm-dualq-coupled-25d.txt >

	Transport Area working group (tsvwg) K. De Schepper	Transport Area working group (tsvwg) K. De Schepper
	Internet-Draft Nokia Bell Labs	Internet-Draft Nokia Bell Labs
	Intended status: Experimental B. Briscoe, Ed.	Intended status: Experimental B. Briscoe, Ed.

	Expires: 8 January 2023 Independent	Expires: 25 February 2023 Independent
	G. White	G. White
	CableLabs	CableLabs

	7 July 2022	24 August 2022

	DualQ Coupled AQMs for Low Latency, Low Loss and Scalable Throughput	DualQ Coupled AQMs for Low Latency, Low Loss and Scalable Throughput
	(L4S)	(L4S)

	draft-ietf-tsvwg-aqm-dualq-coupled-24	draft-ietf-tsvwg-aqm-dualq-coupled-25

	Abstract	Abstract

	This specification defines a framework for coupling the Active Queue	This specification defines a framework for coupling the Active Queue
	Management (AQM) algorithms in two queues intended for flows with	Management (AQM) algorithms in two queues intended for flows with
	different responses to congestion. This provides a way for the	different responses to congestion. This provides a way for the
	Internet to transition from the scaling problems of standard TCP	Internet to transition from the scaling problems of standard TCP
	Reno-friendly ('Classic') congestion controls to the family of	Reno-friendly ('Classic') congestion controls to the family of
	'Scalable' congestion controls. These are designed for consistently	'Scalable' congestion controls. These are designed for consistently
	very Low queuing Latency, very Low congestion Loss and Scaling of	very Low queuing Latency, very Low congestion Loss and Scaling of
	per-flow throughput (L4S) by using Explicit Congestion Notification	per-flow throughput (L4S) by using Explicit Congestion Notification

	(ECN) in a modified way. Until the Coupled DualQ, these L4S senders	(ECN) in a modified way. Until the Coupled DualQ, these scalable L4S
	could only be deployed where a clean-slate environment could be	congestion controls could only be deployed where a clean-slate
	arranged, such as in private data centres. The coupling acts like a	environment could be arranged, such as in private data centres.
	semi-permeable membrane: isolating the sub-millisecond average
	queuing delay and zero congestion loss of L4S from Classic latency	The specification first explains how a Coupled DualQ works. It then
	and loss; but pooling the capacity between any combination of	gives the normative requirements that are necessary for it to work
	Scalable and Classic flows with roughly equivalent throughput per	well. All this is independent of which two AQMs are used, but
	flow. The DualQ achieves this indirectly, without having to inspect	pseudocode examples of specific AQMs are given in appendices.
	transport layer flow identifiers and without compromising the
	performance of the Classic traffic, relative to a single queue. The
	DualQ design has low complexity and requires no configuration for the
	public Internet.

	Status of This Memo	Status of This Memo

	This Internet-Draft is submitted in full conformance with the	This Internet-Draft is submitted in full conformance with the
	provisions of BCP 78 and BCP 79.	provisions of BCP 78 and BCP 79.

	Internet-Drafts are working documents of the Internet Engineering	Internet-Drafts are working documents of the Internet Engineering
	Task Force (IETF). Note that other groups may also distribute	Task Force (IETF). Note that other groups may also distribute
	working documents as Internet-Drafts. The list of current Internet-	working documents as Internet-Drafts. The list of current Internet-
	Drafts is at https://datatracker.ietf.org/drafts/current/.	Drafts is at https://datatracker.ietf.org/drafts/current/.

	Internet-Drafts are draft documents valid for a maximum of six months	Internet-Drafts are draft documents valid for a maximum of six months
	and may be updated, replaced, or obsoleted by other documents at any	and may be updated, replaced, or obsoleted by other documents at any
	time. It is inappropriate to use Internet-Drafts as reference	time. It is inappropriate to use Internet-Drafts as reference
	material or to cite them other than as "work in progress."	material or to cite them other than as "work in progress."


	This Internet-Draft will expire on 8 January 2023.	This Internet-Draft will expire on 25 February 2023.

	Copyright Notice	Copyright Notice

	Copyright (c) 2022 IETF Trust and the persons identified as the	Copyright (c) 2022 IETF Trust and the persons identified as the
	document authors. All rights reserved.	document authors. All rights reserved.

	This document is subject to BCP 78 and the IETF Trust's Legal	This document is subject to BCP 78 and the IETF Trust's Legal
	Provisions Relating to IETF Documents (https://trustee.ietf.org/	Provisions Relating to IETF Documents (https://trustee.ietf.org/
	license-info) in effect on the date of publication of this document.	license-info) in effect on the date of publication of this document.
	Please review these documents carefully, as they describe your rights	Please review these documents carefully, as they describe your rights
	and restrictions with respect to this document. Code Components	and restrictions with respect to this document. Code Components
	extracted from this document must include Revised BSD License text as	extracted from this document must include Revised BSD License text as
	described in Section 4.e of the Trust Legal Provisions and are	described in Section 4.e of the Trust Legal Provisions and are
	provided without warranty as described in the Revised BSD License.	provided without warranty as described in the Revised BSD License.

	Table of Contents	Table of Contents

	1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3	1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3
	1.1. Outline of the Problem . . . . . . . . . . . . . . . . . 3	1.1. Outline of the Problem . . . . . . . . . . . . . . . . . 3

	1.2. Scope . . . . . . . . . . . . . . . . . . . . . . . . . . 6	1.2. Context, Scope & Applicability . . . . . . . . . . . . . 6
	1.3. Terminology . . . . . . . . . . . . . . . . . . . . . . . 7	1.3. Terminology . . . . . . . . . . . . . . . . . . . . . . . 7
	1.4. Features . . . . . . . . . . . . . . . . . . . . . . . . 9	1.4. Features . . . . . . . . . . . . . . . . . . . . . . . . 9
	2. DualQ Coupled AQM . . . . . . . . . . . . . . . . . . . . . . 11	2. DualQ Coupled AQM . . . . . . . . . . . . . . . . . . . . . . 11
	2.1. Coupled AQM . . . . . . . . . . . . . . . . . . . . . . . 11	2.1. Coupled AQM . . . . . . . . . . . . . . . . . . . . . . . 11

	2.2. Dual Queue . . . . . . . . . . . . . . . . . . . . . . . 13	2.2. Dual Queue . . . . . . . . . . . . . . . . . . . . . . . 12
	2.3. Traffic Classification . . . . . . . . . . . . . . . . . 13	2.3. Traffic Classification . . . . . . . . . . . . . . . . . 13

	2.4. Overall DualQ Coupled AQM Structure . . . . . . . . . . . 14	2.4. Overall DualQ Coupled AQM Structure . . . . . . . . . . . 13
	2.5. Normative Requirements for a DualQ Coupled AQM . . . . . 17	2.5. Normative Requirements for a DualQ Coupled AQM . . . . . 17
	2.5.1. Functional Requirements . . . . . . . . . . . . . . . 17	2.5.1. Functional Requirements . . . . . . . . . . . . . . . 17
	2.5.1.1. Requirements in Unexpected Cases . . . . . . . . 18	2.5.1.1. Requirements in Unexpected Cases . . . . . . . . 18
	2.5.2. Management Requirements . . . . . . . . . . . . . . . 19	2.5.2. Management Requirements . . . . . . . . . . . . . . . 19

	2.5.2.1. Configuration . . . . . . . . . . . . . . . . . . 20	2.5.2.1. Configuration . . . . . . . . . . . . . . . . . . 19
	2.5.2.2. Monitoring . . . . . . . . . . . . . . . . . . . 21	2.5.2.2. Monitoring . . . . . . . . . . . . . . . . . . . 21
	2.5.2.3. Anomaly Detection . . . . . . . . . . . . . . . . 22	2.5.2.3. Anomaly Detection . . . . . . . . . . . . . . . . 22
	2.5.2.4. Deployment, Coexistence and Scaling . . . . . . . 22	2.5.2.4. Deployment, Coexistence and Scaling . . . . . . . 22
	3. IANA Considerations (to be removed by RFC Editor) . . . . . . 22	3. IANA Considerations (to be removed by RFC Editor) . . . . . . 22
	4. Security Considerations . . . . . . . . . . . . . . . . . . . 22	4. Security Considerations . . . . . . . . . . . . . . . . . . . 22
	4.1. Low Delay without Requiring Per-Flow Processing . . . . . 22	4.1. Low Delay without Requiring Per-Flow Processing . . . . . 22
	4.2. Handling Unresponsive Flows and Overload . . . . . . . . 23	4.2. Handling Unresponsive Flows and Overload . . . . . . . . 23
	4.2.1. Unresponsive Traffic without Overload . . . . . . . . 24	4.2.1. Unresponsive Traffic without Overload . . . . . . . . 24
	4.2.2. Avoiding Short-Term Classic Starvation: Sacrifice L4S	4.2.2. Avoiding Short-Term Classic Starvation: Sacrifice L4S
	Throughput or Delay? . . . . . . . . . . . . . . . . 25	Throughput or Delay? . . . . . . . . . . . . . . . . 25

	skipping to change at page 3, line 4 ¶	skipping to change at page 2, line 46 ¶
	2.5.2.2. Monitoring . . . . . . . . . . . . . . . . . . . 21	2.5.2.2. Monitoring . . . . . . . . . . . . . . . . . . . 21
	2.5.2.3. Anomaly Detection . . . . . . . . . . . . . . . . 22	2.5.2.3. Anomaly Detection . . . . . . . . . . . . . . . . 22
	2.5.2.4. Deployment, Coexistence and Scaling . . . . . . . 22	2.5.2.4. Deployment, Coexistence and Scaling . . . . . . . 22
	3. IANA Considerations (to be removed by RFC Editor) . . . . . . 22	3. IANA Considerations (to be removed by RFC Editor) . . . . . . 22
	4. Security Considerations . . . . . . . . . . . . . . . . . . . 22	4. Security Considerations . . . . . . . . . . . . . . . . . . . 22
	4.1. Low Delay without Requiring Per-Flow Processing . . . . . 22	4.1. Low Delay without Requiring Per-Flow Processing . . . . . 22
	4.2. Handling Unresponsive Flows and Overload . . . . . . . . 23	4.2. Handling Unresponsive Flows and Overload . . . . . . . . 23
	4.2.1. Unresponsive Traffic without Overload . . . . . . . . 24	4.2.1. Unresponsive Traffic without Overload . . . . . . . . 24
	4.2.2. Avoiding Short-Term Classic Starvation: Sacrifice L4S	4.2.2. Avoiding Short-Term Classic Starvation: Sacrifice L4S
	Throughput or Delay? . . . . . . . . . . . . . . . . 25	Throughput or Delay? . . . . . . . . . . . . . . . . 25


	4.2.3. L4S ECN Saturation: Introduce Drop or Delay? . . . . 26	4.2.3. L4S ECN Saturation: Introduce Drop or Delay? . . . . 26
	4.2.3.1. Protecting against Overload by Unresponsive	4.2.3.1. Protecting against Overload by Unresponsive
	ECN-Capable Traffic . . . . . . . . . . . . . . . . 28	ECN-Capable Traffic . . . . . . . . . . . . . . . . 28

	5. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 28	5. References . . . . . . . . . . . . . . . . . . . . . . . . . 28
	6. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 29	5.1. Normative References . . . . . . . . . . . . . . . . . . 28
	7. References . . . . . . . . . . . . . . . . . . . . . . . . . 29	5.2. Informative References . . . . . . . . . . . . . . . . . 29
	7.1. Normative References . . . . . . . . . . . . . . . . . . 29	Appendix A. Example DualQ Coupled PI2 Algorithm . . . . . . . . 34
	7.2. Informative References . . . . . . . . . . . . . . . . . 30	A.1. Pass #1: Core Concepts . . . . . . . . . . . . . . . . . 35
	Appendix A. Example DualQ Coupled PI2 Algorithm . . . . . . . . 35	A.2. Pass #2: Edge-Case Details . . . . . . . . . . . . . . . 46
	A.1. Pass #1: Core Concepts . . . . . . . . . . . . . . . . . 36	Appendix B. Example DualQ Coupled Curvy RED Algorithm . . . . . 51
	A.2. Pass #2: Edge-Case Details . . . . . . . . . . . . . . . 47	B.1. Curvy RED in Pseudocode . . . . . . . . . . . . . . . . . 51
	Appendix B. Example DualQ Coupled Curvy RED Algorithm . . . . . 52	B.2. Efficient Implementation of Curvy RED . . . . . . . . . . 57
	B.1. Curvy RED in Pseudocode . . . . . . . . . . . . . . . . . 52	Appendix C. Choice of Coupling Factor, k . . . . . . . . . . . . 59
	B.2. Efficient Implementation of Curvy RED . . . . . . . . . . 58	C.1. RTT-Dependence . . . . . . . . . . . . . . . . . . . . . 59
	Appendix C. Choice of Coupling Factor, k . . . . . . . . . . . . 60	C.2. Guidance on Controlling Throughput Equivalence . . . . . 60
	C.1. RTT-Dependence . . . . . . . . . . . . . . . . . . . . . 60	Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . 64
	C.2. Guidance on Controlling Throughput Equivalence . . . . . 61	Contributors . . . . . . . . . . . . . . . . . . . . . . . . . . 64
	Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 65	Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 65

	1. Introduction	1. Introduction


	This document specifies a framework for DualQ Coupled AQMs, which is	This document specifies a framework for DualQ Coupled AQMs, which
	the network part of the L4S architecture [I-D.ietf-tsvwg-l4s-arch].	serve as the network part of the L4S
	L4S enables both very low queuing latency (sub-millisecond on	architecture [I-D.ietf-tsvwg-l4s-arch]. A Coupled DualQ AQM consists
	average) and high throughput at the same time, for ad hoc numbers of	of two queues; L4S and Classic. The L4S queue is intended for
	capacity-seeking applications all sharing the same capacity.	Scalable congestion controls that can maintain very low queuing
		latency (sub-millisecond on average) and high throughput at the same
		time. The Coupled DualQ acts like a semi-permeable membrane: the L4S
		queue isolates the sub-millisecond average queuing delay and zero
		congestion loss of L4S from Classic latency and loss; while the
		coupling between the queues pools the capacity between both queues so
		that ad hoc numbers of capacity-seeking applications all sharing the
		same capacity can have roughly equivalent throughput per flow,
		whichever queue they use. The DualQ achieves this indirectly,
		without having to inspect transport layer flow identifiers and
		without compromising the performance of the Classic traffic, relative
		to a single queue. The DualQ design has low complexity and requires
		no configuration for the public Internet.

	1.1. Outline of the Problem	1.1. Outline of the Problem

	Latency is becoming the critical performance factor for many (most?)	Latency is becoming the critical performance factor for many (most?)
	applications on the public Internet, e.g. interactive Web, Web	applications on the public Internet, e.g. interactive Web, Web
	services, voice, conversational video, interactive video, interactive	services, voice, conversational video, interactive video, interactive
	remote presence, instant messaging, online gaming, remote desktop,	remote presence, instant messaging, online gaming, remote desktop,
	cloud-based applications, and video-assisted remote control of	cloud-based applications, and video-assisted remote control of
	machinery and industrial processes. In the developed world, further	machinery and industrial processes. In the developed world, further
	increases in access network bit-rate offer diminishing returns,	increases in access network bit-rate offer diminishing returns,

	skipping to change at page 5, line 27 ¶	skipping to change at page 5, line 27 ¶
	with ECN, not drop, for the signalling:	with ECN, not drop, for the signalling:

	1. The smaller sawteeth allow an extremely shallow ECN packet-	1. The smaller sawteeth allow an extremely shallow ECN packet-
	marking threshold in the queue.	marking threshold in the queue.

	2. And no smoothing in the network means that every fluctuation of	2. And no smoothing in the network means that every fluctuation of
	the queue is signalled immediately.	the queue is signalled immediately.

	Without ECN, either of these would lead to very high loss levels.	Without ECN, either of these would lead to very high loss levels.
	But, with ECN, the resulting high marking levels are just signals,	But, with ECN, the resulting high marking levels are just signals,

	not impairments. BBRv2 combines the best of both worlds - it works	not impairments. (Note that BBRv2 [BBRv2] combines the best of both
	as a scalable congestion control when ECN is available, but also aims	worlds - it works as a scalable congestion control when ECN is
	to minimize delay when it isn't.	available, but also aims to minimize delay when it isn't.)

	However, until now, Scalable congestion controls (like DCTCP) did not	However, until now, Scalable congestion controls (like DCTCP) did not

	co-exist well in a shared ECN-capable queue with existing ECN-capable	co-exist well in a shared ECN-capable queue with existing Classic
	TCP Reno [RFC5681] or Cubic [RFC8312] congestion controls -- Scalable	(e.g. Reno [RFC5681] or Cubic [RFC8312]) congestion controls --
	controls are so aggressive that these 'Classic' algorithms would	Scalable controls are so aggressive that these 'Classic' algorithms
	drive themselves to a small capacity share. Therefore, until now,	would drive themselves to a small capacity share. Therefore, until
	L4S controls could only be deployed where a clean-slate environment	now, L4S controls could only be deployed where a clean-slate
	could be arranged, such as in private data centres (hence the name	environment could be arranged, such as in private data centres (hence
	DCTCP).	the name DCTCP).


	This document specifies a `DualQ Coupled AQM' extension that solves	One way to solve the problem of coexistence between Scalable and
	the problem of coexistence between Scalable and Classic flows,	Classic flows is to use a per-flow-queuing approach such as FQ-
	without having to inspect flow identifiers. It is not like flow-	CoDel [RFC8290]. It classifies packets by flow identifier into
	queuing approaches [RFC8290] that classify packets by flow identifier	separate queues in order to isolate sparse flows from the higher
	into separate queues in order to isolate sparse flows from the higher	latency in the queues assigned to heavier flows. However, if a flow
	latency in the queues assigned to heavier flows. If a flow needs	needs both low delay and high throughput, having a queue to itself
	both low delay and high throughput, having a queue to itself does not	does not isolate it from the harm it causes to itself. Also FQ
	isolate it from the harm it causes to itself. In contrast, DualQ	approaches need to inspect flow identifiers, which is not always
	Coupled AQMs address the root cause of the latency problem -- they	practical.
	are an enabler for the smooth low latency scalable behaviour of
	Scalable congestion controls, so that every packet in every flow can
	potentially enjoy very low latency, then there would be no need to
	isolate each flow into a separate queue.


	1.2. Scope	In summary, Scalable congestion controls address the root cause of
		the latency, loss and scaling problems with Classic congestion
		controls. Both FQ and DualQ AQMs are enablers for this smooth low
		latency scalable behaviour. But handling individual flows is not
		always applicable, whereas the DualQ approach is.

		1.2. Context, Scope & Applicability

	L4S involves complementary changes in the network and on end-systems:	L4S involves complementary changes in the network and on end-systems:

	Network: A DualQ Coupled AQM (defined in the present document) or a	Network: A DualQ Coupled AQM (defined in the present document) or a
	modification to flow-queue AQMs (described in section 4.2.b of the	modification to flow-queue AQMs (described in section 4.2.b of the
	L4S architecture [I-D.ietf-tsvwg-l4s-arch]);	L4S architecture [I-D.ietf-tsvwg-l4s-arch]);

	End-system: A Scalable congestion control (defined in section 4 of	End-system: A Scalable congestion control (defined in section 4 of
	the L4S ECN protocol [I-D.ietf-tsvwg-ecn-l4s-id]).	the L4S ECN protocol [I-D.ietf-tsvwg-ecn-l4s-id]).


	skipping to change at page 17, line 17 ¶	skipping to change at page 17, line 13 ¶
	capitals) in Section 2.5 are observed.	capitals) in Section 2.5 are observed.

	The two queues could optionally be part of a larger queuing	The two queues could optionally be part of a larger queuing
	hierarchy, such as the initial example ideas in	hierarchy, such as the initial example ideas in
	[I-D.briscoe-tsvwg-l4s-diffserv].	[I-D.briscoe-tsvwg-l4s-diffserv].

	2.5. Normative Requirements for a DualQ Coupled AQM	2.5. Normative Requirements for a DualQ Coupled AQM

	The following requirements are intended to capture only the essential	The following requirements are intended to capture only the essential
	aspects of a DualQ Coupled AQM. They are intended to be independent	aspects of a DualQ Coupled AQM. They are intended to be independent

	of the particular AQMs used for each queue.	of the particular AQMs implemented for each queue, but to still
		define the DualQ framework built around those AQMs.

	2.5.1. Functional Requirements	2.5.1. Functional Requirements

	A Dual Queue Coupled AQM implementation MUST comply with the	A Dual Queue Coupled AQM implementation MUST comply with the
	prerequisite L4S behaviours for any L4S network node (not just a	prerequisite L4S behaviours for any L4S network node (not just a
	DualQ) as specified in section 5 of [I-D.ietf-tsvwg-ecn-l4s-id].	DualQ) as specified in section 5 of [I-D.ietf-tsvwg-ecn-l4s-id].
	These primarily concern classification and remarking as briefly	These primarily concern classification and remarking as briefly
	summarized in Section 2.3 earlier. But there is also a subsection	summarized in Section 2.3 earlier. But there is also a subsection
	(5.5) giving guidance on reducing the burstiness of the link	(5.5) giving guidance on reducing the burstiness of the link
	technology underlying any L4S AQM.	technology underlying any L4S AQM.

	skipping to change at page 28, line 33 ¶	skipping to change at page 28, line 33 ¶
	addressing the saturation problem. At saturation, DualPI2 switches	addressing the saturation problem. At saturation, DualPI2 switches
	into overload mode, where the base AQM is driven by the max delay of	into overload mode, where the base AQM is driven by the max delay of
	both queues and it introduces probabilistic drop to both queues	both queues and it introduces probabilistic drop to both queues
	equally. It leaves only a small range of congestion levels just	equally. It leaves only a small range of congestion levels just
	below saturation where unresponsive traffic gains any advantage from	below saturation where unresponsive traffic gains any advantage from
	using the ECN capability (relative to being unresponsive without	using the ECN capability (relative to being unresponsive without
	ECN), and the advantage is hardly detectable (see [DualQ-Test] and	ECN), and the advantage is hardly detectable (see [DualQ-Test] and
	section IV-E of [DCttH19]. Also overload with an unresponsive ECT(1)	section IV-E of [DCttH19]. Also overload with an unresponsive ECT(1)
	flow gets no more bandwidth advantage than with ECT(0).	flow gets no more bandwidth advantage than with ECT(0).


	5. Acknowledgements	5. References

	Thanks to Anil Agarwal, Sowmini Varadhan's, Gabi Bracha, Nicolas
	Kuhn, Greg Skinner, Tom Henderson, David Pullen, Mirja Kuehlewind,
	Gorry Fairhurst, Pete Heist, Ermin Sakic and Martin Duke for detailed
	review comments particularly of the appendices and suggestions on how
	to make the explanations clearer. Thanks also to Tom Henderson for
	insights on the choice of schedulers and queue delay measurement
	techniques.

	The early contributions of Koen De Schepper, Bob Briscoe, Olga
	Bondarenko and Inton Tsang were part-funded by the European Community
	under its Seventh Framework Programme through the Reducing Internet
	Transport Latency (RITE) project (ICT-317700). Contributions of Koen
	De Schepper and Olivier Tilmans were also part-funded by the 5Growth
	and DAEMON EU H2020 projects. Bob Briscoe's contribution was also
	part-funded by the Comcast Innovation Fund and the Research Council
	of Norway through the TimeIn project. The views expressed here are
	solely those of the authors.

	6. Contributors

	The following contributed implementations and evaluations that
	validated and helped to improve this specification:

	Olga Albisser <olga@albisser.org> of Simula Research Lab, Norway
	(Olga Bondarenko during early drafts) implemented the prototype
	DualPI2 AQM for Linux with Koen De Schepper and conducted
	extensive evaluations as well as implementing the live performance
	visualization GUI [L4Sdemo16].

	Olivier Tilmans <olivier.tilmans@nokia-bell-labs.com> of Nokia
	Bell Labs, Belgium prepared and maintains the Linux implementation
	of DualPI2 for upstreaming.

	Shravya K.S. wrote a model for the ns-3 simulator based on the -01
	version of this Internet-Draft. Based on this initial work, Tom
	Henderson <tomh@tomh.org> updated that earlier model and created a
	model for the DualQ variant specified as part of the Low Latency
	DOCSIS specification, as well as conducting extensive evaluations.

	Ing Jyh (Inton) Tsang of Nokia, Belgium built the End-to-End Data
	Centre to the Home broadband testbed on which DualQ Coupled AQM
	implementations were tested.

	7. References


	7.1. Normative References	5.1. Normative References

	[I-D.ietf-tsvwg-ecn-l4s-id]	[I-D.ietf-tsvwg-ecn-l4s-id]
	Schepper, K. D. and B. Briscoe, "Explicit Congestion	Schepper, K. D. and B. Briscoe, "Explicit Congestion
	Notification (ECN) Protocol for Very Low Queuing Delay	Notification (ECN) Protocol for Very Low Queuing Delay
	(L4S)", Work in Progress, Internet-Draft, draft-ietf-	(L4S)", Work in Progress, Internet-Draft, draft-ietf-

	tsvwg-ecn-l4s-id-26, 7 July 2022,	tsvwg-ecn-l4s-id-28, 8 August 2022,
	<https://datatracker.ietf.org/doc/html/draft-ietf-tsvwg-	<https://www.ietf.org/archive/id/draft-ietf-tsvwg-ecn-l4s-
	ecn-l4s-id-26>.	id-28.txt>.

	[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate	[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
	Requirement Levels", BCP 14, RFC 2119,	Requirement Levels", BCP 14, RFC 2119,
	DOI 10.17487/RFC2119, March 1997,	DOI 10.17487/RFC2119, March 1997,
	<https://www.rfc-editor.org/info/rfc2119>.	<https://www.rfc-editor.org/info/rfc2119>.

	[RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition	[RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition
	of Explicit Congestion Notification (ECN) to IP",	of Explicit Congestion Notification (ECN) to IP",
	RFC 3168, DOI 10.17487/RFC3168, September 2001,	RFC 3168, DOI 10.17487/RFC3168, September 2001,
	<https://www.rfc-editor.org/info/rfc3168>.	<https://www.rfc-editor.org/info/rfc3168>.

	[RFC8311] Black, D., "Relaxing Restrictions on Explicit Congestion	[RFC8311] Black, D., "Relaxing Restrictions on Explicit Congestion
	Notification (ECN) Experimentation", RFC 8311,	Notification (ECN) Experimentation", RFC 8311,
	DOI 10.17487/RFC8311, January 2018,	DOI 10.17487/RFC8311, January 2018,
	<https://www.rfc-editor.org/info/rfc8311>.	<https://www.rfc-editor.org/info/rfc8311>.


	7.2. Informative References	5.2. Informative References

	[Alizadeh-stability]	[Alizadeh-stability]
	Alizadeh, M., Javanmard, A., and B. Prabhakar, "Analysis	Alizadeh, M., Javanmard, A., and B. Prabhakar, "Analysis
	of DCTCP: Stability, Convergence, and Fairness", ACM	of DCTCP: Stability, Convergence, and Fairness", ACM
	SIGMETRICS 2011 , June 2011,	SIGMETRICS 2011 , June 2011,
	<https://dl.acm.org/citation.cfm?id=1993753>.	<https://dl.acm.org/citation.cfm?id=1993753>.

	[AQMmetrics]	[AQMmetrics]
	Kwon, M. and S. Fahmy, "A Comparison of Load-based and	Kwon, M. and S. Fahmy, "A Comparison of Load-based and
	Queue- based Active Queue Management Algorithms", Proc.	Queue- based Active Queue Management Algorithms", Proc.

	skipping to change at page 31, line 45 ¶	skipping to change at page 30, line 49 ¶
	thesis-henrste.pdf?sequence=1>.	thesis-henrste.pdf?sequence=1>.

	[Heist21] Heist, P. and J. Morton, "L4S Tests", github README,	[Heist21] Heist, P. and J. Morton, "L4S Tests", github README,
	August 2021, <https://github.com/heistp/l4s-	August 2021, <https://github.com/heistp/l4s-
	tests/#underutilization-with-bursty-traffic>.	tests/#underutilization-with-bursty-traffic>.

	[I-D.briscoe-docsis-q-protection]	[I-D.briscoe-docsis-q-protection]
	Briscoe, B. and G. White, "The DOCSIS(r) Queue Protection	Briscoe, B. and G. White, "The DOCSIS(r) Queue Protection
	Algorithm to Preserve Low Latency", Work in Progress,	Algorithm to Preserve Low Latency", Work in Progress,
	Internet-Draft, draft-briscoe-docsis-q-protection-06, 13	Internet-Draft, draft-briscoe-docsis-q-protection-06, 13

	May 2022, <https://datatracker.ietf.org/doc/html/draft-	May 2022, <https://www.ietf.org/archive/id/draft-briscoe-
	briscoe-docsis-q-protection-06>.	docsis-q-protection-06.txt>.

	[I-D.briscoe-iccrg-prague-congestion-control]	[I-D.briscoe-iccrg-prague-congestion-control]
	Schepper, K. D., Tilmans, O., and B. Briscoe, "Prague	Schepper, K. D., Tilmans, O., and B. Briscoe, "Prague
	Congestion Control", Work in Progress, Internet-Draft,	Congestion Control", Work in Progress, Internet-Draft,

	draft-briscoe-iccrg-prague-congestion-control-00, 9 March	draft-briscoe-iccrg-prague-congestion-control-01, 11 July
	2021, <https://datatracker.ietf.org/doc/html/draft-	2022, <https://www.ietf.org/archive/id/draft-briscoe-
	briscoe-iccrg-prague-congestion-control-00>.	iccrg-prague-congestion-control-01.txt>.

	[I-D.briscoe-tsvwg-l4s-diffserv]	[I-D.briscoe-tsvwg-l4s-diffserv]
	Briscoe, B., "Interactions between Low Latency, Low Loss,	Briscoe, B., "Interactions between Low Latency, Low Loss,
	Scalable Throughput (L4S) and Differentiated Services",	Scalable Throughput (L4S) and Differentiated Services",
	Work in Progress, Internet-Draft, draft-briscoe-tsvwg-l4s-	Work in Progress, Internet-Draft, draft-briscoe-tsvwg-l4s-
	diffserv-02, 4 November 2018,	diffserv-02, 4 November 2018,

	<https://datatracker.ietf.org/doc/html/draft-briscoe-	<https://www.ietf.org/archive/id/draft-briscoe-tsvwg-l4s-
	tsvwg-l4s-diffserv-02>.	diffserv-02.txt>.

	[I-D.cardwell-iccrg-bbr-congestion-control]	[I-D.cardwell-iccrg-bbr-congestion-control]
	Cardwell, N., Cheng, Y., Yeganeh, S. H., Swett, I., and V.	Cardwell, N., Cheng, Y., Yeganeh, S. H., Swett, I., and V.
	Jacobson, "BBR Congestion Control", Work in Progress,	Jacobson, "BBR Congestion Control", Work in Progress,
	Internet-Draft, draft-cardwell-iccrg-bbr-congestion-	Internet-Draft, draft-cardwell-iccrg-bbr-congestion-
	control-02, 7 March 2022,	control-02, 7 March 2022,

	<https://datatracker.ietf.org/doc/html/draft-cardwell-	<https://www.ietf.org/archive/id/draft-cardwell-iccrg-bbr-
	iccrg-bbr-congestion-control-02>.	congestion-control-02.txt>.

	[I-D.ietf-tsvwg-l4s-arch]	[I-D.ietf-tsvwg-l4s-arch]
	Briscoe, B., Schepper, K. D., Bagnulo, M., and G. White,	Briscoe, B., Schepper, K. D., Bagnulo, M., and G. White,
	"Low Latency, Low Loss, Scalable Throughput (L4S) Internet	"Low Latency, Low Loss, Scalable Throughput (L4S) Internet
	Service: Architecture", Work in Progress, Internet-Draft,	Service: Architecture", Work in Progress, Internet-Draft,

	draft-ietf-tsvwg-l4s-arch-18, 7 July 2022,	draft-ietf-tsvwg-l4s-arch-19, 27 July 2022,
	<https://datatracker.ietf.org/doc/html/draft-ietf-tsvwg-	<https://www.ietf.org/archive/id/draft-ietf-tsvwg-l4s-
	l4s-arch-18>.	arch-19.txt>.

	[L4Sdemo16]	[L4Sdemo16]
	Bondarenko, O., De Schepper, K., Tsang, I., and B.	Bondarenko, O., De Schepper, K., Tsang, I., and B.
	Briscoe, "Ultra-Low Delay for All: Live Experience, Live	Briscoe, "Ultra-Low Delay for All: Live Experience, Live
	Analysis", Proc. MMSYS'16 pp33:1--33:4, May 2016,	Analysis", Proc. MMSYS'16 pp33:1--33:4, May 2016,
	<http://dl.acm.org/citation.cfm?doid=2910017.2910633	<http://dl.acm.org/citation.cfm?doid=2910017.2910633
	(videos of demos:	(videos of demos:
	https://riteproject.eu/dctth/#1511dispatchwg )>.	https://riteproject.eu/dctth/#1511dispatchwg )>.

	[L4S_5G] Willars, P., Wittenmark, E., Ronkainen, H., Östberg, C.,	[L4S_5G] Willars, P., Wittenmark, E., Ronkainen, H., Östberg, C.,

	skipping to change at page 34, line 12 ¶	skipping to change at page 33, line 12 ¶
	Partridge, C., Peterson, L., Ramakrishnan, K., Shenker,	Partridge, C., Peterson, L., Ramakrishnan, K., Shenker,
	S., Wroclawski, J., and L. Zhang, "Recommendations on	S., Wroclawski, J., and L. Zhang, "Recommendations on
	Queue Management and Congestion Avoidance in the	Queue Management and Congestion Avoidance in the
	Internet", RFC 2309, DOI 10.17487/RFC2309, April 1998,	Internet", RFC 2309, DOI 10.17487/RFC2309, April 1998,
	<https://www.rfc-editor.org/info/rfc2309>.	<https://www.rfc-editor.org/info/rfc2309>.

	[RFC2914] Floyd, S., "Congestion Control Principles", BCP 41,	[RFC2914] Floyd, S., "Congestion Control Principles", BCP 41,
	RFC 2914, DOI 10.17487/RFC2914, September 2000,	RFC 2914, DOI 10.17487/RFC2914, September 2000,
	<https://www.rfc-editor.org/info/rfc2914>.	<https://www.rfc-editor.org/info/rfc2914>.


	[RFC3246] Davie, B., Charny, A., Bennet, J.C.R., Benson, K., Le	[RFC3246] Davie, B., Charny, A., Bennet, J C R., Benson, K., Le
	Boudec, J.Y., Courtney, W., Davari, S., Firoiu, V., and D.	Boudec, J Y., Courtney, W., Davari, S., Firoiu, V., and D.
	Stiliadis, "An Expedited Forwarding PHB (Per-Hop	Stiliadis, "An Expedited Forwarding PHB (Per-Hop
	Behavior)", RFC 3246, DOI 10.17487/RFC3246, March 2002,	Behavior)", RFC 3246, DOI 10.17487/RFC3246, March 2002,
	<https://www.rfc-editor.org/info/rfc3246>.	<https://www.rfc-editor.org/info/rfc3246>.

	[RFC3649] Floyd, S., "HighSpeed TCP for Large Congestion Windows",	[RFC3649] Floyd, S., "HighSpeed TCP for Large Congestion Windows",
	RFC 3649, DOI 10.17487/RFC3649, December 2003,	RFC 3649, DOI 10.17487/RFC3649, December 2003,
	<https://www.rfc-editor.org/info/rfc3649>.	<https://www.rfc-editor.org/info/rfc3649>.

	[RFC5033] Floyd, S. and M. Allman, "Specifying New Congestion	[RFC5033] Floyd, S. and M. Allman, "Specifying New Congestion
	Control Algorithms", BCP 133, RFC 5033,	Control Algorithms", BCP 133, RFC 5033,

	skipping to change at page 46, line 32 ¶	skipping to change at page 45, line 32 ¶
	a burst arrives at an empty queue, the sojourn time only fully	a burst arrives at an empty queue, the sojourn time only fully
	measures the burst's delay when its last packet is dequeued, even	measures the burst's delay when its last packet is dequeued, even
	though the queue has known the size of the burst since its last	though the queue has known the size of the burst since its last
	packet was enqueued - so it could have signalled congestion	packet was enqueued - so it could have signalled congestion
	earlier. To remedy this, each head packet can be marked when it	earlier. To remedy this, each head packet can be marked when it
	is dequeued based on the expected delay of the tail packet behind	is dequeued based on the expected delay of the tail packet behind
	it, as explained below, rather than based on the head packet's	it, as explained below, rather than based on the head packet's
	own delay due to the packets in front of it. [Heist21] identifies	own delay due to the packets in front of it. [Heist21] identifies
	a specific scenario where bursty traffic significantly hits	a specific scenario where bursty traffic significantly hits
	utilization of the L queue. If this effect proves to be more	utilization of the L queue. If this effect proves to be more

	widely applicable, it is believed that using the delay behind the	widely applicable, using the delay behind the head could improve
	head would improve performance.	performance.

	The delay behind the head can be implemented by dividing the	The delay behind the head can be implemented by dividing the
	backlog at dequeue by the link rate or equivalently multiplying	backlog at dequeue by the link rate or equivalently multiplying
	the backlog by the delay per unit of backlog. The implementation	the backlog by the delay per unit of backlog. The implementation
	details will depend on whether the link rate is known; if it is	details will depend on whether the link rate is known; if it is
	not, a moving average of the delay per unit backlog can be	not, a moving average of the delay per unit backlog can be
	maintained. This delay consists of serialization as well as	maintained. This delay consists of serialization as well as
	media acquisition for shared media. So the details will depend	media acquisition for shared media. So the details will depend
	strongly on the specific link technology, This approach should be	strongly on the specific link technology, This approach should be
	less sensitive to timing errors and cost less in operations and	less sensitive to timing errors and cost less in operations and

	skipping to change at page 59, line 45 ¶	skipping to change at page 58, line 45 ¶
	13: continue % continue to the top of the while loop	13: continue % continue to the top of the while loop
	14: }	14: }
	15: mark(pkt)	15: mark(pkt)
	16: }	16: }
	17: }	17: }
	18: return(pkt) % return the packet and stop here	18: return(pkt) % return the packet and stop here
	19: }	19: }
	20: return(NULL) % no packet to dequeue	20: return(NULL) % no packet to dequeue
	21: }	21: }


	Figure 11: Optimised Example Dequeue Pseudocode for Coupled DualQ	Figure 11: Optimised Example Dequeue Pseudocode for DualQ Coupled
	AQM using Integer Arithmetic	AQM using Integer Arithmetic

	The two ranges, range_L and range_C are expressed as powers of 2 so	The two ranges, range_L and range_C are expressed as powers of 2 so
	that division can be implemented as a right bit-shift (>>) in lines 5	that division can be implemented as a right bit-shift (>>) in lines 5
	and 10 of the integer variant of the pseudocode (Figure 11).	and 10 of the integer variant of the pseudocode (Figure 11).

	For the integer variant of the pseudocode, an integer version of the	For the integer variant of the pseudocode, an integer version of the
	rand() function used at line 25 of the maxrand(function) in Figure 10	rand() function used at line 25 of the maxrand(function) in Figure 10
	would be arranged to return an integer in the range 0 <= maxrand() <	would be arranged to return an integer in the range 0 <= maxrand() <
	2^32 (not shown). This would scale up all the floating point	2^32 (not shown). This would scale up all the floating point

	skipping to change at page 65, line 8 ¶	skipping to change at page 64, line 8 ¶
	derived from a typical RTT for the Internet.	derived from a typical RTT for the Internet.

	As a non-Internet example, for localized traffic from a particular	As a non-Internet example, for localized traffic from a particular
	ISP's data centre, using the measured RTTs, it was calculated that a	ISP's data centre, using the measured RTTs, it was calculated that a
	value of k = 8 would achieve throughput equivalence, and experiments	value of k = 8 would achieve throughput equivalence, and experiments
	verified the formula very closely.	verified the formula very closely.

	But, for a typical mix of RTTs across the general Internet, a value	But, for a typical mix of RTTs across the general Internet, a value
	of k=2 is recommended as a good workable compromise.	of k=2 is recommended as a good workable compromise.


		Acknowledgements

		Thanks to Anil Agarwal, Sowmini Varadhan, Gabi Bracha, Nicolas Kuhn,
		Greg Skinner, Tom Henderson, David Pullen, Mirja Kuehlewind, Gorry
		Fairhurst, Pete Heist, Ermin Sakic and Martin Duke for detailed
		review comments particularly of the appendices and suggestions on how
		to make the explanations clearer. Thanks also to Tom Henderson for
		insights on the choice of schedulers and queue delay measurement
		techniques. And thanks to the area reviewer, Christer Holmberg.

		The early contributions of Koen De Schepper, Bob Briscoe, Olga
		Bondarenko and Inton Tsang were part-funded by the European Community
		under its Seventh Framework Programme through the Reducing Internet
		Transport Latency (RITE) project (ICT-317700). Contributions of Koen
		De Schepper and Olivier Tilmans were also part-funded by the 5Growth
		and DAEMON EU H2020 projects. Bob Briscoe's contribution was also
		part-funded by the Comcast Innovation Fund and the Research Council
		of Norway through the TimeIn project. The views expressed here are
		solely those of the authors.

		Contributors

		The following contributed implementations and evaluations that
		validated and helped to improve this specification:

		Olga Albisser <olga@albisser.org> of Simula Research Lab, Norway
		(Olga Bondarenko during early drafts) implemented the prototype
		DualPI2 AQM for Linux with Koen De Schepper and conducted
		extensive evaluations as well as implementing the live performance
		visualization GUI [L4Sdemo16].

		Olivier Tilmans <olivier.tilmans@nokia-bell-labs.com> of Nokia
		Bell Labs, Belgium prepared and maintains the Linux implementation
		of DualPI2 for upstreaming.

		Shravya K.S. wrote a model for the ns-3 simulator based on the -01
		version of this Internet-Draft. Based on this initial work, Tom
		Henderson <tomh@tomh.org> updated that earlier model and created a
		model for the DualQ variant specified as part of the Low Latency
		DOCSIS specification, as well as conducting extensive evaluations.

		Ing Jyh (Inton) Tsang of Nokia, Belgium built the End-to-End Data
		Centre to the Home broadband testbed on which DualQ Coupled AQM
		implementations were tested.

	Authors' Addresses	Authors' Addresses

	Koen De Schepper	Koen De Schepper
	Nokia Bell Labs	Nokia Bell Labs
	Antwerp	Antwerp
	Belgium	Belgium
	Email: koen.de_schepper@nokia.com	Email: koen.de_schepper@nokia.com
	URI: https://www.bell-labs.com/usr/koen.de_schepper	URI: https://www.bell-labs.com/usr/koen.de_schepper

	Bob Briscoe (editor)	Bob Briscoe (editor)

End of changes. 30 change blocks.
	133 lines changed or deleted	143 lines changed or added
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/