< draft-ietf-tsvwg-ecn-l4s-id-27.txt   draft-ietf-tsvwg-ecn-l4s-id-28a.txt >
Transport Services (tsv) K. De Schepper Transport Services (tsv) K. De Schepper
Internet-Draft Nokia Bell Labs Internet-Draft Nokia Bell Labs
Intended status: Experimental B. Briscoe, Ed. Intended status: Experimental B. Briscoe, Ed.
Expires: 28 January 2023 Independent Expires: 5 February 2023 Independent
27 July 2022 4 August 2022
Explicit Congestion Notification (ECN) Protocol for Very Low Queuing Explicit Congestion Notification (ECN) Protocol for Very Low Queuing
Delay (L4S) Delay (L4S)
draft-ietf-tsvwg-ecn-l4s-id-27 draft-ietf-tsvwg-ecn-l4s-id-28
Abstract Abstract
This specification defines the protocol to be used for a new network This specification defines the protocol to be used for a new network
service called low latency, low loss and scalable throughput (L4S). service called low latency, low loss and scalable throughput (L4S).
L4S uses an Explicit Congestion Notification (ECN) scheme at the IP L4S uses an Explicit Congestion Notification (ECN) scheme at the IP
layer that is similar to the original (or 'Classic') ECN approach, layer that is similar to the original (or 'Classic') ECN approach,
except as specified within. L4S uses 'scalable' congestion control, except as specified within. L4S uses 'scalable' congestion control,
which induces much more frequent control signals from the network and which induces much more frequent control signals from the network and
it responds to them with much more fine-grained adjustments, so that it responds to them with much more fine-grained adjustments, so that
very low (typically sub-millisecond on average) and consistently low very low (typically sub-millisecond on average) and consistently low
queuing delay becomes possible for L4S traffic without compromising queuing delay becomes possible for L4S traffic without compromising
link utilization. Thus even capacity-seeking (TCP-like) traffic can link utilization. Thus even capacity-seeking (TCP-like) traffic can
have high bandwidth and very low delay at the same time, even during have high bandwidth and very low delay at the same time, even during
periods of high traffic load. periods of high traffic load. The L4S identifier defined in this
document distinguishes L4S from 'Classic' (e.g. TCP-Reno-friendly)
The L4S identifier defined in this document distinguishes L4S from traffic. Then, network bottlenecks can be incrementally modified to
'Classic' (e.g. TCP-Reno-friendly) traffic. It gives an incremental
migration path so that suitably modified network bottlenecks can
distinguish and isolate existing traffic that still follows the distinguish and isolate existing traffic that still follows the
Classic behaviour, to prevent it degrading the low queuing delay and Classic behaviour, to prevent it degrading the low queuing delay and
low loss of L4S traffic. This specification defines the rules that low loss of L4S traffic. This experimental track specification
L4S transports and network elements need to follow with the intention defines the rules that L4S transports and network elements need to
that L4S flows neither harm each other's performance nor that of follow, with the intention that L4S flows neither harm each other's
Classic traffic. Examples of new active queue management (AQM) performance nor that of Classic traffic. It also suggests open
marking algorithms and examples of new transports (whether TCP-like questions to be investigated during experimentation. Examples of new
or real-time) are specified separately. active queue management (AQM) marking algorithms and examples of new
transports (whether TCP-like or real-time) are specified separately.
Status of This Memo Status of This Memo
This Internet-Draft is submitted in full conformance with the This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79. provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/. Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on 28 January 2023. This Internet-Draft will expire on 5 February 2023.
Copyright Notice Copyright Notice
Copyright (c) 2022 IETF Trust and the persons identified as the Copyright (c) 2022 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents (https://trustee.ietf.org/ Provisions Relating to IETF Documents (https://trustee.ietf.org/
license-info) in effect on the date of publication of this document. license-info) in effect on the date of publication of this document.
Please review these documents carefully, as they describe your rights Please review these documents carefully, as they describe your rights
skipping to change at page 2, line 32 skipping to change at page 2, line 32
extracted from this document must include Revised BSD License text as extracted from this document must include Revised BSD License text as
described in Section 4.e of the Trust Legal Provisions and are described in Section 4.e of the Trust Legal Provisions and are
provided without warranty as described in the Revised BSD License. provided without warranty as described in the Revised BSD License.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4
1.1. Latency, Loss and Scaling Problems . . . . . . . . . . . 5 1.1. Latency, Loss and Scaling Problems . . . . . . . . . . . 5
1.2. Terminology . . . . . . . . . . . . . . . . . . . . . . . 7 1.2. Terminology . . . . . . . . . . . . . . . . . . . . . . . 7
1.3. Scope . . . . . . . . . . . . . . . . . . . . . . . . . . 9 1.3. Scope . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2. Choice of L4S Packet Identifier: Requirements . . . . . . . . 10 2. L4S Packet Identification: Document Roadmap . . . . . . . . . 10
3. L4S Packet Identification . . . . . . . . . . . . . . . . . . 11 3. Choice of L4S Packet Identifier: Requirements . . . . . . . . 11
4. Transport Layer Behaviour (the 'Prague Requirements') . . . . 11 4. Transport Layer Behaviour (the 'Prague Requirements') . . . . 12
4.1. Codepoint Setting . . . . . . . . . . . . . . . . . . . . 12 4.1. Codepoint Setting . . . . . . . . . . . . . . . . . . . . 12
4.2. Prerequisite Transport Feedback . . . . . . . . . . . . . 12 4.2. Prerequisite Transport Feedback . . . . . . . . . . . . . 13
4.3. Prerequisite Congestion Response . . . . . . . . . . . . 13 4.3. Prerequisite Congestion Response . . . . . . . . . . . . 14
4.3.1. Guidance on Congestion Response in the RFC Series . . 16 4.3.1. Guidance on Congestion Response in the RFC Series . . 17
4.4. Filtering or Smoothing of ECN Feedback . . . . . . . . . 19 4.4. Filtering or Smoothing of ECN Feedback . . . . . . . . . 20
5. Network Node Behaviour . . . . . . . . . . . . . . . . . . . 19 5. Network Node Behaviour . . . . . . . . . . . . . . . . . . . 20
5.1. Classification and Re-Marking Behaviour . . . . . . . . . 19 5.1. Classification and Re-Marking Behaviour . . . . . . . . . 20
5.2. The Strength of L4S CE Marking Relative to Drop . . . . . 21 5.2. The Strength of L4S CE Marking Relative to Drop . . . . . 22
5.3. Exception for L4S Packet Identification by Network Nodes 5.3. Exception for L4S Packet Identification by Network Nodes
with Transport-Layer Awareness . . . . . . . . . . . . . 22 with Transport-Layer Awareness . . . . . . . . . . . . . 23
5.4. Interaction of the L4S Identifier with other 5.4. Interaction of the L4S Identifier with other
Identifiers . . . . . . . . . . . . . . . . . . . . . . . 22 Identifiers . . . . . . . . . . . . . . . . . . . . . . . 23
5.4.1. DualQ Examples of Other Identifiers Complementing L4S 5.4.1. DualQ Examples of Other Identifiers Complementing L4S
Identifiers . . . . . . . . . . . . . . . . . . . . . 22 Identifiers . . . . . . . . . . . . . . . . . . . . . 23
5.4.1.1. Inclusion of Additional Traffic with L4S . . . . 22 5.4.1.1. Inclusion of Additional Traffic with L4S . . . . 23
5.4.1.2. Exclusion of Traffic From L4S Treatment . . . . . 24 5.4.1.2. Exclusion of Traffic From L4S Treatment . . . . . 25
5.4.1.3. Generalized Combination of L4S and Other 5.4.1.3. Generalized Combination of L4S and Other
Identifiers . . . . . . . . . . . . . . . . . . . . 25 Identifiers . . . . . . . . . . . . . . . . . . . . 26
5.4.2. Per-Flow Queuing Examples of Other Identifiers 5.4.2. Per-Flow Queuing Examples of Other Identifiers
Complementing L4S Identifiers . . . . . . . . . . . . 27 Complementing L4S Identifiers . . . . . . . . . . . . 28
5.5. Limiting Packet Bursts from Links . . . . . . . . . . . . 27 5.5. Limiting Packet Bursts from Links . . . . . . . . . . . . 28
5.5.1. Limiting Packet Bursts from Links Fed by an L4S 5.5.1. Limiting Packet Bursts from Links Fed by an L4S
AQM . . . . . . . . . . . . . . . . . . . . . . . . . 27
5.5.2. Limiting Packet Bursts from Links Upstream of an L4S
AQM . . . . . . . . . . . . . . . . . . . . . . . . . 28 AQM . . . . . . . . . . . . . . . . . . . . . . . . . 28
6. Behaviour of Tunnels and Encapsulations . . . . . . . . . . . 28 5.5.2. Limiting Packet Bursts from Links Upstream of an L4S
6.1. No Change to ECN Tunnels and Encapsulations in General . 28 AQM . . . . . . . . . . . . . . . . . . . . . . . . . 29
6.2. VPN Behaviour to Avoid Limitations of Anti-Replay . . . . 29 6. Behaviour of Tunnels and Encapsulations . . . . . . . . . . . 29
7. L4S Experiments . . . . . . . . . . . . . . . . . . . . . . . 30 6.1. No Change to ECN Tunnels and Encapsulations in General . 29
7.1. Open Questions . . . . . . . . . . . . . . . . . . . . . 30 6.2. VPN Behaviour to Avoid Limitations of Anti-Replay . . . . 30
7.2. Open Issues . . . . . . . . . . . . . . . . . . . . . . . 32 7. L4S Experiments . . . . . . . . . . . . . . . . . . . . . . . 31
7.3. Future Potential . . . . . . . . . . . . . . . . . . . . 32 7.1. Open Questions . . . . . . . . . . . . . . . . . . . . . 31
8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 33 7.2. Open Issues . . . . . . . . . . . . . . . . . . . . . . . 33
9. Security Considerations . . . . . . . . . . . . . . . . . . . 33 7.3. Future Potential . . . . . . . . . . . . . . . . . . . . 33
10. References . . . . . . . . . . . . . . . . . . . . . . . . . 34 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 34
10.1. Normative References . . . . . . . . . . . . . . . . . . 34 9. Security Considerations . . . . . . . . . . . . . . . . . . . 34
10.2. Informative References . . . . . . . . . . . . . . . . . 35 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 35
Appendix A. Rationale for the 'Prague L4S Requirements' . . . . 44 10.1. Normative References . . . . . . . . . . . . . . . . . . 35
10.2. Informative References . . . . . . . . . . . . . . . . . 36
Appendix A. Rationale for the 'Prague L4S Requirements' . . . . 45
A.1. Rationale for the Requirements for Scalable Transport A.1. Rationale for the Requirements for Scalable Transport
Protocols . . . . . . . . . . . . . . . . . . . . . . . . 45 Protocols . . . . . . . . . . . . . . . . . . . . . . . . 46
A.1.1. Use of L4S Packet Identifier . . . . . . . . . . . . 45 A.1.1. Use of L4S Packet Identifier . . . . . . . . . . . . 46
A.1.2. Accurate ECN Feedback . . . . . . . . . . . . . . . . 45 A.1.2. Accurate ECN Feedback . . . . . . . . . . . . . . . . 46
A.1.3. Capable of Replacement by Classic Congestion A.1.3. Capable of Replacement by Classic Congestion
Control . . . . . . . . . . . . . . . . . . . . . . . 46 Control . . . . . . . . . . . . . . . . . . . . . . . 47
A.1.4. Fall back to Classic Congestion Control on Packet A.1.4. Fall back to Classic Congestion Control on Packet
Loss . . . . . . . . . . . . . . . . . . . . . . . . 46 Loss . . . . . . . . . . . . . . . . . . . . . . . . 47
A.1.5. Coexistence with Classic Congestion Control at Classic A.1.5. Coexistence with Classic Congestion Control at Classic
ECN bottlenecks . . . . . . . . . . . . . . . . . . . 47 ECN bottlenecks . . . . . . . . . . . . . . . . . . . 48
A.1.6. Reduce RTT dependence . . . . . . . . . . . . . . . . 51 A.1.6. Reduce RTT dependence . . . . . . . . . . . . . . . . 51
A.1.7. Scaling down to fractional congestion windows . . . . 52 A.1.7. Scaling down to fractional congestion windows . . . . 53
A.1.8. Measuring Reordering Tolerance in Time Units . . . . 53 A.1.8. Measuring Reordering Tolerance in Time Units . . . . 54
A.2. Scalable Transport Protocol Optimizations . . . . . . . . 56 A.2. Scalable Transport Protocol Optimizations . . . . . . . . 57
A.2.1. Setting ECT in Control Packets and Retransmissions . 56 A.2.1. Setting ECT in Control Packets and Retransmissions . 57
A.2.2. Faster than Additive Increase . . . . . . . . . . . . 57 A.2.2. Faster than Additive Increase . . . . . . . . . . . . 57
A.2.3. Faster Convergence at Flow Start . . . . . . . . . . 57 A.2.3. Faster Convergence at Flow Start . . . . . . . . . . 58
Appendix B. Compromises in the Choice of L4S Identifier . . . . 58 Appendix B. Compromises in the Choice of L4S Identifier . . . . 58
Appendix C. Potential Competing Uses for the ECT(1) Codepoint . 63 Appendix C. Potential Competing Uses for the ECT(1) Codepoint . 63
C.1. Integrity of Congestion Feedback . . . . . . . . . . . . 63 C.1. Integrity of Congestion Feedback . . . . . . . . . . . . 63
C.2. Notification of Less Severe Congestion than CE . . . . . 64 C.2. Notification of Less Severe Congestion than CE . . . . . 65
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . 65 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . 65
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 65 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 66
1. Introduction 1. Introduction
This specification defines the protocol to be used for a new network This specification defines the protocol to be used for a new network
service called low latency, low loss and scalable throughput (L4S). service called low latency, low loss and scalable throughput (L4S).
L4S uses an Explicit Congestion Notification (ECN) scheme at the IP L4S uses an Explicit Congestion Notification (ECN) scheme at the IP
layer with the same set of codepoint transitions as the original (or layer with the same set of codepoint transitions as the original (or
'Classic') Explicit Congestion Notification (ECN [RFC3168]). 'Classic') Explicit Congestion Notification (ECN [RFC3168]).
RFC 3168 required an ECN mark to be equivalent to a drop, both when RFC 3168 required an ECN mark to be equivalent to a drop, both when
applied in the network and when responded to by a transport. Unlike applied in the network and when responded to by a transport. Unlike
Classic ECN marking, the network applies L4S marking more immediately Classic ECN marking: i) the network applies L4S marking more
and more aggressively than drop, and the transport response to each immediately and more frequently than drop; and ii) the transport
mark is reduced and smoothed relative to that for drop. The two response to each mark is reduced and smoothed relative to that for
changes counterbalance each other so that the throughput of an L4S drop. The two changes counterbalance each other so that the
flow will be roughly the same as a comparable non-L4S flow under the throughput of an L4S flow will be roughly the same as a comparable
same conditions. Nonetheless, the much more frequent ECN control non-L4S flow under the same conditions. Nonetheless, the much more
signals and the finer responses to these signals result in very low frequent ECN control signals and the finer responses to these signals
queuing delay without compromising link utilization, and this low result in very low queuing delay without compromising link
delay can be maintained during high load. For instance, queuing utilization, and this low delay can be maintained during high load.
delay under heavy and highly varying load with the example DCTCP/ For instance, queuing delay under heavy and highly varying load with
DualQ solution cited below on a DSL or Ethernet link is sub- the example DCTCP/DualQ solution described below on a DSL or Ethernet
millisecond on average and roughly 1 to 2 milliseconds at the 99th link is sub-millisecond on average and roughly 1 to 2 milliseconds at
percentile without losing link utilization [DualPI2Linux], [DCttH19]. the 99th percentile without losing link utilization [DualPI2Linux],
Note that the inherent queuing delay while waiting to acquire a [DCttH19]. Note that the queuing delay while waiting to acquire a
discontinuous medium such as WiFi has to be minimized in its own shared medium such as wireless has to be added to the above. It is a
right, so it would be additional to the above (see section 6.3 of the different issue that needs to be addressed, but separately (see
L4S architecture [I-D.ietf-tsvwg-l4s-arch]). section 6.3 of the L4S architecture [I-D.ietf-tsvwg-l4s-arch]).
L4S relies on 'scalable' congestion controls for these delay L4S relies on 'scalable' congestion controls for these delay
properties and for preserving low delay as flow rate scales, hence properties and for preserving low delay as flow rate scales, hence
the name. The congestion control used in Data Center TCP (DCTCP) is the name. The congestion control used in Data Center TCP (DCTCP) is
an example of a scalable congestion control, but DCTCP is applicable an example of a scalable congestion control, but DCTCP is applicable
solely to controlled environments like data centres [RFC8257], solely to controlled environments like data centres [RFC8257],
because it is too aggressive to co-exist with existing TCP-Reno- because it is too aggressive to co-exist with existing TCP-Reno-
friendly traffic. The DualQ Coupled AQM, which is defined in a friendly traffic. The DualQ Coupled AQM, which is defined in a
complementary experimental complementary experimental
specification [I-D.ietf-tsvwg-aqm-dualq-coupled], is an AQM framework specification [I-D.ietf-tsvwg-aqm-dualq-coupled], is an AQM framework
that enables scalable congestion controls derived from DCTCP to co- that enables scalable congestion controls derived from DCTCP to co-
exist with existing traffic, each getting roughly the same flow rate exist with existing traffic, each getting roughly the same flow rate
when they compete under similar conditions. Note that a scalable when they compete under similar conditions. Note that a scalable
congestion control is still not safe to deploy on the Internet unless congestion control is still not safe to deploy on the Internet unless
it satisfies the requirements listed in Section 4. it satisfies the requirements listed in Section 4.
L4S is not only for elastic (TCP-like) traffic - there are scalable L4S is not only for elastic (TCP-like) traffic - there are scalable
congestion controls for real-time media, such as the L4S variant of congestion controls for real-time media, such as the L4S variant
the SCReAM [RFC8298] real-time media congestion avoidance technique [SCReAM-L4S] of the SCReAM [RFC8298] real-time media congestion
(RMCAT). The factor that distinguishes L4S from Classic traffic is avoidance technique (RMCAT). The factor that distinguishes L4S from
its behaviour in response to congestion. The transport wire Classic traffic is its behaviour in response to congestion. The
protocol, e.g. TCP, QUIC, SCTP, DCCP, RTP/RTCP, is orthogonal (and transport wire protocol, e.g. TCP, QUIC, SCTP, DCCP, RTP/RTCP, is
therefore not suitable for distinguishing L4S from Classic packets). orthogonal (and therefore not suitable for distinguishing L4S from
Classic packets).
The L4S identifier defined in this document is the key piece that The L4S identifier defined in this document is the key piece that
distinguishes L4S from 'Classic' (e.g. Reno-friendly) traffic. It distinguishes L4S from 'Classic' (e.g. Reno-friendly) traffic. Then,
gives an incremental migration path so that suitably modified network network bottlenecks can be incrementally modified to distinguish and
bottlenecks can distinguish and isolate existing Classic traffic from isolate existing Classic traffic from L4S traffic, to prevent the
L4S traffic to prevent the former from degrading the very low delay former from degrading the very low queuing delay and loss of the new
and loss of the new scalable transports, without harming Classic scalable transports, without harming Classic performance at these
performance at these bottlenecks. Initial implementation of the bottlenecks. Although both sender and network deployment are
separate parts of the system has been motivated by the performance required before any benefit, initial implementations of the separate
parts of the system have been motivated by the potential performance
benefits. benefits.
1.1. Latency, Loss and Scaling Problems 1.1. Latency, Loss and Scaling Problems
Latency is becoming the critical performance factor for many (most?) Latency is becoming the critical performance factor for many (most?)
applications on the public Internet, e.g. interactive Web, Web Internet applications, e.g. interactive web, web services, voice,
services, voice, conversational video, interactive video, interactive conversational video, interactive video, interactive remote presence,
remote presence, instant messaging, online gaming, remote desktop, instant messaging, online gaming, remote desktop, cloud-based
cloud-based applications, and video-assisted remote control of applications & services, and remote control of machinery and
machinery and industrial processes. In the 'developed' world, industrial processes. In many parts of the world, further increases
further increases in access network bit-rate offer diminishing in access network bit rate offer diminishing returns [Dukkipati06],
returns, whereas latency is still a multi-faceted problem. In the whereas latency is still a multi-faceted problem. As a result, much
last decade or so, much has been done to reduce propagation time by has been done to reduce propagation time by placing caches or servers
placing caches or servers closer to users. However, queuing remains closer to users. However, queuing remains a major, albeit
a major intermittent component of latency. intermittent, component of latency.
The Diffserv architecture provides Expedited Forwarding [RFC3246], so The Diffserv architecture provides Expedited Forwarding [RFC3246], so
that low latency traffic can jump the queue of other traffic. If that low latency traffic can jump the queue of other traffic. If
growth in high-throughput latency-sensitive applications continues, growth in latency-sensitive applications continues, periods with
periods with solely latency-sensitive traffic will become solely latency-sensitive traffic will become increasingly common on
increasingly common on links where traffic aggregation is low. For links where traffic aggregation is low. During these periods, if all
instance, on the access links dedicated to individual sites (homes, the traffic were marked for the same treatment, Diffserv would make
small enterprises or mobile devices). These links also tend to no difference. The links with low aggregation also tend to become
become the path bottleneck under load. During these periods, if all the path bottleneck under load, for instance, the access links
the traffic were marked for the same treatment, at these bottlenecks dedicated to individual sites (homes, small enterprises or mobile
Diffserv would make no difference. Instead, it becomes imperative to devices). So, instead of differentiation, it becomes imperative to
remove the underlying causes of any unnecessary delay. remove the underlying causes of any unnecessary delay.
The bufferbloat project has shown that excessively-large buffering The bufferbloat project has shown that excessively-large buffering
('bufferbloat') has been introducing significantly more delay than ('bufferbloat') has been introducing significantly more delay than
the underlying propagation time. These delays appear only the underlying propagation time. These delays appear only
intermittently -- only when a capacity-seeking (e.g. TCP) flow is intermittently -- only when a capacity-seeking (e.g. TCP) flow is
long enough for the queue to fill the buffer, making every packet in long enough for the queue to fill the buffer, causing every packet in
other flows sharing the buffer sit through the queue. other flows sharing the buffer to have to work its way through the
queue.
Active queue management (AQM) was originally developed to solve this Active queue management (AQM) was originally developed to solve this
problem (and others). Unlike Diffserv, which gives low latency to problem (and others). Unlike Diffserv, which gives low latency to
some traffic at the expense of others, AQM controls latency for _all_ some traffic at the expense of others, AQM controls latency for _all_
traffic in a class. In general, AQM methods introduce an increasing traffic in a class. In general, AQM methods introduce an increasing
level of discard from the buffer the longer the queue persists above level of discard from the buffer the longer the queue persists above
a shallow threshold. This gives sufficient signals to capacity- a shallow threshold. This gives sufficient signals to capacity-
seeking (aka. greedy) flows to keep the buffer empty for its intended seeking (aka. greedy) flows to keep the buffer empty for its intended
purpose: absorbing bursts. However, RED [RFC2309] and other purpose: absorbing bursts. However, RED [RFC2309] and other
algorithms from the 1990s were sensitive to their configuration and algorithms from the 1990s were sensitive to their configuration and
hard to set correctly. So, this form of AQM was not widely deployed. hard to set correctly. So, this form of AQM was not widely deployed.
More recent state-of-the-art AQM methods, e.g. FQ-CoDel [RFC8290], More recent state-of-the-art AQM methods, such as FQ-CoDel [RFC8290],
PIE [RFC8033], Adaptive RED [ARED01], are easier to configure, PIE [RFC8033] or Adaptive RED [ARED01], are easier to configure,
because they define the queuing threshold in time not bytes, so it is because they define the queuing threshold in time not bytes, so
invariant for different link rates. However, no matter how good the configuration is invariant whatever the link rate. However, the
AQM, the sawtoothing sending window of a Classic congestion control sawtoothing window of a Classic congestion control creates a dilemma
will either cause queuing delay to vary or cause the link to be for the operator: i) either configure a shallow AQM operating point,
underutilized. Even with a perfectly tuned AQM, the additional so the tips of the sawteeth cause minimal queue delay but the troughs
queuing delay will be of the same order as the underlying speed-of- underutilize the link, or ii) configure the operating point deeper
light delay across the network, thereby roughly doubling the total into the buffer, so the troughs utilize the link better but then the
round-trip time. tips cause more delay variation. Even with a perfectly tuned AQM,
the additional queuing delay at the tips of the sawteeth will be of
the same order as the underlying speed-of-light delay across the
network, thereby roughly doubling the total round-trip time.
If a sender's own behaviour is introducing queuing delay variation, If a sender's own behaviour is introducing queuing delay variation,
no AQM in the network can 'un-vary' the delay without significantly no AQM in the network can 'un-vary' the delay without significantly
compromising link utilization. Even flow-queuing (e.g. [RFC8290]), compromising link utilization. Even flow-queuing (e.g. [RFC8290]),
which isolates one flow from another, cannot isolate a flow from the which isolates one flow from another, cannot isolate a flow from the
delay variations it inflicts on itself. Therefore those applications delay variations it inflicts on itself. Therefore those applications
that need to seek out high bandwidth but also need low latency will that need to seek out high bandwidth but also need low latency will
have to migrate to scalable congestion control. have to migrate to scalable congestion control, which uses much
smaller sawtooth variations.
Altering host behaviour is not enough on its own though. Even if Altering host behaviour is not enough on its own though. Even if
hosts adopt low latency behaviour (scalable congestion controls), hosts adopt low latency scalable congestion controls, they need to be
they need to be isolated from the behaviour of existing Classic isolated from the large queue variations induced by existing Classic
congestion controls that induce large queue variations. L4S enables congestion controls. L4S AQMs provide that latency isolation in the
that migration by providing latency isolation in the network and network and the L4S identifier enables the AQMs to distinguish the
distinguishing the two types of packets that need to be isolated: L4S two types of packet that need to be isolated: L4S and Classic. L4S
and Classic. L4S isolation can be achieved with a queue per flow isolation can be achieved with a queue per flow (e.g. [RFC8290]) but
(e.g. [RFC8290]) but a DualQ [I-D.ietf-tsvwg-aqm-dualq-coupled] is a DualQ [I-D.ietf-tsvwg-aqm-dualq-coupled] is sufficient, and
sufficient, and actually gives better tail latency. Both approaches actually gives better tail latency. Both approaches are addressed in
are addressed in this document. this document.
The DualQ solution was developed to make very low latency available The DualQ solution was developed to make very low latency available
without requiring per-flow queues at every bottleneck. This was without requiring per-flow queues at every bottleneck. This was
because per-flow-queuing (FQ) has well-known downsides - not least useful because per-flow-queuing (FQ) has well-known downsides - not
the need to inspect transport layer headers in the network, which least the need to inspect transport layer headers in the network,
makes it incompatible with privacy approaches such as IPSec VPN which makes it incompatible with privacy approaches such as IPSec VPN
tunnels, and incompatible with link layer queue management, where tunnels, and incompatible with link layer queue management, where
transport layer headers can be hidden, e.g. 5G. transport layer headers can be hidden, e.g. 5G.
Latency is not the only concern addressed by L4S: It was known when Latency is not the only concern addressed by L4S. It was known when
TCP congestion avoidance was first developed that it would not scale TCP congestion avoidance was first developed that it would not scale
to high bandwidth-delay products (footnote 6 of Jacobson and to high bandwidth-delay products (footnote 6 of Jacobson and
Karels [TCP-CA]). Given regular broadband bit-rates over WAN Karels [TCP-CA]). Given regular broadband bit-rates over WAN
distances are already [RFC3649] beyond the scaling range of Reno distances are already [RFC3649] beyond the scaling range of Reno
congestion control, 'less unscalable' Cubic [RFC8312] and congestion control, 'less unscalable' Cubic [RFC8312] and
Compound [I-D.sridharan-tcpm-ctcp] variants of TCP have been Compound [I-D.sridharan-tcpm-ctcp] variants of TCP have been
successfully deployed. However, these are now approaching their successfully deployed. However, these are now approaching their
scaling limits. Unfortunately, fully scalable congestion controls scaling limits. Unfortunately, fully scalable congestion controls
such as DCTCP [RFC8257] outcompete Classic ECN congestion controls such as DCTCP [RFC8257] outcompete Classic ECN congestion controls
sharing the same queue, which is why they have been confined to sharing the same queue, which is why they have been confined to
skipping to change at page 7, line 36 skipping to change at page 7, line 40
It turns out that these scalable congestion control algorithms that It turns out that these scalable congestion control algorithms that
solve the latency problem can also solve the scalability problem of solve the latency problem can also solve the scalability problem of
Classic congestion controls. The finer sawteeth in the congestion Classic congestion controls. The finer sawteeth in the congestion
window have low amplitude, so they cause very little queuing delay window have low amplitude, so they cause very little queuing delay
variation and the average time to recover from one congestion signal variation and the average time to recover from one congestion signal
to the next (the average duration of each sawtooth) remains to the next (the average duration of each sawtooth) remains
invariant, which maintains constant tight control as flow-rate invariant, which maintains constant tight control as flow-rate
scales. A background paper [DCttH19] gives the full explanation of scales. A background paper [DCttH19] gives the full explanation of
why the design solves both the latency and the scaling problems, both why the design solves both the latency and the scaling problems, both
in plain English and in more precise mathematical form. The in plain English and in more precise mathematical form. The
explanation is summarised without the maths in Section 4 of the L4S explanation is summarised without the mathematics in Section 4 of the
architecture [I-D.ietf-tsvwg-l4s-arch]. L4S architecture [I-D.ietf-tsvwg-l4s-arch].
1.2. Terminology 1.2. Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in "OPTIONAL" in this document are to be interpreted as described in
[RFC2119]. In this document, these words will appear with that [RFC2119]. In this document, these words will appear with that
interpretation only when in ALL CAPS. Lower case uses of these words interpretation only when in ALL CAPS. Lower case uses of these words
are not to be interpreted as carrying RFC-2119 significance. are not to be interpreted as carrying RFC-2119 significance.
skipping to change at page 8, line 27 skipping to change at page 8, line 33
time from one congestion signal to the next (the recovery time) time from one congestion signal to the next (the recovery time)
remains invariant as the flow rate scales, all other factors being remains invariant as the flow rate scales, all other factors being
equal. This maintains the same degree of control over queueing equal. This maintains the same degree of control over queueing
and utilization whatever the flow rate, as well as ensuring that and utilization whatever the flow rate, as well as ensuring that
high throughput is robust to disturbances. For instance, DCTCP high throughput is robust to disturbances. For instance, DCTCP
averages 2 congestion signals per round-trip whatever the flow averages 2 congestion signals per round-trip whatever the flow
rate, as do other recently developed scalable congestion controls, rate, as do other recently developed scalable congestion controls,
e.g. Relentless TCP [Mathis09], TCP Prague e.g. Relentless TCP [Mathis09], TCP Prague
[I-D.briscoe-iccrg-prague-congestion-control], [PragueLinux], [I-D.briscoe-iccrg-prague-congestion-control], [PragueLinux],
BBRv2 [BBRv2], [I-D.cardwell-iccrg-bbr-congestion-control] and the BBRv2 [BBRv2], [I-D.cardwell-iccrg-bbr-congestion-control] and the
L4S variant of SCREAM for real-time media [SCReAM], [RFC8298]). L4S variant of SCREAM for real-time media [SCReAM-L4S], [RFC8298].
See Section 4.3 for more explanation. See Section 4.3 for more explanation.
Classic service: The Classic service is intended for all the Classic service: The Classic service is intended for all the
congestion control behaviours that co-exist with Reno [RFC5681] congestion control behaviours that co-exist with Reno [RFC5681]
(e.g. Reno itself, Cubic [RFC8312], (e.g. Reno itself, Cubic [RFC8312],
Compound [I-D.sridharan-tcpm-ctcp], TFRC [RFC5348]). The term Compound [I-D.sridharan-tcpm-ctcp], TFRC [RFC5348]). The term
'Classic queue' means a queue providing the Classic service. 'Classic queue' means a queue providing the Classic service.
Low-Latency, Low-Loss Scalable throughput (L4S) service: The 'L4S' Low-Latency, Low-Loss Scalable throughput (L4S) service: The 'L4S'
service is intended for traffic from scalable congestion control service is intended for traffic from scalable congestion control
algorithms, such as TCP Prague algorithms, such as TCP Prague
[I-D.briscoe-iccrg-prague-congestion-control], which was derived [I-D.briscoe-iccrg-prague-congestion-control], which was derived
from DCTCP [RFC8257]. The L4S service is for more general traffic from DCTCP [RFC8257]. The L4S service is for more general traffic
than just TCP Prague -- it allows the set of congestion controls than just TCP Prague -- it allows the set of congestion controls
with similar scaling properties to Prague to evolve, such as the with similar scaling properties to Prague to evolve, such as the
examples listed above (Relentless, SCReAM). The term 'L4S queue' examples listed above (Relentless, SCReAM, etc.). The term 'L4S
means a queue providing the L4S service. queue' means a queue providing the L4S service.
The terms Classic or L4S can also qualify other nouns, such as The terms Classic or L4S can also qualify other nouns, such as
'queue', 'codepoint', 'identifier', 'classification', 'packet', 'queue', 'codepoint', 'identifier', 'classification', 'packet',
'flow'. For example: an L4S packet means a packet with an L4S 'flow'. For example: an L4S packet means a packet with an L4S
identifier sent from an L4S congestion control. identifier sent from an L4S congestion control.
Both Classic and L4S services can cope with a proportion of Both Classic and L4S services can cope with a proportion of
unresponsive or less-responsive traffic as well, but in the L4S unresponsive or less-responsive traffic as well, but in the L4S
case its rate has to be smooth enough or low enough not to build a case its rate has to be smooth enough or low enough not to build a
queue (e.g. DNS, VoIP, game sync datagrams, etc). queue (e.g. DNS, VoIP, game sync datagrams, etc).
Reno-friendly: The subset of Classic traffic that is friendly to the Reno-friendly: The subset of Classic traffic that is friendly to the
standard Reno congestion control defined for TCP in [RFC5681]. standard Reno congestion control defined for TCP in [RFC5681].
The TFRC spec. [RFC5348] indirectly implies that 'friendly' is The TFRC spec [RFC5348] indirectly implies that 'friendly' is
defined as "generally within a factor of two of the sending rate defined as "generally within a factor of two of the sending rate
of a TCP flow under the same conditions". Reno-friendly is used of a TCP flow under the same conditions". Reno-friendly is used
here in place of 'TCP-friendly', given the latter has become here in place of 'TCP-friendly', given the latter has become
imprecise, because the TCP protocol is now used with so many imprecise, because the TCP protocol is now used with so many
different congestion control behaviours, and Reno is used in non- different congestion control behaviours, and Reno can be used in
TCP transports such as QUIC [RFC9000]. non-TCP transports such as QUIC [RFC9000].
Classic ECN: The original Explicit Congestion Notification (ECN) Classic ECN: The original Explicit Congestion Notification (ECN)
protocol [RFC3168], which requires ECN signals to be treated the protocol [RFC3168], which requires ECN signals to be treated the
same as drops, both when generated in the network and when same as drops, both when generated in the network and when
responded to by the sender. For L4S, the names used for the four responded to by the sender. For L4S, the names used for the four
codepoints of the 2-bit IP-ECN field are unchanged from those codepoints of the 2-bit IP-ECN field are unchanged from those
defined in [RFC3168]: Not ECT, ECT(0), ECT(1) and CE, where ECT defined in [RFC3168]: Not ECT, ECT(0), ECT(1) and CE, where ECT
stands for ECN-Capable Transport and CE stands for Congestion stands for ECN-Capable Transport and CE stands for Congestion
Experienced. A packet marked with the CE codepoint is termed Experienced. A packet marked with the CE codepoint is termed
'ECN-marked' or sometimes just 'marked' where the context makes 'ECN-marked' or sometimes just 'marked' where the context makes
skipping to change at page 10, line 26 skipping to change at page 10, line 34
- the congestion control specifications of various DCCP - the congestion control specifications of various DCCP
congestion control identifier (CCID) profiles [RFC4341], congestion control identifier (CCID) profiles [RFC4341],
[RFC4342], [RFC5622]. [RFC4342], [RFC5622].
This document is about identifiers that are used for interoperation This document is about identifiers that are used for interoperation
between hosts and networks. So the audience is broad, covering between hosts and networks. So the audience is broad, covering
developers of host transports and network AQMs, as well as covering developers of host transports and network AQMs, as well as covering
how operators might wish to combine various identifiers, which would how operators might wish to combine various identifiers, which would
require flexibility from equipment developers. require flexibility from equipment developers.
2. Choice of L4S Packet Identifier: Requirements 2. L4S Packet Identification: Document Roadmap
The L4S treatment is an experimental track alternative packet marking
treatment to the Classic ECN treatment in [RFC3168], which has been
updated by [RFC8311] to allow experiments such as the one defined in
the present specification. [RFC4774] discusses some of the issues
and evaluation criteria when defining alternative ECN semantics,
which are further discussed in Section 4.3.1.
The L4S architecture [I-D.ietf-tsvwg-l4s-arch] describes the three
main components of L4S: the sending host behaviour, the marking
behaviour in the network and the L4S ECN protocol that identifies L4S
packets as they flow between the two.
The next section of the present document (Section 3) records the
requirements that informed the choice of L4S identifier. Then
subsequent sections specify the L4S ECN protocol, which i) identifies
packets that have been sent from hosts that are expected to comply
with a broad type of sending behaviour; and ii) identifies the
marking treatment that network nodes are expected to apply to L4S
packets.
For a packet to receive L4S treatment as it is forwarded, the sender
sets the ECN field in the IP header to the ECT(1) codepoint. See
Section 4 for full transport layer behaviour requirements, including
feedback and congestion response.
A network node that implements the L4S service always classifies
arriving ECT(1) packets for L4S treatment and by default classifies
CE packets for L4S treatment unless the heuristics described in
Section 5.3 are employed. See Section 5 for full network element
behaviour requirements, including classification, ECN-marking and
interaction of the L4S identifier with other identifiers and per-hop
behaviours.
L4S ECN works with ECN tunnelling and encapsulation behaviour as is,
except there is one known case where careful attention to
configuration is required, which is detailed in Section 6.
L4S ECN is currently on the experimental track. So Section 7
collects together the general questions and issues that remain open
for investigation during L4S experimentation. Open issues or
questions specific to particular components are called out in the
specifications of each component part, such as the DualQ
[I-D.ietf-tsvwg-aqm-dualq-coupled].
The IANA assignment of the L4S identifier is specified in Section 8.
And Section 9 covers security considerations specific to the L4S
identifier. System security aspects, such as policing and privacy,
are covered in the L4S architecture [I-D.ietf-tsvwg-l4s-arch].
3. Choice of L4S Packet Identifier: Requirements
This subsection briefly records the process that led to the chosen This subsection briefly records the process that led to the chosen
L4S identifier. L4S identifier.
The identifier for packets using the Low Latency, Low Loss, Scalable The identifier for packets using the Low Latency, Low Loss, Scalable
throughput (L4S) service needs to meet the following requirements: throughput (L4S) service needs to meet the following requirements:
* it SHOULD survive end-to-end between source and destination end- * it SHOULD survive end-to-end between source and destination end-
points: across the boundary between host and network, between points: across the boundary between host and network, between
interconnected networks, and through middleboxes; interconnected networks, and through middleboxes;
skipping to change at page 11, line 21 skipping to change at page 12, line 36
all these requirements, particularly given the limited space left in all these requirements, particularly given the limited space left in
the IP header. Therefore a compromise will always be necessary, the IP header. Therefore a compromise will always be necessary,
which is why all the above requirements are expressed with the word which is why all the above requirements are expressed with the word
'SHOULD' not 'MUST'. 'SHOULD' not 'MUST'.
After extensive assessment of alternative schemes, "ECT(1) and CE After extensive assessment of alternative schemes, "ECT(1) and CE
codepoints" was chosen as the best compromise. Therefore this scheme codepoints" was chosen as the best compromise. Therefore this scheme
is defined in detail in the following sections, while Appendix B is defined in detail in the following sections, while Appendix B
records its pros and cons against the above requirements. records its pros and cons against the above requirements.
3. L4S Packet Identification 4. Transport Layer Behaviour (the 'Prague Requirements')
The L4S treatment is an experimental track alternative packet marking
treatment to the Classic ECN treatment in [RFC3168], which has been
updated by [RFC8311] to allow experiments such as the one defined in
the present specification. [RFC4774] discusses some of the issues
and evaluation criteria when defining alternative ECN semantics.
Like Classic ECN, L4S ECN identifies both network and host behaviour:
it identifies the marking treatment that network nodes are expected
to apply to L4S packets, and it identifies packets that have been
sent from hosts that are expected to comply with a broad type of
sending behaviour.
For a packet to receive L4S treatment as it is forwarded, the sender
sets the ECN field in the IP header to the ECT(1) codepoint. See
Section 4 for full transport layer behaviour requirements, including
feedback and congestion response.
A network node that implements the L4S service always classifies This section defines L4S behaviour at the transport layer, also known
arriving ECT(1) packets for L4S treatment and by default classifies as the Prague L4S Requirements (see Appendix A for the origin of the
CE packets for L4S treatment unless the heuristics described in name).
Section 5.3 are employed. See Section 5 for full network element
behaviour requirements, including classification, ECN-marking and
interaction of the L4S identifier with other identifiers and per-hop
behaviours.
4. Transport Layer Behaviour (the 'Prague Requirements')
4.1. Codepoint Setting 4.1. Codepoint Setting
A sender that wishes a packet to receive L4S treatment as it is A sender that wishes a packet to receive L4S treatment as it is
forwarded, MUST set the ECN field in the IP header (v4 or v6) to the forwarded, MUST set the ECN field in the IP header (v4 or v6) to the
ECT(1) codepoint. ECT(1) codepoint.
4.2. Prerequisite Transport Feedback 4.2. Prerequisite Transport Feedback
For a transport protocol to provide scalable congestion control For a transport protocol to provide scalable congestion control
(Section 4.3) it MUST provide feedback of the extent of CE marking on (Section 4.3) it MUST provide feedback of the extent of CE marking on
skipping to change at page 13, line 27 skipping to change at page 14, line 25
without having to sacrifice utilization. without having to sacrifice utilization.
With a congestion control that sawtooths to probe capacity, this With a congestion control that sawtooths to probe capacity, this
duration is called the recovery time, because each time the sawtooth duration is called the recovery time, because each time the sawtooth
yields, on average it take this time to recover to its previous high yields, on average it take this time to recover to its previous high
point. A scalable congestion control does not have to sawtooth, but point. A scalable congestion control does not have to sawtooth, but
it has to coexist with scalable congestion controls that do. it has to coexist with scalable congestion controls that do.
For instance, for DCTCP [RFC8257], TCP Prague For instance, for DCTCP [RFC8257], TCP Prague
[I-D.briscoe-iccrg-prague-congestion-control], [PragueLinux] and the [I-D.briscoe-iccrg-prague-congestion-control], [PragueLinux] and the
L4S variant of SCReAM [RFC8298], the average recovery time is always L4S variant of SCReAM [SCReAM-L4S], [RFC8298], the average recovery
half a round trip (or half a reference round trip), whatever the flow time is always half a round trip (or half a reference round trip),
rate. whatever the flow rate.
As with all transport behaviours, a detailed specification (probably As with all transport behaviours, a detailed specification (probably
an experimental RFC) is expected for each congestion control, an experimental RFC) is expected for each congestion control,
following the guidelines for specifying new congestion control following the guidelines for specifying new congestion control
algorithms in [RFC5033]. In addition it is expected to document algorithms in [RFC5033]. In addition it is expected to document
these L4S-specific matters, specifically the timescale over which the these L4S-specific matters, specifically the timescale over which the
proportionality is averaged, and control of burstiness. The recovery proportionality is averaged, and control of burstiness. The recovery
time requirement above is worded as a 'SHOULD' rather than a 'MUST' time requirement above is worded as a 'SHOULD' rather than a 'MUST'
to allow reasonable flexibility for such implementations. to allow reasonable flexibility for such implementations.
skipping to change at page 18, line 10 skipping to change at page 19, line 5
To summarize, the coexistence problem is confined to cases of To summarize, the coexistence problem is confined to cases of
imperfect flow isolation in an FQ, or in potential cases where a imperfect flow isolation in an FQ, or in potential cases where a
Classic ECN AQM has been deployed in a shared queue (see the L4S Classic ECN AQM has been deployed in a shared queue (see the L4S
operational guidance [I-D.ietf-tsvwg-l4sops] for further details operational guidance [I-D.ietf-tsvwg-l4sops] for further details
including recent surveys attempting to quantify prevalence). including recent surveys attempting to quantify prevalence).
Further, if one of these cases does occur, the coexistence problem Further, if one of these cases does occur, the coexistence problem
does not arise unless sources of Classic and L4S flows are does not arise unless sources of Classic and L4S flows are
simultaneously sharing the same bottleneck queue (e.g. different simultaneously sharing the same bottleneck queue (e.g. different
applications in the same household) and flows of each type have to applications in the same household) and flows of each type have to
be large enough to coincide for long enough for any throughput be large enough to coincide for long enough for any throughput
imbalance to have developed. imbalance to have developed. Therefore, how often the coexistence
problem arises in practice is listed in Section 7 as an open
question that L4S experiments will need to answer.
Severity: Where long-running L4S and Classic flows coincide in a Severity: Where long-running L4S and Classic flows coincide in a
shared queue, testing of one L4S congestion control (TCP Prague) shared queue, testing of one L4S congestion control (TCP Prague)
has found that the imbalance in average throughput between an L4S has found that the imbalance in average throughput between an L4S
and a Classic flow can reach 25:1 in favour of L4S in the worst and a Classic flow can reach 25:1 in favour of L4S in the worst
case [ecn-fallback]. However, when capacity is most scarce, the case [ecn-fallback]. However, when capacity is most scarce, the
Classic flow gets a higher proportion of the link, for instance Classic flow gets a higher proportion of the link, for instance
over a 4 Mb/s link the throughput ratio is below ~10:1 over paths over a 4 Mb/s link the throughput ratio is below ~10:1 over paths
with a base RTT below 100 ms, and falls below ~5:1 for base RTTs with a base RTT below 100 ms, and falls below ~5:1 for base RTTs
below 20ms. below 20ms.
skipping to change at page 36, line 8 skipping to change at page 37, line 8
All", Updated RITE project Technical Report , July 2019, All", Updated RITE project Technical Report , July 2019,
<https://bobbriscoe.net/pubs.html#DCttH_TR>. <https://bobbriscoe.net/pubs.html#DCttH_TR>.
[DualPI2Linux] [DualPI2Linux]
Albisser, O., De Schepper, K., Briscoe, B., Tilmans, O., Albisser, O., De Schepper, K., Briscoe, B., Tilmans, O.,
and H. Steen, "DUALPI2 - Low Latency, Low Loss and and H. Steen, "DUALPI2 - Low Latency, Low Loss and
Scalable (L4S) AQM", Proc. Linux Netdev 0x13 , March 2019, Scalable (L4S) AQM", Proc. Linux Netdev 0x13 , March 2019,
<https://www.netdevconf.org/0x13/session.html?talk- <https://www.netdevconf.org/0x13/session.html?talk-
DUALPI2-AQM>. DUALPI2-AQM>.
[Dukkipati06]
Dukkipati, N. and N. McKeown, "Why Flow-Completion Time is
the Right Metric for Congestion Control", ACM CCR
36(1):59--62, January 2006,
<https://dl.acm.org/doi/10.1145/1111322.1111336>.
[ecn-fallback] [ecn-fallback]
Briscoe, B. and A.S. Ahmed, "TCP Prague Fall-back on Briscoe, B. and A.S. Ahmed, "TCP Prague Fall-back on
Detection of a Classic ECN AQM", bobbriscoe.net Technical Detection of a Classic ECN AQM", bobbriscoe.net Technical
Report TR-BB-2019-002, April 2020, Report TR-BB-2019-002, April 2020,
<https://arxiv.org/abs/1911.00710>. <https://arxiv.org/abs/1911.00710>.
[Heist21] Heist, P. and J. Morton, "L4S Tests", github README, May [Heist21] Heist, P. and J. Morton, "L4S Tests", github README, May
2021, <https://github.com/heistp/l4s-tests/>. 2021, <https://github.com/heistp/l4s-tests/>.
[I-D.briscoe-docsis-q-protection] [I-D.briscoe-docsis-q-protection]
Briscoe, B. and G. White, "The DOCSIS(r) Queue Protection Briscoe, B. and G. White, "The DOCSIS(r) Queue Protection
Algorithm to Preserve Low Latency", Work in Progress, Algorithm to Preserve Low Latency", Work in Progress,
Internet-Draft, draft-briscoe-docsis-q-protection-06, 13 Internet-Draft, draft-briscoe-docsis-q-protection-06, 13
May 2022, <https://datatracker.ietf.org/doc/html/draft- May 2022, <https://www.ietf.org/archive/id/draft-briscoe-
briscoe-docsis-q-protection-06>. docsis-q-protection-06.txt>.
[I-D.briscoe-iccrg-prague-congestion-control] [I-D.briscoe-iccrg-prague-congestion-control]
Schepper, K. D., Tilmans, O., and B. Briscoe, "Prague Schepper, K. D., Tilmans, O., and B. Briscoe, "Prague
Congestion Control", Work in Progress, Internet-Draft, Congestion Control", Work in Progress, Internet-Draft,
draft-briscoe-iccrg-prague-congestion-control-01, 11 July draft-briscoe-iccrg-prague-congestion-control-01, 11 July
2022, <https://datatracker.ietf.org/doc/html/draft- 2022, <https://www.ietf.org/archive/id/draft-briscoe-
briscoe-iccrg-prague-congestion-control-01>. iccrg-prague-congestion-control-01.txt>.
[I-D.briscoe-tsvwg-l4s-diffserv] [I-D.briscoe-tsvwg-l4s-diffserv]
Briscoe, B., "Interactions between Low Latency, Low Loss, Briscoe, B., "Interactions between Low Latency, Low Loss,
Scalable Throughput (L4S) and Differentiated Services", Scalable Throughput (L4S) and Differentiated Services",
Work in Progress, Internet-Draft, draft-briscoe-tsvwg-l4s- Work in Progress, Internet-Draft, draft-briscoe-tsvwg-l4s-
diffserv-02, 4 November 2018, diffserv-02, November 2018,
<https://datatracker.ietf.org/doc/html/draft-briscoe- <https://www.ietf.org/archive/id/draft-briscoe-tsvwg-l4s-
tsvwg-l4s-diffserv-02>. diffserv-02.txt>.
[I-D.cardwell-iccrg-bbr-congestion-control] [I-D.cardwell-iccrg-bbr-congestion-control]
Cardwell, N., Cheng, Y., Yeganeh, S. H., Swett, I., and V. Cardwell, N., Cheng, Y., Yeganeh, S. H., Swett, I., and V.
Jacobson, "BBR Congestion Control", Work in Progress, Jacobson, "BBR Congestion Control", Work in Progress,
Internet-Draft, draft-cardwell-iccrg-bbr-congestion- Internet-Draft, draft-cardwell-iccrg-bbr-congestion-
control-02, 7 March 2022, control-02, March 2022, <https://www.ietf.org/archive/id/
<https://datatracker.ietf.org/doc/html/draft-cardwell- draft-cardwell-iccrg-bbr-congestion-control-02.txt>.
iccrg-bbr-congestion-control-02>.
[I-D.ietf-tcpm-accurate-ecn] [I-D.ietf-tcpm-accurate-ecn]
Briscoe, B., Kühlewind, M., and R. Scheffenegger, "More Briscoe, B., Kühlewind, M., and R. Scheffenegger, "More
Accurate ECN Feedback in TCP", Work in Progress, Internet- Accurate ECN Feedback in TCP", Work in Progress, Internet-
Draft, draft-ietf-tcpm-accurate-ecn-20, 25 July 2022, Draft, draft-ietf-tcpm-accurate-ecn-20, 25 July 2022,
<https://datatracker.ietf.org/doc/html/draft-ietf-tcpm- <https://www.ietf.org/archive/id/draft-ietf-tcpm-accurate-
accurate-ecn-20>. ecn-20.txt>.
[I-D.ietf-tcpm-generalized-ecn] [I-D.ietf-tcpm-generalized-ecn]
Bagnulo, M. and B. Briscoe, "ECN++: Adding Explicit Bagnulo, M. and B. Briscoe, "ECN++: Adding Explicit
Congestion Notification (ECN) to TCP Control Packets", Congestion Notification (ECN) to TCP Control Packets",
Work in Progress, Internet-Draft, draft-ietf-tcpm- Work in Progress, Internet-Draft, draft-ietf-tcpm-
generalized-ecn-10, 27 July 2022, generalized-ecn-10, 27 July 2022,
<https://datatracker.ietf.org/doc/html/draft-ietf-tcpm- <https://www.ietf.org/archive/id/draft-ietf-tcpm-
generalized-ecn-10>. generalized-ecn-10.txt>.
[I-D.ietf-tls-dtls13] [I-D.ietf-tls-dtls13]
Rescorla, E., Tschofenig, H., and N. Modadugu, "The Rescorla, E., Tschofenig, H., and N. Modadugu, "The
Datagram Transport Layer Security (DTLS) Protocol Version Datagram Transport Layer Security (DTLS) Protocol Version
1.3", Work in Progress, Internet-Draft, draft-ietf-tls- 1.3", Work in Progress, Internet-Draft, draft-ietf-tls-
dtls13-43, 30 April 2021, dtls13-43, 30 April 2021,
<https://datatracker.ietf.org/doc/html/draft-ietf-tls- <https://www.ietf.org/archive/id/draft-ietf-tls-
dtls13-43>. dtls13-43.txt>.
[I-D.ietf-trill-ecn-support] [I-D.ietf-trill-ecn-support]
Eastlake, D. E. and B. Briscoe, "TRILL (TRansparent Eastlake, D. E. and B. Briscoe, "TRILL (TRansparent
Interconnection of Lots of Links): ECN (Explicit Interconnection of Lots of Links): ECN (Explicit
Congestion Notification) Support", Work in Progress, Congestion Notification) Support", Work in Progress,
Internet-Draft, draft-ietf-trill-ecn-support-07, 25 Internet-Draft, draft-ietf-trill-ecn-support-07, 25
February 2018, <https://datatracker.ietf.org/doc/html/ February 2018, <https://www.ietf.org/archive/id/draft-
draft-ietf-trill-ecn-support-07>. ietf-trill-ecn-support-07.txt>.
[I-D.ietf-tsvwg-aqm-dualq-coupled] [I-D.ietf-tsvwg-aqm-dualq-coupled]
Schepper, K. D., Briscoe, B., and G. White, "DualQ Coupled Schepper, K. D., Briscoe, B., and G. White, "DualQ Coupled
AQMs for Low Latency, Low Loss and Scalable Throughput AQMs for Low Latency, Low Loss and Scalable Throughput
(L4S)", Work in Progress, Internet-Draft, draft-ietf- (L4S)", Work in Progress, Internet-Draft, draft-ietf-
tsvwg-aqm-dualq-coupled-24, 7 July 2022, tsvwg-aqm-dualq-coupled-24, July 2022,
<https://datatracker.ietf.org/doc/html/draft-ietf-tsvwg- <https://www.ietf.org/archive/id/draft-ietf-tsvwg-aqm-
aqm-dualq-coupled-24>. dualq-coupled-24.txt>.
[I-D.ietf-tsvwg-ecn-encap-guidelines] [I-D.ietf-tsvwg-ecn-encap-guidelines]
Briscoe, B. and J. Kaippallimalil, "Guidelines for Adding Briscoe, B. and J. Kaippallimalil, "Guidelines for Adding
Congestion Notification to Protocols that Encapsulate IP", Congestion Notification to Protocols that Encapsulate IP",
Work in Progress, Internet-Draft, draft-ietf-tsvwg-ecn- Work in Progress, Internet-Draft, draft-ietf-tsvwg-ecn-
encap-guidelines-17, 11 July 2022, encap-guidelines-17, 11 July 2022,
<https://datatracker.ietf.org/doc/html/draft-ietf-tsvwg- <https://www.ietf.org/archive/id/draft-ietf-tsvwg-ecn-
ecn-encap-guidelines-17>. encap-guidelines-17.txt>.
[I-D.ietf-tsvwg-l4s-arch] [I-D.ietf-tsvwg-l4s-arch]
Briscoe, B., Schepper, K. D., Bagnulo, M., and G. White, Briscoe, B., Schepper, K. D., Bagnulo, M., and G. White,
"Low Latency, Low Loss, Scalable Throughput (L4S) Internet "Low Latency, Low Loss, Scalable Throughput (L4S) Internet
Service: Architecture", Work in Progress, Internet-Draft, Service: Architecture", Work in Progress, Internet-Draft,
draft-ietf-tsvwg-l4s-arch-18, 7 July 2022, draft-ietf-tsvwg-l4s-arch-19, 27 July 2022,
<https://datatracker.ietf.org/doc/html/draft-ietf-tsvwg- <https://www.ietf.org/archive/id/draft-ietf-tsvwg-l4s-
l4s-arch-18>. arch-19.txt>.
[I-D.ietf-tsvwg-l4sops] [I-D.ietf-tsvwg-l4sops]
White, G., "Operational Guidance for Deployment of L4S in White, G., "Operational Guidance for Deployment of L4S in
the Internet", Work in Progress, Internet-Draft, draft- the Internet", Work in Progress, Internet-Draft, draft-
ietf-tsvwg-l4sops-03, 28 April 2022, ietf-tsvwg-l4sops-03, 28 April 2022,
<https://datatracker.ietf.org/doc/html/draft-ietf-tsvwg- <https://www.ietf.org/archive/id/draft-ietf-tsvwg-l4sops-
l4sops-03>. 03.txt>.
[I-D.ietf-tsvwg-nqb] [I-D.ietf-tsvwg-nqb]
White, G. and T. Fossati, "A Non-Queue-Building Per-Hop White, G. and T. Fossati, "A Non-Queue-Building Per-Hop
Behavior (NQB PHB) for Differentiated Services", Work in Behavior (NQB PHB) for Differentiated Services", Work in
Progress, Internet-Draft, draft-ietf-tsvwg-nqb-10, 4 March Progress, Internet-Draft, draft-ietf-tsvwg-nqb-10, March
2022, <https://datatracker.ietf.org/doc/html/draft-ietf- 2022, <https://www.ietf.org/archive/id/draft-ietf-tsvwg-
tsvwg-nqb-10>. nqb-10.txt>.
[I-D.ietf-tsvwg-rfc6040update-shim] [I-D.ietf-tsvwg-rfc6040update-shim]
Briscoe, B., "Propagating Explicit Congestion Notification Briscoe, B., "Propagating Explicit Congestion Notification
Across IP Tunnel Headers Separated by a Shim", Work in Across IP Tunnel Headers Separated by a Shim", Work in
Progress, Internet-Draft, draft-ietf-tsvwg-rfc6040update- Progress, Internet-Draft, draft-ietf-tsvwg-rfc6040update-
shim-15, 11 July 2022, shim-15, 11 July 2022, <https://www.ietf.org/archive/id/
<https://datatracker.ietf.org/doc/html/draft-ietf-tsvwg- draft-ietf-tsvwg-rfc6040update-shim-15.txt>.
rfc6040update-shim-15>.
[I-D.sridharan-tcpm-ctcp] [I-D.sridharan-tcpm-ctcp]
Sridharan, M., Tan, K., Bansal, D., and D. Thaler, Sridharan, M., Tan, K., Bansal, D., and D. Thaler,
"Compound TCP: A New TCP Congestion Control for High-Speed "Compound TCP: A New TCP Congestion Control for High-Speed
and Long Distance Networks", Work in Progress, Internet- and Long Distance Networks", Work in Progress, Internet-
Draft, draft-sridharan-tcpm-ctcp-02, 11 November 2008, Draft, draft-sridharan-tcpm-ctcp-02, 11 November 2008,
<https://datatracker.ietf.org/doc/html/draft-sridharan- <https://www.ietf.org/archive/id/draft-sridharan-tcpm-
tcpm-ctcp-02>. ctcp-02.txt>.
[I-D.stewart-tsvwg-sctpecn] [I-D.stewart-tsvwg-sctpecn]
Stewart, R. R., Tuexen, M., and X. Dong, "ECN for Stream Stewart, R. R., Tuexen, M., and X. Dong, "ECN for Stream
Control Transmission Protocol (SCTP)", Work in Progress, Control Transmission Protocol (SCTP)", Work in Progress,
Internet-Draft, draft-stewart-tsvwg-sctpecn-05, 15 January Internet-Draft, draft-stewart-tsvwg-sctpecn-05, 15 January
2014, <https://datatracker.ietf.org/doc/html/draft- 2014, <https://www.ietf.org/archive/id/draft-stewart-
stewart-tsvwg-sctpecn-05>. tsvwg-sctpecn-05.txt>.
[LinuxPacedChirping] [LinuxPacedChirping]
Misund, J. and B. Briscoe, "Paced Chirping - Rethinking Misund, J. and B. Briscoe, "Paced Chirping - Rethinking
TCP start-up", Proc. Linux Netdev 0x13 , March 2019, TCP start-up", Proc. Linux Netdev 0x13 , March 2019,
<https://www.netdevconf.org/0x13/session.html?talk-chirp>. <https://www.netdevconf.org/0x13/session.html?talk-chirp>.
[Mathis09] Mathis, M., "Relentless Congestion Control", PFLDNeT'09 , [Mathis09] Mathis, M., "Relentless Congestion Control", PFLDNeT'09 ,
May 2009, <http://www.hpcc.jp/pfldnet2009/ May 2009, <http://www.hpcc.jp/pfldnet2009/
Program_files/1569198525.pdf>. Program_files/1569198525.pdf>.
skipping to change at page 40, line 5 skipping to change at page 40, line 48
Queue Management and Congestion Avoidance in the Queue Management and Congestion Avoidance in the
Internet", RFC 2309, DOI 10.17487/RFC2309, April 1998, Internet", RFC 2309, DOI 10.17487/RFC2309, April 1998,
<https://www.rfc-editor.org/info/rfc2309>. <https://www.rfc-editor.org/info/rfc2309>.
[RFC2474] Nichols, K., Blake, S., Baker, F., and D. Black, [RFC2474] Nichols, K., Blake, S., Baker, F., and D. Black,
"Definition of the Differentiated Services Field (DS "Definition of the Differentiated Services Field (DS
Field) in the IPv4 and IPv6 Headers", RFC 2474, Field) in the IPv4 and IPv6 Headers", RFC 2474,
DOI 10.17487/RFC2474, December 1998, DOI 10.17487/RFC2474, December 1998,
<https://www.rfc-editor.org/info/rfc2474>. <https://www.rfc-editor.org/info/rfc2474>.
[RFC3246] Davie, B., Charny, A., Bennet, J.C.R., Benson, K., Le [RFC3246] Davie, B., Charny, A., Bennet, J C R., Benson, K., Le
Boudec, J.Y., Courtney, W., Davari, S., Firoiu, V., and D. Boudec, J Y., Courtney, W., Davari, S., Firoiu, V., and D.
Stiliadis, "An Expedited Forwarding PHB (Per-Hop Stiliadis, "An Expedited Forwarding PHB (Per-Hop
Behavior)", RFC 3246, DOI 10.17487/RFC3246, March 2002, Behavior)", RFC 3246, DOI 10.17487/RFC3246, March 2002,
<https://www.rfc-editor.org/info/rfc3246>. <https://www.rfc-editor.org/info/rfc3246>.
[RFC3540] Spring, N., Wetherall, D., and D. Ely, "Robust Explicit [RFC3540] Spring, N., Wetherall, D., and D. Ely, "Robust Explicit
Congestion Notification (ECN) Signaling with Nonces", Congestion Notification (ECN) Signaling with Nonces",
RFC 3540, DOI 10.17487/RFC3540, June 2003, RFC 3540, DOI 10.17487/RFC3540, June 2003,
<https://www.rfc-editor.org/info/rfc3540>. <https://www.rfc-editor.org/info/rfc3540>.
[RFC3649] Floyd, S., "HighSpeed TCP for Large Congestion Windows", [RFC3649] Floyd, S., "HighSpeed TCP for Large Congestion Windows",
skipping to change at page 44, line 15 skipping to change at page 45, line 11
[RFC9001] Thomson, M., Ed. and S. Turner, Ed., "Using TLS to Secure [RFC9001] Thomson, M., Ed. and S. Turner, Ed., "Using TLS to Secure
QUIC", RFC 9001, DOI 10.17487/RFC9001, May 2021, QUIC", RFC 9001, DOI 10.17487/RFC9001, May 2021,
<https://www.rfc-editor.org/info/rfc9001>. <https://www.rfc-editor.org/info/rfc9001>.
[Savage-TCP] [Savage-TCP]
Savage, S., Cardwell, N., Wetherall, D., and T. Anderson, Savage, S., Cardwell, N., Wetherall, D., and T. Anderson,
"TCP Congestion Control with a Misbehaving Receiver", ACM "TCP Congestion Control with a Misbehaving Receiver", ACM
SIGCOMM Computer Communication Review 29(5):71--78, SIGCOMM Computer Communication Review 29(5):71--78,
October 1999. October 1999.
[SCReAM] Johansson, I., "SCReAM", github repository; , [SCReAM-L4S]
Johansson, I., "SCReAM", github repository; ,
<https://github.com/EricssonResearch/scream/blob/master/ <https://github.com/EricssonResearch/scream/blob/master/
README.md>. README.md>.
[sub-mss-prob] [sub-mss-prob]
Briscoe, B. and K. De Schepper, "Scaling TCP's Congestion Briscoe, B. and K. De Schepper, "Scaling TCP's Congestion
Window for Small Round Trip Times", BT Technical Report Window for Small Round Trip Times", BT Technical Report
TR-TUB8-2015-002, May 2015, TR-TUB8-2015-002, May 2015,
<https://arxiv.org/abs/1904.07598>. <https://arxiv.org/abs/1904.07598>.
[TCP-CA] Jacobson, V. and M.J. Karels, "Congestion Avoidance and [TCP-CA] Jacobson, V. and M.J. Karels, "Congestion Avoidance and
skipping to change at page 58, line 14 skipping to change at page 58, line 45
favourable [Paced-Chirping] due to the shallow ECN marking threshold favourable [Paced-Chirping] due to the shallow ECN marking threshold
needed for L4S. It is exacerbated by the typically greater mismatch needed for L4S. It is exacerbated by the typically greater mismatch
between the link rate of the sending host and typical Internet access between the link rate of the sending host and typical Internet access
bottlenecks. This problem is detrimental in general, but would bottlenecks. This problem is detrimental in general, but would
particularly harm the performance of short flows relative to Classic particularly harm the performance of short flows relative to Classic
congestion controls. congestion controls.
Appendix B. Compromises in the Choice of L4S Identifier Appendix B. Compromises in the Choice of L4S Identifier
This appendix is informative, not normative. As explained in This appendix is informative, not normative. As explained in
Section 2, there is insufficient space in the IP header (v4 or v6) to Section 3, there is insufficient space in the IP header (v4 or v6) to
fully accommodate every requirement. So the choice of L4S identifier fully accommodate every requirement. So the choice of L4S identifier
involves tradeoffs. This appendix records the pros and cons of the involves tradeoffs. This appendix records the pros and cons of the
choice that was made. choice that was made.
Non-normative recap of the chosen codepoint scheme: Non-normative recap of the chosen codepoint scheme:
Packets with ECT(1) and conditionally packets with CE signify L4S Packets with ECT(1) and conditionally packets with CE signify L4S
semantics as an alternative to the semantics of Classic semantics as an alternative to the semantics of Classic
ECN [RFC3168], specifically: ECN [RFC3168], specifically:
skipping to change at page 64, line 47 skipping to change at page 65, line 30
Pre-Congestion Notification (PCN) is another scheme that assigns Pre-Congestion Notification (PCN) is another scheme that assigns
alternative semantics to the ECN field. It uses ECT(1) to signify a alternative semantics to the ECN field. It uses ECT(1) to signify a
less severe level of pre-congestion notification than CE [RFC6660]. less severe level of pre-congestion notification than CE [RFC6660].
However, the ECN field only takes on the PCN semantics if packets However, the ECN field only takes on the PCN semantics if packets
carry a Diffserv codepoint defined to indicate PCN marking within a carry a Diffserv codepoint defined to indicate PCN marking within a
controlled environment. PCN is required to be applied solely to the controlled environment. PCN is required to be applied solely to the
outer header of a tunnel across the controlled region in order not to outer header of a tunnel across the controlled region in order not to
interfere with any end-to-end use of the ECN field. Therefore a PCN interfere with any end-to-end use of the ECN field. Therefore a PCN
region on the path would not interfere with the L4S service region on the path would not interfere with the L4S service
identifier defined in Section 3. identifier defined in Section 2.
Acknowledgements Acknowledgements
Thanks to Richard Scheffenegger, John Leslie, David Taeht, Jonathan Thanks to Richard Scheffenegger, John Leslie, David Taeht, Jonathan
Morton, Gorry Fairhurst, Michael Welzl, Mikael Abrahamsson and Andrew Morton, Gorry Fairhurst, Michael Welzl, Mikael Abrahamsson and Andrew
McGregor for the discussions that led to this specification. Ing-jyh McGregor for the discussions that led to this specification. Ing-jyh
(Inton) Tsang was a contributor to the early drafts of this document. (Inton) Tsang was a contributor to the early drafts of this document.
And thanks to Mikael Abrahamsson, Lloyd Wood, Nicolas Kuhn, Greg And thanks to Mikael Abrahamsson, Lloyd Wood, Nicolas Kuhn, Greg
White, Tom Henderson, David Black, Gorry Fairhurst, Brian Carpenter, White, Tom Henderson, David Black, Gorry Fairhurst, Brian Carpenter,
Jake Holland, Rod Grimes, Richard Scheffenegger, Sebastian Moeller, Jake Holland, Rod Grimes, Richard Scheffenegger, Sebastian Moeller,
 End of changes. 66 change blocks. 
217 lines changed or deleted 260 lines changed or added

This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/