< draft-briscoe-tsvwg-ecn-encap-guidelines-02.txt   draft-briscoe-tsvwg-ecn-encap-guidelines-03a.txt >
Transport Area Working Group B. Briscoe Transport Area Working Group B. Briscoe
Internet-Draft BT Internet-Draft BT
Updates: 3819 (if approved) J. Kaippallimalil Updates: 3819 (if approved) J. Kaippallimalil
Intended status: BCP Huawei Intended status: BCP Huawei
Expires: August 28, 2013 P. Thaler Expires: March 9, 2014 P. Thaler
Broadcom Corporation Broadcom Corporation
February 24, 2013 September 05, 2013
Guidelines for Adding Congestion Notification to Protocols that Guidelines for Adding Congestion Notification to Protocols that
Encapsulate IP Encapsulate IP
draft-briscoe-tsvwg-ecn-encap-guidelines-02 draft-briscoe-tsvwg-ecn-encap-guidelines-03
Abstract Abstract
The purpose of this document is to guide the design of congestion The purpose of this document is to guide the design of congestion
notification in any lower layer or tunnelling protocol that notification in any lower layer or tunnelling protocol that
encapsulates IP. The aim is for explicit congestion signals to encapsulates IP. The aim is for explicit congestion signals to
propagate consistently from lower layer protocols into IP. Then the propagate consistently from lower layer protocols into IP. Then the
IP internetwork layer can act as a portability layer to carry IP internetwork layer can act as a portability layer to carry
congestion notification from non-IP-aware congested nodes up to the congestion notification from non-IP-aware congested nodes up to the
transport layer (L4). Following these guidelines should assure transport layer (L4). Following these guidelines should assure
skipping to change at page 1, line 42 skipping to change at page 1, line 42
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/. Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on August 28, 2013. This Internet-Draft will expire on March 9, 2014.
Copyright Notice Copyright Notice
Copyright (c) 2013 IETF Trust and the persons identified as the Copyright (c) 2013 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 2, line 19 skipping to change at page 2, line 19
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License. described in the Simplified BSD License.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1. Scope . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.1. Scope . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5
3. Modes of Operation . . . . . . . . . . . . . . . . . . . . . . 7 3. Modes of Operation . . . . . . . . . . . . . . . . . . . . . . 7
3.1. Feed-Forward-and-Up Mode . . . . . . . . . . . . . . . . . 7 3.1. Feed-Forward-and-Up Mode . . . . . . . . . . . . . . . . . 8
3.2. Feed-Up-and-Forward Mode . . . . . . . . . . . . . . . . . 9 3.2. Feed-Up-and-Forward Mode . . . . . . . . . . . . . . . . . 9
3.3. Feed-Backward Mode . . . . . . . . . . . . . . . . . . . . 10 3.3. Feed-Backward Mode . . . . . . . . . . . . . . . . . . . . 10
3.4. Null Mode . . . . . . . . . . . . . . . . . . . . . . . . 12 3.4. Null Mode . . . . . . . . . . . . . . . . . . . . . . . . 12
4. Feed-Forward-and-Up Mode: Guidelines for Adding Congestion 4. Feed-Forward-and-Up Mode: Guidelines for Adding Congestion
Notification . . . . . . . . . . . . . . . . . . . . . . . . . 12 Notification . . . . . . . . . . . . . . . . . . . . . . . . . 12
4.1. IP-in-IP Tunnels with Tightly Coupled Shim Headers . . . . 13 4.1. IP-in-IP Tunnels with Tightly Coupled Shim Headers . . . . 13
4.2. Wire Protocol Design: Indication of ECN Support . . . . . 13 4.2. Wire Protocol Design: Indication of ECN Support . . . . . 13
4.3. Encapsulation Guidelines . . . . . . . . . . . . . . . . . 15 4.3. Encapsulation Guidelines . . . . . . . . . . . . . . . . . 15
4.4. Decapsulation Guidelines . . . . . . . . . . . . . . . . . 16 4.4. Decapsulation Guidelines . . . . . . . . . . . . . . . . . 17
4.5. Sequences of Similar Tunnels or Subnets . . . . . . . . . 18 4.5. Sequences of Similar Tunnels or Subnets . . . . . . . . . 18
4.6. Reframing and Congestion Markings . . . . . . . . . . . . 18 4.6. Reframing and Congestion Markings . . . . . . . . . . . . 19
5. Feed-Up-and-Forward Mode: Guidelines for Adding Congestion 5. Feed-Up-and-Forward Mode: Guidelines for Adding Congestion
Notification . . . . . . . . . . . . . . . . . . . . . . . . . 19 Notification . . . . . . . . . . . . . . . . . . . . . . . . . 19
6. Feed-Backward Mode: Guidelines for Adding Congestion 6. Feed-Backward Mode: Guidelines for Adding Congestion
Notification . . . . . . . . . . . . . . . . . . . . . . . . . 20 Notification . . . . . . . . . . . . . . . . . . . . . . . . . 20
7. IANA Considerations (to be removed by RFC Editor) . . . . . . 21 7. IANA Considerations (to be removed by RFC Editor) . . . . . . 21
8. Security Considerations . . . . . . . . . . . . . . . . . . . 21 8. Security Considerations . . . . . . . . . . . . . . . . . . . 21
9. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 21 9. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 22
10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 21 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 23
11. Comments Solicited . . . . . . . . . . . . . . . . . . . . . . 22 11. Comments Solicited . . . . . . . . . . . . . . . . . . . . . . 23
12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 22 12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 23
12.1. Normative References . . . . . . . . . . . . . . . . . . . 22 12.1. Normative References . . . . . . . . . . . . . . . . . . . 23
12.2. Informative References . . . . . . . . . . . . . . . . . . 22 12.2. Informative References . . . . . . . . . . . . . . . . . . 24
Appendix A. Outstanding Document Issues . . . . . . . . . . . . . 25 Appendix A. Outstanding Document Issues . . . . . . . . . . . . . 27
Appendix B. Changes in This Version (to be removed by RFC Appendix B. Changes in This Version (to be removed by RFC
Editor) . . . . . . . . . . . . . . . . . . . . . . . 25 Editor) . . . . . . . . . . . . . . . . . . . . . . . 27
1. Introduction 1. Introduction
Explicit Congestion Notification (ECN [RFC3168]) is defined in the IP Explicit Congestion Notification (ECN [RFC3168]) is defined in the IP
header (v4 & v6) to allow a resource to notify the onset of queue header (v4 & v6) to allow a resource to notify the onset of queue
build-up without having to drop packets, by explicitly marking a build-up without having to drop packets, by explicitly marking a
proportion of packets with the congestion experienced (CE) codepoint. proportion of packets with the congestion experienced (CE) codepoint.
ECN removes nearly all congestion loss and it cuts delays for two ECN removes nearly all congestion loss and it cuts delays for two
main reasons: i) it avoids the delay when recovering from congestion main reasons: i) it avoids the delay when recovering from congestion
skipping to change at page 5, line 9 skipping to change at page 5, line 9
then in the following sections separate guidelines are given for each then in the following sections separate guidelines are given for each
mode. mode.
This document updates the advice to subnetwork designers about ECN in This document updates the advice to subnetwork designers about ECN in
Section 13 of [RFC3819]. Section 13 of [RFC3819].
1.1. Scope 1.1. Scope
This document only concerns wire protocol processing of explicit This document only concerns wire protocol processing of explicit
notification of congestion and makes no changes or recommendations notification of congestion and makes no changes or recommendations
concerning algorithms for congestion marking or congestion response concerning algorithms for congestion marking or for congestion
(algorithm issues should be independent of the layer the algorithm response (algorithm issues should be independent of the layer the
operates in). algorithm operates in).
The question of congestion notification signals with different The question of congestion notification signals with different
semantics to those of ECN in IP is touched on in a couple of specific semantics to those of ECN in IP is touched on in a couple of specific
cases (e.g. QCN [IEEE802.1Qau]) and with schemes with multiple cases (e.g. QCN [IEEE802.1Qau]) and with schemes with multiple
severity levels such as PCN [RFC6660]). However, no attempt is made severity levels such as PCN [RFC6660]). However, no attempt is made
to give guidelines about schemes with different semantics that are to give guidelines about schemes with different semantics that are
yet to be invented. yet to be invented.
The semantics of congestion signals can be relative to the traffic
class. Therefore correct propagation of congestion signals could
depend on correct propagation of any traffic class field between the
layers. In this document, correct propagation of traffic class
information is assumed, while what 'correct' means and how it is
achieved is covered elsewhere (e.g. [RFC2983]) and is outside the
scope of the present document.
Note that these guidelines do not require the subnet wire protocol to Note that these guidelines do not require the subnet wire protocol to
be changed to accommodate congestion notification. Another way to be changed to accommodate congestion notification. Another way to
add congestion notification without consuming header space in the add congestion notification without consuming header space in the
subnet protocol might be to use a parallel control plane protocol. subnet protocol might be to use a parallel control plane protocol.
This document focuses on the congestion notification interface This document focuses on the congestion notification interface
between IP and lower layer protocols that can encapsulate IP, where between IP and lower layer protocols that can encapsulate IP, where
the term 'IP' includes v4 or v6, unicast, multicast or anycast. the term 'IP' includes v4 or v6, unicast, multicast or anycast.
However, it is likely that the guidelines will also be useful when a However, it is likely that the guidelines will also be useful when a
lower layer protocol or tunnel encapsulates itself (e.g. Ethernet lower layer protocol or tunnel encapsulates itself (e.g. Ethernet
MAC in MAC [IEEE802.1Qah]) or when it encapsulates other protocols. MAC in MAC [IEEE802.1Qah]) or when it encapsulates other protocols.
In the feed-backward mode, propagation of congestion signals for
multicast and anycast packets is out-of-scope (because it would be so
complicated that it is hoped no-one would attempt such an
abomination).
2. Terminology 2. Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [RFC2119]. document are to be interpreted as described in RFC 2119 [RFC2119].
Further terminology used within this document: Further terminology used within this document:
Protocol data unit (PDU): Information that is delivered as a unit Protocol data unit (PDU): Information that is delivered as a unit
skipping to change at page 13, line 36 skipping to change at page 13, line 36
o GRE [RFC1701, RFC2784] o GRE [RFC1701, RFC2784]
o PPTP [RFC2637] o PPTP [RFC2637]
o GTP [GTPv1, GTPv1-U, GTPv2-C] o GTP [GTPv1, GTPv1-U, GTPv2-C]
o VXLAN [vxlan]. o VXLAN [vxlan].
4.2. Wire Protocol Design: Indication of ECN Support 4.2. Wire Protocol Design: Indication of ECN Support
This section is intended to guide the redesign of any lower layer
protocol that encapsulate IP to add native ECN support at the lower
layer. It reflects the approaches used in [RFC6040] and in
[RFC5129]. Therefore IP-in-IP tunnels or IP-in-MPLS or MPLS-in-MPLS
encapsulations that already comply with [RFC6040] or [RFC5129] will
already satisfy this guidance.
A lower layer (or subnet) congestion notification system: A lower layer (or subnet) congestion notification system:
1. SHOULD NOT apply explicit congestion notifications to PDUs that 1. SHOULD NOT apply explicit congestion notifications to PDUs that
are destined for legacy layer-4 transport implementations that are destined for legacy layer-4 transport implementations that
will not understand ECN, and will not understand ECN, and
2. SHOULD NOT apply explicit congestion notifications to PDUs if the 2. SHOULD NOT apply explicit congestion notifications to PDUs if the
egress of the subnet might not propagate congestion notifications egress of the subnet might not propagate congestion notifications
onward into the higher layer. onward into the higher layer.
skipping to change at page 15, line 22 skipping to change at page 15, line 29
(PBB). (PBB).
QCN [IEEE802.1Qau] provides another example of how to indicate to QCN [IEEE802.1Qau] provides another example of how to indicate to
lower layer devices that the end-points will not understand ECN. An lower layer devices that the end-points will not understand ECN. An
operator can define certain 802.1p classes of service to indicate operator can define certain 802.1p classes of service to indicate
non-QCN frames and an ingress bridge is required to map arriving not- non-QCN frames and an ingress bridge is required to map arriving not-
QCN-capable IP packets to one of these non-QCN 802.1p classes. QCN-capable IP packets to one of these non-QCN 802.1p classes.
4.3. Encapsulation Guidelines 4.3. Encapsulation Guidelines
This section is intended to guide the redesign of any node that
encapsulates IP with a lower layer header when adding native ECN
support to the lower layer protocol. It reflects the approaches used
in [RFC6040] and in [RFC5129]. Therefore IP-in-IP tunnels or IP-in-
MPLS or MPLS-in-MPLS encapsulations that already comply with
[RFC6040] or [RFC5129] will already satisfy this guidance.
1. Egress Capability Check: A subnet ingress needs to be sure that 1. Egress Capability Check: A subnet ingress needs to be sure that
the corresponding egress of a subnet will propagate any the corresponding egress of a subnet will propagate any
congestion notification added to the outer header across the congestion notification added to the outer header across the
subnet. This is necessary in addition to checking that an subnet. This is necessary in addition to checking that an
incoming PDU indicates an ECN-capable (L4) transport. Examples incoming PDU indicates an ECN-capable (L4) transport. Examples
of how this guarantee might be provided include: of how this guarantee might be provided include:
* by configuration (e.g. if any label switches in a domain * by configuration (e.g. if any label switches in a domain
support ECN marking, [RFC5129] requires all egress nodes to support ECN marking, [RFC5129] requires all egress nodes to
have been configured to propagate ECN) have been configured to propagate ECN)
skipping to change at page 16, line 38 skipping to change at page 17, line 7
Most information can be extracted if the Congestion Baseline is Most information can be extracted if the Congestion Baseline is
standardised at the node that is regulating the load (the Load standardised at the node that is regulating the load (the Load
Regulator--typically the data source). Then the operator can Regulator--typically the data source). Then the operator can
measure both congestion since the Load Regulator, and congestion measure both congestion since the Load Regulator, and congestion
since the subnet ingress. The latter might be measurable by since the subnet ingress. The latter might be measurable by
subtracting the level of CE markings on inner headers from that subtracting the level of CE markings on inner headers from that
on outer headers (see Appendix C of [RFC6040]). on outer headers (see Appendix C of [RFC6040]).
4.4. Decapsulation Guidelines 4.4. Decapsulation Guidelines
This section is intended to guide the redesign of any node that
decapsulates IP from within a lower layer header when adding native
ECN support to the lower layer protocol. It reflects the approaches
used in [RFC6040] and in [RFC5129]. Therefore IP-in-IP tunnels or
IP-in-MPLS or MPLS-in-MPLS encapsulations that already comply with
[RFC6040] or [RFC5129] will already satisfy this guidance.
A subnet egress SHOULD NOT simply copy congestion notification from A subnet egress SHOULD NOT simply copy congestion notification from
outer headers to the forwarded header. It SHOULD calculate the outer headers to the forwarded header. It SHOULD calculate the
outgoing congestion notification field from the inner and outer outgoing congestion notification field from the inner and outer
headers using the following guidelines. If there is any conflict, headers using the following guidelines. If there is any conflict,
rules earlier in the list take precedence over rules later in the rules earlier in the list take precedence over rules later in the
list: list:
1. If the arriving inner header is a Not-ECN-PDU it implies the L4 1. If the arriving inner header is a Not-ECN-PDU it implies the L4
transport will not understand explicit congestion markings. transport will not understand explicit congestion markings.
Then: Then:
skipping to change at page 18, line 12 skipping to change at page 18, line 31
currently unused combinations are not precluded from future use currently unused combinations are not precluded from future use
through new standards actions. through new standards actions.
4.5. Sequences of Similar Tunnels or Subnets 4.5. Sequences of Similar Tunnels or Subnets
In some deployments, particularly in 3GPP networks, an IP packet may In some deployments, particularly in 3GPP networks, an IP packet may
traverse two or more IP-in-IP tunnels in sequence that all use traverse two or more IP-in-IP tunnels in sequence that all use
identical technology (e.g. GTP). identical technology (e.g. GTP).
In such cases, it would be sufficient for every encapsulation and In such cases, it would be sufficient for every encapsulation and
decapsulation in the chain to comply with RFC6040. Alternatively, as decapsulation in the chain to comply with RFC 6040. Alternatively,
an optimisation, a node that decapsulates a packet and immediately as an optimisation, a node that decapsulates a packet and immediately
re-encapsulates it for the next tunnel MAY copy the incoming outer re-encapsulates it for the next tunnel MAY copy the incoming outer
ECN field directly to the outgoing outer and the incoming inner ECN ECN field directly to the outgoing outer and the incoming inner ECN
field directly to the outgoing inner. Then the overall behavior field directly to the outgoing inner. Then the overall behavior
across the sequence of tunnel segments would still be consistent with across the sequence of tunnel segments would still be consistent with
RFC 6040. RFC 6040.
Appendix C of RFC6040 describes how a tunnel egress can monitor how Appendix C of RFC6040 describes how a tunnel egress can monitor how
much congestion has been introduced within a tunnel. A network much congestion has been introduced within a tunnel. A network
operator might want to monitor how much congestion had been operator might want to monitor how much congestion had been
introduced within a whole sequence of tunnels. Using the technique introduced within a whole sequence of tunnels. Using the technique
in Appendix C of RFC6040 at the final egress, the operator could in Appendix C of RFC6040 at the final egress, the operator could
monitor the whole sequence of tunnels, but only if the above monitor the whole sequence of tunnels, but only if the above
optimisation were used consistently along the sequence of tunnels, in optimisation were used consistently along the sequence of tunnels, in
order to make it appear as a single tunnel. Therefore, tunnel order to make it appear as a single tunnel. Therefore, tunnel
endpoint implementations SHOULD allow the operator to configure endpoint implementations SHOULD allow the operator to configure
whether this optimisation is enabled. whether this optimisation is enabled.
When ECN support is added to a subnet technology, consideration When ECN support is added to a subnet technology, consideration
SHOULD be given to a similar optimisation between subnets in sequnce SHOULD be given to a similar optimisation between subnets in sequence
if they all use the same technology. if they all use the same technology.
4.6. Reframing and Congestion Markings 4.6. Reframing and Congestion Markings
The guidance in this section is worded in terms of framing
boundaries, but it applies equally whether the protocol data units
are frames, cells or packets.
Where framing boundaries are different between two layers, congestion Where framing boundaries are different between two layers, congestion
indications SHOULD be propagated on the basis that a congestion indications SHOULD be propagated on the basis that a congestion
indication on a PDU applies to all the octets in the PDU. On indication on a PDU applies to all the octets in the PDU. On
average, an encapsulator or decapsulator SHOULD approximately average, an encapsulator or decapsulator SHOULD approximately
preserve the number of marked octets arriving and leaving (counting preserve the number of marked octets arriving and leaving (counting
the size of inner headers, but not added encapsulating headers). the size of inner headers, but not added encapsulating headers).
The next departing frame SHOULD be immediately marked even if only The next departing frame SHOULD be immediately marked even if only
enough incoming marked octets have arrived for part of the departing enough incoming marked octets have arrived for part of the departing
frame. This ensures that any outstanding congestion marked octets frame. This ensures that any outstanding congestion marked octets
skipping to change at page 19, line 12 skipping to change at page 19, line 36
For instance, an algorithm for marking departing frames could For instance, an algorithm for marking departing frames could
maintain a counter representing the balance of arriving marked octets maintain a counter representing the balance of arriving marked octets
minus departing marked octets. It adds the size of every marked minus departing marked octets. It adds the size of every marked
frame that arrives and if the counter is positive it marks the next frame that arrives and if the counter is positive it marks the next
frame to depart and subtracts its size from the counter. This will frame to depart and subtracts its size from the counter. This will
often leave a negative remainder in the counter, which is deliberate. often leave a negative remainder in the counter, which is deliberate.
5. Feed-Up-and-Forward Mode: Guidelines for Adding Congestion 5. Feed-Up-and-Forward Mode: Guidelines for Adding Congestion
Notification Notification
The guidance in this section is primarily applicable to encapsulation
of IP packets in Ethernet headers. However, it generalises to
encapsulation by other subnet technologies with no native support for
explicit congestion notification. It is unlikely to be applicable or
necessary for IP-in-IP encapsulation, where feed-forward-and-up mode
based on [RFC6040] would be more appropriate.
Marking the IP header while switching at layer-2 (by using a layer-3 Marking the IP header while switching at layer-2 (by using a layer-3
switch) seems to represent a layering violation. However, it can be switch) seems to represent a layering violation. However, it can be
considered as a benign optimisation if the guidelines below are considered as a benign optimisation if the guidelines below are
followed. Feed-up-and-forward is certainly not a general alternative followed. Feed-up-and-forward is certainly not a general alternative
to implementing feed-forward congestion notification in the lower to implementing feed-forward congestion notification in the lower
layer, because: layer, because:
o IPv4 and IPv6 are not the only layer-3 protocols that might be o IPv4 and IPv6 are not the only layer-3 protocols that might be
encapsulated by lower layer protocols encapsulated by lower layer protocols
skipping to change at page 20, line 30 skipping to change at page 21, line 11
layer congestion notification. Therefore no detailed protocol design layer congestion notification. Therefore no detailed protocol design
guidelines are appropriate. Nonetheless, a more general guideline is guidelines are appropriate. Nonetheless, a more general guideline is
appropriate: appropriate:
1. A subnetwork technology intended to eventually interface to IP 1. A subnetwork technology intended to eventually interface to IP
SHOULD NOT be designed using only the feed-backward mode, which SHOULD NOT be designed using only the feed-backward mode, which
is certainly best for a stand-alone subnet, but would need to be is certainly best for a stand-alone subnet, but would need to be
modified to work efficiently as part of the wider Internet, modified to work efficiently as part of the wider Internet,
because IP uses feed-forward-and-up mode. because IP uses feed-forward-and-up mode.
The feed-backward approach does at least work beneath IP, but it can The feed-backward approach at least works beneath IP, where the term
result in very inefficient and sluggish congestion control--except if 'works' is used only in a narrow functional sense because feed-
it is confined to the subnet directly connected to the original data backward can result in very inefficient and sluggish congestion
source, when it is faster than feed-forward. It would be valid to control--except if it is confined to the subnet directly connected to
design a protocol that could work in feed-backward mode for paths the original data source, when it is faster than feed-forward. It
that only cross one subnet, and in feed-forward-and-up mode for paths would be valid to design a protocol that could work in feed-backward
that cross subnets. mode for paths that only cross one subnet, and in feed-forward-and-up
mode for paths that cross subnets.
In the early days of TCP/IP, a similar feed-backward approach was In the early days of TCP/IP, a similar feed-backward approach was
tried for explicit congestion signalling, using source-quench (SQ) tried for explicit congestion signalling, using source-quench (SQ)
ICMP control packets. However, SQ fell out of favour and is now ICMP control packets. However, SQ fell out of favour and is now
formally deprecated [RFC6633]. The main problem was that it is hard formally deprecated [RFC6633]. The main problem was that it is hard
for a data source to tell the difference between a spoofed SQ message for a data source to tell the difference between a spoofed SQ message
and a quench request from a genuine buffer on the path. It is also and a quench request from a genuine buffer on the path. It is also
hard for a lower layer buffer to address an SQ message to the hard for a lower layer buffer to address an SQ message to the
original source port number, which may be buried within many layers original source port number, which may be buried within many layers
of headers, and possibly encrypted. of headers, and possibly encrypted.
skipping to change at page 21, line 14 skipping to change at page 21, line 45
technology. If a QCN subnet were later connected into a wider IP- technology. If a QCN subnet were later connected into a wider IP-
based internetwork (e.g. when attempting to interconnect multiple based internetwork (e.g. when attempting to interconnect multiple
data centres) it would suffer the inefficiency shown Figure 3. data centres) it would suffer the inefficiency shown Figure 3.
7. IANA Considerations (to be removed by RFC Editor) 7. IANA Considerations (to be removed by RFC Editor)
This memo includes no request to IANA. This memo includes no request to IANA.
8. Security Considerations 8. Security Considerations
{ToDo}` If a lower layer wire protocol is redesigned to include explicit
congestion signalling in-band in the protocol header, care SHOULD be
take to ensure that the field used is specified as mutable during
transit. Otherwise interior nodes signalling congestion would
invalidate any authentication protocol applied to the lower layer
header--by altering a header field that had been assumed as
immutable.
The redesign of protocols that encapsulate IP in order to propagate
congestion signals between layers raises potential signal integrity
concerns. Experimental or proposed approaches exist for assuring the
end-to-end integrity of in-band congestion signals, e.g.:
o Congestion exposure (ConEx ) for networks to audit that their
congestion signals are not being suppressed by other networks or
by receivers, and for networks to police that senders are
responding sufficiently to the signals, irrespective of the
transport protocol used [I-D.ietf-conex-abstract-mech].
o The ECN nonce [RFC3540] for a TCP sender to detect whether a
network or the receiver is suppressing congestion signals.
o A test with the same goals as the ECN nonce, but without the need
for the receiver to co-operate with the protocol
[I-D.moncaster-tcpm-rcv-cheat].
Given these end-to-end approaches are already being specified, it
would make little sense to attempt to design hop-by-hop congestion
signal integrity into a new lower layer protocol, because end-to-end
integrity inherently achieves hop-by-hop integrity.
9. Conclusions 9. Conclusions
Following the guidance in the document enables ECN support to be Following the guidance in the document enables ECN support to be
extended to numerous protocols that encapsulate IP (v4 & v6) in a extended to numerous protocols that encapsulate IP (v4 & v6) in a
consistent way, so that IP continues to fulfil its role as an end-to- consistent way, so that IP continues to fulfil its role as an end-to-
end interoperability layer. This includes: end interoperability layer. This includes:
o A wide range of tunnelling protocols with various forms of shim o A wide range of tunnelling protocols with various forms of shim
header between two IP headers; header between two IP headers;
skipping to change at page 21, line 50 skipping to change at page 23, line 15
10. Acknowledgements 10. Acknowledgements
Thanks to Gorry Fairhurst for extensive initial reviews. Michael Thanks to Gorry Fairhurst for extensive initial reviews. Michael
Welzl pointed out that lower layer congestion notification signals Welzl pointed out that lower layer congestion notification signals
may have different semantics to those in IP. may have different semantics to those in IP.
Bob Briscoe was part-funded by the European Community under its Bob Briscoe was part-funded by the European Community under its
Seventh Framework Programme through the Trilogy project (ICT-216372) Seventh Framework Programme through the Trilogy project (ICT-216372)
for initial drafts and through the Reducing Internet Transport for initial drafts and through the Reducing Internet Transport
Latency (RITE) project (ICT-317700) subsequently. The views Latency (RITE) project (ICT-317700) subsequently. The views
expressed here are solely those of the author. expressed here are solely those of the authors.
11. Comments Solicited 11. Comments Solicited
Comments and questions are encouraged and very welcome. They can be Comments and questions are encouraged and very welcome. They can be
addressed to the IETF Transport Area working group mailing list addressed to the IETF Transport Area working group mailing list
<tsvwg@ietf.org>, and/or to the authors. <tsvwg@ietf.org>, and/or to the authors.
12. References 12. References
12.1. Normative References 12.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to [RFC2119] Bradner, S., "Key words for use in
Indicate Requirement Levels", BCP 14, RFCs to Indicate Requirement Levels",
RFC 2119, March 1997. BCP 14, RFC 2119, March 1997.
[RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, [RFC3168] Ramakrishnan, K., Floyd, S., and D.
"The Addition of Explicit Congestion Black, "The Addition of Explicit
Notification (ECN) to IP", RFC 3168, Congestion Notification (ECN) to IP",
September 2001. RFC 3168, September 2001.
[RFC3819] Karn, P., Bormann, C., Fairhurst, G., [RFC3819] Karn, P., Bormann, C., Fairhurst, G.,
Grossman, D., Ludwig, R., Mahdavi, J., Grossman, D., Ludwig, R., Mahdavi,
Montenegro, G., Touch, J., and L. Wood, J., Montenegro, G., Touch, J., and L.
"Advice for Internet Subnetwork Designers", Wood, "Advice for Internet Subnetwork
BCP 89, RFC 3819, July 2004. Designers", BCP 89, RFC 3819,
July 2004.
[RFC4774] Floyd, S., "Specifying Alternate Semantics [RFC4774] Floyd, S., "Specifying Alternate
for the Explicit Congestion Notification Semantics for the Explicit Congestion
(ECN) Field", BCP 124, RFC 4774, Notification (ECN) Field", BCP 124,
November 2006. RFC 4774, November 2006.
[RFC5129] Davie, B., Briscoe, B., and J. Tay,
"Explicit Congestion Marking in
MPLS", RFC 5129, January 2008.
[RFC6040] Briscoe, B., "Tunnelling of Explicit
Congestion Notification", RFC 6040,
November 2010.
12.2. Informative References 12.2. Informative References
[ATM-TM-ABR] Cisco, "Understanding the Available Bit Rate [ATM-TM-ABR] Cisco, "Understanding the Available
(ABR) Service Category for ATM VCs", Design Bit Rate (ABR) Service Category for
Technote 10415, June 2005. ATM VCs", Design Technote 10415,
June 2005.
[Buck00] Buckwalter, J., "Frame Relay: Technology and [Buck00] Buckwalter, J., "Frame Relay:
Practice", Pub. Addison Wesley ISBN-13: 978- Technology and Practice", Pub.
Addison Wesley ISBN-13: 978-
0201485240, 2000. 0201485240, 2000.
[DCTCP] Alizadeh, M., Greenberg, A., Maltz, D., [DCTCP] Alizadeh, M., Greenberg, A., Maltz,
Padhye, J., Patel, P., Prabhakar, B., D., Padhye, J., Patel, P., Prabhakar,
Sengupta, S., and M. Sridharan, "Data Center B., Sengupta, S., and M. Sridharan,
TCP (DCTCP)", ACM SIGCOMM CCR 40(4)63--74, "Data Center TCP (DCTCP)", ACM
SIGCOMM CCR 40(4)63--74,
October 2010, <http://portal.acm.org/ October 2010, <http://portal.acm.org/
citation.cfm?id=1851192>. citation.cfm?id=1851192>.
[GTPv1] 3GPP, "GPRS Tunnelling Protocol (GTP) across [GTPv1] 3GPP, "GPRS Tunnelling Protocol (GTP)
the Gn and Gp interface", Technical across the Gn and Gp interface",
Specification TS 29.060. Technical Specification TS 29.060.
[GTPv1-U] 3GPP, "General Packet Radio System (GPRS) [GTPv1-U] 3GPP, "General Packet Radio System
Tunnelling Protocol User Plane (GTPv1-U)", (GPRS) Tunnelling Protocol User Plane
Technical Specification TS 29.281. (GTPv1-U)", Technical
Specification TS 29.281.
[GTPv2-C] 3GPP, "Evolved General Packet Radio Service [GTPv2-C] 3GPP, "Evolved General Packet Radio
(GPRS) Tunnelling Protocol for Control plane Service (GPRS) Tunnelling Protocol
(GTPv2-C)", Technical Specification TS for Control plane (GTPv2-C)",
29.274. Technical Specification TS 29.274.
[I-D.ietf-conex-abstract-mech] Mathis, M. and B. Briscoe,
"Congestion Exposure (ConEx) Concepts
and Abstract Mechanism",
draft-ietf-conex-abstract-mech-07
(work in progress), July 2013.
[I-D.moncaster-tcpm-rcv-cheat] Moncaster, T., "A TCP Test to Allow
Senders to Identify Receiver Non-
Compliance",
draft-moncaster-tcpm-rcv-cheat-01
(work in progress), June 2007.
[IEEE802.1Qah] IEEE, "IEEE Standard for Local and [IEEE802.1Qah] IEEE, "IEEE Standard for Local and
Metropolitan Area Networks--Virtual Bridged Metropolitan Area Networks--Virtual
Local Area Networks--Amendment 6: Provider Bridged Local Area Networks--
Backbone Bridges", IEEE Std 802.1Qah-2008, Amendment 6: Provider Backbone
August 2008, <http://www.ieee802.org/1/ Bridges", IEEE Std 802.1Qah-2008,
pages/802.1ah.html>. August 2008, <http://www.ieee802.org/
1/pages/802.1ah.html>.
(Access Controlled link within page) (Access Controlled link within page)
[IEEE802.1Qau] Finn, N., Ed., "IEEE Standard for Local and [IEEE802.1Qau] Finn, N., Ed., "IEEE Standard for
Metropolitan Area Networks--Virtual Bridged Local and Metropolitan Area
Local Area Networks - Amendment 13: Networks--Virtual Bridged Local Area
Congestion Notification", IEEE Std 802.1Qau- Networks - Amendment 13: Congestion
Notification", IEEE Std 802.1Qau-
2010, March 2010, <http:// 2010, March 2010, <http://
ieeexplore.ieee.org/xpl/ ieeexplore.ieee.org/xpl/
mostRecentIssue.jsp?punumber=5454061>. mostRecentIssue.jsp?punumber=5454061>
.
(Access Controlled link within page) (Access Controlled link within page)
[ITU-T.I.371] ITU-T, "Traffic Control and Congestion [ITU-T.I.371] ITU-T, "Traffic Control and
Control in B-ISDN", ITU-T Rec. I.371 Congestion Control in B-ISDN", ITU-T
(03/04), March 2004. Rec. I.371 (03/04), March 2004.
[RFC1323] Jacobson, V., Braden, B., and D. Borman, [RFC1323] Jacobson, V., Braden, B., and D.
"TCP Extensions for High Performance", Borman, "TCP Extensions for High
RFC 1323, May 1992. Performance", RFC 1323, May 1992.
[RFC1701] Hanks, S., Li, T., Farinacci, D., and P. [RFC1701] Hanks, S., Li, T., Farinacci, D., and
Traina, "Generic Routing Encapsulation P. Traina, "Generic Routing
(GRE)", RFC 1701, October 1994. Encapsulation (GRE)", RFC 1701,
October 1994.
[RFC2003] Perkins, C., "IP Encapsulation within IP", [RFC2003] Perkins, C., "IP Encapsulation within
RFC 2003, October 1996. IP", RFC 2003, October 1996.
[RFC2637] Hamzeh, K., Pall, G., Verthein, W., Taarud, [RFC2637] Hamzeh, K., Pall, G., Verthein, W.,
J., Little, W., and G. Zorn, "Point-to-Point Taarud, J., Little, W., and G. Zorn,
Tunneling Protocol", RFC 2637, July 1999. "Point-to-Point Tunneling Protocol",
RFC 2637, July 1999.
[RFC2661] Townsley, W., Valencia, A., Rubens, A., [RFC2661] Townsley, W., Valencia, A., Rubens,
Pall, G., Zorn, G., and B. Palter, "Layer A., Pall, G., Zorn, G., and B.
Two Tunneling Protocol "L2TP"", RFC 2661, Palter, "Layer Two Tunneling Protocol
August 1999. "L2TP"", RFC 2661, August 1999.
[RFC2784] Farinacci, D., Li, T., Hanks, S., Meyer, D., [RFC2784] Farinacci, D., Li, T., Hanks, S.,
and P. Traina, "Generic Routing Meyer, D., and P. Traina, "Generic
Encapsulation (GRE)", RFC 2784, March 2000. Routing Encapsulation (GRE)",
RFC 2784, March 2000.
[RFC2884] Hadi Salim, J. and U. Ahmed, "Performance [RFC2884] Hadi Salim, J. and U. Ahmed,
Evaluation of Explicit Congestion "Performance Evaluation of Explicit
Notification (ECN) in IP Networks", Congestion Notification (ECN) in IP
RFC 2884, July 2000. Networks", RFC 2884, July 2000.
[RFC4301] Kent, S. and K. Seo, "Security Architecture [RFC2983] Black, D., "Differentiated Services
for the Internet Protocol", RFC 4301, and Tunnels", RFC 2983, October 2000.
December 2005.
[RFC5129] Davie, B., Briscoe, B., and J. Tay, [RFC3540] Spring, N., Wetherall, D., and D.
"Explicit Congestion Marking in MPLS", Ely, "Robust Explicit Congestion
RFC 5129, January 2008. Notification (ECN) Signaling with
Nonces", RFC 3540, June 2003.
[RFC6040] Briscoe, B., "Tunnelling of Explicit [RFC4301] Kent, S. and K. Seo, "Security
Congestion Notification", RFC 6040, Architecture for the Internet
November 2010. Protocol", RFC 4301, December 2005.
[RFC6633] Gont, F., "Deprecation of ICMP Source Quench [RFC6633] Gont, F., "Deprecation of ICMP Source
Messages", RFC 6633, May 2012. Quench Messages", RFC 6633, May 2012.
[RFC6660] Briscoe, B., Moncaster, T., and M. Menth, [RFC6660] Briscoe, B., Moncaster, T., and M.
"Encoding Three Pre-Congestion Notification Menth, "Encoding Three Pre-Congestion
(PCN) States in the IP Header Using a Single Notification (PCN) States in the IP
Diffserv Codepoint (DSCP)", RFC 6660, Header Using a Single Diffserv
Codepoint (DSCP)", RFC 6660,
July 2012. July 2012.
[trill-rbridge-options] Eastlake, D., Ghanwani, A., Manral, V., and [trill-rbridge-options] Eastlake, D., Ghanwani, A., Manral,
C. Bestler, "RBridges: Further TRILL Header V., and C. Bestler, "RBridges:
Extensions", Further TRILL Header Extensions",
draft-ietf-trill-rbridge-options-07 (work in draft-ietf-trill-rbridge-options-07
progress), June 2012. (work in progress), June 2012.
[vxlan] Mahalingam, M., Dutt, D., Duda, K., Agarwal, [vxlan] Mahalingam, M., Dutt, D., Duda, K.,
P., Kreeger, L., Sridhar, T., Bursell, M., Agarwal, P., Kreeger, L., Sridhar,
and C. Wright, "VXLAN: A Framework for T., Bursell, M., and C. Wright,
Overlaying Virtualized Layer 2 Networks over "VXLAN: A Framework for Overlaying
Virtualized Layer 2 Networks over
Layer 3 Networks", Layer 3 Networks",
draft-mahalingam-dutt-dcops-vxlan-03 (work draft-mahalingam-dutt-dcops-vxlan-04
in progress), February 2013. (work in progress), May 2013.
Appendix A. Outstanding Document Issues Appendix A. Outstanding Document Issues
1. [GF] Concern that certain guidelines warrant a MUST (NOT) rather 1. [GF] Concern that certain guidelines warrant a MUST (NOT) rather
than a SHOULD (NOT). Given the guidelines say that if any SHOULD than a SHOULD (NOT). Given the guidelines say that if any SHOULD
(NOT)s are not followed, a strong justification will be needed, (NOT)s are not followed, a strong justification will be needed,
they have been left as SHOULD (NOT) pending further list they have been left as SHOULD (NOT) pending further list
discussion. In particular: discussion. In particular:
* If inner is a Not-ECN-PDU and Outer is CE (or highest severity * If inner is a Not-ECN-PDU and Outer is CE (or highest severity
congestion level), MUST (not SHOULD) drop? congestion level), MUST (not SHOULD) drop?
2. [GF] Impact of Diffserv on alternate marking schemes (referring 2. Consider whether an IETF Standard Track doc will be needed to
to RFC3168, RFC4774 & RFC2983)
3. Consider whether an IETF Standard Track doc will be needed to
Update the IP-in-IP protocols listed in Section 4.1--at least Update the IP-in-IP protocols listed in Section 4.1--at least
those that the IETF controls--and which Area it should sit under. those that the IETF controls--and which Area it should sit under.
4. Guidelines referring to subnet technologies should also refer to Appendix B. Changes in This Version (to be removed by RFC Editor)
tunnels and vice versa.
5. Check that guidelines allow for multicast as well as unicast. From briscoe-02 to 03:
6. Security Considerations * Scope section:
Appendix B. Changes in This Version (to be removed by RFC Editor) + Added dependence on correct propagation of traffic class
information
+ For the feed-backward mode, deemed multicast and anycast out
of scope
* Ensured all guidelines referring to subnet technologies also
refer to tunnels and vice versa by adding applicability
sentences at the start of sections 4.1, 4.2, 4.3, 4.4, 4.6 and
5.
* Added Security Considerations on ensuring congestion signal
fields are classed as immutable and on using end-to-end
congestion signal integrity technologies rather than hop-by-
hop.
From briscoe-01 to 02: From briscoe-01 to 02:
* Added authors: JK & PT * Added authors: JK & PT
* Added * Added
+ Section 4.1 "IP-in-IP Tunnels with Tightly Coupled Shim + Section 4.1 "IP-in-IP Tunnels with Tightly Coupled Shim
Headers" Headers"
 End of changes. 56 change blocks. 
134 lines changed or deleted 251 lines changed or added

This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/