Transport Area Working Group                                  B. Briscoe
Internet-Draft                                                  BT & UCL
Expires: September 7, December 28, 2006                                    A. Jacquet
                                                            A. Salvatori
                                                               M. Koyabe
                                                                      BT
                                                          March 06,
                                                           June 26, 2006

     Re-ECN: Adding Accountability for Causing Congestion to TCP/IP
                   draft-briscoe-tsvwg-re-ecn-tcp-01
                   draft-briscoe-tsvwg-re-ecn-tcp-02

Status of this Memo

   By submitting this Internet-Draft, each author represents that any
   applicable patent or other IPR claims of which he or she is aware
   have been or will be disclosed, and any of which he or she becomes
   aware will be disclosed, in accordance with Section 6 of BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on September 7, December 28, 2006.

Copyright Notice

   Copyright (C) The Internet Society (2006).

Abstract

   This document introduces a new protocol for explicit congestion
   notification (ECN), termed re-ECN, which can be deployed
   incrementally around unmodified routers.  The protocol arranges an
   extended ECN field in each packet so that, as it crosses any
   interface in an internetwork, it will carry a truthful prediction of
   congestion on the remainder of its path.  Then the upstream party at
   any trust boundary in the internetwork can be held responsible for
   the congestion they cause, or allow to be caused.  So, networks can
   introduce straightforward accountability and policing mechanisms for
   incoming traffic from end-customers or from neighbouring network
   domains.  The purpose of this document is to specify the re-ECN
   protocol at the IP layer and to give guidelines on any consequent
   changes required to transport protocols.  It includes the changes
   required to TCP both as an example and as a specification.  It also
   gives examples of mechanisms that can use the protocol to ensure data
   sources respond correctly to congestion.  And it describes example
   mechanisms that ensure the dominant selfish strategy of both network
   domains and end-points will be to set the extended ECN field
   honestly.

Authors' Statement: Status (to be removed by the RFC Editor)

   This document is posted as an Internet-Draft with the intent (at
   least that of the authors) to eventually progress to standards track.

   Although the re-ECN protocol is intended to make a simple but far-
   reaching change to the Internet architecture, the most immediate
   priority for the authors is to delay any move of the ECN nonce to
   Proposed Standard status.

   The ECN nonce is an experimental RFC that allows /senders/ to check
   the integrity of congestion feedback from /networks/.  Therefore the
   nonce only helps in scenarios where the sender is trusted to control
   network congestion.  On the other hand, the re-ECN protocol aims to
   allow networks themselves to be able to police cheating senders and
   receivers and to police neighbouring networks.  Re-ECN is therefore
   proposed in preference to the ECN nonce on the basis that it
   addresses the generic problem of accountability for congestion of a
   network's resources at the IP layer.

   Delaying the ECN nonce is justified by two factors:

   o  The ECN nonce would permanently consumes a two-bit codepoint in
      the IP header for a purpose specific to a limited trust model.
      Although the nonce is a neat idea, its applicability seems too
      limited to warrant space in the IP header;

   o  Although we have re-designed the re-ECN codepoints so that they do
      not prevent the ECN nonce progressing, the same is not true the
      other way round.  If the ECN nonce started to see some deployment
      (perhaps because it was blessed with proposed standard status),
      incremental deployment of re-ECN would effectively be impossible,
      because re-ECN marking fractions at inter-domain borders would be
      polluted by unknown levels of nonce traffic.

   The authors are aware that re-ECN must prove it has the potential it
   claims if it is to displace the nonce.  Therefore, every effort has
   been made to complete a comprehensive specification of re-ECN so that
   its potential can be assessed.  We therefore seek the opinion of the
   Internet community on whether the re-ECN protocol is sufficiently
   useful to warrant standards action.

Changes from previous drafts (to be removed by the RFC Editor)

   From -00 to -01:

      Encoding of re-ECN wire protocol changed for reasons given in
      Appendix B and consequently draft substantially re-written.

      Substantial text added in sections on applications, incremental
      deployment, architectural rationale and security considerations.

   From -01 to -02:

      Explanation on informal terminology in Section 3.4 clarified.

      IPv6 wire protocol encoding added (Section 5.2).

      Text on (non-)issues with tunnels, encryption and link layer
      congestion notification added (Section 5.6 & Section 5.7).

      Section added giving evolvability arguments against encouraging
      bottleneck policing (Section 6.1.2).  And text on re-ECN's
      evolvability by design added to Section 6.1.3

      Text on inter-domain policing (Section 6.1.6) and inter-domain
      fail-safes (Section 6.1.7) added.

      Minor editorial changes throughout.

Table of Contents

   1.  Introduction
   2.  Requirements notation
   3.  Protocol Overview
     3.1.  Background and Applicability
     3.2.  Re-ECN Abstracted Network Layer Wire Protocol (IPv4 or
           v6)
     3.3.  Re-ECN Protocol Operation
     3.4.  Informal Terminology
   4.  Transport Layers
     4.1.  TCP
       4.1.1.  RECN mode: Full re-ECN capable transport
       4.1.2.  RECN-Co mode: Re-ECT Sender with a Vanilla or
               Nonce ECT Receiver
       4.1.3.  Capability Negotiation
       4.1.4.  Extended ECN (EECN) Field Settings during Flow
               Start or after Idle Periods
       4.1.5.  Pure ACKS, Retransmissions, Window Probes and
               Partial ACKs
     4.2.  Other Transports
       4.2.1.  Guidelines for Adding Re-ECN to Other Transports
   5.  Network Layer
     5.1.  Re-ECN IPv4 Wire Protocol
     5.2.  Re-ECN IPv6 Wire Protocol
     5.3.  Router Forwarding Behaviour
     5.4.  Justification for Setting the First SYN to FNE
     5.5.  Control and Management
       5.5.1.  Negative Balance Warning
       5.5.2.  Rate Response Control
     5.6.  IP in IP Tunnels
     5.7.  Non-Issues
   6.  Applications
     6.1.  Policing Congestion Response
       6.1.1.  The Policing Problem
       6.1.2.  The Case Against Bottleneck Policing
       6.1.3.  Re-ECN Incentive Framework
       6.1.3.
       6.1.4.  Egress Dropper
       6.1.4.
       6.1.5.  Rate Policing
       6.1.5.
       6.1.6.  Inter-domain Policing
       6.1.6.
       6.1.7.  Inter-domain Fail-safes
       6.1.8.  Simulations
     6.2.  Other Applications
       6.2.1.  DDoS Mitigation
       6.2.2.  End-to-end QoS
       6.2.3.  Traffic Engineering
       6.2.4.  Inter-Provider Service Monitoring
     6.3.  Limitations
   7.  Incremental Deployment
     7.1.  Incremental Deployment Features
     7.2.  Incremental Deployment Incentives
   8.  Architectural Rationale
   9.  Related Work
     9.1.  Policing Rate Response to Congestion
     9.2.  Congestion Notification Integrity
     9.3.  Identifying Upstream and Downstream Congestion
   10. Security Considerations
   11. IANA Considerations
   12. Conclusions
   13. Acknowledgements
   14. Comments Solicited
   15. References
     15.1. Normative References
     15.2. Informative References
   Appendix A.  Precise Re-ECN Protocol Operation
   Appendix B.  Justification for Two Codepoints Signifying Zero
                Worth Packets
   Appendix C.  ECN Compatibility
   Appendix C. D.  Packet Marking During Flow Start
   Appendix D. E.  Example Egress Dropper Algorithm
   Appendix E. F.  Re-TTL
   Appendix F. G.  Policer Designs to ensure Congestion
                Responsiveness
     F.1.
     G.1.  Per-user Policing
     F.2.
     G.2.  Per-flow Rate Policing
   Appendix H.  Downstream Congestion Metering Algorithms
     H.1.  Bulk Downstream Congestion Metering Algorithm
     H.2.  Inflation Factor for Persistently Negative Flows
   Authors' Addresses
   Intellectual Property and Copyright Statements

1.  Introduction

   This document aims:

   o  To provide a complete specification of the addition of the re-ECN
      protocol to IP and guidelines on how to add it to transport layer
      protocols, including a complete specification of re-ECN in TCP as
      an example;

   o  To show how a number of hard problems become much easier to solve
      once re-ECN is available in IP.

   A general statement of the problem solved by re-ECN is to provide
   sufficient information in each IP datagram to be able to hold senders
   and whole networks accountable for the congestion they cause
   downstream, before they cause it.  But the every-day problems that
   re-ECN can solve are much more recognisable than this rather generic
   statement: mitigating distributed denial of service (DDoS);
   simplifying differentiation of quality of service (QoS); policing
   compliance to congestion control; and so on.

   Uniquely, re-ECN manages to enable solutions to these problems
   without unduly stifling innovative new ways to use the Internet.
   This was a hard balance to strike, given it could be argued that DDoS
   is an innovative way to use the Internet.  The most valuable insight
   was to allow each network to choose the level of constraint it wishes
   to impose.  Also re-ECN has been carefully designed so that networks
   that choose to use it conservatively can protect themselves against
   the congestion caused in their network by users on other networks
   with more liberal policies.

   For instance, some network owners want to block applications like
   voice and video unless their network is compensated for the extra
   share of bottleneck bandwidth taken.  These real-time applications
   tend to be unresponsive when congestion arises.  Whereas elastic TCP-
   based applications back away quickly, ending up taking a much smaller
   share of congested capacity for themselves.  Other network owners
   want to invest in large amounts of capacity and make their gains from
   simplicity of operation and economies of scale.

   Re-ECN allows the more conservative networks to police out flows that
   have not asked to be unresponsive to congestion---not because they
   are voice or video---just because they don't respond to congestion.
   But it also allows other networks to choose not to police.
   Crucially, when flows from liberal networks cross into a conservative
   network, re-ECN enables the conservative network to apply penalties
   to its neighbouring networks for the congestion they cause. allow to be
   caused.  And these penalties can be applied to bulk data, without
   regard to flows.

   Then, if unresponsive applications become so dominant that some of
   the more liberal networks experience congestion collapse [RFC3714],
   they can change their minds and use re-ECN to apply tighter controls
   in order to bring congestion back under control.

   Re-ECN works by arranging that each packet arrives at each network
   element carrying a view of expected congestion on its own downstream
   path, albeit averaged over multiple packets.  Most usefully,
   congestion on the remainder of the path becomes visible in the IP
   header at the first ingress.  Many of the applications of re-ECN
   involve a policer at this ingress using the view of downstream
   congestion arriving in packets to police or control the packet rate.

   Importantly, the scheme is recursive: a whole network harbouring
   users causing congestion in downstream networks can be held
   responsible or policed by its downstream neighbour.

   This document is structured as follows.  First an overview of the re-
   ECN protocol is given (Section 3), outlining its attributes and
   explaining conceptually how it works as a whole.  The two main parts
   of the document follow, as described above.  That is, the protocol
   specification divided into transport (Section 4) and network
   (Section 5) layers, then the applications it can be put to, such as
   policing DDoS, QoS and congestion control (Section 6).  Although
   these applications do not require standardisation themselves, they
   are described in a fair degree of detail in order to explain how re-
   ECN can be used.  Given, re-ECN proposes to use the last undefined
   bit in the IPv4 header, we felt it necessary to outline the potential
   that re-ECN could release in return for being given that bit.

   Deployment issues discussed throughout the document are brought
   together in Section 7, which is followed by a brief section
   explaining the somewhat subtle rationale for the design, from an
   architectural perspective (Section 8).  We end by describing related
   work (Section 9), listing security considerations (Section 10) and
   finally drawing conclusions (Section 12).

2.  Requirements notation

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in [RFC2119].

   This document first specifies a protocol, then describes a framework
   that creates the right incentives to ensure compliance to the
   protocol.  This could cause confusion because the second part of the
   document considers many cases where malicious nodes may not comply
   with the protocol.  When such contingencies are described, if any of
   the above keywords are not capitalised, that is deliberate.  So, for
   instance, the following two apparently contradictory sentences would
   be perfectly consistent: i) x MUST do this; ii) x may not do this.

3.  Protocol Overview

3.1.  Background and Applicability

   First we briefly recap the essentials of the ECN protocol [RFC3168].
   Two bits in the IP protocol (v4 or v6) are assigned to the ECN field.
   The sender clears the field to "00" (Not-ECT) if either end-point
   transport is not ECN-capable.  Otherwise it indicates an ECN-capable
   transport (ECT) using either of the two code-points "10" or "01"
   (ECT(0) and ECT(1) resp.).

   ECN-capable routers probabilistically set "11" if congestion is
   experienced (CE), the marking probability increasing with the length
   of the queue at its egress link (the (typically using the RED
   algorithm [RFC2309]).  However, they still drop rather than mark Not-ECT Not-
   ECT packets.  With multiple ECN-capable routers on a path, a flow of
   packets accumulates the fraction of CE marking that each router adds.
   The combined effect of the packet marking of all the routers along
   the path signals congestion of the whole path to the receiver.  So,
   for example, if one router early in a path is marking 1% of packets
   and another later in a path is marking 2%, flows that pass through
   both routers will experience approximately 3% marking (see Appendix A
   for a precise treatment).

   The choice of two ECT code-points in the ECN field [RFC3168]
   permitted future flexibility, optionally allowing the sender to
   encode the experimental ECN nonce [RFC3540] in the packet stream.
   The nonce is designed to allow a sender to check the integrity of
   congestion feedback.  But Section 9.2 explains that it still gives no
   control over how fast the sender transmits as a result of the
   feedback.  On the other hand, re-ECN is designed both to ensure that
   congestion is declared honestly and that the sender's rate responds
   appropriately.

   Re-ECN is based on a feedback arrangement called
   `re-feedback' [Re-fb].  The word is short for either receiver-
   aligned, re-inserted or re-echoed feedback.  But it actually works
   even when no feedback is available.  In fact it has been carefully
   designed to work for single datagram flows.  Indeed, it even
   encourages aggregation of single packet flows by congestion control
   proxies.  Then, even if the traffic mix of the Internet were to
   become dominated by short messages, it would still be possible to
   control congestion effectively and efficiently.

   Changing the Internet's feedback architecture seems to imply
   considerable upheaval.  But re-ECN can be deployed incrementally at
   the transport layer around unmodified routers using existing fields
   in IP (v4 or v6).  However it does also require the last undefined
   bit in the IPv4 header, which it uses in combination with the 2-bit
   ECN field to create four new codepoints.  Changes  Nonetheless, changes to IP
   routers are RECOMMENDED in order to improve resilience against DoS
   attacks.  Similarly, re-ECN works best if both the sender and
   receiver transports are re-ECN-capable, but it can work with just
   sender support.  Section 7 7.1 summarises the incremental deployment
   strategy.

   The re-ECN protocol makes no changes and has no effect on the TCP
   congestion control algorithm or on other rate responses to
   congestion.  Re-ECN is only concerned with enabling the ingress
   network to police that a source is complying with a congestion
   control algorithm, which is orthogonal to congestion control itself.

   Before re-ECN can be considered worthy of using up the last bit in
   the IP header, we must be sure that all our claims are robust.  We
   have gradually been reducing the list of outstanding issues, but the
   few that still remain are listed in Section 6.3.  We expect others
   may find new attacks,
   attacks may still be found, but we offer the re-ECN protocol on the
   basis that it is built on fairly solid theoretical foundations and,
   so far, it has proved possible to keep it relatively robust.

3.2.  Re-ECN Abstracted Network Layer Wire Protocol (IPv4 or v6)

   The re-ECN wire protocol uses the two bit ECN field broadly as in
   RFC3168 [RFC3168] as described above, but with three five differences of
   detail (see (brought together in a list in Section 5.3). 7.1).  This
   specification defines a new re-ECN extension (RE) flag.  We will
   defer the definition of the actual position of the RE flag in the
   IPv4 & v6 headers until Section 5.  Until then it will suffice to use
   an abstraction of the IPv4 and v6 wire protocols by just calling it
   the RE flag.

   Unlike the ECN field, the RE flag is intended to be set by the sender
   and remain unchanged along the path, although it can be read by
   network elements that understand the re-ECN protocol.  It is feasible
   that a network element MAY change the setting of the RE flag, perhaps
   acting as a proxy for an end-point, but such a protocol would have to
   be defined in another specification (e.g. [Re-PCN]).

   Although the RE flag is a separate, single bit field, it can be read
   as an extension to the two-bit ECN field; the three concatenated bits
   in what we will call the extended ECN field (EECN) making eight
   codepoints.  We will use the RFC3168 names of the ECN codepoints to
   describe settings of the ECN field when the RE flag setting is "don't
   care", but we also define the following six extended ECN codepoint
   names for when we need to be more specific.

   +-------+------------+------+---------------+-----------------------+

   +-------+-----------+------+--------------+-------------------------+
   |  ECN  | RFC3168   |  RE  | Extended ECN |      Re-ECN meaning     |
   | field | codepoint | flag | codepoint    |                         |
   +-------+------------+------+---------------+-----------------------+
   +-------+-----------+------+--------------+-------------------------+
   |   00  | Not-ECT   |   0  | Not-RECT     |    Not re-ECN-capable   |
   |       |           |      |              |        transport        |
   |   00  | Not-ECT   |   1  | FNE          |       Feedback not      |
   |       |           |      |              |       established       |
   |   01  | ECT(1)    |   0  | Re-Echo      |   Re-echoed congestion  |
   |       |           |      |              |         and RECT        |
   |   01  | ECT(1)    |   1  | RECT         |      Re-ECN capable     |
   |       |           |      |              |        transport        |
   |   10  | ECT(0)    |   0  | ---          | Legacy ECN use only     |
   |       |            |      |               |                       |
   |   10  | ECT(0)    |   1  | --CU--       |     Currently unused    |
   |       |           |      |              |                         |
   |   11  | CE        |   0  | CE(0)        |       Congestion   Re-Echo canceled by   |
   |       |           |      |              |  congestion experienced with   |
   |       |            |      |               |        Re-Echo |
   |   11  | CE        |   1  | CE(-1)       |  Congestion      |
   |       |            |      |               | experienced |
   +-------+------------+------+---------------+-----------------------+
   +-------+-----------+------+--------------+-------------------------+

                     Table 1: Extended ECN Codepoints

3.3.  Re-ECN Protocol Operation

   In this section we will give an overview of the operation of the re-
   ECN protocol for TCP/IP, leaving a detailed specification to the
   following sections.  Other transports will be discussed later.

   In summary, the protocol adds a third `re-echo' stage to the existing
   TCP/IP ECN protocol.  Whenever the network adds CE congestion
   signalling to the IP header on the forward data path, the receiver
   feeds it back to the ingress using TCP, then the sender re-echoes it
   into the forward data path using the RE flag in the next packet.

   Prior to receiving any feedback a sender will not know which setting
   of the RE flag to use, so it sets the feedback not established (FNE)
   codepoint.  The network reads the FNE codepoint conservatively as
   equivalent to re-echoed congestion.

   Specifically, once a flow is established, a re-ECN sender always
   initialises the ECN field to ECT(1).  And it usually sets the RE flag
   to "1".  Whenever a router re-marks a packet to CE, the receiver
   feeds back this event to the sender.  On receiving this feedback, the
   re-ECN sender will clear the RE flag to "0" in the next packet it
   sends.

   We chose to set and clear the RE flag this way round to ease
   incremental deployment (see Section 7). 7.1).  To avoid confusion we will
   use the term `blanking' (rather than marking) when the RE flag is
   cleared to "0".  So, over a stream of packets, we will talk of the
   `RE blanking fraction' as the fraction of octets in packets with the
   RE flag cleared to "0".

       ^
       |
       |       RE blanking fraction
    3% |--------------------------------+=====
       |                                |
    2% |                                |
       |            CE marking fraction |
    1% |        +-----------------------+
       |        |
    0% +---------------------------------------->
          ^     0     ^                 i    ^    resource index
          |     ^     |                 ^    |
          0     |     1                 |    2     observation points
              1.00%                  2.00%         marking fraction

   Figure 1: A 2-Router Example (Imprecise)

   Figure 1 uses the two router example introduced earlier to illustrate
   why re-ECN allows routers to measure downstream congestion.  The
   horizontal axis represents the index of each congestible resource
   (typically queues) along a path through the Internet.  There may be
   many routers on the path, but we assume only two are currently
   congested (those with resource index 0 and i).  The two superimposed
   plots show the fraction of each extended ECN codepoint in a flow
   observed along this path.  Given about 3% of packets reaching the
   destination are marked CE, in response to feedback the sender will
   blank the RE flag in about 3% of packets it sends.  Then approximate
   downstream congestion can be measured at the observation points shown
   along the path by subtracting the CE marking fraction from the RE
   blanking fraction, as shown in the table below (Appendix A derives
   these approximations from a precise analysis).

           +-------------------+------------------------------+
           | Observation point | Approx downstream congestion |
           +-------------------+------------------------------+
           |         0         |         3% - 0% = 3%         |
           |         1         |         3% - 1% = 2%         |
           |         2         |         3% - 3% = 0%         |
           +-------------------+------------------------------+

   Table 2: Downstream Congestion Measured at Example Observation Points

   All along the path, whole-path congestion remains unchanged so it can
   be used as a reference against which to compare upstream congestion.
   The difference predicts downstream congestion for the rest of the
   path.  Therefore, measuring the fractions of each codepoint at any
   point in the Internet will reveal upstream, downstream and whole path
   congestion.

   Note that we have introduced discussion of marking and blanking
   fractions solely for illustration.  To be absolutely clear, these
   fractions are averages that would result from the behaviour of a TCP
   protocol handler mechanically blanking outgoing packets in direct
   response to incoming feedback---we are not saying any protocol
   handler works with these average fractions directly.

3.4.  Informal Terminology

   In the rest of this memo we will loosely talk of positive or negative
   flows, meaning flows where the moving average of the downstream
   congestion metric is persistently positive or negative.  The notion
   of a negative metric arises because it is derived by subtracting one
   metric from another.  Of course actual downstream congestion cannot
   be negative, only the metric can (whether due to time lags or
   deliberate malice).

   Just as we will loosely talk of positive and negative flows, we will
   also talk of positive or negative packets, meaning packets that
   contribute positively or negatively to the downstream congestion. congestion
   metric.

   Therefore packets can be considered to have a we will talk of packets having `worth' of +1, 0 or
   -1, which, when multiplied by their size, indicates their
   contribution to the downstream congestion. congestion metric.

   Figure 2 shows the main state transitions of the system once a flow
   is established, showing the worth of packets in each state.  When the
   network congestion marks a packet it decrements its worth. worth (moving
   from the left of the main square to the right).  When the sender
   blanks the RE flag in order to re-echo congestion it increments the
   worth of a packet. packet (moving from the bottom of the main square to the
   top).

   Sender state         Sent     Worth  Network            Received   Worth
                        packet         Congestion                    packet
            +----------------------------------------------------+
            |                                                    ^
            V                                                    |
   Congestion echoed -->Re-Echo  +1      -->  --+--->      CE(0)      0 --+
                          /                                      |
        No congestion___/
                        (positive)     |
                   /    \            (canceled)   |
                                       V       \    network              |
                                       |   congestion            |
                                       |                         |
   Flow established --> RECT      0      -->  ----+->      CE(-1)    -1 --+

   Figure 2: Re-ECN System State Diagram (bootstrap not shown)

   The idea is that
            ^           (neutral)      | |          (negative)
            |                          | |
            |                      no  V V
            |               congestion | |
            +-----------<--------------+-+

   Figure 2: Re-ECN System State Diagram (bootstrap not shown)

   The idea is that every time the network decrements the worth of a
   packet, the sender increments the worth of a later packet.  Then,
   over time, as many positive packets octets should arrive at the receiver as
   negative.  Note we have said octets not packets, so if packets are of
   different sizes, the worth should be incremented on enough octets to
   balance the octets in negative packets arriving at the receiver.  It
   is this balance that will allow the network to hold the sender
   accountable for the congestion it causes, as we shall see.

   If we the
   informal outline below uses TCP as an example transport, but the idea
   would be broadly similar for any transport that adapts its rate to
   congestion.

   We will start with the sender in `flow established' state, normally it Normally
   as acknowledgements of earlier packets arrive that don't feedback any
   congestion, the congestion window can be opened, so the sender goes
   round the tight smaller sub-loop, sending RECT packets (worth nothing) 0) and
   returning to the flow established state to send another one.  But
   if  If a
   router congestion marks one of the packets is congestion marked, its worth is decremented. packets, it decrements the
   packet's worth.  The sender will have been continuing to traverse
   round its tight sending loop. the smaller feedback loop every time acknowledgements arrive.
   But when congestion feedback returns from one of the packets in
   flight this packet that was marked
   with -1 worth (the largest loop in the figure) the sender jumps to
   the congestion echoed state in order to re-echo the congestion,
   incrementing the worth of the next packet to +1 by blanking its RE bit.
   flag.  The sender then returns to the flow established state and
   continues
   in round the tight loop smaller loop, sending zero worth. packets worth 0.  Note that
   the size of the loops is just an artefact of the figure; it is not
   meant to imply that one loop is slower than the other - they are both
   the same end to end feedback loop.

   If a packet carrying re-echoed congestion happens to also be
   congestion marked, the +1 worth added by the sender will be cancelled
   out by the -1 network congestion marking.  Although the two worth
   values correctly cancel out, neither the congestion marking nor the re-
   echoed
   re-echoed congestion are lost, because the RE bit and the ECN field
   are orthogonal.  So, whenever this happens, the receiver will
   correctly detect and re-echo the new congestion event as well (the
   top sub-
   loop).

   The table below specifies unambiguously the worth of each extended
   ECN codepoint.  Note the order is different from the previous table sub-loop).  When we need to better show how distinguish, we will sometimes call a
   packet marked RECT neutral (0 worth), while we will call the worth increments and CE(0)
   marking canceled (also 0 worth).  If a re-echoed packet isn't unlucky
   enough to be further congestion marked, the sender will return to the
   flow established state and continue to send RECT packets (worth 0).

   The table below specifies unambiguously the worth of each extended
   ECN codepoint.  Note the order is different from the previous table
   to better show how the worth increments and decrements.  The FNE
   codepoint is an exception.  It is used in the flow bootstrap process
   (explained later) and has the same positive (+1) worth as a packet
   with the Re-Echo codepoint.

   +--------+------+----------------+-------+--------------------------+

   +-------+-----+----------------+-------+----------------------------+
   |  ECN  |  RE | Extended ECN   | Worth |       Re-ECN meaning       |
   | field | bit | codepoint      |       |                            |
   +--------+------+----------------+-------+--------------------------+
   +-------+-----+----------------+-------+----------------------------+
   |   00  |  0  | Not-RECT       | ...   |     Not re-ECN-capable     |
   |       |     |                |       |          transport         |
   |   01  |  0  | Re-Echo        | +1    |  Re-echoed congestion and  |
   |       |     |                |       |            RECT            |
   |   10  |  0  | ---            | ...   |   Legacy ECN use only      |
   |   11  |  0  | CE(0)          |  0    |  Congestion experienced     Re-Echo canceled by    |
   |       |     |                |       |       with Re-Echo   congestion experienced   |
   |   00  |  1  | FNE            | +1    |  Feedback not established  |
   |   01  |  1  | RECT           |  0    |  Re-ECN capable transport  |
   |   10  |  1  | --CU--         | ...   |      Currently unused      |
   |       |     |                |       |                            |
   |   11  |  1  | CE(-1)         | -1    |   Congestion experienced   |
   +--------+------+----------------+-------+--------------------------+
   +-------+-----+----------------+-------+----------------------------+

                Table 3: 'Worth' of Extended ECN Codepoints

4.  Transport Layers

4.1.  TCP

   Re-ECN capability at the sender is essential.  At the receiver it is
   optional, as long as the receiver has a basic (`vanilla flavour')
   RFC3168-compliant ECN-capable transport (ECT) [RFC3168].  Given re-
   ECN is not the first attempt to define the semantics of the ECN
   field, we give a table below summarising what happens for various
   combinations of capabilities of the sender S and receiver R, as
   indicated in the first four columns below.  The last column gives the
   mode a half-connection should be in after the first two of the three
   TCP handshakes.

   +--------+---------------+-----------+---------+--------------------+
   | Re-ECT |   ECT-Nonce   |    ECT    | Not-ECT |         S-R        |
   |        |   (RFC3540)   | (RFC3168) |         |   Half-connection  |
   |        |               |           |         |        Mode        |
   +--------+---------------+-----------+---------+--------------------+
   |   SR   |               |           |         |        RECN        |
   |    S   |       R       |           |         |       RECN-Co      |
   |    S   |               |     R     |         |       RECN-Co      |
   |    S   |               |           |    R    |       Not-ECT      |
   +--------+---------------+-----------+---------+--------------------+

       Table 4: Modes of TCP Half-connection for Combinations of ECN
                  Capabilities of Sender S and Receiver R

   We will describe what happens in each mode, then describe how they
   are negotiated.  The abbreviations for the modes in the above table
   mean:

   RECN: Full re-ECN capable transport

   RECN-Co: Re-ECN sender in compatibility mode with a vanilla [RFC3168]
      ECN receiver or an [RFC3540] ECN nonce-capable receiver.
      Implementation of this mode is OPTIONAL.

   Not-ECT: Not ECN-capable transport, as defined in [RFC3168] for when
      at least one of the transports does not understand even basic ECN
      marking.

   Note that we use the term Re-ECT for a host transport that is re-ECN-
   capable but RECN for the modes of the half connections between hosts
   when they are both Re-ECT.  If a host transport is Re-ECT, this fact
   alone does NOT imply either of its half connections will necessarily
   be in RECN mode, at least not until it has confirmed that the other
   host is Re-ECT.

4.1.1.  RECN mode: Full re-ECN capable transport

   In full RECN mode, for each half connection, both the sender and the
   receiver each maintain an unsigned integer counter we will call ECC
   (echo congestion counter).  The receiver maintains a count, modulo 8,
   of how many times a CE marked packet has arrived during the half-
   connection.  Once a RECN connection is established, the three TCP
   option flags (ECE, CWR & NS) used for ECN-related functions in
   previous versions of ECN are used as a 3-bit field for the receiver
   to repeatedly tell the sender the current value of ECC whenever it
   sends a TCP ACK.  We will call this the echo congestion increment
   (ECI) field.  This overloaded use of these 3 option flags as one
   3-bit ECI field is shown in Figure 4.  The actual definition of the
   TCP header, including the addition of support for the ECN nonce, is
   shown for comparison in Figure 3.  This specification does not
   redefine the names of these three TCP option flags, it merely
   overloads them with another definition once a flow is established.

        0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15
      +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
      |               |           | N | C | E | U | A | P | R | S | F |
      | Header Length | Reserved  | S | W | C | R | C | S | S | Y | I |
      |               |           |   | R | E | G | K | H | T | N | N |
      +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+

   Figure 3: The (post-ECN Nonce) definition of bytes 13 and 14 of the
   TCP Header

        0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15
      +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
      |               |           |           | U | A | P | R | S | F |
      | Header Length | Reserved  |    ECI    | R | C | S | S | Y | I |
      |               |           |           | G | K | H | T | N | N |
      +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+

   Figure 4: Definition of the ECI field within bytes 13 and 14 of the
   TCP Header, overloading the current definitions above for established
   RECN flows.

   Receiver Action in RECN Mode

      Every time a CE marked packet arrives at a receiver in RECN mode,
      the receiver transport increments its local value of ECC modulo 8
      and MUST echo its value to the sender in the ECI field of the next
      ACK.  It MUST repeat the same value of ECI in every subsequent ACK
      until the next CE event, when it increments ECI again.

      The increment of the local ECC values is modulo 8 so the field
      value simply wraps round back to zero when it overflows.  The
      least significant bit is to the right (labelled bit 9).

      A receiver in RECN mode MAY delay the echo of a CE to the next
      delayed-ACK, which would be necessary if ACK-withholding were
      implemented.

   Sender Action in RECN Mode

      On the arrival of every ACK, the sender compares the ECI field
      with its own ECC value, then replaces its local value with that
      from the ACK.  The difference D is assumed to be the number of CE
      marked packets that arrived at the receiver since it sent the
      previously received ACK (but see below for the sender's safety
      strategy).  Whenever the ECI field increments by D (or D drops are
      detected), the sender MUST clear the RE flag to "0" in the IP
      header of the next D data packets it sends, effectively re-echoing
      each single increment of ECI.  Otherwise the data sender MUST send
      all data packets with RE set to "1".

      As a general rule, once a flow is established, as well as setting
      or clearing the RE flag as above, a data sender in RECN mode MUST
      always set the ECN field to ECT(1).  However, the settings of the
      extended ECN field during flow start are defined in Section 4.1.4.

      As we have already emphasised, the re-ECN protocol makes no
      changes and has no effect on the TCP congestion control algorithm.
      So, each increment of ECI (or detection of a drop) also triggers
      the standard TCP congestion response, but with no more than one
      congestion response per round trip, as usual.

      A TCP sender also acts as the receiver for the other half-
      connection.  The host will maintain two ECC values S.ECC and R.ECC
      as sender and receiver respectively.  Every data packet TCP header sent by a
      host in RECN mode will also repeat the prevailing value of R.ECC
      in its ECI field.  If a sender in RECN mode has to retransmit a
      packet due to a suspected loss, the re-transmitted packet MUST
      carry the latest prevailing value of R.ECC when it is re-
      transmitted, which will not necessarily be the one it carried
      originally.

4.1.1.1.  Safety against Long Pure ACK Loss Sequences

   The ECI method was chosen for echoing congestion marking because a
   re-ECN sender needs to know about every CE mark arriving at the
   receiver, not just whether at least one arrives within a round trip
   time (which is all the ECE/CWR mechanism supported).  But  And, as pure
   ACKs are not protected by TCP reliable delivery, so we repeat the same
   ECI value in every ACK until it changes.  Even if many ACKs in a row
   are lost, as soon as one gets through, the ECI field it repeats from
   previous ACKs that didn't get through will update the sender on how
   many CE marks arrived since the last ACK got through.

   The sender will only lose a record of the arrival of a CE mark if all
   the ACKS are lost (and all of them were pure ACKs) for a stream of
   data long enough to contain 8 or more CE marks.  So, if the marking
   fraction was p, at least 8/p pure ACKs would have to be lost.  For
   example, if p was 5%, a sequence of 160 pure ACKs would all have to
   be lost.  To protect against such extremely unlikely events, if a re-
   ECN sender detects a sequence of pure ACKs has been lost it SHOULD
   assume the ECI field wrapped as many times as possible within the
   sequence.

   Specifically, if a re-ECN sender receives an ACK with an
   acknowledgement number that acknowledges L segments since the
   previous ACK but with a sequence number unchanged from the previously
   received ACK, it SHOULD conservatively assume that the ECI field
   incremented by D' = L - ((L-D) mod 8), where D is the apparent
   increase in the ECI field.  For example if the ACK arriving after 9
   pure ACK losses apparently increased ECI by 2, the assumed increment
   of ECI would still be 2.  But if ECI apparently increased by 2 after
   11 pure ACK losses, ECI should be assumed to have increased by 10.

   A re-ECN sender MAY implement a heuristic algorithm to predict beyond
   reasonable doubt that the ECI field probably did not wrap within a
   sequence of lost pure ACKs.  But such an algorithm is NOT REQUIRED.
   Such an algorithm MUST NOT be used unless it is proven to work even
   in the presence of correlation between high ACK loss rate on the back
   channel and high CE marking rate on the forward channel.

   Whatever assumption a re-ECN sender makes about potentially lost CE
   marks, both its congestion control and its re-echoing behaviour
   SHOULD be consistent with the assumption it makes.

4.1.2.  RECN-Co mode: Re-ECT Sender with a Vanilla or Nonce ECT Receiver

   If the half-connection is in RECN-Co mode, ECN feedback proceeds no
   differently to that of vanilla ECN.  In other words, the receiver
   sets the ECE flag repeatedly in the TCP header and the sender
   responds by setting the CWR flag.  Although RECN-Co mode is used when
   the receiver has not implemented the re-ECN protocol, the sender can
   infer enough from its vanilla ECN feedback to set or clear the RE
   flag reasonably well.  Essentially,  Specifically, every time the receiver toggles
   the ECE field from "0" to "1" (or a loss is detected), as well as
   setting CWR in the TCP flags, the re-ECN sender sets MUST blank the IP header RE
   flag of the same next packet to "0" as it would do in full RECN mode.  Specifically, the re-ECN
   sender MUST clear the RE flag to "0" in the next packet.  Otherwise
   Otherwise, the data sender SHOULD send all other packets with RE set
   to "1".  Once a flow is established, a re-ECN data sender in RECN-Co
   mode MUST always set the ECN field to ECT(1).

   If a CE marked packet arrives at the receiver within a round trip
   time of a previous mark, the receiver will still be echoing ECE for
   the last CE mark.  Therefore, such a mark will be missed by the
   sender.  Of course, this isn't of concern for congestion control, but
   it does mean that very occasionally the RE blanking fraction will be
   understated.  Therefore flows in RECN-Co mode may occasionally be
   mistaken for very lightly cheating flows and consequently might
   suffer a small number of packet drops through an egress dropper
   (Section 6.1.3). 6.1.4).  We expect re-ECN would be deployed for some time
   before policers and droppers start to enforce it.  So, given there is
   not much ECN deployment yet anyway, this minor problem may affect
   only a very small proportion of flows, reducing to nothing over the
   years as vanilla ECN hosts upgrade.  The use of RECN-Co mode would
   need to be reviewed in the light of experience at the time of re-ECN
   deployment.

   RECN-Co mode is OPTIONAL.  Re-ECN implementers who want to keep their
   code simple, MAY choose not to implement this mode.  If they do not,
   a re-ECN sender SHOULD fall back to vanilla ECT mode in the presence
   of an ECN-capable receiver.  It MAY choose to fall back to the ECT-
   Nonce mode, but if re-ECN implementers don't want to be bothered with
   RECN-Co mode, they probably won't want to add an ECT-Nonce mode
   either.

4.1.2.1.  Re-ECN support for the ECN Nonce

   A TCP half-connection in RECN-Co mode MUST NOT support the ECN
   Nonce [RFC3540].  This means that the sending code of a re-ECN
   implementation will never need to include ECN Nonce support.  Re-ECN
   is intended to provide wider protection than the ECN nonce against
   congestion control misbehaviour, and re-ECN only requires support
   from the sender, therefore it is preferable to specifically rule out
   the need for dual sender implementations.  As a consequence, a re-ECN
   capable sender will never set ECT(0), so it will be easier for
   network elements to discriminate re-ECN traffic flows from other ECN
   traffic, which will always contain some ECT(0) packets.

   However, a re-ECN implementation MAY OPTIONALLY include receiving
   code that complies with the ECN Nonce protocol when interacting with
   a sender that supports the ECN nonce (rather than re-ECN), but this
   support is NOT REQUIRED.

   RFC3540 allows an ECN nonce sender to choose whether to sanction a
   receiver that does not ever set the nonce sum.  Given re-ECN is
   intended to provide wider protection than the ECN nonce against
   congestion control misbehaviour, implementers of re-ECN receivers MAY
   choose not to implement backwards compatibility with the ECN nonce
   capability.  This may be because they deem that the risk of sanctions
   is low, perhaps because significant deployment of the ECN nonce seems
   unlikely at implementation time.

4.1.3.  Capability Negotiation

   During the TCP hand-shake at the start of a connection, an originator
   of the connection (host A) with a re-ECN-capable transport MUST
   indicate it is Re-ECT by setting the TCP options NS=1, CWR=1 and
   ECE=1 in the initial SYN.

   A responding Re-ECT host (host B) MUST return a SYN ACK with flags
   CWR=1 and ECE=0.  The responding host MUST NOT set this combination
   of flags unless the preceding SYN has already indicated Re-ECT
   support as above.  A Re-ECT server (B) can use either setting of the
   NS flag combined with this type of SYN ACK in response to a SYN from
   a Re-ECT client (A).  Normally a Re-ECT server will reply to a Re-ECT
   client with NS=0, but under special circumstances described in
   Section 4.1.4 the special circumstance below it can return
   a SYN ACK with NS=1.

   If the initial SYN from Re-ECT client A is marked CE(-1), a Re-ECT
   server B MUST increment its local value of ECC.  But B cannot reflect
   the value of ECC in the SYN ACK, because it is still using the 3 bits
   to negotiate connection capabilities.  So, server B MUST set the
   alternative TCP header flags in its SYN ACK: NS=1, CWR=1 and ECE=0.

   These handshakes are summarised in Table 5 below, with X meaning
   `don't care'.  The handshakes used for the other flavours of ECN are
   also shown for comparison.  To compress the width of the table, the
   headings of the first four columns have been severely abbreviated, as
   follows:

      R: *R*e-ECT

      N: ECT-*N*once (RFC3540)

      E: *E*CT (RFC3168)

      I: Not-ECT (*I*mplicit congestion notification).

   These correspond with the same headings used in Table 4.  Indeed, the
   resulting modes in the last two columns of the table below are a more
   comprehensive way of saying the same thing as Table 4.

   +----+---+---+---+------------+-------------+-----------+-----------+
   | R  | N | E | I |   SYN A-B  | SYN ACK B-A |  A-B Mode |  B-A Mode |
   +----+---+---+---+------------+-------------+-----------+-----------+
   |    |   |   |   | NS CWR ECE |  NS CWR ECE |           |           |
   | AB |   |   |   |  1   1   1 |  X   1   0  |    RECN   |    RECN   |
   | A  | B |   |   |  1   1   1 |  1   0   1  |  RECN-Co  | ECT-Nonce |
   | A  |   | B |   |  1   1   1 |  0   0   1  |  RECN-Co  |    ECT    |
   | A  |   |   | B |  1   1   1 |  0   0   0  |  Not-ECT  |  Not-ECT  |
   | B  | A |   |   |  0   1   1 |  0   0   1  | ECT-Nonce |  RECN-Co  |
   | B  |   | A |   |  0   1   1 |  0   0   1  |    ECT    |  RECN-Co  |
   | B  |   |   | A |  0   0   0 |  0   0   0  |  Not-ECT  |  Not-ECT  |
   +----+---+---+---+------------+-------------+-----------+-----------+

      Table 5: TCP Capability Negotiation between Originator (A) and
                               Responder (B)

   As soon as a re-ECN capable TCP server receives a SYN, it MUST set
   its two half-connections into the modes given in Table 5.  As soon as
   a re-ECN capable TCP client receives a SYN ACK, it MUST set its two
   half-connections into the modes given in Table 5.  The half-
   connections will remain in these modes for the rest of the
   connection, including for the third segment of TCP's three-way hand-
   shake (the ACK).

   {ToDo: Consider SYNs within a connection.}

   Recall that, if the SYN ACK reflects the same flag settings as the
   preceding SYN (because there is a broken legacy implementation that
   behaves this way), RFC3168 specifies that the whole connection MUST
   revert to Not-ECT.

   Also note that, whenever the SYN flag of a TCP segment is set
   (including when the ACK flag is also set), the NS, CWR and ECE flags
   MUST NOT be interpreted as the 3-bit ECI value, which is only set as
   a copy of the local ECC value in non-SYN packets.

4.1.4.  Extended ECN (EECN) Field Settings during Flow Start or after
        Idle Periods

   If the originator (A) of a TCP connection supports re-ECN it MUST set
   the extended ECN (EECN) field in the IP header of the initial SYN
   packet to the feedback not established (FNE) codepoint.

   FNE is a new extended ECN codepoint defined by this specification
   (Section 3.2).  The feedback not established (FNE) codepoint is used
   when the transport does not have the benefit of ECN feedback so it
   cannot decide whether to set or clear the RE flag.

   If after receiving a SYN the server B has set its sending half-
   connection into RECN mode or RECN-Co mode, it MUST set the extended
   ECN field in the IP header of its SYN ACK to the feedback not
   established (FNE) codepoint.  Note the careful wording here, which
   means that Re-ECT server B must MUST set FNE on a SYN ACK whether it is
   responding to a SYN from a Re-ECT client or from a client that is
   merely ECN-capable.

   The original ECN specification [RFC3168] required SYNs and SYN ACKs
   to use the Not-ECT codepoint of the ECN field.  The aim was to
   prevent well-known DoS attacks such as SYN flooding being able to
   gain from the advantage that ECN capability afforded over drop at
   ECN-capable routers.

   For a SYN ACK ACK, Kuzmanovic [I-D.ietf-tsvwg-ecnsyn] has shown that this
   caution was unnecessary, and proposes to allow a SYN ACK to be
   ECN-capable ECN-
   capable to improve performance.  However, our use of FNE on  We have gone further by proposing to
   make the initial SYN seems to ECN-capable too.  By stipulating the FNE
   codepoint for the initial SYN, we comply with this aim RFC3168 in word but not
   in spirit,
   so a justification for choosing to because we have indeed set RE the ECN field to 1 for a SYN is given in
   Section 5.4.

   Once a TCP half connection is in RECN mode or RECN-Co Not-ECT, but
   we have extended the ECN field with another bit.  And it will be seen
   (Section 5.3) that we have defined one setting of that bit to mean an
   ECN-capable transport.  Therefore, by proposing that the FNE
   codepoint MUST be used on the initial SYN of a connection, we have
   (deliberately) made the initial SYN ECN-capable.  Section 5.4
   justifies deciding to make the initial SYN ECN-capable.

   Once a TCP half connection is in RECN mode or RECN-Co mode, FNE will
   have already been set on the initial SYN and possibly the SYN ACK as
   above.  But each re-ECN sender will have to set FNE cautiously on a
   few data packets as well, given a number of packets will usually have
   to be sent before sufficient congestion feedback is received.  The
   behaviour will be different depending on the mode of the half-
   connection:

   RECN mode: Given the constraints on TCP's initial window [RFC3390]
      and its exponential window increase during slow start
      phase [RFC2581], it turns out that the sender SHOULD set FNE on
      the first and third data packets in its flow, assuming equal sized
      data packets once a flow is established.  Appendix C D presents the
      calculation that led to this conclusion.  Below, after running
      through the start of an example TCP session, we give the intuition
      learned from that calculation.

   RECN-Co mode: A re-ECT sender that switches into re-ECN compatibility
      mode or into Not-ECT mode (because it has detected the
      corresponding host is ECN-
      capable but not re-ECN capable) MUST limit its initial
      window to 1 segment.  The reasoning behind this constraint is
      given in Section 5.4.  Having set this initial window, a re-ECN
      sender in RECN-Co mode SHOULD set FNE on the first and third data
      packets in a flow, as for RECN mode.

   +----+------+----------------+-------+-------+---------------+------+
   |    | Data | TCP A(Re-ECT)  | IP A  | IP B  | TCP B(Re-ECT) | Data |
   +----+------+----------------+-------+-------+---------------+------+
   |    | Byte |  SEQ  ACK CTL  | EECN  | EECN  |  SEQ  ACK CTL | Byte |
   | -- | ---- | -------------  | ----- | ----- | ------------- | ---- |
   |  1 |      | 0100      SYN  | FNE   | -->   |      R.ECC=0  |      |
   |    |      |    CWR,ECE,NS  |       |       |               |      |
   |  2 |      |      R.ECC=0   | <--   | FNE   | 0300 0101     |      |
   |    |      |                |       |       |   SYN,ACK,CWR |      |
   |  3 |      | 0101 0301 ACK  | RECT  | -->   |      R.ECC=0  |      |
   |  4 | 1000 | 0101 0301 ACK  | FNE   | -->   |      R.ECC=0  |      |
   |  5 |      |      R.ECC=0   | <--   | FNE   | 0301 1102 ACK | 1460 |
   |  6 |      |      R.ECC=0   | <--   | RECT  | 1762 1102 ACK | 1460 |
   |  7 |      |      R.ECC=0   | <--   | FNE   | 3222 1102 ACK | 1460 |
   |  8 |      | 1102 1762 ACK  | RECT  | -->   |      R.ECC=0  |      |
   |  9 |      |      R.ECC=0   | <--   | RECT  | 4682 1102 ACK | 1460 |
   | 10 |      |      R.ECC=0   | <--   | RECT  | 6142 1102 ACK | 1460 |
   | 11 |      | 1102 3222 ACK  | RECT  | -->   |      R.ECC=0  |      |
   | 12 |      |      R.ECC=0   | <--   | RECT  | 7602 1102 ACK | 1460 |
   | 13 |      |      R.ECC=1   | <*-   | RECT  | 9062 1102 ACK | 1460 |
   |    |      | ...            |       |       |               |      |
   +----+------+----------------+-------+-------+---------------+------+

                      Table 6: TCP Session Example #1

   Table 6 shows an example TCP session, where the server B sets FNE on
   its first and third data packets (lines 5 & 7) as well as on the
   initial SYN ACK as previously described.  The left hand half of the
   table shows the relevant settings of headers sent by client A in
   three layers: the TCP payload size; TCP settings; then IP settings.
   The right hand half gives equivalent columns for server B. The only
   TCP settings shown are the sequence number (SEQ), acknowledgement
   number (ACK) and the relevant control (CTL) flags that A sets in the
   TCP header.  The IP columns show the setting of the extended ECN
   (EECN) field.

   Also shown on the receiving side of the table is the value of the
   receiver's echo congestion counter (R.ECC) after processing the
   incoming EECN header.  Note that, once a host sets a half-connection
   into RECN mode, it MUST initialise its local value of ECC to zero.

   The intuition that Appendix C D gives for why a sender should set FNE
   on the first and third data packets is as follows.  At line 13, a
   packet sent by B is shown with an '*', which means it has been
   congestion marked by an intermediate router from RECT to CE(-1).  On
   receiving this CE marked packet, client A increments its ECC counter
   to 1 as shown.  This was the 7th data packet B sent, but before
   feedback about this event returns to B, it might well have sent many
   more packets.  Indeed, during exponential slow start, about as many
   packets will be in flight (unacknowledged) as have been acknowledged.
   So, when the feedback from the congestion event on B's 7th segment
   returns, B will have sent about 7 further packets that will still be
   in flight.  At that stage, B's best estimate of the network's packet
   marking fraction will be 1/7.  So, as B will have sent about 14
   packets, it should have already marked 2 of them as FNE in order to
   have marked 1/7; hence the need to have set the first and third data
   packets to FNE.

   Client A's behaviour in Table 6 also shows FNE being set on the first
   SYN and the first data packet (lines 1 & 4), but in this case it
   sends no more data packets, so of course, it cannot, and does not
   need to, set FNE again.  Note that in the A-B direction there is no
   need to set FNE on the third part of the three-way hand-shake (line
   3---the ACK).

   Note that in this section we have used the word SHOULD rather than
   MUST when specifying how to set FNE on data segments before positive
   congestion feedback arrives (but note that the word MUST was used for
   FNE on the SYN and SYN ACK).  FNE is only RECOMMENDED for the first
   and third data segments to entertain the possibility that the TCP
   transport has the benefit of other knowledge of the path, which it
   re-uses from one flow for the benefit of a newly starting flow.  For
   instance, one flow can re-use knowledge of other flows between the
   same hosts if using a Congestion Manager [RFC3124] or when a proxy
   host aggregates congestion information for large numbers of flows.

   After an idle period of more than 1 second, a re-ECN sender transport
   MUST set the EECN field of the next packet it sends that resumes the connection to
   FNE.  Note that this next packet may be sent a very long time later,
   a packet does NOT have to be sent after 1 second of idling.  In order
   that the design of network policers can be deterministic, this
   specification deliberately puts an absolute lower limit on how long a
   connection can be idle before the next packet that resumes the connection
   must be set to FNE, rather than relating it to the connection round
   trip time.  We use the lower bound of the retransmission timeout
   (RTO) [RFC2988], which is commonly used as the idle period before TCP
   must reduce to the restart window [RFC2581].  Note our specification
   of re-ECN's idle period is NOT intended to change the idle period for
   TCP's restart, nor indeed for any other purposes.

   {ToDo: Describe how the sender falls back to legacy modes if packets
   don't appear to be getting through (to work round firewalls
   discarding packets they consider unusual).}

4.1.5.  Pure ACKS, Retransmissions, Window Probes and Partial ACKs

   A re-ECN sender MUST clear the RE flag to "0" and set the ECN field
   to Not-ECT in pure ACKs, retransmissions and window probes, as
   specified in [RFC3168].  Our eventual goal is for all packets to be
   sent with re-ECN enabled, and we believe the semantics of the ECI
   field go a long way towards being able to achieve this.  However, we
   have not completed a full security analysis for these cases,
   therefore, currently we merely re-state current practice.

   We must also reconcile the facts that congestion marking is applied
   to packets but acknowledgements cover octet ranges and acknowledged
   octet boundaries need not match the transmitted boundaries.  The
   general principle we work to is to remain compatible with TCP's
   congestion control which is driven by congestion events at packet
   granularity while at the same time aiming to blank the RE flag on at
   least as many octets in a flow as have been marked CE.

   Therefore, a re-ECN TCP receiver MUST increment its ECC value as many
   times as CE marked packets have been received.  And that value MUST
   be echoed to the sender in the first available ACK using the ECI
   field.  This ensures the TCP sender's congestion control receives
   timely feedback on congestion events at the same packet granularity
   that they were generated on congested routers.

   Then, a re-ECN sender stores the difference D between its own ECC
   value and the incoming ECI field by incrementing a counter R. Then, R
   is decremented by 1 each subsequent packet that is sent with the RE
   flag blanked, until R is no longer positive.  Using this technique,
   whenever a re-ECN transport sends a not re-ECN capable (NRECN) packet
   (e.g. a retransmission), the remaining packets required to have the
   RE flag blanked will be automatically carried over to subsequent
   packets, through the variable R.

   This does not ensure precisely the same number of octets have RE
   blanked as were CE marked.  But we believe positive errors will
   cancel negative over a long enough period. {ToDo: However, more
   research is needed to prove whether this is so.  If it is not, it may
   be necessary to increment and decrement R in octets rather than
   packets, by incrementing R as the product of D and the size in octets
   of packets being sent (typically the MSS).}

4.2.  Other Transports

4.2.1.  Guidelines for Adding Re-ECN to Other Transports

   Re-ECT sender transports that have established the receiver transport
   is at least ECN-capable (not necessarily re-ECN capable) MUST blank
   the RE codepoint in packets carrying at least as many octets as
   arrive at receiver with the CE codepoint set.  Re-ECN-capable sender
   transports should always initialise the ECN field to the ECT(1)
   codepoint once a flow is established.

   If the sender transport does not have sufficient feedback to even
   estimate the path's CE rate, it SHOULD set FNE continuously.  If the
   sender transport has some, perhaps stale, feedback to estimate that
   the path's CE rate is nearly definitely less than E%, the transport
   MAY blank RE in packets for E% of sent octets, and set the RECT
   codepoint for the remainder.

   {ToDo: Give a brief outline of what would be expected for each of the
   following:

   o  UDP fire and forget (e.g.  DNS)

   o  UDP streaming with no feedback

   o  UDP streaming with feedback

   o  DCCP}  DCCP [RFC4340] }

   o  RSVP and/or NSIS: A separate I-D has been submitted [Re-PCN]
      describing how re-ECN can be used in an edge-to-edge rather than
      end-to-end scenario.  It can then be used by downstream networks
      to police whether upstream networks are blocking new flow
      reservations when downstream congestion is too high, even though
      the congestion is in other operators' downstream networks.  This
      relates to current work in progress on Admission Control over
      Diffserv using Pre-Congestion Notification, being reported to the
      IETF TSVWG [CL-arch]. [CL-deploy].

5.  Network Layer

5.1.  Re-ECN IPv4 Wire Protocol

   The wire protocol of the ECN field in the IP header remains largely
   unchanged from [RFC3168].  However, an extension to the ECN field we
   call the RE (re-ECN extension) flag (Section 3.2) is defined in this
   document.  It doubles the extended ECN codepoint space, giving 8
   potential codepoints.  The semantics of the extra codepoints are
   backward compatible with the semantics of the 4 original codepoints
   [RFC3168] (Section 7 7.1 collects together and summarises all the
   changes defined in this document).

   For IPv4, this document proposes that the new RE control flag will be
   positioned where the `reserved' control flag was at bit 48 of the
   IPv4 header (counting from 0).  Alternatively, some would call this
   bit 0 (counting from 0) of byte 7 (counting from 1) of the IPv4
   header (Figure 5).

             0   1   2
           +---+---+---+
           | R | D | M |
           | E | F | F |
           +---+---+---+

   Figure 5: New Definition of the Re-ECN Extension (RE) Control Flag at
   the Start of Byte 7 of the IPv4 Header

   The semantics of the RE flag are described in outline in Section 3
   and specified fully in Section 4.  The RE flag is always considered
   in conjunction with the 2-bit ECN field, as if they were concatenated
   together to form a 3-bit extended ECN field.  If the ECN field is set
   to either the ECT(1) or CE codepoint, when the RE flag is blanked
   (cleared to "0") it represents a re-echo of congestion experienced by
   an early packet.  If the ECN field is set to the Not-ECT codepoint,
   when the RE flag is set to "1" it represents the feedback not
   established (FNE) codepoint, which signals that the packet was sent
   without the benefit of congestion feedback.

   It is believed that the RE flag FNE codepoint can simultaneously serve other
   purposes, particularly where the start of a flow needs distinguishing
   from packets later in the flow.  For instance it would have been
   useful to identify new flows for tag switching and might enable
   similar developments in the future if it were adopted.  It is similar
   to the state set-up bit idea designed to protect against memory
   exhaustion attacks.  This idea was proposed informally by David Clark
   and documented by Handley and Greenhalgh [Steps_DoS].  The RE flag FNE
   codepoint can be thought of as a `soft-state set-up flag', because it
   is idempotent (i.e. one occurrence of the flag is sufficient but
   further occurrences achieve the same effect if previous ones were
   lost).

   We are sure there will probably be other claims pending on the use of
   bit 48.  We know of at least two [ARI05], [RFC3514] but neither have
   been pursued in the IETF, so far, although the present proposal would
   meet the needs of the former.

   The security flag proposal (commonly known as the evil bit) was
   published on 1 April 2003 as Informational RFC 3514, but it was not
   adopted due to confusion over whether evil-doers might set it
   inappropriately.  The present proposal is backward compatible with
   RFC3514 because if re-ECN compliant senders were benign they would
   correctly clear the evil bit to honestly declare that they had just
   received congestion feedback.  Whereas evil-doers would hide
   congestion feedback by setting the evil bit continuously, or at least
   more often than they should.  So, evil senders can be identified,
   because they declare that they are good less often than they should.

5.2.  Re-ECN IPv6 Wire Protocol

   {ToDo: Include

   For IPv6, this document proposes that the IPv6 extension header design, including support
   for new RE control flag will be
   positioned as the FNE flag.  Also its integrated support for first bit of the option field of a new Congestion
   hop by hop option header (Figure 6).

        0                   1                   2                   3
        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |  Next Header  |  Hdr ext Len  |  Option Type  |  Option Len   |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |R|                     Reserved for future multi-bit
   congestion notification field, with use                 |
       |E|                                                             |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   Figure 6: Definition of a TTL hop count scheme New IPv6 Congestion Hop by Hop Option
   Header containing the Re-ECN Extension (RE) Control Flag

               0 1 2 3 4 5 6 7 8
               +-+-+-+-+-+-+-+-+-
               |AIU|C|Option ID|
               +-+-+-+-+-+-+-+-+-

   Figure 7: Congestion Hop by Hop Option Type Encoding

   The Hop-by-Hop Options header enables packets to check
   that all carry information to
   be examined and processed by routers on or nodes along the path support it (similar packet's
   delivery path, including the source and destination nodes.  For re-
   ECN, the two bits of the Action If Unrecognized (AIU) flag of the
   Congestion extension header MUST be set to Quick-Start).
   So, "00" meaning if
   unrecognized `skip over option and continue processing the whole path of header'.
   Then, any routers doesn't support or a receiver not upgraded with the optional re-ECN
   features described in this memo will simply ignore this header.  But
   routers with these optional re-ECN features or a re-ECN policing
   function, will process this Congestion extension header.

   The `C' flag MUST be set to "1" to specify that the extension, Option Data
   (currently only the
   end-points RE control flag) can fall back change en-route to re-ECN (or drop).}

5.3.  Router Forwarding Behaviour

   Re-ECN works well without modifying the forwarding behaviour of
   packet's final destination.  This ensures that, when an
   Authentication header (AH [RFC2402]) is present in the packet, for
   any
   routers.  However, option whose data may change en-route, its entire Option Data
   field will be treated as zero-valued octets when computing or
   verifying the packet's authenticating value.

   Although the RE control flag should not be changed along the path, we
   expect that the rest of this option field that is currently `Reserved
   for future use' could be used for a multi-bit congestion notification
   field which we would expect to change en route.  As the RE flag does
   not need end-to-end authentication, we set the C flag to '1'.

   {ToDo: A Congestion Hop by Hop Option ID will need to be registered
   with IANA.}

5.3.  Router Forwarding Behaviour

   Re-ECN works well without modifying the forwarding behaviour of any
   routers.  However, below, two OPTIONAL changes to forwarding
   behaviour are defined, which respectively enhance performance and
   improve a router's discrimination against flooding attacks.  They are
   both OPTIONAL additions that we propose MAY apply by default to all
   Diffserv per-hop scheduling behaviours (PHBs) [RFC2475] and ECN
   marking behaviours [RFC3168].  Specifications for PHBs MAY define
   different forwarding behaviours from this default, but this is NOT
   REQUIRED.  [Re-PCN] is one example.

   FNE indicates ECT:

      The FNE codepoint indicates to tells a router to assume that the packet was
      sent
      and will be received by an ECN-capable transport. transport (see Section 5.4).  Therefore an
      FNE packet MAY be marked rather than dropped.  Note that the FNE
      codepoint has been intentionally chosen so that, to legacy routers
      (which do not inspect the RE flag), flag) an FNE packet appears to be
      Not-ECT,
      Not-ECT so it will be dropped by legacy AQM algorithms.

      A network operator MUST NOT configure a router to ECN mark rather
      than drop FNE packets unless it can guarantee that FNE packets
      will be rate limited, either locally or upstream.  The ingress
      policers discussed in Section 6.1.4 6.1.5 would count as rate limiters
      for this purpose.

   Preferential Drop: If a re-ECN capable router experiences very high
      load so that it has to drop arriving packets (e.g. a DoS attack),
      it MAY preferentially drop packets within the same Diffserv PHB
      using the preference order for extended ECN codepoints given in
      Table 7.  Preferential dropping is can be difficult to implement, implement on
      some hardware, but if feasible it would discriminate against
      attack traffic, traffic if done as part of the overall policing framework
      of Section 6.1.2. 6.1.3.  If nowhere else, routers at the egress of a
      network SHOULD implement preferential drop (stronger than the MAY
      above).  For simplicity, preferences 3,4 4 & 5 MAY be merged into one
      preference level.

   +-------+-----+------------+-------+-------------+------------------+

   +-------+-----+-----------+-------+------------+--------------------+
   |  ECN  |  RE | Extended  | Worth | Drop Pref  |   Re-ECN meaning   |
   | field | bit | ECN       |       | (1 = drop  |                    |
   |       |     | codepoint |       | 1st)       |                    |
   +-------+-----+------------+-------+-------------+------------------+
   +-------+-----+-----------+-------+------------+--------------------+
   |   01  |  0  | Re-Echo   | +1    | 7 5/4        |      Re-echoed     |
   |       |     |           |       |            |   congestion and   |
   |       |     |           |       |            |        RECT        |
   |   00  |  1  | FNE       | +1    | 6 4          |    Feedback not    |
   |       |     |           |       |            |     established    |
   |   11  |  0  | CE(0)     | 0     | 5 3          |    Congestion  Re-Echo canceled  |
   |       |     |           |       |            | experienced with    by congestion   |
   |       |     |           |       |            |      Re-Echo     experienced    |
   |   01  |  1  | RECT      | 0     | 4 3          |   Re-ECN capable   |
   |       |     |           |       |            |      transport     |
   |   11  |  1  | CE(-1)    | -1    | 3          |     Congestion     |
   |       |     |           |       |            |     experienced    |
   |   10  |  1  | --CU--    | n/a   | 2          |  Currently Unused  |
   |   10  |  0  | ---       | n/a   | 2          |   Legacy ECN use   |
   |       |     |           |       |            |        only        |
   |   00  |  0  | Not-RECT  | n/a   | 1          | Not       |
   |       |     |            |       |             | re-ECN-capable |
   |       |     |           |       |            |      transport     |
   +-------+-----+------------+-------+-------------+------------------+
   +-------+-----+-----------+-------+------------+--------------------+

       Table 7: Drop Preference of EECN Codepoints (Sorted by `Worth')

      The above drop preferences are arranged to preserve packets with
      more positive worth (Section 3.4), given senders of positive
      packets must have honestly declared downstream congestion.  This
      is explained fully in Section 6 on applications. applications, particularly when
      the application of re-ECN to protect against DDoS attacks is
      described.

5.4.  Justification for Setting the First SYN to FNE

   We require clients

   Congested routers may mark an FNE packet to consider CE(-1) (Section 5.3), and
   the first initial SYN as congestion MUST be set to FNE by Re-ECT client A
   (Section 4.1.4).  So an initial SYN may be marked if
   they find out at CE(-1) rather than
   dropped.  This seems dangerous, because the end of sender has not yet
   established whether the handshake receiver is a legacy one that the server was does not Re-
   ECT capable.  This way we remove the need
   understand congestion marking.  It also seems to allow malicious
   senders to take advantage of ECN marking to cautiously avoid setting
   the first so much drop when
   launching SYN to flooding attacks.  Below we explain the features of the
   protocol design that remove both these dangers.

   ECN-capable initial SYN with a Not-ECT server: If the TCP server B is
      re-ECN capable, provision is made for it to feedback a possible
      congestion marked SYN in the SYN ACK (Section 4.1.4).  But if the
      TCP client A finds out from the SYN ACK that the server was not
      ECN-capable, the TCP client MUST consider the first SYN as
      congestion marked before setting itself into Not-ECT mode.
      Section 4.1.4 mandates that such a TCP client MUST also set its
      initial window to 1 segment.  In this way we remove the need to
      cautiously avoid setting the first SYN to Not-RECT.  This will
      give worse performance while deployment is patchy, but better
      performance once deployment is widespread.

   SYN flooding attacks can't exploit ECN-capability: Malicious clients hosts
      may think they can use the advantage that ECN-marking gives over
      drop in launching classic SYN-flood attacks.  But Section 5.3
      mandates that a router MUST only be configured to treat packets
      with the rate limit on FNE codepoints performed by codepoint as ECN-capable if FNE packets are rate
      limited.  Introduction of the
   ingress policer should be FNE codepoint was a sufficient countermeasure.

   If deliberate move
      to enable transport-neutral handling of flow-start and flow state
      set-up in the server is re-ECN capable, provision is made for IP layer where it to echo a belongs.  It then becomes possible congestion marking.  Congested routers may mark an FNE
   packet
      to CE (see Section 5.3), in which case the packet will arrive
   at B with an extended ECN codepoint protect against flooding attacks of CE(-1).  So, if the initial all forms (not just SYN from Re-ECT client A is marked CE(-1), a Re-ECT server B MUST
   increment its local value of ECC.  But B cannot reflect
      flooding) without transport-specific inspection for things like
      the value of
   ECC SYN flag in the TCP headers.  Then, for instance, SYN ACK, because it is still flooding
      attacks using IPSec ESP encryption can also be rate limited at the 3 bits to negotiate
   connection capabilities.  So, server B MUST set the alternative TCP
   header flags in its SYN ACK: NS=1, CWR=1 and ECE=0 (see Table 5).
      IP layer.

   It might seem pedantic worrying about these single packets, but going to all this
   behaviour ensures trouble to enable ECN on the system
   initial packet of a flow, but it is safe, motivated by a much wider concern
   to ensure safe congestion control will still be possible even if the
   application mix on
   the Internet evolves to the point where the majority of flows
   consist of a single window or even a single packet.  It also allows
   denial of service attacks to be more easily isolated and prevented.

5.5.  Control and Management

5.5.1.  Negative Balance Warning

   A new ICMP message type is being considered so that a dropper can
   warn the apparent sender of a flow that it has started to sanction
   the flow.  The message would have similar semantics to the `Time
   exceeded' ICMP message type.  To ensure the sender has to invest some
   work before the network will generate such a message, a dropper
   SHOULD only send such a message for flows that have demonstrated that
   they have started correctly by establishing a positive record, but
   have later gone negative.  The threshold is up to the implementation.
   The purpose of the message is to deconfuse the cause of drops from
   other causes, such as congestion or transmission losses.  The dropper
   would send the message to the sender of the flow, not the receiver.
   If we did define this message type, it would be REQUIRED for all re-
   ECT senders to parse and understand it.  Note that a sender MUST only
   use this message to explain why losses are occurring.  A sender MUST
   NOT take this message to mean that losses have occurred that it was
   not aware of.  Otherwise, spoof messages could be sent by malicious
   sources to slow down a sender (c.f.  ICMP source quench).

   However, the need for this message type is not yet confirmed, as we
   are considering how to prevent it being used by malicious senders to
   scan for droppers and to test their threshold settings. {ToDo:
   Complete this section.}

5.5.2.  Rate Response Control

   The incentive framework of Section 6.1.2 6.1.3 implies the there may be a need
   for a sender to send a request to an ingress policer asking that it
   be allowed to apply a non-default response to congestion (where TCP-friendly TCP-
   friendly is assumed to be the default).  This would require the
   sender to know what message format(s) to use and to be able to
   discover how to address the policer.  And message format(s) would
   have to be defined.  The required control
   protocol(s) are outside the scope of this document, but will require
   definition elsewhere.

   The policer is likely to be local to the sender and inline, probably
   at the ingress interface to the internetwork.  So, discovery should
   not be hard.  A variety of control protocols already exist for some
   widely used rate-responses to congestion.  For instance DCCP
   congestion control identifiers (CCIDs) (CCIDs [RFC4340]) fulfil this role and
   so does QoS signalling (e.g. and RSVP request for controlled load
   service is equivalent to a request for no rate response to
   congestion, but with admission control).

5.6.  IP in IP Tunnels

   For tunnels re-ECN to work correctly, re-ECN largely requires no more than
   the correctly through IP in IP tunnels, it needs
   slightly different tunnel handling of to regular ECN [RFC3168].  The RE flag raises an
   extra issue, but it is more straightforward than the ECN field
   because it is not intended
   Ideally, for re-ECN to change along the path.  Therefore work through a tunnel, the tunnel entry point only needs to should
   copy both the RE flag into and the
   encapsulating header, without any need ECN field from the inner to negotiate whether the outer
   IP header.  Then at the tunnel exit supports exit, any congestion marking of the
   outer ECN field should overwrite the inner ECN field (unless the
   inner field is Not-ECT in which case an alarm should be raised).  The
   RE flag handling.

   {ToDo: However, there are some issues to discuss concerning tunnels,
   which will shouldn't change along a path, so the outer RE flag should be included in
   the same as the inner.  If it isn't a future version of this draft}

5.7.  Non-Issues

   {ToDo: management alarm should be
   raised.  This section will explain why behaviour is the addition of re-ECN does not
   interact with any same as the full-functionality variant
   of [RFC3168] at tunnel exit, but different at tunnel entry.

   If tunnels are left as they are specified in [RFC3168], whether the following:

   o  Integration
   limited or full-functionality variants are used, a problem arises
   with congestion notification in various link layers
      (Ethernet, ATM (and MPLS re-ECN if it had a congestion notification
      capability added, which is tunnel crosses an inter-domain boundary, because the
   difference between positive and negative markings will not precluded for be
   correctly accounted for.  In a limited functionality ECN tunnel, the EXP field
      [RFC3270])

   o  Tunnels,
   flow will appear to be legacy traffic, and Overlays that wish therefore may be wrongly
   rate limited.  In a full-functionality ECN tunnel, the result will
   depend whether the tunnel entry copies the inner RE flag to support congestion notification
      (see also the brief discussion of edge-to-edge support outer
   header or the RE flag in the outer header is always cleared.  If the
   former, the flow will tend to be too positive when accounted for re-ECN at
   borders.  If the latter, it will be too negative.

   {ToDo: A future version of this draft will discuss the necessary
   changes to IP in RSVP or NSIS transports earlier) IP tunnels in more depth.}

5.7.  Non-Issues

   The following issues might seem to cause unfavourable interactions
   with re-ECN, but we will explain why they don't:

   o  Encryption  Various link layers support explicit congestion notification, such
      as Frame Relay and IPSec

   }

6.  Applications

6.1.  Policing Congestion Response

6.1.1.  The Policing Problem

   The current Internet architecture trusts hosts ATM.  Explicit congestion notification is
      proposed to respond voluntarily be added to congestion.  Limited evidence shows that other link layers, such as Ethernet
      (802.3ar Ethernet congestion management) and MPLS [ECN-MPLS];

   o  Encryption and IPSec.

   In the large majority case of
   end-points congestion notification at the link layer, each
   particular link layer scheme either manages congestion on the Internet comply link
   with a TCP-friendly response to
   congestion.  But telephony (and increasingly video) services over its own link-level feedback (the usual arrangement in the
   best efforts Internet cases
   of ATM and Frame Relay), or congestion notification from the link
   layer is merged into congestion notification at the IP level when the
   frame headers are attracting decapsulated at the interest of major commercial
   operations.  Most end of these applications do the link (the
   recommended arrangement in the Ethernet and MPLS cases).  Given the
   RE flag is not respond to congestion
   at all.  Those that can switch to lower rate codecs, still have a
   lower bound below which they must become unresponsive intended to congestion.

   Even TCP-friendly applications can cause a disproportionate amount of
   congestion, simply by using multiple flows or by transferring data
   continuously.  Also change along the Internet Architecture has few defences
   against distributed denial of service attacks path, this means that combine both
   problems: unresponsiveness to
   downstream congestion and flooding with multiple
   flows.

   Applications that need (or choose) to will still be unresponsive to congestion
   can effectively steal whatever share of bottleneck resources they
   want from responsive flows.  Whether or not such free-riding measureable at any point where IP
   is
   common, inability to prevent it increases processed on the risk path by subtracting positive from negative
   markings.

   In the case of poor returns
   for investors encryption, as long as the tunnel issues described in network infrastructure, leading to under-investment.
   An increasing proportion
   Section 5.6 are dealt with, payload encryption itself will not be a
   problem.  The design goal of unresponsive, free-riding demand coupled
   with persistent under-supply re-ECN is a broken economic cycle.  Therefore,
   if the current, largely co-operative consensus continues to erode, include downstream
   congestion collapse could become more common in more areas of the
   Internet [RFC3714].

   However, while we have designed re-ECN to provide a way IP header so that it is not necessary to solve
   these problems, this does bury into
   inner headers.  Obfuscation of flow identifiers is not imply we advocate that every network
   should introduce tight controls on those that cause congestion.  Re-
   ECN has been specifically designed a problem for
   re-ECN policing elements.  Re-ECN doesn't ever require flow
   identifiers to allow different networks be valid, it only requires them to
   choose how conservative be unique.  So if
   an IPSec encapsulating security payload (ESP [RFC2406]) or liberal they wish an
   authentication header (AH [RFC2402]) is used, the security parameters
   index (SPI) will be a sufficient flow identifier, as it is intended
   to be with respect unique to
   policing congestion.  But those that choose a flow without revealing actual port numbers.

   In general, even if endpoints use some locally agreed scheme to be conservative hide
   port numbers, re-ECN policing elements can
   protect themselves from just consider the excesses that liberal networks allow
   their users.

6.1.2.  Incentive Framework

   The aim is pair of
   source and destination IP addresses as the flow identifier.  Re-ECN
   encourages endpoints to create an incentive environment at least tell the network layer that ensures optimal
   sharing a
   sequence of packets are all part of capacity despite everyone acting selfishly (including
   lying and cheating).  Of course, the mechanisms put in place same flow, if indeed they
   are.  The alternative would be for this
   can lie dormant wherever co-operation is the norm.

   Throughout this document we focus on path congestion.  But most forms
   of fairness, including TCP's, also depend on round trip time.  So, we
   also propose sender to make each packet
   appear to measure downstream path delay using re-feedback.
   This proposal will be published in a very simple future draft, but
   for now we give an outline in Appendix E.

   Figure 6 sketches the incentive framework that we will describe piece
   by piece throughout this section.  We will do a first pass in
   overview, then return new flow, which would require them all to each piece be marked
   FNE in detail.  An internetwork order to avoid being treated with
   multiple trust boundaries is depicted.  The difference between the
   two plots in bulk of malicious flows
   at the example we used earlier Figure 1 egress dropper.  Given the FNE marking is plotted below.
   The graph displays downstream path congestion seen in a typical flow
   as it traverses an example path from sender S to receiver R, across worth +1 and
   networks N1, N2 & N4.  Everyone is shown using re-ECN, but we intend are likely to show why everyone would /choose/ rate limit FNE packets, endpoints are given an
   incentive not to use it, correctly and
   honestly.

   Two main types of self-interest can be identified:

   o  Users set FNE on each packet.  But if the sender really
   does want to transmit data across the network as fast as
      possible, paying as little as possible for hide the privilege.  In this
      respect, there is no distinction flow relationship between senders and receivers,
      but we must be wary of potential malice by one on the other;

   o  Network operators want packets it can choose
   to maximise revenues from pay the resources
      they invest in.  They compete amongst themselves cost of multiple FNE packets, which in the long run will
   compensate for the custom of
      users.

         policer
       A  |
       |  |
       |S <-----N1----> <---N2---> <---N4--> R         domain
       |: :                                :
       |V :                                :
    3% |--------+                          :
       |  :     |                          :
    2% |  :     +-----------------------+  :
       |  :    downstream congestion    |  :
    1% |  :                             |  :
       |  :                             |  :
    0% +--------------------------------+=====-->
                0                       i  ^      resource index
                |                       | /|\
              1.00%                  2.00% |       marking fraction
                                           |
                                        dropper

   Figure 6: Incentive Framework, showing creation of opposing pressures extra memory required on network policing elements
   to under-declare and over-declare downstream congestion, using a
   policer and a dropper

   Source congestion control: We want process each flow.

6.  Applications

6.1.  Policing Congestion Response

6.1.1.  The Policing Problem

   The current Internet architecture trusts hosts to ensure respond voluntarily
   to congestion.  Limited evidence shows that the sender will
      throttle its rate as downstream congestion increases.  Whatever large majority of
   end-points on the agreed congestion Internet comply with a TCP-friendly response (whether TCP-compatible or some
      enhanced QoS), to some extent it will always be against the
      sender's interest to comply.

   Ingress policing:
   congestion.  But it is in all telephony (and increasingly video) services over the network operators' interests
   best efforts Internet are attracting the interest of major commercial
   operations.  Most of these applications do not respond to encourage fair congestion response, so
   at all.  Those that their investments
      are employed to satisfy the most valuable demand.  N1 is in the
      best position can switch to deploy lower rate codecs, still have a policer at its ingress
   lower bound below which they must become unresponsive to check that S1 congestion.

   Of course, the Internet is complying with congestion control (Section 6.1.4). intended to support many different
   application behaviours.  But ingress
      policing the problem is not that this freedom can be
   exercised irresponsibly.  The greater problem is that we will never
   be able to agree on where the only possible arrangement.  Re-ECN provides
      the necessary information for dual control boundary is between responsible and
   irresponsible.  Therefore re-ECN is designed to allow different
   networks to set their own view of congestion either by
      the sender or by the network ingress.  So, limit to irresponsibility, and
   to allow networks that choose a more conservative limit to push back
   against congestion caused in some scenarios (e.g.
      sensing devices with minimal capabilities) the network ingress
      might do more liberal networks.

   As an example of the congestion control as impossibility of setting a proxy standard for
   fairness, mandating TCP-friendliness would set the sender.

   Edge egress dropper: If the policer ensures the source has less right
      to a bar too high rate the higher it declares downstream congestion, for
   unresponsive streaming media, but still some would say the
      source has bar was
   too low.  Even though all known peer-to-peer filesharing applications
   are TCP-compatible, they can cause a clear incentive disproportionate amount of
   congestion, simply by using multiple flows and by transferring data
   continuously relative to understate downstream congestion.
      But, other short-lived sessions.  On the other
   hand, if packets are understated when they enter we swung the internetwork,
      they will other way and set the bar low enough to allow
   streaming media to be negative when they leave.  So, unresponsive, we introduce a dropper
      at the last network egress, would also allow denial of
   service attacks, which drops packets in flows are typically unresponsive to congestion and
   consist of multiple continuous flows.

   Applications that
      persistently declare negative downstream need (or choose) to be unresponsive to congestion (see
      Section 6.1.3 for details).  Incidentally, a network
   can trivially
      prevent negative traffic effectively take (some would say steal) whatever share of
   bottleneck resources they want from being sent in the first place by responsive flows.  Whether or not
      permitting a sender
   such free-riding is common, inability to send any CE packets, which would clearly
      contravene prevent it increases the ECN protocol.

               ..competitive routing
             .'         :      '.
           .'  p e n a l:t i e s '.
          :           | :       \  :
       A  :           | :        | :
       |S <-----N1----> <---N2---> <---N4--> R         domain
       |  :           | :        | :
       |  V           | :        | :
    3% |--------+     | :        | :
       |        |     V V        V V
    2% |        +-----------------------+
       |       downstream congestion    |
    1% |          :                     |
       |          :                     |
    0% +--------------------------------+=====-->
                0                ^      i         resource index
                |               /|\     |
              1.00%              |   2.00%         marking fraction
                                 |
                             sanctions

      Figure 7: Incentives at Inter-domain Borders

   Inter-domain traffic policing: But next we must ask, if congestion
   arises downstream (say
   risk of poor returns for investors in N4), what is the ingress network's (N1's)
   incentive network infrastructure, leading
   to police its customers' response?  If N1 turns a blind
   eye, its own customers benefit while other networks suffer.  This under-investment.  An increasing proportion of unresponsive or
   free-riding demand coupled with persistent under-supply is
   why all inter-domain QoS architectures (e.g. Intserv, Diffserv)
   police traffic each time it crosses a trust boundary.  Re-ECN gives
   trustworthy information at each trust boundary, which N4 (say) can
   use in bulk to police all broken
   economic cycle.  Therefore, if the responses current, largely co-operative
   consensus continues to erode, congestion collapse could become more
   common in more areas of all the
   sources beyond its upstream neighbour (N2) with one very simple
   passive mechanism, as we will now explain using Figure 7.

   But before we do, Internet [RFC3714].

   While we need have designed re-ECN so that networks can choose to make a very important point.  In the
   explanation that follows, we assume a very specific variant of volume
   charging between networks.  We must make clear that we are deploy
   stringent policing, this does not
   advocating imply we advocate that everyone every
   network should use this form of contract.  We are
   well aware introduce tight controls on those that the IETF tries cause
   congestion.  Re-ECN has been specifically designed to avoid standardising technology that
   depends on a particular business model.  And we strongly share this
   desire allow different
   networks to encourage diversity. choose how conservative or liberal they wish to be with
   respect to policing congestion.  But our aim is merely those that choose to show be
   conservative can protect themselves from the excesses that
   border liberal
   networks allow their users.

6.1.2.  The Case Against Bottleneck Policing

   The state of the art in rate policing can is the bottleneck policer,
   which is intended to be deployed at least work with this one model, then we can
   assume any forwarding resource that may
   become congested.  Its aim is to detect flows that cause
   significantly more local congestion than others.  Although operators
   might experiment with the metric in other
   models (see Section 6.1.5 for examples).  Of course, operators are
   free to complement this usage element of solve their charges with
   traditional capacity charging, and we expect they will.

   Emulating policing with inter-domain congestion charging: Between
      high-speed networks, immediate problems by deploying bottleneck
   policers, we are concerned that widespread deployment would rather avoid holding back traffic
      while make it is policed.  Instead, once
   extremely hard to evolve new application behaviours.  We believe the
   IETF should offer re-ECN has arranged headers as the preferred protocol on which to
      carry downstream congestion honestly, N2 can contract to pay N4
      penalties in proportion base
   solutions to a single bulk count the policing problems of operators, because it would not
   harm evolvability and, frankly, it would be far more effective (see
   later for why).

   Approaches like [XCHOKe] & [pBox] are nice approaches for rate
   policing traffic without the congestion
      metrics crossing their mutual trust boundary (Section 6.1.5).  In
      this way, N4 puts pressure on N2 to suppress downstream
      congestion, benefit of whole path information (such
   as shown could be provided by the solid downward arrow re-ECN).  But they must be deployed at the egress of
      N2.  Then N2 has an incentive either
   bottlenecks in order to police the congestion
      response work.  Unfortunately, a large proportion of its own ingress
   traffic (from N1) or to charge N1 in
      turn on the basis of congestion counted traverses at their mutual boundary.
      In this recursive way, the incentives for each flow to respond
      correctly to congestion trace back least two bottlenecks (in two access networks),
   particularly with each flow precisely to
      each source, despite the mechanism not recognising flows (see
      Section 6.2.2). current traffic mix where peer-to-peer file-
   sharing is prevalent.  If N1 turns a blind eye to its own upstream
      customers' congestion response, ECN were deployed, we believe it will still have to pay its
      downstream neighbours.

   No congestion charging would be
   likely that these bottleneck policers would be adapted to users: Bulk combine ECN
   congestion charging at trust
      boundaries is passive and extremely simple, and loses none of its
      per-packet precision marking from one boundary to the next (unlike
      Diffserv all-address traffic conditioning agreements, which
      dissipate their effectiveness across long topologies). upstream path with local congestion
   knowledge.  But at any
      trust boundary, there is no imperative to use congestion charging.
      Traditional traffic policing can then the only useful placement for such policers
   would be used, if close to the complexity and
      cost is preferred.  In particular, at egress of the boundary with end
      customers (e.g. between S and N1), traffic policing will most
      likely internetwork.

   But then, if these bottleneck policers were widely deployed (which
   would require them to be far more appropriate.  Policer complexity is less of a
      concern at effective than they are now), the edge of
   Internet would find itself with one universal rate adaptation policy
   (probably TCP-friendliness) embedded throughout the network.  And end-customers are  Given
   TCP's congestion control algorithm is already known to be highly averse to the unpredictability of congestion
      charging.

      So, NOTE WELL: this document neither advocates nor requires
      congestion charging for end customers hitting its
   scalability limits and advocates but does not
      require inter-domain congestion charging.

   Competitive discipline of inter-domain traffic engineering: With
      inter-domain new algorithms are being developed for high-
   speed congestion charging, control, embedding TCP policing into the Internet
   would make evolution to new algorithms extremely painful.  If a domain seems
   source wanted to have use a
      perverse incentive different algorithm, it would have to fake congestion; N2's profit depends on first
   discover then negotiate with all the
      difference between congestion at policers on its ingress (its revenue) path,
   particularly those in the far access network.  The IETF has already
   traveled that path with the Intserv architecture and at
      its egress (its cost).  So, overstating internal congestion seems found it
   constrains scalability [RFC2208].

   Anyway, if bottleneck policers were ever widely deployed, they would
   be likely to increase profit.  However, smart border routing [Smart_rtg] be bypassed by
      N1 will bias its multipath routing towards the least cost routes.
      So, N2 risks losing all its revenue to competitive routes if it
      overstates congestion (see Section 6.2.3).  In other words, if N2
      is the least congested route, its ability determined attackers.  They inherently
   have to raise excess profits
      is limited police fairness per flow or per source-destination pair.
   Therefore they can easily be circumvented either by opening multiple
   flows (by varying the congestion on the next least congested route.
      This pressure on N2 to remain competitive is represented end-point port number); or by spoofing the
      dotted downward arrow at
   source address but arranging with the ingress receiver to N2 in Figure 7.

   Closing the loop: All hide the above elements conspire true
   return address at a higher layer.

6.1.3.  Re-ECN Incentive Framework

   The aim is to trap everyone
      between two opposing pressures (upper half create an incentive environment that ensures optimal
   sharing of Figure 6), ensuring
      the downstream congestion metric arrives at capacity despite everyone acting selfishly (including
   lying and cheating).  Of course, the destination
      neither above nor below zero.  So, we have arrived back where we
      started mechanisms put in our argument.  The ingress edge network place for this
   can rely lie dormant wherever co-operation is the norm.

   Throughout this document we focus on path congestion.  But some forms
   of fairness, particularly TCP's, also depend on round trip time.  So,
   we also propose to measure downstream congestion declared path delay using re-feedback.
   This proposal will be published in a very simple future draft, but
   for now we give an outline in Appendix F.

   Figure 8 sketches the packet headers presented incentive framework that we will describe piece
   by
      the sender.  So it can police the sender's congestion response
      accordingly.

6.1.2.1.  The Case against Classic Feedback

   A system that produces an optimal outcome as a result of everyone's
   selfish actions is extremely powerful.  But why do we have to change
   to re-ECN to achieve it?  Can't classic congestion feedback (as used
   already by standard ECN) be arranged to provide similar incentives?
   Superficially it can.  Given ECN already existed, piece throughout this was the
   deployment path Kelly proposed for his seminal work that used self-
   interest to optimise section.  We will do a system of networks and users (summarised first pass in
   [Evol_cc]).  The mechanism was nearly identical
   overview, then return to volume charging;
   except only each piece in detail.  We re-use the volume earlier
   example of packets marked with how downstream congestion experienced
   (CE) was counted.

   However, below we explain why relying on classic feedback /required/ is derived by subtracting
   upstream congestion charging to be used, while re-ECN achieves the same
   powerful outcome, from path congestion (Figure 1) but does not /require/ depict
   multiple trust boundaries to turn it into an internetwork.  For
   clarity, only downstream congestion charging.  In
   brief, the problem with classic feedback is that the incentives have
   to trace shown (the difference between
   the indirect two earlier plots).  The graph displays downstream path back to the sender---the long way round
   the feedback loop.  For example, if classic feedback were used
   congestion seen in
   Figure 6, N2 would have had to influence N1 via N4, R & a typical flow as it traverses an example path
   from sender S rather than
   directly.

   Inability to agree what receiver R, across networks N1, N2 & N4.  Everyone
   is happening downstream: In order shown using re-ECN correctly, but we intend to police
      its upstream neighbour's congestion response, the neighbours
      should be able to agree on the congestion show why everyone
   would /choose/ to use it correctly, and honestly.

   Three main types of self-interest can be responded to.
      Whatever identified:

   o  Users want to transmit data across the feedback regime, network as packets change hands at each
      trust boundary, any path metrics they carry are verifiable fast as
      possible, paying as little as possible for the privilege.  In this
      respect, there is no distinction between senders and receivers,
      but we must be wary of potential malice by both
      neighbours.  But, with a classic path metric, they can only agree one on the /upstream/ path congestion.

   Inaccessible back-channel: The network needs a whole-path congestion
      metric other;

   o  Network operators want to control maximise revenues from the source.  Classically, whole path congestion
      emerges at resources
      they invest in.  They compete amongst themselves for the destination, to be fed back from receiver custom of
      users.

   o  Attackers (whether users or networks) want to sender
      in a back-channel.  But, in use any data network, back-channels need
      not be visible opportunity
      to relays, as they are essentially communications
      between subvert the end-points.  They may be encrypted, asymmetrically
      routed new re-ECN system for their own gain or simply omitted, so no network element can reliably
      intercept them.  The congestion charging literature solves this
      problem by charging the receiver and assuming this will cause the
      receiver to refer the charges to damage
      the sender.  But, of course, this
      creates unintended side-effects...

   `Receiver pays' unacceptable: In connectionless datagram networks,
      receivers and receiving networks cannot prevent reception from
      malicious senders, so `receiver pays' opens them to `denial service of
      funds' attacks.

   End-user their victims, whether targeted or random.

          policer
           |
           |
         S <-----N1----> <---N2---> <---N4--> R         domain
         | :                                :
       A\|/:                                :
       | V :                                :
    3% |---------+                          :
       |   :     |                          :
    2% |   :     +-----------------------+  :
       |   :    downstream congestion charging unacceptable: Even if 'denial    |  :
    1% |   :                             |  :
       |   :                             |  :
    0% +---------------------------------+=====-->
                 0                       i  ^      resource index
                 |                       | /|\
               1.00%                  2.00% |       marking fraction
                                            |
                                         dropper

   Figure 8: Incentive Framework, showing creation of funds'
      were not opposing pressures
   to under-declare and over-declare downstream congestion, using a problem, we know that end-users are highly averse
   policer and a dropper

   Source congestion control: We want to ensure that the unpredictability of sender will
      throttle its rate as downstream congestion charging and anyway, we want increases.  Whatever
      the agreed congestion response (whether TCP-compatible or some
      enhanced QoS), to
      avoid restricting network operators some extent it will always be against the
      sender's interest to just one retail tariff. comply.

   Ingress policing: But with classic feedback only an upstream metric it is available, in all the network operators' interests
      to encourage fair congestion response, so
      we cannot avoid having that their investments
      are employed to wrap satisfy the `receiver pays' money flow
      around most valuable demand.  The re-ECN
      protocol ensures packets carry the feedback loop, necessarily forcing end-users to be
      subjected to necessary information about
      their own expected downstream congestion charging.

   To summarise so far, with classic feedback, policing congestion
   response /requires/ congestion charging of end-users and that N1 can deploy a `receiver
   pays' model, whereas,
      policer at its ingress to check that S1 is complying with re-ECN, incentives can whatever
      congestion control it should be fashioned either
   by technical policing mechanisms (more appropriate for end users) or
   by congestion charging using (Section 6.1.5).  If N1 is
      extremely conservative it may police each flow, but it can choose
      to just police the safer `sender pays' model (more
   appropriate inter-domain).

   We now take a second pass over the incentive framework, filling in
   the detail.

6.1.3.  Egress Dropper

   As bulk amount of congestion each customer causes
      without regard to flows, or if it is extremely liberal it need not
      police congestion control at all.  Whatever, it is always
      preferable to police traffic leaves at the last network very first ingress into an
      internetwork, before non-compliant traffic can cause any damage.

   Edge egress dropper: If the receiver (domain N4 in
   Figure 6), policer ensures the RE blanking fraction in source has less right
      to a flow should match high rate the CE
   congestion marking fraction.  If it is less (a negative flow), higher it
   implies that declares downstream congestion, the
      source is understating path congestion (which will
   reduce the penalties that N2 owes N4).

   If has a clear incentive to understate downstream congestion.
      But, if flows of packets are positive, N4 need take no action---this simply means its
   upstream neighbour is paying more penalties than it needs to, and understated when they enter the
   source is going slower than it needs to.  But, to protect itself
   against persistently
      internetwork, they will have become negative flows, N4 should install by the time they
      leave.  So, we introduce a dropper at
   its egress.  Appendix D gives a suggested algorithm for the dropper,
   meeting the criteria below.

   o  It SHOULD introduce minimal false positives for honest flows;

   o  It SHOULD quickly detect and sanction dishonest last network egress,
      which drops packets in flows (minimal
      false negatives);

   o  It MUST be invulnerable to state exhaustion attacks from malicious
      sources.  For instance, if the dropper uses flow-state, it should
      not be possible that persistently declare negative
      downstream congestion (see Section 6.1.4 for details).

               ..competitive routing
             .'         :      '.
           .'  p e n a source l:t i e s '.
          :           | :       \  :
       A  :           | :        | :
       |S <-----N1----> <---N2---> <---N4--> R         domain
       |  :           | :        | :
       |  V           | :        | :
    3% |--------+     | :        | :
       |        |     V V        V V
    2% |        +-----------------------+
       |       downstream congestion    |
    1% |          :                     |
       |          :                     |
    0% +--------------------------------+=====-->
                0                ^      i         resource index
                |               /|\     |
              1.00%              |   2.00%         marking fraction
                                 |
                             sanctions

      Figure 9: Incentives at Inter-domain Borders

   Inter-domain traffic policing: But next we must ask, if congestion
      arises downstream (say in N4), what is the ingress network's
      (N1's) incentive to send numerous packets, police its customers' response?  If N1 turns a
      blind eye, its own customers benefit while other networks suffer.
      This is why all inter-domain QoS architectures (e.g. Intserv,
      Diffserv) police traffic each with time it crosses a trust boundary.
      We have already shown that re-ECN gives a trustworthy measure of
      the expected downstream congestion that a
      different flow ID, will cause by
      subtracting negative volume from positive at any intermediate
      point on a path.  N4 (say) can use this measure to force police all the dropper
      responses to exhaust congestion of all the sources beyond its memory
      capacity.;

   o  It MUST introduce sufficient loss upstream
      neighbour (N2), but in goodput so that malicious
      sources cannot play off losses in the egress dropper against
      higher allowed throughput.  Salvatori [CLoop_pol] describes this
      attack, which involves the source understating path bulk with one very simple passive
      mechanism, rather than per flow, as we will now explain using
      Figure 9.

   Emulating policing with inter-domain congestion
      then inserting forward error correction (FEC) packets to
      compensate expected losses.

   Note that the dropper operates on flows but penalties: Between
      high-speed networks, we would like it not to
   require rather avoid per-flow state.  This is why policing, and
      we have been careful would rather avoid holding back traffic while it is policed.
      Instead, once re-ECN has arranged headers to carry downstream
      congestion honestly, N2 can contract to pay N4 penalties in
      proportion to ensure
   that all flows MUST start with a packet marked with single bulk count of the FNE
   codepoint.  If a congestion metrics
      crossing their mutual trust boundary (Section 6.1.6).  In this
      way, N4 puts pressure on N2 to suppress downstream congestion, for
      every flow does not start with passing through the FNE codepoint, a
   dropper is likely border interface, even though they
      will all start and end in different places, and even though they
      may all be allowed different responses to treat it unfavourably.  This risk makes it worth
   setting congestion.  The figure
      depicts this downward pressure on N2 by the FNE codepoint solid downward arrow
      at the start egress of a flow, even though there
   is a cost N2.  Then N2 has an incentive either to police
      the sender congestion response of setting FNE (positive `worth').  Indeed,
   with the FNE codepoint, its own ingress traffic (from N1) or to
      emulate policing by applying penalties to N1 in turn on the rate basis
      of congestion counted at which a sender can generate new
   flows can be limited (Appendix F). their mutual boundary.  In this respect, recursive
      way, the FNE
   codepoint works like Clark's state set-up bit [Steps_DoS].

   Appendix F also gives an example dropper implementation that
   aggregates incentives for each flow state.  Dropper algorithms will often maintain a
   moving average across flows of to respond correctly to
      congestion trace back with each flow precisely to each source,
      despite the fraction mechanism not recognising flows (see Section 6.2.2).

   Inter-domain congestion charging diversity: Any two networks are free
      to agree any of RE blanked packets.
   When maintaining an average across flows, a dropper SHOULD only allow
   flows into range of penalty regimes between themselves
      within the average if they start following reasonable constraints.  N2 should expect to
      have to pay penalties to N4 where penalties monotonically increase
      with FNE, but it SHOULD the volume of congestion and negative penalties are not
   include packets
      allowed.  For instance, they may agree an SLA with tiered
      congestion thresholds, where higher penalties apply the FNE codepoint set in the average.  An
   ingress gateway sets higher the FNE codepoint when it does not have
      threshold that is broken.  But the
   benefit most obvious (and useful) form
      of feedback from the ingress.  So, counting packets with FNE
   cleared would be likely penalty is where N4 levies a charge on N2 proportional to make the average unnecessarily positive,
   providing headroom (or should we say footroom?) for dishonest
   (negative) traffic.

   If
      volume of downstream congestion N2 dumps into N4.  In the dropper detects a persistently negative flow, it SHOULD drop
   sufficient negative and neutral packets
      explanation that follows, we assume this specific variant of
      volume charging between networks - charging proportionate to force the flow to not be
   negative.  Drops SHOULD be focused on just sufficient packets in
   misbehaving flows to remove the negative bias while doing minimal
   harm.

6.1.4.  Rate Policing

   Approaches like [XCHOKe] & [pBox] are nice approaches for rate
   policing traffic without the benefit
      volume of whole path information, such
   as could be provided by re-ECN.  But they congestion.

      We must be deployed at
   bottlenecks in order to work.  Unfortunately, a large proportion make clear that we are not advocating that everyone should
      use this form of
   traffic traverses at least two bottlenecks (in the two access
   networks), particularly with contract.  We are well aware that the current traffic mix where peer-to-
   peer file-sharing IETF tries
      to avoid standardising technology that depends on a particular
      business model.  And we strongly share this desire to encourage
      diversity.  But our aim is prevalent.  These `bottleneck policers' could be
   adapted merely to combine ECN congestion marking from the upstream path show that border policing can
      at least work with
   local congestion knowledge.  But this one model, then we can assume that
      operators might experiment with the only useful placement metric in other models (see
      Section 6.1.6 for
   them would be close examples).  Of course, operators are free to the egress
      complement this usage element of the network.

   But then, if these bottleneck policers were widely deployed, the
   Internet would find itself their charges with one universal rate adaptation policy
   (TCP-friendliness) embedded throughout the network.  Given TCP's traditional
      capacity charging, and we expect they will.

   No congestion control algorithm is already known charging to be hitting its
   scalability limits and new algorithms are being developed for high-
   speed users: Bulk congestion control, embedding TCP policing into the Internet
   would make evolution to new algorithms penalties at trust
      boundaries are passive and extremely painful.  If a
   source wanted to use a different algorithm, it would have to both
   discover simple, and negotiate with a policer in some remote access network,
   as well as possibly others on its path.

   Therefore, re-ECN has been designed lose none of
      their per-packet precision from one boundary to avoid the need for bottleneck next (unlike
      Diffserv all-address traffic conditioning agreements, which
      dissipate their effectiveness across long topologies).  But at any
      trust boundary, there is no imperative to use congestion charging.
      Traditional traffic policing so that we can avoid be used, if the complexity and
      cost is preferred.  In particular, at the threat boundary with end
      customers (e.g. between S and N1), traffic policing will most
      likely be more appropriate.  Policer complexity is less of a single rate adaptation
   policy throughout the network.  Instead, re-ECN allows the access
   network operator
      concern at the ingress to choose which rate adaptation to
   enforce.  If desired, the re-ECN wire protocol allows these ingress
   policers to perform per-flow policing according to edge of the widely adopted
   TCP rate adaptation, but it also allows new rate adaptation policies
   beyond TCP network.  And end-customers are known
      to be enforced.  Further, it also allows the flexibility
   for networks to choose highly averse to police users as a whole, rather than flows
   (see Appendix F for example designs).

   o  The particular rate adaptation may be agreed bilaterally between
      the sender and its ingress provider (Section 5.5.2), which would
      greatly improve the evolvability unpredictability of congestion control, requiring
      only a single, local box to be updated upon changes.  Of course,
      one would currently expect TCP to be the default of choice.

   o  Bottleneck policing can easily be circumvented, opening multiple
      flows by varying the active end-point port number; or by spoofing
      the source address
      charging.

   NOTE WELL: This document neither advocates nor requires congestion
      charging for end customers and advocates but arranging with the receiver to hide the
      true return address at a higher layer.

   A useful feature does not require
      inter-domain congestion charging.

   Competitive discipline of re-ECN is that it provides all the information inter-domain traffic engineering: With
      inter-domain congestion charging, a
   policer needs directly in domain seems to have a
      perverse incentive to fake congestion; N2's profit depends on the packets being policed.  Re-Echo packets
   represent
      difference between congestion echoes as far as an at its ingress policer is
   concerned. (its revenue) and at
      its egress (its cost).  So, even policing TCP's AIMD algorithm is relatively
   straightforward.  Appendix F presents an example design, but the
   choice of overstating internal congestion seems
      to increase profit.  However, smart border routing [Smart_rtg] by
      N1 will bias its multipath routing towards the preferred mechanism is up least cost routes.
      So, N2 risks losing all its revenue to competitive routes if it
      overstates congestion (see Section 6.2.3).  In other words, if N2
      is the implementer.

   Finally, we must not forget that an easy way least congested route, its ability to circumvent re-ECN's
   defences raise excess profits
      is for limited by the source congestion on the next least congested route.
      This pressure on N2 to turn off re-ECN support, remain competitive is represented by setting the
   Not-RECT codepoint, implying legacy traffic.  Therefore an
      dotted downward arrow at the ingress
   policer must put a general rate-limit on Not-RECT traffic, which
   SHOULD be lax during early, patchy deployment, but will have to
   become stricter as deployment widens.  Similarly, flows starting
   without an FNE packet can be confined by a strict rate-limit used for
   the remainder of flows that haven't proved they are well-behaved by
   starting correctly (therefore they need not consume any flow state---
   they are just confined to N2 in Figure 9.

   Closing the `misbehaving' bin if they carry an
   unrecognised flow ID).  Also, as already pointed out, an ingress rate
   policer MUST block both CE codepoints, as traffic that is already
   negative as soon as it is sent must be invalid.

6.1.5.  Inter-domain Policing

   Section 6.1.2 outlining the whole loop: All the Incentive Framework above has
   already explained how neighbouring domains can arrange their contract
   with each other so that a network can penalises its upstream
   neighbour in proportion elements conspire to trap everyone
      between two opposing pressures (the downward and upward arrows in
      Figure 8 & Figure 9), ensuring the total downstream congestion that
   crosses the interface between them over an accounting period.  That
   is, a simple count of metric
      arrives at the volume of data destination neither above nor below zero.  So, we
      have arrived back where we started in packets with RE blanked
   minus the volume with CE marked over, say, a month.

   Full details of how this our argument.  The ingress
      edge network can be done, why it works and a security
   analysis are available in a sister Internet Draft entitled `Emulating
   Border Flow Policing using Re-ECN rely on Bulk Data' [Re-PCN].  That I-D
   gives examples of how downstream networks congestion declared in the
      packet headers presented by the sender.  So it can police the aggregate
      sender's congestion response accordingly.

   Evolvability of congestion control: We have seen that re-ECN enables
      policing at the very first ingress.  We have also seen that, as
      flows continue on their upstream neighbours, against path through further networks downstream,
      re-ECN removes the need for further per-domain ingress policing of
      all the different
   contractual arrangements.  The goal congestion responses allowed to each different
      flow.  This is why the evolvability of re-ECN policing is so
      superior to ensure an upstream network
   in turn polices its upstream networks, eventually ensuring upstream
   networks will suffer bottleneck policing or to any policing of different
      QoS for different flows.  Even if they do not all access networks choose to
      conservatively police congestion per flow, each will want to
      compete with the rate response others to allow new responses to congestion for
      new types of their users.

   The scenario used in [Re-PCN] is one where re-ECN is used edge-to-
   edge rather than end-to-end application.  With re-ECN, each can introduce new
      controls independently, without coordinating with other networks
      and without having to standardise anything.  But, as in the present document.  However, the
   position at we have just
      seen, by making inter-domain borders is nearly identical. {ToDo: A
   summary of the relevant aspects of that I-D will be included here,
   but due to lack of time this has had penalties proportionate to bulk
      downtream congestion, downstream networks can be deferred for agnostic to the next
   version.}

6.1.6.  Simulations

   Simulations of policer and dropper performance done
      specific congestion response for each flow, but they can still
      apply more back-pressure the multi-bit
   version of re-feedback have more liberal the ingress access
      network has been included in section 5 "Dropper
   Performance" of [Re-fb].  Simulations of policer and dropper for the
   re-ECN version described in this document are work in progress.

6.2.  Other Applications

   {ToDo: Other applications of re-ECN will be briefly outlined here
   (largely drawing from section 3 of [Re-fb]), such as: }

6.2.1.  DDoS Mitigation response to congestion it allowed for each
      flow.

6.1.3.1.  The Case against Classic Feedback

   A flooding attack system that produces an optimal outcome as a result of everyone's
   selfish actions is inherently about congestion extremely powerful.  Especially one that enables
   evolvability of a resource.
   Because congestion control.  But why do we have to change to
   re-ECN ensures to achieve it?  Can't classic congestion feedback (as used
   already by standard ECN) be arranged to provide similar incentives
   and similar evolvability?  Superficially it can.  Kelly's seminal
   work showed how we can allow everyone the sources causing network freedom to evolve whatever
   congestion
   experience control behaviour is in their application's best interest
   but still optimise the cost whole system of their own actions, it acts as networks and users by placing
   a first line of
   defence against DDoS.  As load focuses price on a victim, upstream queues
   grow, requiring honest sources congestion to pre-load packets with a higher
   fraction ensure responsible use of positive packets.  Once downstream routers are so
   congested that they are dropping traffic, they will be CE marking the
   traffic they do forward 100%.  Honest sources will therefore be
   sending Re-Echo 100% (and therefore being severely rate-limited at
   the ingress).

   Malicious sources can either do the same this
   freedom [Evol_cc]).  Kelly used ECN with its classic congestion
   feedback model as honest sources, and be
   rate-limited at ingress, or they can understate the mechanism to convey congestion by sending
   more neutral RECT price
   information.  The mechanism was nearly identical to volume charging;
   except only the volume of packets than they should.  If sources understate marked with congestion (i.e. do not re-echo sufficient positive packets) and experienced
   (CE) was counted.

   However, below we explain why relying on classic feedback /required/
   congestion charging to be used, while re-ECN achieves the
   preferential drop ranking same
   powerful outcome (given it is implemented built on routers (Section 5.3),
   these routers will preserve positive traffic until last.  So, Kelly's foundations), but does
   not /require/ congestion charging.  In brief, the
   neutral traffic from malicious sources will all be automatically
   dropped first.  Either way, problem with
   classic feedback is that the malicious sources cannot send more
   than honest sources.

   Further, DDoS sources will tend incentives have to be re-used by different
   controllers for different attacks.  They will therefore build up a
   long term history of causing congestion.  Therefore, as long as trace the
   population of potentially compromisable hosts around indirect
   path back to the Internet is
   limited, sender---the long way round the per-user policing algorithms feedback loop.  For
   example, if classic feedback were used in Appendix F.1 will
   gradually throttle down the zombies.  Therefore, widespread
   deployment of re-ECN could considerably dampen the force Figure 8, N2 would have had
   to influence N1 via all of DDoS.
   Zombie armies could hold back from attacking for long enough N4, R & S rather than directly.

   Inability to agree what is happening downstream: In order to police
      its upstream neighbour's congestion response, the neighbours
      should be able to build up enough credit in agree on the per-user policers congestion to launch an
   attack.  But they would then still be limited to no more throughput
   than other, honest users.

   Inter-domain traffic policing (see Section 6.1.5)ensures that any
   network that harbours compromised `zombie' hosts will have to bear
   the cost of the congestion caused by responded to.
      Whatever the feedback regime, as packets of the zombies in
   downstream networks.  Such network will be incentivised to deploy
   per-user policers that rate-limit hosts unresponsive to congestion so change hands at each
      trust boundary, any path metrics they carry are verifiable by both
      neighbours.  But, with a classic path metric, they can only send very slowly into congested paths.  As well as
   protecting other networks, agree
      on the extremely poor performance at any sign
   of /upstream/ path congestion.

   Inaccessible back-channel: The network needs a whole-path congestion will incentivise the zombie's owner to clean
      metric if it up.
   However, the host should behave normally when using uncongested
   paths.

6.2.2.  End-to-end QoS

   {ToDo: }

6.2.3.  Traffic Engineering

   {ToDo: }

6.2.4.  Inter-Provider Service Monitoring

   {ToDo: }

6.3.  Limitations

   This section will discuss the limitations of wants to control the re-ECN approach,
   particularly:

   o  Malicious users have source.  Classically, whole path
      congestion emerges at the ability destination, to turn off ECT.  Given Not-ECT
      traffic cannot be efficiently policed, users would be able fed back from
      receiver to get sender in a considerable advantage that would back-channel.  But, in any data network,
      back-channels need not be visible to relays, as they are
      essentially communications between the end-points.  They may be
      encrypted, asymmetrically routed or simply compensated omitted, so no network
      element can reliably intercept them.  The congestion charging
      literature solves this problem by
      their being charging the preferential candidates for drops in case of
      sustained congestion.  For receiver and
      assuming this reason, we recommend that while
      accommodating a smooth initial transition to re-ECN policers
      should gradually be tuned will cause the receiver to rate limit Not-ECT traffic in refer the
      long term.

   o  Re-feedback for TTL (re-TTL) would also be desirable at charges to the same
      time as re-ECN.  Unfortunately
      sender.  But, of course, this requires a further agreement creates unintended side-effects...

   `Receiver pays' unacceptable: In connectionless datagram networks,
      receivers and receiving networks cannot prevent reception from
      malicious senders, so `receiver pays' opens them to standardise the mechanisms briefly described in Appendix E

   o  We `denial of
      funds' attacks.

   End-user congestion charging unacceptable: Even if 'denial of funds'
      were not a problem, we know that end-users are considering highly averse to
      the issue unpredictability of whether it would be useful congestion charging and anyway, we want to
      truncate rather than drop packets that appear
      avoid restricting network operators to be malicious, just one retail tariff.
      But with classic feedback only an upstream metric is available, so
      that
      we cannot avoid having to wrap the `receiver pays' money flow
      around the feedback loop is not broken but useful data can loop, necessarily forcing end-users to be
      removed.

   o  The inability
      subjected to police excessive congestion when it causes an
      ECN-capable router to drop ECT traffic rather than marking it.
      Re-ECN allows charging.

   To summarise so far, with classic feedback, policing congestion
   response without losing evolvability /requires/ congestion charging
   of downstream explicit end-users and a `receiver pays' model, whereas, with re-ECN, it is
   still possible to influence incentives using congestion
      notifications, not drops.

7.  Incremental Deployment

7.1.  Incremental Deployment Features

   We chose charging but
   using the safer `sender pays' model.  However, congestion charging is
   only likely to use ECT(1) be appropriate between domains.  So, without losing
   evolvability, re-ECN enables technical policing mechanisms that are
   more appropriate for Re-ECN traffic deliberately.  Existing ECN
   sources set ECT(0) at either 50% (the nonce) or 100% (the default).
   So they will appear to end users than congestion pricing.

   We now take a re-ECN policer as very highly congested
   paths.  When policers are first deployed they can be configured
   permissively, allowing through both `legacy' ECN and misbehaving re-
   ECN flows.  Then, as the threshold is set more strictly, the more
   legacy ECN sources will gain by upgrading to re-ECN.  Thus, towards second pass over the end of incentive framework, filling in
   the voluntary incremental deployment period, legacy
   transports can be given progressively stronger encouragement to
   upgrade.

   {ToDo: detail.

6.1.4.  Egress Dropper

   As well as introducing traffic leaves the new information above, this section
   is intended to collect together all last network before the snippets of information
   throughout receiver (domain N4 in
   Figure 8), the draft about incremental deployment.  Through lack fraction of
   time, this rationalisation will have to wait until the next version,
   except for positive octets in a flow should match the brief list below.  However,
   fraction of negative octets introduced by congestion marking, leaving
   a long section describing
   possible deployment scenarios balance of zero.  If it is available in less (a negative flow), it implies that
   the section following.}

   Re-ECN semantics for use of source is understating path congestion (which will reduce the two-bit ECN field
   penalties that N2 owes N4).

   If flows are different in positive, N4 need take no action---this simply means its
   upstream neighbour is paying more penalties than it needs to, and the following minor respects compared to RFC3168:

   o  A re-ECN sender sets ECT(1) by default, whereas an RFC3168 sender
      sets ECT(0) by default;

   o  No provision
   source is necessary for a re-ECN capable source transport going slower than it needs to.  But, to
      use the ECN nonce;

   o  Routers MAY preferentially drop different extended ECN codepoints;

   o  Packets carrying the feedback protect itself
   against persistently negative flows, N4 will need to install a
   dropper at its egress.  Appendix E gives a suggested algorithm for
   this dropper.  There is not established (FNE) codepoint MAY
      optionally intention that the dropper algorithm
   needs to be marked rather than dropped by routers, even though
      their ECN field standardised, it is Not-ECT (with merely provided to show that an
   efficient, robust algorithm is possible.  But whatever algorithm is
   used must meet the important caveat in
      "retcp_Router_Forwarding_Behaviour"); criteria below:

   o  Packets may be dropped by policing nodes because of apparent
      misbehaviour, not just because of congestion.

   None of these changes REQUIRE any modifications to routers.

7.2.  Incremental Deployment Incentives  It would only SHOULD introduce minimal false positives for honest flows;

   o  It SHOULD quickly detect and sanction dishonest flows (minimal
      false negatives);

   o  It MUST be worth standardising the re-ECN protocol invulnerable to state exhaustion attacks from malicious
      sources.  For instance, if there
   existed a coherent story for how the dropper uses flow-state, it might should
      not be incrementally deployed.
   In order possible for it to have a chance of deployment, everyone who needs source to
   act, must have send numerous packets, each with a strong incentive
      different flow ID, to act, and force the incentives must
   arise dropper to exhaust its memory
      capacity;

   o  It MUST introduce sufficient loss in the order goodput so that deployment would have to happen.  Re-ECN
   works around unmodified ECN routers, but we can't just discuss why
   and how re-ECN deployment might build on ECN deployment, because
   there is precious little to build on malicious
      sources cannot play off losses in the first place.  Instead, we
   aim egress dropper against
      higher allowed throughput.  Salvatori [CLoop_pol] describes this
      attack, which involves the source understating path congestion
      then inserting forward error correction (FEC) packets to show
      compensate expected losses.

   Note that re-ECN deployment could carry ECN with it.  We focus
   on commercial deployment incentives, although some of the arguments
   apply equally to academic or government sectors.

   ECN deployment:

      ECN is largely implemented in commercial routers, dropper operates on flows but generally
      not as a supported feature, and we would like it has largely not to
   require per-flow state.  This is why we have been deployed
      by commercial network operators.  It has been released in many
      Unix-based operating systems, but not in proprietary OSs like
      Windows or those in many mobile devices.  For detailed deployment
      status, see [ECN-Deploy].  We believe careful to ensure
   that all flows MUST start with a packet marked with the reason ECN deployment
      has FNE
   codepoint.  If a flow does not happened start with the FNE codepoint, a
   dropper is twofold:

      *  ECN requires changes to both routers and hosts.  If someone
         wanted likely to sell treat it unfavourably.  This risk makes it worth
   setting the improvement that ECN offers, they would have
         to co-ordinate deployment FNE codepoint at the start of their product with others.  An ECN
         server only gives any improvement on an ECN network.  An ECN
         network only gives any improvement if used by ECN devices.
         Deployment that requires co-ordination adds a flow, even though there
   is a cost and delay and
         tends to dilute any competitive advantage that might be gained.

      *  ECN `only' gives a performance improvement.  Making a product a
         bit faster (whether the product is a device or a network),
         isn't usually a sufficient selling point to be worth the cost sender of co-ordinating across setting FNE (positive `worth').  Indeed,
   with the industry to deploy it.  Network
         operators tend to avoid re-configuring a working network unless
         launching FNE codepoint, the rate at which a sender can generate new product.

   ECN and re-ECN for Edge-to-edge Assured QoS:

      We believe
   flows can be limited (Appendix G).  In this respect, the proposal to provide assured QoS sessions using FNE
   codepoint works like Handley's state set-up bit [Steps_DoS].

   Appendix E also gives an example dropper implementation that
   aggregates flow state.  Dropper algorithms will often maintain a
      form
   moving average across flows of ECN called pre-congestion notification (PCN) [CL-arch] is
      most likely to break the deadlock in ECN deployment first.  It fraction of RE blanked packets.
   When maintaining an average across flows, a dropper SHOULD only requires edge-to-edge deployment so allow
   flows into the average if they start with FNE, but it SHOULD not
   include packets with the FNE codepoint set in the average.  A sender
   sets the FNE codepoint when it does not require
      endpoint support.  It can have the benefit of feedback
   from the receiver.  So, counting packets with FNE cleared would be deployed in a single network, then
      grow incrementally
   likely to interconnected networks.  And it provides make the average unnecessarily positive, providing headroom
   (or should we say footroom?) for dishonest (negative) traffic.

   If the dropper detects a
      different `product' (internetworked assured QoS), rather than
      merely making an existing product a bit faster.

      Not only could this assured QoS application kick-start ECN
      deployment, persistently negative flow, it could also carry re-ECN deployment with it; because
      re-ECN can enable the assured QoS region SHOULD drop
   sufficient negative and neutral packets to expand force the flow to a large
      internetwork where neighbouring networks do not trust each other.
      [Re-PCN] argues that re-ECN security should be built
   negative.  Drops SHOULD be focused on just sufficient packets in
   misbehaving flows to remove the QoS
      system from the start, explaining why and how.

      If ECN and re-ECN were deployed edge-to-edge for assured QoS,
      operators would gain valuable experience.  They would also clear
      away many technical obstacles such as firewall configurations that
      block all but the legacy settings of the ECN field and the RE
      flag.

   ECN in negative bias while doing minimal
   extra harm.

6.1.5.  Rate Policing

   Access Networks:

      The next obstacle operators who wish to ECN deployment would be extension check that a sender is complying with a
   particular rate response to access
      and backhaul networks, where considerable link layer differences
      makes implementation non-trivial, particularly on congested
      wireless links.  ECN and re-ECN work fine during partial
      deployment, but they will not be very useful if congestion can deploy rate policers at
   the most congested
      elements in networks are very first ingress to the last internetwork.  Re-ECN has been designed
   to support them.  Access network
      support is one of avoid the weakest parts of this deployment story.  All need for bottleneck policing so that we can hope avoid a
   future where a single rate adaptation policy is that, once embedded throughout
   the benefits of ECN are better
      understood by operators, they will push for network.  Instead, re-ECN allows the necessary link
      layer implementations as deployment proceeds.

   Policing Unresponsive Flows:

      Re-ECN particular rate adaptation
   policy to be solely agreed bilaterally between the sender and its
   ingress access provider (Section 5.5.2 discusses possible ways to
   signal between them), which allows congestion control to be policed,
   but maintains its evolvability, requiring only a network single, local box to offer differentiated quality of service
   be updated.

   If desired, the re-ECN protocol allows these ingress policers to
   perform per-flow policing according to the widely adopted TCP rate
   adaptation, perhaps as explained in Section 6.2.2. a default.  But we do not believe this will
      motivate initial deployment of re-ECN, because the industry is
      already set on alternative ways of doing QoS.  Despite being much it also allows new rate
   adaptation policies beyond TCP to be enforced.  Perhaps more complicated and expensive,
   usefully, it also allows the alternative approaches are
      here flexibility for networks to choose to
   police users as a whole, rather than flows.

   Appendix G gives examples of per-user and now. per-flow policing
   algorithms.  But re-ECN there is critical no implication that these algorithms are to QoS deployment in another respect.  It
      can
   be used to prevent applications from taking whatever bandwidth
      they choose without asking.

      Currently, applications standardised, or that remain resolute in their lack of
      response to congestion they are rewarded by other TCP applications.  In
      other words, TCP is naively friendly, in that it reduces its ideal.  The ingress rate
      in response to congestion whether it policer is competing with friends
      (other TCPs) or with enemies (unresponsive applications).

      Therefore, those network owners
   the part of the re-ECN incentive framework that want is intended to sell QoS will be keen
      to ensure that their users can't help themselves to QoS for free.
      Given the very large revenues at stake, we believe effective
      policing of congestion response will become highly sought after by
      network owners.

      But this does not necessarily argue
   most flexible.  Once endpoint protocol handlers for re-ECN deployment.
      Network owners might choose to deploy bottleneck policers rather
      than re-ECN-based policing.  However, under Related Work
      (Section 9) we argue that bottleneck policers and egress
   droppers are inherently
      vulnerable to circumvention.

      Therefore we believe there will be a strong demand from network
      owners for re-ECN deployment so they can police flows that do not
      ask to be unresponsive to congestion, in order to protect their
      revenues from flows that do ask (QoS).  In particular, we suspect
      that the place, operators of cellular networks will can choose exactly which congestion
   response they want to prevent VoIP police, and video applications being used freely on their networks as whether they want to do it per
   user, per flow or not at all.

   However, if a
      more open market develops rate policer is used, it should use path (not
   downstream) congestion as the relevant metric, which is represented
   by the fraction of octets in GPRS packets with positive (Re-Echo and 3G devices.

      Initial deployments are likely to be isolated to single cellular
      networks.  Cellular operators would first place requirements on
      device manufacturers to include FNE)
   and canceled (CE(0)) markings.  Of course, re-ECN in provides all the standards for mobile
      devices.  In parallel, they would put out tenders for ingress and
      egress policers.  Then, after
   information a while they would start to tighten
      rate limits on Not-ECT traffic from non-standard devices and they
      would start policing whatever non-accredited applications people
      might install on mobile devices with re-ECN support policer needs directly in the
      operating system.  This would force packets being policed.
   So, even independent mobile device
      manufacturers to provide re-ECN support.  Early standardisation
      across policing TCP's AIMD algorithm is relatively straightforward.
   Appendix G presents an example design, but the cellular operators choice of preferred
   mechanism is likely, including interconnection
      agreements with penalties for excess downstream congestion.

      We suspect some fixed broadband networks (whether cable or DSL)
      would follow a similar path.  However, we also believe up to the implementer.

   Note that larger
      parts we have included canceled packets in the measure of path
   congestion.  Canceled packets arise when the fixed Internet would not choose to police on a per-
      flow basis.  Some might choose sender re-echoes earlier
   congestion, but then this Re-Echo packet just happens to police be
   congestion on a per-user
      basis in order to manage heavy peer-to-peer file-sharing, but it
      seems likely that a sizeable majority marked itself.  One would not deploy any form of
      policing.

      This hybrid situation begs normally expect many
   canceled packets at the question, "How does re-ECN work for
      networks that choose first ingress because one would not normally
   expect much congestion marking to using policing if they connect with others have been necessary that don't?"  Traffic from non-ECN capable sources will arrive
      from other networks soon in
   the path.  However, a home network or campus network may well sit
   between the sending endpoint and cause the ingress policer, so some
   congestion within may occur upstream of the policed, ECN-
      capable networks.  So networks that chose to police policer.  And if congestion
      would rate-limit Not-ECT traffic throughout their network,
      particularly at their borders.  They would probably also set
      higher usage prices in their interconnection contracts for
      incoming Not-ECT does
   occur upstream, some canceled packets should be visible, and Not-RECT traffic.  We assume that
      interconnection contracts between networks should
   be taken into account in the same tier will
      include congestion penalties before contracts with provider
      backbones do.

      A hybrid situation could remain measure of path congestion.

   But a much more important reason for all time.  As was explained in
      the introduction, we believe including canceled packets in healthy competition between
      policing and not policing, with no imperative to convert the whole
      world to
   the religion measure of policing.  Networks path congestion at an ingress policer is that chose not to
      deploy egress droppers would leave themselves open to being
      congested a sender
   might otherwise subvert the protocol by senders in other networks. sending canceled packets
   instead of neutral (RECT) packets.  Like neutral, canceled packets
   are worth zero, so the sender knows they won't be counted against any
   quota it might have been allowed.  But unlike neutral packets,
   canceled packets are immune to congestion marking, because they have
   already been congestion marked.  So, it is both correct and useful
   that would canceled packets should be their
      choice.

      The important aspect included in a policer's measure of
   path congestion, as this removes the egress dropper though is that it most
      protects incentive the network sender would
   otherwise have to mark more packets as canceled than it should.

   An ingress policer should also ensure that deploys it.  If a network does flows are not
      deploy an egress dropper, sources sending into already
   negative when they enter the access network.  As with canceled
   packets, the presence of negative packets will typically be unusual.
   Therefore it from other
      networks will be able easy to understate detect negative flows at the congestion ingress by
   just detecting negative packets then monitoring the flow they are
      causing.  Whereas, belong
   to.

   Of course, even if a network deploys an egress dropper, the sender does operate its own network, it can
      know how much may
   arrange not to congestion other networks are dumping into it.  And
      apply penalties or charges accordingly.  So, whether mark traffic.  Whether the sender does this
   or not is of no concern to anyone else except the sender.  Such a
      network polices
   sender will not be policed against its own sources at ingress, it is in its interests network's contribution to deploy an egress dropper.

   Host support:

      In the above deployment scenario, host operating system support
      for re-ECN came about through
   congestion, but the cellular operators demanding it
      in device standards (i.e. 3GPP).  Of course, increasingly, mobile
      devices are being built to support multiple wireless technologies.
      So, if re-ECN were stipulated for cellular devices, it only resulting problem would
      automatically appear be overload in those devices connected to the wireless
      fringes of fixed networks if they coupled cellular with WiFi or
      Bluetooth technology,
   sender's own network.

   Finally, we must not forget that an easy way to circumvent re-ECN's
   defences is for instance.  Also, once implemented in the
      operating system of one mobile device, it would tend source to be found
      in other devices using turn off re-ECN support, by setting the same family of operating system.

      Therefore, whether or not
   Not-RECT codepoint, implying legacy traffic.  Therefore an ingress
   policer must put a fixed network deployed ECN, or
      deployed re-ECN policers and droppers, many of its hosts might
      well general rate-limit on Not-RECT traffic, which
   SHOULD be using re-ECN over it.  Indeed, they would be at lax during early, patchy deployment, but will have to
   become stricter as deployment widens.  Similarly, flows starting
   without an
      advantage when communicating with hosts across Re-ECN policed
      networks that rate limited Not-RECT traffic.

   Other possible scenarios:

      The above is thankfully not the only plausible scenario we FNE packet can
      think of.  One of be confined by a strict rate-limit used for
   the many clubs remainder of operators flows that meet regularly
      around the world might decide to act together to persuade a major
      operating system manufacturer haven't proved they are well-behaved by
   starting correctly (therefore they need not consume any flow state---
   they are just confined to implement re-ECN.  And the `misbehaving' bin if they may
      agree between them on an interconnection model that includes
      congestion penalties.

      Re-ECN provides carry an interesting opportunity
   unrecognised flow ID).

6.1.6.  Inter-domain Policing

   One of the main design goals of re-ECN is for device
      manufacturers as well as network operators.  Policers can border security
   mechanisms to be
      configured loosely when first deployed.  Then as re-ECN take-up
      increases, simple as possible, otherwise they can be tightened up, so will become
   the pinch-points that a network with re-ECN
      deployed can gradually squeeze down limit scalability of the service provided whole internetwork.
   We want to legacy
      devices avoid per-flow processing at borders and to keep to
   passive mechanisms that have not upgraded can monitor traffic in parallel to re-ECN.  Many device vendors
      rely on replacement sales.  And operating system companies rely
      heavily on new release sales.  Also support services would like
   forwarding, rather than having to
      be filter traffic inline---in series
   with forwarding.

   So far, we have been able to force stragglers to upgrade.  So, keep the ability to
      throttle service border mechanisms simple,
   despite having had to legacy operating systems is quite valuable.

      Also, policing unresponsive sources may not be harden them against some subtle attacks on the only or even
   re-ECN design.  The mechanisms are still passive and avoid per-flow
   processing.

   The basic accounting mechanism at each border interface simply
   involves accumulating the first application that drives deployment.  It may be policing
      causes volume of heavy congestion (e.g. peer-to-peer file-sharing).  Or
      it may be mitigation packets with positive worth (Re-
   Echo and FNE), and subtracting the volume of denial those with negative
   worth: CE(-1).  Even though this mechanism takes no regard of service.  Or we may be wrong in
      thinking simpler QoS flows,
   over an accounting period (say a month) this subtraction will not be the initial motivation account
   for re-ECN
      deployment.  Indeed, the combined pressure for downstream congestion caused by all these may be the motivator, but it seems optimistic to expect such a level of
      joined-up thinking from today's communications industry.  We
      believe a single application alone must be a sufficient motivator.

      In short, everyone gains from adding accountability to TCP/IP,
      except flows traversing the selfish or malicious.  So, deployment incentives tend
   interface, wherever they come from, and wherever they go to.  The two
   networks can agree to be strong.

8.  Architectural Rationale

   In the Internet's technical community the danger of not responding use this metric however they wish to
   congestion is well-understood, with its attendant risk of congestion
   collapse [RFC3714].  However, many of determine
   some congestion-related penalty against the Internet's commercial
   community consider that upstream network.
   Although the very essence of IP algorithm could hardly be simpler, it is spelled out
   using pseudo-code in Appendix H.1.

   Various attempts to provide open
   access to subvert the internetwork for re-ECN design have been made.  In all applications.  Congestion
   cases their root cause is seen
   as a symptom persistently negative flows.  But, after
   describing these attacks we will show that we don't actually have to
   get rid of over-conservative investment.  And all persistently negative flows in order to thwart the goal of
   application design
   attacks.

   In honest flows, downstream congestion is measured as positive minus
   negative volume.  So if all flows are honest (i.e. not persistently
   negative), adding all positive volume and all negative volume without
   regard to find novel ways to continue working despite flows will give an aggregate measure of downstream
   congestion.  They argue that  But such simple aggregation is only possible if no flows
   are persistently negative.  Unless persistently negative flows are
   completely removed, they will reduce the Internet was never intended to be
   solely for TCP-friendly applications.  Another side aggregate measure of
   congestion.  The aggregate may still be positive overall, but not as
   positive as it would have been had the Internet's
   commercial community believe that negative flows been removed.

   In Section 6.1.4 we discussed how to sanction traffic to remove, or
   at least to identify, persistently negative flows.  But, even if the
   sanction for negative traffic is to discard it, unless it is no use providing a network
   for novel applications if
   discarded at the exact point it has insufficient capacity.  And goes negative, it will
   always have insufficient capacity unless a greater share wrongly
   subtract from aggregate downstream congestion, at least at any
   borders it crosses after it has gone negative but before it is
   discarded.

   We rely on sanctions to deter dishonest understatement of
   application revenues congestion.
   But even the ultimate sanction of discard can only be /assured/ for effective if
   the infrastructure
   provider.  Otherwise sender is bothered about the major investments required will carry too data getting through to its
   destination.  A number of attacks have been identified where a sender
   gains from sending dummy traffic or it can attack someone or
   something using dummy traffic even though it isn't communicating any
   information to anyone:

   o  A host can send traffic with no positive markings towards its
      intended destination, aiming to transmit as much risk and won't happen.

   The lesson articulated in [Tussle] is that we shouldn't embed our
   view on these arguments traffic as any
      dropper will allow [Bauer06].  It may add forward error correction
      (FEC) to repair as much drop as it experiences.

   o  A host can send dummy traffic into the Internet at design time.  Instead we
   should design the Internet so that network with no positive
      markings and with no intention of communicating with anyone, but
      merely to cause higher levels of congestion for others who do want
      to communicate (DoS).  So, to ride over the outcome extra congestion,
      everyone else has to spend more of these arguments whatever rights to cause
      congestion they have been allowed.

   o  A network can
   get decided simply create its own dummy traffic to congest
      another network, perhaps causing it to lose business at run-time.  Re-ECN is designed in that spirit.  Once no cost to
      the protocol attacking network.  This is available, different a form of denial of service
      perpetrated by one network operators can choose how
   liberal they want to be on another.  The preferential drop
      measures in holding people accountable Section 5.3 provide crude protection against such
      attacks, but we are not overly worried about more accurate
      prevention measures, because it is already possible for networks
      to DoS other networks on the
   congestion general Internet, but they cause.  Some might boldly invest in capacity and not
   police its use at all, hoping that novel applications will result.
   Others might use generally
      don't because of the grave consequences of being found out.  We
      are only concerned if re-ECN increases the motivation for fine-grained flow policing, expecting such an
      attack, as in the next example.

   o  A network can just generate negative traffic and send it over its
      border with a neighbour to
   make money selling vertically integrated services.  Yet others might
   sit somewhere half-way, perhaps doing coarse, per-user policing.  All
   might change their minds later.  But re-ECN always allows them reduce the overall penalties that it
      should pay to
   interconnect so that neighbour.  It could even initialise the careful ones can protect themselves from TTL so
      it expired shortly after entering the
   liberal ones.

   The incentive-based approach used for re-ECN is based on Gibbens and
   Kelly's arguments [Evol_cc] on allowing endpoints neighbouring network,
      reducing the freedom to
   evolve new congestion control algorithms for new applications.  They
   ensured responsible behaviour despite everyone's self-interest chance of detection further downstream.  This attack
      need not be motivated by
   applying pricing a desire to ECN marking, and Kelly had proved stability deny service and
   optimality in an earlier paper.

   Re-ECN keeps all the underlying economic incentives, but rearranges
   the feedback.  The idea is indeed need
      not cause denial of service.  A network's main motivator would
      most likely be to allow a network operator (if reduce the penalties it
   chooses) pays to deploy engineering mechanisms like policers at a neighbour.
      But, the front prospect of financial gain might tempt the network which can be designed into
      mounting a DoS attack on the other network as well, given the gain
      would offset some of the risk of being detected.

   The first step towards a solution to behave /as if/ they are
   responding to congestion prices.  Rather than having all these problems with negative
   flows is to subject users be able to congestion pricing, networks can then use more traditional
   charging regimes (or novel ones).  But the engineering can constrain estimate the overall amount of congestion a user can cause.  This provides a
   buffer against completely outrageous congestion control, but still
   makes it easy for novel applications to evolve if contribution they need different make to
   downstream congestion control at a border and to correct the norms.  It also allows novel charging
   regimes measure
   accordingly.  Although ideally we want to evolve.

   Despite being achieved with a relatively minor protocol change, re-
   ECN is an architectural change.  Previously, Internet congestion
   could only be controlled by the data sender, because it was remove negative flows
   themselves, perhaps surprisingly, the only
   one both in a position most effective first step is to control
   cancel out the load and in a position to see
   information polluting effect negative flows have on congestion.  Re-ECN levels the playing field. measure of
   downstream congestion at a border.  It
   recognises that is more important to get an
   unbiased estimate of their effect, than to try to remove them all.  A
   suggested algorithm to give an unbiased estimate of the network also has a role contribution
   from negative flows to play in moderating
   (policing) the downstream congestion control.  But policing measure is only truly effective
   at the first ingress into given in
   Appendix H.2.

   Although making an internetwork, whereas path congestion
   was previously only visible at accurate assessment of the last egress.  So, re-ECN
   democratises congestion information.  Then contribution from
   negative flows may not be easy, just the choice over who
   actually controls single step of neutralising
   their polluting effect on congestion can be made at run-time, not design
   time---a bit like an aircraft with dual controls.  And different
   operators can metrics removes all the gains
   networks could otherwise make different choices.  We believe non-architectural
   approaches to this problem are unlikely from mounting dummy traffic attacks on
   each other.  This puts all networks on the same side (only with
   respect to offer more negative flows of course), rather than partial
   solutions (see Section 9).

   Importantly, re-ECN does NOT REQUIRE assumptions about specific
   congestion responses to be embedded in any being pitched
   against each other.  The network elements, except
   at the first ingress to where this flow goes negative as
   well as all the internetwork if that level networks downstream lose out from not being
   reimbursed for any congestion this flow causes.  So they all have an
   interest in getting rid of control is
   desired by the ingress operator.  But such tight policing will be these negative flows.  Networks forwarding
   a
   matter of agreement between flow before it goes negative aren't strictly on the same side, but
   they are disinterested bystanders---they don't care that the source and its access network
   operator.  The ingress operator need not police congestion response
   at flow granularity;
   goes negative downstream, but at least they can't actively gain from
   making it can simply hold go negative.  The problem becomes localised so that once a source responsible for
   flow goes negative, all the
   aggregate congestion it causes, perhaps keeping networks from where it within happens and beyond
   downstream each have a monthly
   congestion quota.  Or if the ingress network trusts the source, small problem, each can detect it has a
   problem and each can do nothing.

   Therefore, the aim get rid of the re-ECN protocol is NOT solely to police
   TCP-friendliness.  Re-ECN preserves IP as a generic network layer problem if it chooses to.  But
   negative flows can no longer be used for
   all sorts any new attacks.

   Once an unbiased estimate of responses the effect of negative flows can be
   made, the problem reduces to congestion, for all sorts detecting and preferably removing flows
   that have gone negative as soon as possible.  But importantly,
   complete eradication of transports.
   Re-ECN merely ensures truthful downstream congestion information negative flows is
   available no longer critical---best
   endeavours will be sufficient.

   For instance, let us consider the case where a source sends traffic
   with no positive markings at all, hoping to at least get as much
   traffic delivered as network-based droppers will allow.  The flow is
   likely to go at least slightly negative in the first network layer for on the
   path (N1 if we use the example network layout in Figure 9).  If all sorts of accountability
   applications.

   The end
   networks use the algorithm in Appendix H.2 to end design principle does not say that all functions
   should be moved out of inflate penalties at
   their border with an upstream network, they will remove the lower layers---only those functions that
   are effect of
   negative flows.  So, for instance, N2 will not generic to all higher layers.  Re-ECN adds be paying a function penalty to
   N1 for this flow.  Further, because the flow contributes no positive
   markings at all, a dropper at the egress will completely remove it.

   The remaining problem is that every network layer is carrying a flow that
   is generic, but was omitted: accountability for causing congestion.  Accountability is not something that an end-user
   can provide congestion to themselves.  We believe re-ECN adds no more than is
   sufficient others but not being held to hold each flow accountable, even if account for the
   congestion it consists of a
   single datagram.

   "Accountability" implies being able to identify who is responsible
   for causing congestion.  However, at the network layer it would NOT
   be useful to identify causing.  Whenever the cause of congestion by adding individual fail-safe border algorithm
   (Section 6.1.7) or
   organisational identity information, NOR by using source IP
   addresses.  Rather than bringing identity information to the point of
   congestion, we bring downstream congestion information border algorithm to the point
   where the cause can be most easily identified and dealt with.  That
   is, at any trust boundary, congestion compensate for negative
   flows (Appendix H.2) detects a negative flow, it can instantiate a
   focused dropper for that flow locally.  It may be associated with some time before
   the
   physically connected upstream neighbour that flow is directly responsible
   for causing detected, but the more strongly negative the flow is, the
   more quickly it (whether intentionally or not).  A trust boundary
   interface is exactly will be detected by the place to police or throttle fail-safe algorithm.  But, in order
   the meantime, it will not be distorting border incentives.  Until it
   is detected, if it contributes to
   directly mitigate congestion, rather than having drop anywhere, its packets will
   tend to trace be dropped before others if routers use the
   (ir)responsible party preferential drop
   rules in order to shut them down.

   Some considered that ECN itself was a layering violation.  The
   reasoning went that Section 5.3, which discriminate against non-positive
   packets.  All networks below the interface to a layer should provide point where a service
   to the higher layer and hide how the lower layer does it.  However,
   ECN reveals the state of the network layer flow goes negative
   (N1, N2 and below N4 in this case) have an incentive to remove this flow,
   but the transport
   layer.  A more positive way to describe ECN is that router where it is like the
   return value first goes negative (in N1) can of a function call to course
   remove the network layer.  It explicitly
   returns problem for everyone downstream.

   In the status case of DDoS attacks, Section 6.2.1 describes how re-ECN
   mitigates their force.

   Note that the request to deliver a packet, by returning a
   value representing guiding principle behind all the current risk above discussion is
   that a packet will not be served.
   Re-ECN has similar semantics, except any gain from subverting the transport layer must try protocol should be precisely
   neutralised, rather than punished.  If a gain is punished to
   guess the return value, then a
   greater extent than is sufficient to neutralise it, it can use the actual return value from will most
   likely open up a new vulnerability, where the network layer to modify amplifying effect of
   the next guess.

9.  Related Work

   {Due punishment mechanism can be turned on others.

   For instance, if possible, flows should be removed as soon as they go
   negative, but we do NOT RECOMMEND any attempts to lack of time, this section is incomplete.  The reader discard such flows
   further upstream while they are still positive.  Such over-zealous
   push-back is
   referred unnecessary and potentially dangerous.  These flows have
   paid their `fare' up to the Related Work section of [Re-fb] point they go negative, so there is no
   harm in delivering them that far.  If someone downstream asks for a brief selection
   of related ideas.}

9.1.  Policing Rate Response
   flow to Congestion

   ATM network elements send congestion back-pressure messages [ITU-
   T.I.371] along each connection, duplicating any end be dropped as near to end feedback the source as possible, because they don't trust it.  On
   say it is going to become negative later, an upstream node cannot
   test the other hand, truth of this assertion.  Rather than have to authenticate
   such messages, re-ECN ensures
   information in forwarded packets has been designed so that flows can be used dropped
   solely based on locally measurable evidence.  A message hinting that
   a flow should be watched closely to test for congestion
   management without requiring negativity is fine.  But
   not a connection-oriented architecture and
   re-using the overhead of fields message that are already set aside claims that a positive flow will go negative
   later, so it should be dropped. .

6.1.7.  Inter-domain Fail-safes

   The mechanisms described so far create incentives for end rational
   network operators to
   end congestion control (and behave.  That is, one operator aims to make
   another behave responsibly by applying penalties and expects a
   rational response (i.e. one that trades off costs against benefits).
   It is usually reasonable to assume that other network operators will
   behave rationally (policy routing loop detection in can avoid those that might not).
   But this approach does not protect against the case misconfigurations and
   accidents of re-
   TTL in Appendix E).

   We borrowed ideas from policers in the literature [pBox],[XCHOKe],
   AFD etc. for our rate equation policer.  However, without other operators.

   Therefore, we propose the benefit following two mechanisms at a network's
   borders to provide "defence in depth".  Both are similar:

   Highly positive flows: A small sample of re-ECN positive packets should be
      picked randomly as they don't police cross a border interface.  Then subsequent
      packets matching the correct rate for same source and destination address and DSCP
      should be monitored.  If the condition fraction of
   their path.  They detect unusually high /absolute/ rates, but only
   while the policer itself positive marking is congested, because they work well
      above a threshold (to be determined by detecting
   prevalent flows in the discards from operational practice), a
      management alarm SHOULD be raised, and the local RED queue.  These
   policers must sit at every potential bottleneck, whereas our policer
   need only flow MAY be located at each ingress
      automatically subject to focused drop.

   Persistently negative flows: A small sample of congestion marked
      (negative) packets should be picked randomly as they cross a
      border interface.  Then subsequent packets matching the internetwork.  As Floyd &
   Fall explain [pBox], same
      source and destination address and DSCP should be monitored.  If
      the limitation balance of their approach positive minus negative markings is that persistently
      negative, a high
   sending rate might management alarm SHOULD be perfectly legitimate, if raised, and the rest of flow MAY be
      automatically subject to focused drop.

   Both these mechanisms rely on the path
   is uncongested or fact that highly positive (or
   negative) flows will appear more quickly in the round trip time is short.  Commercially
   available rate policers cap sample by selecting
   randomly solely from positive (or negative) packets.

6.1.8.  Simulations

   Simulations of policer and dropper performance done for the rate multi-bit
   version of any one flow.  Or they
   enforce monthly volume caps re-feedback have been included in an attempt to control high volume
   file-sharing.  They limit section 5 "Dropper
   Performance" of [Re-fb].  Simulations of policer and dropper for the value
   re-ECN version described in this document are work in progress.

6.2.  Other Applications

6.2.1.  DDoS Mitigation

   A flooding attack is inherently about congestion of a customer derives.  They might
   also limit resource.
   Because re-ECN ensures the congestion customers can cause, but only as an
   accidental side-effect.  They actually punish traffic that fills
   troughs as much as traffic that causes peaks in utilisation.  In
   practice sources causing network operators need to be able to allocate service by congestion
   experience the cost during congestion, and by value at other times.

9.2.  Congestion Notification Integrity

   The choice of two ECT code-points in the ECN field [RFC3168]
   permitted future flexibility, optionally allowing the sender to
   encode the experimental ECN nonce [RFC3540] in the packet stream.

   The ECN nonce is an elegant scheme that allows the sender to detect
   if someone in the feedback loop tries to claim no congestion was
   experienced when it fact their own actions, it was (whether drop or ECN marking).  The
   sender chooses between the two ECT codepoints in acts as a pseudo-random
   sequence.  Then, whenever the network marks first line of
   defence against DDoS.  As load focuses on a packet with CE, to deny
   the congestion happened, the cheater would have victim, upstream queues
   grow, requiring honest sources to guess which ECT
   codepoint was overwritten, pre-load packets with only a 50:50 chance higher
   fraction of being correct
   each time.

   The assumption behind the ECN nonce is positive packets.  Once downstream routers are so
   congested that a sender they are dropping traffic, they will want to
   detect whether a receiver is suppressing congestion feedback.  This
   is only true if be CE marking the sender's interests are aligned with
   traffic they do forward 100%.  Honest sources will therefore be
   sending Re-Echo 100% (and therefore being severely rate-limited at
   the
   network's, or with ingress).

   Senders under malicious control can either do the community of users same as a whole.  This may be
   true for certain large senders, who are under close scrutiny honest
   sources, and have
   a reputation to maintain.  But we have to deal with a more hostile
   world, where traffic may be dominated rate-limited at ingress, or they can understate
   congestion by peer-to-peer transfers,
   rather sending more neutral RECT packets than downloads from a few popular sites.  Often they should.  If
   sources understate congestion (i.e. do not re-echo sufficient
   positive packets) and the `natural'
   self-interest of a sender preferential drop ranking is not aligned with implemented on
   routers (Section 5.3), these routers will preserve positive traffic
   until last.  So, the interests of other
   users.  It often wishes to transfer data quickly to neutral traffic from malicious sources will all
   be automatically dropped first.  Either way, the receiver malicious sources
   cannot send more than honest sources.

   Further, hosts under malicious control will tend to be re-used for
   many different attacks.  They will therefore build up a long term
   history of causing congestion.  Therefore, as
   much long as the receiver wants population
   of potentially compromisable hosts around the data quickly.

   In contrast, Internet is limited,
   the re-ECN protocol enables per-user policing algorithms in Appendix G.1 will gradually
   throttle down zombies and other launchpads for attacks.  Therefore,
   widespread deployment of an agreed rate-
   response to congestion (e.g. TCP-friendliness) at the sender's
   interface with re-ECN could considerably dampen the internetwork.  It also ensures downstream networks
   can police force
   of DDoS.  Certainly, zombie armies could hold their upstream neighbours, fire for long
   enough to encourage them be able to police
   their users build up enough credit in turn.  But most importantly, it requires the sender per-user policers
   to
   declare path congestion launch an attack.  But they would then still be limited to the network and it can remove no more
   throughput than other, honest users.

   Inter-domain traffic at policing (see Section 6.1.6)ensures that any
   network that harbours compromised `zombie' hosts will have to bear
   the egress if this declaration is dishonest.  So it can police
   correctly, irrespective cost of whether the receiver tries to suppress congestion feedback or whether the sender ignores genuine caused by traffic from zombies in
   downstream networks.  Such networks will be incentivised to deploy
   per-user policers that rate-limit hosts that are unresponsive to
   congestion
   feedback.  Therefore so they can only send very slowly into congested paths.
   As well as protecting other networks, the re-ECN protocol addresses a much wider range extremely poor performance
   at any sign of cheating problems, which includes the one addressed by congestion will incentivise the ECN
   nonce. {ToDo: Ensure we address zombie's owner to
   clean it up.  However, the early ACK problem.}

9.3.  Identifying Upstream and Downstream Congestion

   Purple [Purple] proposes that routers host should use behave normally when using
   uncongested paths.

   Uniquely, re-ECN handles DDoS traffic without relying on the CWR flag validity
   of identifiers in packets.  Certainly the
   TCP header egress dropper relies on
   uniqueness of ECN-capable flows to work out path congestion and
   therefore downstream congestion in flow identifiers, but not their validity.  So if a similar way to re-ECN.  However,
   because CWR is in
   source spoofs another address, re-ECN works just as well, as long as
   the transport layer, it is not always visible to
   network layer routers and policers.  Purple's motivation was to
   improve AQM, not policing.  But, attacker cannot imitate all the flow identifiers of course, nodes trying to avoid a another
   active flow passing through the same dropper (see Section 6.3).
   Similarly, the ingress policer would relies on uniqueness of flow IDs, not
   their validity.  Because a new flow will only be expected to allow CWR to be visible.

10.  Security Considerations

   This whole memo concerns allowed any rate at
   all if it starts with FNE, and the deployment of a secure congestion
   control framework.  There more FNE packets there are some specific security issues that we
   are still working on.

   Malicious users have ability to launch dynamically changing attacks,
   exploiting
   starting new flows, the time it takes to detect an attack, given more they will be limited.  Essentially a re-
   ECN marking
   is binary.  We are concentrating on subtle interactions between the
   ingress policer and limits the egress dropper in an effort to make it
   impossible to game bulk of all congestion entering the system.

   There network
   through a physical interface; limiting the congestion caused by each
   flow is merely an inherent need for at least some flow state at optional extra.

6.2.2.  End-to-end QoS

   {ToDo: (Section 3.3.2 of [Re-fb] entitled `Edge QoS' gives an outline
   of the egress
   dropper given text that will be added here).}

6.2.3.  Traffic Engineering

   {ToDo: }

6.2.4.  Inter-Provider Service Monitoring

   {ToDo: }

6.3.  Limitations

   The known limitations of the binary marking environment, and re-ECN approach are:

   o  We still cannot defend against the consequent
   vulnerability to state exhaustion attacks.  An egress dropper design
   with bounded flow state is attack described in write-up.

   A Section 10
      where a malicious source can spoof another user's address and send sends negative traffic to through the same destination in order to fool the dropper into
   sanctioning the other user's flow.  To prevent or mitigate these two
   different kinds of DoS attack, against the
      egress dropper as another flow and against given
   flows, we are considering various protection mechanisms.
   Section 5.5.1 discusses one of these.

   The security of re-ECN has been deliberately designed to not rely on
   cryptography.

11.  IANA Considerations

   This memo includes no request to IANA (yet).

   If this memo was imitates its flow identifiers,
      allowing a malicious source to progress cause an innocent flow to standards track, it would list:
      experience heavy drop.

   o  The new RE flag in IPv4 (Section 5.1) and its extension with  Re-feedback for TTL (re-TTL) would also be desirable at the
      ECN field to create same
      time as re-ECN.  Unfortunately this requires a new set of extended ECN (EECN) codepoints;

   o  The definition of the EECN codepoints further standards
      action for default Diffserv PHBs
      (Section 3.2) the mechanisms briefly described in Appendix F

   o  Traffic must be ECN-capable for re-ECN to be effective.  The new extension header only
      defence against malicious users who turn off ECN capbility is that
      networks are expected to rate limit Not-ECT traffic and to apply
      higher drop preference to it during congestion.  Although these
      are blunt instruments, they at least represent a feasible scenario
      for IPv6 the future Internet where Not-ECT traffic co-exists with re-
      ECN traffic, but as a severely hobbled under-class.  We recommend
      (Section 5.2);

   o  The new combinations of flags 7.1) that while accommodating a smooth initial transition
      to re-ECN, policing policies should gradually be tightened to rate
      limit Not-ECT traffic more strictly in the TCP header for capability
      negotiation (Section 4.1.3); longer term.

   o  The new ICMP message type (Section 5.5.1).

12.  Conclusions

   {ToDo:}

13.  Acknowledgements

   Sebastien Cazalet and Andrea Soppera contributed  When checking whether a flow is balancing positive markings with
      congestion marking, re-ECN can only account for congestion
      marking, not drops.  So, whenever a sender experiences drop, it
      does not have to re-echo the idea of re-
   feedback.  All congestion event.  Nonetheless, it is
      hardly any advantage to be able to send faster than other flows
      only if your traffic is dropped and the following have given helpful comments: Andrea
   Soppera, David Songhurst, Peter Hovell, Louise Burness, Phil Eardley,
   Steve Rudkin, Marc Wennink, Fabrice Saffre, Cefn Hoile, Steve Wright,
   John Davey, Martin Koyabe, Carla Di Cairano-Gilfedder, Alexandru
   Murgu, Nigel Geffen, Pete Willis (BT), Sally Floyd (ICIR), Stephen
   Hailes, Mark Handley, Adam Greenhalgh (UCL), Jon Crowcroft (Uni Cam),
   David Clark, Bill Lehr, Sharon Gillett, Steve Bauer, Liz Maida (MIT),
   and comments other traffic isn't.

   o  We are considering the issue of whether it would be useful to
      truncate rather than drop packets that appear to be malicious, so
      that the feedback loop is not broken but useful data can be
      removed.

7.  Incremental Deployment

7.1.  Incremental Deployment Features

   The design of the re-ECN protocol started from participants in the CRN/CFP Broadband and DoS-
   resistant Internet working groups.

14.  Comments Solicited

   Comments and questions are encouraged fact that the
   current ECN marking behaviour of routers was sufficient and very welcome.  They can that re-
   feedback could be
   addressed to introduced around these routers by changing the IETF Transport Area working group's mailing list
   <tsvwg@ietf.org>, and/or to
   sender behaviour but not the authors.

15.  References

15.1.  Normative References

   [RFC2119]  Bradner, S., "Key words for use in RFCs routers.  Otherwise, if had required
   routers to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

   [RFC2309]  Braden, B., Clark, D., Crowcroft, J., Davie, B., Deering,
              S., Estrin, D., Floyd, S., Jacobson, V., Minshall, G.,
              Partridge, C., Peterson, L., Ramakrishnan, K., Shenker,
              S., Wroclawski, J., and L. Zhang, "Recommendations on
              Queue Management and Congestion Avoidance in be changed, the
              Internet", RFC 2309, April 1998.

   [RFC2581]  Allman, M., Paxson, V., and W. Stevens, "TCP Congestion
              Control", RFC 2581, April 1999.

   [RFC3168]  Ramakrishnan, K., Floyd, S., and D. Black, "The Addition chance of Explicit Congestion encountering a path that had
   every router upgraded would be vanishly small during early
   deployment, giving no incentive to start deployment.  Also, as there
   is no new forwarding behaviour, routers and hosts do not have to
   signal or negotiate anything.

   However, networks that choose to protect themselves using re-ECN do
   have to add new security functions at their trust boundaries with
   others.  They distinguish legacy traffic by its ECN field.  Traffic
   from Not-ECT transports is distinguishable by its Not-RECT marking.
   Traffic from legacy ECN transports is distinguished from re-ECN by
   which of ECT(0) or ECT(1) is used.  We chose to use ECT(1) for re-ECN
   traffic deliberately.  Existing ECN sources set ECT(0) on either 50%
   (the nonce) or 100% (the default) of packets, whereas re-ECN does not
   use ECT(0) at all.  We can use this distinguishing feature of legacy
   ECN traffic to separate it out for different treatment at the various
   border security functions: egress dropping, ingress policing and
   border policing.

   The general principle we adopt is that an egress dropper will not
   drop any legacy traffic, but ingress and border policers will limit
   the bulk rate of legacy traffic that can enter each network.  Then,
   during early re-ECN deployment, operators can set very permissive (or
   non-existent) rate-limits on legacy traffic, but once re-ECN
   implementations are generally available, legacy traffic can be rate-
   limited increasingly harshly.  Ultimately, an operator might choose
   to block all legacy traffic entering its network, or at least only
   allow through a trickle.

   Then, as the limits are set more strictly, the more legacy ECN
   sources will gain by upgrading to re-ECN.  Thus, towards the end of
   the voluntary incremental deployment period, legacy transports can be
   given progressively stronger encouragement to upgrade.

   The following list of minor changes, brings together all the points
   where Re-ECN semantics for use of the two-bit ECN field are different
   compared to RFC3168:

   o  A re-ECN sender sets ECT(1) by default, whereas an RFC3168 sender
      sets ECT(0) by default (Section 3.3);

   o  No provision is necessary for a re-ECN capable source transport to
      use the ECN nonce (Section 4.1.2.1);

   o  Routers MAY preferentially drop different extended ECN codepoints
      (Section 5.3);

   o  Packets carrying the feedback not established (FNE) codepoint MAY
      optionally be marked rather than dropped by routers, even though
      their ECN field is Not-ECT (with the important caveat in
      Section 5.3);

   o  Packets may be dropped by policing nodes because of apparent
      misbehaviour, not just because of congestion (Section 6);

   o  Tunnel entry behaviour is still to be defined, but may have to be
      different from RFC3168 (Section 5.6).

   None of these changes REQUIRE any modifications to routers.  Also
   none of these changes affect anything about end to end congestion
   control; they are all to do with allowing networks to police that end
   to end congestion control is well-behaved.

7.2.  Incremental Deployment Incentives

   It would only be worth standardising the re-ECN protocol if there
   existed a coherent story for how it might be incrementally deployed.
   In order for it to have a chance of deployment, everyone who needs to
   act, must have a strong incentive to act, and the incentives must
   arise in the order that deployment would have to happen.  Re-ECN
   works around unmodified ECN routers, but we can't just discuss why
   and how re-ECN deployment might build on ECN deployment, because
   there is precious little to build on in the first place.  Instead, we
   aim to show that re-ECN deployment could carry ECN with it.  We focus
   on commercial deployment incentives, although some of the arguments
   apply equally to academic or government sectors.

   ECN deployment:

      ECN is largely implemented in commercial routers, but generally
      not as a supported feature, and it has largely not been deployed
      by commercial network operators.  It has been released in many
      Unix-based operating systems, but not in proprietary OSs like
      Windows or those in many mobile devices.  For detailed deployment
      status, see [ECN-Deploy].  We believe the reason ECN deployment
      has not happened is twofold:

      *  ECN requires changes to both routers and hosts.  If someone
         wanted to sell the improvement that ECN offers, they would have
         to co-ordinate deployment of their product with others.  An ECN
         server only gives any improvement on an ECN network.  An ECN
         network only gives any improvement if used by ECN devices.
         Deployment that requires co-ordination adds cost and delay and
         tends to dilute any competitive advantage that might be gained.

      *  ECN `only' gives a performance improvement.  Making a product a
         bit faster (whether the product is a device or a network),
         isn't usually a sufficient selling point to be worth the cost
         of co-ordinating across the industry to deploy it.  Network
         operators tend to avoid re-configuring a working network unless
         launching a new product.

   ECN and re-ECN for Edge-to-edge Assured QoS:

      We believe the proposal to provide assured QoS sessions using a
      form of ECN called pre-congestion notification (PCN) [CL-deploy]
      is most likely to break the deadlock in ECN deployment first.  It
      only requires edge-to-edge deployment so it does not require
      endpoint support.  It can be deployed in a single network, then
      grow incrementally to interconnected networks.  And it provides a
      different `product' (internetworked assured QoS), rather than
      merely making an existing product a bit faster.

      Not only could this assured QoS application kick-start ECN
      deployment, it could also carry re-ECN deployment with it; because
      re-ECN can enable the assured QoS region to expand to a large
      internetwork where neighbouring networks do not trust each other.
      [Re-PCN] argues that re-ECN security should be built in to the QoS
      system from the start, explaining why and how.

      If ECN and re-ECN were deployed edge-to-edge for assured QoS,
      operators would gain valuable experience.  They would also clear
      away many technical obstacles such as firewall configurations that
      block all but the legacy settings of the ECN field and the RE
      flag.

   ECN in Access Networks:

      The next obstacle to ECN deployment would be extension to access
      and backhaul networks, where considerable link layer differences
      makes implementation non-trivial, particularly on congested
      wireless links.  ECN and re-ECN work fine during partial
      deployment, but they will not be very useful if the most congested
      elements in networks are the last to support them.  Access network
      support is one of the weakest parts of this deployment story.  All
      we can hope is that, once the benefits of ECN are better
      understood by operators, they will push for the necessary link
      layer implementations as deployment proceeds.

   Policing Unresponsive Flows:

      Re-ECN allows a network to offer differentiated quality of service
      as explained in Section 6.2.2.  But we do not believe this will
      motivate initial deployment of re-ECN, because the industry is
      already set on alternative ways of doing QoS.  Despite being much
      more complicated and expensive, the alternative approaches are
      here and now.

      But re-ECN is critical to QoS deployment in another respect.  It
      can be used to prevent applications from taking whatever bandwidth
      they choose without asking.

      Currently, applications that remain resolute in their lack of
      response to congestion are rewarded by other TCP applications.  In
      other words, TCP is naively friendly, in that it reduces its rate
      in response to congestion whether it is competing with friends
      (other TCPs) or with enemies (unresponsive applications).

      Therefore, those network owners that want to sell QoS will be keen
      to ensure that their users can't help themselves to QoS for free.
      Given the very large revenues at stake, we believe effective
      policing of congestion response will become highly sought after by
      network owners.

      But this does not necessarily argue for re-ECN deployment.
      Network owners might choose to deploy bottleneck policers rather
      than re-ECN-based policing.  However, under Related Work
      (Section 9) we argue that bottleneck policers are inherently
      vulnerable to circumvention.

      Therefore we believe there will be a strong demand from network
      owners for re-ECN deployment so they can police flows that do not
      ask to be unresponsive to congestion, in order to protect their
      revenues from flows that do ask (QoS).  In particular, we suspect
      that the operators of cellular networks will want to prevent VoIP
      and video applications being used freely on their networks as a
      more open market develops in GPRS and 3G devices.

      Initial deployments are likely to be isolated to single cellular
      networks.  Cellular operators would first place requirements on
      device manufacturers to include re-ECN in the standards for mobile
      devices.  In parallel, they would put out tenders for ingress and
      egress policers.  Then, after a while they would start to tighten
      rate limits on Not-ECT traffic from non-standard devices and they
      would start policing whatever non-accredited applications people
      might install on mobile devices with re-ECN support in the
      operating system.  This would force even independent mobile device
      manufacturers to provide re-ECN support.  Early standardisation
      across the cellular operators is likely, including interconnection
      agreements with penalties for excess downstream congestion.

      We suspect some fixed broadband networks (whether cable or DSL)
      would follow a similar path.  However, we also believe that larger
      parts of the fixed Internet would not choose to police on a per-
      flow basis.  Some might choose to police congestion on a per-user
      basis in order to manage heavy peer-to-peer file-sharing, but it
      seems likely that a sizeable majority would not deploy any form of
      policing.

      This hybrid situation begs the question, "How does re-ECN work for
      networks that choose to using policing if they connect with others
      that don't?"  Traffic from non-ECN capable sources will arrive
      from other networks and cause congestion within the policed, ECN-
      capable networks.  So networks that chose to police congestion
      would rate-limit Not-ECT traffic throughout their network,
      particularly at their borders.  They would probably also set
      higher usage prices in their interconnection contracts for
      incoming Not-ECT and Not-RECT traffic.  We assume that
      interconnection contracts between networks in the same tier will
      include congestion penalties before contracts with provider
      backbones do.

      A hybrid situation could remain for all time.  As was explained in
      the introduction, we believe in healthy competition between
      policing and not policing, with no imperative to convert the whole
      world to the religion of policing.  Networks that chose not to
      deploy egress droppers would leave themselves open to being
      congested by senders in other networks.  But that would be their
      choice.

      The important aspect of the egress dropper though is that it most
      protects the network that deploys it.  If a network does not
      deploy an egress dropper, sources sending into it from other
      networks will be able to understate the congestion they are
      causing.  Whereas, if a network deploys an egress dropper, it can
      know how much congestion other networks are dumping into it.  And
      apply penalties or charges accordingly.  So, whether or not a
      network polices its own sources at ingress, it is in its interests
      to deploy an egress dropper.

   Host support:

      In the above deployment scenario, host operating system support
      for re-ECN came about through the cellular operators demanding it
      in device standards (i.e. 3GPP).  Of course, increasingly, mobile
      devices are being built to support multiple wireless technologies.
      So, if re-ECN were stipulated for cellular devices, it would
      automatically appear in those devices connected to the wireless
      fringes of fixed networks if they coupled cellular with WiFi or
      Bluetooth technology, for instance.  Also, once implemented in the
      operating system of one mobile device, it would tend to be found
      in other devices using the same family of operating system.

      Therefore, whether or not a fixed network deployed ECN, or
      deployed re-ECN policers and droppers, many of its hosts might
      well be using re-ECN over it.  Indeed, they would be at an
      advantage when communicating with hosts across Re-ECN policed
      networks that rate limited Not-RECT traffic.

   Other possible scenarios:

      The above is thankfully not the only plausible scenario we can
      think of.  One of the many clubs of operators that meet regularly
      around the world might decide to act together to persuade a major
      operating system manufacturer to implement re-ECN.  And they may
      agree between them on an interconnection model that includes
      congestion penalties.

      Re-ECN provides an interesting opportunity for device
      manufacturers as well as network operators.  Policers can be
      configured loosely when first deployed.  Then as re-ECN take-up
      increases, they can be tightened up, so that a network with re-ECN
      deployed can gradually squeeze down the service provided to legacy
      devices that have not upgraded to re-ECN.  Many device vendors
      rely on replacement sales.  And operating system companies rely
      heavily on new release sales.  Also support services would like to
      be able to force stragglers to upgrade.  So, the ability to
      throttle service to legacy operating systems is quite valuable.

      Also, policing unresponsive sources may not be the only or even
      the first application that drives deployment.  It may be policing
      causes of heavy congestion (e.g. peer-to-peer file-sharing).  Or
      it may be mitigation of denial of service.  Or we may be wrong in
      thinking simpler QoS will not be the initial motivation for re-ECN
      deployment.  Indeed, the combined pressure for all these may be
      the motivator, but it seems optimistic to expect such a level of
      joined-up thinking from today's communications industry.  We
      believe a single application alone must be a sufficient motivator.

      In short, everyone gains from adding accountability to TCP/IP,
      except the selfish or malicious.  So, deployment incentives tend
      to be strong.

8.  Architectural Rationale

   In the Internet's technical community the danger of not responding to
   congestion is well-understood, with its attendant risk of congestion
   collapse [RFC3714].  However, many of the Internet's commercial
   community consider that the very essence of IP is to provide open
   access to the internetwork for all applications.  Congestion is seen
   as a symptom of over-conservative investment.  And the goal of
   application design is to find novel ways to continue working despite
   congestion.  They argue that the Internet was never intended to be
   solely for TCP-friendly applications.  Another side of the Internet's
   commercial community believe that it is no use providing a network
   for novel applications if it has insufficient capacity.  And it will
   always have insufficient capacity unless a greater share of
   application revenues can be /assured/ for the infrastructure
   provider.  Otherwise the major investments required will carry too
   much risk and won't happen.

   The lesson articulated in [Tussle] is that we shouldn't embed our
   view on these arguments into the Internet at design time.  Instead we
   should design the Internet so that the outcome of these arguments can
   get decided at run-time.  Re-ECN is designed in that spirit.  Once
   the protocol is available, different network operators can choose how
   liberal they want to be in holding people accountable for the
   congestion they cause.  Some might boldly invest in capacity and not
   police its use at all, hoping that novel applications will result.
   Others might use re-ECN for fine-grained flow policing, expecting to
   make money selling vertically integrated services.  Yet others might
   sit somewhere half-way, perhaps doing coarse, per-user policing.  All
   might change their minds later.  But re-ECN always allows them to
   interconnect so that the careful ones can protect themselves from the
   liberal ones.

   The incentive-based approach used for re-ECN is based on Gibbens and
   Kelly's arguments [Evol_cc] on allowing endpoints the freedom to
   evolve new congestion control algorithms for new applications.  They
   ensured responsible behaviour despite everyone's self-interest by
   applying pricing to ECN marking, and Kelly had proved stability and
   optimality in an earlier paper.

   Re-ECN keeps all the underlying economic incentives, but rearranges
   the feedback.  The idea is to allow a network operator (if it
   chooses) to deploy engineering mechanisms like policers at the front
   of the network which can be designed to behave /as if/ they are
   responding to congestion prices.  Rather than having to subject users
   to congestion pricing, networks can then use more traditional
   charging regimes (or novel ones).  But the engineering can constrain
   the overall amount of congestion a user can cause.  This provides a
   buffer against completely outrageous congestion control, but still
   makes it easy for novel applications to evolve if they need different
   congestion control to the norms.  It also allows novel charging
   regimes to evolve.

   Despite being achieved with a relatively minor protocol change, re-
   ECN is an architectural change.  Previously, Internet congestion
   could only be controlled by the data sender, because it was the only
   one both in a position to control the load and in a position to see
   information on congestion.  Re-ECN levels the playing field.  It
   recognises that the network also has a role to play in moderating
   (policing) congestion control.  But policing is only truly effective
   at the first ingress into an internetwork, whereas path congestion
   was previously only visible at the last egress.  So, re-ECN
   democratises congestion information.  Then the choice over who
   actually controls congestion can be made at run-time, not design
   time---a bit like an aircraft with dual controls.  And different
   operators can make different choices.  We believe non-architectural
   approaches to this problem are unlikely to offer more than partial
   solutions (see Section 9).

   Importantly, re-ECN does NOT REQUIRE assumptions about specific
   congestion responses to be embedded in any network elements, except
   at the first ingress to the internetwork if that level of control is
   desired by the ingress operator.  But such tight policing will be a
   matter of agreement between the source and its access network
   operator.  The ingress operator need not police congestion response
   at flow granularity; it can simply hold a source responsible for the
   aggregate congestion it causes, perhaps keeping it within a monthly
   congestion quota.  Or if the ingress network trusts the source, it
   can do nothing.

   Therefore, the aim of the re-ECN protocol is NOT solely to police
   TCP-friendliness.  Re-ECN preserves IP as a generic network layer for
   all sorts of responses to congestion, for all sorts of transports.
   Re-ECN merely ensures truthful downstream congestion information is
   available in the network layer for all sorts of accountability
   applications.

   The end to end design principle does not say that all functions
   should be moved out of the lower layers---only those functions that
   are not generic to all higher layers.  Re-ECN adds a function to the
   network layer that is generic, but was omitted: accountability for
   causing congestion.  Accountability is not something that an end-user
   can provide to themselves.  We believe re-ECN adds no more than is
   sufficient to hold each flow accountable, even if it consists of a
   single datagram.

   "Accountability" implies being able to identify who is responsible
   for causing congestion.  However, at the network layer it would NOT
   be useful to identify the cause of congestion by adding individual or
   organisational identity information, NOR by using source IP
   addresses.  Rather than bringing identity information to the point of
   congestion, we bring downstream congestion information to the point
   where the cause can be most easily identified and dealt with.  That
   is, at any trust boundary congestion can be associated with the
   physically connected upstream neighbour that is directly responsible
   for causing it (whether intentionally or not).  A trust boundary
   interface is exactly the place to police or throttle in order to
   directly mitigate congestion, rather than having to trace the
   (ir)responsible party in order to shut them down.

   Some considered that ECN itself was a layering violation.  The
   reasoning went that the interface to a layer should provide a service
   to the higher layer and hide how the lower layer does it.  However,
   ECN reveals the state of the network layer and below to the transport
   layer.  A more positive way to describe ECN is that it is like the
   return value of a function call to the network layer.  It explicitly
   returns the status of the request to deliver a packet, by returning a
   value representing the current risk that a packet will not be served.
   Re-ECN has similar semantics, except the transport layer must try to
   guess the return value, then it can use the actual return value from
   the network layer to modify the next guess.

9.  Related Work

   {Due to lack of time, this section is incomplete.  The reader is
   referred to the Related Work section of [Re-fb] for a brief selection
   of related ideas.}

9.1.  Policing Rate Response to Congestion

   ATM network elements send congestion back-pressure messages [ITU-
   T.I.371] along each connection, duplicating any end to end feedback
   because they don't trust it.  On the other hand, re-ECN ensures
   information in forwarded packets can be used for congestion
   management without requiring a connection-oriented architecture and
   re-using the overhead of fields that are already set aside for end to
   end congestion control (and routing loop detection in the case of re-
   TTL in Appendix F).

   We borrowed ideas from policers in the literature [pBox],[XCHOKe],
   AFD etc. for our rate equation policer.  However, without the benefit
   of re-ECN they don't police the correct rate for the condition of
   their path.  They detect unusually high /absolute/ rates, but only
   while the policer itself is congested, because they work by detecting
   prevalent flows in the discards from the local RED queue.  These
   policers must sit at every potential bottleneck, whereas our policer
   need only be located at each ingress to the internetwork.  As Floyd &
   Fall explain [pBox], the limitation of their approach is that a high
   sending rate might be perfectly legitimate, if the rest of the path
   is uncongested or the round trip time is short.  Commercially
   available rate policers cap the rate of any one flow.  Or they
   enforce monthly volume caps in an attempt to control high volume
   file-sharing.  They limit the value a customer derives.  They might
   also limit the congestion customers can cause, but only as an
   accidental side-effect.  They actually punish traffic that fills
   troughs as much as traffic that causes peaks in utilisation.  In
   practice network operators need to be able to allocate service by
   cost during congestion, and by value at other times.

9.2.  Congestion Notification Integrity

   The choice of two ECT code-points in the ECN field [RFC3168]
   permitted future flexibility, optionally allowing the sender to
   encode the experimental ECN nonce [RFC3540] in the packet stream.

   The ECN nonce is an elegant scheme that allows the sender to detect
   if someone in the feedback loop tries to claim no congestion was
   experienced when it fact it was (whether drop or ECN marking).  The
   sender chooses between the two ECT codepoints in a pseudo-random
   sequence.  Then, whenever the network marks a packet with CE, to deny
   the congestion happened, the cheater would have to guess which ECT
   codepoint was overwritten, with only a 50:50 chance of being correct
   each time.

   The assumption behind the ECN nonce is that a sender will want to
   detect whether a receiver is suppressing congestion feedback.  This
   is only true if the sender's interests are aligned with the
   network's, or with the community of users as a whole.  This may be
   true for certain large senders, who are under close scrutiny and have
   a reputation to maintain.  But we have to deal with a more hostile
   world, where traffic may be dominated by peer-to-peer transfers,
   rather than downloads from a few popular sites.  Often the `natural'
   self-interest of a sender is not aligned with the interests of other
   users.  It often wishes to transfer data quickly to the receiver as
   much as the receiver wants the data quickly.

   In contrast, the re-ECN protocol enables policing of an agreed rate-
   response to congestion (e.g. TCP-friendliness) at the sender's
   interface with the internetwork.  It also ensures downstream networks
   can police their upstream neighbours, to encourage them to police
   their users in turn.  But most importantly, it requires the sender to
   declare path congestion to the network and it can remove traffic at
   the egress if this declaration is dishonest.  So it can police
   correctly, irrespective of whether the receiver tries to suppress
   congestion feedback or whether the sender ignores genuine congestion
   feedback.  Therefore the re-ECN protocol addresses a much wider range
   of cheating problems, which includes the one addressed by the ECN
   nonce.

9.3.  Identifying Upstream and Downstream Congestion

   Purple [Purple] proposes that routers should use the CWR flag in the
   TCP header of ECN-capable flows to work out path congestion and
   therefore downstream congestion in a similar way to re-ECN.  However,
   because CWR is in the transport layer, it is not always visible to
   network layer routers and policers.  Purple's motivation was to
   improve AQM, not policing.  But, of course, nodes trying to avoid a
   policer would not be expected to allow CWR to be visible.

10.  Security Considerations

   This whole memo concerns the deployment of a secure congestion
   control framework.  However, below we list some specific security
   issues that we are still working on:

   o  Malicious users have ability to launch dynamically changing
      attacks, exploiting the time it takes to detect an attack, given
      ECN marking is binary.  We are concentrating on subtle
      interactions between the ingress policer and the egress dropper in
      an effort to make it impossible to game the system.

   o  There is an inherent need for at least some flow state at the
      egress dropper given the binary marking environment, which leads
      to an apparent vulnerability to state exhaustion attacks.  An
      egress dropper design with bounded flow state is in write-up.

   o  A malicious source can spoof another user's address and send
      negative traffic to the same destination in order to fool the
      dropper into sanctioning the other user's flow.  To prevent or
      mitigate these two different kinds of DoS attack, against the
      dropper and against given flows, we are considering various
      protection mechanisms.  Section 5.5.1 discusses one of these.

   o  A malicious client can send requests using a spoofed source
      address to a server (such as a DNS server) that tends to respond
      with single packet responses.  This server will then be tricked
      into having to set FNE on the first (and only) packet of all these
      wasted responses.  Given packets marked FNE are worth +1, this
      will cause such servers to consume more of their allowance to
      cause congestion than they would wish to.  In general, re-ECN is
      deliberately designed so that single packet flows have to bear the
      cost of not discovering the congestion state of their path.  One
      of the reasons for introducing re-ECN is to encourage short flows
      to make use of previous path knowledge by moving the cost of this
      lack of knowledge to sources that create short flows.  Therefore,
      we in the long run we might expect services like DNS to aggregate
      single packet flows into connections where it brings benefits.
      However, this attack where DNS requests are made from spoofed
      addresses genuinely forces the server to waste its resources.  The
      only mitigating feature is that the attacker has to set FNE on
      each of its requests if they are to get through an egress dropper
      to a DNS server.  The attacker therefore has to consume as many
      resources as the victim, which at least implies re-ECN does not
      unwittingly amplify this attack.

   Having highlighted outstanding security issues, we now explain the
   design decisions that were taken based on a security-related
   rationale.  It may seem that the six codepoints of the eight made
   available by extending the ECN field with the RE flag have been used
   rather wastefully to encode just five states.  In effect the RE flag
   has been used as an orthogonal single bit, using up four codepoints
   to encode the three states of positive, neutral and negative worth.
   The mapping of the codepoints in an earlier version of this proposal
   used the codepoint space more efficiently, but the scheme became
   vulnerable to network operators bypassing congestion penalties by
   focusing congestion marking on positive packets.  Appendix B explains
   why fixing that problem while allowing for incremental deployment,
   would have used another codepoint anyway.  So it was better to use
   this orthogonal encoding scheme, which greatly simplified the whole
   protocol and brought with it some subtle security benefits.

   With the scheme as now proposed, once the RE flag is set or cleared
   by the sender or its proxy, it should not be written by the network,
   only read.  So the gateways can detect if any network maliciously
   alters the RE flag.  IPSec AH integrity checking does not cover the
   IPv4 option flags (they were considered mutable---even the one we
   propose using for the RE flag that was `currently unused' when IPSec
   was defined).  But it would be sufficient for a pair of gateways to
   make random checks on whether the RE flag was the same when it
   reached the egress gateway as when it left the ingress.  Indeed, if
   IPSec AH had covered the RE flag, any network intending to alter
   sufficient RE flags to make a gain would have focused its alterations
   on packets without authenticating headers (AHs).

   The security of re-ECN has been deliberately designed to not rely on
   cryptography.

11.  IANA Considerations

   This memo includes no request to IANA (yet).

   If this memo was to progress to standards track, it would list:

   o  The new RE flag in IPv4 (Section 5.1) and its extension with the
      ECN field to create a new set of extended ECN (EECN) codepoints;

   o  The definition of the EECN codepoints for default Diffserv PHBs
      (Section 3.2)

   o  The new extension header for IPv6 (Section 5.2);

   o  The new combinations of flags in the TCP header for capability
      negotiation (Section 4.1.3);

   o  The new ICMP message type (Section 5.5.1).

12.  Conclusions

   {ToDo:}

13.  Acknowledgements

   Sebastien Cazalet and Andrea Soppera contributed to the idea of re-
   feedback.  All the following have given helpful comments: Andrea
   Soppera, David Songhurst, Peter Hovell, Louise Burness, Phil Eardley,
   Steve Rudkin, Marc Wennink, Fabrice Saffre, Cefn Hoile, Steve Wright,
   John Davey, Martin Koyabe, Carla Di Cairano-Gilfedder, Alexandru
   Murgu, Nigel Geffen, Pete Willis, John Adams (BT), Sally Floyd
   (ICIR), Joe Babiarz, Kwok Ho-Chan (Nortel), Stephen Hailes, Mark
   Handley (who developed the attack with canceled packets), Adam
   Greenhalgh (who developed the attack on DNS) (UCL), Jon Crowcroft
   (Uni Cam), David Clark, Bill Lehr, Sharon Gillett, Steve Bauer (who
   complemented our own dummy traffic attacks with others), Liz Maida
   (MIT), and comments from participants in the CRN/CFP Broadband and
   DoS-resistant Internet working groups.

14.  Comments Solicited

   Comments and questions are encouraged and very welcome.  They can be
   addressed to the IETF Transport Area working group's mailing list
   <tsvwg@ietf.org>, and/or to the authors.

15.  References

15.1.  Normative References

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

   [RFC2309]  Braden, B., Clark, D., Crowcroft, J., Davie, B., Deering,
              S., Estrin, D., Floyd, S., Jacobson, V., Minshall, G.,
              Partridge, C., Peterson, L., Ramakrishnan, K., Shenker,
              S., Wroclawski, J., and L. Zhang, "Recommendations on
              Queue Management and Congestion Avoidance in the
              Internet", RFC 2309, April 1998.

   [RFC2581]  Allman, M., Paxson, V., and W. Stevens, "TCP Congestion
              Control", RFC 2581, April 1999.

   [RFC3168]  Ramakrishnan, K., Floyd, S., and D. Black, "The Addition
              of Explicit Congestion Notification (ECN) to IP",
              RFC 3168, September 2001.

   [RFC3390]  Allman, M., Floyd, S., and C. Partridge, "Increasing TCP's
              Initial Window", RFC 3390, October 2002.

   [RFC3540]  Spring, N., Wetherall, D., and D. Ely, "Robust Explicit
              Congestion Notification (ECN) Signaling with Nonces",
              RFC 3540, June 2003.

15.2.  Informative References

   [ARI05]    Adams, J., Roberts, L., and A. IJsselmuiden, "Changing the
              Internet to Support Real-Time Content Supply from a Large
              Fraction of Broadband Residential Users", BT Technology
              Journal (BTTJ) 23(2), April 2005.

   [Bauer06]  Bauer, S., Faratin, P., and R. Beverly, "Assessing the
              assumptions underlying mechanism design for the Internet",
              Proc. Workshop on the Economics of Networked Systems
              (NetEcon06) , June 2006, <http://www.cs.duke.edu/nicl/
              netecon06/papers/ne06-assessing.pdf>.

   [CL-deploy]
              Briscoe, B., Eardley, P., Songhurst, D., Le Faucheur, F.,
              Charny, A., Babiarz, J., Chan, K., Westberg, L., Bader,
              A., and G. Karagiannis, "A Deployment Model for Admission
              Control over DiffServ using Pre-Congestion Notification",
              draft-briscoe-tsvwg-cl-architecture-03 (work in progress),
              June 2006.

   [CLoop_pol]
              Salvatori, A., "Closed Loop Traffic Policing", Politecnico
              Torino and Institut Eurecom Masters Thesis ,
              September 2005.

   [ECN-Deploy]
              Floyd, S., "ECN (Explicit Congestion Notification) in
              TCP/IP; Implementation and Deployment of ECN", Web-page ,
              May 2004,
              <http://www.icir.org/floyd/ecn.html#implementations>.

   [ECN-MPLS]
              Bruce, B., Briscoe, B., and J. Tay, "Explicit Congestion
              Marking in MPLS", draft-davie-ecn-mpls-00 (work in
              progress), June 2006.

   [Evol_cc]  Gibbens, R. and F. Kelly, "Resource pricing and the
              evolution of congestion control", Automatica 35(12)1969--
              1985, December 1999,
              <http://www.statslab.cam.ac.uk/~frank/evol.html>.

   [I-D.ietf-tsvwg-ecnsyn]
              Kuzmanovic, A., "Adding Explicit Congestion Notification
              (ECN) Capability to IP", TCP's SYN/ACK  Packets",
              draft-ietf-tsvwg-ecnsyn-00 (work in progress),
              November 2005.

   [ITU-T.I.371]
              ITU-T, "Traffic Control and Congestion Control in
              {B-ISDN}", ITU-T Rec. I.371 (03/04), March 2004.

   [Jiang02]  Jiang, H. and D. Dovrolis, "The Macroscopic Behavior of
              the TCP Congestion Avoidance Algorithm", ACM SIGCOMM
              CCR 32(3)75-88, July 2002,
              <http://doi.acm.org/10.1145/571697.571725>.

   [Mathis97]
              Mathis, M., Semke, J., Mahdavi, J., and T. Ott, "The
              Macroscopic Behavior of the TCP Congestion Avoidance
              Algorithm", ACM SIGCOMM CCR 27(3)67--82, July 1997,
              <http://doi.acm.org/10.1145/263932.264023>.

   [Purple]   Pletka, R., Waldvogel, M., and S. Mannal, "PURPLE:
              Predictive Active Queue Management Utilizing Congestion
              Information", Proc. Local Computer Networks (LCN 2003) ,
              October 2003.

   [RFC2208]  Mankin, A., Baker, F., Braden, B., Bradner, S., O'Dell,
              M., Romanow, A., Weinrib, A., and L. Zhang, "Resource
              ReSerVation Protocol (RSVP) Version 1 Applicability
              Statement Some Guidelines on Deployment", RFC 2208,
              September 1997.

   [RFC2402]  Kent, S. and R. Atkinson, "IP Authentication Header",
              RFC 2402, November 1998.

   [RFC2406]  Kent, S. and R. Atkinson, "IP Encapsulating Security
              Payload (ESP)", RFC 3168, September 2001.

   [RFC3390]  Allman, M., Floyd, 2406, November 1998.

   [RFC2475]  Blake, S., Black, D., Carlson, M., Davies, E., Wang, Z.,
              and C. Partridge, "Increasing W. Weiss, "An Architecture for Differentiated
              Services", RFC 2475, December 1998.

   [RFC2988]  Paxson, V. and M. Allman, "Computing TCP's
              Initial Window", Retransmission
              Timer", RFC 3390, October 2002.

   [RFC3540]  Spring, N., Wetherall, D., 2988, November 2000.

   [RFC3124]  Balakrishnan, H. and D. Ely, "Robust Explicit S. Seshan, "The Congestion Notification (ECN) Signaling with Nonces", Manager",
              RFC 3540, 3124, June 2001.

   [RFC3514]  Bellovin, S., "The Security Flag in the IPv4 Header",
              RFC 3514, April 2003.

15.2.  Informative References

   [ARI05]    Adams, J., Roberts, L.,

   [RFC3714]  Floyd, S. and A. IJsselmuiden, "Changing J. Kempf, "IAB Concerns Regarding Congestion
              Control for Voice Traffic in the
              Internet to Support Real-Time Content Supply from a Large
              Fraction of Broadband Residential Users", BT Technology
              Journal (BTTJ) 23(2), April 2005.

   [CL-arch] Internet", RFC 3714,
              March 2004.

   [RFC4340]  Kohler, E., Handley, M., and S. Floyd, "Datagram
              Congestion Control Protocol (DCCP)", RFC 4340, March 2006.

   [Re-PCN]   Briscoe, B., Eardley, P., Songhurst, "Emulating Border Flow Policing using Re-ECN
              on Bulk Data", draft-briscoe-tsvwg-re-ecn-border-cheat-01
              (work in progress), March 2006.

   [Re-fb]    Briscoe, B., Jacquet, A., Di Cairano-Gilfedder, C.,
              Salvatori, A., Soppera, A., and M. Koyabe, "Policing
              Congestion Response in an Internetwork Using Re-Feedback",
              ACM SIGCOMM CCR 35(4)277--288, August 2005, <http://
              www.acm.org/sigs/sigcomm/sigcomm2005/
              techprog.html#session8>.

   [Smart_rtg]
              Goldenberg, D., Qiu, L., Xie, H., Yang, Y., and Y. Zhang,
              "Optimizing Cost and Performance for Multihoming", ACM
              SIGCOMM CCR 34(4)79--92, October 2004,
              <http://citeseer.ist.psu.edu/698472.html>.

   [Steps_DoS]
              Handley, M. and A. Greenhalgh, "Steps towards a DoS-
              resistant Internet Architecture", Proc. ACM SIGCOMM
              workshop on Future directions in network architecture
              (FDNA'04) pp 49--56, August 2004.

   [Tussle]   Clark, D., Le Faucheur, F.,
              Charny, A., Babiarz, Sollins, K., Wroclawski, J., and K. Chan, "A Framework for
              Admission Control over DiffServ using Pre-Congestion
              Notification", draft-briscoe-tsvwg-cl-architecture-02
              (work R. Braden,
              "Tussle in progress), March 2006.

   [CLoop_pol]
              Salvatori, Cyberspace: Defining Tomorrow's Internet", ACM
              SIGCOMM CCR 32(4)347--356, October 2002,
              <http://www.acm.org/sigcomm/sigcomm2002/papers/
              tussle.pdf>.

   [XCHOKe]   Chhabra, P., Chuig, S., Goel, A., "Closed Loop Traffic Policing", Politecnico
              Torino John, A., Kumar, A.,
              Saran, H., and Institut Eurecom Masters Thesis R. Shorey, "XCHOKe: Malicious Source
              Control for Congestion Avoidance at Internet Gateways",
              Proceedings of IEEE International Conference on Network
              Protocols (ICNP-02) ,
              September 2005.

   [ECN-Deploy] November 2002,
              <http://www.cc.gatech.edu/~akumar/xchoke.pdf>.

   [pBox]     Floyd, S., "ECN (Explicit S. and K. Fall, "Promoting the Use of End-to-End
              Congestion Notification) Control in
              TCP/IP; Implementation the Internet", IEEE/ACM Transactions
              on Networking 7(4) 458--472, August 1999,
              <http://www.aciri.org/floyd/end2end-paper.html>.

Appendix A.  Precise Re-ECN Protocol Operation

   The protocol operation described in Section 3.3 was an approximation.
   In fact, standard ECN router marking combines 1% and Deployment 2% marking into
   slightly less than 3% whole-path marking, because routers
   deliberately mark CE whether or not it has already been marked by
   another router upstream.  So the combined marking fraction would
   actually be 100% - (100% - 1%)(100% - 2%) = 2.98%.

   To generalise this we will need some notation.

   o  j represents the index of ECN", Web-page ,
              May 2004,
              <http://www.icir.org/floyd/ecn.html#implementations>.

   [Evol_cc]  Gibbens, R. and F. Kelly, "Resource pricing and each resource (typically queues) along a
      path, ranging from 0 at the
              evolution first router to n-1 at the last.

   o  m_j represents the fraction of octets *m*arked CE by a particular
      router (whether or not they are already marked) because of
      congestion control", Automatica 35(12)1969--
              1985, December 1999,
              <http://www.statslab.cam.ac.uk/~frank/evol.html>.

   [I-D.ietf-tsvwg-ecnsyn]
              Kuzmanovic, A., "Adding Explicit Congestion Notification
              (ECN) Capability to TCP's SYN/ACK  Packets",
              draft-ietf-tsvwg-ecnsyn-00 (work in progress),
              November 2005.

   [ITU-T.I.371]
              ITU-T, "Traffic Control and Congestion Control in
              {B-ISDN}", ITU-T Rec. I.371 (03/04), March 2004.

   [Jiang02]  Jiang, H. and D. Dovrolis, "The Macroscopic Behavior of resource j.

   o  u_j represents congestion *u*pstream of resource j, being the TCP Congestion Avoidance Algorithm", ACM SIGCOMM
              CCR 32(3)75-88, July 2002,
              <http://doi.acm.org/10.1145/571697.571725>.

   [Mathis97]
              Mathis, M., Semke, J., Mahdavi, J., and T. Ott, "The
              Macroscopic Behavior
      fraction of CE marking in arriving packet headers (before
      marking).

   o  p_j represents *p*ath congestion, being the TCP Congestion Avoidance
              Algorithm", ACM SIGCOMM CCR 27(3)67--82, July 1997,
              <http://doi.acm.org/10.1145/263932.264023>.

   [Purple]   Pletka, R., Waldvogel, M., and S. Mannal, "PURPLE:
              Predictive Active Queue Management Utilizing Congestion
              Information", Proc. Local Computer Networks (LCN 2003) ,
              October 2003.

   [RFC2475]  Blake, S., Black, D., Carlson, M., Davies, E., Wang, Z.,
              and W. Weiss, "An Architecture for Differentiated
              Services", RFC 2475, December 1998.

   [RFC2988]  Paxson, V. fraction of packets
      arriving at resource j with the RE flag blanked (excluding Not-
      RECT packets).

   o  v_j denotes expected congestion downstream of resource j, which
      can be thought of as a *v*irtual marking fraction, being derived
      from two other marking fractions.

   Observed fractions of each particular codepoint (u, p and M. Allman, "Computing TCP's Retransmission
              Timer", RFC 2988, November 2000.

   [RFC3124]  Balakrishnan, H. v) and S. Seshan, "The Congestion Manager",
              RFC 3124, June 2001.

   [RFC3270]  Le Faucheur, F., Wu, L., Davie, B., Davari, S., Vaananen,
              P., Krishnan, R., Cheval, P.,
   router marking rate m are dimensionless fractions, being the ratio of
   two data volumes (marked and J. Heinanen, "Multi-
              Protocol Label Switching (MPLS) Support total) over a monitoring period.  All
   measurements are in terms of Differentiated
              Services", RFC 3270, May 2002.

   [RFC3514]  Bellovin, S., "The Security Flag octets, not packets, assuming that line
   resources are more congestible than packet processing.

   The path congestion (RE blanking fraction) set by the sender should
   reflect the upstream congestion (CE marking fraction) fed back from
   the destination.  Therefore in the IPv4 Header",
              RFC 3514, April 2003.

   [RFC3714]  Floyd, S. and J. Kempf, "IAB Concerns Regarding Congestion
              Control for Voice Traffic steady state

      p_0  = u_n
           = 1 - (1 - m_1)(1 - m_2)...

   Similarly, at some point j in the Internet", RFC 3714,
              March 2004.

   [Re-PCN]   Briscoe, B., "Emulating Border Flow Policing using Re-ECN
              on Bulk Data", draft-briscoe-tsvwg-re-ecn-border-cheat-01
              (work middle of the network, if p = 1 -
   (1 - u_j)(1 - v_j), then

      v_j  = 1 - (1 - p)/(1 - u_j)

          ~= p - u_j;                      if u_j << 100%

   So, between the two routers in progress), March 2006.

   [Re-fb]    Briscoe, B., Jacquet, A., Di Cairano-Gilfedder, C.,
              Salvatori, A., Soppera, A., and M. Koyabe, "Policing
              Congestion Response the example in an Internetwork Using Re-Feedback",
              ACM SIGCOMM CCR 35(4)277--288, August 2005, <http://
              www.acm.org/sigs/sigcomm/sigcomm2005/
              techprog.html#session8>.

   [Smart_rtg]
              Goldenberg, D., Qiu, L., Xie, H., Yang, Y., Section 3.3, congestion
   downstream is

      v_1  = 100.00% - (100% - 2.98%) / (100% - 1.00%)
           = 2.00%,

   or a useful approximation of downstream congestion is

      v_1 ~= 2.98% - 1.00%
          ~= 1.98%.

Appendix B.  Justification for Two Codepoints Signifying Zero Worth
             Packets

   It may seem a waste of a codepoint to set aside two codepoints of the
   Extended ECN field to signify zero worth (RECT and Y. Zhang,
              "Optimizing Cost CE(0) are both
   worth zero).  The justification is subtle, but worth recording.

   The original version of re-ECN ([Re-fb] and Performance draft-00 of this memo)
   used three codepoints for Multihoming", ACM
              SIGCOMM CCR 34(4)79--92, October 2004,
              <http://citeseer.ist.psu.edu/698472.html>.

   [Steps_DoS]
              Handley, M. neutral (ECT(1)), positive (ECT(0)) and A. Greenhalgh, "Steps towards a DoS-
              resistant Internet Architecture", Proc. ACM SIGCOMM
              workshop on Future directions
   negative (CE) packets.  The sender set packets to neutral unless re-
   echoing congestion, when it set them positive, in network architecture
              (FDNA'04) pp 49--56, August 2004.

   [Tussle]   Clark, D., Sollins, K., Wroclawski, J., and R. Braden,
              "Tussle much the same way
   that it blanks the RE flag in Cyberspace: Defining Tomorrow's Internet", ACM
              SIGCOMM CCR 32(4)347--356, October 2002,
              <http://www.acm.org/sigcomm/sigcomm2002/papers/
              tussle.pdf>.

   [XCHOKe]   Chhabra, P., Chuig, S., Goel, A., John, A., Kumar, A.,
              Saran, H., the current protocol.  However, routers
   were meant to mark congestion by setting packets negative (CE)
   irrespective of whether they had previously been neutral or positive.

   However, we did not arrange for senders to remember which packet had
   been sent with which codepoint, or for feedback to say exactly which
   packets arrived with which codepoints.  The transport was meant to
   inflate the number of positive packets it sent to allow for a few
   being wiped out by congestion marking.  We (wrongly) assumed that
   routers would congestion mark packets indiscriminately, so the
   transport could infer how many positive packets had been marked and R. Shorey, "XCHOKe: Malicious Source
              Control
   compensate accordingly by re-echoing.  But this created a perverse
   incentive for routers to preferentially congestion mark positive
   packets rather than neutral ones.

   We could have removed this perverse incentive by requiring re-ECN
   senders to remember which packets they had sent with which codepoint.
   And for Congestion Avoidance at Internet Gateways",
              Proceedings of IEEE International Conference on Network
              Protocols (ICNP-02) , November 2002,
              <http://www.cc.gatech.edu/~akumar/xchoke.pdf>.

   [pBox]     Floyd, S. feedback from the receiver to identify which packets arrived
   as which.  Then, if a positive packet was congestion marked to
   negative, the sender could have re-echoed twice to maintain the
   balance between positive and K. Fall, "Promoting negative at the Use receiver.

   Instead, we chose to make re-echoing congestion (blanking RE)
   orthogonal to congestion notification (marking CE), which required a
   second neutral codepoint (the orthogonal scheme forms the main square
   of End-to-End
              Congestion Control four codepoints in Figure 2).  Then the Internet", IEEE/ACM Transactions
              on Networking 7(4) 458--472, August 1999,
              <http://www.aciri.org/floyd/end2end-paper.html>.

Appendix A.  Precise Re-ECN Protocol Operation

   The protocol operation described in Section 3.3 was an approximation.
   In fact, standard ECN router marking combines 1% receiver would be able to
   detect and 2% marking into
   slightly less than 3% whole-path marking, because routers
   deliberately mark CE whether or not echo a congestion event even if it has already arrived on a packet
   that had originally been marked by
   another router upstream.  So the combined marking fraction would
   actually be 100% - (100% - 1%)(100% - 2%) = 2.98%.

   To generalise this positive.

   If we will need some notation.

   o  j represents had added extra complexity to the index of each resource (typically queues) along a
      path, ranging from 0 at sender and receiver
   transports to track changes to individual packets, we could have made
   it work, but then routers would have had an incentive to mark
   positive packets with half the first probability of neutral packets.  That
   in turn would have led router algorithms to n-1 at the last.

   o  m_j represents the fraction of octets *m*arked CE become more complex.
   Then senders wouldn't know whether a mark had been introduced by a particular
      router (whether
   simple or not they are already marked) because of
      congestion of resource j.

   o  u_j represents congestion *u*pstream a complex router algorithm.  That in turn would have
   required another codepoint to distinguish between legacy ECN and new
   re-ECN router marking.

   Once the cost of resource j, being IP header codepoint real-estate was the
      fraction same for
   both schemes, there was no doubt that the simpler option for
   endpoints and for routers should be chosen.  The resulting protocol
   also no longer needed the tricky inflation/deflation complexity of CE marking in arriving packet headers (before
      marking).

   o  p_j represents *p*ath congestion, being
   the fraction original (broken) scheme.  It was also much simpler to understand
   conceptually.

   A further advantage of packets
      arriving at resource j with the new orthogonal four-codepoint scheme was
   that senders owned sole rights to change the RE flag blanked (excluding Not-
      RECT packets).

   o  v_j denotes expected congestion downstream and routers
   owned sole rights to change the ECN field.  Although we still arrange
   the incentives so neither party strays outside their dominion, these
   clear lines of resource j, which authority simplify the matter.

   Finally, a little redundancy can be thought very powerful in a scheme such as
   this.  In one flow, the proportion of packets changed to CE should be
   the same as a *v*irtual marking fraction, being derived
      from two other marking fractions.

   Observed fractions the proportion of each particular codepoint (u, p and v) RECT packets changed to CE(-1) and
   router marking rate m are dimensionless fractions, being the ratio
   proportion of Re-Echo packets changed to CE(0).  Double checking
   using such redundant relationships can improve the security of
   two data volumes (marked and total) over a monitoring period.  All
   measurements are
   scheme (cf. double-entry book-keeping or the ECN Nonce).
   Alternatively, it might be necessary to exploit the redundancy in terms of octets, not packets, assuming that line
   resources are more congestible than packet processing. the
   future to encode an extra information channel.

Appendix C.  ECN Compatibility

   The path congestion (RE blanking fraction) set by rationale for choosing the particular combinations of SYN and SYN
   ACK flags in Section 4.1.3 is as follows.

   Choice of SYN flags: A re-ECN sender should
   reflect the upstream congestion (CE marking fraction) fed back from can work with vanilla ECN
      receivers so we wanted to use the destination.  Therefore same flags as would be used in the steady state

      p_0  = u_n
           = 1 - (1 - m_1)(1 - m_2)...

   Similarly,
      an ECN-setup SYN [RFC3168] (CWR=1, ECE=1).  But at some point j the same time,
      we wanted a server (host B) that is Re-ECT to be able to recognise
      that the client (A) is also Re-ECT.  We believe also setting NS=1
      in the middle of initial SYN achieves both these objectives, as it should be
      ignored by vanilla ECT receivers and by ECT-Nonce receivers.  But
      senders that are not Re-ECT should not set NS=1.  At the network, if p = 1 -
   (1 - u_j)(1 - v_j), then

      v_j  = 1 - (1 - p)/(1 - u_j)

          ~= p - u_j;                      if u_j << 100%

   So, between time ECN
      was defined, the NS flag was not defined, so setting NS=1 should
      be ignored by existing ECT receivers (but testing against
      implementations may yet prove otherwise).  The ECN Nonce
      RFC [RFC3540] is silent on what the two routers NS field might be set to in
      the example TCP SYN, but we believe the intent was for a nonce client to
      set NS=0 in Section 3.3, congestion
   downstream is

      v_1  = 100.00% - (100% - 2.98%) / (100% - 1.00%)
           = 2.00%,

   or the initial SYN (again only testing will tell).
      Therefore we define a useful approximation Re-ECN-setup SYN as one with NS=1, CWR=1 &
      ECE=1

   Choice of downstream congestion SYN ACK flags: Choice of SYN ACK: The client (A) needs to
      be able to determine whether the server (B) is

      v_1 ~= 2.98% - 1.00%
          ~= 1.98%.

Appendix B. Re-ECT.  The
      original ECN Compatibility specification required an ECT server to respond to an
      ECN-setup SYN with an ECN-setup SYN ACK of CWR=0 and ECE=1.  There
      is no room to modify this by setting the NS flag, as that is
      already set in the SYN ACK of an ECT-Nonce server.  So we used the
      only combination of CWR and ECE that would not be used by existing
      TCP receivers: CWR=1 and ECE=0.  The rationale original ECN specification
      defines this combination as a non-ECN-setup SYN ACK, which remains
      true for choosing vanilla and Nonce ECTs.  But for re-ECN we define it as a
      Re-ECN-setup SYN ACK.  We didn't use a SYN ACK with both CWR and
      ECE cleared to 0 because that would be the likely response from
      most Not-ECT receivers.  And we didn't use a SYN ACK with both CWR
      and ECE set to 1 either, as at least one broken receiver
      implementation echoes whatever flags were in the particular combinations of SYN and into its SYN
      ACK.  Therefore we define a Re-ECN-setup SYN ACK flags in Section 4.1.3 is as follows. one with CWR=1
      & ECE=0.

   Choice of two alternative SYN flags: A re-ECN sender can work with vanilla ECN
      receivers so we wanted to use ACKs: the same flags as would be used NS flag may take either value
      in
      an ECN-setup SYN [RFC3168] (CWR=1, ECE=1).  But at the same time,
      we wanted a server (host B) Re-ECN-setup SYN ACK.  Section 5.4 REQUIRES that is a Re-ECT
      server MUST set the NS flag to be able 1 in a Re-ECN-setup SYN ACK to recognise
      that echo
      congestion experienced (CE) on the client (A) initial SYN.  Otherwise a Re-
      ECN-setup SYN ACK MUST be returned with NS=0.  The only current
      known use of the NS flag in a SYN ACK is also Re-ECT.  We believe also to indicate support for
      the ECN nonce, which will be negotiated by setting NS=1
      in CWR=0 & ECE=1.
      Given the initial ECN nonce MUST NOT be used for a RECN mode connection, a
      Re-ECN-setup SYN achieves both these objectives, as it should ACK can use either setting of the NS flag without
      any risk of confusion, because the CWR & ECE flags will be
      ignored by vanilla ECT receivers and
      reversed relative to those used by ECT-Nonce receivers.  But
      senders an ECN nonce SYN ACK.

Appendix D.  Packet Marking During Flow Start

   {ToDo: Write up proof that are not Re-ECT sender should not set NS=1.  At mark FNE on first and third
   data packets, even with the largest allowed initial window.}

Appendix E.  Example Egress Dropper Algorithm

   {ToDo: Write up the time ECN
      was defined, basic algorithm with flow state, then the NS flag was not defined, so setting NS=1 should
   aggregated one.}

Appendix F.  Re-TTL

   This Appendix gives an overview of a proposal to be ignored by existing ECT receivers (but testing against
      implementations may yet prove otherwise).  The ECN Nonce
      RFC [RFC3540] is silent on what able to overload
   the NS TTL field might be set to in the TCP SYN, but we believe the intent was for a nonce client IP header to
      set NS=0 monitor downstream propagation
   delay.  It is planned to fully write up this proposal in a future
   Internet Draft.

   Delay re-feedback can be achieved by overloading the initial SYN (again only testing will tell).
      Therefore we define TTL field,
   without changing IP or router TTL processing.  A target value for TTL
   at the destination would need standardising, say 16.  If the path hop
   count increased by more than 16 during a Re-ECN-setup SYN as one with NS=1, CWR=1 &
      ECE=1

   Choice of SYN ACK flags: Choice of SYN ACK: The client (A) needs routing change, it would
   temporarily be mistaken for a routing loop, so this target would need
   to be able chosen to determine whether the server (B) is Re-ECT. exceed typical hop count increases.  The
      original ECN specification required an ECT server to respond TCP wire
   protocol and handlers would need modifying to an
      ECN-setup SYN with an ECN-setup SYN ACK of CWR=0 feed back the
   destination TTL and ECE=1.  There
      is no room initialise it.  It would be necessary to modify this by setting
   standardise the NS flag, as that is
      already set unit of TTL in the SYN ACK terms of an ECT-Nonce server.  So we used real time (as was the
      only combination
   original intent in the early days of CWR and ECE that would not the Internet).

   In the longer term, precision could be used by existing
      TCP receivers: CWR=1 and ECE=0.  The original ECN specification
      defines this combination as a non-ECN-setup SYN ACK, which remains
      true for vanilla and Nonce ECTs.  But improved if routers
   decremented TTL to represent exact propagation delay to the next
   router.  That is, for re-ECN we define it as a
      Re-ECN-setup SYN ACK.  We didn't use a SYN ACK with both CWR and
      ECE cleared router to 0 because that decrement TTL by, say, 1.8 time
   units it would be alternate the likely response from
      most Not-ECT receivers.  And we didn't use a SYN ACK with both CWR
      and ECE set to decrement of every packet between 1 either, as & 2
   at least one broken receiver
      implementation echoes whatever flags were in the SYN into its SYN
      ACK.  Therefore we define a Re-ECN-setup SYN ACK as one with CWR=1
      & ECE=0.

   Choice ratio of two alternative SYN ACKs: the NS flag may take either value
      in 1:4.  Although this might sometimes require a Re-ECN-setup SYN ACK.  Section 5.4 REQUIRES that seemingly
   dangerous null decrement, a Re-ECT
      server MUST set packet in a loop would still decrement to
   zero after 255 time units on average.  As more routers were upgraded
   to this more accurate TTL decrement, path delay estimates would
   become increasingly accurate despite the presence of some legacy
   routers that continued to always decrement the NS flag TTL by 1.

Appendix G.  Policer Designs to 1 in ensure Congestion Responsiveness

G.1.  Per-user Policing

   User policing requires a Re-ECN-setup SYN ACK to echo
      congestion experienced (CE) policer on the initial SYN.  Otherwise a Re-
      ECN-setup SYN ACK MUST be returned with NS=0.  The only current
      known use ingress interface of the NS flag in a SYN ACK is to indicate support for
      the ECN nonce, which will be negotiated by setting CWR=0 & ECE=1.
      Given
   access router associated with the ECN nonce MUST NOT be used for a RECN mode connection, a
      Re-ECN-setup SYN ACK can use either setting of user.  At that point, the NS flag without
      any risk traffic
   of confusion, because the CWR & ECE flags will be
      reversed relative user hasn't diverged on different routes yet; nor has it mixed
   with traffic from other sources.

   In order to those used by an ECN nonce SYN ACK.

Appendix C.  Packet Marking During Flow Start

   {ToDo: Write up proof ensure that sender should mark FNE on first and third
   data packets, even a user doesn't generate more congestion in
   the network than her due share, a modified bulk token-bucket is
   maintained with the largest allowed following parameter:

   o  b_0 the initial window.}

Appendix D.  Example Egress Dropper Algorithm

   {ToDo: Write up token level

   o  r the basic algorithm with flow state, then filling rate

   o  b_max the
   aggregated one.}

Appendix E.  Re-TTL

   This Appendix gives an overview bucket depth

   The same token bucket algorithm is used as in many areas of
   networking, but how it is used is very different:

   o  all traffic from a user over the lifetime of their subscription is
      policed in the same token bucket.

   o  only positive and canceled packets (Re-Echo, FNE and CE(0))
      consume tokens

   Such a proposal policer will allow network operators to be able throttle the
   contribution of their users to overload network congestion.  This will require
   the TTL field appropriate contractual terms to be in the IP header place between operators
   and users.  For instance: a condition for a user to monitor downstream propagation
   delay.  It is planned subscribe to fully write up this proposal in a future
   Internet Draft.

   Delay re-feedback can
   given network service may be achieved by overloading the TTL field,
   without changing IP or router TTL processing.  A target value for TTL
   at the destination would need standardising, say 16.  If the path hop
   count increased by that she should not cause more than 16 during a routing change, it would
   temporarily be mistaken for
   volume C_user of congestion over a routing loop, so this target would need reference period T_user, although
   she may carry forward up to N_user times her allowance at the end of
   each period.  These terms directly set the parameter of the user
   policer:

   o  b_0 = C_user

   o  r = C_user/T_user

   o  b_max = b_0 * (N_user +1)

   Besides the congestion budget policer above, another user policer may
   be chosen necessary to exceed typical hop count increases.  The TCP wire
   protocol and handlers would need modifying further rate-limit FNE packets, if they are to be
   marked rather than dropped (see discussion in Section 5.3.).  Rate-
   limiting FNE packets will prevent high bursts of new flow arrivals,
   which is a very useful feature in DoS prevention.  A condition to feed back the
   destination TTL and initialise it.  It
   subscribe to a given network service would have to be necessary that a user
   should not generate more than C_FNE FNE packets, over a reference
   period T_FNE, with no option to
   standardise carry forward any of the unit allowance at
   the end of TTL in each period.  These terms directly set the parameters of real time (as was
   the
   original intent FNE policer:

   o  b_0 = C_FNE

   o  r = C_FNE/T_FNE

   o  b_max = b_0

   T_FNE should be a much shorter period than T_user: for instance T_FNE
   could be in the early days order of the Internet).

   In the longer term, precision minutes while T_user could be improved if routers
   decremented TTL to represent exact propagation delay in order of
   weeks.

G.2.  Per-flow Rate Policing

   Per-flow policing aims to enforce congestion responsiveness on the next
   router.  That is, for
   shortest information timescale on a router to decrement TTL by, say, 1.8 time
   units it would alternate the decrement of every network path: packet roundtrips.

   This again requires that the appropriate terms be agreed between 1 & 2
   at a ratio of 1:4.  Although this might sometimes require a seemingly
   dangerous null decrement,
   network operator and its users, where a packet in congestion responsiveness
   policy might be required for the use of a loop would still decrement to
   zero after 255 time units on average. given network service
   (perhaps unless the user specifically requests otherwise).

   As more routers were upgraded
   to this more accurate TTL decrement, path delay estimates would
   become increasingly accurate despite an example, we describe below how a rate adaptation policer can be
   designed when the presence of some legacy
   routers applicable rate adaptation policy is TCP-
   compliance.  In that continued to always decrement context, the TTL by 1.

Appendix F.  Policer Designs to ensure Congestion Responsiveness

F.1.  Per-user Policing

   User policing requires average throughput of a policer on flow will
   be expected to be bounded by the ingress interface value of the
   access router associated with the user.  At that point, TCP throughput during
   congestion avoidance, given n Mathis' formula [Mathis97]

      x_TCP = k * s / ( T * sqrt(m) )

   where:

   o  x_TCP is the traffic throughput of the user hasn't diverged on different routes yet; nor has it mixed
   with traffic from other sources.

   In order to ensure that a user doesn't generate more congestion TCP flow in
   the network than her due share, packets per second,

   o  k is a modified bulk token-bucket constant upper-bounded by sqrt(3/2),

   o  s is
   maintained with the following parameter:

   o  b_0 average packet size of the initial token level flow,

   o  r  T is the filling rate

   o  b_max roundtrip time of the bucket depth

   The same token bucket algorithm flow,

   o  m is used as the congestion level experienced by the flow.

   We define the marking period N=1/m which represents the average
   number of packets between two positive or canceled packets.  Mathis'
   formula can be re-written as:

      x_TCP = k*s*sqrt(N)/T

   We can then get the average inter-mark time in many areas a compliant TCP flow,
   dt_TCP, by solving (x_TCP/s)*dt_TCP = N which gives

      dt_TCP = sqrt(N)*T/k

   We rely on this equation for the design of
   networking, but how it is used is very different:

   o  all traffic from a user over the lifetime rate-adaptation policer
   as a variation of their subscription is
      policed in the same a token bucket.

   o  only Re-Echo packets consume tokens

   Such  In that case a policer will allow network operators to throttle the
   contribution of their users has to network congestion. be
   set up for each policed flow.  This will require
   the appropriate contractual terms to may be in place between operators
   and users.  For instance: triggered by FNE packets,
   with the remainder of flows being all rate limited together if they
   do not start with an FNE packet.

   Where maintaining per flow state is not a condition problem, for a user to subscribe to a
   given network service instance on
   some access routers, systematic per-flow policing may be that she should not cause considered.
   Should per-flow state be more than constrained, rate adaptation policing
   could be limited to a
   volume C_user random sample of congestion over a reference period T_user, although
   she may carry forward up to N_user times her allowance at flows exhibiting positive or
   canceled packets.

   As in the end case of
   each period.  These terms directly set user policing, only positive or canceled packets
   will consume tokens, however the parameter amount of the user
   policer:

   o  b_0 = C_user

   o  r = C_user/T_user

   o  b_max = b_0 * (N_user +1)

   Besides tokens consumed will
   depend on the congestion budget policer above, another user signal.

   When a new rate adaptation policer
   will be necessary to rate-limit FNE packets, if they are to be marked
   rather than dropped (see discussion in Section 5.3.).  Rate-limiting
   FNE packets will prevent high bursts is set up for flow j, the
   following state is created:

   o  a token bucket b_j of new flow arrivals, which is depth b_max starting at level b_0

   o  a
   very useful feature in DoS prevention.  A condition to subscribe to timestamp t_j = timenow()

   o  a
   given network service would have to be that counter N_j = 0

   o  a user should not
   generate more than C_FNE FNE packets, over roundtrip estimate T_j

   o  a reference period T_FNE, filling rate r

   When the policing node forwards a packet of flow j with no option to carry forward any of the allowance at Re-Echo:

   o  . the end of
   each period.  These terms directly set counter is incremented: N_j += 1

   When the parameters policing node forwards a packet of flow j carrying a
   congestion mark (CE):

   o  the FNE
   policer: counter is incremented: N_j += 1

   o  b_0 = C_FNE  the token level is adjusted: b_j += r*(timenow()-t_j) - sqrt(N_j)*
      T_j/k

   o  r  the counter is reset: N_j = C_FNE/T_FNE 0

   o  b_max  the timer is reset: t_j = b_0

   T_FNE should timenow()

   An implementation example will be given in a much shorter period than T_user: later draft that avoids
   having to extract the square root.

   Analysis: For a TCP flow, for instance T_FNE
   could be in r= 1 token/sec, on average,

      r*(timenow()-t_j)-sqrt(N_j)* T_j/k = dt_TCP - sqrt(N)*T/k = 0

   This means that the order of minutes while T_user could be in order token level will fluctuate around its initial
   level.  The depth b_max of
   weeks.

F.2.  Per-flow Rate Policing

   Per-flow policing aims to enforce congestion responsiveness on the
   shortest information bucket sets the timescale on a network path: packet roundtrips.

   This again requires that which the
   rate adaptation policy is performed while the filling rate r sets the appropriate terms be agreed
   trade-off between a
   network operator and its users, where a congestion responsiveness
   policy might be required for and robustness:

   o  the use of a given network service
   (perhaps unless higher b_max, the user specifically requests otherwise).

   As an example, we describe below how a rate adaptation policer can be
   designed when longer it will take to catch greedy flows

   o  the applicable higher r, the fewer false positives (greedy verdict on
      compliant flows) but the more false negatives (compliant verdict
      on greedy flows)

   This rate adaptation policy is TCP-
   compliance.  In that context, policer requires the average throughput availability of a flow will
   be expected to roundtrip
   estimate which may be bounded by obtained for instance from the value application of
   re-feedback to the TCP throughput during
   congestion avoidance, given n Mathis' formula [Mathis97]

      x_TCP = k * s / ( T * sqrt(m) )

   where:

   o  x_TCP is downstream delay Appendix F or passive estimation
   [Jiang02].

   When the throughput bucket of a policer located at the TCP flow in packets per second,

   o  k access router (whether it
   is a constant upper-bounded by sqrt(3/2),

   o  s is per-user policer or a per-flow policer) becomes empty, the average packet size of
   access router SHOULD drop at least all packets causing the flow,

   o  T is token
   level to become negative.  The network operator MAY take further
   sanctions if the roundtrip time token level of the flow,

   o  m is per-flow policers associated with
   a user becomes negative.

Appendix H.  Downstream Congestion Metering Algorithms

H.1.  Bulk Downstream Congestion Metering Algorithm

   To meter the bulk amount of downstream congestion level experienced by the flow.

   We define the marking period N=1/m which represents in traffic crossing
   an inter-domain border an algorithm is needed that accumulates the average
   number
   size of positive packets between two re-echoes.  Mathis' formula can be re-
   written as:

      x_TCP = k*s*sqrt(N)/T

   We can then get and subtracts the average inter-mark time in size of negative packets.
   We maintain two counters:

      V_b: accumulated congestion volume

      B: total data volume (in case it is needed)

   A suitable pseudo-code algorithm for a compliant TCP flow,
   dt_TCP, by solving (x_TCP/s)*dt_TCP border router is as follows:

   ====================================================================
   V_b = N which gives

      dt_TCP 0
   B   = sqrt(N)*T/k

   We rely on this equation 0
   for each re-ECN-capable packet {
       b = readLength(packet)      /* set b to packet size          */
       B += b                      /* accumulate total volume       */
       if readEECN(packet) == (Re-Echo || FNE) {
           V_b += b                /* increment...                  */
       } elseif readEECN(packet) == CE(-1) {
           V_b -= b                /* ...or decrement V_b...        */
       }                           /*...depending on EECN field     */
   }
   ====================================================================

   At the design end of a rate-adaptation policer an accounting period this counter V_b represents the
   congestion volume that penalties could be applied to, as a variation described in
   Section 6.1.6.

   For instance, accumulated volume of congestion through a token bucket.  In that case border
   interface over a policer has to month might be
   set up for each policed flow. V_b = 5PB (petabyte = 10^15 byte).
   This may be triggered by FNE packets,
   with the remainder might have resulted from an average downstream congestion level
   of flows being all rate limited together if they
   do not start with 1% on an FNE packet.

   Where maintaining per flow state is not a problem, accumulated total data volume of B = 500PB.

H.2.  Inflation Factor for instance on
   some access routers, systematic per-flow policing may be considered.
   Should per-flow state be more constrained, rate adaptation policing
   could be limited Persistently Negative Flows

   The following process is suggested to a random sample of complement the simple algorithm
   above in order to protect against the various attacks from
   persistently negative flows exhibiting Re-Echoes. described in Section 6.1.6.  As explained
   in that section, the case most important and first step is to estimate the
   contribution of user policing, only re-echo packets will consume
   tokens, however persistently negative flows to the amount bulk volume of tokens consumed will depend on
   downstream pre-congestion and to inflate this bulk volume as if these
   flows weren't there.  The process below has been designed to give an
   unboased estimate, but it may be possible to define other processes
   that achieve similar ends.

   While the
   congestion signal.

   When a new rate adaptation policer above simple metering algorithm is set up for flow j, counting the
   following state is created:

   o  a token bucket b_j bulk of depth b_max starting at level b_0

   o  a timestamp t_j = timenow()

   o  a counter N_j = 0

   o  a roundtrip estimate T_j

   o  a filling rate r

   When
   traffic over an accounting period, the policing node forwards meter should also select a packet
   subset of flow j with no Re-Echo:

   o  . the counter is incremented: N_j += 1

   When the policing node forwards a packet of whole flow j carrying a
   congestion mark (CE):

   o  the counter is incremented: N_j += 1

   o  the token level is adjusted: b_j += r*(timenow()-t_j) - sqrt(N_j)*
      T_j/k

   o  the counter is reset: N_j = 0

   o  the timer ID space that is reset: t_j = timenow()

   An implementation example will small enough to be given in a later draft that avoids
   having able to extract the square root.

   Analysis: For
   realistically measure but large enough to give a TCP flow, for r= 1 token/sec, on average,

      r*(timenow()-t_j)-sqrt(N_j)* T_j/k = dt_TCP - sqrt(N)*T/k = 0

   This means that the token level will fluctuate around its initial
   level.  The depth b_max realistic sample.
   Many different samples of different subsets of the bucket sets ID space should be
   taken at different times during the timescale on which accounting period, preferably
   covering the
   rate adaptation policy is performed while whole ID space.  During each sample, the filling rate r sets meter should
   count the
   trade-off between responsiveness volume of positive packets and robustness:

   o subtract the higher b_max, volume of
   negative, maintaining a separate account for each flow in the sample.
   It should run a lot longer it will take than the large majority of flows, to catch greedy flows

   o avoid
   a bias from missing the higher r, starts and ends of flows, which tend to be
   positive and negative respectively.

   Once the fewer false positives (greedy verdict on
      compliant flows) but accounting period finishes, the more false negatives (compliant verdict
      on greedy flows)

   This rate adaptation policer requires meter should calculate the availability
   total of a roundtrip
   estimate which may be obtained the accounts V_{bI} for instance from the application subset of
   re-feedback to flows I in the downstream delay Appendix E or passive estimation
   [Jiang02].

   When sample,
   and the bucket total of a policer located at the access router (whether it
   is a per-user policer or accounts V_{fI} excluding flows with a per-flow policer) becomes empty, negative
   account from the
   access router SHOULD drop at least subset I. Then the weighted mean of all packets causing these
   samples should be taken a_S = sum_{forall I} V_{fI} / sum_{forall I}
   V_{bI}.

   If V_b is the token
   level result of the bulk accounting algorithm over the
   accounting period (Appendix H.1) it can be inflated by this factor
   a_S to become negative.  The network operator MAY take further
   sanctions if get a good unbiased estimate of the token level volume of downstream
   congestion over the per-flow policers associated with
   a user becomes negative. accounting period a_S.V_b, without being polluted
   by the effect of persistently negative flows.

Authors' Addresses

   Bob Briscoe
   BT & UCL
   B54/77, Adastral Park
   Martlesham Heath
   Ipswich  IP5 3RE
   UK

   Phone: +44 1473 645196
   Email: bob.briscoe@bt.com
   URI:   http://www.cs.ucl.ac.uk/staff/B.Briscoe/

   Arnaud Jacquet
   BT
   B54/70, Adastral Park
   Martlesham Heath
   Ipswich  IP5 3RE
   UK

   Phone: +44 1473 647284
   Email: arnaud.jacquet@bt.com
   URI:

   Alessandro Salvatori
   BT
   B54/77, Adastral Park
   Martlesham Heath
   Ipswich  IP5 3RE
   UK

   Email: sandr8@gmail.com

   Martin Koyabe
   BT
   B54/69, Adastral Park
   Martlesham Heath
   Ipswich  IP5 3RE
   UK

   Phone: +44 1473 646923
   Email: martin.koyabe@bt.com
   URI:

Intellectual Property Statement

   The IETF takes no position regarding the validity or scope of any
   Intellectual Property Rights or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; nor does it represent that it has
   made any independent effort to identify any such rights.  Information
   on the procedures with respect to rights in RFC documents can be
   found in BCP 78 and BCP 79.

   Copies of IPR disclosures made to the IETF Secretariat and any
   assurances of licenses to be made available, or the result of an
   attempt made to obtain a general license or permission for the use of
   such proprietary rights by implementers or users of this
   specification can be obtained from the IETF on-line IPR repository at
   http://www.ietf.org/ipr.

   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights that may cover technology that may be required to implement
   this standard.  Please address the information to the IETF at
   ietf-ipr@ietf.org.

Disclaimer of Validity

   This document and the information contained herein are provided on an
   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
   ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
   INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
   INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

Copyright Statement

   Copyright (C) The Internet Society (2006).  This document is subject
   to the rights, licenses and restrictions contained in BCP 78, and
   except as set forth therein, the authors retain all their rights.

Acknowledgment

   Funding for the RFC Editor function is currently provided by the
   Internet Society.