draft-briscoe-re-pcn-border-cheat-00.txt   draft-briscoe-re-pcn-border-cheat-01.txt 
PCN Working Group B. Briscoe PCN Working Group B. Briscoe
Internet-Draft BT & UCL Internet-Draft BT & UCL
Intended status: Informational June 30, 2007 Intended status: Informational February 25, 2008
Expires: January 1, 2008 Expires: August 28, 2008
Emulating Border Flow Policing using Re-ECN on Bulk Data Emulating Border Flow Policing using Re-ECN on Bulk Data
draft-briscoe-re-pcn-border-cheat-00 draft-briscoe-re-pcn-border-cheat-01
Status of this Memo Status of this Memo
By submitting this Internet-Draft, each author represents that any By submitting this Internet-Draft, each author represents that any
applicable patent or other IPR claims of which he or she is aware applicable patent or other IPR claims of which he or she is aware
have been or will be disclosed, and any of which he or she becomes have been or will be disclosed, and any of which he or she becomes
aware will be disclosed, in accordance with Section 6 of BCP 79. aware will be disclosed, in accordance with Section 6 of BCP 79.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that Task Force (IETF), its areas, and its working groups. Note that
skipping to change at page 1, line 34 skipping to change at page 1, line 34
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt. http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html. http://www.ietf.org/shadow.html.
This Internet-Draft will expire on January 1, 2008. This Internet-Draft will expire on August 28, 2008.
Copyright Notice Copyright Notice
Copyright (C) The IETF Trust (2007). Copyright (C) The IETF Trust (2008).
Abstract Abstract
Scaling per flow admission control to the Internet is a hard problem. Scaling per flow admission control to the Internet is a hard problem.
A recently proposed approach combines Diffserv and pre-congestion A recently proposed approach combines Diffserv and pre-congestion
notification (PCN) to provide a service slightly better than Intserv notification (PCN) to provide a service slightly better than Intserv
controlled load. It scales to networks of any size, but only if controlled load. It scales to networks of any size, but only if
domains trust each other to comply with admission control and rate domains trust each other to comply with admission control and rate
policing. This memo claims to solve this trust problem without policing. This memo claims to solve this trust problem without
losing scalability. It describes bulk border policing that provides losing scalability. It describes bulk border policing that provides
a sufficient emulation of per-flow policing with the help of another a sufficient emulation of per-flow policing with the help of another
recently proposed extension to ECN, involving re-echoing ECN feedback recently proposed extension to ECN, involving re-echoing ECN feedback
(re-ECN). With only passive bulk measurements at borders, sanctions (re-ECN). With only passive bulk measurements at borders, sanctions
can be applied against cheating networks. can be applied against cheating networks.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 7
2. Requirements Notation . . . . . . . . . . . . . . . . . . . . 9
3. The Problem . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.1. The Traditional Per-flow Policing Problem . . . . . . . . 10
3.2. Generic Scenario . . . . . . . . . . . . . . . . . . . . . 12
4. Re-ECN Protocol for an RSVP (or similar) Transport . . . . . . 14
4.1. Protocol Overview . . . . . . . . . . . . . . . . . . . . 14
4.2. Re-ECN Abstracted Network Layer Wire Protocol (IPv4 or
v6) . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.2.1. Re-ECN Recap . . . . . . . . . . . . . . . . . . . . . 16
4.2.2. Re-ECN Combined with Pre-Congestion Notification
(re-PCN) . . . . . . . . . . . . . . . . . . . . . . . 18
4.3. Protocol Operation . . . . . . . . . . . . . . . . . . . . 20
4.3.1. Protocol Operation for an Established Flow . . . . . . 20
4.3.2. Aggregate Bootstrap . . . . . . . . . . . . . . . . . 21
4.3.3. Flow Bootstrap . . . . . . . . . . . . . . . . . . . . 22
4.3.4. Router Forwarding Behaviour . . . . . . . . . . . . . 23
4.3.5. Extensions . . . . . . . . . . . . . . . . . . . . . . 25
5. Emulating Border Policing with Re-ECN . . . . . . . . . . . . 25
5.1. Informal Terminology . . . . . . . . . . . . . . . . . . . 25
5.2. Policing Overview . . . . . . . . . . . . . . . . . . . . 26
5.3. Pre-requisite Contractual Arrangements . . . . . . . . . . 28
5.4. Emulation of Per-Flow Rate Policing: Rationale and
Limits . . . . . . . . . . . . . . . . . . . . . . . . . . 31
5.5. Sanctioning Dishonest Marking . . . . . . . . . . . . . . 32
5.6. Border Mechanisms . . . . . . . . . . . . . . . . . . . . 34
5.6.1. Border Accounting Mechanisms . . . . . . . . . . . . . 34
5.6.2. Competitive Routing . . . . . . . . . . . . . . . . . 38
5.6.3. Fail-safes . . . . . . . . . . . . . . . . . . . . . . 39
6. Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
7. Incremental Deployment . . . . . . . . . . . . . . . . . . . . 42
8. Design Choices and Rationale . . . . . . . . . . . . . . . . . 43
9. Security Considerations . . . . . . . . . . . . . . . . . . . 45
10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 46
11. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 46
12. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 47
13. Comments Solicited . . . . . . . . . . . . . . . . . . . . . . 47
14. References . . . . . . . . . . . . . . . . . . . . . . . . . . 48
14.1. Normative References . . . . . . . . . . . . . . . . . . . 48
14.2. Informative References . . . . . . . . . . . . . . . . . . 48
Appendix A. Implementation . . . . . . . . . . . . . . . . . . . 50
A.1. Ingress Gateway Algorithm for Blanking the RE flag . . . . 50
A.2. Downstream Congestion Metering Algorithms . . . . . . . . 51
A.2.1. Bulk Downstream Congestion Metering Algorithm . . . . 51
A.2.2. Inflation Factor for Persistently Negative Flows . . . 52
A.3. Algorithm for Sanctioning Negative Traffic . . . . . . . . 52
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 53
Intellectual Property and Copyright Statements . . . . . . . . . . 54
Status (to be removed by the RFC Editor) Status (to be removed by the RFC Editor)
This memo is posted as an Internet-Draft with the intent to This memo is posted as an Internet-Draft with the intent to
eventually be broken down in two documents; one for the standards eventually be broken down in two documents; one for the standards
track and one for informational status. But until it becomes an item track and one for informational status. But until it becomes an item
of IETF working group business the whole proposal has been kept of IETF working group business the whole proposal has been kept
together to aid understanding. Only the text of Section 4 of this together to aid understanding. Only the text of Section 4 of this
document requires standardisation. The rest of the sections describe document requires standardisation. The rest of the sections describe
how a system might be built from these protocols by the operators of how a system might be built from these protocols by the operators of
an internetwork. Note in particular that the policing and monitoring an internetwork. Note in particular that the policing and monitoring
functions proposed for the trust boundaries between operators would functions proposed for the trust boundaries between operators would
not need standardisation by the IETF. They simply represent one way not need standardisation by the IETF. They simply represent one way
that the proposed protocols could be used to extend the PCN that the proposed protocols could be used to extend the PCN
architecture [PCN-arch] to span multiple domains without mutual trust architecture [I-D.ietf-pcn-architecture] to span multiple domains
between the operators. without mutual trust between the operators.
To realise the system described, this document also depends on To realise the system described, this document also depends on
standardisation of three other documents currently being discussed standardisation of three other documents currently being discussed
(but not on the standards track) in the IETF Transport Area: pre- (but not on the standards track) in the IETF Transport Area: pre-
congestion notification (PCN) marking on interior nodes [PCN]; congestion notification (PCN) marking on interior nodes [PCN];
feedback of aggregate PCN measurements by suitably extending the feedback of aggregate PCN measurements by suitably extending the
admission control signalling protocol (e.g. RSVP) [RSVP-ECN]; and admission control signalling protocol (e.g. RSVP) [RSVP-ECN]; and
re-insertion of the feedback into the forward stream of IP packets by re-insertion of the feedback into the forward stream of IP packets by
the PCN ingress gateway in a similar way to that proposed for a TCP the PCN ingress gateway in a similar way to that proposed for a TCP
source [Re-TCP]. source [Re-TCP].
The authors seek comments from the Internet community on whether The authors seek comments from the Internet community on whether
combining PCN and re-ECN in this way is a sufficient solution to the combining PCN and re-ECN in this way is a sufficient solution to the
problem of scaling microflow admission control to the Internet as a problem of scaling microflow admission control to the Internet as a
whole, even though such scaling must take account of the increasing whole, even though such scaling must take account of the increasing
numbers of networks and users who may all have conflicting interests. numbers of networks and users who may all have conflicting interests.
Changes from previous drafts (to be removed by the RFC Editor) Changes from previous drafts (to be removed by the RFC Editor)
Changes in this version <draft-briscoe-re-pcn-border-cheat-00> Full diffs of incremental changes between drafts are available at
relative to the last <draft-briscoe-tsvwg-re-ecn-border-cheat-01>: URL: <http://www.cs.ucl.ac.uk/staff/B.Briscoe/pubs.html#repcn>
Changes from <draft-briscoe-re-pcn-border-cheat-00> to
<draft-briscoe-re-pcn-border-cheat-01> (current version):
Updated references.
Changes from <draft-briscoe-tsvwg-re-ecn-border-cheat-01>
to <draft-briscoe-re-pcn-border-cheat-00>:
Changed filename to associate it with the new IETF PCN w-g, rather Changed filename to associate it with the new IETF PCN w-g, rather
than the TSVWG w-g. than the TSVWG w-g.
Introduction: Clarified that bulk policing only replaces per-flow Introduction: Clarified that bulk policing only replaces per-flow
policing at interior inter-domain borders, while per-flow policing policing at interior inter-domain borders, while per-flow policing
is still needed at the access interface to the internetwork. Also is still needed at the access interface to the internetwork. Also
clarified that the aim is to neutralise any gains from cheating clarified that the aim is to neutralise any gains from cheating
using local bilateral contracts between neighbouring networks, using local bilateral contracts between neighbouring networks,
rather than merely identifying remote cheaters. rather than merely identifying remote cheaters.
skipping to change at page 3, line 28 skipping to change at page 6, line 36
Clarified that "Designing in security from the start" merely means Clarified that "Designing in security from the start" merely means
allowing codepoint space in the PCN protocol encoding. There is allowing codepoint space in the PCN protocol encoding. There is
no need to actually implement inter-domain security mechanisms for no need to actually implement inter-domain security mechanisms for
solutions confined to a single domain. solutions confined to a single domain.
Updated some references and added a ref to the Security Updated some references and added a ref to the Security
Considerations, as well as other minor corrections and Considerations, as well as other minor corrections and
improvements. improvements.
Changes from <draft-briscoe-tsvwg-re-ecn-border-cheat-00 to Changes from <draft-briscoe-tsvwg-re-ecn-border-cheat-00> to
<draft-briscoe-tsvwg-re-ecn-border-cheat-01>: <draft-briscoe-tsvwg-re-ecn-border-cheat-01>:
Added subsection on Border Accounting Mechanisms (Section 5.6.1) Added subsection on Border Accounting Mechanisms (Section 5.6.1)
Section 4.2 on the re-ECN wire protocol clarified and re-organised Section 4.2 on the re-ECN wire protocol clarified and re-organised
to separately discuss re-ECN for default ECN marking and for pre- to separately discuss re-ECN for default ECN marking and for pre-
congestion marking (PCN). congestion marking (PCN).
Router Forwarding Behaviour subsection added to re-organised Router Forwarding Behaviour subsection added to re-organised
section on Protocol Operation (Section 4.3). Extensions section section on Protocol Operation (Section 4.3). Extensions section
skipping to change at page 5, line 5 skipping to change at page 7, line 17
Added section on Incremental Deployment (Section 7), drawing Added section on Incremental Deployment (Section 7), drawing
together relevant points about deployment made throughout. together relevant points about deployment made throughout.
Sections on Design Rationale (Section 8) and Security Sections on Design Rationale (Section 8) and Security
Considerations (Section 9) expanded with some new material, Considerations (Section 9) expanded with some new material,
including new attacks and their defences. including new attacks and their defences.
Suggested Border Metering Algorithms improved (Appendix A.2) for Suggested Border Metering Algorithms improved (Appendix A.2) for
resilience to newly identified attacks. resilience to newly identified attacks.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 7
2. Requirements Notation . . . . . . . . . . . . . . . . . . . . 9
3. The Problem . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.1. The Traditional Per-flow Policing Problem . . . . . . . . 9
3.2. Generic Scenario . . . . . . . . . . . . . . . . . . . . . 11
4. Re-ECN Protocol for an RSVP (or similar) Transport . . . . . . 14
4.1. Protocol Overview . . . . . . . . . . . . . . . . . . . . 14
4.2. Re-ECN Abstracted Network Layer Wire Protocol (IPv4 or
v6) . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.2.1. Re-ECN Recap . . . . . . . . . . . . . . . . . . . . . 16
4.2.2. Re-ECN Combined with Pre-Congestion Notification
(re-PCN) . . . . . . . . . . . . . . . . . . . . . . . 17
4.3. Protocol Operation . . . . . . . . . . . . . . . . . . . . 19
4.3.1. Protocol Operation for an Established Flow . . . . . . 19
4.3.2. Aggregate Bootstrap . . . . . . . . . . . . . . . . . 21
4.3.3. Flow Bootstrap . . . . . . . . . . . . . . . . . . . . 22
4.3.4. Router Forwarding Behaviour . . . . . . . . . . . . . 23
4.3.5. Extensions . . . . . . . . . . . . . . . . . . . . . . 24
5. Emulating Border Policing with Re-ECN . . . . . . . . . . . . 24
5.1. Informal Terminology . . . . . . . . . . . . . . . . . . . 25
5.2. Policing Overview . . . . . . . . . . . . . . . . . . . . 26
5.3. Pre-requisite Contractual Arrangements . . . . . . . . . . 28
5.4. Emulation of Per-Flow Rate Policing: Rationale and
Limits . . . . . . . . . . . . . . . . . . . . . . . . . . 31
5.5. Sanctioning Dishonest Marking . . . . . . . . . . . . . . 32
5.6. Border Mechanisms . . . . . . . . . . . . . . . . . . . . 34
5.6.1. Border Accounting Mechanisms . . . . . . . . . . . . . 34
5.6.2. Competitive Routing . . . . . . . . . . . . . . . . . 38
5.6.3. Fail-safes . . . . . . . . . . . . . . . . . . . . . . 39
6. Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
7. Incremental Deployment . . . . . . . . . . . . . . . . . . . . 42
8. Design Choices and Rationale . . . . . . . . . . . . . . . . . 43
9. Security Considerations . . . . . . . . . . . . . . . . . . . 45
10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 46
11. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 46
12. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 47
13. Comments Solicited . . . . . . . . . . . . . . . . . . . . . . 47
14. References . . . . . . . . . . . . . . . . . . . . . . . . . . 48
14.1. Normative References . . . . . . . . . . . . . . . . . . . 48
14.2. Informative References . . . . . . . . . . . . . . . . . . 48
Appendix A. Implementation . . . . . . . . . . . . . . . . . . . 50
A.1. Ingress Gateway Algorithm for Blanking the RE flag . . . . 50
A.2. Downstream Congestion Metering Algorithms . . . . . . . . 51
A.2.1. Bulk Downstream Congestion Metering Algorithm . . . . 51
A.2.2. Inflation Factor for Persistently Negative Flows . . . 52
A.3. Algorithm for Sanctioning Negative Traffic . . . . . . . . 52
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 53
Intellectual Property and Copyright Statements . . . . . . . . . . 54
1. Introduction 1. Introduction
The Internet community largely lost interest in the Intserv The Internet community largely lost interest in the Intserv
architecture after it was clarified that it would be unlikely to architecture after it was clarified that it would be unlikely to
scale to the whole Internet [RFC2208]. Although Intserv mechanisms scale to the whole Internet [RFC2208]. Although Intserv mechanisms
proved impractical, the bandwidth reservation service it aimed to proved impractical, the bandwidth reservation service it aimed to
offer is still very much required. offer is still very much required.
A recently proposed approach [PCN-arch] combines Diffserv and pre- A recently proposed approach [I-D.ietf-pcn-architecture] combines
congestion notification (PCN) to provide a service slightly better Diffserv and pre-congestion notification (PCN) to provide a service
than Intserv controlled load [RFC2211]. It scales to any size slightly better than Intserv controlled load [RFC2211]. It scales to
network, but only if domains trust their neighbours to have checked any size network, but only if domains trust their neighbours to have
that upstream customers aren't taking more bandwidth than they checked that upstream customers aren't taking more bandwidth than
reserved, either accidentally or deliberately. This memo describes they reserved, either accidentally or deliberately. This memo
border policing measures so that one network can protect its describes border policing measures so that one network can protect
interests, even if networks around it are deliberately trying to its interests, even if networks around it are deliberately trying to
cheat. The approach provides a sufficient emulation of flow rate cheat. The approach provides a sufficient emulation of flow rate
policing at trust boundaries but without per-flow processing. The policing at trust boundaries but without per-flow processing. The
emulation is not perfect, but it is sufficient to ensure that the emulation is not perfect, but it is sufficient to ensure that the
punishment is at least proportionate to the severity of the cheat. punishment is at least proportionate to the severity of the cheat.
Per-flow rate policing for each reservation is still expected to be Per-flow rate policing for each reservation is still expected to be
used at the access edge of the internetwork, but at the borders used at the access edge of the internetwork, but at the borders
between networks bulk policing can be used to emulate per-flow between networks bulk policing can be used to emulate per-flow
policing. policing.
The aim is to be able to scale controlled load service to any number The aim is to be able to scale controlled load service to any number
of endpoints, even though such scaling must take account of the of endpoints, even though such scaling must take account of the
increasing numbers of networks and users who may all have conflicting increasing numbers of networks and users who may all have conflicting
interests. To achieve such scaling, this memo combines two recent interests. To achieve such scaling, this memo combines two recent
proposals, both of which it briefly recaps: proposals, both of which it briefly recaps:
o A deployment model for admission control over Diffserv using pre- o A deployment model for admission control over Diffserv using pre-
congestion notification [PCN-arch] describes how bulk pre- congestion notification [I-D.ietf-pcn-architecture] describes how
congestion notification on routers within an edge-to-edge Diffserv bulk pre-congestion notification on routers within an edge-to-edge
region can emulate the precision of per-flow admission control to Diffserv region can emulate the precision of per-flow admission
provide controlled load service without unscalable per-flow control to provide controlled load service without unscalable per-
processing; flow processing;
o Re-ECN: Adding Accountability to TCP/IP [Re-TCP]. The trick that o Re-ECN: Adding Accountability to TCP/IP [Re-TCP]. The trick that
addresses cheating at borders is to recognise that border policing addresses cheating at borders is to recognise that border policing
is mainly necessary because cheating upstream networks will admit is mainly necessary because cheating upstream networks will admit
traffic when they shouldn't only as long as they don't directly traffic when they shouldn't only as long as they don't directly
experience the downstream congestion their misbehaviour can cause. experience the downstream congestion their misbehaviour can cause.
The re-ECN protocol requires upstream nodes to declare expected The re-ECN protocol requires upstream nodes to declare expected
downstream congestion in all forwarded packets and it makes it in downstream congestion in all forwarded packets and it makes it in
their interests to declare it honestly. Operators can then their interests to declare it honestly. Operators can then
monitor downstream congestion in bulk at borders to emulate monitor downstream congestion in bulk at borders to emulate
skipping to change at page 8, line 24 skipping to change at page 8, line 38
on the local bilateral contractual relationships that already exist on the local bilateral contractual relationships that already exist
between neighbouring networks. between neighbouring networks.
Rather than the end-to-end arrangement used when re-ECN was specified Rather than the end-to-end arrangement used when re-ECN was specified
for the TCP transport [Re-TCP], this memo specifies re-ECN in an for the TCP transport [Re-TCP], this memo specifies re-ECN in an
edge-to-edge arrangement, making it applicable to the above edge-to-edge arrangement, making it applicable to the above
deployment model for admission control over Diffserv. Also, rather deployment model for admission control over Diffserv. Also, rather
than using a TCP transport for regular congestion feedback, this memo than using a TCP transport for regular congestion feedback, this memo
specifies re-ECN using RSVP as the transport for feedback [RSVP-ECN]. specifies re-ECN using RSVP as the transport for feedback [RSVP-ECN].
A similar deployment model, but with a different transport for A similar deployment model, but with a different transport for
signalling congestion feedback could be used (e.g. RMD [NSIS-RMD] signalling congestion feedback could be used (e.g. Arumaithurai
uses NSIS). [I-D.arumaithurai-nsis-pcn] and RMD [I-D.ietf-nsis-rmd] use NSIS).
This memo aims to do two things: i) define how to apply the re-ECN This memo aims to do two things: i) define how to apply the re-ECN
protocol to the admission control over Diffserv scenario; and ii) protocol to the admission control over Diffserv scenario; and ii)
explain why re-ECN sufficiently emulates border policing in that explain why re-ECN sufficiently emulates border policing in that
scenario. Most of the memo is taken up with the second aim; scenario. Most of the memo is taken up with the second aim;
explaining why it works. Applying re-ECN to the scenario actually explaining why it works. Applying re-ECN to the scenario actually
involves quite a trivial modification to the ingress gateway. That involves quite a trivial modification to the ingress gateway. That
modification can be added to gateways later, so our immediate goal is modification can be added to gateways later, so our immediate goal is
to convince everyone to have the foresight to define the PCN wire to convince everyone to have the foresight to define the PCN wire
protocol encoding to accommodate the extended codepoints defined in protocol encoding to accommodate the extended codepoints defined in
skipping to change at page 12, line 45 skipping to change at page 13, line 9
Within the Diffserv region are three interior domains, A, B and C, as Within the Diffserv region are three interior domains, A, B and C, as
well as the inward facing interfaces of the ingress and egress well as the inward facing interfaces of the ingress and egress
gateways. An ingress and egress border router (BR) is shown gateways. An ingress and egress border router (BR) is shown
interconnecting each interior domain with the next. There may be interconnecting each interior domain with the next. There may be
other interior routers (not shown) within each interior domain. other interior routers (not shown) within each interior domain.
In two paragraphs we now briefly recap how pre-congestion In two paragraphs we now briefly recap how pre-congestion
notification is intended to be used to control flow admission to a notification is intended to be used to control flow admission to a
large Diffserv region. The first paragraph describes data plane large Diffserv region. The first paragraph describes data plane
functions and the second describes signalling in the control plane. functions and the second describes signalling in the control plane.
We omit many details from [PCN-arch] including behaviour during We omit many details from [I-D.ietf-pcn-architecture] including
routing changes. For brevity here we assume other flows are already behaviour during routing changes. For brevity here we assume other
in progress across a path through the Diffserv region before a new flows are already in progress across a path through the Diffserv
one arrives, but how bootstrap works is described in Section 4.3.2. region before a new one arrives, but how bootstrap works is described
in Section 4.3.2.
Figure 1 shows a single simplex reserved flow from the sending (Sx) Figure 1 shows a single simplex reserved flow from the sending (Sx)
end host to the receiving (Rx) end host. The ingress gateway polices end host to the receiving (Rx) end host. The ingress gateway polices
incoming traffic within its admitted reservation and remarks it to incoming traffic within its admitted reservation and remarks it to
turn on an ECN-capable codepoint [RFC3168] and the controlled load turn on an ECN-capable codepoint [RFC3168] and the controlled load
(CL) Diffserv codepoint. Together, these codepoints define which (CL) Diffserv codepoint. Together, these codepoints define which
traffic is entitled to the enhanced scheduling of the CL behaviour traffic is entitled to the enhanced scheduling of the CL behaviour
aggregate on routers within the Diffserv region. The CL PHB of aggregate on routers within the Diffserv region. The CL PHB of
interior routers consists of a scheduling behaviour and a new ECN interior routers consists of a scheduling behaviour and a new ECN
marking behaviour that we call `pre-congestion notification' [PCN]. marking behaviour that we call `pre-congestion notification' [PCN].
skipping to change at page 16, line 30 skipping to change at page 17, line 9
re-ECN. re-ECN.
Note that general-purpose routers do not have to read the RE flag, Note that general-purpose routers do not have to read the RE flag,
only special policing elements at borders do. And no general-purpose only special policing elements at borders do. And no general-purpose
routers have to change the RE flag, although the ingress and egress routers have to change the RE flag, although the ingress and egress
gateways do because in the edge-to-edge deployment model we are gateways do because in the edge-to-edge deployment model we are
using, they act as proxies for the endpoints. Therefore the RE flag using, they act as proxies for the endpoints. Therefore the RE flag
does not even have to be visible to interior routers. So the RE flag does not even have to be visible to interior routers. So the RE flag
has no implications on protocols like MPLS. Congested label has no implications on protocols like MPLS. Congested label
switching routers (LSRs) would have to be able to notify their switching routers (LSRs) would have to be able to notify their
congestion with an ECN/PCN codepoint in the MPLS shim [ECN-MPLS], but congestion with an ECN/PCN codepoint in the MPLS shim [RFC5129], but
like any interior IP router, they can be oblivious to the RE flag, like any interior IP router, they can be oblivious to the RE flag,
which need only be read by border policing functions. which need only be read by border policing functions.
Although the RE flag is a separate, single bit field, it can be read Although the RE flag is a separate, single bit field, it can be read
as an extension to the two-bit ECN field; the three concatenated bits as an extension to the two-bit ECN field; the three concatenated bits
in what we will call the extended ECN field (EECN) make eight in what we will call the extended ECN field (EECN) make eight
codepoints available. When the RE flag setting is "don't care", we codepoints available. When the RE flag setting is "don't care", we
use the RFC3168 names of the ECN codepoints, but [Re-TCP] proposes use the RFC3168 names of the ECN codepoints, but [Re-TCP] proposes
the following six codepoint names for when there is a need to be more the following six codepoint names for when there is a need to be more
specific. specific.
skipping to change at page 19, line 40 skipping to change at page 20, line 12
Table 2: Extended ECN Codepoints if the Diffserv codepoint uses Pre- Table 2: Extended ECN Codepoints if the Diffserv codepoint uses Pre-
congestion Notification (PCN) congestion Notification (PCN)
4.3. Protocol Operation 4.3. Protocol Operation
4.3.1. Protocol Operation for an Established Flow 4.3.1. Protocol Operation for an Established Flow
The re-ECN protocol involves a simple tweak to the action of the The re-ECN protocol involves a simple tweak to the action of the
gateway at the ingress edge of the CL region. In the deployment gateway at the ingress edge of the CL region. In the deployment
model just described [PCN-arch], for each active traffic aggregate model just described [I-D.ietf-pcn-architecture], for each active
across the CL region (CL-region-aggregate) the ingress gateway will traffic aggregate across the CL region (CL-region-aggregate) the
hold a fairly recent Congestion-Level-Estimate that the egress ingress gateway will hold a fairly recent Congestion-Level-Estimate
gateway will have fed back to it, piggybacked on the signalling that that the egress gateway will have fed back to it, piggybacked on the
sets up each flow. For instance, one aggregate might have been signalling that sets up each flow. For instance, one aggregate might
experiencing 3% pre-congestion (that is, congestion marked octets have been experiencing 3% pre-congestion (that is, congestion marked
whether Admission Marked or Pre-emption Marked). In this case, the octets whether Admission Marked or Pre-emption Marked). In this
ingress gateway MUST clear the RE flag to "0" for the same percentage case, the ingress gateway MUST clear the RE flag to "0" for the same
of octets of CL-packets (3%) and set it to "1" in the rest (97%). percentage of octets of CL-packets (3%) and set it to "1" in the rest
Appendix A.1 gives a simple pseudo-code algorithm that the ingress (97%). Appendix A.1 gives a simple pseudo-code algorithm that the
gateway may use to do this. ingress gateway may use to do this.
The RE flag is set and cleared this way round for incremental The RE flag is set and cleared this way round for incremental
deployment reasons (see [Re-TCP]). To avoid confusion we will use deployment reasons (see [Re-TCP]). To avoid confusion we will use
the term `blanking' (rather than marking) when the RE flag is cleared the term `blanking' (rather than marking) when the RE flag is cleared
to "0", so we will talk of the `RE blanking fraction' as the fraction to "0", so we will talk of the `RE blanking fraction' as the fraction
of octets with the RE flag cleared to "0". of octets with the RE flag cleared to "0".
^ ^
| |
| RE blanking fraction | RE blanking fraction
skipping to change at page 21, line 30 skipping to change at page 21, line 51
4.3.2. Aggregate Bootstrap 4.3.2. Aggregate Bootstrap
When a new reservation PATH message arrives at the egress, if there When a new reservation PATH message arrives at the egress, if there
are currently no flows in progress from the same ingress, there will are currently no flows in progress from the same ingress, there will
be no state maintaining the current level of pre-congestion marking be no state maintaining the current level of pre-congestion marking
for the aggregate. While the reservation signalling continues onward for the aggregate. While the reservation signalling continues onward
towards the receiving host, the egress gateway returns an RSVP towards the receiving host, the egress gateway returns an RSVP
message to the ingress with a flag [RSVP-ECN] asking the ingress to message to the ingress with a flag [RSVP-ECN] asking the ingress to
send a specified number of data probes between them. This bootstrap send a specified number of data probes between them. This bootstrap
behaviour is all described in the deployment model [PCN-arch]. behaviour is all described in the deployment
model [I-D.ietf-pcn-architecture].
However, with our new re-ECN scheme, the ingress does not know what However, with our new re-ECN scheme, the ingress does not know what
proportion of the data probes should have the RE flag blanked, proportion of the data probes should have the RE flag blanked,
because it has no estimate yet of pre-congestion for the path across because it has no estimate yet of pre-congestion for the path across
the Diffserv region. the Diffserv region.
To be conservative, following the guidance for specifying other re- To be conservative, following the guidance for specifying other re-
ECN transports in [Re-TCP], the ingress SHOULD set the FNE codepoint ECN transports in [Re-TCP], the ingress SHOULD set the FNE codepoint
of the extended ECN header in all probe packets (Table 2). As per of the extended ECN header in all probe packets (Table 2). As per
the deployment model, the egress gateway measures the fraction of the deployment model, the egress gateway measures the fraction of
skipping to change at page 24, line 30 skipping to change at page 25, line 6
understand drop, not congestion marking. But a PCN-capable router understand drop, not congestion marking. But a PCN-capable router
can mark rather than drop an FNE packet, even though its ECN field can mark rather than drop an FNE packet, even though its ECN field
when looked at in isolation is '00' which appears to be a legacy when looked at in isolation is '00' which appears to be a legacy
Not-ECT packet. Therefore, if a packet's RE flag is '1', even if Not-ECT packet. Therefore, if a packet's RE flag is '1', even if
its ECN field is '00', a PCN-enabled router SHOULD use congestion its ECN field is '00', a PCN-enabled router SHOULD use congestion
marking. This allows the `feedback not established' (FNE) marking. This allows the `feedback not established' (FNE)
codepoint to be used for probe packets, in order to pick up PCN codepoint to be used for probe packets, in order to pick up PCN
marking when bootstrapping an aggregate. marking when bootstrapping an aggregate.
ECN marking rather than dropping of FNE packets MUST only be ECN marking rather than dropping of FNE packets MUST only be
deployed in controlled environments, such as that in [PCN-arch], deployed in controlled environments, such as that in
where the presence of an egress node that understands ECN marking [I-D.ietf-pcn-architecture], where the presence of an egress node
is assured. Congestion events might otherwise be ignored if the that understands ECN marking is assured. Congestion events might
receiver only understands drop, rather than ECN marking. This is otherwise be ignored if the receiver only understands drop, rather
because there is no guarantee that ECN capability has been than ECN marking. This is because there is no guarantee that ECN
negotiated if feedback is not established (FNE). Also, [Re-TCP] capability has been negotiated if feedback is not established
places the strong condition that a router MUST apply drop rather (FNE). Also, [Re-TCP] places the strong condition that a router
than marking to FNE packets unless it can guarantee that FNE MUST apply drop rather than marking to FNE packets unless it can
packets are rate limited either locally or upstream. guarantee that FNE packets are rate limited either locally or
upstream.
4.3.5. Extensions 4.3.5. Extensions
If a different signalling system, such as NSIS, were used, but it If a different signalling system, such as NSIS, were used, but it
provided admission control in a similar way, using pre-congestion provided admission control in a similar way, using pre-congestion
notification (e.g. with RMD [NSIS-RMD]) we believe re-ECN could be notification (e.g. Arumaithurai [I-D.arumaithurai-nsis-pcn] or
used to protect against misbehaving networks in the same way as RMD [I-D.ietf-nsis-rmd]) we believe re-ECN could be used to protect
proposed above. against misbehaving networks in the same way as proposed above.
5. Emulating Border Policing with Re-ECN 5. Emulating Border Policing with Re-ECN
Note that the re-ECN protocol described in Section 4 above would Note that the re-ECN protocol described in Section 4 above would
require standardisation, whereas operators acting in their own require standardisation, whereas operators acting in their own
interests would be expected to deploy policing and monitoring interests would be expected to deploy policing and monitoring
functions similar to those proposed in the sections below without any functions similar to those proposed in the sections below without any
further need for standardisation by the IETF. Flexibility is further need for standardisation by the IETF. Flexibility is
expected in exactly how policing and monitoring is done. expected in exactly how policing and monitoring is done.
skipping to change at page 46, line 43 skipping to change at page 46, line 43
10. IANA Considerations 10. IANA Considerations
This memo includes no request to IANA. This memo includes no request to IANA.
11. Conclusions 11. Conclusions
This memo builds on a promising technique to solve the classic This memo builds on a promising technique to solve the classic
problem of making flow admission control scale to any size network. problem of making flow admission control scale to any size network.
It involves the use of Diffserv in a deployment model that uses pre- It involves the use of Diffserv in a deployment model that uses pre-
congestion notification feedback to control admission into a network congestion notification feedback to control admission into a network
path [PCN-arch]. However as it stands, that deployment model depends path [I-D.ietf-pcn-architecture]. However as it stands, that
on all network domains trusting each other to comply with the deployment model depends on all network domains trusting each other
protocols, invoking admission control and flow pre-emption when to comply with the protocols, invoking admission control and flow
requested. pre-emption when requested.
We propose that the congestion feedback used in that deployment model We propose that the congestion feedback used in that deployment model
should be re-echoed into the forward data path, by making a trivial should be re-echoed into the forward data path, by making a trivial
modification to the ingress gateway. We then explain how the modification to the ingress gateway. We then explain how the
resulting downstream pre-congestion metric in packets can be resulting downstream pre-congestion metric in packets can be
monitored in bulk at borders to sufficiently emulate flow rate monitored in bulk at borders to sufficiently emulate flow rate
policing. policing.
We claim the result of combining these two approaches is an admission We claim the result of combining these two approaches is an admission
control system that scales to any size network _and_ any number of control system that scales to any size network _and_ any number of
skipping to change at page 47, line 42 skipping to change at page 47, line 42
(UCL), Francois Le Faucheur, Anna Charny (Cisco), Jozef Babiarz, (UCL), Francois Le Faucheur, Anna Charny (Cisco), Jozef Babiarz,
Kwok-Ho Chan, Corey Alexander (Nortel), David Clark, Bill Lehr, Kwok-Ho Chan, Corey Alexander (Nortel), David Clark, Bill Lehr,
Sharon Gillett, Steve Bauer (MIT) (who publicised various dummy Sharon Gillett, Steve Bauer (MIT) (who publicised various dummy
traffic attacks), Sally Floyd (ICIR) and comments from participants traffic attacks), Sally Floyd (ICIR) and comments from participants
in the CFP/CRN Inter-Provider QoS, Broadband and DoS-Resistant in the CFP/CRN Inter-Provider QoS, Broadband and DoS-Resistant
Internet working groups. Internet working groups.
13. Comments Solicited 13. Comments Solicited
Comments and questions are encouraged and very welcome. They can be Comments and questions are encouraged and very welcome. They can be
addressed to the IETF Transport Area working group's mailing list addressed to the IETF Congestion and Pre-Congestion Notification
<tsvwg@ietf.org>, and/or to the authors. working group's mailing list <pcn@ietf.org>, and/or to the author(s).
14. References 14. References
14.1. Normative References 14.1. Normative References
[PCN] Briscoe, B., Eardley, P., Songhurst, D., Le Faucheur, F., [PCN] Briscoe, B., Eardley, P., Songhurst, D., Le Faucheur, F.,
Charny, A., Liatsos, V., Babiarz, J., Chan, K., Dudley, Charny, A., Liatsos, V., Babiarz, J., Chan, K., Dudley,
S., Westberg, L., Bader, A., and G. Karagiannis, "Pre- S., Westberg, L., Bader, A., and G. Karagiannis, "Pre-
Congestion Notification Marking", Congestion Notification Marking",
draft-briscoe-tsvwg-cl-phb-03 (work in progress), draft-briscoe-tsvwg-cl-phb-03 (work in progress),
skipping to change at page 48, line 36 skipping to change at page 48, line 36
Stiliadis, "An Expedited Forwarding PHB (Per-Hop Stiliadis, "An Expedited Forwarding PHB (Per-Hop
Behavior)", RFC 3246, March 2002. Behavior)", RFC 3246, March 2002.
[RSVP-ECN] [RSVP-ECN]
Le Faucheur, F., Charny, A., Briscoe, B., Eardley, P., Le Faucheur, F., Charny, A., Briscoe, B., Eardley, P.,
Babiarz, J., and K. Chan, "RSVP Extensions for Admission Babiarz, J., and K. Chan, "RSVP Extensions for Admission
Control over Diffserv using Pre-congestion Notification", Control over Diffserv using Pre-congestion Notification",
draft-lefaucheur-rsvp-ecn-01 (work in progress), draft-lefaucheur-rsvp-ecn-01 (work in progress),
June 2006. June 2006.
[Re-TCP] Briscoe, B., Jacquet, A., Salvatori, A., and M. Koyabi, [Re-TCP] Briscoe, B., Jacquet, A., Moncaster, T., and A. Smith,
"Re-ECN: Adding Accountability for Causing Congestion to "Re-ECN: Adding Accountability for Causing Congestion to
TCP/IP", draft-briscoe-tsvwg-re-ecn-tcp-04 (work in TCP/IP", draft-briscoe-tsvwg-re-ecn-tcp-05 (work in
progress), June 2007. progress), January 2008.
14.2. Informative References 14.2. Informative References
[CLoop_pol] [CLoop_pol]
Salvatori, A., "Closed Loop Traffic Policing", Politecnico Salvatori, A., "Closed Loop Traffic Policing", Politecnico
Torino and Institut Eurecom Masters Thesis , Torino and Institut Eurecom Masters Thesis ,
September 2005. September 2005.
[ECN-BGP] Mortier, R. and I. Pratt, "Incentive Based Inter-Domain [ECN-BGP] Mortier, R. and I. Pratt, "Incentive Based Inter-Domain
Routeing", Proc Internet Charging and QoS Technology Routeing", Proc Internet Charging and QoS Technology
Workshop (ICQT'03) pp308--317, September 2003, <http:// Workshop (ICQT'03) pp308--317, September 2003, <http://
research.microsoft.com/users/mort/publications.aspx>. research.microsoft.com/users/mort/publications.aspx>.
[ECN-MPLS] [I-D.arumaithurai-nsis-pcn]
Davie, B., Briscoe, B., and J. Tay, "Explicit Congestion Arumaithurai, M., "NSIS PCN-QoSM: A Quality of Service
Marking in MPLS", draft-ietf-tsvwg-ecn-mpls-01 (work in Model for Pre-Congestion Notification (PCN)",
progress), June 2007. draft-arumaithurai-nsis-pcn-00 (work in progress),
September 2007.
[I-D.ietf-nsis-rmd]
Bader, A., "RMD-QOSM - The Resource Management in Diffserv
QOS Model", draft-ietf-nsis-rmd-12 (work in progress),
November 2007.
[I-D.ietf-pcn-architecture]
Eardley, P., "Pre-Congestion Notification Architecture",
draft-ietf-pcn-architecture-03 (work in progress),
February 2008.
[IXQoS] Briscoe, B. and S. Rudkin, "Commercial Models for IP [IXQoS] Briscoe, B. and S. Rudkin, "Commercial Models for IP
Quality of Service Interconnect", BT Technology Journal Quality of Service Interconnect", BT Technology Journal
(BTTJ) 23(2)171--195, April 2005, (BTTJ) 23(2)171--195, April 2005,
<http://www.cs.ucl.ac.uk/staff/B.Briscoe/pubs.html#ixqos>. <http://www.cs.ucl.ac.uk/staff/B.Briscoe/pubs.html#ixqos>.
[NSIS-RMD]
Bader, A., Westberg, L., Karagiannis, G., Kappler, C., and
T. Phelan, "RMD-QOSM - The Resource Management in Diffserv
QOS Model", draft-ietf-nsis-rmd-09 (work in progress),
March 2007.
[PCN-arch]
Eardley, P., Babiarz, J., Chan, K., Charny, A., Geib, R.,
Karagiannis, G., Menth, M., and T. Tsou, "Pre-Congestion
Notification Architecture",
draft-eardley-pcn-architecture-00 (work in progress),
June 2007.
[RFC2205] Braden, B., Zhang, L., Berson, S., Herzog, S., and S. [RFC2205] Braden, B., Zhang, L., Berson, S., Herzog, S., and S.
Jamin, "Resource ReSerVation Protocol (RSVP) -- Version 1 Jamin, "Resource ReSerVation Protocol (RSVP) -- Version 1
Functional Specification", RFC 2205, September 1997. Functional Specification", RFC 2205, September 1997.
[RFC2207] Berger, L. and T. O'Malley, "RSVP Extensions for IPSEC [RFC2207] Berger, L. and T. O'Malley, "RSVP Extensions for IPSEC
Data Flows", RFC 2207, September 1997. Data Flows", RFC 2207, September 1997.
[RFC2208] Mankin, A., Baker, F., Braden, B., Bradner, S., O'Dell, [RFC2208] Mankin, A., Baker, F., Braden, B., Bradner, S., O'Dell,
M., Romanow, A., Weinrib, A., and L. Zhang, "Resource M., Romanow, A., Weinrib, A., and L. Zhang, "Resource
ReSerVation Protocol (RSVP) Version 1 Applicability ReSerVation Protocol (RSVP) Version 1 Applicability
skipping to change at page 50, line 8 skipping to change at page 50, line 5
Felstaine, "A Framework for Integrated Services Operation Felstaine, "A Framework for Integrated Services Operation
over Diffserv Networks", RFC 2998, November 2000. over Diffserv Networks", RFC 2998, November 2000.
[RFC3540] Spring, N., Wetherall, D., and D. Ely, "Robust Explicit [RFC3540] Spring, N., Wetherall, D., and D. Ely, "Robust Explicit
Congestion Notification (ECN) Signaling with Nonces", Congestion Notification (ECN) Signaling with Nonces",
RFC 3540, June 2003. RFC 3540, June 2003.
[RFC4727] Fenner, B., "Experimental Values In IPv4, IPv6, ICMPv4, [RFC4727] Fenner, B., "Experimental Values In IPv4, IPv6, ICMPv4,
ICMPv6, UDP, and TCP Headers", RFC 4727, November 2006. ICMPv6, UDP, and TCP Headers", RFC 4727, November 2006.
[RFC5129] Davie, B., Briscoe, B., and J. Tay, "Explicit Congestion
Marking in MPLS", RFC 5129, January 2008.
[Re-fb] Briscoe, B., Jacquet, A., Di Cairano-Gilfedder, C., [Re-fb] Briscoe, B., Jacquet, A., Di Cairano-Gilfedder, C.,
Salvatori, A., Soppera, A., and M. Koyabe, "Policing Salvatori, A., Soppera, A., and M. Koyabe, "Policing
Congestion Response in an Internetwork Using Re-Feedback", Congestion Response in an Internetwork Using Re-Feedback",
ACM SIGCOMM CCR 35(4)277--288, August 2005, <http:// ACM SIGCOMM CCR 35(4)277--288, August 2005, <http://
www.acm.org/sigs/sigcomm/sigcomm2005/ www.acm.org/sigs/sigcomm/sigcomm2005/
techprog.html#session8>. techprog.html#session8>.
[Smart_rtg] [Smart_rtg]
Goldenberg, D., Qiu, L., Xie, H., Yang, Y., and Y. Zhang, Goldenberg, D., Qiu, L., Xie, H., Yang, Y., and Y. Zhang,
"Optimizing Cost and Performance for Multihoming", ACM "Optimizing Cost and Performance for Multihoming", ACM
skipping to change at page 54, line 7 skipping to change at page 54, line 7
Martlesham Heath Martlesham Heath
Ipswich IP5 3RE Ipswich IP5 3RE
UK UK
Phone: +44 1473 645196 Phone: +44 1473 645196
Email: bob.briscoe@bt.com Email: bob.briscoe@bt.com
URI: http://www.cs.ucl.ac.uk/staff/B.Briscoe/ URI: http://www.cs.ucl.ac.uk/staff/B.Briscoe/
Full Copyright Statement Full Copyright Statement
Copyright (C) The IETF Trust (2007). Copyright (C) The IETF Trust (2008).
This document is subject to the rights, licenses and restrictions This document is subject to the rights, licenses and restrictions
contained in BCP 78, and except as set forth therein, the authors contained in BCP 78, and except as set forth therein, the authors
retain all their rights. retain all their rights.
This document and the information contained herein are provided on an This document and the information contained herein are provided on an
"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
 End of changes. 26 change blocks. 
132 lines changed or deleted 144 lines changed or added

This html diff was produced by rfcdiff 1.34. The latest version is available from http://tools.ietf.org/tools/rfcdiff/