draft-ietf-quic-recovery-14.txt   draft-ietf-quic-recovery-latest.txt 
QUIC Working Group J. Iyengar, Ed. QUIC Working Group J. Iyengar, Ed.
Internet-Draft Fastly Internet-Draft Fastly
Intended status: Standards Track I. Swett, Ed. Intended status: Standards Track I. Swett, Ed.
Expires: February 16, 2019 Google Expires: March 23, 2019 Google
August 15, 2018 September 19, 2018
QUIC Loss Detection and Congestion Control QUIC Loss Detection and Congestion Control
draft-ietf-quic-recovery-14 draft-ietf-quic-recovery-latest
Abstract Abstract
This document describes loss detection and congestion control This document describes loss detection and congestion control
mechanisms for QUIC. mechanisms for QUIC.
Note to Readers Note to Readers
Discussion of this draft takes place on the QUIC working group Discussion of this draft takes place on the QUIC working group
mailing list (quic@ietf.org), which is archived at mailing list (quic@ietf.org), which is archived at
skipping to change at page 1, line 42 skipping to change at page 1, line 42
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/. Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on February 16, 2019. This Internet-Draft will expire on March 23, 2019.
Copyright Notice Copyright Notice
Copyright (c) 2018 IETF Trust and the persons identified as the Copyright (c) 2018 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(https://trustee.ietf.org/license-info) in effect on the date of (https://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 9, line 16 skipping to change at page 9, line 16
reordering where packet whose ack triggered the Early Retransit reordering where packet whose ack triggered the Early Retransit
process encountered a shorter path; process encountered a shorter path;
o the latest RTT sample is higher than the SRTT, perhaps due to a o the latest RTT sample is higher than the SRTT, perhaps due to a
sustained increase in the actual RTT, but the smoothed SRTT has sustained increase in the actual RTT, but the smoothed SRTT has
not yet caught up. not yet caught up.
The 1.125 multiplier increases reordering resilience. Implementers The 1.125 multiplier increases reordering resilience. Implementers
MAY experiment with using other multipliers, bearing in mind that a MAY experiment with using other multipliers, bearing in mind that a
lower multiplier reduces reordering resilience and increases spurious lower multiplier reduces reordering resilience and increases spurious
retransmissions, and a higher multipler increases loss recovery retransmissions, and a higher multiplier increases loss recovery
delay. delay.
This mechanism is based on Early Retransmit for TCP [RFC5827]. This mechanism is based on Early Retransmit for TCP [RFC5827].
However, [RFC5827] does not include the timer described above. Early However, [RFC5827] does not include the timer described above. Early
Retransmit is prone to spurious retransmissions due to its reduced Retransmit is prone to spurious retransmissions due to its reduced
reordering resilence without the timer. This observation led Linux reordering resilence without the timer. This observation led Linux
TCP implementers to implement a timer for TCP as well, and this TCP implementers to implement a timer for TCP as well, and this
document incorporates this advancement. document incorporates this advancement.
4.3. Timer-based Detection 4.3. Timer-based Detection
skipping to change at page 10, line 11 skipping to change at page 10, line 11
When CRYPTO frames are sent, the sender SHOULD set a timer for the When CRYPTO frames are sent, the sender SHOULD set a timer for the
handshake timeout period. Upon timeout, the sender MUST retransmit handshake timeout period. Upon timeout, the sender MUST retransmit
all unacknowledged CRYPTO data by calling all unacknowledged CRYPTO data by calling
RetransmitAllUnackedHandshakeData(). On each consecutive expiration RetransmitAllUnackedHandshakeData(). On each consecutive expiration
of the handshake timer without receiving an acknowledgement for a new of the handshake timer without receiving an acknowledgement for a new
packet, the sender SHOULD double the handshake timeout and set a packet, the sender SHOULD double the handshake timeout and set a
timer for this period. timer for this period.
When CRYPTO frames are outstanding, the TLP and RTO timers are not When CRYPTO frames are outstanding, the TLP and RTO timers are not
active unless the CRYPTO frames were sent at 1RTT encryption. active unless the CRYPTO frames were sent at 1-RTT encryption.
When an acknowledgement is received for a handshake packet, the new When an acknowledgement is received for a handshake packet, the new
RTT is computed and the timer SHOULD be set for twice the newly RTT is computed and the timer SHOULD be set for twice the newly
computed smoothed RTT. computed smoothed RTT.
4.3.1.1. Retry and Version Negotiation 4.3.1.1. Retry and Version Negotiation
A Retry or Version Negotiation packet causes a client to send another A Retry or Version Negotiation packet causes a client to send another
Initial packet, effectively restarting the connection process. Initial packet, effectively restarting the connection process.
skipping to change at page 12, line 38 skipping to change at page 12, line 38
A packet sent on an RTO timer MUST NOT be blocked by the sender's A packet sent on an RTO timer MUST NOT be blocked by the sender's
congestion controller. A sender MUST however count these bytes as congestion controller. A sender MUST however count these bytes as
additional bytes in flight, since this packet adds network load additional bytes in flight, since this packet adds network load
without establishing packet loss. without establishing packet loss.
4.4. Generating Acknowledgements 4.4. Generating Acknowledgements
QUIC SHOULD delay sending acknowledgements in response to packets, QUIC SHOULD delay sending acknowledgements in response to packets,
but MUST NOT excessively delay acknowledgements of packets containing but MUST NOT excessively delay acknowledgements of packets containing
frames other than ACK or ACN_ECN. Specifically, implementaions MUST frames other than ACK or ACN_ECN. Specifically, implementations MUST
attempt to enforce a maximum ack delay to avoid causing the peer attempt to enforce a maximum ack delay to avoid causing the peer
spurious timeouts. The RECOMMENDED maximum ack delay in QUIC is spurious timeouts. The RECOMMENDED maximum ack delay in QUIC is
25ms. 25ms.
An acknowledgement MAY be sent for every second full-sized packet, as An acknowledgement MAY be sent for every second full-sized packet, as
TCP does [RFC5681], or may be sent less frequently, as long as the TCP does [RFC5681], or may be sent less frequently, as long as the
delay does not exceed the maximum ack delay. QUIC recovery delay does not exceed the maximum ack delay. QUIC recovery
algorithms do not assume the peer generates an acknowledgement algorithms do not assume the peer generates an acknowledgement
immediately when receiving a second full-sized packet. immediately when receiving a second full-sized packet.
skipping to change at page 13, line 34 skipping to change at page 13, line 34
4.4.2. ACK Ranges 4.4.2. ACK Ranges
When an ACK frame is sent, one or more ranges of acknowledged packets When an ACK frame is sent, one or more ranges of acknowledged packets
are included. Including older packets reduces the chance of spurious are included. Including older packets reduces the chance of spurious
retransmits caused by losing previously sent ACK frames, at the cost retransmits caused by losing previously sent ACK frames, at the cost
of larger ACK frames. of larger ACK frames.
ACK frames SHOULD always acknowledge the most recently received ACK frames SHOULD always acknowledge the most recently received
packets, and the more out-of-order the packets are, the more packets, and the more out-of-order the packets are, the more
important it is to send an updated ACK frame quickly, to prevent the important it is to send an updated ACK frame quickly, to prevent the
peer from declaring a packet as lost and spuriusly retransmitting the peer from declaring a packet as lost and spuriously retransmitting
frames it contains. the frames it contains.
Below is one recommended approach for determining what packets to Below is one recommended approach for determining what packets to
include in an ACK frame. include in an ACK frame.
4.4.3. Receiver Tracking of ACK Frames 4.4.3. Receiver Tracking of ACK Frames
When a packet containing an ACK frame is sent, the largest When a packet containing an ACK frame is sent, the largest
acknowledged in that frame may be saved. When a packet containing an acknowledged in that frame may be saved. When a packet containing an
ACK frame is acknowledged, the receiver can stop acknowledging ACK frame is acknowledged, the receiver can stop acknowledging
packets less than or equal to the largest acknowledged in the sent packets less than or equal to the largest acknowledged in the sent
skipping to change at page 14, line 17 skipping to change at page 14, line 17
progress. progress.
4.5. Pseudocode 4.5. Pseudocode
4.5.1. Constants of interest 4.5.1. Constants of interest
Constants used in loss recovery are based on a combination of RFCs, Constants used in loss recovery are based on a combination of RFCs,
papers, and common practice. Some may need to be changed or papers, and common practice. Some may need to be changed or
negotiated in order to better suit a variety of environments. negotiated in order to better suit a variety of environments.
kMaxTLPs (RECOMMENDED 2): Maximum number of tail loss probes before kMaxTLPs: Maximum number of tail loss probes before an RTO expires.
an RTO expires. The RECOMMENDED value is 2.
kReorderingThreshold (RECOMMENDED 3): Maximum reordering in packet kReorderingThreshold: Maximum reordering in packet number space
number space before FACK style loss detection considers a packet before FACK style loss detection considers a packet lost. The
lost. RECOMMENDED value is 3.
kTimeReorderingFraction (RECOMMENDED 1/8): Maximum reordering in kTimeReorderingFraction: Maximum reordering in time space before
time space before time based loss detection considers a packet time based loss detection considers a packet lost. In fraction of
lost. In fraction of an RTT. an RTT. The RECOMMENDED value is 1/8.
kUsingTimeLossDetection (RECOMMENDED false): Whether time based loss kUsingTimeLossDetection: Whether time based loss detection is in
detection is in use. If false, uses FACK style loss detection. use. If false, uses FACK style loss detection. The RECOMMENDED
value is false.
kMinTLPTimeout (RECOMMENDED 10ms): Minimum time in the future a tail kMinTLPTimeout: Minimum time in the future a tail loss probe timer
loss probe timer may be set for. may be set for. The RECOMMENDED value is 10ms.
kMinRTOTimeout (RECOMMENDED 200ms): Minimum time in the future an kMinRTOTimeout: Minimum time in the future an RTO timer may be set
RTO timer may be set for. for. The RECOMMENDED value is 200ms.
kDelayedAckTimeout (RECOMMENDED 25ms): The length of the peer's kDelayedAckTimeout: The length of the peer's delayed ack timer. The
delayed ack timer. RECOMMENDED value is 25ms.
kInitialRtt (RECOMMENDED 100ms): The RTT used before an RTT sample kInitialRtt: The RTT used before an RTT sample is taken. The
is taken. RECOMMENDED value is 100ms.
4.5.2. Variables of interest 4.5.2. Variables of interest
Variables required to implement the congestion control mechanisms are Variables required to implement the congestion control mechanisms are
described in this section. described in this section.
loss_detection_timer: Multi-modal timer used for loss detection. loss_detection_timer: Multi-modal timer used for loss detection.
handshake_count: The number of times all unacknowledged handshake handshake_count: The number of times all unacknowledged handshake
data has been retransmitted without receiving an ack. data has been retransmitted without receiving an ack.
tlp_count: The number of times a tail loss probe has been sent tlp_count: The number of times a tail loss probe has been sent
without receiving an ack. without receiving an ack.
rto_count: The number of times an rto has been sent without rto_count: The number of times an RTO has been sent without
receiving an ack. receiving an ack.
largest_sent_before_rto: The last packet number sent prior to the largest_sent_before_rto: The last packet number sent prior to the
first retransmission timeout. first retransmission timeout.
time_of_last_sent_retransmittable_packet: The time the most recent time_of_last_sent_retransmittable_packet: The time the most recent
retransmittable packet was sent. retransmittable packet was sent.
time_of_last_sent_handshake_packet: The time the most recent packet time_of_last_sent_handshake_packet: The time the most recent packet
containing a CRYPTO frame was sent. containing a CRYPTO frame was sent.
skipping to change at page 19, line 20 skipping to change at page 19, line 20
Pseudocode for OnPacketAcked follows: Pseudocode for OnPacketAcked follows:
OnPacketAcked(acked_packet): OnPacketAcked(acked_packet):
if (!acked_packet.is_ack_only): if (!acked_packet.is_ack_only):
OnPacketAckedCC(acked_packet) OnPacketAckedCC(acked_packet)
// If a packet sent prior to RTO was acked, then the RTO // If a packet sent prior to RTO was acked, then the RTO
// was spurious. Otherwise, inform congestion control. // was spurious. Otherwise, inform congestion control.
if (rto_count > 0 && if (rto_count > 0 &&
acked_packet.packet_number > largest_sent_before_rto): acked_packet.packet_number > largest_sent_before_rto):
OnRetransmissionTimeoutVerified( OnRetransmissionTimeoutVerified(
acket_packet.packet_number) acked_packet.packet_number)
handshake_count = 0 handshake_count = 0
tlp_count = 0 tlp_count = 0
rto_count = 0 rto_count = 0
sent_packets.remove(acked_packet.packet_number) sent_packets.remove(acked_packet.packet_number)
4.5.7. Setting the Loss Detection Timer 4.5.7. Setting the Loss Detection Timer
QUIC loss detection uses a single timer for all timer-based loss QUIC loss detection uses a single timer for all timer-based loss
detection. The duration of the timer is based on the timer's mode, detection. The duration of the timer is based on the timer's mode,
which is set in the packet and timer events further below. The which is set in the packet and timer events further below. The
skipping to change at page 19, line 50 skipping to change at page 19, line 50
When stateless rejects are in use, the connection is considered When stateless rejects are in use, the connection is considered
immediately closed once a reject is sent, so no timer is set to immediately closed once a reject is sent, so no timer is set to
retransmit the reject. retransmit the reject.
Version negotiation packets are always stateless, and MUST be sent Version negotiation packets are always stateless, and MUST be sent
once per handshake packet that uses an unsupported QUIC version, and once per handshake packet that uses an unsupported QUIC version, and
MAY be sent in response to 0-RTT packets. MAY be sent in response to 0-RTT packets.
4.5.7.2. Tail Loss Probe and Retransmission Timer 4.5.7.2. Tail Loss Probe and Retransmission Timer
Tail loss probes [TLP] and retransmission timeouts [RFC6298] are a Tail loss probes [TLP] and retransmission timeouts [RFC6298] are
timer based mechanism to recover from cases when there are timer based mechanisms to recover from cases when there are
outstanding retransmittable packets, but an acknowledgement has not outstanding retransmittable packets, but an acknowledgement has not
been received in a timely manner. been received in a timely manner.
The TLP and RTO timers are armed when there is not unacknowledged The TLP and RTO timers are armed when there is no unacknowledged
handshake data. The TLP timer is set until the max number of TLP handshake data. The TLP timer is set until the max number of TLP
packets have been sent, and then the RTO timer is set. packets have been sent, and then the RTO timer is set.
4.5.7.3. Early Retransmit Timer 4.5.7.3. Early Retransmit Timer
Early retransmit [RFC5827] is implemented with a 1/4 RTT timer. It Early retransmit [RFC5827] is implemented with a 1/4 RTT timer. It
is part of QUIC's time based loss detection, but is always enabled, is part of QUIC's time based loss detection, but is always enabled,
even when only packet reordering loss detection is enabled. even when only packet reordering loss detection is enabled.
4.5.7.4. Pseudocode 4.5.7.4. Pseudocode
skipping to change at page 26, line 13 skipping to change at page 26, line 13
scheduler (fq qdisc) in Linux (3.11 onwards). scheduler (fq qdisc) in Linux (3.11 onwards).
5.8. Pseudocode 5.8. Pseudocode
5.8.1. Constants of interest 5.8.1. Constants of interest
Constants used in congestion control are based on a combination of Constants used in congestion control are based on a combination of
RFCs, papers, and common practice. Some may need to be changed or RFCs, papers, and common practice. Some may need to be changed or
negotiated in order to better suit a variety of environments. negotiated in order to better suit a variety of environments.
kMaxDatagramSize (RECOMMENDED 1200 bytes): The sender's maximum kMaxDatagramSize: The sender's maximum payload size. Does not
payload size. Does not include UDP or IP overhead. The max include UDP or IP overhead. The max packet size is used for
packet size is used for calculating initial and minimum congestion calculating initial and minimum congestion windows. The
windows. RECOMMENDED value is 1200 bytes.
kInitialWindow (RECOMMENDED min(10 * kMaxDatagramSize,
max(2* kMaxDatagramSize, 14600))): kInitialWindow: Default limit on the initial amount of outstanding
Default limit on the initial amount of outstanding data in bytes. data in bytes. Taken from [RFC6928]. The RECOMMENDED value is
Taken from [RFC6928]. the minimum of 10 * kMaxDatagramSize and max(2* kMaxDatagramSize,
14600)).
kMinimumWindow (RECOMMENDED 2 * kMaxDatagramSize): Minimum kMinimumWindow: Minimum congestion window in bytes. The RECOMMENDED
congestion window in bytes. value is 2 * kMaxDatagramSize.
kLossReductionFactor (RECOMMENDED 0.5): Reduction in congestion kLossReductionFactor: Reduction in congestion window when a new loss
window when a new loss event is detected. event is detected. The RECOMMENDED value is 0.5.
5.8.2. Variables of interest 5.8.2. Variables of interest
Variables required to implement the congestion control mechanisms are Variables required to implement the congestion control mechanisms are
described in this section. described in this section.
ecn_ce_counter: The highest value reported for the ECN-CE counter by ecn_ce_counter: The highest value reported for the ECN-CE counter by
the peer in an ACK_ECN frame. This variable is used to detect the peer in an ACK_ECN frame. This variable is used to detect
increases in the reported ECN-CE counter. increases in the reported ECN-CE counter.
skipping to change at page 30, line 16 skipping to change at page 30, line 16
This document has no IANA actions. Yet. This document has no IANA actions. Yet.
8. References 8. References
8.1. Normative References 8.1. Normative References
[QUIC-TRANSPORT] [QUIC-TRANSPORT]
Iyengar, J., Ed. and M. Thomson, Ed., "QUIC: A UDP-Based Iyengar, J., Ed. and M. Thomson, Ed., "QUIC: A UDP-Based
Multiplexed and Secure Transport", draft-ietf-quic- Multiplexed and Secure Transport", draft-ietf-quic-
transport-14 (work in progress). transport-latest (work in progress).
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997, DOI 10.17487/RFC2119, March 1997,
<https://www.rfc-editor.org/info/rfc2119>. <https://www.rfc-editor.org/info/rfc2119>.
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
May 2017, <https://www.rfc-editor.org/info/rfc8174>. May 2017, <https://www.rfc-editor.org/info/rfc8174>.
 End of changes. 24 change blocks. 
46 lines changed or deleted 46 lines changed or added

This html diff was produced by rfcdiff 1.44jr. The latest version is available from http://tools.ietf.org/tools/rfcdiff/