draft-ietf-quic-recovery-28.txt   draft-ietf-quic-recovery-latest.txt 
QUIC Working Group J. Iyengar, Ed. QUIC Working Group J. Iyengar, Ed.
Internet-Draft Fastly Internet-Draft Fastly
Intended status: Standards Track I. Swett, Ed. Intended status: Standards Track I. Swett, Ed.
Expires: November 20, 2020 Google Expires: December 7, 2020 Google
May 19, 2020 June 5, 2020
QUIC Loss Detection and Congestion Control QUIC Loss Detection and Congestion Control
draft-ietf-quic-recovery-28 draft-ietf-quic-recovery-latest
Abstract Abstract
This document describes loss detection and congestion control This document describes loss detection and congestion control
mechanisms for QUIC. mechanisms for QUIC.
Note to Readers Note to Readers
Discussion of this draft takes place on the QUIC working group Discussion of this draft takes place on the QUIC working group
mailing list (quic@ietf.org [1]), which is archived at mailing list (quic@ietf.org [1]), which is archived at
skipping to change at page 1, line 42 skipping to change at page 1, line 42
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/. Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on November 20, 2020. This Internet-Draft will expire on December 7, 2020.
Copyright Notice Copyright Notice
Copyright (c) 2020 IETF Trust and the persons identified as the Copyright (c) 2020 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(https://trustee.ietf.org/license-info) in effect on the date of (https://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 22, line 21 skipping to change at page 22, line 21
window MUST be reduced to the minimum congestion window window MUST be reduced to the minimum congestion window
(kMinimumWindow). This response of collapsing the congestion window (kMinimumWindow). This response of collapsing the congestion window
on persistent congestion is functionally similar to a sender's on persistent congestion is functionally similar to a sender's
response on a Retransmission Timeout (RTO) in TCP [RFC5681] after response on a Retransmission Timeout (RTO) in TCP [RFC5681] after
Tail Loss Probes (TLP) [RACK]. Tail Loss Probes (TLP) [RACK].
6.9. Pacing 6.9. Pacing
This document does not specify a pacer, but it is RECOMMENDED that a This document does not specify a pacer, but it is RECOMMENDED that a
sender pace sending of all in-flight packets based on input from the sender pace sending of all in-flight packets based on input from the
congestion controller. For example, a pacer might distribute the congestion controller. Sending multiple packets into the network
congestion window over the smoothed RTT when used with a window-based without any delay between them creates a packet burst that might
controller, or a pacer might use the rate estimate of a rate-based cause short-term congestion and losses. Implementations MUST either
controller. use pacing or another method to limit such bursts to the initial
congestion window; see Section 6.2.
An implementation should take care to architect its congestion An implementation should take care to architect its congestion
controller to work well with a pacer. For instance, a pacer might controller to work well with a pacer. For instance, a pacer might
wrap the congestion controller and control the availability of the wrap the congestion controller and control the availability of the
congestion window, or a pacer might pace out packets handed to it by congestion window, or a pacer might pace out packets handed to it by
the congestion controller. the congestion controller.
Timely delivery of ACK frames is important for efficient loss Timely delivery of ACK frames is important for efficient loss
recovery. Packets containing only ACK frames SHOULD therefore not be recovery. Packets containing only ACK frames SHOULD therefore not be
paced, to avoid delaying their delivery to the peer. paced, to avoid delaying their delivery to the peer.
skipping to change at page 23, line 10 skipping to change at page 23, line 11
Using a value for "N" that is small, but at least 1 (for example, Using a value for "N" that is small, but at least 1 (for example,
1.25) ensures that variations in round-trip time don't result in 1.25) ensures that variations in round-trip time don't result in
under-utilization of the congestion window. Values of 'N' larger under-utilization of the congestion window. Values of 'N' larger
than 1 ultimately result in sending packets as acknowledgments are than 1 ultimately result in sending packets as acknowledgments are
received rather than when timers fire, provided the congestion window received rather than when timers fire, provided the congestion window
is fully utilized and acknowledgments arrive at regular intervals. is fully utilized and acknowledgments arrive at regular intervals.
Practical considerations, such as packetization, scheduling delays, Practical considerations, such as packetization, scheduling delays,
and computational efficiency, can cause a sender to deviate from this and computational efficiency, can cause a sender to deviate from this
rate over time periods that are much shorter than a round-trip time. rate over time periods that are much shorter than a round-trip time.
Sending multiple packets into the network without any delay between
them creates a packet burst that might cause short-term congestion
and losses. Implementations MUST either use pacing or limit such
bursts to the initial congestion window; see Section 6.2.
One possible implementation strategy for pacing uses a leaky bucket One possible implementation strategy for pacing uses a leaky bucket
algorithm, where the capacity of the "bucket" is limited to the algorithm, where the capacity of the "bucket" is limited to the
maximum burst size and the rate the "bucket" fills is determined by maximum burst size and the rate the "bucket" fills is determined by
the above function. the above function.
6.10. Under-utilizing the Congestion Window 6.10. Under-utilizing the Congestion Window
When bytes in flight is smaller than the congestion window and When bytes in flight is smaller than the congestion window and
sending is not pacing limited, the congestion window is under- sending is not pacing limited, the congestion window is under-
skipping to change at page 24, line 46 skipping to change at page 24, line 46
8. IANA Considerations 8. IANA Considerations
This document has no IANA actions. This document has no IANA actions.
9. References 9. References
9.1. Normative References 9.1. Normative References
[QUIC-TLS] [QUIC-TLS]
Thomson, M., Ed. and S. Turner, Ed., "Using TLS to Secure Thomson, M., Ed. and S. Turner, Ed., "Using TLS to Secure
QUIC", draft-ietf-quic-tls-28 (work in progress). QUIC", draft-ietf-quic-tls-latest (work in progress).
[QUIC-TRANSPORT] [QUIC-TRANSPORT]
Iyengar, J., Ed. and M. Thomson, Ed., "QUIC: A UDP-Based Iyengar, J., Ed. and M. Thomson, Ed., "QUIC: A UDP-Based
Multiplexed and Secure Transport", draft-ietf-quic- Multiplexed and Secure Transport", draft-ietf-quic-
transport-28 (work in progress). transport-latest (work in progress).
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997, DOI 10.17487/RFC2119, March 1997,
<https://www.rfc-editor.org/info/rfc2119>. <https://www.rfc-editor.org/info/rfc2119>.
[RFC8085] Eggert, L., Fairhurst, G., and G. Shepherd, "UDP Usage [RFC8085] Eggert, L., Fairhurst, G., and G. Shepherd, "UDP Usage
Guidelines", BCP 145, RFC 8085, DOI 10.17487/RFC8085, Guidelines", BCP 145, RFC 8085, DOI 10.17487/RFC8085,
March 2017, <https://www.rfc-editor.org/info/rfc8085>. March 2017, <https://www.rfc-editor.org/info/rfc8085>.
skipping to change at page 28, line 44 skipping to change at page 28, line 44
intends to delay acknowledgments for packets in the intends to delay acknowledgments for packets in the
ApplicationData packet number space. The actual ack_delay in a ApplicationData packet number space. The actual ack_delay in a
received ACK frame may be larger due to late timers, reordering, received ACK frame may be larger due to late timers, reordering,
or lost ACK frames. or lost ACK frames.
loss_detection_timer: Multi-modal timer used for loss detection. loss_detection_timer: Multi-modal timer used for loss detection.
pto_count: The number of times a PTO has been sent without receiving pto_count: The number of times a PTO has been sent without receiving
an ack. an ack.
time_of_last_sent_ack_eliciting_packet[kPacketNumberSpace]: The time time_of_last_ack_eliciting_packet[kPacketNumberSpace]: The time the
the most recent ack-eliciting packet was sent. most recent ack-eliciting packet was sent.
largest_acked_packet[kPacketNumberSpace]: The largest packet number largest_acked_packet[kPacketNumberSpace]: The largest packet number
acknowledged in the packet number space so far. acknowledged in the packet number space so far.
loss_time[kPacketNumberSpace]: The time at which the next packet in loss_time[kPacketNumberSpace]: The time at which the next packet in
that packet number space will be considered lost based on that packet number space will be considered lost based on
exceeding the reordering window in time. exceeding the reordering window in time.
sent_packets[kPacketNumberSpace]: An association of packet numbers sent_packets[kPacketNumberSpace]: An association of packet numbers
in a packet number space to information about them. Described in in a packet number space to information about them. Described in
skipping to change at page 29, line 23 skipping to change at page 29, line 23
loss_detection_timer.reset() loss_detection_timer.reset()
pto_count = 0 pto_count = 0
latest_rtt = 0 latest_rtt = 0
smoothed_rtt = initial_rtt smoothed_rtt = initial_rtt
rttvar = initial_rtt / 2 rttvar = initial_rtt / 2
min_rtt = 0 min_rtt = 0
max_ack_delay = 0 max_ack_delay = 0
for pn_space in [ Initial, Handshake, ApplicationData ]: for pn_space in [ Initial, Handshake, ApplicationData ]:
largest_acked_packet[pn_space] = infinite largest_acked_packet[pn_space] = infinite
time_of_last_sent_ack_eliciting_packet[pn_space] = 0 time_of_last_ack_eliciting_packet[pn_space] = 0
loss_time[pn_space] = 0 loss_time[pn_space] = 0
A.5. On Sending a Packet A.5. On Sending a Packet
After a packet is sent, information about the packet is stored. The After a packet is sent, information about the packet is stored. The
parameters to OnPacketSent are described in detail above in parameters to OnPacketSent are described in detail above in
Appendix A.1.1. Appendix A.1.1.
Pseudocode for OnPacketSent follows: Pseudocode for OnPacketSent follows:
OnPacketSent(packet_number, pn_space, ack_eliciting, OnPacketSent(packet_number, pn_space, ack_eliciting,
in_flight, sent_bytes): in_flight, sent_bytes):
sent_packets[pn_space][packet_number].packet_number = sent_packets[pn_space][packet_number].packet_number =
packet_number packet_number
sent_packets[pn_space][packet_number].time_sent = now() sent_packets[pn_space][packet_number].time_sent = now()
sent_packets[pn_space][packet_number].ack_eliciting = sent_packets[pn_space][packet_number].ack_eliciting =
ack_eliciting ack_eliciting
sent_packets[pn_space][packet_number].in_flight = in_flight sent_packets[pn_space][packet_number].in_flight = in_flight
if (in_flight): if (in_flight):
if (ack_eliciting): if (ack_eliciting):
time_of_last_sent_ack_eliciting_packet[pn_space] = now() time_of_last_ack_eliciting_packet[pn_space] = now()
OnPacketSentCC(sent_bytes) OnPacketSentCC(sent_bytes)
sent_packets[pn_space][packet_number].size = sent_bytes sent_packets[pn_space][packet_number].size = sent_bytes
SetLossDetectionTimer() SetLossDetectionTimer()
A.6. On Receiving a Datagram A.6. On Receiving a Datagram
When a server is blocked by anti-amplification limits, receiving a When a server is blocked by anti-amplification limits, receiving a
datagram unblocks it, even if none of the packets in the datagram are datagram unblocks it, even if none of the packets in the datagram are
successfully processed. In such a case, the PTO timer will need to successfully processed. In such a case, the PTO timer will need to
be re-armed. be re-armed.
skipping to change at page 32, line 5 skipping to change at page 32, line 5
which is set in the packet and timer events further below. The which is set in the packet and timer events further below. The
function SetLossDetectionTimer defined below shows how the single function SetLossDetectionTimer defined below shows how the single
timer is set. timer is set.
This algorithm may result in the timer being set in the past, This algorithm may result in the timer being set in the past,
particularly if timers wake up late. Timers set in the past fire particularly if timers wake up late. Timers set in the past fire
immediately. immediately.
Pseudocode for SetLossDetectionTimer follows: Pseudocode for SetLossDetectionTimer follows:
GetEarliestTimeAndSpace(times): GetLossTimeAndSpace():
time = times[Initial] time = loss_time[Initial]
space = Initial space = Initial
for pn_space in [ Handshake, ApplicationData ]: for pn_space in [ Handshake, ApplicationData ]:
if (times[pn_space] != 0 && if (time == 0 || loss_time[pn_space] < time):
(time == 0 || times[pn_space] < time) &&; time = loss_time[pn_space];
# Skip ApplicationData until handshake completion.
(pn_space != ApplicationData ||
IsHandshakeComplete()):
time = times[pn_space];
space = pn_space space = pn_space
return time, space return time, space
GetPtoTimeAndSpace():
duration = (smoothed_rtt + max(4 * rttvar, kGranularity))
* (2 ^ pto_count)
// Arm PTO from now when there are no inflight packets.
if (no in-flight packets):
assert(!PeerCompletedAddressValidation())
if (has handshake keys):
return (now() + duration), Handshake
else:
return (now() + duration), Initial
pto_timeout = infinite
pto_space = Initial
for space in [ Initial, Handshake, ApplicationData ]:
if (no in-flight packets in space):
continue;
if (space == ApplicationData):
// Skip ApplicationData until handshake complete.
if (handshake is not complete):
return pto_timeout, pto_space
// Include max_ack_delay and backoff for ApplicationData.
duration += max_ack_delay * (2 ^ pto_count)
t = time_of_last_ack_eliciting_packet[space] + duration
if (t < pto_timeout):
pto_timeout = t
pto_space = space
return pto_timeout, pto_space
PeerCompletedAddressValidation(): PeerCompletedAddressValidation():
# Assume clients validate the server's address implicitly. # Assume clients validate the server's address implicitly.
if (endpoint is server): if (endpoint is server):
return true return true
# Servers complete address validation when a # Servers complete address validation when a
# protected packet is received. # protected packet is received.
return has received Handshake ACK || return has received Handshake ACK ||
has received 1-RTT ACK || has received 1-RTT ACK ||
has received HANDSHAKE_DONE has received HANDSHAKE_DONE
SetLossDetectionTimer(): SetLossDetectionTimer():
earliest_loss_time, _ = GetEarliestTimeAndSpace(loss_time)
earliest_loss_time, _ = GetLossTimeAndSpace()
if (earliest_loss_time != 0): if (earliest_loss_time != 0):
// Time threshold loss detection. // Time threshold loss detection.
loss_detection_timer.update(earliest_loss_time) loss_detection_timer.update(earliest_loss_time)
return return
if (server is at anti-amplification limit): if (server is at anti-amplification limit):
// The server's timer is not set if nothing can be sent. // The server's timer is not set if nothing can be sent.
loss_detection_timer.cancel() loss_detection_timer.cancel()
return return
if (no ack-eliciting packets in flight && if (no ack-eliciting packets in flight &&
PeerCompletedAddressValidation()): PeerCompletedAddressValidation()):
// There is nothing to detect lost, so no timer is set. // There is nothing to detect lost, so no timer is set.
// However, the client needs to arm the timer if the // However, the client needs to arm the timer if the
// server might be blocked by the anti-amplification limit. // server might be blocked by the anti-amplification limit.
loss_detection_timer.cancel() loss_detection_timer.cancel()
return return
// Determine which PN space to arm PTO for. // Determine which PN space to arm PTO for.
sent_time, pn_space = GetEarliestTimeAndSpace( timeout, _ = GetPtoTimeAndSpace()
time_of_last_sent_ack_eliciting_packet) loss_detection_timer.update(timeout)
// Don't arm PTO for ApplicationData until handshake complete.
if (pn_space == ApplicationData &&
handshake is not confirmed):
loss_detection_timer.cancel()
return
if (sent_time == 0):
assert(!PeerCompletedAddressValidation())
sent_time = now()
// Calculate PTO duration
timeout = smoothed_rtt + max(4 * rttvar, kGranularity) +
max_ack_delay
timeout = timeout * (2 ^ pto_count)
loss_detection_timer.update(sent_time + timeout)
A.9. On Timeout A.9. On Timeout
When the loss detection timer expires, the timer's mode determines When the loss detection timer expires, the timer's mode determines
the action to be performed. the action to be performed.
Pseudocode for OnLossDetectionTimeout follows: Pseudocode for OnLossDetectionTimeout follows:
OnLossDetectionTimeout(): OnLossDetectionTimeout():
earliest_loss_time, pn_space = earliest_loss_time, pn_space = GetLossTimeAndSpace()
GetEarliestTimeAndSpace(loss_time)
if (earliest_loss_time != 0): if (earliest_loss_time != 0):
// Time threshold loss Detection // Time threshold loss Detection
lost_packets = DetectLostPackets(pn_space) lost_packets = DetectLostPackets(pn_space)
assert(!lost_packets.empty()) assert(!lost_packets.empty())
OnPacketsLost(lost_packets) OnPacketsLost(lost_packets)
SetLossDetectionTimer() SetLossDetectionTimer()
return return
if (bytes_in_flight > 0): if (bytes_in_flight > 0):
// PTO. Send new data if available, else retransmit old data. // PTO. Send new data if available, else retransmit old data.
// If neither is available, send a single PING frame. // If neither is available, send a single PING frame.
_, pn_space = GetEarliestTimeAndSpace( _, pn_space = GetPtoTimeAndSpace()
time_of_last_sent_ack_eliciting_packet)
SendOneOrTwoAckElicitingPackets(pn_space) SendOneOrTwoAckElicitingPackets(pn_space)
else: else:
assert(endpoint is client without 1-RTT keys) assert(endpoint is client without 1-RTT keys)
// Client sends an anti-deadlock packet: Initial is padded // Client sends an anti-deadlock packet: Initial is padded
// to earn more anti-amplification credit, // to earn more anti-amplification credit,
// a Handshake packet proves address ownership. // a Handshake packet proves address ownership.
if (has Handshake keys): if (has Handshake keys):
SendOneAckElicitingHandshakePacket() SendOneAckElicitingHandshakePacket()
else: else:
SendOneAckElicitingPaddedInitialPacket() SendOneAckElicitingPaddedInitialPacket()
skipping to change at page 39, line 39 skipping to change at page 39, line 39
Pseudocode for OnPacketNumberSpaceDiscarded follows: Pseudocode for OnPacketNumberSpaceDiscarded follows:
OnPacketNumberSpaceDiscarded(pn_space): OnPacketNumberSpaceDiscarded(pn_space):
assert(pn_space != ApplicationData) assert(pn_space != ApplicationData)
// Remove any unacknowledged packets from flight. // Remove any unacknowledged packets from flight.
foreach packet in sent_packets[pn_space]: foreach packet in sent_packets[pn_space]:
if packet.in_flight if packet.in_flight
bytes_in_flight -= size bytes_in_flight -= size
sent_packets[pn_space].clear() sent_packets[pn_space].clear()
// Reset the loss detection and PTO timer // Reset the loss detection and PTO timer
time_of_last_sent_ack_eliciting_packet[kPacketNumberSpace] = 0 time_of_last_ack_eliciting_packet[pn_space] = 0
loss_time[pn_space] = 0 loss_time[pn_space] = 0
pto_count = 0 pto_count = 0
SetLossDetectionTimer() SetLossDetectionTimer()
Appendix C. Change Log Appendix C. Change Log
*RFC Editor's Note:* Please remove this section prior to *RFC Editor's Note:* Please remove this section prior to
publication of a final version of this document. publication of a final version of this document.
Issue and pull request numbers are listed with a leading octothorp. Issue and pull request numbers are listed with a leading octothorp.
 End of changes. 18 change blocks. 
49 lines changed or deleted 54 lines changed or added

This html diff was produced by rfcdiff 1.44jr. The latest version is available from http://tools.ietf.org/tools/rfcdiff/