flow-control.md 5.7 KB
 bernd committed Dec 29, 2015 1 ``````# Flow Control # `````` bernd committed May 01, 2015 2 `````` `````` bernd committed Dec 29, 2015 3 ``````The assumptions of TCP are wrong, so TCP's flow control is broken - that's `````` bernd committed May 01, 2015 4 ``````one of my rationale to create a new protocol. So what are my assumptions, and `````` bernd committed Dec 29, 2015 5 ``````what do I propose instead? `````` bernd committed May 01, 2015 6 `````` `````` bernd committed Dec 29, 2015 7 ``````Let's first look at TCP: What does TCP assume? `````` bernd committed May 01, 2015 8 `````` `````` bernd committed Dec 29, 2015 9 ``````TCP defines a window size. This equals the amount of data which is in `````` bernd committed May 01, 2015 10 11 12 13 14 15 16 ``````flight, so supposed there is no packet drop.  The tail is the last acknowledged packet, and the head is the last sent packet.  TCP has a "slow start", so it starts with one packet, and increases the number of segments in the window depending on how many segments are acknowledged.  This gives an exponential growth phase, until there's one packet drop.  The assumption here is that when the sender sends too fast, packets are dropped.  If a packet is dropped, TCP will half the window size, and slowly grow it `````` bernd committed Dec 29, 2015 17 ``````by one segment per round-trip - until again a packet is lost. `````` bernd committed May 01, 2015 18 `````` `````` bernd committed Dec 29, 2015 19 ``````This means the number of packets in flight oscillates by a factor of two. `````` bernd committed May 01, 2015 20 21 22 23 24 25 26 27 28 `````` So what about optimal buffer size?  The optimal buffer size for a TCP connection is capable to keep packets for 0.5 RTT - because that's the same amount of packets that fit onto the wire (they take 0.5 RTT from source to destination).  This is something so far nobody has implemented, because it requires that the buffering router measures the RTT for each TCP connection.  This measurement is possible in theory, it can be measured from the first syn to the first ack.  In practice, there's usually no scientific method applied to choose the right buffer size; if you are lucky, there had been a few experiments, selecting some buffer size on an educated guess, and the router `````` bernd committed Dec 29, 2015 29 ``````has a global FIFO. `````` bernd committed May 01, 2015 30 `````` `````` bernd committed Dec 29, 2015 31 ``````The worst problem with this algorithm is on networks with poor quality, such `````` bernd committed May 01, 2015 32 33 34 ``````as wireless networks, where packet drops are relatively frequent, and have nothing to do with the sender rate.  The next problem is that a filled up buffer in the router delays all other connections, including low-latency `````` bernd committed Dec 29, 2015 35 ``````low-bandwidth real-time protocols. `````` bernd committed May 01, 2015 36 `````` `````` bernd committed Dec 29, 2015 37 38 ``````So that's not the way to go.  We must reevaluate the assumptions and find a solution. `````` bernd committed May 01, 2015 39 `````` `````` bernd committed Dec 29, 2015 40 ``````## The assumptions ## `````` bernd committed May 01, 2015 41 `````` `````` bernd committed Dec 29, 2015 42 43 44 45 46 47 ``````+ Network devices do have buffers, most of them "too large buffers" for TCP to work reasonable + The buffers are your friend, not your enemy, they avoid retransmissions + Buffers should usually stay almost empty + Packet drops are not related to flow control problems + Intermediate network hops can help fairness, by providing "fair queuing" `````` bernd committed May 01, 2015 48 `````` `````` bernd committed Dec 29, 2015 49 50 51 ``````## The solution ## Since network hops which may help with flow control are not likely to be `````` bernd committed May 01, 2015 52 53 ``````available soon (and probably also not the right solution), the solution has to do end-to-end flow control (like TCP/IP), working with single (unfair) FIFO `````` bernd committed Dec 29, 2015 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 ``````queuing. The flow control should be fair (_n_ competing connections should get _n_ of the data rate each), and it should not completely yield to TCP, even in a buffer-bloat configuration. The approach is the following: The sender sends short bursts of packets (default: 8 packets per burst), and the receiver measures the timing when these packets arrive - from earliest to latest, and calculates the achievable data rate. The receiver sends this data rate back to the sender, which adjusts its sending rate (to make sure the rate is not faked, the receiver must prove it has received at least most packets). Data rate calculation accumulates rates over several bursts (default: 4 bursts per block), and sends only a final result, i.e. one acknowledge per 32 packets. This is the P part of a PID controller; the receiver constantly provides measurements of acheivable rates, and the sender adjusts this rate on every ack received. The sender tracks two things: Slack and slack-growth (I and D of the PID `````` bernd committed May 01, 2015 72 73 ``````controller). Slack, i.e. accumulated buffer space, provides an exponential slowdown, where a factor of two equates to either half the difference of `````` bernd committed Dec 29, 2015 74 ``````maximum and minimum observed slack or 20ms (whatever is larger). `````` bernd committed May 01, 2015 75 `````` `````` bernd committed Dec 29, 2015 76 ``````Slack-growth is observed by the timing of the last burst compared with the `````` bernd committed May 01, 2015 77 78 79 ``````first burst in the four burst sequence. This tells us how excessive our data rate is. To compensate, we need to multiply that time with the bursts in flight, and add that as extra delay after the next burst we send. This allows `````` bernd committed Dec 29, 2015 80 ``````the buffer to recover. `````` bernd committed May 01, 2015 81 `````` `````` bernd committed Dec 29, 2015 82 ``````The whole algorithm, sender and receiver side, fits into about 100 lines at `````` bernd committed May 01, 2015 83 84 ``````the moment, which includes the rate control and burst generation on the sender side, but does not include all the debugging and statistics code to observe `````` bernd committed Dec 29, 2015 85 ``````what happens. `````` bernd committed May 01, 2015 86 `````` `````` bernd committed Dec 29, 2015 87 ``````To get fast long-distance connections up to speed quickly, the first rate `````` bernd committed May 01, 2015 88 89 90 ``````adjustment will also up the packets in flight. Later, each ack allows for further packets in flight (default: a maximum of two bursts, i.e. 64 packets) before the next ack is expected. To achieve this, the sender measures round `````` bernd committed Dec 29, 2015 91 ``````trip delay. `````` bernd committed May 01, 2015 92 `````` `````` bernd committed Dec 29, 2015 93 ``````This helps to detect broken connections - if the receiver goes offline or `````` bernd committed May 01, 2015 94 95 ``````has been suspended temporarily, the sender stops. It can't call back the packets in flight, for sure, these will get lost, and might temporarily fill up `````` bernd committed Dec 29, 2015 96 ``````buffers. `````` bernd committed May 01, 2015 97 `````` `````` bernd committed Dec 29, 2015 98 ``````The algorithm is measured as fair and sufficiently stable for several `````` bernd committed May 01, 2015 99 ``````parallel connections to the same sender, and it works together with parallel `````` bernd committed Dec 29, 2015 100 ``````TCP and LEDBAT traffic. `````` bernd committed May 01, 2015 101 `````` `````` bernd committed Dec 29, 2015 102 ``````## Fair Queuing ## `````` bernd committed May 01, 2015 103 `````` `````` bernd committed Dec 29, 2015 104 ``````Instead of using a single FIFO buffer policy, routers (or, in net2o `````` bernd committed May 01, 2015 105 106 107 108 109 110 ``````terminology, switches) can help fairness: Under congestion, each connection is given its own FIFO, and all filled FIFOs of the same QoS level are served in a round-robin fashion.  This allows the receiver to accurately determine the actual achievable bandwidth, and does not trigger the more heuristical delay-optimizing strategies.  Also, this buffer policy allows to have different flow control algorithms on different protocols, and still have `````` bernd committed Dec 29, 2015 111 ``fairness.``