Re: tcptrace Congestion Control in TCP

From: Brent Draney (brdraney@nersc.gov)
Date: 03/22/04

Next message: Dennis Dungs: "RE: tcptrace Congestion Control in TCP"
Previous message: Dennis Dungs: "tcptrace Congestion Control in TCP"
Maybe in reply to: Dennis Dungs: "tcptrace Congestion Control in TCP"
Next in thread: Dennis Dungs: "RE: tcptrace Congestion Control in TCP"
Reply: Dennis Dungs: "RE: tcptrace Congestion Control in TCP"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

Message-Id: <200403221829.i2MITf8t062139@exit.nersc.gov>
Subject: Re: tcptrace Congestion Control in TCP 
Date: Mon, 22 Mar 2004 10:29:41 -0800
From: Brent Draney <brdraney@nersc.gov>

Hi Dennis,
I'm not an expert in the specific implementations under linux and its also
very version dependent, but I'll try to help with the general principles
of TCP congestion control.

First of all, there are 2 sides to the congestion control, sender and receiver.
The sender uses what is generally called "congestion control" and the receiver
uses the "advertised window" This allows either host to throttle back the
connection. Don't confuse sender and receiver with connection initiator and
responder. Who sends the first syn isn't important, its the direction of the
data flow. Also, there are 2 receive windows and 2 sender congestion controls
to cover both directions of data.

At socket startup you have the window scaling factor (its a bit shift so consider
it an exponential factor *2^winscale not a multiplier *winscale) that cannot be
changed after the initial handshake. I have seen instances where a winscale factor
of 2 or 3 was negotiated and do to an out of order packet or missing syns, Tcptrace
will not factor in the winscale. A missed winscale of 2 would turn 17520 into 70080.

Tcp goes through a faze called slow start. In this faze the socket will send 1 to 3
packets and then wait for an ack. The next send will be roughly double and wait
for an ack. This will repeat until one of two things happen. You reach the advertised
window (not necessarily the recieve buffer) or you drop a packet which most
versions of tcp attribute to congestion.

You can never send more data than the advertised window based on the RFC's and you
shouldn't be able to send data than your send buffer because the sender must hold
onto it until its ack'ed (might need to retrans). The receiver can shrink the
advertised window to slow or stall the sender. This often happens when the receiver
is overloaded and is having problems processing packets.

If a packet is dropped at any time the sender is supposed to go into congestion
control (execpt under fast retransmit where the sender can allow 1 dropped packet
per window). The sender should create (or reduce if already in congestion control)
a congestion window that is 1/2 of the current sender window. Set the congestion
flag and then add one additional packet per window sent(linear recovery).

The combination of slow start, backoff and linear recovery are what you're describing
by "it creates artifically packet loss to detect the maximum bandwidth." Its
not artificial loss its just the receive window supports more bandwidth than the
network does an current versions of tcp are to stupid to figure it out before
packet loss.

Different versions o linux deal will buffer requests differently. Early versions
had a bug which reduced the advertised window to half the requested buffer.
This was fixed later with a hack to double the adv. window. When the real
error was fixed we had double sized windows for a while. Then came the "buffer tax"
where between 10% and 20% of the buffer request was reserved for the kernel
to deal with as it saw fit. VERY version dependent.

I have no experience with bluetooth but any setsockopt options may be no-opts
at some level in the stack by the time they make it to HW.

Hope this helps. If I've misspoken I'm sure the community can help by correcting me.

Thanks,
Brent

> Hi,
>
> I have a general question on TCPs congestion control algorithm. I searched
> for the answer in different RFCs as well as I read different papers about
> the general implementation of TCP and in Linux. I failed to find an answer.
> Also a look into the Linux TCP source code did not help.
> So I hope to get an answer from the mailing list. If the question is
> offtopic, please referr me to a point, where I can find the answer.
>
> As what I understand about TCP, the Tahoe, Reno and NewReno implementation
> of TCP use a reactive approach to adapt dynamically to the maximum avialable
> bandwidth. Thus, it creates artifically packet loss to detect the maximum
> bandwidth. This is done by a linear increase of cwnd per RTT (in congestion
> avoidance phase).
> Secondly, the receiver can advertise a window (rwnd), that reports the
> number of bytes he is willing to accept. The rwnd basically reports the
> number of bytes currently free in his receiving buffer.
> As I made different measurements with Ethereal and tcptrace, I cannot find
> any packet loss over a time of 30 seconds. So it can be concluded, that
> rwnd limits the sending rate.
> I made measurements over Bluetooth and Wireless LAN, each without any packet
> loss, but different throughputs. Each time, different maximum rwnd have been
> advertised (BT 17520, WLAN 62780). A TCP receiving buffer of 64kByte was
> used for both measurements.
> So I have two questions:
>
> - Did I made any errors in my considerations?
> - What is the scheme to control the advertised window, since it seems not
> only to be related to the free buffer space of the receiver?
>
> It would be kind, if I could get an answer.
>
> Thanks in advance,
>
> Dennis
>
>
> ----------------------------------------------------------------------------
> To unsubscribe, send a message with body containing "unsubscribe tcptrace" to
> majordomo@tcptrace.org.
>

----------------------------------------------------------------------------
To unsubscribe, send a message with body containing "unsubscribe tcptrace" to
majordomo@tcptrace.org.

Next message: Dennis Dungs: "RE: tcptrace Congestion Control in TCP"
Previous message: Dennis Dungs: "tcptrace Congestion Control in TCP"
Maybe in reply to: Dennis Dungs: "tcptrace Congestion Control in TCP"
Next in thread: Dennis Dungs: "RE: tcptrace Congestion Control in TCP"
Reply: Dennis Dungs: "RE: tcptrace Congestion Control in TCP"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

This archive was generated by hypermail 2.1.7 : 03/23/04 EST