From: Dennis Dungs (Dennis.Dungs@freenet.de)
Date: 03/23/04
From: "Dennis Dungs" <Dennis.Dungs@freenet.de> Subject: RE: tcptrace Congestion Control in TCP Date: Tue, 23 Mar 2004 10:03:21 +0100 Message-ID: <003b01c410b5$b25534b0$78030a0a@kom.auc.dk>
Hi Brent,
thank you very much for your help. I really appreciate that.
I understood what you said, but I am still uncertain in some things.
To clarify what I did and what I interpreted into my measurements:
I am currently investigating TCPs performance over WLAN an Bluetooth. So I
made different measurements in different scenarios (1 or more connections,
different types of crosstraffic, handovers, ....)
In nearly all scenarios I cannot see any packet loss, and I wondered why
(WLAN and Bluetooth use link layer retransmssions to overcome packet loss
due to bit errors), but I will describe my considerations only about a
single TCP connection without any crosstraffic. I ran a TCP connection over
a period of 10s.
So I concluded, that TCPs cwnd must have hit the rwnd before running into
congestion. In WLAN, I measured a maximum bandwidth of 700kbytes/s with RTTs
of 20ms resulting in a bandwidth-delay-product of 14 kbyte, which equals 9,5
fully utilized packets. In Bluetooth, a bandwidth of 75kbytes/s and RTTs of
around 300ms result in a BDP of 22,5 kbyte, equal to 15,4 packets.
I further concluded, that with an observed rwnd of 62780 (the window scaling
option set to 0 resulting in a scaling factor of 1), the receiver is always
able to process enough packets, even if the network is fully utilized. So
the rwnd is not throtteling the TCP sender. I also saw no reductions of the
rwnd throughout the first SYN to the last FINACK.
Secondly, even when sstresh is set to 1 packet, the CA phase would need 10
RTTs (WLAN) = 200 ms or 16 RTTs (Bluetooth) = 4.8s to run into congestion.
But I did not see any packet loss after 10s, even when I extended the
connection duration to 30s or even more. Even when using more than one
connection, that should cause "more" congestion does not really cause
frequent retransmission.
On the other hand, with the 17520 window, I get a Bluetooth throughput of 58
kbyte/s, that matches 17520/RTT(300ms) . The same applies to WLAN, a
throughput of 876 kbyte/s matches 17520/(20ms). I found out, that M$ Wind*ws
running on the mobile node is advertising a window of 62780, whereas the
wired communication partner running Redhat 7.3 is advertising 17520. It is
not dependant on Bt or WLAN, as I wrongly stated in my first message. Sorry,
if this mistake caused confusion.
So I can explain the situation with the low rwnd, but not with the high
rwnd. Is there any hint, where I have to look or what else I have to take
into account, so that I can explain also the high rwnd situation?
Thanks in advance,
Dennis
-----Original Message-----
From: owner-tcptrace@tcptrace.org [mailto:owner-tcptrace@tcptrace.org] On
Behalf Of Brent Draney
Sent: Montag, 22. März 2004 19:30
To: Dennis Dungs
Cc: tcptrace@tcptrace.org
Subject: Re: tcptrace Congestion Control in TCP
Hi Dennis,
I'm not an expert in the specific implementations under linux and its also
very version dependent, but I'll try to help with the general principles
of TCP congestion control.
First of all, there are 2 sides to the congestion control, sender and
receiver.
The sender uses what is generally called "congestion control" and the
receiver
uses the "advertised window" This allows either host to throttle back the
connection. Don't confuse sender and receiver with connection initiator and
responder. Who sends the first syn isn't important, its the direction of
the
data flow. Also, there are 2 receive windows and 2 sender congestion
controls
to cover both directions of data.
At socket startup you have the window scaling factor (its a bit shift so
consider
it an exponential factor *2^winscale not a multiplier *winscale) that cannot
be
changed after the initial handshake. I have seen instances where a winscale
factor
of 2 or 3 was negotiated and do to an out of order packet or missing syns,
Tcptrace
will not factor in the winscale. A missed winscale of 2 would turn 17520
into 70080.
Tcp goes through a faze called slow start. In this faze the socket will
send 1 to 3
packets and then wait for an ack. The next send will be roughly double and
wait
for an ack. This will repeat until one of two things happen. You reach the
advertised
window (not necessarily the recieve buffer) or you drop a packet which most
versions of tcp attribute to congestion.
You can never send more data than the advertised window based on the RFC's
and you
shouldn't be able to send data than your send buffer because the sender
must hold
onto it until its ack'ed (might need to retrans). The receiver can shrink
the
advertised window to slow or stall the sender. This often happens when the
receiver
is overloaded and is having problems processing packets.
If a packet is dropped at any time the sender is supposed to go into
congestion
control (execpt under fast retransmit where the sender can allow 1 dropped
packet
per window). The sender should create (or reduce if already in congestion
control)
a congestion window that is 1/2 of the current sender window. Set the
congestion
flag and then add one additional packet per window sent(linear recovery).
The combination of slow start, backoff and linear recovery are what you're
describing
by "it creates artifically packet loss to detect the maximum bandwidth."
Its
not artificial loss its just the receive window supports more bandwidth than
the
network does an current versions of tcp are to stupid to figure it out
before
packet loss.
Different versions o linux deal will buffer requests differently. Early
versions
had a bug which reduced the advertised window to half the requested buffer.
This was fixed later with a hack to double the adv. window. When the real
error was fixed we had double sized windows for a while. Then came the
"buffer tax"
where between 10% and 20% of the buffer request was reserved for the kernel
to deal with as it saw fit. VERY version dependent.
I have no experience with bluetooth but any setsockopt options may be
no-opts
at some level in the stack by the time they make it to HW.
Hope this helps. If I've misspoken I'm sure the community can help by
correcting me.
Thanks,
Brent
> Hi,
>
> I have a general question on TCPs congestion control algorithm. I searched
> for the answer in different RFCs as well as I read different papers about
> the general implementation of TCP and in Linux. I failed to find an
answer.
> Also a look into the Linux TCP source code did not help.
> So I hope to get an answer from the mailing list. If the question is
> offtopic, please referr me to a point, where I can find the answer.
>
> As what I understand about TCP, the Tahoe, Reno and NewReno implementation
> of TCP use a reactive approach to adapt dynamically to the maximum
avialable
> bandwidth. Thus, it creates artifically packet loss to detect the maximum
> bandwidth. This is done by a linear increase of cwnd per RTT (in
congestion
> avoidance phase).
> Secondly, the receiver can advertise a window (rwnd), that reports the
> number of bytes he is willing to accept. The rwnd basically reports the
> number of bytes currently free in his receiving buffer.
> As I made different measurements with Ethereal and tcptrace, I cannot find
> any packet loss over a time of 30 seconds. So it can be concluded, that
> rwnd limits the sending rate.
> I made measurements over Bluetooth and Wireless LAN, each without any
packet
> loss, but different throughputs. Each time, different maximum rwnd have
been
> advertised (BT 17520, WLAN 62780). A TCP receiving buffer of 64kByte was
> used for both measurements.
> So I have two questions:
>
> - Did I made any errors in my considerations?
> - What is the scheme to control the advertised window, since it seems not
> only to be related to the free buffer space of the receiver?
>
> It would be kind, if I could get an answer.
>
> Thanks in advance,
>
> Dennis
>
>
>
----------------------------------------------------------------------------
> To unsubscribe, send a message with body containing "unsubscribe tcptrace"
to
> majordomo@tcptrace.org.
>
----------------------------------------------------------------------------
To unsubscribe, send a message with body containing "unsubscribe tcptrace"
to
majordomo@tcptrace.org.
----------------------------------------------------------------------------
To unsubscribe, send a message with body containing "unsubscribe tcptrace" to
majordomo@tcptrace.org.
This archive was generated by hypermail 2.1.7 : 03/23/04 EST