slow TCP connections due to very different speed of segments?

Hello group/list,

I've checked the FAQ but I couldn't find any reference to this issue.

Our campus LAN is mostly Gigabit Ethernet fiber and 100 Mb/s UTP distribution, but we still have some distribution done to remote parts of the campus done over LRE (long range Ethernet), which is much like a "local DSL". It's supposed to give 10Mb/s under the best conditions AFAIK.

People from these remote locations complain that traffic to servers on the core network is very slow. I've ruled out problems at client or server side. I've tested file transfers across the LRE segment and across the Gb Eth. segment. Their speeds were close to the expected max, so this (I guess) rules out problems on the segments themselves (esp. the LRE). I'm starting to wonder whether the big drop in speed between the two segments isn't the root cause (I mean, a 1Gb/s and a

10Mb/s segment). Would Ethernet gurus be so kind to comment? (I'm a poor system admin, acting as a network admin!).

The exact topology is as follows:

(Core LAN) | [Cisco Catalyst 4006 L3 switch] |

1Gb/s Eth fiber | [SMC L2 switch] | 100 Mb/s UTP | [Cisco LRE 29xx switch] | phone cable | [LRE end equipment] | 100 Mb/s UTP | (client PC)

Thanks in advance for any comment, tip etc.

NOTE: the From: e-mail address is a dead one. Please post.

Greets, _Alain_

Reply to
alainjean
Loading thread data ...

There is no issue with the speed mis-match that you describe. TCP was designed to operate in that environment and as is witnessed by the internet and other WANs does actually does so.

Performance problems are always tricky, if you don't know quite a lot about how this stuff works and can use tools like packet sniffers and interpret the results it could be quite difficult.

  1. The absolute first thing is to check the counters on all of the equipment to see of there are errors accumulating.

Fix them. Any on Ethernet will most likely be caused by duplex mis-match.

The error rate is basically zero on this kit now and less than 1 in a million is OK. TCP is very good at recovery and can hide much higher rates. On LANs most ports have zero errors *ever*.

  1. Make sure that the performance is not actually as designed. This is much harder. For example some users may complain that windows file copies are very slow when they drag a directory across the link. It casn take many net transactions to get even one file across so the performance degrades quite quickly with hops and or other latency and or slow links.

Eg to copy a file windows might do: I made the details up but it is pretty close (I think).

Hello is filex there - wait on reply Hello are you still there - wait on reply Open file - wait on reply Close file - wait on reply Open file - wait on reply Read first block - wait on reply there was only one block so: Close file - wait on reply.

Thats 7 transactions just to copy one tiny file. Windows also keeps its explorer widow up to date so that can be another load of transfers.

You can see this with a sniffer^h^h^h^h^h^h^h Ethereal.

I am now a convert. FAB.

Reply to
anybody43

Gee ... people complain? :)

Both directions? When users were complaining? I typically run ttcp and ping

AFAIK, ethernet itself isn't the issue, but that Crisco bridge may be. TCP/IP has a discovery mechanism that can work well, but hasn't always been well implemented (especially MS-Win9*). I also believe some of the newer p2p apps use UDP. You may need to sniff the link.

It gets into Quality-of-Service issues, but if someone is hogging that 10 Mb/s line, everyone else will suffer. Paradoxically worse (latency) if the Crisco or LRE has big buffers.

-- Robert

Reply to
Robert Redelmeier

Thanks to you Robert, FAB, Rick for all the input.

I'll do another pass of investigation based on your comments, but here are a few more words on this, though:

- duplex mismatch: I've checked this all along the path, no duplex mismatch (esp. the UTP link between the SMC and the Cisco LRE switch)

- this problem seem to happen all the time. When I've done my tests, I've used ftp transfers (from Unix boxes, so that I know at least that the ftp server is decent). Yes, the slowness was in both directions, although, as expected, download was always signficantly faster than upload. Ping times were normal. I'll try ttcp too, thanks.

- I've concentrated on one complaint from one user who's making ftp transfers only... we don't allow P2P on the University campus anyway :-)

- the LRE links are not shared, so no one can be hogging them

My guts feeling is that somehow data gets pushed too fast over the fast link and that either of the SMC or Cisco LRE switches drop frames. I understand that normal TCP window mechanisms should take care of that, but I suspect that they don't, somehow, and I'm not sure how to check this (where can I find something about this in netstat output BTW?).

Any new hint is welcome, I'll followup after a new round of checking counters and such.

Greets, _Alain_

Reply to
alainjean

Of course, if the buffers are too small - say smaller than they typical TCP windows being used, a burst of traffic from the fast side may fill the buffers and overflow. Checks of the statistics might be in order - both netstat on the end systems and link-level on each side of that 10 Mbit/s pipe (or anywhere there is a speed change I suppose).

rick jones

Reply to
Rick Jones

Also check cable termination quality. A split pair (homemade) on a cable will ruin performance on that collision domain.

Is LRE assymmetric? I wouldn't expect different performances, unless there are slow disks limiting.

If it's low ftp throughput, you might try adjusting that users TCPRecvWindow.

-- Robert

Reply to
Robert Redelmeier

On the sending side, look for lines relating to retransmissions. Compar with total data segments sent. Also compare retransmissions to data segments retransmitted. On the recv side, look for out-of-order segments. While it may not match your platform entirely, the following:

ftp://ftp.cup.hp.com/dist/networking/briefs/annotated_netstat.txt

may be of some help. You may also want to compile:

ftp://ftp.cup.hp.com/dist/networking/tools/beforeafter.tar.gz

on your system so you can "subtract" one netstat from the other (check it carefully though, as that code was written and tested only on HP-UX netstat and lanadmin - it is simple, but perhaps not simple enough)

TCP has mechanisms to attempt to adapt to congestion. How well it works depends on the nature of the congestion and the flavor of TCP being used - seems one sure way to get one's degree these days is describe yet another tweak to congestion control :)

Someone else asked about asymmetry - that could indeed be an issue particularly if the receiver is an "ack-every-other" and the asymmetry is great - basically, if the ratio is worse than 2*MSS/60 where MSS is the MSS for TCP over the asymmetric link and 60 is a wag for the size of an ACK segment, then the ACK's may be saturationg the slow return link and limiting the floy of TCP segments the other way.

rick jones

Reply to
Rick Jones

Cabling-Design.com Forums website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.