X-over accidentally connected from server to switch. But it worked!

W

WeeJockMacFeegle 20 years ago

I've been investigating a problem recently where we were experiencing intermittent packet loss and errors on the switch to a particular server.

In the end we traced the problem to the fact that the server had been connected to the switch using a crossover cable.

In my view this shouldn't have worked at all, but it did, albeit at a reduced rate and with significant errors.

Can anyone explain why this would actually work? Was the switch or NIC compensating in some way for the cable being incorrect?

Vote

H

Harald Andersen 20 years ago

Many, if not most, new GB NIC's and switches supports "Auto MDI/MDX" and will support both stright and twisted cables.

/HC

Vote

B

Barry Margolin 20 years ago

In that case, should he have been getting errors at all because of the incorrect cable type?

Does this interfere with auto-sensing speed and duplex? Mismatches there often result in lots of packet loss.

Vote

G

googlegroups 20 years ago

No. The cable type does not explain the errors. If at least one of his transceivers supports this mechanism, then both are the correct cable type.

Auto MDI/MDX does not make any difference to the detection of speed and duplex settings. He still has to get this right.

Could it have been a *bad* crossover cable?

/chris

Vote

W

WeeJockMacFeegle 20 years ago

I don't think it was a bad crossover cable. Got the cable tester on it and it report all fine. I suspect it was probably a mis-match between NIC and switch speed/duplex settings. Switch was set to 100Mb/Full, NICs auto set themselves at 100Mb/Half.

Strapping the NICs to 100Mb/Full killed the connection. I suspect because the NIC could handle having the wrong cable at Half Duplex, but not at Full Duplex.

Thanks everyone.

Gary

Vote

R

Rick Jones 20 years ago

Some old boilerplate I trot-out from time to time:

How 100Base-T Autoneg is supposed to work:

When both sides of the link are set to autoneg, they will "negotiate" the duplex setting and select full duplex if both sides can do full-duplex.

If one side is hardcoded and not using autoneg, the autoneg process will "fail" and the side trying to autoneg is required by spec to use half-duplex mode.

If one side is using half-duplex, and the other is using full-duplex, sorrow and woe is the usual result.

So, the following table shows what will happen given various settings on each side:

Auto Half Full

Auto Happiness Lucky Sorrow

Half Lucky Happiness Sorrow

Full Sorrow Sorrow Happiness

Happiness means that there is a good shot of everything going well. Lucky means that things will likely go well, but not because you did anything correctly :) Sorrow means that there _will_ be a duplex mis-match.

When there is a duplex mismatch, on the side running half-duplex you will see various errors and probably a number of _LATE_ collisions ("normal collisions don't count here). On the side running full-duplex you will see things like FCS errors. Note that those errors are not necessarily conclusive, they are simply indicators.

Further, it is important to keep in mind that a "clean" ping (or the like - eg "linkloop" or default netperf TCP_RR) test result is inconclusive here - a duplex mismatch causes lost traffic _only_ when both sides of the link try to speak at the same time. A typical ping test, being synchronous, one at a time request/response, never tries to have both sides talking at the same time.

Finally, when/if you migrate to 1000Base-T, everything has to be set to auto-neg anyway.

rick jones

Vote

G

googlegroups 20 years ago

I know Rick's words to be the truth, and yet I regularly see NIC setup interfaces that purport thier ability to "force" gigabit operation. Like this one:

formatting link

What's the deal with these control panels?

My guess is that the NICs are still autonegotiating, but restrict the bits advertised in the TAF to just the gigabit modes.

Is that correct?

Secondarily, can anyone offer perspective on why-oh-why systems/network people everywhere seem to think autonegotiation is totally unreliable?

I can count the number of times I've seen a genuine negotation failure with my fingers, and those invariably involved pre-802.3u transceivers.

OTOH, the number of links I've seen screwed up due to force-happy administrative error are myriad.

/chris

Vote

B

BernieM 20 years ago

I've seen many a case of NetWare reporting a 'negotiating' NIC as operating in FD mode and we see the Cisco switchport having negotiated FD but still showing collisions. As a general rule we've now hard coded all core switch ports (200 servers connected) for the last two years. Once the 'general rule of thumb' becomes policy it's easy to manage.

It's followed over to our migration to Gbit and we've never looked back. It's not so much that we think autonegotiation is totally unreliable but hard coding both ends significantly reduces the possibility of problems.

Having the option to hard code Gbit switchports (which generally means you can also hard code them down) allowed us to replace all 10/100 cards with

10/100/1000 and progressively migrate non-Gbit devices to Gbit.

BernieM

Vote

G

googlegroups 20 years ago

Your Cisco switchport claims to be running full-duplex mode, but is also logging colisions?

This is very surprising behavior. Did you log a bug with Cisco? What did they say?

Okay, but you might be introducing a problem. It's very easy to force all of your switch ports, especially since they're likely all under the conrol of one group, and the ports live on a small subset of equipment (the switches)... You didn't say that you positively know the operating mode of every last host transceiver in your environment.

Perhaps in your experience it reduces the possibility of problems. I explained in the OP that in my experience it greatly increases the possibility of problems. I'm guessing from Rick's post that he's seen the same thing:

rj> c) people (network administrators among them) who didn't fully rj> understand how autoneg was supposed to work and ass-u-me-d that rj> they could leave one side at auto and hardcode the other to rj> "force" the mode they wanted.

In my work, I regularly deploy equipment in other people's datacenters. I can't tell you how many times I've setup systems where the customer's switchports were running auto (often the switch had just come out of the carton), only to be called back a few months later to discover that some admin had blindly forced all port operating modes without knowing/caring what was on the other end of each cable.

When I deployed the equipment, I checked the switchports and setup a good match. When the admin changed the port settings without coordination he broke things. He was likely applying a similar "rule of thumb".

I don't understand what you're trying to say. You can plug 10/100 cards into 10/100/1000 ports (and vice versa) regardless of your ability to administratively restrict operating modes. It seems like you're just creating work for yourself. "Having the option to hard code" certainly did not "allow you to replace".

And you missed the point of my original post: Autonegotiation is mandatory with gigabit. The silent force mode Cisco employs on its FastEthernet switchports cannot work in a gigabit environment.

The option you describe doesn't exist. You're still autonegotating, just perhaps with a limited subset of operational modes. This is very different from the FastEthernet model where autonegotiation can be truly eliminated.

/chris

Vote

B

BernieM 20 years ago

That's completely normal behaviour when there is a duplex mismatch. The switchport was operating in FD but collisions were ocurring on the Ethernet ... hence we discovered the other end was not operating in FD although the OS said it was.

We only hard code 'server' switchports. Interesting how people take things out of context. I specifically mentioned 'core' switch ports and never stated 'every single host'.

Yes, and I'm presenting my experience which indicates migrating to Gbit doesn't mean 'everything has to be set to autonegotiation' and you can the hard coding of speed and duplex settings can be managed. It's horses for courses as they say.

Communication during the initial setup? Wouldn't you discuss the state of switchports with the sites support person/s? Were you made aware of the rules? Did you ask? Were the details of your installation documented?

I'm not having a go at you I'm just conveying the fact that mismatches can be avoided when all parties 'communicate'.

Sorry for missing the 'point' of your post but I was making comments related to my experience with Gbit and 'negotiation'.

I don't understand what you saying. our server ports do not autosense. They are hard coded at both ends. How can autonegotiation be 'mandatory' when you can still hard code the NIC/switchport. What am I not understanding?

BernieM

Vote

R

Rich Seifert 20 years ago

If your switch is operating in full-duplex mode, how can it possibly detect collisions? A collision (on twisted pair or fiber media) in half-duplex mode is defined as frame-reception-while-transmitting. However, in full-duplex mode, frame reception while transmitting is perfectly normal and expected. The question is, what "event" is being counted by the collision counter for a switch port in full-duplex mode?

Now, the collision counter on the *half-duplex* end of a mismatched pair may very well see collisions; in fact, it may see more than usual, since its full-duplex link partner considers it perfectly acceptable to transmit while receiving frames.

[snip]

You cannot disable Auto-Negotiation for 1000BASE-T. What you *can* do is limit the Auto-Negotiation advertisements to reflect a desire to operate as "gigabit or nothing," but the device will still execute the Auto-Negotiation algorithm. At minimum, it needs to select a clock master for the link, and place the other device in slave status.

Vote

G

googlegroups 20 years ago

Thanks, Rich.

It's a little disheartening to consider that administrative interfaces set to 1000BASE-T truly are negotiating (where they might not be for

100-BASE modes) and that they make no distinction on this point.

I'm very curious if you've got opinions to share on the state of mistrust of the algorithm.

/chris

Vote

R

Rich Seifert 20 years ago

While I have heard numerous anecdotal accounts of Auto-Negotiation failures (in this newsgroup, and similar fora), I have never personally seen or experienced a single instance of such on any production (non-prototype) equipment.

Vote

S

Stephen Sprunk 20 years ago

If you're dying to see it for yourself, find a switch with a late-90s-era Broadcom PHY (e.g. Cisco 5000) and a PC with a 3com NIC using _original_ Windows drivers. It'll fail every time.

Those of us who dealt with these things in the wild saw it all the time back when it first came out. Particularly problematic is that folks tend to buy the same server/PC models in bulk, so if one fails, odds are you're looking at several hundred/thousand simultaneous failures. Or you might luck out and never see any, depending on what IT was buying.

It's definitely not as bad now as it was then, but the problem isn't gone yet. However, I'd agree that the rate is likely much lower than admins believe it is.

S

Vote

X-over accidentally connected from server to switch. But it worked!

Join the Discussion

Didn't find your answer?