Filtering hardware addresses within a NIC

In 'Interconnections', Radia Perlman describes a scheme in which NICs hash multicast addresses into several buckets. When the upstream driver needs to subscribe to a multicast hardware address, the NIC removes the filtering for the appropriate bucket. The result is that the desired multicast frames, /plus/ any others that might hash into that bucket get forwarded up the stack.

What's the state of the art here?

Hash buckets? How many? Individual address registers? How many?

I understand that answers can only be generalizations or chipset- specific. Either sort of reply would be helpful.

With the proliferation of virtualization, NICs are forwarding lots of non-manufacturer-assigned addresses these days.

I'm curious about filtering of both unicast and multicast frames.

Thanks for any pointers!

Reply to
Chris Marget
Loading thread data ...

As far as I know, 64, but that was a while ago.

Even so, IPv4 uses broadcast (all ones), not multicast, so it really doesn't help for the largest fraction of the traffic. Appletalk uses multicast, but that is pretty dead by now.

I believe that IPv6 knows about multicast, but that is catching on pretty slow. Otherwise, as far as I know the usual system is to use some of the bits on the CRC shift register as hash.

As far as I understand it, each NIC allows for one unicast address, some number of hash buckets for multicast, and the broadcast address.

I suppose with enough buckets one might use that for unicast, but it isn't hard to compare a 48 bit register one bit at a time.

-- glen

Reply to
glen herrmannsfeldt

Huh? Something sent to an IPv4 broadcast address will be sent to the broadcast MAC, but IPv4 multicast uses multicast MACs. Are you thinking perhaps of ARP, which initially uses a broadcast MAC to get the mapping from destination IPv4 address to MAC address?

rick jones

Reply to
Rick Jones

Both ARP and IP broadcast use, as far as I know, MAC broadcast.

Now, with no other protocol on the wire, I suppose it doesn't matter much, but DECnet, Appletalk, and IPX hosts have to filter out all the ARP and broadcast IP in software.

It would be nice to see more use of IP multicast, but as far as I know it isn't used much.

-- glen

Reply to
glen herrmannsfeldt

For clarity, I was thinking of a strictly IP environment.

Heck, even in a homogeneous IP environment, most stations on the wire need to filter out the majority of IP's Ethernet broadcasts. Broadcast frames should be a very small percentage of overall traffic in an IPv4 broadcast domain. If it's not, then your application developers are broken :-)

Glen, I too, am confused by your comment:

Offline, someone has pointed me to this product page:

formatting link
"Unicast/Multicast Rx frame filtering for up to 256 address/mask pairs"

So, I guess that's one data point. Sounds like no hash buckets and

256 configurable registers. This sounds like it would be probably adequate for most multicast and virtualization applications, but I have doubts about whether this very expensive (I assume) NIC is similar to the typical gigabit broadcom and intel NICs being delivered with most servers and desktops.

I'm still interested to collect a few more samples of how different NICs handle the filtering.

Thanks!

/chris

Reply to
Chris Marget

I found another couple of examples:

formatting link
"Advanced packet filtering - 16 exact matched (unicast or multicast)"

formatting link
"16 exact-matched packets (unicast or multicast)" "4096-bit hash filter for multicast frames"

The first example is a single-interface device marketed for embedded and on-board applications. The second example is a quad-interface unit made for servers.

Sixteen addresses before falling to promiscuous mode seems like a potentially cripplingly low number for today's data centers. ...Or, it would be, if not for the filtering done by switches upstream. But, taking that line of thinking to its conclusion -- why ever do any hardware filtering in NICs? Just run promiscuous! The switch will protect me!

Am I misunderstanding something here? Are there millions of VMWare servers running all of their (physical) NICs in promiscuous mode?

Reply to
Chris Marget

So, it seems, were the designers of IP.

Note that Appletalk AARP uses multicast, so that, (except for hash collisions) IP only hosts won't see the packets at all.

A small percentage of packets on the wire, but every host has to process them. Of the frames received by a host (that is, stored in the buffer and passed up the line) the fraction might not be so small.

There is an IP multicast address system, such that hosts can register to receive multicast data.

I could be wrong on this, but as far as I know it is rarely used for the services that could use it. Streaming audio from radio station, and live tracking systems like MLB gameday, send the same data to many hosts. If each source only had to send it once, with mutlicast routers routing the data as needed, the total network load might be greatly reduced. Just yesterday I read about some companies blocking access to sites like youtube as it becomes a large fraction of the corporate bandwidth.

I haven't followed CAM technology recently. 256 by 48 isn't so large I suppose. If you put the bits into a hash system, though, and assume that RAM is smaller than CAM, you should get at least a 32768 bucket hash table, though likely even larger.

Hard to say.

If you look at the open source device drivers, it should be there.

-- glen

Reply to
glen herrmannsfeldt

Chris Marget wrote: (snip)

A host, large or small, normally only has one IP address per interface. And even with more, ARP will take care of that, so there isn't much reason for more than one unicast address.

Now, a four port chip should have at least four unicast addresses.

It seems like the limit might not be chip area, but the time required to do the comparison. Hash is convenient, as the time is fairly independent of the hash table size. (It might go as log(N), though.)

I thought that they usually ran NAT for virtual machine hosts.

If you don't/can't do that, you can implement an IP router and route to virtual hosts.

But even with mutltiple IP addresses, there is still no need for more than one MAC address, as ARP will still work fine.

-- glen

Reply to
glen herrmannsfeldt

Neither of those are how enterprise virtualization (VMWare ESX nor XEN) work. Both of those hypervisor platforms run a software L2 switch behind the NIC. Virtual hosts get their own MAC address which (obviously) must be passed up the stack by the NIC. Given that VMWare eats a whole bunch of MAC addresses all on its own, and virtualization density commonly goes beyond 16/server, we've blown this unicast MAC budget very quickly.

Again, not in a VM enviromnet. VMs run their own IP stack, must do ARP all by themselves.

Reply to
Chris Marget

Bonsoir Glen,

For that point, I don't know what is the difference (or perhaps the relationship) between the MMRP protocol (former GMRP) and the IGMP snooping protocol.

It seems that both protocols are able to modify the multicast address table.

Best regards, Michelot

Reply to
Michelot

Both mechanisms update the L2 forwarding table on a bridge/switch.

GMRP is the mechanism by which stations can directly register their interest in group frames with a bridge/switch. It requires deliberate participation/initiation by the end stations.

IGMP snooping is a mechanism by which a bridge/switch can determine Ethernet stations interest in a subset of Ethernet group traffic (those frames associated with IPv4 multicast). It does this by intercepting IPv4 IGMP traffic originated by hosts / destined for IPv4 routers, and by breaking IGMP's host report suppression mechanism. Stations doesn't know this is going on. Rather, the switch snookers them into expressing interest.

Neither of these are related to hardware filtering within a NIC.

/chris

Reply to
Chris Marget

Bonsoir Chris,

Thanks for your time.

I need to read again... after the night.

I note this.

Best regards, Michelot

Reply to
Michelot

Bonsoir Chris,

Thanks for giving me, with your description, the curiosity to look that.

I read also :

(1) In RFC 5110 : GMRP is not really used, but IGMP is widely supported.

(2) In Cisco, about Catalyst 4500 : hosts can send the IGMP join message together with the GMRP join message. We can suppose that it is for non snooping aware switches.

Best regards, Michelot

Reply to
Michelot

I guess the answer to this part of the question really is "Run promiscuous, the switch will protect me":

formatting link
:

Each physical NIC is put in promiscuous mode -Need to receive frames destined to all VMs -Not a issue at all on a modern Ethernet network

Huh.

Reply to
Chris Marget

You mean forwarding tables in switches aren't of infinite capacity?-)

rick jones

Reply to
Rick Jones

I agree, but I suspect at the root of it is something to which glen referred in another thread when he said "Ethernet works!" It has worked well enough and long enough now that people have forgotten there is the prospect of it not working. Or, to use another phrase, "good enough"... and it will probably remain that way so long as the switch vendors keep increasing the size of the their forwarding tables so that the probability of them filling is the inverse of some reasonably large number of nines. (And/or the NIC vendors keep increasing the number of MACs they can track)

rick jones

Reply to
Rick Jones

Heh. Well, there's a wide gulf between 16 and infinity. It's almost... nevermind.

Am I the only one surprised by this promiscuity feature? I'd think that VM folks would be stressing the need for things like:

- stable STP topologies (TCN results in unicast flooding, could be a problem)

- IGMP snooping (prune back unwanted multicast from software processing)

- routing symmetry / arp timer tweaks (prevent unicast frames flooding

3hours 55minutes out of every 4 hours)

"the switch will protect me"^H^H^H^H^H^H^H^H^H^H^H^H^H"not a issue at all on a modern ethernet network" seems like it falls somewhere between naive and optimistic.

Reply to
Chris Marget

Rick Jones wrote: (snip, someone wrote)

Well, I wrote that as a reason for the small number of posts to the group, but it does seem also to apply that way.

-- glen

Reply to
glen herrmannsfeldt

Cabling-Design.com Forums website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.