Grouping NIC interrupt requests

- S
- Spoon
  
  Contact options for registered users
posted
17 years ago

Fri, May 5, 2006 4:19 PM

Hello everyone,

(I may not understand all of this very well, thus my questions might be somewhat unclear.)

As far as I understand, an Ethernet NIC throws an interrupt request every time it has received a complete frame. Sometime later, the OS will copy the frame from the NIC's memory (?) to some kernel buffer in RAM.

If I receive a high rate of small packets, my system will be busy servicing interrupts, and won't get any real work done.

Would it be possible to configure the NIC to throw interrupt requests only when certain thresholds are reached?

e.g.

50 packets 5000 bytes or even: max(50 packets, 5000 bytes)

(This might be called "batching IRQ".)

formatting link

By the way, how big are consumer-grade NIC buffers these days? I would think a NIC can buffer at least 50 frames (75,000 bytes).

Regards.

- S
- Spoon
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Fri, May 5, 2006 4:27 PM

IRQ mitigation perhaps?

formatting link

- R
- Rick Jones
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Fri, May 5, 2006 4:50 PM

It will depend entirely in the _implementation_ of the NIC and matters not whether the NIC is "Ethernet" or some other link type.

Most if not all _Gigabit_ Ethernet NICs offer interrupt coalescing settings of one form or another. Generally they all involve tradeoffs between CPU overhead and minimum latency.

I suspect some of the "higher-end" 100BT's might have it as well.

Keep in mind it is not a panacea - you can still have a traffic rate that will keep your system very very busy. Interrupt coalescing will simply change where the cliff happens to be.

rick jones

- J
- J. Clarke
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Fri, May 5, 2006 4:58 PM

(a) Worst case this would be around 200,000 interrupts/sec. If that brings your server to its knees you need more server. (b) Most NICs use memory-mapped I/O--my laptop has 64k of memory on the NIC mapped into the system address space, while one of my servers has 4 meg.

- S
- Spoon
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Sun, May 7, 2006 9:14 AM

I've tested only two setups so far:

7,600 1344-byte IPv4 packets per second (81.7 Mbit/s)

Approximately 12-15% CPU is spent in the interrupt handler, and 2.5% CPU in the receiving process.

20,000 36-byte IPv4 packets per second (5.8 Mbit/s)

Approximately 30-35% CPU is spent in the interrupt handler, and 3-5% CPU in the receiving process.

I'm trying to minimize CPU usage, in order to have the OS run the idle task and call HALT as often as possible, to minimize power consumption.

4 mega-bytes? Really?

The Intel PRO/1000 MT NIC (a high-end part) only has 64 KB:

formatting link

Regards.

- J
- J. Clarke
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Sun, May 7, 2006 10:01 AM

Not usually an issue with servers.

That's the size of the assigned address space.

formatting link

- R
- Robert Redelmeier
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Sun, May 7, 2006 4:08 PM

Spoon wrote in part:

Isn't there a 64 byte minimum?

The throughputs seem low and the overhead numbers seem high, at least for x86 Linux. You should be able to do a lot better with a real-mode x86 OS, or if you use APIC instead of the dog-slow XT-PIC.

-- Robert

- S
- stephen
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Sun, May 7, 2006 5:36 PM

64 bytes is the Ethernet min frame size.

If IP packets would be smaller than the minimum once an Ethernet wrapper is added (another 18 bytes of header and trailer), then they get padded by the Ethernet driver. And then the driver will add the interframe gap (96 bits?). So the 20k packets are actually using up more than 12 Mbps of bandwidth on the wire.

Mind you the test is still worst case - there is not much you can with 36 bytes in IP, since AFAIR there isnt room for IP + TCP.

- S
- Spoon
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Sun, May 7, 2006 6:39 PM

OK, but I'm dealing with an embedded device ^_^

Is that the maximum amount of memory the card can address?

How much memory is really on-board?

- S
- Stephen Sprunk
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Sun, May 7, 2006 6:46 PM

The IP header alone is 20 bytes and the TCP header is another 20 bytes (both without options). Depending on the network, between 25% and 50% of all real-world traffic will be TCP ACKs or probes at exactly 40 bytes. The remainder will typically be PMTU-sized, though one will often see spikes at

576 bytes and a few other intermediate points.

The only way to get a semi-useful IP packet under 40 bytes (e.g. 36) is with UDP or ICMP, and neither is particularly interesting when discussing high-load servers.

S

- S
- Spoon
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Sun, May 7, 2006 7:44 PM

Google came up with:

formatting link

The original Ethernet standards defined the minimum frame size as

64-bytes and the maximum as 1518-bytes. These numbers include all bytes from the Destination MAC Address field through the Frame Check Sequence field. The Preamble and Start Frame Delimiter fields are not included when quoting the size of a frame.

What do you mean "throughputs seem low"? In my two experiments I did not try to send faster. They were just the arbitrary rates I used.

The OS I use is x86/Linux indeed.

I don't see how a real-time OS could make the overhead of interrupt handling smaller?

I use a stock 2.6.16 kernel. How can I tell if I use APIC or XT-PIC?

Regards.

- R
- Robert Redelmeier
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Mon, May 8, 2006 4:47 AM

Spoon wrote in part:

Oh. I'm usued to testing with `ttcp` at wirespeed.

Something I understand.

Because there are no ring0-ring3 transitions. Interrupt handling is faster.

`cat /proc/interrupts` wil say right away.

-- Robert

- S
- Spoon
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Mon, May 8, 2006 9:26 AM

UDP packets with an 8-byte payload (a sequence number).

I wanted to detect which packets were lost.

- S
- Spoon
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Tue, May 9, 2006 8:13 AM

$ cat /proc/interrupts CPU0 0: 26618 XT-PIC timer 1: 98 XT-PIC i8042 2: 0 XT-PIC cascade 8: 1 XT-PIC rtc 9: 202 XT-PIC acpi, eth0, ehci_hcd:usb1 10: 1518 XT-PIC uhci_hcd:usb3, Dta1xx 11: 1 XT-PIC uhci_hcd:usb2, uhci_hcd:usb4, i915@pci:0000:00:02.0 12: 6053 XT-PIC i8042 14: 6196 XT-PIC ide0 15: 11 XT-PIC ide1 NMI: 0 ERR: 0

$ uname -r

2.6.14.6

(My kernel is 2.6.14.6 not 2.6.16)

You wrote:

"You should be able to do a lot better with a real-mode x86 OS, or if you use APIC instead of the dog-slow XT-PIC."

Can you tell me more about the difference between APIC and XT-PIC? (I don't even know what they refer to, off to Google.)

Regards.

- R
- Robert Redelmeier
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Tue, May 9, 2006 12:33 PM

Spoon wrote in part:

Sure. On the original IMB-PC/AT/clones (to this day), interrupts are prioritized by a [virtual] hardware chip, the 8259 Programmable Interrupt Controller. Some details are at:

formatting link

The problem is that they need to be programmed by IO instructions (at least one and often two per interrupt), and IO has been getting _slower_ since it has to go through the Northbridge and Southbridge. About 1 us is the best you can expect,

Intel realized this problem in the mid 1990s and had to deal with SMP interrupts, so designed a complete substitute, the APIC controller built into P6 and later CPU chips. This controller is much faster becaust it isn't accessed through slow IO instructions. You can find data in the Intel "systems Programming" manual.

Fortunately, current Linux kernels can be compiled to use APIC on uniprocessor systems.

-- Robert

- T
- tsar.peter
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Thu, May 18, 2006 2:19 PM

FreeBSD offers the option of "polling NIC's" , it's a kernel compilation option ( not supported for all NIC's however)

polling will spend less instructions / frame then INT's when enough load is present.