There is an OS overhead for sure (although you can mitigate that with things lik...

justincormack · on July 9, 2014

It is not just a few percent for applications where userspace needs to look at each packet. There are some performance comparisons on the netmap page for example http://info.iet.unipi.it/~luigi/netmap/

personZ · on July 9, 2014

That comparison is of questionable merit. It compares bulk, unmetered packet generation (in an OS, worth noting, just as a kernel module) against netsend, which is intentionally a rate limited generator (using busy waits, as an aside, which means that the CPU probably was at 100%...doing nothing), with the significant overhead that entails.

where userspace needs to look at each packet

That benchmark is like those naive "look how fast my web server is when it returns just status code 200" comments. Even if we accepted that the overhead was anywhere close to the linked, which it isn't, the moment you actually do something with the packets those savings disappear into rounding errors.

lrizzo · on July 9, 2014

netsend and other apps used in that comparison were not rate limited. Surely that test only measures system's overheads, but i put that disclaimer very clearly in all papers and talks (btw "which it isn't" suggests that you have different numbers so please let us know). Surely, sometimes these per-packet savings are irrelevant, but there are a number of use cases where this kind of savings matter a lot. This is true for netmap, dpdk, DNA and all other network-stack-bypass frameworks.

On passing, people typically use the term 'os-bypass' but netmap relies on the OS for protection, synchronization, memory management etc -- all things that the OS does well and i find no reason to reinvent.

personZ · on July 9, 2014

Unless you used a specialized, customized version of netsend, yes it was rate limited -- it by default waits on the clock interval, and warns you if you try to call it in defiance of that. Further it is endlessly calculating the time and calling system functions to get the time.

As a test of max throughput, it is a horrible test. I don't have the motivation to prove it, but I would be surprised if more than 5% of the CPU load actually went towards networking, the rest time calculations and interval tests.