50

I have a minimal CentOS 6.3, 64 bit acting as gateway with 4 NIC (1 Gbps), each bonded together one for public traffic and other for private, which performs NATing. It has 6 GB RAM and 4 logical cores. We have been using this for the past two years without any problems.

I don't have any experience with hardware routers, but I have heard that they have less RAM and CPU and use flash disks. How can a box with low hardware configuration perform better (as in, handle more concurrent connections) than a machine with more RAM and CPU?

What are the limiting factors, other than IOS using different methods to handle this?

TRiG
  • 1,193
  • 3
  • 14
  • 30
Blue Gene
  • 645

6 Answers6

67

ASICs.

Instead of using a general purpose CPU and task-specific software, you can skip the software and just make the silicon handle the task directly.

High performance networking hardware uses ASICs instead of software for the computationally heavy (but relatively logically simple) tasks of something like comparing an IP address to an enormous internet routing table, checking a CAM table for a switching decision, or checking a packet against an ACL. This makes an enormous difference in the speed of those time-sensitive operations, providing a significant advantage over a general-purpose CPU.

Shane Madden
  • 116,404
  • 13
  • 187
  • 256
12

A high-end, dedicated router can outperform a PC with a faster CPU and more RAM because it it can do more of the routing in hardware.

It's the same reason a $60 Gigabit Ethernet switch can outperform a $2,000 PC with 4 two-port GigE cards acting as an Ethernet switch. The switch is built from the ground up to be a switch.

10

"Other than IOS" ?

IOS makes almost all the difference. CentOS is a general-purpose operating system. It's designed to perform well enough under a very wide range of scenarios, using a vast array of different hardware configurations. IOS on the other hand is extremely fine tuned to handle only the kind of workloads you would expect from a piece of network equipment, using the very specific types of hardware you would find in Cisco gear.

Knowing exactly what pieces of hardware you're programming for will take you a very long way in terms of performance vs. compatibility.

Ryan Ries
  • 56,311
4

Both software and hardware have something to say. I have the comparison of Intel and TP-Link NIC (which uses a Realtek chip at its heart) on generic server hardware, as well as purpose-built and generic-purpose software in routing.

On the hardware side, if the ASIC on board can do some handling of IP traffic, the processor load can be lower and thus faster. I have noticed the two onboard INtel NIC chips communicating directly by DMA, bypassing main CPU in handling packet forwarding; meanwhile the Realtek chip interrupts whenever a packet arrives.

On the software side, if the software is designed to be used in routing, it can be made more efficient. I have used both pfSense+PF (a modified FreeBSD intended to be used as a router) and generic-purpose Ubuntu 12.04+iptables as routing software and the first clearly switch traffic a lot faster. (Ubuntu 14.04 is now almost as fast, thanks to the new nftables in Linux 3.13 kernel.)

However dedicated router do have one major drawback: it cannot perform much other than switching traffic, and it cannot be virtualized. My current edge router is a virtual machine inside my ESXi cluster running Ubuntu 14.04, and it also acts as an intrusion detection system and load balancer.

3

AFAIK, it's the overhead of a general-purpose operating system; regardless of how fast your connections, the packets are dealt with on a packet-by-packet basis within the kernel's context, increasing latency and strain on the system. I believe it's been already explained in the other Answers better than I could do.

Having said that, there are promising new"ish" technologies increasing in popularity and feasibility that might create a more formidable competitor out of Linux systems in this as well as in other regards; i.e. InfiniBand

Take a look at the following Q&A on StackOverflow: How is TCP Kernel-bypass Implemented

Further Reading:

3

It's usually because of lack of out-of-box network stack/devices configuration in linux. In almost 90% cases your network traffic is processed by CPU0 while other are in idle. If you'll solve this problem difference with hardware routers will not be so drastic as you may think. You should set up at least RSS or RPS (driver/stack based packet processing distribution among the CPUs).

If you really care for your linux router performance and have enough time I recommend you to read this article in packagecloud blog (there is also article about transmitting packets).

If you'll need to take a look at distribution and you think that watching at while sleep 1; do cat $some_file_in_procfs; done, CPU mask evaluation and manual smp_affinity writing is boring, you'd probably found my pet-project netutils-linux extremely useful.