Technical Features

Linux mainline

What is Linux?

Linux is a clone of the operating system Unix, written from scratch by Linus Torvalds with assistance from a loosely-knit team of hackers across the Net. It aims towards POSIX and Single UNIX Specification compliance.

It has all the features you would expect in a modern system, including true multitasking, virtual memory, shared libraries, demand loading, shared copy-on-write executables, proper memory management, and multistack networking including IPv4 and IPv6.

BFS CPU scheduler

The Brain Fuck Scheduler (BFS) is a process scheduler created by kernel programmer Con Kolivas. Designed for the Linux kernel in August 2009 as an alternative to the Completely Fair Scheduler and the O(1) scheduler.

The objective of BFS, compared to other schedulers, is to provide a scheduler with a simpler algorithm, that does not require adjustment of heuristics or tuning parameters to tailor performance to a specific type of computation workload. The BFS author asserted that these tunable parameters were difficult for the average user to understand, especially in terms of interactions of multiple parameters with each other, and claimed that the use of such tuning parameters could often result in improved performance in a specific targeted type of computation, at the cost of worse performance in the general case. BFS has been reported to improve responsiveness on light-NUMA (non-uniform memory access) Linux mobile devices and desktop computers with fewer than 16 cores.

Shortly following its introduction, the new scheduler made headlines within the Linux community, appearing on Slashdot, with reviews in Linux Magazine and Linux Pro Magazine. Although there have been varied reviews of improved performance and responsiveness, Con Kolivas does not intend for BFS to be integrated into the mainline kernel.

BFQ I/O scheduler

BFQ is a proportional-share storage-I/O scheduler that also supports hierarchical scheduling with a cgroups interface. Here are the main nice features of BFQ.

Low latency for interactive applications. According to our results, whatever the background load is, for interactive tasks the storage device is virtually as responsive as if it was idle. For example, even if one or more of the following background workloads are being served in parallel:

  • One or more large files are being read or written,
  • A tree of source files is being compiled,
  • One or more virtual machines are performing I/O,
  • A software update is in progress,
  • Indexing daemons are scanning the filesystems and updating their databases, starting a command/application or loading a file from within an application takes about the same time as if the storage device was idle. As a comparison, with CFQ, NOOP, DEADLINE or SIO, and under the same conditions, applications experience high latencies, or even become unresponsive until the background workload terminates (especially on SSDs).

Low latency for soft real-time applications

Also soft real-time applications, such as audio and video players or audio audio- and video-streaming applications, enjoy about the same latencies regardless of the device load. As a consequence, these applications do not suffer from almost any glitch due to the background workload.

High throughput

BFQ achieves up to 30% higher throughput than CFQ on hard disks with most parallel workloads, and about the same throughput with the rest of the workloads we have considered. BFQ achieves the same throughput as CFQ, NOOP, DEADLINE and SIO on SSDs.

Strong fairness guarantees

As for long-term guarantees, BFQ distributes the throughput as desired to I/O-bound applications (and not just the device time), with any workload and independently of the device parameters..


Graysky's kernel GCC patch

Three different machines running a generic x86-64 kernel and an otherwise identical kernel running with the optimized gcc options were tested using a make based endpoint.

There are small but real speed increases to running with this patch as judged by a make endpoint. The increases are on par with the speed increase that the upstream sanctioned core2 option gives users, so not including additional options seems somewhat arbitrary to me.

Vegas / YeAH TCP congestion control

Proper QoS over TCP, maximizes throughput without compromising network latency.

TCP Vegas is a TCP congestion avoidance algorithm that emphasizes packet delay, rather than packet loss, as a signal to help determine the rate at which to send packets. It was developed at the University of Arizona by Lawrence Brakmo and Larry L. Peterson.

TCP Vegas detects congestion at an incipient stage based on increasing Round-Trip Time (RTT) values of the packets in the connection unlike other flavors such as Reno, New Reno, etc., which detect congestion only after it has actually happened via packet loss. The algorithm depends heavily on accurate calculation of the Base RTT value. If it is too small then throughput of the connection will be less than the bandwidth available while if the value is too large then it will overrun the connection.

A lot of research is going on regarding the fairness provided by the linear increase/decrease mechanism for congestion control in Vegas. One interesting caveat is when Vegas is inter-operated with other versions like Reno. In this case, performance of Vegas degrades because Vegas reduces its sending rate before Reno, as it detects congestion early and hence gives greater bandwidth to co-existing TCP Reno flows.

TCP Vegas is one of several “flavors” of TCP congestion avoidance algorithms. It is one of a series of efforts at TCP tuning that adapt congestion control and system behaviors to new challenges faced by increases in available bandwidth in Internet components on networks like Internet2.

x86_64 advanced instructions