EyeQ Documentation
Table of Contents
1 What is EyeQ?
EyeQ is a distributed transport layer for your datacenter, which provides predictable transmit and receive bandwidth guarantees (over few ms) to VMs/services on a server, without having to configure CoS queues in your network for every VM/service.
For more details, check:
- A short README.
- Comprehensive system design and evaluation in NSDI 2013: http://www.stanford.edu/~jvimal/EyeQ-NSDI13.pdf
- NSDI 2013 talk slides and video: https://www.usenix.org/conference/nsdi13/eyeq-practical-network-performance-isolation-edge
- A position workshop paper, talk and slides at HotCloud 2012.
- Poster at Eurosys 2013.
2 Talk at NSDI 2013
3 Getting started
3.1 Repositories
I maintain two repositories for EyeQ. One is for bleeding edge development, and the other is for releases. The bleeding edge contains all branches with untested and unfinished code. The release branch is more stable.
The stable repository is here: http://github.com/jvimal/eyeq. The bleeding edge repository is under an older project name: http://github.com/jvimal/perfiso10g.
3.2 Downloading
You can clone the github repositories, which has the source code to compile the Linux kernel module:
$ git clone https://github.com/jvimal/eyeq.git
3.3 Compiling
I have tested EyeQ with Linux 3.0 and above. EyeQ is a kernel module
that implements a queueing discpline. So you need the kernel headers
to compile EyeQ. You can find available headers on your system in
/lib/modules
.
$ cd eyeq $ ls /lib/modules/ 3.2.0-23-generic $ make
This will produce the performance isolation kernel module called
perfiso.ko
for your running kernel. If you want to compile for a
different kernel, please edit the Makefile and change the version
number accordingly.
3.4 Installing
EyeQ hooks into the transmit path by acting as a queueing discipline.
To avoid dependency on iproute
, EyeQ replaces the hierarchical token
bucket (htb
) module so you can use your existing tc
tool to
install EyeQ. So you need to remove htb
before you can use EyeQ.
EyeQ hooks into the receive path using netdev_rx_handler_register
callback.
To install EyeQ on a device:
$ rmmod sch_htb $ insmod ./perfiso.ko $ tc qdisc show qdisc pfifo_fast 0: dev eth0 root refcnt 2 bands 3 priomap... $ tc qdisc add dev eth0 root handle 1: htb $ dmesg | tail perfiso: Init TX path for eth0 perfiso: Init RX path for eth0 $ tc qdisc show qdisc htb 1: dev eth0 root
3.5 Usage
We provide a separate python configuration tool to work with EyeQ, as
it does not use the standard tc
interface. EyeQ kernel module
exports a sysfs file which it uses for all configuration mechanisms.
3.5.1 Creating tenants
EyeQ lets you filter tenants using IP addresses. Filters are needed to correctly classify packets to their corresponding groups for proper resource accounting. We use the source IP on the transmit path, and destination IP on the receive path. Implementing classifiers is easy, but contact me if you need a particular classifier in EyeQ and I can make it available.
Each tenant has a classifier and a "weight" that denotes the relative importance of one tenant over another. If you have two tenants A and B with weights 1 and 2 respectively, then tenant B will get twice the bandwidth of A whenever there is contention for bandwidth (transmit/receive). It is possible to set unequal weights for transmit and receive by directly writing to EyeQ's sysfs files.
For example, to create a tenant with IP address 10.0.1.1, do:
$ tools/pi.py --create 10.0.1.1 --dev eth0 INFO:root:Creating txc 10.0.1.1 on dev eth0 INFO:root:Creating vq 10.0.1.1 on dev eth0 INFO:root:Setting weight of vq 10.0.1.1 on dev eth0 to 1 $ tools/pi.py --list dev: eth0 listing TXCs 10.0.1.1 weight 1 assoc vq 10.0.1.1 listing VQs 10.0.1.1 weight 1
To set a different weight for tenant 10.0.1.1 on the receive path, do:
$ echo dev eth0 10.0.1.1 weight 10 > /sys/module/perfiso/parameters
Note: --create
option assumes that your network/kernel already
knows how to route packets; the tool does not do them on your behalf.
There is a sample script tests/tenant.py
that automatically creates
IP addresses and routing tables in a consistent fashion. Check the
Mininet section below.
3.5.2 Removing tenants
$ tools/pi.py --delete 10.0.1.1 --dev eth0
3.5.3 Parameters
EyeQ exposes a number of knobs to fine tune its operation. In many cases the default should just suffice. The default parameters are tuned for 10GbE.
$ tools/pi.py --get 1 ISO_RFAIR_INCREASE_INTERVAL_US 120 2 IsoAutoGenerateFeedback 1 3 ISO_TOKENBUCKET_TIMEOUT_NS 50000 4 ISO_TXC_UPDATE_INTERVAL_US 200 ... a lot more.
The userspace tool has default parameters for 1GbE networks as well.
$ tools/pi.py --one-gbe Setting ISO_TOKENBUCKET_TIMEOUT_NS = 100000 Setting ISO_VQ_DRAIN_RATE_MBPS = 920 Setting ISO_RL_UPDATE_INTERVAL_US = 50 Setting ISO_RFAIR_INITIAL = 500 Setting ISO_MAX_TX_RATE = 980
3.5.4 Saving and restoring configurations
You can save the config as a json file which you can restore later.
$ tools/pi.py --save /tmp/config $ cat /tmp/config { "params": { "ISO_RFAIR_INCREASE_INTERVAL_US": "120", "IsoAutoGenerateFeedback": "1", "ISO_TOKENBUCKET_TIMEOUT_NS": "50000", ... }, "config": { "eth0": { "vqs": [ ], "txcs": [ { ... } ] } } }
3.5.5 Unloading EyeQ
You can unload EyeQ by first removing each qdisc you created on network devices.
$ tc qdisc del dev eth0 root
You can restore an earlier configuration by running the following as root:
$ tools/pi.py --load /tmp/config
4 Trying things out
The scripts used in all our experiments in the NSDI paper are available online in the test repository.
4.1 Mininet
Mininet is a collection of useful scripts to configure features such as network namespaces, containers and Linux Traffic Control, to create lightweight virtual networks on a single machine. I am one of the authors of the second release of Mininet, with Bob Lantz, Brandon Heller and Nikhil Handigol.
Mininet uses tc
to configure htb
inorder to emulate links with of
some capacity. So you can use Mininet to test EyeQ as well.
Unfortunately, one of Mininet's limitations (as of 2.0) is that you
are constrained to resources that one server can offer especially CPU.
One CPU core amounts to about 2–3Gb/s of aggregate switching
capacity. So you can try EyeQ safely with few (10) links operating at
slower link speeds, say 10–100Mb/s.
The test repository has scripts to create a single-switch topology in Mininet. To test it on Mininet, just do the following steps:
- Download and install Mininet-2.0 on Ubuntu 12.10+ by following
Option 2 on the Mininet Setup Page.
Alternatively, you can boot the AMI id
ami-7eab204e
in US-Oregon region in Amazon AWS which has Mininet-2.0 preinstalled. - Install paramiko and termcolor dependencies. On Ubuntu, just run:
$ sudo apt-get install python-paramiko $ sudo easy_install termcolor
Disable DNS resolution in ssh. Add "UseDNS no" to
/etc/ssh/sshd_config
. Then, runsudo reload ssh
for changes to take effect. - Clone the tests repository.
$ cd ~/eyeq $ git clone https://github.com/jvimal/eyeq-tests.git tests $ cd tests
- Ensure correct configuration:
- SSH settings in
config/mininet.py
- Edit
host.py
to importconfig.mininet
instead ofconfig.packard
.
- SSH settings in
- Run the Mininet script
sudo python mininet/simple.py
and wait for the CLI. - Ensure you can log in from h1 to h2 without passwords.
- In Mininet, type
xterm h1
- You should be able to type
ssh 10.0.0.2
and log in without any hassles. If you get "permission denied (public key)" or some such error, please add/root/.ssh/id_rsa.pub
to/root/.ssh/authorized_keys
.
- In Mininet, type
- Create tenants on all hosts.
By default, the Mininet script does not create tenants. You can create tenants by doing the following:
- In Mininet, type
xterm h1
- Inside the terminal, type the following, which creates 3 tenants
on each of the 3 virtual hosts (h1, h2 and h3):
python tenant.py -m 3 -T 3
- By default, the script will create tenants with IP addresses
11.0.T.M
where T is the tenant ID and M is the machine number (1 for h1, etc.). Routing tables are set correctly so that packets destined to a tenant pick the corresponding source IP address.
--conf
as shown below:$ python mininet/simple.py --conf --num-hosts 3 --num-tenants 3
- In Mininet, type
- Make a few parameter changes so things work at 100Mb/s instead of
10Gb/s.
$ cd ~/eyeq/ # Set the interface speed to 100Mb/s $ tools/pi.py --set 18,100 Setting ISO_MAX_TX_RATE = 100 # Set the receive speed to 100Mb/s $ tools/pi.py --set 5,100 Setting ISO_VQ_DRAIN_RATE_MBPS = 100 # Set the tokenbucket timeout value to 1ms $ tools/pi.py --set 3,1000000 Setting ISO_TOKENBUCKET_TIMEOUT_NS = 1000000 # Set the rate metering interval to 1ms $ tools/pi.py --set 13,10000 Setting ISO_VQ_UPDATE_INTERVAL_US = 10000
The above scripts are available in
tests/mininet/100mbps.sh
.If you write to the sysfs files directly to change
ISO_MAX_TX_RATE
andISO_VQ_DRAIN_RATE_MBPS
, you need to force EyeQ to recompute tenant rx and tx rate guarantees. You can do it by running:$ echo dev eth0 > /sys/module/perfiso/parameters/recompute_dev
- Check TX fairness.
- Run iperf servers (
iperf -s
) on all hosts. - Run iperf client from h1:1 (tenant 1) to h2:1 (tenant 1)
- h1 terminal:
iperf -c 11.0.1.2 -t 100 -i 1
- h1 terminal:
- Run 4 flows from h1:2 to h3:2
- h1 terminal:
iperf -c 11.0.2.3 -t 100 -i 1 -P4
- h1 terminal:
- Run iperf servers (
- The stats command all work in Mininet as well.
$ tools/pi.py --stats --dev h1-eth0
5 Rate limiter
We have also implemented EyeQ's rate limiter, optimized for multiqueue
networking devices, as a drop-in replacement to Linux's Token Bucket
Filter (tbf
). You can download it from the ptb repository:
https://github.com/jvimal/ptb.
We have a rate limiter supporting most of HTB's features (hierarchical rate limiting) in the works. We're working with Linux Kernel Developers to release it upstream, but if you want to try out an early release, let me know!
5.1 Obtaining, Compiling and Installing
Since ptb
is a drop-in replacement for tbf
, you will have to
remove tbf
from a running kernel before you can use ptb
. The
module is for Linux kernels 3.7 onwards, as the qdisc API
datastructures changed a bit. However, the API change is simple so
you can easily port it to older kernels.
$ rmmod sch_tbf $ git clone https://github.com/jvimal/ptb.git $ cd ptb $ make $ insmod ./sch_ptb.ko
There is a sample script with default options so you can test PTB out.
Just like Linux's default mq
or mqprio
qdisc, ptb
must be the
root qdisc. It cannot be the child of any other qdisc.
$ cat tc.sh #!/bin/bash dev=eth2 tc qdisc del dev $dev root rmmod sch_ptb make insmod ./sch_ptb.ko tc qdisc add dev $dev root handle 1: tbf limit 100000 burst 1000 rate 3Gbit
Footnotes:
1 FOOTNOTE DEFINITION NOT FOUND: 0