EyeQ Documentation

Table of Contents

1 What is EyeQ?

EyeQ is a distributed transport layer for your datacenter, which provides predictable transmit and receive bandwidth guarantees (over few ms) to VMs/services on a server, without having to configure CoS queues in your network for every VM/service.

For more details, check:

2 Talk at NSDI 2013

3 Getting started

3.1 Repositories

I maintain two repositories for EyeQ. One is for bleeding edge development, and the other is for releases. The bleeding edge contains all branches with untested and unfinished code. The release branch is more stable.

The stable repository is here: http://github.com/jvimal/eyeq. The bleeding edge repository is under an older project name: http://github.com/jvimal/perfiso10g.

3.2 Downloading

You can clone the github repositories, which has the source code to compile the Linux kernel module:

$ git clone https://github.com/jvimal/eyeq.git

3.3 Compiling

I have tested EyeQ with Linux 3.0 and above. EyeQ is a kernel module that implements a queueing discpline. So you need the kernel headers to compile EyeQ. You can find available headers on your system in /lib/modules.

$ cd eyeq

$ ls /lib/modules/
3.2.0-23-generic

$ make

This will produce the performance isolation kernel module called perfiso.ko for your running kernel. If you want to compile for a different kernel, please edit the Makefile and change the version number accordingly.

3.4 Installing

EyeQ hooks into the transmit path by acting as a queueing discipline. To avoid dependency on iproute, EyeQ replaces the hierarchical token bucket (htb) module so you can use your existing tc tool to install EyeQ. So you need to remove htb before you can use EyeQ. EyeQ hooks into the receive path using netdev_rx_handler_register callback.

To install EyeQ on a device:

$ rmmod sch_htb

$ insmod ./perfiso.ko

$ tc qdisc show
qdisc pfifo_fast 0: dev eth0 root refcnt 2 bands 3 priomap...

$ tc qdisc add dev eth0 root handle 1: htb

$ dmesg | tail
perfiso: Init TX path for eth0
perfiso: Init RX path for eth0

$ tc qdisc show
qdisc htb 1: dev eth0 root

3.5 Usage

We provide a separate python configuration tool to work with EyeQ, as it does not use the standard tc interface. EyeQ kernel module exports a sysfs file which it uses for all configuration mechanisms.

3.5.1 Creating tenants

EyeQ lets you filter tenants using IP addresses. Filters are needed to correctly classify packets to their corresponding groups for proper resource accounting. We use the source IP on the transmit path, and destination IP on the receive path. Implementing classifiers is easy, but contact me if you need a particular classifier in EyeQ and I can make it available.

Each tenant has a classifier and a "weight" that denotes the relative importance of one tenant over another. If you have two tenants A and B with weights 1 and 2 respectively, then tenant B will get twice the bandwidth of A whenever there is contention for bandwidth (transmit/receive). It is possible to set unequal weights for transmit and receive by directly writing to EyeQ's sysfs files.

For example, to create a tenant with IP address 10.0.1.1, do:

$ tools/pi.py --create 10.0.1.1 --dev eth0
INFO:root:Creating txc 10.0.1.1 on dev eth0
INFO:root:Creating vq 10.0.1.1 on dev eth0
INFO:root:Setting weight of vq 10.0.1.1 on dev eth0 to 1

$ tools/pi.py --list
dev: eth0
listing TXCs
        10.0.1.1 weight 1 assoc vq 10.0.1.1
listing VQs
        10.0.1.1 weight 1

To set a different weight for tenant 10.0.1.1 on the receive path, do:

$ echo dev eth0 10.0.1.1 weight 10 > /sys/module/perfiso/parameters

Note: --create option assumes that your network/kernel already knows how to route packets; the tool does not do them on your behalf. There is a sample script tests/tenant.py that automatically creates IP addresses and routing tables in a consistent fashion. Check the Mininet section below.

3.5.2 Removing tenants

$ tools/pi.py --delete 10.0.1.1 --dev eth0

3.5.3 Parameters

EyeQ exposes a number of knobs to fine tune its operation. In many cases the default should just suffice. The default parameters are tuned for 10GbE.

$ tools/pi.py --get
 1           ISO_RFAIR_INCREASE_INTERVAL_US        120
 2                  IsoAutoGenerateFeedback          1
 3               ISO_TOKENBUCKET_TIMEOUT_NS      50000
 4               ISO_TXC_UPDATE_INTERVAL_US        200
... a lot more.

The userspace tool has default parameters for 1GbE networks as well.

$ tools/pi.py --one-gbe
Setting ISO_TOKENBUCKET_TIMEOUT_NS = 100000
Setting ISO_VQ_DRAIN_RATE_MBPS = 920
Setting ISO_RL_UPDATE_INTERVAL_US = 50
Setting ISO_RFAIR_INITIAL = 500
Setting ISO_MAX_TX_RATE = 980

3.5.4 Saving and restoring configurations

You can save the config as a json file which you can restore later.

$ tools/pi.py --save /tmp/config
$ cat /tmp/config
{
    "params": {
        "ISO_RFAIR_INCREASE_INTERVAL_US": "120",
        "IsoAutoGenerateFeedback": "1",
        "ISO_TOKENBUCKET_TIMEOUT_NS": "50000",
...
    },
    "config": {
        "eth0": {
            "vqs": [
            ],
            "txcs": [
                {
...
                }
            ]
        }
    }
}

3.5.5 Unloading EyeQ

You can unload EyeQ by first removing each qdisc you created on network devices.

$ tc qdisc del dev eth0 root

You can restore an earlier configuration by running the following as root:

$ tools/pi.py --load /tmp/config

4 Trying things out

The scripts used in all our experiments in the NSDI paper are available online in the test repository.

4.1 Mininet

Mininet is a collection of useful scripts to configure features such as network namespaces, containers and Linux Traffic Control, to create lightweight virtual networks on a single machine. I am one of the authors of the second release of Mininet, with Bob Lantz, Brandon Heller and Nikhil Handigol.

Mininet uses tc to configure htb inorder to emulate links with of some capacity. So you can use Mininet to test EyeQ as well. Unfortunately, one of Mininet's limitations (as of 2.0) is that you are constrained to resources that one server can offer especially CPU. One CPU core amounts to about 2–3Gb/s of aggregate switching capacity. So you can try EyeQ safely with few (10) links operating at slower link speeds, say 10–100Mb/s.

The test repository has scripts to create a single-switch topology in Mininet. To test it on Mininet, just do the following steps:

  1. Download and install Mininet-2.0 on Ubuntu 12.10+ by following Option 2 on the Mininet Setup Page.

    Alternatively, you can boot the AMI id ami-7eab204e in US-Oregon region in Amazon AWS which has Mininet-2.0 preinstalled.

  2. Install paramiko and termcolor dependencies. On Ubuntu, just run:
    $ sudo apt-get install python-paramiko
    $ sudo easy_install termcolor
    

    Disable DNS resolution in ssh. Add "UseDNS no" to /etc/ssh/sshd_config. Then, run sudo reload ssh for changes to take effect.

  3. Clone the tests repository.
    $ cd ~/eyeq
    $ git clone https://github.com/jvimal/eyeq-tests.git tests
    $ cd tests
    
  4. Ensure correct configuration:
    • SSH settings in config/mininet.py
    • Edit host.py to import config.mininet instead of config.packard.
  5. Run the Mininet script sudo python mininet/simple.py and wait for the CLI.
  6. Ensure you can log in from h1 to h2 without passwords.
    • In Mininet, type xterm h1
    • You should be able to type ssh 10.0.0.2 and log in without any hassles. If you get "permission denied (public key)" or some such error, please add /root/.ssh/id_rsa.pub to /root/.ssh/authorized_keys.
  7. Create tenants on all hosts.

    By default, the Mininet script does not create tenants. You can create tenants by doing the following:

    • In Mininet, type xterm h1
    • Inside the terminal, type the following, which creates 3 tenants on each of the 3 virtual hosts (h1, h2 and h3): python tenant.py -m 3 -T 3
    • By default, the script will create tenants with IP addresses 11.0.T.M where T is the tenant ID and M is the machine number (1 for h1, etc.). Routing tables are set correctly so that packets destined to a tenant pick the corresponding source IP address.
    Alternatively, once you've set up ssh successfully, you can create tenants on all machines simply by passing --conf as shown below:
    $ python mininet/simple.py --conf --num-hosts 3 --num-tenants 3
    
  8. Make a few parameter changes so things work at 100Mb/s instead of 10Gb/s.
    $ cd ~/eyeq/
    
    # Set the interface speed to 100Mb/s
    $ tools/pi.py --set 18,100
    Setting ISO_MAX_TX_RATE = 100
    
    # Set the receive speed to 100Mb/s
    $ tools/pi.py --set 5,100
    Setting ISO_VQ_DRAIN_RATE_MBPS = 100
    
    # Set the tokenbucket timeout value to 1ms
    $ tools/pi.py --set 3,1000000
    Setting ISO_TOKENBUCKET_TIMEOUT_NS = 1000000
    
    # Set the rate metering interval to 1ms
    $ tools/pi.py --set 13,10000
    Setting ISO_VQ_UPDATE_INTERVAL_US = 10000
    

    The above scripts are available in tests/mininet/100mbps.sh.

    If you write to the sysfs files directly to change ISO_MAX_TX_RATE and ISO_VQ_DRAIN_RATE_MBPS, you need to force EyeQ to recompute tenant rx and tx rate guarantees. You can do it by running:

    $ echo dev eth0 > /sys/module/perfiso/parameters/recompute_dev
    
  9. Check TX fairness.
    • Run iperf servers (iperf -s) on all hosts.
    • Run iperf client from h1:1 (tenant 1) to h2:1 (tenant 1)
      • h1 terminal: iperf -c 11.0.1.2 -t 100 -i 1
    • Run 4 flows from h1:2 to h3:2
      • h1 terminal: iperf -c 11.0.2.3 -t 100 -i 1 -P4
    You should see each iperf get 50Mb/s.
  10. The stats command all work in Mininet as well.
    $ tools/pi.py --stats --dev h1-eth0
    

5 Rate limiter

We have also implemented EyeQ's rate limiter, optimized for multiqueue networking devices, as a drop-in replacement to Linux's Token Bucket Filter (tbf). You can download it from the ptb repository: https://github.com/jvimal/ptb.

We have a rate limiter supporting most of HTB's features (hierarchical rate limiting) in the works. We're working with Linux Kernel Developers to release it upstream, but if you want to try out an early release, let me know!

5.1 Obtaining, Compiling and Installing

Since ptb is a drop-in replacement for tbf, you will have to remove tbf from a running kernel before you can use ptb. The module is for Linux kernels 3.7 onwards, as the qdisc API datastructures changed a bit. However, the API change is simple so you can easily port it to older kernels.

$ rmmod sch_tbf
$ git clone https://github.com/jvimal/ptb.git
$ cd ptb
$ make
$ insmod ./sch_ptb.ko

There is a sample script with default options so you can test PTB out. Just like Linux's default mq or mqprio qdisc, ptb must be the root qdisc. It cannot be the child of any other qdisc.

$ cat tc.sh
#!/bin/bash

dev=eth2
tc qdisc del dev $dev root
rmmod sch_ptb
make
insmod ./sch_ptb.ko
tc qdisc add dev $dev root handle 1: tbf limit 100000 burst 1000 rate 3Gbit

Footnotes:

1 FOOTNOTE DEFINITION NOT FOUND: 0

Author: Vimalkumar (j.vimal@gmail.com)

Date: 2013-04-15 17:43:44 PDT

Generated by Org version 7.4 with Emacs version 24

Validate XHTML 1.0