跳转到内容
来自 Arch Linux 中文维基

本文或本节需要翻译。要贡献翻译,请访问简体中文翻译团队

附注: Only the intro is translated.(在 Talk:进阶网络流量控制# 中讨论)

Linux内核的网络模块具有网络流量控制功能。包iproute2安装了tc 命令通过命令行来操控。 这篇文章展示了如何使用排队规则来控制流量。比如,若因为用户滥用网络带宽,要禁止所管理的网络范围内下载或种子下载行为,可以用排队规则(queueing disciplines),禁调此类流量,保证整个网络的速度。 这篇文章需要读者对于网络设备、iptables有一定的了解。

排队

排队控制着数据的发送方式;接收数据的反应性更强,面向网络的控制更少。然而,由于TCP/IP数据包是使用慢速启动发送的,因此系统开始缓慢发送数据包,并保持越来越快的发送速度,直到数据包开始被拒绝——因此,可以通过在转发之前丢弃到达路由器的数据包来控制局域网上接收的流量。还有更多相关细节,但它们没有直接涉及排队逻辑。为了完全控制流量的大小,我们需要成为链中最慢的环节。也就是说,如果连接的最大下载速度为500k,如果你不将输出限制在450k或以下,那么将由调制解调器而不是我们来限制流量。每个网络设备都有一个目录,可以在其中设置qdisc。默认情况下,此根具有fq_codel qdisc。(更多信息见下文)

有两种类型:有分类和无分类。

类qdisc允许您创建类,其工作方式类似于树上的分支。然后,您可以设置规则,将数据包过滤到每个类中。每个类本身都可以分配其他有类或无类qdisc。

无类qdisc不允许向其中添加更多qdisc。在开始配置qdisc之前,首先我们需要从根目录中删除任何现有的qdisc。这将从eth0设备中删除任何qdisc:

# tc qdisc del root dev eth0

无类Qdiscs

这些队列通过重新排序、减慢或丢弃数据包来对流量进行基本管理。此qdiscs不允许创建类。fifo_fast这是systemd 217之前的默认qdisc。在没有应用自定义qdisc配置的每个网络设备中,fifo_fast是根上设置的qdisc。fifo的意思是先进先出,也就是说,第一个进入的数据包将是第一个被发送的。这样,没有包裹会得到特殊待遇。令牌桶过滤器(TBF)只要不超过特定的速率限制,此qdisc就允许传递字节。它的工作原理是创建一个虚拟桶,然后以一定的速度丢弃代币,填充该桶。每个包都从bucket中获取一个虚拟令牌,并使用它来获得通过的权限。如果到达的数据包太多,bucket将没有剩余的令牌,剩余的数据包将等待一定时间以获得新的令牌。如果令牌到达的速度不够快,数据包将被丢弃。在相反的情况下(发送的数据包太少),令牌可用于允许一些突发(上传尖峰)发生。

这意味着这个 qdisc 在减慢界面速度方面很有用。

举例:

上传可能会填满调制解调器的队列,因此,当您上传一个巨大的文件时,交互性会被破坏。

# tc qdisc add dev ppp0 root tbf rate 220kbit latency 50ms burst 1540

请注意,上述上传速度应更改为您的上传速度减去几个百分点(即链中最慢的环节)。此配置为ppp0 设备设置TBF,将上传速度限制为220k,为丢弃前的包设置50ms的延迟,并设置1540的突发。 它的工作原理是将排队保持在Linux机器上(可以在那里进行整形),而不是调制解调器上。

Stochastic Fairness Queueing (SFQ)

This is a round-robin qdisc. Each conversation is set on a fifo queue, and on each round, each conversation has the possibility to send data. That is why it is called "Fairness". It is also called "Stochastic" because it does not really create a queue for each conversation, instead it uses a hashing algorithm. For the hash, there is a chance for multiple sessions on the same bucket. To solve this, SFQ changes its hashing algorithm often to prevent that this becomes noticeable.

Example:

This configuration sets SFQ on the root on the eth0 device, configuring it to perturb (alter) its hashing algorithm every 10 seconds.

# tc qdisc add dev eth0 root sfq perturb 10

CoDel and Fair Queueing CoDel

Since systemd 217, fq_codel is the default. CoDel (Controlled Delay) is an attempt to limit buffer bloating and minimize latency in saturated network links by distinguishing good queues (that empty quickly) from bad queues that stay saturated and slow. The fair queueing Codel utilizes fair queues to more readily distribute available bandwidth between Codel flows. The configuration options are limited intentionally, since the algorithm is designed to work with dynamic networks, and there are some corner cases to consider that are discussed on the bufferbloat wiki concerning Codel, including issues on very large switches and sub megabit connections.

Additional information is available via the tc-codel(8) and tc-fq_codel(8).

警告:Make sure your ethernet driver supports Byte Queue Limits before using CoDel. Here is a list of drivers supported as of kernel 3.6

Classful Qdiscs

Classful qdiscs are very useful if you have different kinds of traffic which should have differing treatment. A classful qdisc allows you to have branches. The branches are called classes.

Setting a classful qdisc requires that you name each class. To name a class,the classid parameter is used . The parent parameter, as the name indicates, points to the parent of the class.

All the names should be set as x:y where x is the name of the root, and y is the name of the class. Normally, the root is called 1: and its children are things like 1:10

Hierarchical Token Bucket (HTB)

HTB is well suited for setups where you have a fixed amount of bandwidth which you want to divide for different purposes, giving each purpose a guaranteed bandwidth, with the possibility of specifying how much bandwidth can be borrowed. Here is an example with comments explaining what each line does:

# This line sets a HTB qdisc on the root of eth0, and it specifies that the class 1:30 is used by default. It sets the name of the root as 1:, for future references.
tc qdisc add dev eth0 root handle 1: htb default 30

# This creates a class called 1:1, which is direct descendant of root (the parent is 1:), this class gets assigned also an HTB qdisc, and then it sets a max rate of 6mbits, with a burst of 15k
tc class add dev eth0 parent 1: classid 1:1 htb rate 6mbit burst 15k

# The previous class has this branches:

# Class 1:10, which has a rate of 5mbit
tc class add dev eth0 parent 1:1 classid 1:10 htb rate 5mbit burst 15k

# Class 1:20, which has a rate of 3mbit
tc class add dev eth0 parent 1:1 classid 1:20 htb rate 3mbit ceil 6mbit burst 15k

# Class 1:30, which has a rate of 1kbit. This one is the default class.
tc class add dev eth0 parent 1:1 classid 1:30 htb rate 1kbit ceil 6mbit burst 15k

# Martin Devera, author of HTB, then recommends SFQ for beneath these classes:
tc qdisc add dev eth0 parent 1:10 handle 10: sfq perturb 10
tc qdisc add dev eth0 parent 1:20 handle 20: sfq perturb 10
tc qdisc add dev eth0 parent 1:30 handle 30: sfq perturb 10

Filters

Once a classful qdisc is set on root (which may contain classes with more classful qdiscs), it is necessary to use filters to indicate which package should be processed by which class.

On a classless-only environment, filters are not necessary.

You can filter packets by using tc, or a combination of tc + iptables.

Using tc only

Here is an example explaining a filter:

# This command adds a filter to the qdisc 1: of dev eth0, set the
# priority of the filter to 1, matches packets with a
# destination port 22, and make the class 1:10 process the
# packets that match.
tc filter add dev eth0 protocol ip parent 1: prio 1 u32 match ip dport 22 0xffff flowid 1:10

# This filter is attached to the qdisc 1: of dev eth0, has a
# priority of 2, and matches the ip address 4.3.2.1 exactly, and
# matches packets with a source port of 80, then makes class
# 1:11 process the packets that match
tc filter add dev eth0 parent 1: protocol ip prio 2 u32 match ip src 4.3.2.1/32 match ip sport 80 0xffff flowid 1:11

Using tc + iptables

iptables has a method called fwmark that can be used to mark packets across interfaces.

First, this makes packets marked with 6, to be processed by the 1:30 class

# tc filter add dev eth0 protocol ip parent 1: prio 1 handle 6 fw flowid 1:30

This sets that mark 6, using iptables

# iptables -A PREROUTING -t mangle -i eth0 -j MARK --set-mark 6

You can then use iptables normally to match packets and then mark them with fwmark.

Example of ingress traffic shaping with SNAT

Qdiscs on ingress traffic provide only policing with no shaping. In order to shape ingress, the IFB (Intermediate Functional Block) device has to be used. However, another problem arises if SNAT or MASQUERADE is in use, as all incoming traffic has the same destination address. The Qdisc intercepts the incoming traffic on the external interface before reverse NAT translation so it can only see the router's IP as destination of the packets.

The following solution is implemented on OpenWRT and can be applied to Arch Linux: First the outgoing packets are marked with MARK and the corresponding connections (and related connections) with CONNMARK. On the incoming packets an ingress u32 filter redirects the traffic to IFB (action mirred), and also retrieves the mark of the packet from CONNTRACK (action connmark) thus providing information as to which IP behind the NAT initiated the traffic).

This function is integrated in kernel since linux-3.19 and in iproute2 since 4.1.

The following is a small script with only 2 HTB classes on ingress to demonstrate it. Traffic defaults to class 3:30. Outgoing traffic from 192.168.1.50 (behind NAT) to the Internet is marked with "3" and thus incoming packets from the Internet going to 192.168.1.50 are marked also with "3" and are classified on 3:33.

#!/bin/sh -x

# Maximum allowed downlink. Set to 90% of the achievable downlink in kbits/s
DOWNLINK=1800

# Interface facing the Internet
EXTDEV=enp0s3

# Load IFB, all other modules all loaded automatically
modprobe ifb
ip link set dev ifb0 down

# Clear old queuing disciplines (qdisc) on the interfaces and the MANGLE table
tc qdisc del dev $EXTDEV root    2> /dev/null > /dev/null
tc qdisc del dev $EXTDEV ingress 2> /dev/null > /dev/null
tc qdisc del dev ifb0 root       2> /dev/null > /dev/null
tc qdisc del dev ifb0 ingress    2> /dev/null > /dev/null
iptables -t mangle -F
iptables -t mangle -X QOS

# appending "stop" (without quotes) after the name of the script stops here.
if [ "$1" = "stop" ]
then
        echo "Shaping removed on $EXTDEV."
        exit
fi

ip link set dev ifb0 up

# HTB classes on IFB with rate limiting
tc qdisc add dev ifb0 root handle 3: htb default 30
tc class add dev ifb0 parent 3: classid 3:3 htb rate ${DOWNLINK}kbit
tc class add dev ifb0 parent 3:3 classid 3:30 htb rate 400kbit ceil ${DOWNLINK}kbit
tc class add dev ifb0 parent 3:3 classid 3:33 htb rate 1400kbit ceil ${DOWNLINK}kbit

# Packets marked with "3" on IFB flow through class 3:33
tc filter add dev ifb0 parent 3:0 protocol ip handle 3 fw flowid 3:33

# Outgoing traffic from 192.168.1.50 is marked with "3"
iptables -t mangle -N QOS
iptables -t mangle -A FORWARD -o $EXTDEV -j QOS
iptables -t mangle -A OUTPUT -o $EXTDEV -j QOS
iptables -t mangle -A QOS -j CONNMARK --restore-mark
iptables -t mangle -A QOS -s 192.168.1.50 -m mark --mark 0 -j MARK --set-mark 3
iptables -t mangle -A QOS -j CONNMARK --save-mark

# Forward all ingress traffic on internet interface to the IFB device
tc qdisc add dev $EXTDEV ingress handle ffff:
tc filter add dev $EXTDEV parent ffff: protocol ip \
        u32 match u32 0 0 \
        action connmark \
        action mirred egress redirect dev ifb0 \
        flowid ffff:1

exit 0

See also