* Re: [PATCHv4] virtio-spec: virtio network device multiqueue support
From: Michael S. Tsirkin @ 2012-09-10 6:27 UTC (permalink / raw)
To: Rusty Russell
Cc: kvm, netdev, rick.jones2, virtualization, levinsasha928, pbonzini,
Tom Herbert
In-Reply-To: <20120910061629.GC16819@redhat.com>
On Mon, Sep 10, 2012 at 09:16:29AM +0300, Michael S. Tsirkin wrote:
> On Mon, Sep 10, 2012 at 11:42:25AM +0930, Rusty Russell wrote:
> > OK, I read the spec (pasted below for easy of reading), but I'm still
> > confused over how this will work.
> >
> > I thought normal net drivers have the hardware provide an rxhash for
> > each packet, and we map that to CPU to queue the packet on[1]. We hope
> > that the receiving process migrates to that CPU, so xmit queue
> > matches.
>
> This ony works sometimes. For example it's common to pin netperf to a
> cpu to get consistent performance. Proper hardware must obey what
> applications want it to do, not the other way around.
>
> > For virtio this would mean a new per-packet rxhash value, right?
> >
> > Why are we doing something different? What am I missing?
> >
> > Thanks,
> > Rusty.
> > [1] Everything I Know About Networking I Learned From LWN:
> > https://lwn.net/Articles/362339/
>
> I think you missed this:
>
> Some network interfaces can help with the distribution of incoming
> packets; they have multiple receive queues and multiple interrupt lines.
> Others, though, are equipped with a single queue, meaning that the
> driver for that hardware must deal with all incoming packets in a
> single, serialized stream. Parallelizing such a stream requires some
> intelligence on the part of the host operating system.
>
> In other words RPS is a hack to speed up networking on cheapo
> hardware, this is one of the reasons it is off by default.
> Good hardware has multiple receive queues.
> We can implement a good one so we do not need RPS.
>
> Also not all guest OS-es support RPS.
>
> Does this clarify?
I would like to add that on many processors, sending
IPCs between guest CPUs requires exits on sending *and*
receiving path, making it very expensive.
> > ---
> > Transmit Packet Steering
> >
> > When VIRTIO_NET_F_MULTIQUEUE feature bit is negotiated, guest can use any of multiple configured transmit queues to transmit a given packet. To avoid packet reordering by device (which generally leads to performance degradation) driver should attempt to utilize the same transmit virtqueue for all packets of a given transmit flow. For bi-directional protocols (in practice, TCP), a given network connection can utilize both transmit and receive queues. For best performance, packets from a single connection should utilize the paired transmit and receive queues from the same virtqueue pair; for example both transmitqN and receiveqN. This rule makes it possible to optimize processing on the device side, but this is not a hard requirement: devices should function correctly even when this rule
is not followed.
> >
> > Driver selects an active steering rule using VIRTIO_NET_CTRL_STEERING command (this controls both which virtqueue is selected for a given packet for receive and notifies the device which virtqueues are about to be used for transmit).
> >
> > This command accepts a single out argument in the following format:
> >
> > #define VIRTIO_NET_CTRL_STEERING 4
> >
> > The field rule specifies the function used to select transmit virtqueue for a given packet; the field param makes it possible to pass an extra parameter if appropriate. When rule is set to VIRTIO_NET_CTRL_STEERING_SINGLE (this is the default) all packets are steered to the default virtqueue transmitq (1); param is unused; this is the default. With any other rule, When rule is set to VIRTIO_NET_CTRL_STEERING_RX_FOLLOWS_TX packets are steered by driver to the first N=(param+1) multiqueue virtqueues transmitq1...transmitqN; the default transmitq is unused. Driver must have configured all these (param+1) virtqueues beforehand.
> >
> > Supported steering rules can be added and removed in the future. Driver should check that the request to change the steering rule was successful by checking ack values of the command. As selecting a specific steering is an optimization feature, drivers should avoid hard failure and fall back on using a supported steering rule if this command fails. The default steering rule is VIRTIO_NET_CTRL_STEERING_SINGLE. It will not be removed.
> >
> > When the steering rule is modified, some packets can still be outstanding in one or more of the transmit virtqueues. Since drivers might choose to modify the current steering rule at a high rate (e.g. adaptively in response to changes in the workload) to avoid reordering packets, device is recommended to complete processing of the transmit queue(s) utilized by the original steering before processing any packets delivered by the modified steering rule.
> >
> > For debugging, the current steering rule can also be read from the configuration space.
> >
> > Receive Packet Steering
> >
> > When VIRTIO_NET_F_MULTIQUEUE feature bit is negotiated, device can use any of multiple configured receive queues to pass a given packet to driver. Driver controls which virtqueue is selected in practice by configuring packet steering rule using VIRTIO_NET_CTRL_STEERING command, as described above[sub:Transmit-Packet-Steering].
> >
> > The field rule specifies the function used to select receive virtqueue for a given packet; the field param makes it possible to pass an extra parameter if appropriate. When rule is set to VIRTIO_NET_CTRL_STEERING_SINGLE all packets are steered to the default virtqueue receiveq (0); param is unused; this is the default. When rule is set to VIRTIO_NET_CTRL_STEERING_RX_FOLLOWS_TX packets are steered by host to the first N=(param+1) multiqueue virtqueues receiveq1...receiveqN; the default receiveq is unused. Driver must have configured all these (param+1) virtqueues beforehand. For best performance for bi-directional flows (such as TCP) device should detect the flow to virtqueue pair mapping on transmit and select the receive virtqueue from the same virtqueue pair. For uni-directional flow
s, or when this mapping information is missing, a device-specific steering function is used.
> >
> > Supported steering rules can be added and removed in the future. Driver should probe for supported rules by checking ack values of the command.
> >
> > When the steering rule is modified, some packets can still be outstanding in one or more of the virtqueues. Device is not required to wait for these packets to be consumed before delivering packets using the new streering rule. Drivers modifying the steering rule at a high rate (e.g. adaptively in response to changes in the workload) are recommended to complete processing of the receive queue(s) utilized by the original steering before processing any packets delivered by the modified steering rule.
^ permalink raw reply
* Re: [PATCHv4] virtio-spec: virtio network device multiqueue support
From: Michael S. Tsirkin @ 2012-09-10 6:16 UTC (permalink / raw)
To: Rusty Russell
Cc: kvm, netdev, rick.jones2, virtualization, levinsasha928, pbonzini,
Tom Herbert
In-Reply-To: <878vcifwxi.fsf@rustcorp.com.au>
On Mon, Sep 10, 2012 at 11:42:25AM +0930, Rusty Russell wrote:
> OK, I read the spec (pasted below for easy of reading), but I'm still
> confused over how this will work.
>
> I thought normal net drivers have the hardware provide an rxhash for
> each packet, and we map that to CPU to queue the packet on[1]. We hope
> that the receiving process migrates to that CPU, so xmit queue
> matches.
This ony works sometimes. For example it's common to pin netperf to a
cpu to get consistent performance. Proper hardware must obey what
applications want it to do, not the other way around.
> For virtio this would mean a new per-packet rxhash value, right?
>
> Why are we doing something different? What am I missing?
>
> Thanks,
> Rusty.
> [1] Everything I Know About Networking I Learned From LWN:
> https://lwn.net/Articles/362339/
I think you missed this:
Some network interfaces can help with the distribution of incoming
packets; they have multiple receive queues and multiple interrupt lines.
Others, though, are equipped with a single queue, meaning that the
driver for that hardware must deal with all incoming packets in a
single, serialized stream. Parallelizing such a stream requires some
intelligence on the part of the host operating system.
In other words RPS is a hack to speed up networking on cheapo
hardware, this is one of the reasons it is off by default.
Good hardware has multiple receive queues.
We can implement a good one so we do not need RPS.
Also not all guest OS-es support RPS.
Does this clarify?
> ---
> Transmit Packet Steering
>
> When VIRTIO_NET_F_MULTIQUEUE feature bit is negotiated, guest can use any of multiple configured transmit queues to transmit a given packet. To avoid packet reordering by device (which generally leads to performance degradation) driver should attempt to utilize the same transmit virtqueue for all packets of a given transmit flow. For bi-directional protocols (in practice, TCP), a given network connection can utilize both transmit and receive queues. For best performance, packets from a single connection should utilize the paired transmit and receive queues from the same virtqueue pair; for example both transmitqN and receiveqN. This rule makes it possible to optimize processing on the device side, but this is not a hard requirement: devices should function correctly even when this rule i
s not followed.
>
> Driver selects an active steering rule using VIRTIO_NET_CTRL_STEERING command (this controls both which virtqueue is selected for a given packet for receive and notifies the device which virtqueues are about to be used for transmit).
>
> This command accepts a single out argument in the following format:
>
> #define VIRTIO_NET_CTRL_STEERING 4
>
> The field rule specifies the function used to select transmit virtqueue for a given packet; the field param makes it possible to pass an extra parameter if appropriate. When rule is set to VIRTIO_NET_CTRL_STEERING_SINGLE (this is the default) all packets are steered to the default virtqueue transmitq (1); param is unused; this is the default. With any other rule, When rule is set to VIRTIO_NET_CTRL_STEERING_RX_FOLLOWS_TX packets are steered by driver to the first N=(param+1) multiqueue virtqueues transmitq1...transmitqN; the default transmitq is unused. Driver must have configured all these (param+1) virtqueues beforehand.
>
> Supported steering rules can be added and removed in the future. Driver should check that the request to change the steering rule was successful by checking ack values of the command. As selecting a specific steering is an optimization feature, drivers should avoid hard failure and fall back on using a supported steering rule if this command fails. The default steering rule is VIRTIO_NET_CTRL_STEERING_SINGLE. It will not be removed.
>
> When the steering rule is modified, some packets can still be outstanding in one or more of the transmit virtqueues. Since drivers might choose to modify the current steering rule at a high rate (e.g. adaptively in response to changes in the workload) to avoid reordering packets, device is recommended to complete processing of the transmit queue(s) utilized by the original steering before processing any packets delivered by the modified steering rule.
>
> For debugging, the current steering rule can also be read from the configuration space.
>
> Receive Packet Steering
>
> When VIRTIO_NET_F_MULTIQUEUE feature bit is negotiated, device can use any of multiple configured receive queues to pass a given packet to driver. Driver controls which virtqueue is selected in practice by configuring packet steering rule using VIRTIO_NET_CTRL_STEERING command, as described above[sub:Transmit-Packet-Steering].
>
> The field rule specifies the function used to select receive virtqueue for a given packet; the field param makes it possible to pass an extra parameter if appropriate. When rule is set to VIRTIO_NET_CTRL_STEERING_SINGLE all packets are steered to the default virtqueue receiveq (0); param is unused; this is the default. When rule is set to VIRTIO_NET_CTRL_STEERING_RX_FOLLOWS_TX packets are steered by host to the first N=(param+1) multiqueue virtqueues receiveq1...receiveqN; the default receiveq is unused. Driver must have configured all these (param+1) virtqueues beforehand. For best performance for bi-directional flows (such as TCP) device should detect the flow to virtqueue pair mapping on transmit and select the receive virtqueue from the same virtqueue pair. For uni-directional flows,
or when this mapping information is missing, a device-specific steering function is used.
>
> Supported steering rules can be added and removed in the future. Driver should probe for supported rules by checking ack values of the command.
>
> When the steering rule is modified, some packets can still be outstanding in one or more of the virtqueues. Device is not required to wait for these packets to be consumed before delivering packets using the new streering rule. Drivers modifying the steering rule at a high rate (e.g. adaptively in response to changes in the workload) are recommended to complete processing of the receive queue(s) utilized by the original steering before processing any packets delivered by the modified steering rule.
^ permalink raw reply
* Re: RFC: mac802154 Packet Queueing and Slave Devices
From: Eric Dumazet @ 2012-09-10 6:12 UTC (permalink / raw)
To: Alan Ott
Cc: netdev-u79uwXL29TY76Z2rM5mHXA,
linux-zigbee-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
In-Reply-To: <504D37A7.60109-yzvJWuRpmD1zbRFIqnYvSA@public.gmane.org>
On Sun, 2012-09-09 at 20:43 -0400, Alan Ott wrote:
> Hi,
>
> Tony and I were recently talking about packet queueing on 802.15.4. What
> currently happens (in net/mac802154/tx.c) is that each tx packet (skb)
> is stuck on a work queue, and the worker function then sends each packet
> to the hardware driver in order.
>
> The problem with this is that it defeats the netif flow control.
And qdisc ability to better control bufferbloat...
By the way, mac802154_tx() looks buggy :
if (!(priv->phy->channels_supported[page] & (1 << chan))) {
WARN_ON(1);
// Here, a kfree_skb(skb) is missing.
return NETDEV_TX_OK;
}
if (skb_cow_head(skb, priv->hw.extra_tx_headroom)) {
dev_kfree_skb(skb); // should be kfree_skb(skb)
return NETDEV_TX_OK;
}
work = kzalloc(sizeof(struct xmit_work), GFP_ATOMIC);
if (!work)
return NETDEV_TX_BUSY;
NETDEV_TX_BUSY is going to loop. So if there is really no more memory,
its a deadlock. You should instead kfree_skb(skb) and return
NETDEV_TX_OK.
Also mac802154_wpan_xmit() returns NETDEV_TX_OK without kfree_skb(skb)
here :
if (chan == MAC802154_CHAN_NONE ||
page >= WPAN_NUM_PAGES ||
chan >= WPAN_NUM_CHANNELS)
return NETDEV_TX_OK;
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
^ permalink raw reply
* Re: (ipt_log_packet, sb_add) 3.6.0-rc2 kernel panic - not syncing; Fatal exception in interrupt
From: Eric Dumazet @ 2012-09-10 6:00 UTC (permalink / raw)
To: Jan Engelhardt; +Cc: Sami Farin, netdev, Florian Westphal, e1000-devel
In-Reply-To: <alpine.LNX.2.01.1209100659220.22738@frira.zrqbmnf.qr>
On Mon, 2012-09-10 at 07:02 +0200, Jan Engelhardt wrote:
> On Monday 2012-09-03 00:53, Eric Dumazet wrote:
> >[PATCH] xt_LOG: take care of timewait sockets
> >
> >Sami Farin reported crashes in xt_LOG because it assumes skb->sk is a
> >full blown socket.
> >
> >But with TCP early demux, we can have skb->sk pointing to a timewait
> >socket.
> >
> >+static void dump_sk_uid_gid(struct sbuff *m, struct sock *sk)
> >+{
> >+ if (!sk || sk->sk_state == TCP_TIME_WAIT)
> >+ return;
> >+
> >+ read_lock_bh(&sk->sk_callback_lock);
> >+ if (sk->sk_socket && sk->sk_socket->file)
> >+ sb_add(m, "UID=%u GID=%u ",
> >+ sk->sk_socket->file->f_cred->fsuid,
> >+ sk->sk_socket->file->f_cred->fsgid);
>
> xt_owner.c is also using f_cred, so it might need the same,
> does it not?
Right.
AFAIK, xt_owner would make little sense in input path, no ?
static struct xt_match owner_mt_reg __read_mostly = {
.name = "owner",
.revision = 1,
.family = NFPROTO_UNSPEC,
.checkentry = owner_check,
.match = owner_mt,
.matchsize = sizeof(struct xt_owner_match_info),
.hooks = (1 << NF_INET_LOCAL_OUT) |
(1 << NF_INET_POST_ROUTING),
.me = THIS_MODULE,
};
So it seems we have nothing to do at this moment.
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel® Ethernet, visit http://communities.intel.com/community/wired
^ permalink raw reply
* Re: (ipt_log_packet, sb_add) 3.6.0-rc2 kernel panic - not syncing; Fatal exception in interrupt
From: Jan Engelhardt @ 2012-09-10 5:02 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Florian Westphal, Sami Farin, netdev, e1000-devel
In-Reply-To: <1346626385.2563.44.camel@edumazet-glaptop>
On Monday 2012-09-03 00:53, Eric Dumazet wrote:
>[PATCH] xt_LOG: take care of timewait sockets
>
>Sami Farin reported crashes in xt_LOG because it assumes skb->sk is a
>full blown socket.
>
>But with TCP early demux, we can have skb->sk pointing to a timewait
>socket.
>
>+static void dump_sk_uid_gid(struct sbuff *m, struct sock *sk)
>+{
>+ if (!sk || sk->sk_state == TCP_TIME_WAIT)
>+ return;
>+
>+ read_lock_bh(&sk->sk_callback_lock);
>+ if (sk->sk_socket && sk->sk_socket->file)
>+ sb_add(m, "UID=%u GID=%u ",
>+ sk->sk_socket->file->f_cred->fsuid,
>+ sk->sk_socket->file->f_cred->fsgid);
xt_owner.c is also using f_cred, so it might need the same,
does it not?
^ permalink raw reply
* [PATCH] rndis_wlan: move the dereference below the NULL test
From: Wei Yongjun @ 2012-09-10 4:46 UTC (permalink / raw)
To: jussi.kivilinna, linville; +Cc: yongjun_wei, linux-wireless, netdev
From: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
The dereference should be moved below the NULL test.
spatch with a semantic match is used to found this.
(http://coccinelle.lip6.fr/)
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
---
drivers/net/wireless/rndis_wlan.c | 16 ++++++++++------
1 file changed, 10 insertions(+), 6 deletions(-)
diff --git a/drivers/net/wireless/rndis_wlan.c b/drivers/net/wireless/rndis_wlan.c
index 7a4ae9e..de2a673 100644
--- a/drivers/net/wireless/rndis_wlan.c
+++ b/drivers/net/wireless/rndis_wlan.c
@@ -1946,12 +1946,19 @@ static int rndis_get_tx_power(struct wiphy *wiphy, int *dbm)
static int rndis_scan(struct wiphy *wiphy,
struct cfg80211_scan_request *request)
{
- struct net_device *dev = request->wdev->netdev;
- struct usbnet *usbdev = netdev_priv(dev);
- struct rndis_wlan_private *priv = get_rndis_wlan_priv(usbdev);
+ struct net_device *dev;
+ struct usbnet *usbdev;
+ struct rndis_wlan_private *priv;
int ret;
int delay = SCAN_DELAY_JIFFIES;
+ if (!request)
+ return -EINVAL;
+
+ dev = request->wdev->netdev;
+ usbdev = netdev_priv(dev);
+ priv = get_rndis_wlan_priv(usbdev);
+
netdev_dbg(usbdev->net, "cfg80211.scan\n");
/* Get current bssid list from device before new scan, as new scan
@@ -1959,9 +1966,6 @@ static int rndis_scan(struct wiphy *wiphy,
*/
rndis_check_bssid_list(usbdev, NULL, NULL);
- if (!request)
- return -EINVAL;
-
if (priv->scan_request && priv->scan_request != request)
return -EBUSY;
^ permalink raw reply related
* [PATCH] caif: move the dereference below the NULL test
From: Wei Yongjun @ 2012-09-10 4:38 UTC (permalink / raw)
To: sjur.brandeland, davem; +Cc: yongjun_wei, netdev
From: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
The dereference should be moved below the NULL test.
spatch with a semantic match is used to found this.
(http://coccinelle.lip6.fr/)
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
---
net/caif/cfsrvl.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/net/caif/cfsrvl.c b/net/caif/cfsrvl.c
index dd485f6..ba217e9 100644
--- a/net/caif/cfsrvl.c
+++ b/net/caif/cfsrvl.c
@@ -211,9 +211,10 @@ void caif_client_register_refcnt(struct cflayer *adapt_layer,
void (*put)(struct cflayer *lyr))
{
struct cfsrvl *service;
- service = container_of(adapt_layer->dn, struct cfsrvl, layer);
- WARN_ON(adapt_layer == NULL || adapt_layer->dn == NULL);
+ if (WARN_ON(adapt_layer == NULL || adapt_layer->dn == NULL))
+ return;
+ service = container_of(adapt_layer->dn, struct cfsrvl, layer);
service->hold = hold;
service->put = put;
}
^ permalink raw reply related
* Re: [PATCHv4] virtio-spec: virtio network device multiqueue support
From: Rusty Russell @ 2012-09-10 2:12 UTC (permalink / raw)
To: Michael S. Tsirkin, kvm, virtualization, netdev
Cc: pbonzini, rick.jones2, levinsasha928, Tom Herbert
In-Reply-To: <20120909130308.GA3471@redhat.com>
OK, I read the spec (pasted below for easy of reading), but I'm still
confused over how this will work.
I thought normal net drivers have the hardware provide an rxhash for
each packet, and we map that to CPU to queue the packet on[1]. We hope
that the receiving process migrates to that CPU, so xmit queue
matches.
For virtio this would mean a new per-packet rxhash value, right?
Why are we doing something different? What am I missing?
Thanks,
Rusty.
[1] Everything I Know About Networking I Learned From LWN:
https://lwn.net/Articles/362339/
---
Transmit Packet Steering
When VIRTIO_NET_F_MULTIQUEUE feature bit is negotiated, guest can use any of multiple configured transmit queues to transmit a given packet. To avoid packet reordering by device (which generally leads to performance degradation) driver should attempt to utilize the same transmit virtqueue for all packets of a given transmit flow. For bi-directional protocols (in practice, TCP), a given network connection can utilize both transmit and receive queues. For best performance, packets from a single connection should utilize the paired transmit and receive queues from the same virtqueue pair; for example both transmitqN and receiveqN. This rule makes it possible to optimize processing on the device side, but this is not a hard requirement: devices should function correctly even when this rule is
not followed.
Driver selects an active steering rule using VIRTIO_NET_CTRL_STEERING command (this controls both which virtqueue is selected for a given packet for receive and notifies the device which virtqueues are about to be used for transmit).
This command accepts a single out argument in the following format:
#define VIRTIO_NET_CTRL_STEERING 4
The field rule specifies the function used to select transmit virtqueue for a given packet; the field param makes it possible to pass an extra parameter if appropriate. When rule is set to VIRTIO_NET_CTRL_STEERING_SINGLE (this is the default) all packets are steered to the default virtqueue transmitq (1); param is unused; this is the default. With any other rule, When rule is set to VIRTIO_NET_CTRL_STEERING_RX_FOLLOWS_TX packets are steered by driver to the first N=(param+1) multiqueue virtqueues transmitq1...transmitqN; the default transmitq is unused. Driver must have configured all these (param+1) virtqueues beforehand.
Supported steering rules can be added and removed in the future. Driver should check that the request to change the steering rule was successful by checking ack values of the command. As selecting a specific steering is an optimization feature, drivers should avoid hard failure and fall back on using a supported steering rule if this command fails. The default steering rule is VIRTIO_NET_CTRL_STEERING_SINGLE. It will not be removed.
When the steering rule is modified, some packets can still be outstanding in one or more of the transmit virtqueues. Since drivers might choose to modify the current steering rule at a high rate (e.g. adaptively in response to changes in the workload) to avoid reordering packets, device is recommended to complete processing of the transmit queue(s) utilized by the original steering before processing any packets delivered by the modified steering rule.
For debugging, the current steering rule can also be read from the configuration space.
Receive Packet Steering
When VIRTIO_NET_F_MULTIQUEUE feature bit is negotiated, device can use any of multiple configured receive queues to pass a given packet to driver. Driver controls which virtqueue is selected in practice by configuring packet steering rule using VIRTIO_NET_CTRL_STEERING command, as described above[sub:Transmit-Packet-Steering].
The field rule specifies the function used to select receive virtqueue for a given packet; the field param makes it possible to pass an extra parameter if appropriate. When rule is set to VIRTIO_NET_CTRL_STEERING_SINGLE all packets are steered to the default virtqueue receiveq (0); param is unused; this is the default. When rule is set to VIRTIO_NET_CTRL_STEERING_RX_FOLLOWS_TX packets are steered by host to the first N=(param+1) multiqueue virtqueues receiveq1...receiveqN; the default receiveq is unused. Driver must have configured all these (param+1) virtqueues beforehand. For best performance for bi-directional flows (such as TCP) device should detect the flow to virtqueue pair mapping on transmit and select the receive virtqueue from the same virtqueue pair. For uni-directional flows, o
r when this mapping information is missing, a device-specific steering function is used.
Supported steering rules can be added and removed in the future. Driver should probe for supported rules by checking ack values of the command.
When the steering rule is modified, some packets can still be outstanding in one or more of the virtqueues. Device is not required to wait for these packets to be consumed before delivering packets using the new streering rule. Drivers modifying the steering rule at a high rate (e.g. adaptively in response to changes in the workload) are recommended to complete processing of the receive queue(s) utilized by the original steering before processing any packets delivered by the modified steering rule.
^ permalink raw reply
* Re: RFC - document network device carrier management
From: Jan Engelhardt @ 2012-09-10 2:00 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: David Miller, netdev
In-Reply-To: <20120815085827.2b252094@nehalam.linuxnetplumber.net>
On Wednesday 2012-08-15 17:58, Stephen Hemminger wrote:
>--- a/Documentation/networking/netdevices.txt 2012-06-22 08:27:46.729168196 -0700
>+++ b/Documentation/networking/netdevices.txt 2012-08-15 08:56:31.120429994 -0700
>@@ -45,6 +45,36 @@ drop, truncate, or pass up oversize pack
> packets is preferred.
>
>
>+CARRIER
>+=======
>+Most network devices have an operational state that the device
>+monitors. The Linux kernel uses the name "carrier" for this flag which
>+is a historical reference to old modems. Carrier is reported to
>+userspace via the IFF_RUNNING flag from SIOCGIFFLAGS ioctl.
I think Netlink should be mentioned instead:
Carrier is reported to userspace via the IFF_RUNNING flag in
struct ifinfomsg.ifi_flags returned by RTM_GETLINK (see rtnetlink(7)).
^ permalink raw reply
* Re: [PATCH v2] iproute2: tc.8: update UNITS section.
From: Li Wei @ 2012-09-10 1:28 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: netdev
In-Reply-To: <20120829111956.77d504ce@s6510.linuxnetplumber.net>
On 08/30/2012 02:19 AM, Stephen Hemminger wrote:
> On Wed, 29 Aug 2012 14:41:56 +0800
> Li Wei <lw@cn.fujitsu.com> wrote:
>
>> - rename section UNITS to PARAMETERS.
>> - break section PARAMETERS down to four subsections to cover the
>> common used parameter types(RATES, TIMES, SIZES, VALUES).
>> - add some explaination for IEC units in RATES.
>> - point out the max value we can set for RATES, TIMES and SIZES.
>>
>> Signed-off-by: Li Wei <lw@cn.fujitsu.com>
>
> Plan to merge this when I get back next week.
ping ...
>
>
^ permalink raw reply
* RFC: mac802154 Packet Queueing and Slave Devices
From: Alan Ott @ 2012-09-10 0:43 UTC (permalink / raw)
To: Alexander Smirnov, Dmitry Eremin-Solenikov, slapin, Tony Cheneau
Cc: linux-zigbee-devel, netdev
Hi,
Tony and I were recently talking about packet queueing on 802.15.4. What
currently happens (in net/mac802154/tx.c) is that each tx packet (skb)
is stuck on a work queue, and the worker function then sends each packet
to the hardware driver in order.
The problem with this is that it defeats the netif flow control. The
networking layer thinks the packet is sent as soon as it's put on the
workqueue (because the function that queues it returns NETDEV_TX_OK to
the networking layer), and the workqueue can then get arbitrarily large
if an application tries to send a lot of data. (Tony has shown this with
iperf)
The way the 802.15.4 drivers are currently written, their xmit function
blocks until the hardware confirms the packet has been sent. Any
hardware queueing is either not done (at86rf230 and mrf24j40 (and other
non-mainline (yet) drivers)[1]), or is done completely in the firmware
side (as in serial.c (for Econotag), not in mainline yet).
Solution 1:
If we want to keep the driver interface this way (no queueing on the
driver side and each driver's .xmit() function blocks), then we should
call netif_stop_queue()/netif_wake_queue() on the mac802154-subsystem
side[2].
Solution 2:
If we instead want to move to a non-blocking .xmit() function, like
ethernet and wifi currently have, we should then push the
netif_*_queue() functions to the drivers. This has the added benefit of
increased efficiency for devices which have a hardware queue (like the
Econotag, which is managed by the serial.c driver), as netif_*_queue()
functions won't have to be turned on and off repeatedly.
Solution 2 is more invasive. Note that right now we can't add
netif_*_queue() functions to the drivers, because the drivers have no
way to get to the net_device pointer. That is a different, but related
problem, which we might as well get to now. Right now there is the idea
of hardware devices each having multiple virtual slave devices
(represented by mac802154_sub_if_data, net/mac802154/mac802154.h). These
slave devices each have a net_device pointer. The drivers only get a
pointer to the ieee802154_dev, which represents the physical hardware.
They get no net_device (and there's no way for them to get to a
net_device pointer because there are multiple net_devices (one for each
slave interface) which could be sending them data). One of the problems
here is that each of these slave interfaces can potentially (and by
design) be on a different channel, which seems to cause a major problem
since the hardware radio can only receive on a single channel at a time.
I propose implementation of solution #1 in the short term, in parallel
with discussion about the intent of slave devices what their intended
design goals were and how they can be made to work as designed (if
possible).
Alan.
[1] While I can't speak for all devices, the mrf24j40 has no hardware
queue (or, it has a single packet queue).
[2] netif_stop_queue() would go in mac802154_tx() and netif_wake_queue()
would go in mac802154_xmit_worker() once xmit() returns.
^ permalink raw reply
* [PATCH net-next] r8169: use unlimited DMA burst for TX
From: Michal Schmidt @ 2012-09-09 23:55 UTC (permalink / raw)
To: netdev
Cc: Francois Romieu, Hayes Wang, Realtek linux nic maintainers,
Ivan Vecera
The r8169 driver currently limits the DMA burst for TX to 1024 bytes. I have
a box where this prevents the interface from using the gigabit line to its full
potential. This patch solves the problem by setting TX_DMA_BURST to unlimited.
The box has an ASRock B75M motherboard with on-board RTL8168evl/8111evl
(XID 0c900880). TSO is enabled.
I used netperf (TCP_STREAM test) to measure the dependency of TX throughput
on MTU. I did it for three different values of TX_DMA_BURST ('5'=512, '6'=1024,
'7'=unlimited). This chart shows the results:
http://michich.fedorapeople.org/r8169/r8169-effects-of-TX_DMA_BURST.png
Interesting points:
- With the current DMA burst limit (1024):
- at the default MTU=1500 I get only 842 Mbit/s.
- when going from small MTU, the performance rises monotonically with
increasing MTU only up to a peak at MTU=1076 (908 MBit/s). Then there's
a sudden drop to 762 MBit/s from which the throughput rises monotonically
again with further MTU increases.
- With a smaller DMA burst limit (512):
- there's a similar peak at MTU=1076 and another one at MTU=564.
- With unlimited DMA burst:
- at the default MTU=1500 I get nice 940 Mbit/s.
- the throughput rises monotonically with increasing MTU with no strange
peaks.
Notice that the peaks occur at MTU sizes that are multiples of the DMA burst
limit plus 52. Why 52? Because:
20 (IP header) + 20 (TCP header) + 12 (TCP options) = 52
The Realtek-provided r8168 driver (v8.032.00) uses unlimited TX DMA burst too,
except for CFG_METHOD_1 where the TX DMA burst is set to 512 bytes.
CFG_METHOD_1 appears to be the oldest MAC version of "RTL8168B/8111B",
i.e. RTL_GIGA_MAC_VER_11 in r8169. Not sure if this MAC version really needs
the smaller burst limit, or if any other versions have similar requirements.
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
---
drivers/net/ethernet/realtek/r8169.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
index b47d5b3..549314f 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -77,7 +77,7 @@
static const int multicast_filter_limit = 32;
#define MAX_READ_REQUEST_SHIFT 12
-#define TX_DMA_BURST 6 /* Maximum PCI burst, '6' is 1024 */
+#define TX_DMA_BURST 7 /* Maximum PCI burst, '7' is unlimited */
#define SafeMtu 0x1c20 /* ... actually life sucks beyond ~7k */
#define InterFrameGap 0x03 /* 3 means InterFrameGap = the shortest one */
--
1.7.1
^ permalink raw reply related
* Re: netlink: hide struct module parameter in netlink_kernel_create
From: Stephen Rothwell @ 2012-09-09 23:37 UTC (permalink / raw)
To: Pablo Neira Ayuso; +Cc: David S. Miller, netdev
[-- Attachment #1: Type: text/plain, Size: 322 bytes --]
Hi all,
I didn't see the original patch until it reached linux-enxt this morning,
but just a comment:
THIS_MODULE is defined in linux/export.h, so that should be included in
linux/netlink.h instead of linux/module.h as it is much smaller.
--
Cheers,
Stephen Rothwell sfr@canb.auug.org.au
[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply
* Re: [PATCH v2] net-tcp: TCP/IP stack bypass for loopback connections
From: David Miller @ 2012-09-09 21:39 UTC (permalink / raw)
To: jengelh; +Cc: eric.dumazet, P, brutus, edumazet, netdev
In-Reply-To: <alpine.LNX.2.01.1209091954120.15847@frira.zrqbmnf.qr>
From: Jan Engelhardt <jengelh@inai.de>
Date: Sun, 9 Sep 2012 19:54:42 +0200 (CEST)
>
> On Thursday 2012-08-23 13:40, Eric Dumazet wrote:
>>On Thu, 2012-08-23 at 11:57 +0100, Pádraig Brady wrote:
>>
>>> Just to quantify the loopback testing compat issue.
>>> I often do stuff like the following to test latency.
>>> Will that be impacted?
>>>
>>> tc qdisc add dev lo root handle 1:0 netem delay 20msec
>>>
>>
>>Yes this will. At least for tcp traffic this wont "work".
>>
>>TCP friends bypass layers, by directly queuing skbs to sockets.
>>
>>-> no iptables,
>
> If it amounts to that, you will have upset users rather soon.
This is over "loopback", you're just being rediculous. %99.9999 of
people simply do not care.
^ permalink raw reply
* Re: [PATCH] configure: Add search path for 64bit library.
From: Jan Engelhardt @ 2012-09-09 19:42 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: Ben Hutchings, Li Wei, netdev
In-Reply-To: <20120813093323.1506aa83@nehalam.linuxnetplumber.net>
On Monday 2012-08-13 18:33, Stephen Hemminger wrote:
>> > > > IPT_LIB_DIR=""
>> > > > - for dir in /lib /usr/lib /usr/local/lib
>> > > > + for dir in /lib /usr/lib /usr/local/lib /lib64 /usr/lib64 /usr/local/lib64
>> > > > do
>> > > > for file in $dir/{xtables,iptables}/lib*t_*so ; do
>> > > > if [ -f $file ]; then
>> > >
>> > > I think this should be done with pkg-config:
>> > >
>> > > pkg-config --variable=xtlibdir xtables
>> >
>> > Does every distro have pkg-config or does more logic need to be done here?
>>
>> Every distro has pkg-config; the question is whether you want to support
>> library versions that don't include a pkg-config file (xtables.pc), if
>> they exist.
>
>Let's do pkg-config first, and as a fallback keep the old code and only
>look in the same old places.
Every distro that has libxtables.so also has the .pc file.
The obvious reason to have the .pc file is to render such error
prone static path searching redundant.
^ permalink raw reply
* Re: [PATCH v2] net-tcp: TCP/IP stack bypass for loopback connections
From: Jan Engelhardt @ 2012-09-09 17:54 UTC (permalink / raw)
To: Eric Dumazet
Cc: Pádraig Brady, Bruce "Brutus" Curtis,
David S. Miller, Eric Dumazet, netdev
In-Reply-To: <1345722015.5904.675.camel@edumazet-glaptop>
On Thursday 2012-08-23 13:40, Eric Dumazet wrote:
>On Thu, 2012-08-23 at 11:57 +0100, Pádraig Brady wrote:
>
>> Just to quantify the loopback testing compat issue.
>> I often do stuff like the following to test latency.
>> Will that be impacted?
>>
>> tc qdisc add dev lo root handle 1:0 netem delay 20msec
>>
>
>Yes this will. At least for tcp traffic this wont "work".
>
>TCP friends bypass layers, by directly queuing skbs to sockets.
>
>-> no iptables,
If it amounts to that, you will have upset users rather soon.
^ permalink raw reply
* ndo_get_stats and rtnl_netlink
From: Shlomo Pongartz @ 2012-09-09 15:23 UTC (permalink / raw)
To: netdev
Hi,
Just realized that dev_get_stats which calls into a netdevice
ndo_get_stats64/ndo_get_stats can be
called with or without RTNL lock protection. If called from
rtnl_fill_ifinfo e.g as of invocation of
"ip link show <interface>, there IS locking, however if called from
dev_seq_printf_stats e.g as of
invocation of reading the /sys/class/net/<interface>/statistics/
entries, etc more cases -- no locking.
This turned to be problematic when implementing the ethtool
"set_channels" directive which
changes the number of **rings**, since we stepped on a bug where the
rings data structure was
changed by the ethtool flow in the same time a statistics call was done
into the driver, etc.
What would be the way to continue here, per driver lock sounds non
generic...
Regards
Shlomo Pongratz
^ permalink raw reply
* [PATCHv4] virtio-spec: virtio network device multiqueue support
From: Michael S. Tsirkin @ 2012-09-09 13:03 UTC (permalink / raw)
To: kvm, virtualization, netdev; +Cc: rick.jones2, pbonzini, levinsasha928
Add multiqueue support to virtio network device. Add a new feature flag
VIRTIO_NET_F_MULTIQUEUE for this feature, a +new configuration field
max_virtqueue_pairs to detect supported number +of virtqueues as well as
a new command VIRTIO_NET_CTRL_STEERING to +program packet steering.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Changes from v3:
Address Sasha's comments
- drop debug fields - less fields less to debug :)
- clarify max_virtqueue_pairs field and steering param field
- misc typos
Address Paolo's comments
- Fixed old rule name left over from v2
Address Rick's comment
- Tweaked wording
Changes from v2:
Address Jason's comments on v2:
- Changed STEERING_HOST to STEERING_RX_FOLLOWS_TX:
this is both clearer and easier to support.
It does not look like we need a separate steering command
since host can just watch tx packets as they go.
- Moved RX and TX steering sections near each other.
- Add motivation for other changes in v2
Changes from Jason's rfc:
- reserved vq 3: this makes all rx vqs even and tx vqs odd, which
looks nicer to me.
- documented packet steering, added a generalized steering programming
command. Current modes are single queue and host driven multiqueue,
but I envision support for guest driven multiqueue in the future.
- make default vqs unused when in mq mode - this wastes some memory
but makes it more efficient to switch between modes as
we can avoid this causing packet reordering.
If this looks OK to everyone, we can proceed with finalizing the
implementation. This patch is against
eb9fc84d0d3c46438aaab190e2401a9e5409a052 in virtio-spec git tree.
---
virtio-spec.lyx | 453 +++++++++++++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 446 insertions(+), 7 deletions(-)
diff --git a/virtio-spec.lyx b/virtio-spec.lyx
index fb6a4e3..2c2490e 100644
--- a/virtio-spec.lyx
+++ b/virtio-spec.lyx
@@ -58,6 +58,7 @@
\html_be_strict false
\author -608949062 "Rusty Russell,,,"
\author 1531152142 "Paolo Bonzini,,,"
+\author 1986246365 "Michael S. Tsirkin"
\end_header
\begin_body
@@ -3896,6 +3897,61 @@ Only if VIRTIO_NET_F_CTRL_VQ set
\end_inset
+\change_inserted 1986246365 1346663522
+ 3: reserved
+\end_layout
+
+\begin_layout Description
+
+\change_inserted 1986246365 1346663550
+4: receiveq1.
+ 5: transmitq1.
+ 6: receiveq2.
+ 7.
+ transmitq2.
+ ...
+ 2
+\emph on
+N
+\emph default
++2:receivq
+\emph on
+N
+\emph default
+, 2
+\emph on
+N
+\emph default
++3:transmitq
+\emph on
+N
+\emph default
+
+\begin_inset Foot
+status open
+
+\begin_layout Plain Layout
+
+\change_inserted 1986246365 1346663558
+Only if VIRTIO_NET_F_CTRL_VQ set.
+
+\emph on
+N
+\emph default
+ is indicated by
+\emph on
+max_virtqueue_pairs
+\emph default
+ field.
+\change_unchanged
+
+\end_layout
+
+\end_inset
+
+
+\change_unchanged
+
\end_layout
\begin_layout Description
@@ -4056,6 +4112,17 @@ VIRTIO_NET_F_CTRL_VLAN
\begin_layout Description
VIRTIO_NET_F_GUEST_ANNOUNCE(21) Guest can send gratuitous packets.
+\change_inserted 1986246365 1346617842
+
+\end_layout
+
+\begin_layout Description
+
+\change_inserted 1986246365 1346618103
+VIRTIO_NET_F_MULTIQUEUE(22) Device has multiple receive and transmission
+ queues.
+\change_unchanged
+
\end_layout
\end_deeper
@@ -4068,11 +4135,45 @@ configuration
\begin_inset space ~
\end_inset
-layout Two configuration fields are currently defined.
+layout
+\change_deleted 1986246365 1346671560
+Two
+\change_inserted 1986246365 1346671647
+Six
+\change_unchanged
+ configuration fields are currently defined.
The mac address field always exists (though is only valid if VIRTIO_NET_F_MAC
is set), and the status field only exists if VIRTIO_NET_F_STATUS is set.
Two read-only bits are currently defined for the status field: VIRTIO_NET_S_LIN
K_UP and VIRTIO_NET_S_ANNOUNCE.
+
+\change_inserted 1986246365 1347194909
+ The following read-only field,
+\emph on
+max_virtqueue_pairs
+\emph default
+ only exists if VIRTIO_NET_F_MULTIQUEUE is set.
+ This field specifies the maximum number of each of transmit and receive
+ virtqueues (receiveq1..receiveq
+\emph on
+N
+\emph default
+ and transmitq1..transmitq
+\emph on
+N
+\emph default
+ respectively;
+\emph on
+N
+\emph default
+=
+\emph on
+max_virtqueue_pairs
+\emph default
+) that can be used for multiqueue operation, excluding the default receiveq(0)
+ and transmitq(1) virtqueues.
+
+\change_unchanged
\begin_inset listings
inline false
@@ -4105,6 +4206,15 @@ struct virtio_net_config {
\begin_layout Plain Layout
u16 status;
+\change_inserted 1986246365 1346671221
+
+\end_layout
+
+\begin_layout Plain Layout
+
+\change_inserted 1986246365 1346671532
+
+ u16 max_virtqueue_pairs;
\end_layout
\begin_layout Plain Layout
@@ -4151,6 +4261,18 @@ physical
\begin_layout Enumerate
If the VIRTIO_NET_F_CTRL_VQ feature bit is negotiated, identify the control
virtqueue.
+\change_inserted 1986246365 1346618052
+
+\end_layout
+
+\begin_layout Enumerate
+
+\change_inserted 1986246365 1346618175
+If VIRTIO_NET_F_MULTIQUEUE feature bit is negotiated, identify the receive
+ and transmission queues that are going to be used in multiqueue mode.
+ Only queues that are going to be used need to be initialized.
+\change_unchanged
+
\end_layout
\begin_layout Enumerate
@@ -4168,7 +4290,11 @@ status
\end_layout
\begin_layout Enumerate
-The receive virtqueue should be filled with receive buffers.
+The receive virtqueue
+\change_inserted 1986246365 1346618180
+(s)
+\change_unchanged
+ should be filled with receive buffers.
This is described in detail below in
\begin_inset Quotes eld
\end_inset
@@ -4513,6 +4639,8 @@ Note that the header will be two bytes longer for the VIRTIO_NET_F_MRG_RXBUF
\end_inset
+\change_deleted 1986246365 1346932640
+
\end_layout
\begin_layout Subsection*
@@ -4988,8 +5116,24 @@ status open
The Guest needs to check VIRTIO_NET_S_ANNOUNCE bit in status field when
it notices the changes of device configuration.
The command VIRTIO_NET_CTRL_ANNOUNCE_ACK is used to indicate that driver
- has recevied the notification and device would clear the VIRTIO_NET_S_ANNOUNCE
- bit in the status filed after it received this command.
+ has rece
+\change_inserted 1986246365 1346663932
+i
+\change_unchanged
+v
+\change_deleted 1986246365 1346663934
+i
+\change_unchanged
+ed the notification and device would clear the VIRTIO_NET_S_ANNOUNCE bit
+ in the status fi
+\change_inserted 1986246365 1346663942
+e
+\change_unchanged
+l
+\change_deleted 1986246365 1346663943
+e
+\change_unchanged
+d after it received this command.
\end_layout
\begin_layout Standard
@@ -5004,10 +5148,306 @@ Sending the gratuitous packets or marking there are pending gratuitous packets
\begin_layout Enumerate
Sending VIRTIO_NET_CTRL_ANNOUNCE_ACK command through control vq.
+\change_deleted 1986246365 1346662247
+
\end_layout
-\begin_layout Enumerate
+\begin_layout Subsection*
+
+\change_inserted 1986246365 1346932658
+\begin_inset CommandInset label
+LatexCommand label
+name "sub:Transmit-Packet-Steering"
+
+\end_inset
+
+Transmit Packet Steering
+\end_layout
+
+\begin_layout Standard
+
+\change_inserted 1986246365 1346932658
+When VIRTIO_NET_F_MULTIQUEUE feature bit is negotiated, guest can use any
+ of multiple configured transmit queues to transmit a given packet.
+ To avoid packet reordering by device (which generally leads to performance
+ degradation) driver should attempt to utilize the same transmit virtqueue
+ for all packets of a given transmit flow.
+ For bi-directional protocols (in practice, TCP), a given network connection
+ can utilize both transmit and receive queues.
+ For best performance, packets from a single connection should utilize the
+ paired transmit and receive queues from the same virtqueue pair; for example
+ both transmitqN and receiveqN.
+ This rule makes it possible to optimize processing on the device side,
+ but this is not a hard requirement: devices should function correctly even
+ when this rule is not followed.
+\end_layout
+
+\begin_layout Standard
+
+\change_inserted 1986246365 1346932658
+Driver selects an active steering rule using VIRTIO_NET_CTRL_STEERING command
+ (this controls both which virtqueue is selected for a given packet for
+ receive and notifies the device which virtqueues are about to be used for
+ transmit).
+\end_layout
+
+\begin_layout Standard
+
+\change_inserted 1986246365 1346932658
+This command accepts a single out argument in the following format:
+\end_layout
+
+\begin_layout Standard
+
+\change_inserted 1986246365 1346932658
+\begin_inset listings
+inline false
+status open
+
+\begin_layout Plain Layout
+
+\change_inserted 1986246365 1347192845
+
+#define VIRTIO_NET_CTRL_STEERING 4
+\end_layout
+
+\begin_layout Plain Layout
+
+\change_inserted 1986246365 1346932658
+
+\end_layout
+
+\begin_layout Plain Layout
+
+\change_inserted 1986246365 1346932658
+
+struct virtio_net_ctrl_steering {
+\end_layout
+
+\begin_layout Plain Layout
+
+\change_inserted 1986246365 1346932658
+
+ u8 current_steering_rule;
+\end_layout
+
+\begin_layout Plain Layout
+
+\change_inserted 1986246365 1346932658
+
+ u8 reserved;
+\end_layout
+
+\begin_layout Plain Layout
+
+\change_inserted 1986246365 1346932658
+
+ u16 current_steering_param;
+\end_layout
+
+\begin_layout Plain Layout
+
+\change_inserted 1986246365 1346932658
+
+};
+\end_layout
+
+\begin_layout Plain Layout
+
+\change_inserted 1986246365 1347192841
+
+#define VIRTIO_NET_CTRL_STEERING_SINGLE 0
+\end_layout
+
+\begin_layout Plain Layout
+
+\change_inserted 1986246365 1347192840
+
+#define VIRTIO_NET_CTRL_STEERING_RX_FOLLOWS_TX 1
+\end_layout
+
+\end_inset
+
+
+\end_layout
+
+\begin_layout Standard
+
+\change_inserted 1986246365 1347193028
+The field
+\emph on
+rule
+\emph default
+ specifies the function used to select transmit virtqueue for a given packet;
+ the field
+\emph on
+param
+\emph default
+ makes it possible to pass an extra parameter if appropriate.
+ When
+\emph on
+rule
+\emph default
+ is set to VIRTIO_NET_CTRL_STEERING_SINGLE (this is the default) all packets
+ are steered to the default virtqueue transmitq (1); param is unused; this
+ is the default.
+ With any other rule, When
+\emph on
+rule
+\emph default
+ is set to VIRTIO_NET_CTRL_STEERING_RX_FOLLOWS_TX packets are steered by
+ driver to the first
+\emph on
+N
+\emph default
+=(
+\emph on
+param
+\emph default
++1) multiqueue virtqueues transmitq1...transmitqN; the default transmitq is
+ unused.
+ Driver must have configured all these (
+\emph on
+param
+\emph default
++1) virtqueues beforehand.
+\end_layout
+
+\begin_layout Standard
+
+\change_inserted 1986246365 1347193114
+Supported steering rules can be added and removed in the future.
+ Driver should check that the request to change the steering rule was successful
+ by checking ack values of the command.
+ As selecting a specific steering is an optimization feature, drivers should
+ avoid hard failure and fall back on using a supported steering rule if
+ this command fails.
+ The default steering rule is VIRTIO_NET_CTRL_STEERING_SINGLE.
+ It will not be removed.
+\end_layout
+
+\begin_layout Standard
+
+\change_inserted 1986246365 1346932658
+When the steering rule is modified, some packets can still be outstanding
+ in one or more of the transmit virtqueues.
+ Since drivers might choose to modify the current steering rule at a high
+ rate (e.g.
+ adaptively in response to changes in the workload) to avoid reordering
+ packets, device is recommended to complete processing of the transmit queue(s)
+ utilized by the original steering before processing any packets delivered
+ by the modified steering rule.
+\end_layout
+
+\begin_layout Standard
+
+\change_inserted 1986246365 1346932658
+For debugging, the current steering rule can also be read from the configuration
+ space.
+\end_layout
+
+\begin_layout Subsection*
+
+\change_inserted 1986246365 1346670357
+\begin_inset CommandInset label
+LatexCommand label
+name "sub:Receive-Packet-Steering"
+
+\end_inset
+
+Receive Packet Steering
+\end_layout
+
+\begin_layout Standard
+
+\change_inserted 1986246365 1346671046
+When VIRTIO_NET_F_MULTIQUEUE feature bit is negotiated, device can use any
+ of multiple configured receive queues to pass a given packet to driver.
+ Driver controls which virtqueue is selected in practice by configuring
+ packet steering rule using VIRTIO_NET_CTRL_STEERING command, as described
+ above
+\begin_inset CommandInset ref
+LatexCommand ref
+reference "sub:Transmit-Packet-Steering"
+
+\end_inset
+
.
+\end_layout
+
+\begin_layout Standard
+
+\change_inserted 1986246365 1347193175
+The field
+\emph on
+rule
+\emph default
+ specifies the function used to select receive virtqueue for a given packet;
+ the field
+\emph on
+param
+\emph default
+ makes it possible to pass an extra parameter if appropriate.
+ When
+\emph on
+rule
+\emph default
+ is set to VIRTIO_NET_CTRL_STEERING_SINGLE all packets are steered to the
+ default virtqueue receiveq (0); param is unused; this is the default.
+ When
+\emph on
+rule
+\emph default
+ is set to VIRTIO_NET_CTRL_STEERING_RX_FOLLOWS_TX packets are steered by
+ host to the first
+\emph on
+N
+\emph default
+=(
+\emph on
+param
+\emph default
++1) multiqueue virtqueues receiveq1...receiveqN; the default receiveq is unused.
+ Driver must have configured all these (
+\emph on
+param
+\emph default
++1) virtqueues beforehand.
+ For best performance for bi-directional flows (such as TCP) device should
+ detect the flow to virtqueue pair mapping on transmit and select the receive
+ virtqueue from the same virtqueue pair.
+ For uni-directional flows, or when this mapping information is missing,
+ a device-specific steering function is used.
+\change_unchanged
+
+\end_layout
+
+\begin_layout Standard
+
+\change_inserted 1986246365 1346669564
+Supported steering rules can be added and removed in the future.
+ Driver should probe for supported rules by checking ack values of the command.
+\end_layout
+
+\begin_layout Standard
+
+\change_inserted 1986246365 1346932135
+When the steering rule is modified, some packets can still be outstanding
+ in one or more of the virtqueues.
+ Device is not required to wait for these packets to be consumed before
+ delivering packets using the new streering rule.
+ Drivers modifying the steering rule at a high rate (e.g.
+ adaptively in response to changes in the workload) are recommended to complete
+ processing of the receive queue(s) utilized by the original steering before
+ processing any packets delivered by the modified steering rule.
+\end_layout
+
+\begin_layout Standard
+
+\change_deleted 1986246365 1346664095
+.
+
+\change_unchanged
\end_layout
@@ -5973,8 +6413,7 @@ If the VIRTIO_CONSOLE_F_MULTIPORT feature is negotiated, the driver can
spawn multiple ports, not all of which may be attached to a console.
Some could be generic ports.
In this case, the control virtqueues are enabled and according to the max_nr_po
-rts configuration-space value, an appropriate number of virtqueues are
- created.
+rts configuration-space value, an appropriate number of virtqueues are created.
A control message indicating the driver is ready is sent to the host.
The host can then send control messages for adding new ports to the device.
After creating and initializing each port, a VIRTIO_CONSOLE_PORT_READY
--
MST
^ permalink raw reply related
* Re: [PATCHv3] virtio-spec: virtio network device multiqueue support
From: Michael S. Tsirkin @ 2012-09-09 12:40 UTC (permalink / raw)
To: Sasha Levin; +Cc: netdev, kvm, virtualization
In-Reply-To: <50493D04.1090408@gmail.com>
On Fri, Sep 07, 2012 at 02:17:08AM +0200, Sasha Levin wrote:
> Hi Michael,
>
> On 09/06/2012 02:08 PM, Michael S. Tsirkin wrote:
> > Add multiqueue support to virtio network device.
> > Add a new feature flag VIRTIO_NET_F_MULTIQUEUE for this feature, a new
> > configuration field max_virtqueue_pairs to detect supported number of
> > virtqueues as well as a new command VIRTIO_NET_CTRL_STEERING to program
> > packet steering.
> >
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
>
> Some comments about the change:
>
> - "The following four read-only fields only exists if VIRTIO_NET_F_MULTIQUEUE
> is set." => Should be "exist" (I think).
>
> - "When rule is set to VIRTIO_NET_CTRL_STEERING_RX_FOLLOWS_TX packets are
> steered by driver to the first (param+1) multiqueue virtqueues
> transmitq1...transmitqN;" - Why param+1? I thought we ignore the default
> transmit/receive in this case.
>
> - "As selecting a specific steering ais n optimization feature" - "is an".
>
> - It's mentioned several times that the ability to read the steering rule from
> the virtio-net config is there for debug reasons. Is it really necessary? I
> think it's the first time I see debug features go in as part of the spec.
Yes, let features -> less stuff to debug. I'll drop it.
> - I'm slightly confused, why are there both receive and transmit steering? I
> can't find a difference in the way to configure the rule for transmit and
> receive.
This paragraph is there to address this:
Driver selects an active steering rule using VIRTIO_NET_CTRL_STEERING
command (this controls both which virtqueue is selected for a given
packet for receive and notifies the device which virtqueues are about to
be used for transmit).
How can I clarify this better?
> Is it a plan for the future to allow different rules for tx and rx? If
> so, shouldn't we use different ctrl commands (
> VIRTIO_NET_CTRL_TX_STEERING/VIRTIO_NET_CTRL_RX_STEERING)?
I don't see separate steering as very useful:
it does not work for RX follows TX or for TX follows
RX, and separate commands imediately create lots of
options with behaviour which hard to define.
For example if you configure SINGLE on TX but RX_FOLLOWS_TX
on RX what does it mean?
> - "When rule is set to VIRTIO_NET_CTRL_STEERING_SINGLE all packets are steered
> to the default virtqueue receveq (0);" - "receiveq (0)"
>
>
>
> Thanks,
> Sasha
^ permalink raw reply
* [PATCH net-next] etherdevice: introduce help function eth_zero_addr()
From: Duan Jiong @ 2012-09-09 2:32 UTC (permalink / raw)
To: davem; +Cc: netdev
a lot of code has either the memset or an inefficient copy
from a static array that contains the all-zeros Ethernet address.
Introduce help function eth_zero_addr() to fill an address with
all zeros, making the code clearer and allowing us to get rid of
some constant arrays.
Signed-off-by: Duan Jiong <djduanjiong@gmail.com>
---
include/linux/etherdevice.h | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/include/linux/etherdevice.h b/include/linux/etherdevice.h
index d426336..b006ba0 100644
--- a/include/linux/etherdevice.h
+++ b/include/linux/etherdevice.h
@@ -151,6 +151,17 @@ static inline void eth_broadcast_addr(u8 *addr)
}
/**
+ * eth_zero_addr - Assign zero address
+ * @addr: Pointer to a six-byte array containing the Ethernet address
+ *
+ * Assign the zero address to the given address array.
+ */
+static inline void eth_zero_addr(u8 *addr)
+{
+ memset(addr, 0x00, ETH_ALEN);
+}
+
+/**
* eth_hw_addr_random - Generate software assigned random Ethernet and
* set device flag
* @dev: pointer to net_device structure
--
1.7.11.4
^ permalink raw reply related
* Re: [PATCH 0/2] [v3] netlink_kernel_create updates
From: David Miller @ 2012-09-08 23:16 UTC (permalink / raw)
To: pablo; +Cc: netdev
In-Reply-To: <1347108834-15429-1-git-send-email-pablo@netfilter.org>
From: pablo@netfilter.org
Date: Sat, 8 Sep 2012 14:53:52 +0200
> Fixed the infiniband issue. New round of these patches.
...
> Pablo Neira Ayuso (2):
> netlink: kill netlink_set_nonroot
> netlink: hide struct module parameter in netlink_kernel_create
All applied to net-next, thanks.
^ permalink raw reply
* Re: [PATCH] scsi_netlink: Remove dead and buggy code
From: David Miller @ 2012-09-08 22:51 UTC (permalink / raw)
To: ebiederm; +Cc: netdev, James.Bottomley, James.Smart
In-Reply-To: <87pq5xjw4m.fsf@xmission.com>
From: ebiederm@xmission.com (Eric W. Biederman)
Date: Fri, 07 Sep 2012 15:39:21 -0700
>
> The scsi netlink code confuses the netlink port id with a process id,
> going so far as to read NETLINK_CREDS(skb)->pid instead of the correct
> NETLINK_CB(skb).pid. Fortunately it does not matter because nothing
> registers to respond to scsi netlink requests.
>
> The only interesting use of the scsi_netlink interface is
> fc_host_post_vendor_event which sends a netlink multicast message.
>
> Since nothing registers to handle scsi netlink messages kill all of the
> registration logic, while retaining the same error handling behavior
> preserving the userspace visible behavior and removing all of the
> confused code that thought a netlink port id was a process id.
>
> This was tested with a kernel allyesconfig build which had no problems.
>
> Cc: James Bottomley <James.Bottomley@parallels.com>
> Cc: James Smart <James.Smart@Emulex.Com>
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Yeah I can't see anyone, anywhere, using these scsi_send_nl_*()
interfaces at all.
When I get an ACK from the scsi folks I'll add this to net-next,
thanks Eric.
^ permalink raw reply
* Re: [PATCH] net: small bug on rxhash calculation
From: David Miller @ 2012-09-08 22:43 UTC (permalink / raw)
To: eric.dumazet; +Cc: chema, edumazet, netdev, chema
In-Reply-To: <1347092320.1234.335.camel@edumazet-glaptop>
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Sat, 08 Sep 2012 10:18:40 +0200
> On Fri, 2012-09-07 at 16:40 -0700, Chema Gonzalez wrote:
>> In the current rxhash calculation function, while the
>> sorting of the ports/addrs is coherent (you get the
>> same rxhash for packets sharing the same 4-tuple, in
>> both directions), ports and addrs are sorted
>> independently. This implies packets from a connection
>> between the same addresses but crossed ports hash to
>> the same rxhash.
>>
>> For example, traffic between A=S:l and B=L:s is hashed
>> (in both directions) from {L, S, {s, l}}. The same
>> rxhash is obtained for packets between C=S:s and D=L:l.
>>
>> This patch ensures that you either swap both addrs and ports,
>> or you swap none. Traffic between A and B, and traffic
>> between C and D, get their rxhash from different sources
>> ({L, S, {l, s}} for A<->B, and {L, S, {s, l}} for C<->D)
>>
>> The patch is co-written with Eric Dumazet <edumazet@google.com>
>>
>> Signed-off-by: Chema Gonzalez <chema@google.com>
>> ---
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
Applied and queued up for -stable, thanks.
^ permalink raw reply
* Re: [PATCH net-next] filter: add MOD operation
From: George Bakos @ 2012-09-08 20:31 UTC (permalink / raw)
To: Eric Dumazet; +Cc: netdev, Jay Schulist, Andi Kleen, tcpdump-workers
In-Reply-To: <1347091415.1234.317.camel@edumazet-glaptop>
[-- Attachment #1: Type: text/plain, Size: 4516 bytes --]
Here's a patch to libpcap-1.3 to test against. I still need to
include changes to man pages.
g
On Sat, 08 Sep 2012 10:03:35 +0200
Eric Dumazet <eric.dumazet@gmail.com> wrote:
> From: Eric Dumazet <edumazet@google.com>
>
> On Fri, 2012-09-07 at 20:03 -0700, Andi Kleen wrote:
> > On Fri, Sep 07, 2012 at 07:49:10AM +0000, George Bakos wrote:
> > > Gents,
> > > Any fundamental reason why the following (, etc.) shouldn't be
> > > included in net/core/filter.c?
> > >
> > > case BPF_S_ALU_MOD_X:
> > > if (X == 0)
> > > return 0;
> > > A %= X;
> > > continue;
> >
> > Copying netdev.
> >
> > In principle no reason against it, but you may need to update
> > the various BPF JITs too that Linux now has too.
>
> Hi Andi, thanks for the forward
>
> In recent commit ffe06c17afbb was added ALU_XOR_X,
> so we could add ALU_MOD_X as well.
>
> ALU_MOD_K is a bit more complex as we cant use an ancillary, and must
> instead use a new BPF_OP code :
>
> /* alu/jmp fields */
> #define BPF_OP(code) ((code) & 0xf0)
> #define BPF_ADD 0x00
> #define BPF_SUB 0x10
> #define BPF_MUL 0x20
> #define BPF_DIV 0x30
> #define BPF_OR 0x40
> #define BPF_AND 0x50
> #define BPF_LSH 0x60
> #define BPF_RSH 0x70
> #define BPF_NEG 0x80
>
> So I guess we could use
>
> #define BPF_MOD 0x90
>
> About the various arches JIT, there is no hurry :
> We can update them later.
>
> At JIT 'compile' time, if we find a not yet handled instruction, we fall
> back to the net/core/filter.c interpreter.
>
> If the following patch is accepted, I'll do the x86 part as a followup.
>
> Thanks !
>
> [PATCH net-next] filter: add MOD operation
>
> Add a new ALU opcode, to compute a modulus.
>
> Commit ffe06c17afbbb used an ancillary to implement XOR_X,
> but here we reserve one of the available ALU opcode to implement both
> MOD_X and MOD_K
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Suggested-by: George Bakos <gbakos@alpinista.org>
> Cc: Jay Schulist <jschlst@samba.org>
> Cc: Jiri Pirko <jpirko@redhat.com>
> Cc: Andi Kleen <ak@linux.intel.com>
> ---
> include/linux/filter.h | 4 ++++
> net/core/filter.c | 15 +++++++++++++++
> 2 files changed, 19 insertions(+)
>
> diff --git a/include/linux/filter.h b/include/linux/filter.h
> index 82b0135..3cf5fd5 100644
> --- a/include/linux/filter.h
> +++ b/include/linux/filter.h
> @@ -74,6 +74,8 @@ struct sock_fprog { /* Required for SO_ATTACH_FILTER. */
> #define BPF_LSH 0x60
> #define BPF_RSH 0x70
> #define BPF_NEG 0x80
> +#define BPF_MOD 0x90
> +
> #define BPF_JA 0x00
> #define BPF_JEQ 0x10
> #define BPF_JGT 0x20
> @@ -196,6 +198,8 @@ enum {
> BPF_S_ALU_MUL_K,
> BPF_S_ALU_MUL_X,
> BPF_S_ALU_DIV_X,
> + BPF_S_ALU_MOD_K,
> + BPF_S_ALU_MOD_X,
> BPF_S_ALU_AND_K,
> BPF_S_ALU_AND_X,
> BPF_S_ALU_OR_K,
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 907efd2..fbe3a8d 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -167,6 +167,14 @@ unsigned int sk_run_filter(const struct sk_buff *skb,
> case BPF_S_ALU_DIV_K:
> A = reciprocal_divide(A, K);
> continue;
> + case BPF_S_ALU_MOD_X:
> + if (X == 0)
> + return 0;
> + A %= X;
> + continue;
> + case BPF_S_ALU_MOD_K:
> + A %= K;
> + continue;
> case BPF_S_ALU_AND_X:
> A &= X;
> continue;
> @@ -469,6 +477,8 @@ int sk_chk_filter(struct sock_filter *filter, unsigned int flen)
> [BPF_ALU|BPF_MUL|BPF_K] = BPF_S_ALU_MUL_K,
> [BPF_ALU|BPF_MUL|BPF_X] = BPF_S_ALU_MUL_X,
> [BPF_ALU|BPF_DIV|BPF_X] = BPF_S_ALU_DIV_X,
> + [BPF_ALU|BPF_MOD|BPF_K] = BPF_S_ALU_MOD_K,
> + [BPF_ALU|BPF_MOD|BPF_X] = BPF_S_ALU_MOD_X,
> [BPF_ALU|BPF_AND|BPF_K] = BPF_S_ALU_AND_K,
> [BPF_ALU|BPF_AND|BPF_X] = BPF_S_ALU_AND_X,
> [BPF_ALU|BPF_OR|BPF_K] = BPF_S_ALU_OR_K,
> @@ -531,6 +541,11 @@ int sk_chk_filter(struct sock_filter *filter, unsigned int flen)
> return -EINVAL;
> ftest->k = reciprocal_value(ftest->k);
> break;
> + case BPF_S_ALU_MOD_K:
> + /* check for division by zero */
> + if (ftest->k == 0)
> + return -EINVAL;
> + break;
> case BPF_S_LD_MEM:
> case BPF_S_LDX_MEM:
> case BPF_S_ST:
>
>
--
[-- Attachment #2: libpcap-1.3.0-with-modulus.patch --]
[-- Type: text/x-patch, Size: 4452 bytes --]
diff -Naur libpcap-1.3.0/bpf/net/bpf_filter.c libpcap-1.3.0-with-modulus/bpf/net/bpf_filter.c
--- libpcap-1.3.0/bpf/net/bpf_filter.c 2012-03-29 12:57:32.000000000 +0000
+++ libpcap-1.3.0-with-modulus/bpf/net/bpf_filter.c 2012-08-31 01:36:53.206825554 +0000
@@ -469,6 +469,12 @@
A /= X;
continue;
+ case BPF_ALU|BPF_MOD|BPF_X:
+ if (X == 0)
+ return 0;
+ A %= X;
+ continue;
+
case BPF_ALU|BPF_AND|BPF_X:
A &= X;
continue;
@@ -501,6 +507,10 @@
A /= pc->k;
continue;
+ case BPF_ALU|BPF_MOD|BPF_K:
+ A %= pc->k;
+ continue;
+
case BPF_ALU|BPF_AND|BPF_K:
A &= pc->k;
continue;
@@ -621,6 +631,13 @@
*/
if (BPF_SRC(p->code) == BPF_K && p->k == 0)
return 0;
+ break;
+ case BPF_MOD:
+ /*
+ * Check for illegal modulus 0.
+ */
+ if (BPF_SRC(p->code) == BPF_K && p->k == 0)
+ return 0;
break;
default:
return 0;
diff -Naur libpcap-1.3.0/bpf_image.c libpcap-1.3.0-with-modulus/bpf_image.c
--- libpcap-1.3.0/bpf_image.c 2012-03-29 12:57:32.000000000 +0000
+++ libpcap-1.3.0-with-modulus/bpf_image.c 2012-08-31 01:36:53.225825770 +0000
@@ -216,6 +216,11 @@
fmt = "x";
break;
+ case BPF_ALU|BPF_MOD|BPF_X:
+ op = "mod";
+ fmt = "x";
+ break;
+
case BPF_ALU|BPF_AND|BPF_X:
op = "and";
fmt = "x";
@@ -256,6 +261,11 @@
fmt = "#%d";
break;
+ case BPF_ALU|BPF_MOD|BPF_K:
+ op = "mod";
+ fmt = "#%d";
+ break;
+
case BPF_ALU|BPF_AND|BPF_K:
op = "and";
fmt = "#0x%x";
diff -Naur libpcap-1.3.0/grammar.y libpcap-1.3.0-with-modulus/grammar.y
--- libpcap-1.3.0/grammar.y 2012-03-29 12:57:32.000000000 +0000
+++ libpcap-1.3.0-with-modulus/grammar.y 2012-08-31 01:36:53.196825439 +0000
@@ -617,6 +617,7 @@
| arth '*' arth { $$ = gen_arth(BPF_MUL, $1, $3); }
| arth '/' arth { $$ = gen_arth(BPF_DIV, $1, $3); }
| arth '&' arth { $$ = gen_arth(BPF_AND, $1, $3); }
+ | arth '%' arth { $$ = gen_arth(BPF_MOD, $1, $3); }
| arth '|' arth { $$ = gen_arth(BPF_OR, $1, $3); }
| arth LSH arth { $$ = gen_arth(BPF_LSH, $1, $3); }
| arth RSH arth { $$ = gen_arth(BPF_RSH, $1, $3); }
diff -Naur libpcap-1.3.0/optimize.c libpcap-1.3.0-with-modulus/optimize.c
--- libpcap-1.3.0/optimize.c 2012-03-29 12:57:32.000000000 +0000
+++ libpcap-1.3.0-with-modulus/optimize.c 2012-08-31 01:36:53.188825347 +0000
@@ -666,6 +666,12 @@
a /= b;
break;
+ case BPF_MOD:
+ if (b == 0)
+ bpf_error("illegal modulus 0");
+ a %= b;
+ break;
+
case BPF_AND:
a &= b;
break;
@@ -1044,6 +1050,7 @@
case BPF_ALU|BPF_SUB|BPF_K:
case BPF_ALU|BPF_MUL|BPF_K:
case BPF_ALU|BPF_DIV|BPF_K:
+ case BPF_ALU|BPF_MOD|BPF_K:
case BPF_ALU|BPF_AND|BPF_K:
case BPF_ALU|BPF_OR|BPF_K:
case BPF_ALU|BPF_LSH|BPF_K:
@@ -1079,6 +1086,7 @@
case BPF_ALU|BPF_SUB|BPF_X:
case BPF_ALU|BPF_MUL|BPF_X:
case BPF_ALU|BPF_DIV|BPF_X:
+ case BPF_ALU|BPF_MOD|BPF_X:
case BPF_ALU|BPF_AND|BPF_X:
case BPF_ALU|BPF_OR|BPF_X:
case BPF_ALU|BPF_LSH|BPF_X:
@@ -1112,7 +1120,7 @@
vstore(s, &val[A_ATOM], val[X_ATOM], alter);
break;
}
- else if (op == BPF_MUL || op == BPF_DIV ||
+ else if (op == BPF_MUL || op == BPF_DIV || op == BPF_MOD ||
op == BPF_AND || op == BPF_LSH || op == BPF_RSH) {
s->code = BPF_LD|BPF_IMM;
s->k = 0;
diff -Naur libpcap-1.3.0/pcap/bpf.h libpcap-1.3.0-with-modulus/pcap/bpf.h
--- libpcap-1.3.0/pcap/bpf.h 2012-06-12 16:55:36.000000000 +0000
+++ libpcap-1.3.0-with-modulus/pcap/bpf.h 2012-08-31 01:36:53.199825471 +0000
@@ -1235,6 +1235,7 @@
#define BPF_LSH 0x60
#define BPF_RSH 0x70
#define BPF_NEG 0x80
+#define BPF_MOD 0x90
#define BPF_JA 0x00
#define BPF_JEQ 0x10
#define BPF_JGT 0x20
diff -Naur libpcap-1.3.0/scanner.l libpcap-1.3.0-with-modulus/scanner.l
--- libpcap-1.3.0/scanner.l 2012-03-29 12:57:32.000000000 +0000
+++ libpcap-1.3.0-with-modulus/scanner.l 2012-08-31 01:36:53.225825770 +0000
@@ -329,7 +329,7 @@
sls return SLS;
[ \r\n\t] ;
-[+\-*/:\[\]!<>()&|=] return yytext[0];
+[+\-*/:\[\]!<>()&|=%] return yytext[0];
">=" return GEQ;
"<=" return LEQ;
"!=" return NEQ;
@@ -387,7 +387,7 @@
[A-Za-z0-9]([-_.A-Za-z0-9]*[.A-Za-z0-9])? {
yylval.s = sdup((char *)yytext); return ID; }
"\\"[^ !()\n\t]+ { yylval.s = sdup((char *)yytext + 1); return ID; }
-[^ \[\]\t\n\-_.A-Za-z0-9!<>()&|=]+ {
+[^ \[\]\t\n\-_.A-Za-z0-9!<>()&|=%]+ {
bpf_error("illegal token: %s", yytext); }
. { bpf_error("illegal char '%c'", *yytext); }
%%
[-- Attachment #3: Type: text/plain, Size: 171 bytes --]
_______________________________________________
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers
^ permalink raw reply
* Re: [PATCH net-next] netfilter: x_tables: xt_init() should run earlier
From: Eric Dumazet @ 2012-09-08 19:50 UTC (permalink / raw)
To: Patrick McHardy
Cc: Cong Wang, Pablo Neira Ayuso, netfilter-devel,
Linux Kernel Network Developers
In-Reply-To: <Pine.GSO.4.63.1209081949500.2030@stinky-local.trash.net>
On Sat, 2012-09-08 at 19:50 +0200, Patrick McHardy wrote:
> Shouldn't we simply change the Makefile order?
Yes, this is what Pablo did.
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox