* Re: [PATCH] virtio_net: free transmit skbs in a timer
[not found] <1209565906-9019-1-git-send-email-markmc@redhat.com>
@ 2008-05-02 10:55 ` Rusty Russell
[not found] ` <200805022055.25870.rusty@rustcorp.com.au>
1 sibling, 0 replies; 16+ messages in thread
From: Rusty Russell @ 2008-05-02 10:55 UTC (permalink / raw)
To: Mark McLoughlin; +Cc: Anthony Liguori, linux-kernel, virtualization
On Thursday 01 May 2008 00:31:46 Mark McLoughlin wrote:
> virtio_net currently only frees old transmit skbs just
> before queueing new ones. If the queue is full, it then
> enables interrupts and waits for notification that more
> work has been performed.
Hi Mark,
This patch is fine, but it's better to do it from skb_xmit_done(). Of
course, this is usually called from an interrupt handler, so it's not
entirely trivial: we can't free the skbs there.
A softirq is probably the answer here, but AFAICT that's old fashioned.
Not sure what the right way of doing this is now...
Rusty.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] virtio_net: free transmit skbs in a timer
[not found] ` <200805022055.25870.rusty@rustcorp.com.au>
@ 2008-05-12 20:37 ` Mark McLoughlin
[not found] ` <1210624663.14409.20.camel@muff>
1 sibling, 0 replies; 16+ messages in thread
From: Mark McLoughlin @ 2008-05-12 20:37 UTC (permalink / raw)
To: Rusty Russell; +Cc: Anthony Liguori, linux-kernel, virtualization
Hi Rusty,
Sorry 'bout the lag ...
On Fri, 2008-05-02 at 20:55 +1000, Rusty Russell wrote:
> On Thursday 01 May 2008 00:31:46 Mark McLoughlin wrote:
> > virtio_net currently only frees old transmit skbs just
> > before queueing new ones. If the queue is full, it then
> > enables interrupts and waits for notification that more
> > work has been performed.
>
> Hi Mark,
>
> This patch is fine, but it's better to do it from skb_xmit_done().
Unless I'm missing something, we only get this callback when we've
stopped the queue and we're waiting for buffers to be freed up.
In the normal case, where the callback is disabled, we don't get any
notification that the host has finished with the buffer ... hence the
need for a timer.
2.6.25-rc2 rebase below.
> Of
> course, this is usually called from an interrupt handler, so it's not
> entirely trivial: we can't free the skbs there.
>
> A softirq is probably the answer here, but AFAICT that's old fashioned.
> Not sure what the right way of doing this is now...
Thanks,
Mark.
Subject: [PATCH] virtio_net: free transmit skbs in a timer
virtio_net currently only frees old transmit skbs just
before queueing new ones. If the queue is full, it then
enables interrupts and waits for notification that more
work has been performed.
However, a side-effect of this scheme is that there are
always xmit skbs left dangling when no new packets are
sent, against the Documentation/networking/driver.txt
guideline:
"... it is not allowed for your TX mitigation scheme
to let TX packets "hang out" in the TX ring unreclaimed
forever if no new TX packets are sent."
Add a timer to ensure that any time we queue new TX
skbs, we will shortly free them again.
This fixes an easily reproduced hang at shutdown where
iptables attempts to unload nf_conntrack and nf_conntrack
waits for an skb it is tracking to be freed, but virtio_net
never frees it.
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
---
drivers/net/virtio_net.c | 30 ++++++++++++++++++++++++++++--
1 files changed, 28 insertions(+), 2 deletions(-)
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index f926b5a..69b308a 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -44,6 +44,8 @@ struct virtnet_info
/* The skb we couldn't send because buffers were full. */
struct sk_buff *last_xmit_skb;
+ struct timer_list xmit_free_timer;
+
/* Number of input buffers, and max we've ever had. */
unsigned int num, max;
@@ -230,9 +232,23 @@ static void free_old_xmit_skbs(struct virtnet_info *vi)
}
}
+static void xmit_free(unsigned long data)
+{
+ struct virtnet_info *vi = (void *)data;
+
+ netif_tx_lock(vi->dev);
+
+ free_old_xmit_skbs(vi);
+
+ if (!skb_queue_empty(&vi->send))
+ mod_timer(&vi->xmit_free_timer, jiffies + (HZ/10));
+
+ netif_tx_unlock(vi->dev);
+}
+
static int xmit_skb(struct virtnet_info *vi, struct sk_buff *skb)
{
- int num;
+ int num, err;
struct scatterlist sg[2+MAX_SKB_FRAGS];
struct virtio_net_hdr *hdr;
const unsigned char *dest = ((struct ethhdr *)skb->data)->h_dest;
@@ -275,7 +291,11 @@ static int xmit_skb(struct virtnet_info *vi, struct sk_buff *skb)
vnet_hdr_to_sg(sg, skb);
num = skb_to_sgvec(skb, sg+1, 0, skb->len) + 1;
- return vi->svq->vq_ops->add_buf(vi->svq, sg, num, 0, skb);
+ err = vi->svq->vq_ops->add_buf(vi->svq, sg, num, 0, skb);
+ if (!err)
+ mod_timer(&vi->xmit_free_timer, jiffies + (HZ/10));
+
+ return err;
}
static int start_xmit(struct sk_buff *skb, struct net_device *dev)
@@ -428,6 +448,10 @@ static int virtnet_probe(struct virtio_device *vdev)
skb_queue_head_init(&vi->recv);
skb_queue_head_init(&vi->send);
+ init_timer(&vi->xmit_free_timer);
+ vi->xmit_free_timer.data = (unsigned long)vi;
+ vi->xmit_free_timer.function = xmit_free;
+
err = register_netdev(dev);
if (err) {
pr_debug("virtio_net: registering device failed\n");
@@ -465,6 +489,8 @@ static void virtnet_remove(struct virtio_device *vdev)
/* Stop all the virtqueues. */
vdev->config->reset(vdev);
+ del_timer_sync(&vi->xmit_free_timer);
+
/* Free our skbs in send and recv queues, if any. */
while ((skb = __skb_dequeue(&vi->recv)) != NULL) {
kfree_skb(skb);
--
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH] virtio_net: free transmit skbs in a timer
[not found] ` <1210624663.14409.20.camel@muff>
@ 2008-05-13 7:47 ` Avi Kivity
[not found] ` <48294776.3010504@qumranet.com>
1 sibling, 0 replies; 16+ messages in thread
From: Avi Kivity @ 2008-05-13 7:47 UTC (permalink / raw)
To: Mark McLoughlin; +Cc: Anthony Liguori, linux-kernel, virtualization
Mark McLoughlin wrote:
> virtio_net currently only frees old transmit skbs just
> before queueing new ones. If the queue is full, it then
> enables interrupts and waits for notification that more
> work has been performed.
>
> However, a side-effect of this scheme is that there are
> always xmit skbs left dangling when no new packets are
> sent, against the Documentation/networking/driver.txt
> guideline:
>
> "... it is not allowed for your TX mitigation scheme
> to let TX packets "hang out" in the TX ring unreclaimed
> forever if no new TX packets are sent."
>
> Add a timer to ensure that any time we queue new TX
> skbs, we will shortly free them again.
>
> This fixes an easily reproduced hang at shutdown where
> iptables attempts to unload nf_conntrack and nf_conntrack
> waits for an skb it is tracking to be freed, but virtio_net
> never frees it.
>
Sorry to barge in late, but IMO the timer should be on the host, which
is cheaper than on the guest (well, a 100ms timer is likely zero cost,
but I still don't like it).
the host should fire a tx completion interrupt whenever the completion
queue has "enough" entries, where we can define "enough" now as the
halfway mark or a timer expiry, whichever comes earlier.
We can later improve "enough" to be "just enough so the timer never
triggers" and adjust it dynamically. It probably doesn't matter for
Linux, but I don't want to punish guests that can do true async
networking and depend on timely completion notification.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] virtio_net: free transmit skbs in a timer
[not found] ` <48294776.3010504@qumranet.com>
@ 2008-05-14 6:07 ` Rusty Russell
[not found] ` <200805141607.26306.rusty@rustcorp.com.au>
1 sibling, 0 replies; 16+ messages in thread
From: Rusty Russell @ 2008-05-14 6:07 UTC (permalink / raw)
To: Avi Kivity; +Cc: Anthony Liguori, linux-kernel, virtualization
On Tuesday 13 May 2008 17:47:02 Avi Kivity wrote:
> Mark McLoughlin wrote:
> > virtio_net currently only frees old transmit skbs just
> > before queueing new ones. If the queue is full, it then
> > enables interrupts and waits for notification that more
> > work has been performed.
> >
> > However, a side-effect of this scheme is that there are
> > always xmit skbs left dangling when no new packets are
> > sent, against the Documentation/networking/driver.txt
> > guideline:
> >
> > "... it is not allowed for your TX mitigation scheme
> > to let TX packets "hang out" in the TX ring unreclaimed
> > forever if no new TX packets are sent."
> >
> > Add a timer to ensure that any time we queue new TX
> > skbs, we will shortly free them again.
> >
> > This fixes an easily reproduced hang at shutdown where
> > iptables attempts to unload nf_conntrack and nf_conntrack
> > waits for an skb it is tracking to be freed, but virtio_net
> > never frees it.
>
> Sorry to barge in late, but IMO the timer should be on the host, which
> is cheaper than on the guest (well, a 100ms timer is likely zero cost,
> but I still don't like it).
>
> the host should fire a tx completion interrupt whenever the completion
> queue has "enough" entries, where we can define "enough" now as the
> halfway mark or a timer expiry, whichever comes earlier.
>
> We can later improve "enough" to be "just enough so the timer never
> triggers" and adjust it dynamically. It probably doesn't matter for
> Linux, but I don't want to punish guests that can do true async
> networking and depend on timely completion notification.
This implies that we should not be supressing notifications in the guest at
all (unless we're sure there are more packets to come, which currently we
never are: that needs new net infrastructure).
But that means we'd get a notification on every xmit at the moment.
Benchmarks anyone?
Rusty.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] virtio_net: free transmit skbs in a timer
[not found] ` <200805141607.26306.rusty@rustcorp.com.au>
@ 2008-05-14 8:59 ` Avi Kivity
[not found] ` <482AAA0F.2020607@qumranet.com>
1 sibling, 0 replies; 16+ messages in thread
From: Avi Kivity @ 2008-05-14 8:59 UTC (permalink / raw)
To: Rusty Russell; +Cc: Anthony Liguori, linux-kernel, virtualization
Rusty Russell wrote:
>> Sorry to barge in late, but IMO the timer should be on the host, which
>> is cheaper than on the guest (well, a 100ms timer is likely zero cost,
>> but I still don't like it).
>>
>> the host should fire a tx completion interrupt whenever the completion
>> queue has "enough" entries, where we can define "enough" now as the
>> halfway mark or a timer expiry, whichever comes earlier.
>>
>> We can later improve "enough" to be "just enough so the timer never
>> triggers" and adjust it dynamically. It probably doesn't matter for
>> Linux, but I don't want to punish guests that can do true async
>> networking and depend on timely completion notification.
>>
>
> This implies that we should not be supressing notifications in the guest at
> all (unless we're sure there are more packets to come, which currently we
> never are: that needs new net infrastructure).
>
We don't have to be sure, just reasonably confident. If we see a stream
of packets, we open the window, but set a timer in case we're wrong.
The expectation is that the timer will only fire when tx rate drops (or
tx stops completely).
> But that means we'd get a notification on every xmit at the moment.
> Benchmarks anyone?
>
Notification on every xmit will surely kill performance. I'm trying to
get batching to work but also good latency when the link is not saturated.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] virtio_net: free transmit skbs in a timer
[not found] ` <482AAA0F.2020607@qumranet.com>
@ 2008-05-15 15:29 ` Mark McLoughlin
2008-05-15 15:32 ` Avi Kivity
[not found] ` <482C579F.4030509@qumranet.com>
0 siblings, 2 replies; 16+ messages in thread
From: Mark McLoughlin @ 2008-05-15 15:29 UTC (permalink / raw)
To: Avi Kivity; +Cc: Anthony Liguori, linux-kernel, virtualization
On Wed, 2008-05-14 at 11:59 +0300, Avi Kivity wrote:
> Rusty Russell wrote:
> >> Sorry to barge in late, but IMO the timer should be on the host, which
> >> is cheaper than on the guest (well, a 100ms timer is likely zero cost,
> >> but I still don't like it).
> >>
> >> the host should fire a tx completion interrupt whenever the completion
> >> queue has "enough" entries, where we can define "enough" now as the
> >> halfway mark or a timer expiry, whichever comes earlier.
> >>
> >> We can later improve "enough" to be "just enough so the timer never
> >> triggers" and adjust it dynamically. It probably doesn't matter for
> >> Linux, but I don't want to punish guests that can do true async
> >> networking and depend on timely completion notification.
> >>
> >
> > This implies that we should not be supressing notifications in the guest at
> > all (unless we're sure there are more packets to come, which currently we
> > never are: that needs new net infrastructure).
> >
>
> We don't have to be sure, just reasonably confident. If we see a stream
> of packets, we open the window, but set a timer in case we're wrong.
> The expectation is that the timer will only fire when tx rate drops (or
> tx stops completely).
>
> > But that means we'd get a notification on every xmit at the moment.
> > Benchmarks anyone?
> >
>
> Notification on every xmit will surely kill performance. I'm trying to
> get batching to work but also good latency when the link is not saturated.
I think Rusty is speaking from the POV of the guest driver - i.e. that
virtio_net should never disable notifications on the xmit queue using
disable_cb()?
Sounds like you think agree, but that the host side should throttle the
rate of xmit notifications?
Cheers,
Mark.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] virtio_net: free transmit skbs in a timer
2008-05-15 15:29 ` Mark McLoughlin
@ 2008-05-15 15:32 ` Avi Kivity
[not found] ` <482C579F.4030509@qumranet.com>
1 sibling, 0 replies; 16+ messages in thread
From: Avi Kivity @ 2008-05-15 15:32 UTC (permalink / raw)
To: Mark McLoughlin; +Cc: Anthony Liguori, linux-kernel, virtualization
Mark McLoughlin wrote:
> I think Rusty is speaking from the POV of the guest driver - i.e. that
> virtio_net should never disable notifications on the xmit queue using
> disable_cb()?
>
> Sounds like you think agree, but that the host side should throttle the
> rate of xmit notifications?
>
Yes.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] virtio_net: free transmit skbs in a timer
[not found] ` <482C579F.4030509@qumranet.com>
@ 2008-05-15 23:25 ` Rusty Russell
[not found] ` <200805160925.18087.rusty@rustcorp.com.au>
1 sibling, 0 replies; 16+ messages in thread
From: Rusty Russell @ 2008-05-15 23:25 UTC (permalink / raw)
To: Avi Kivity; +Cc: Anthony Liguori, linux-kernel, virtualization
On Friday 16 May 2008 01:32:47 Avi Kivity wrote:
> Mark McLoughlin wrote:
> > I think Rusty is speaking from the POV of the guest driver - i.e. that
> > virtio_net should never disable notifications on the xmit queue using
> > disable_cb()?
> >
> > Sounds like you think agree, but that the host side should throttle the
> > rate of xmit notifications?
>
> Yes.
But performance is going to suck in the meantime, as currently our host
doesn't do this.
Rusty.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] virtio_net: free transmit skbs in a timer
[not found] ` <200805160925.18087.rusty@rustcorp.com.au>
@ 2008-05-18 6:40 ` Avi Kivity
[not found] ` <482FCF45.1050307@qumranet.com>
1 sibling, 0 replies; 16+ messages in thread
From: Avi Kivity @ 2008-05-18 6:40 UTC (permalink / raw)
To: Rusty Russell; +Cc: Anthony Liguori, linux-kernel, virtualization
Rusty Russell wrote:
> On Friday 16 May 2008 01:32:47 Avi Kivity wrote:
>
>> Mark McLoughlin wrote:
>>
>>> I think Rusty is speaking from the POV of the guest driver - i.e. that
>>> virtio_net should never disable notifications on the xmit queue using
>>> disable_cb()?
>>>
>>> Sounds like you think agree, but that the host side should throttle the
>>> rate of xmit notifications?
>>>
>> Yes.
>>
>
> But performance is going to suck in the meantime, as currently our host
> doesn't do this.
>
That's what the feature bits are for.
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] virtio_net: free transmit skbs in a timer
[not found] ` <482FCF45.1050307@qumranet.com>
@ 2008-05-18 14:16 ` Rusty Russell
[not found] ` <200805190016.10823.rusty@rustcorp.com.au>
1 sibling, 0 replies; 16+ messages in thread
From: Rusty Russell @ 2008-05-18 14:16 UTC (permalink / raw)
To: Avi Kivity; +Cc: Anthony Liguori, linux-kernel, virtualization
On Sunday 18 May 2008 16:40:05 Avi Kivity wrote:
> Rusty Russell wrote:
> > On Friday 16 May 2008 01:32:47 Avi Kivity wrote:
> >> Mark McLoughlin wrote:
> >>> I think Rusty is speaking from the POV of the guest driver - i.e. that
> >>> virtio_net should never disable notifications on the xmit queue using
> >>> disable_cb()?
> >>>
> >>> Sounds like you think agree, but that the host side should throttle the
> >>> rate of xmit notifications?
> >>
> >> Yes.
> >
> > But performance is going to suck in the meantime, as currently our host
> > doesn't do this.
>
> That's what the feature bits are for.
OK. And since the current situation is that the host doesn't throttle, the
feature bit should be "don't throttle, host is doing it for you", and Mark's
patch should go in...
Cheers,
Rusty.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] virtio_net: free transmit skbs in a timer
[not found] ` <200805190016.10823.rusty@rustcorp.com.au>
@ 2008-05-18 14:27 ` Avi Kivity
[not found] ` <48303CB9.7070406@qumranet.com>
1 sibling, 0 replies; 16+ messages in thread
From: Avi Kivity @ 2008-05-18 14:27 UTC (permalink / raw)
To: Rusty Russell; +Cc: Anthony Liguori, linux-kernel, virtualization
Rusty Russell wrote:
> OK. And since the current situation is that the host doesn't throttle, the
> feature bit should be "don't throttle, host is doing it for you", and Mark's
> patch should go in...
>
Yes.
We should have thought of this before, though, especially as Xen does
this or something very similar:
> /* Shared ring page */ \
> struct __name##_sring { \
> RING_IDX req_prod, req_event; \
> RING_IDX rsp_prod, rsp_event; \
> uint8_t pad[48]; \
> union __name##_sring_entry ring[1]; /* variable-length */ \
> }; \
req_event and rsp_event allow the other side to indicate when it wants a
notification.
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] virtio_net: free transmit skbs in a timer
[not found] ` <48303CB9.7070406@qumranet.com>
@ 2008-05-19 1:52 ` Rusty Russell
[not found] ` <200805191152.28409.rusty@rustcorp.com.au>
1 sibling, 0 replies; 16+ messages in thread
From: Rusty Russell @ 2008-05-19 1:52 UTC (permalink / raw)
To: Avi Kivity; +Cc: Anthony Liguori, linux-kernel, virtualization
On Monday 19 May 2008 00:27:05 Avi Kivity wrote:
> Rusty Russell wrote:
> > OK. And since the current situation is that the host doesn't throttle,
> > the feature bit should be "don't throttle, host is doing it for you", and
> > Mark's patch should go in...
>
> Yes.
>
> We should have thought of this before, though, especially as Xen does
>
> this or something very similar:
> > /* Shared ring page */ \
> > struct __name##_sring { \
> > RING_IDX req_prod, req_event; \
> > RING_IDX rsp_prod, rsp_event; \
> > uint8_t pad[48]; \
> > union __name##_sring_entry ring[1]; /* variable-length */ \
> > }; \
>
> req_event and rsp_event allow the other side to indicate when it wants a
> notification.
Well, we do have such a thing, in the ring suppression flags. Note that DaveM
is talking about moving network tx queue into the net drivers themselves,
which will make them much more efficient (ie. drain entire queue before
kick), which may again change the balance of what the Right Thing is.
I'll merge Mark's patch, and look at hacking up a feature to change behaviour
to never suppress tx-done interrupts.
Thanks,
Rusty.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] virtio_net: free transmit skbs in a timer
[not found] ` <200805191152.28409.rusty@rustcorp.com.au>
@ 2008-05-19 10:26 ` Avi Kivity
[not found] ` <483155D4.2090700@qumranet.com>
1 sibling, 0 replies; 16+ messages in thread
From: Avi Kivity @ 2008-05-19 10:26 UTC (permalink / raw)
To: Rusty Russell; +Cc: Anthony Liguori, linux-kernel, virtualization
Rusty Russell wrote:
>> We should have thought of this before, though, especially as Xen does
>>
>> this or something very similar:
>>
>>> /* Shared ring page */ \
>>> struct __name##_sring { \
>>> RING_IDX req_prod, req_event; \
>>> RING_IDX rsp_prod, rsp_event; \
>>> uint8_t pad[48]; \
>>> union __name##_sring_entry ring[1]; /* variable-length */ \
>>> }; \
>>>
>> req_event and rsp_event allow the other side to indicate when it wants a
>> notification.
>>
>
> Well, we do have such a thing, in the ring suppression flags.
Can you point me at this?
> Note that DaveM
> is talking about moving network tx queue into the net drivers themselves,
> which will make them much more efficient (ie. drain entire queue before
> kick), which may again change the balance of what the Right Thing is.
>
That depends on whether Linux knows whether more packets are coming.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] virtio_net: free transmit skbs in a timer
[not found] ` <483155D4.2090700@qumranet.com>
@ 2008-05-19 12:21 ` Rusty Russell
[not found] ` <200805192221.52072.rusty@rustcorp.com.au>
1 sibling, 0 replies; 16+ messages in thread
From: Rusty Russell @ 2008-05-19 12:21 UTC (permalink / raw)
To: Avi Kivity; +Cc: Anthony Liguori, linux-kernel, virtualization
On Monday 19 May 2008 20:26:28 Avi Kivity wrote:
> Rusty Russell wrote:
> >> We should have thought of this before, though, especially as Xen does
> >>
> >> this or something very similar:
> >>> /* Shared ring page */ \
> >>> struct __name##_sring { \
> >>> RING_IDX req_prod, req_event; \
> >>> RING_IDX rsp_prod, rsp_event; \
> >>> uint8_t pad[48]; \
> >>> union __name##_sring_entry ring[1]; /* variable-length */ \
> >>> }; \
> >>
> >> req_event and rsp_event allow the other side to indicate when it wants a
> >> notification.
> >
> > Well, we do have such a thing, in the ring suppression flags.
>
> Can you point me at this?
Ah, sorry, I misunderstood. No, we don't have a threshold like this, we
have an all-or-nothing flag in each direction.
We have the ability to add new fields to the rings. I've put it on my
TODO to benchmark what this does. It may or may not help. In this case,
notification when there are no more packets in xmit ring would be
sufficient. We already kick the host when we fill a ring even if
it says it doesn't need it, perhaps this would be symmetry.
> > Note that DaveM
> > is talking about moving network tx queue into the net drivers themselves,
> > which will make them much more efficient (ie. drain entire queue before
> > kick), which may again change the balance of what the Right Thing is.
>
> That depends on whether Linux knows whether more packets are coming.
The current problem is that hard_start_xmit gets called with a packet, and
has no idea if there are more in the tx queue. By having the driver control
the queue, it can at least tell that.
The question remains whether this will be sufficient for tx mitigation. A
program may be about to write more to the socket, and we can't know that.
But it can hardly hurt.
In turn, sending out such packet bursts will have an effect on packet reuse.
Cheers,
Rusty.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] virtio_net: free transmit skbs in a timer
[not found] ` <200805192221.52072.rusty@rustcorp.com.au>
@ 2008-05-19 13:26 ` Avi Kivity
2008-05-20 1:37 ` Rusty Russell
0 siblings, 1 reply; 16+ messages in thread
From: Avi Kivity @ 2008-05-19 13:26 UTC (permalink / raw)
To: Rusty Russell; +Cc: Anthony Liguori, linux-kernel, virtualization
Rusty Russell wrote:
>>>
>>> Well, we do have such a thing, in the ring suppression flags.
>>>
>> Can you point me at this?
>>
>
> Ah, sorry, I misunderstood. No, we don't have a threshold like this, we
> have an all-or-nothing flag in each direction.
>
>
Please point me anyway?
> We have the ability to add new fields to the rings. I've put it on my
> TODO to benchmark what this does. It may or may not help. In this case,
> notification when there are no more packets in xmit ring would be
> sufficient. We already kick the host when we fill a ring even if
> it says it doesn't need it, perhaps this would be symmetry.
>
Notification on ring full is too late with smp. You need to warn the
other side in advance.
The reason I'm interested in adjustable thresholds is that you can then
to tx mitigation without (usually) suffering the worst-case latency when
you aren't streaming or streaming slower than what you tuned for.
>
>>> Note that DaveM
>>> is talking about moving network tx queue into the net drivers themselves,
>>> which will make them much more efficient (ie. drain entire queue before
>>> kick), which may again change the balance of what the Right Thing is.
>>>
>> That depends on whether Linux knows whether more packets are coming.
>>
>
> The current problem is that hard_start_xmit gets called with a packet, and
> has no idea if there are more in the tx queue. By having the driver control
> the queue, it can at least tell that.
>
Yes, definitely an improvement, esp. for sendfile().
> The question remains whether this will be sufficient for tx mitigation. A
> program may be about to write more to the socket, and we can't know that.
> But it can hardly hurt.
>
>
Right.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] virtio_net: free transmit skbs in a timer
2008-05-19 13:26 ` Avi Kivity
@ 2008-05-20 1:37 ` Rusty Russell
0 siblings, 0 replies; 16+ messages in thread
From: Rusty Russell @ 2008-05-20 1:37 UTC (permalink / raw)
To: Avi Kivity; +Cc: Anthony Liguori, linux-kernel, virtualization
On Monday 19 May 2008 23:26:15 Avi Kivity wrote:
> Rusty Russell wrote:
> >>> Well, we do have such a thing, in the ring suppression flags.
> >>
> >> Can you point me at this?
> >
> > Ah, sorry, I misunderstood. No, we don't have a threshold like this, we
> > have an all-or-nothing flag in each direction.
>
> Please point me anyway?
Sure, linux/virtio_ring.h:
/* The Host uses this in used->flags to advise the Guest: don't kick me when
* you add a buffer. It's unreliable, so it's simply an optimization. Guest
* will still kick if it's out of buffers. */
#define VRING_USED_F_NO_NOTIFY 1
/* The Guest uses this in avail->flags to advise the Host: don't interrupt me
* when you consume a buffer. It's unreliable, so it's simply an
* optimization. */
#define VRING_AVAIL_F_NO_INTERRUPT 1
> > We have the ability to add new fields to the rings. I've put it on my
> > TODO to benchmark what this does. It may or may not help. In this case,
> > notification when there are no more packets in xmit ring would be
> > sufficient. We already kick the host when we fill a ring even if
> > it says it doesn't need it, perhaps this would be symmetry.
>
> Notification on ring full is too late with smp. You need to warn the
> other side in advance.
No, you misunderstand; this is not a performance issue. On xmit, the driver
cleans up any old used packets before trying to send anyway. So it doesn't
care when xmit packets are used, and suppresses the 'used' interrupt on the
xmit virtqueue. Only if the xmit ring is full does it enable the xmit-used
notification. This is optimal.
The issue is that we *do* actually care when xmit packets are used: we're
supposed to free them in a timely manner and if the packet flow stops, we
don't. By always sending a used interrupt when *all* packets are used, we
would cover this case quite nicely without impacting the normal case of
packet flow.
> The reason I'm interested in adjustable thresholds is that you can then
> to tx mitigation without (usually) suffering the worst-case latency when
> you aren't streaming or streaming slower than what you tuned for.
Right, this would be a threshold that the host would set, approx. "when you've
put this many packets in the xmit ring, tell me" (the opposite direction of
the discussion above). Currently we will kick the host on the first packet,
and qemu will suppress the notifications based on some timer and we'll notify
it anyway if the ring fills (which is suboptimal). With this technique the
host could double the threshold up to some maximum percentage of the ring.
While I like the Xen scheme, we can do the same thing from within the guest
with the existing scheme using an internal threshold. We are always allowed
to send "spurious" notifications to the host, so it can't break anything.
Added to TODO.
Thanks,
Rusty.
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2008-05-20 1:37 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <1209565906-9019-1-git-send-email-markmc@redhat.com>
2008-05-02 10:55 ` [PATCH] virtio_net: free transmit skbs in a timer Rusty Russell
[not found] ` <200805022055.25870.rusty@rustcorp.com.au>
2008-05-12 20:37 ` Mark McLoughlin
[not found] ` <1210624663.14409.20.camel@muff>
2008-05-13 7:47 ` Avi Kivity
[not found] ` <48294776.3010504@qumranet.com>
2008-05-14 6:07 ` Rusty Russell
[not found] ` <200805141607.26306.rusty@rustcorp.com.au>
2008-05-14 8:59 ` Avi Kivity
[not found] ` <482AAA0F.2020607@qumranet.com>
2008-05-15 15:29 ` Mark McLoughlin
2008-05-15 15:32 ` Avi Kivity
[not found] ` <482C579F.4030509@qumranet.com>
2008-05-15 23:25 ` Rusty Russell
[not found] ` <200805160925.18087.rusty@rustcorp.com.au>
2008-05-18 6:40 ` Avi Kivity
[not found] ` <482FCF45.1050307@qumranet.com>
2008-05-18 14:16 ` Rusty Russell
[not found] ` <200805190016.10823.rusty@rustcorp.com.au>
2008-05-18 14:27 ` Avi Kivity
[not found] ` <48303CB9.7070406@qumranet.com>
2008-05-19 1:52 ` Rusty Russell
[not found] ` <200805191152.28409.rusty@rustcorp.com.au>
2008-05-19 10:26 ` Avi Kivity
[not found] ` <483155D4.2090700@qumranet.com>
2008-05-19 12:21 ` Rusty Russell
[not found] ` <200805192221.52072.rusty@rustcorp.com.au>
2008-05-19 13:26 ` Avi Kivity
2008-05-20 1:37 ` Rusty Russell
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).