* [PATCH] Fix e1000 can_receive handler
@ 2008-05-07 21:40 Anthony Liguori
2008-05-08 13:26 ` Aurelien Jarno
0 siblings, 1 reply; 7+ messages in thread
From: Anthony Liguori @ 2008-05-07 21:40 UTC (permalink / raw)
To: kvm-devel; +Cc: Anthony Liguori, Avi Kivity
The current logic of the can_receive handler is to allow packets whenever the
receiver is disabled or when there are descriptors available in the ring.
I think the logic ought to be to allow packets whenever the receiver is enabled
and there are descriptors available in the ring.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
diff --git a/qemu/hw/e1000.c b/qemu/hw/e1000.c
index 0728539..01f8983 100644
--- a/qemu/hw/e1000.c
+++ b/qemu/hw/e1000.c
@@ -520,8 +520,8 @@ e1000_can_receive(void *opaque)
{
E1000State *s = opaque;
- return (!(s->mac_reg[RCTL] & E1000_RCTL_EN) ||
- s->mac_reg[RDH] != s->mac_reg[RDT]);
+ return ((s->mac_reg[RCTL] & E1000_RCTL_EN) &&
+ s->mac_reg[RDH] != s->mac_reg[RDT]);
}
static void
-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference
Don't miss this year's exciting event. There's still time to save $100.
Use priority code J8TL2D2.
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] Fix e1000 can_receive handler
2008-05-07 21:40 [PATCH] Fix e1000 can_receive handler Anthony Liguori
@ 2008-05-08 13:26 ` Aurelien Jarno
2008-05-08 14:03 ` Anthony Liguori
0 siblings, 1 reply; 7+ messages in thread
From: Aurelien Jarno @ 2008-05-08 13:26 UTC (permalink / raw)
To: Anthony Liguori; +Cc: kvm-devel, Avi Kivity
On Wed, May 07, 2008 at 04:40:58PM -0500, Anthony Liguori wrote:
> The current logic of the can_receive handler is to allow packets whenever the
> receiver is disabled or when there are descriptors available in the ring.
>
> I think the logic ought to be to allow packets whenever the receiver is enabled
> and there are descriptors available in the ring.
The current behaviour is actually correct, this is the way QEMU works:
when the card is stopped, it should always accept packets, and then
discard them.
--
.''`. Aurelien Jarno | GPG: 1024D/F1BCDB73
: :' : Debian developer | Electrical Engineer
`. `' aurel32@debian.org | aurelien@aurel32.net
`- people.debian.org/~aurel32 | www.aurel32.net
-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference
Don't miss this year's exciting event. There's still time to save $100.
Use priority code J8TL2D2.
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Fix e1000 can_receive handler
2008-05-08 13:26 ` Aurelien Jarno
@ 2008-05-08 14:03 ` Anthony Liguori
2008-05-08 17:02 ` Avi Kivity
0 siblings, 1 reply; 7+ messages in thread
From: Anthony Liguori @ 2008-05-08 14:03 UTC (permalink / raw)
To: Aurelien Jarno; +Cc: kvm-devel, Avi Kivity
Aurelien Jarno wrote:
> On Wed, May 07, 2008 at 04:40:58PM -0500, Anthony Liguori wrote:
>
>> The current logic of the can_receive handler is to allow packets whenever the
>> receiver is disabled or when there are descriptors available in the ring.
>>
>> I think the logic ought to be to allow packets whenever the receiver is enabled
>> and there are descriptors available in the ring.
>>
>
> The current behaviour is actually correct, this is the way QEMU works:
> when the card is stopped, it should always accept packets, and then
> discard them.
>
The previous patches in my virtio series change that behavior. Before
delivering a packet to a VLAN, it checks to see if any of the VLAN
clients are able to receive a packet.
This is very important for achieving good performance with tap. With
virtio-net, we were dropping a ton of packets with tap because there
weren't descriptors available on the RX ring.
I plan to submit that behavioral change to QEMU upstream along with the
virtio drivers. I'm still optimizing phys_page_find() though. The
performance impact of switching the ring manipulation to using the
stl_phys accessors is unacceptable for KVM.
Regards,
Anthony Liguori
-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference
Don't miss this year's exciting event. There's still time to save $100.
Use priority code J8TL2D2.
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Fix e1000 can_receive handler
2008-05-08 14:03 ` Anthony Liguori
@ 2008-05-08 17:02 ` Avi Kivity
2008-05-08 17:21 ` Avi Kivity
2008-05-08 18:41 ` Anthony Liguori
0 siblings, 2 replies; 7+ messages in thread
From: Avi Kivity @ 2008-05-08 17:02 UTC (permalink / raw)
To: Anthony Liguori; +Cc: kvm-devel, Aurelien Jarno
Anthony Liguori wrote:
> Aurelien Jarno wrote:
>> On Wed, May 07, 2008 at 04:40:58PM -0500, Anthony Liguori wrote:
>>
>>> The current logic of the can_receive handler is to allow packets
>>> whenever the
>>> receiver is disabled or when there are descriptors available in the
>>> ring.
>>>
>>> I think the logic ought to be to allow packets whenever the receiver
>>> is enabled
>>> and there are descriptors available in the ring.
>>>
>>
>> The current behaviour is actually correct, this is the way QEMU works:
>> when the card is stopped, it should always accept packets, and then
>> discard them.
>>
>
> The previous patches in my virtio series change that behavior. Before
> delivering a packet to a VLAN, it checks to see if any of the VLAN
> clients are able to receive a packet.
>
> This is very important for achieving good performance with tap. With
> virtio-net, we were dropping a ton of packets with tap because there
> weren't descriptors available on the RX ring.
>
> I plan to submit that behavioral change to QEMU upstream along with
> the virtio drivers. I'm still optimizing phys_page_find() though.
> The performance impact of switching the ring manipulation to using the
> stl_phys accessors is unacceptable for KVM.
I think this indicates a virtio tuning problem. The rx queue should
never be empty on normal operation, ever. Signalling on rx queue empty
invites the overrun since it takes time for the interrupt to be
delivered and for the guest to refill the queue.
Long term we need to do this dynamically but we can start with
signalling on rx queue half empty (or half full if you're an optimist).
We further need to tune the queue size so that this results in an
acceptable number of interrupts:
1 Gbps = 83K pps = 83 entries per half queue @ 1 KHz interrupt rate
So we need 166 entries on the queue to keep a moderate interrupt rate,
if we change it to signal at the halfway mark.
Note that flow control still makes sense since it allows us to buffer
some packets if the guest is scheduled out. But we can't use it as the
primary mechanism since it won't exist with multiqueue NICs (where the
virtio descriptors are fed to driver).
Similar reasoning probably applied to tx.
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference
Don't miss this year's exciting event. There's still time to save $100.
Use priority code J8TL2D2.
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Fix e1000 can_receive handler
2008-05-08 17:02 ` Avi Kivity
@ 2008-05-08 17:21 ` Avi Kivity
2008-05-08 18:41 ` Anthony Liguori
1 sibling, 0 replies; 7+ messages in thread
From: Avi Kivity @ 2008-05-08 17:21 UTC (permalink / raw)
To: Anthony Liguori; +Cc: kvm-devel, Aurelien Jarno
Avi Kivity wrote:
>
> Note that flow control still makes sense since it allows us to buffer
> some packets if the guest is scheduled out. But we can't use it as
> the primary mechanism since it won't exist with multiqueue NICs (where
> the virtio descriptors are fed to driver).
>
... are fed to the hardware, I meant.
And to clarify further, I do think the patch is correct, but that virtio
should be able to work well without it under ordinary circumstances.
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference
Don't miss this year's exciting event. There's still time to save $100.
Use priority code J8TL2D2.
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Fix e1000 can_receive handler
2008-05-08 17:02 ` Avi Kivity
2008-05-08 17:21 ` Avi Kivity
@ 2008-05-08 18:41 ` Anthony Liguori
2008-05-09 7:47 ` Avi Kivity
1 sibling, 1 reply; 7+ messages in thread
From: Anthony Liguori @ 2008-05-08 18:41 UTC (permalink / raw)
To: Avi Kivity; +Cc: kvm-devel, Aurelien Jarno
Avi Kivity wrote:
> Anthony Liguori wrote:
>> Aurelien Jarno wrote:
>>> On Wed, May 07, 2008 at 04:40:58PM -0500, Anthony Liguori wrote:
>>>
>>>> The current logic of the can_receive handler is to allow packets
>>>> whenever the
>>>> receiver is disabled or when there are descriptors available in the
>>>> ring.
>>>>
>>>> I think the logic ought to be to allow packets whenever the
>>>> receiver is enabled
>>>> and there are descriptors available in the ring.
>>>>
>>>
>>> The current behaviour is actually correct, this is the way QEMU works:
>>> when the card is stopped, it should always accept packets, and then
>>> discard them.
>>>
>>
>> The previous patches in my virtio series change that behavior.
>> Before delivering a packet to a VLAN, it checks to see if any of the
>> VLAN clients are able to receive a packet.
>>
>> This is very important for achieving good performance with tap. With
>> virtio-net, we were dropping a ton of packets with tap because there
>> weren't descriptors available on the RX ring.
>>
>> I plan to submit that behavioral change to QEMU upstream along with
>> the virtio drivers. I'm still optimizing phys_page_find() though.
>> The performance impact of switching the ring manipulation to using
>> the stl_phys accessors is unacceptable for KVM.
>
> I think this indicates a virtio tuning problem. The rx queue should
> never be empty on normal operation, ever. Signalling on rx queue
> empty invites the overrun since it takes time for the interrupt to be
> delivered and for the guest to refill the queue.
Well, to start with, the e1000_can_receive handler is just plan wrong.
The logic is broken. This hasn't caused an issue because the code isn't
used.
That said, it is possible to tune virtio to get back the performance
lose of dropping packets. We lose about 20% of iperf from dropped
packets ATM. If we bump the ring size up to 512 we get it back. If we
bump it to 1024, we start loosing again. It's much less reliably than
doing flow control though.
> Long term we need to do this dynamically but we can start with
> signalling on rx queue half empty (or half full if you're an
> optimist). We further need to tune the queue size so that this
> results in an acceptable number of interrupts:
Part of the trouble with the embedded scatter gather list is that it's
not at all clear what "half empty" is unless you count all of the
descriptors. There may be one giant descriptor, or many small ones.
> 1 Gbps = 83K pps = 83 entries per half queue @ 1 KHz interrupt rate
>
> So we need 166 entries on the queue to keep a moderate interrupt rate,
> if we change it to signal at the halfway mark.
>
> Note that flow control still makes sense since it allows us to buffer
> some packets if the guest is scheduled out. But we can't use it as
> the primary mechanism since it won't exist with multiqueue NICs (where
> the virtio descriptors are fed to driver).
Yes, we also need a better mechanism with vringfd(). I've been thinking
about how to structure this API within QEMU but it's still not clear to
me. Flow control seems to make sense though with the given API.
Regards,
Anthony Liguori
> Similar reasoning probably applied to tx.
>
-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference
Don't miss this year's exciting event. There's still time to save $100.
Use priority code J8TL2D2.
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Fix e1000 can_receive handler
2008-05-08 18:41 ` Anthony Liguori
@ 2008-05-09 7:47 ` Avi Kivity
0 siblings, 0 replies; 7+ messages in thread
From: Avi Kivity @ 2008-05-09 7:47 UTC (permalink / raw)
To: Anthony Liguori; +Cc: kvm-devel, Aurelien Jarno
Anthony Liguori wrote:
> Avi Kivity wrote:
>
>> Anthony Liguori wrote:
>>
>>> Aurelien Jarno wrote:
>>>
>>>> On Wed, May 07, 2008 at 04:40:58PM -0500, Anthony Liguori wrote:
>>>>
>>>>
>>>>> The current logic of the can_receive handler is to allow packets
>>>>> whenever the
>>>>> receiver is disabled or when there are descriptors available in the
>>>>> ring.
>>>>>
>>>>> I think the logic ought to be to allow packets whenever the
>>>>> receiver is enabled
>>>>> and there are descriptors available in the ring.
>>>>>
>>>>>
>>>> The current behaviour is actually correct, this is the way QEMU works:
>>>> when the card is stopped, it should always accept packets, and then
>>>> discard them.
>>>>
>>>>
>>> The previous patches in my virtio series change that behavior.
>>> Before delivering a packet to a VLAN, it checks to see if any of the
>>> VLAN clients are able to receive a packet.
>>>
>>> This is very important for achieving good performance with tap. With
>>> virtio-net, we were dropping a ton of packets with tap because there
>>> weren't descriptors available on the RX ring.
>>>
>>> I plan to submit that behavioral change to QEMU upstream along with
>>> the virtio drivers. I'm still optimizing phys_page_find() though.
>>> The performance impact of switching the ring manipulation to using
>>> the stl_phys accessors is unacceptable for KVM.
>>>
>> I think this indicates a virtio tuning problem. The rx queue should
>> never be empty on normal operation, ever. Signalling on rx queue
>> empty invites the overrun since it takes time for the interrupt to be
>> delivered and for the guest to refill the queue.
>>
>
> Well, to start with, the e1000_can_receive handler is just plan wrong.
> The logic is broken. This hasn't caused an issue because the code isn't
> used.
>
> That said, it is possible to tune virtio to get back the performance
> lose of dropping packets. We lose about 20% of iperf from dropped
> packets ATM. If we bump the ring size up to 512 we get it back. If we
> bump it to 1024, we start loosing again. It's much less reliably than
> doing flow control though.
>
What about setting the interrupt to fire at the ring midpoint?
>> Long term we need to do this dynamically but we can start with
>> signalling on rx queue half empty (or half full if you're an
>> optimist). We further need to tune the queue size so that this
>> results in an acceptable number of interrupts:
>>
>
> Part of the trouble with the embedded scatter gather list is that it's
> not at all clear what "half empty" is unless you count all of the
> descriptors. There may be one giant descriptor, or many small ones.
>
>
For networking, chaining is actually better since we want to measure
bandwidth, not descriptor count. Each ring entry will usually be
between 1500 and 4096 bytes, and so an entry count is an approximation
of the time it will take to send that data.
With small packets this breaks down, but these are not the typical high
bandwidth workloads.
Block is differently since there we want to count seeks, not data.
>> 1 Gbps = 83K pps = 83 entries per half queue @ 1 KHz interrupt rate
>>
>> So we need 166 entries on the queue to keep a moderate interrupt rate,
>> if we change it to signal at the halfway mark.
>>
>> Note that flow control still makes sense since it allows us to buffer
>> some packets if the guest is scheduled out. But we can't use it as
>> the primary mechanism since it won't exist with multiqueue NICs (where
>> the virtio descriptors are fed to driver).
>>
>
> Yes, we also need a better mechanism with vringfd(). I've been thinking
> about how to structure this API within QEMU but it's still not clear to
> me. Flow control seems to make sense though with the given API.
>
I don't disagree with flow control; I just don't think it's enough.
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference
Don't miss this year's exciting event. There's still time to save $100.
Use priority code J8TL2D2.
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2008-05-09 7:47 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-05-07 21:40 [PATCH] Fix e1000 can_receive handler Anthony Liguori
2008-05-08 13:26 ` Aurelien Jarno
2008-05-08 14:03 ` Anthony Liguori
2008-05-08 17:02 ` Avi Kivity
2008-05-08 17:21 ` Avi Kivity
2008-05-08 18:41 ` Anthony Liguori
2008-05-09 7:47 ` Avi Kivity
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox