* [PATCH] virtio: use shadow_avail_idx while checking number of heads
@ 2023-08-25 17:04 Ilya Maximets
2023-09-25 11:20 ` Ilya Maximets
2023-09-25 14:23 ` Stefan Hajnoczi
0 siblings, 2 replies; 12+ messages in thread
From: Ilya Maximets @ 2023-08-25 17:04 UTC (permalink / raw)
To: qemu-devel; +Cc: Jason Wang, Stefan Hajnoczi, Michael S. Tsirkin, Ilya Maximets
We do not need the most up to date number of heads, we only want to
know if there is at least one.
Use shadow variable as long as it is not equal to the last available
index checked. This avoids expensive qatomic dereference of the
RCU-protected memory region cache as well as the memory access itself
and the subsequent memory barrier.
The change improves performance of the af-xdp network backend by 2-3%.
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
---
hw/virtio/virtio.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 309038fd46..04bf7cc977 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -999,7 +999,15 @@ void virtqueue_push(VirtQueue *vq, const VirtQueueElement *elem,
/* Called within rcu_read_lock(). */
static int virtqueue_num_heads(VirtQueue *vq, unsigned int idx)
{
- uint16_t num_heads = vring_avail_idx(vq) - idx;
+ uint16_t num_heads;
+
+ if (vq->shadow_avail_idx != idx) {
+ num_heads = vq->shadow_avail_idx - idx;
+
+ return num_heads;
+ }
+
+ num_heads = vring_avail_idx(vq) - idx;
/* Check it isn't doing very strange things with descriptor numbers. */
if (num_heads > vq->vring.num) {
--
2.40.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH] virtio: use shadow_avail_idx while checking number of heads
2023-08-25 17:04 [PATCH] virtio: use shadow_avail_idx while checking number of heads Ilya Maximets
@ 2023-09-25 11:20 ` Ilya Maximets
2023-09-25 14:23 ` Stefan Hajnoczi
1 sibling, 0 replies; 12+ messages in thread
From: Ilya Maximets @ 2023-09-25 11:20 UTC (permalink / raw)
To: qemu-devel; +Cc: i.maximets, Jason Wang, Stefan Hajnoczi, Michael S. Tsirkin
On 8/25/23 19:04, Ilya Maximets wrote:
> We do not need the most up to date number of heads, we only want to
> know if there is at least one.
>
> Use shadow variable as long as it is not equal to the last available
> index checked. This avoids expensive qatomic dereference of the
> RCU-protected memory region cache as well as the memory access itself
> and the subsequent memory barrier.
>
> The change improves performance of the af-xdp network backend by 2-3%.
>
> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
> ---
Kind reminder.
Best regards, Ilya Maximets.
> hw/virtio/virtio.c | 10 +++++++++-
> 1 file changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> index 309038fd46..04bf7cc977 100644
> --- a/hw/virtio/virtio.c
> +++ b/hw/virtio/virtio.c
> @@ -999,7 +999,15 @@ void virtqueue_push(VirtQueue *vq, const VirtQueueElement *elem,
> /* Called within rcu_read_lock(). */
> static int virtqueue_num_heads(VirtQueue *vq, unsigned int idx)
> {
> - uint16_t num_heads = vring_avail_idx(vq) - idx;
> + uint16_t num_heads;
> +
> + if (vq->shadow_avail_idx != idx) {
> + num_heads = vq->shadow_avail_idx - idx;
> +
> + return num_heads;
> + }
> +
> + num_heads = vring_avail_idx(vq) - idx;
>
> /* Check it isn't doing very strange things with descriptor numbers. */
> if (num_heads > vq->vring.num) {
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] virtio: use shadow_avail_idx while checking number of heads
2023-08-25 17:04 [PATCH] virtio: use shadow_avail_idx while checking number of heads Ilya Maximets
2023-09-25 11:20 ` Ilya Maximets
@ 2023-09-25 14:23 ` Stefan Hajnoczi
2023-09-25 15:02 ` Ilya Maximets
1 sibling, 1 reply; 12+ messages in thread
From: Stefan Hajnoczi @ 2023-09-25 14:23 UTC (permalink / raw)
To: Ilya Maximets; +Cc: qemu-devel, Jason Wang, Stefan Hajnoczi, Michael S. Tsirkin
On Fri, 25 Aug 2023 at 13:04, Ilya Maximets <i.maximets@ovn.org> wrote:
>
> We do not need the most up to date number of heads, we only want to
> know if there is at least one.
>
> Use shadow variable as long as it is not equal to the last available
> index checked. This avoids expensive qatomic dereference of the
> RCU-protected memory region cache as well as the memory access itself
> and the subsequent memory barrier.
>
> The change improves performance of the af-xdp network backend by 2-3%.
>
> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
> ---
> hw/virtio/virtio.c | 10 +++++++++-
> 1 file changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> index 309038fd46..04bf7cc977 100644
> --- a/hw/virtio/virtio.c
> +++ b/hw/virtio/virtio.c
> @@ -999,7 +999,15 @@ void virtqueue_push(VirtQueue *vq, const VirtQueueElement *elem,
> /* Called within rcu_read_lock(). */
> static int virtqueue_num_heads(VirtQueue *vq, unsigned int idx)
> {
> - uint16_t num_heads = vring_avail_idx(vq) - idx;
> + uint16_t num_heads;
> +
> + if (vq->shadow_avail_idx != idx) {
> + num_heads = vq->shadow_avail_idx - idx;
> +
> + return num_heads;
This still needs to check num_heads > vq->vring.num and return -EINVAL
as is done below.
> + }
> +
> + num_heads = vring_avail_idx(vq) - idx;
>
> /* Check it isn't doing very strange things with descriptor numbers. */
> if (num_heads > vq->vring.num) {
> --
> 2.40.1
>
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] virtio: use shadow_avail_idx while checking number of heads
2023-09-25 14:23 ` Stefan Hajnoczi
@ 2023-09-25 15:02 ` Ilya Maximets
2023-09-25 15:12 ` Stefan Hajnoczi
0 siblings, 1 reply; 12+ messages in thread
From: Ilya Maximets @ 2023-09-25 15:02 UTC (permalink / raw)
To: Stefan Hajnoczi
Cc: i.maximets, qemu-devel, Jason Wang, Stefan Hajnoczi,
Michael S. Tsirkin
On 9/25/23 16:23, Stefan Hajnoczi wrote:
> On Fri, 25 Aug 2023 at 13:04, Ilya Maximets <i.maximets@ovn.org> wrote:
>>
>> We do not need the most up to date number of heads, we only want to
>> know if there is at least one.
>>
>> Use shadow variable as long as it is not equal to the last available
>> index checked. This avoids expensive qatomic dereference of the
>> RCU-protected memory region cache as well as the memory access itself
>> and the subsequent memory barrier.
>>
>> The change improves performance of the af-xdp network backend by 2-3%.
>>
>> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
>> ---
>> hw/virtio/virtio.c | 10 +++++++++-
>> 1 file changed, 9 insertions(+), 1 deletion(-)
>>
>> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
>> index 309038fd46..04bf7cc977 100644
>> --- a/hw/virtio/virtio.c
>> +++ b/hw/virtio/virtio.c
>> @@ -999,7 +999,15 @@ void virtqueue_push(VirtQueue *vq, const VirtQueueElement *elem,
>> /* Called within rcu_read_lock(). */
>> static int virtqueue_num_heads(VirtQueue *vq, unsigned int idx)
>> {
>> - uint16_t num_heads = vring_avail_idx(vq) - idx;
>> + uint16_t num_heads;
>> +
>> + if (vq->shadow_avail_idx != idx) {
>> + num_heads = vq->shadow_avail_idx - idx;
>> +
>> + return num_heads;
>
> This still needs to check num_heads > vq->vring.num and return -EINVAL
> as is done below.
Hmm, yeas, you're right. If the value was incorrect initially, the shadow
will be incorrect. However, I think we should just not return here in this
case and let vring_avail_idx() to grab an actual new value below. Otherwise
we may never break out of this error.
Does that make sense?
>
>> + }
>> +
>> + num_heads = vring_avail_idx(vq) - idx;
>>
>> /* Check it isn't doing very strange things with descriptor numbers. */
>> if (num_heads > vq->vring.num) {
>> --
>> 2.40.1
>>
>>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] virtio: use shadow_avail_idx while checking number of heads
2023-09-25 15:02 ` Ilya Maximets
@ 2023-09-25 15:12 ` Stefan Hajnoczi
2023-09-25 15:36 ` Ilya Maximets
0 siblings, 1 reply; 12+ messages in thread
From: Stefan Hajnoczi @ 2023-09-25 15:12 UTC (permalink / raw)
To: Ilya Maximets; +Cc: qemu-devel, Jason Wang, Stefan Hajnoczi, Michael S. Tsirkin
On Mon, 25 Sept 2023 at 11:02, Ilya Maximets <i.maximets@ovn.org> wrote:
>
> On 9/25/23 16:23, Stefan Hajnoczi wrote:
> > On Fri, 25 Aug 2023 at 13:04, Ilya Maximets <i.maximets@ovn.org> wrote:
> >>
> >> We do not need the most up to date number of heads, we only want to
> >> know if there is at least one.
> >>
> >> Use shadow variable as long as it is not equal to the last available
> >> index checked. This avoids expensive qatomic dereference of the
> >> RCU-protected memory region cache as well as the memory access itself
> >> and the subsequent memory barrier.
> >>
> >> The change improves performance of the af-xdp network backend by 2-3%.
> >>
> >> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
> >> ---
> >> hw/virtio/virtio.c | 10 +++++++++-
> >> 1 file changed, 9 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> >> index 309038fd46..04bf7cc977 100644
> >> --- a/hw/virtio/virtio.c
> >> +++ b/hw/virtio/virtio.c
> >> @@ -999,7 +999,15 @@ void virtqueue_push(VirtQueue *vq, const VirtQueueElement *elem,
> >> /* Called within rcu_read_lock(). */
> >> static int virtqueue_num_heads(VirtQueue *vq, unsigned int idx)
> >> {
> >> - uint16_t num_heads = vring_avail_idx(vq) - idx;
> >> + uint16_t num_heads;
> >> +
> >> + if (vq->shadow_avail_idx != idx) {
> >> + num_heads = vq->shadow_avail_idx - idx;
> >> +
> >> + return num_heads;
> >
> > This still needs to check num_heads > vq->vring.num and return -EINVAL
> > as is done below.
>
> Hmm, yeas, you're right. If the value was incorrect initially, the shadow
> will be incorrect. However, I think we should just not return here in this
> case and let vring_avail_idx() to grab an actual new value below. Otherwise
> we may never break out of this error.
>
> Does that make sense?
No, because virtio_error() marks the device as broken. The device
requires a reset in order to function again. Fetching
vring_avail_idx() again won't help.
>
> >
> >> + }
> >> +
> >> + num_heads = vring_avail_idx(vq) - idx;
> >>
> >> /* Check it isn't doing very strange things with descriptor numbers. */
> >> if (num_heads > vq->vring.num) {
> >> --
> >> 2.40.1
> >>
> >>
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] virtio: use shadow_avail_idx while checking number of heads
2023-09-25 15:12 ` Stefan Hajnoczi
@ 2023-09-25 15:36 ` Ilya Maximets
2023-09-25 15:38 ` Stefan Hajnoczi
0 siblings, 1 reply; 12+ messages in thread
From: Ilya Maximets @ 2023-09-25 15:36 UTC (permalink / raw)
To: Stefan Hajnoczi
Cc: i.maximets, qemu-devel, Jason Wang, Stefan Hajnoczi,
Michael S. Tsirkin
On 9/25/23 17:12, Stefan Hajnoczi wrote:
> On Mon, 25 Sept 2023 at 11:02, Ilya Maximets <i.maximets@ovn.org> wrote:
>>
>> On 9/25/23 16:23, Stefan Hajnoczi wrote:
>>> On Fri, 25 Aug 2023 at 13:04, Ilya Maximets <i.maximets@ovn.org> wrote:
>>>>
>>>> We do not need the most up to date number of heads, we only want to
>>>> know if there is at least one.
>>>>
>>>> Use shadow variable as long as it is not equal to the last available
>>>> index checked. This avoids expensive qatomic dereference of the
>>>> RCU-protected memory region cache as well as the memory access itself
>>>> and the subsequent memory barrier.
>>>>
>>>> The change improves performance of the af-xdp network backend by 2-3%.
>>>>
>>>> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
>>>> ---
>>>> hw/virtio/virtio.c | 10 +++++++++-
>>>> 1 file changed, 9 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
>>>> index 309038fd46..04bf7cc977 100644
>>>> --- a/hw/virtio/virtio.c
>>>> +++ b/hw/virtio/virtio.c
>>>> @@ -999,7 +999,15 @@ void virtqueue_push(VirtQueue *vq, const VirtQueueElement *elem,
>>>> /* Called within rcu_read_lock(). */
>>>> static int virtqueue_num_heads(VirtQueue *vq, unsigned int idx)
>>>> {
>>>> - uint16_t num_heads = vring_avail_idx(vq) - idx;
>>>> + uint16_t num_heads;
>>>> +
>>>> + if (vq->shadow_avail_idx != idx) {
>>>> + num_heads = vq->shadow_avail_idx - idx;
>>>> +
>>>> + return num_heads;
>>>
>>> This still needs to check num_heads > vq->vring.num and return -EINVAL
>>> as is done below.
>>
>> Hmm, yeas, you're right. If the value was incorrect initially, the shadow
>> will be incorrect. However, I think we should just not return here in this
>> case and let vring_avail_idx() to grab an actual new value below. Otherwise
>> we may never break out of this error.
>>
>> Does that make sense?
>
> No, because virtio_error() marks the device as broken. The device
> requires a reset in order to function again. Fetching
> vring_avail_idx() again won't help.
OK, I see. In this case we're talking about situation where
vring_avail_idx() was called in some other place and stored a bad value
in the shadow variable, then virtqueue_num_heads() got called. Right?
AFAIU, we can still just fall through here and let vring_avail_idx()
to read the index again and fail the existing check. That would happen
today without this patch applied.
I'm jut trying to avoid duplication of the virtio_error call, i.e.:
if (vq->shadow_avail_idx != idx) {
num_heads = vq->shadow_avail_idx - idx;
/* Check it isn't doing very strange things with descriptor numbers. */
if (num_heads > vq->vring.num) {
virtio_error(vq->vdev, "Guest moved used index from %u to %u",
idx, vq->shadow_avail_idx);
return -EINVAL;
}
return num_heads;
}
vs
if (vq->shadow_avail_idx != idx) {
num_heads = vq->shadow_avail_idx - idx;
/* Only use the shadow value if it was good initially. */
if (num_heads <= vq->vring.num) {
return num_heads;
}
}
What do you think?
Best regards, Ilya Maximets.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] virtio: use shadow_avail_idx while checking number of heads
2023-09-25 15:36 ` Ilya Maximets
@ 2023-09-25 15:38 ` Stefan Hajnoczi
2023-09-25 20:58 ` Ilya Maximets
0 siblings, 1 reply; 12+ messages in thread
From: Stefan Hajnoczi @ 2023-09-25 15:38 UTC (permalink / raw)
To: Ilya Maximets; +Cc: qemu-devel, Jason Wang, Stefan Hajnoczi, Michael S. Tsirkin
On Mon, 25 Sept 2023 at 11:36, Ilya Maximets <i.maximets@ovn.org> wrote:
>
> On 9/25/23 17:12, Stefan Hajnoczi wrote:
> > On Mon, 25 Sept 2023 at 11:02, Ilya Maximets <i.maximets@ovn.org> wrote:
> >>
> >> On 9/25/23 16:23, Stefan Hajnoczi wrote:
> >>> On Fri, 25 Aug 2023 at 13:04, Ilya Maximets <i.maximets@ovn.org> wrote:
> >>>>
> >>>> We do not need the most up to date number of heads, we only want to
> >>>> know if there is at least one.
> >>>>
> >>>> Use shadow variable as long as it is not equal to the last available
> >>>> index checked. This avoids expensive qatomic dereference of the
> >>>> RCU-protected memory region cache as well as the memory access itself
> >>>> and the subsequent memory barrier.
> >>>>
> >>>> The change improves performance of the af-xdp network backend by 2-3%.
> >>>>
> >>>> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
> >>>> ---
> >>>> hw/virtio/virtio.c | 10 +++++++++-
> >>>> 1 file changed, 9 insertions(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> >>>> index 309038fd46..04bf7cc977 100644
> >>>> --- a/hw/virtio/virtio.c
> >>>> +++ b/hw/virtio/virtio.c
> >>>> @@ -999,7 +999,15 @@ void virtqueue_push(VirtQueue *vq, const VirtQueueElement *elem,
> >>>> /* Called within rcu_read_lock(). */
> >>>> static int virtqueue_num_heads(VirtQueue *vq, unsigned int idx)
> >>>> {
> >>>> - uint16_t num_heads = vring_avail_idx(vq) - idx;
> >>>> + uint16_t num_heads;
> >>>> +
> >>>> + if (vq->shadow_avail_idx != idx) {
> >>>> + num_heads = vq->shadow_avail_idx - idx;
> >>>> +
> >>>> + return num_heads;
> >>>
> >>> This still needs to check num_heads > vq->vring.num and return -EINVAL
> >>> as is done below.
> >>
> >> Hmm, yeas, you're right. If the value was incorrect initially, the shadow
> >> will be incorrect. However, I think we should just not return here in this
> >> case and let vring_avail_idx() to grab an actual new value below. Otherwise
> >> we may never break out of this error.
> >>
> >> Does that make sense?
> >
> > No, because virtio_error() marks the device as broken. The device
> > requires a reset in order to function again. Fetching
> > vring_avail_idx() again won't help.
>
> OK, I see. In this case we're talking about situation where
> vring_avail_idx() was called in some other place and stored a bad value
> in the shadow variable, then virtqueue_num_heads() got called. Right?
>
> AFAIU, we can still just fall through here and let vring_avail_idx()
> to read the index again and fail the existing check. That would happen
> today without this patch applied.
Yes, that is fine.
>
> I'm jut trying to avoid duplication of the virtio_error call, i.e.:
>
> if (vq->shadow_avail_idx != idx) {
> num_heads = vq->shadow_avail_idx - idx;
>
> /* Check it isn't doing very strange things with descriptor numbers. */
> if (num_heads > vq->vring.num) {
> virtio_error(vq->vdev, "Guest moved used index from %u to %u",
> idx, vq->shadow_avail_idx);
> return -EINVAL;
> }
> return num_heads;
> }
>
> vs
>
> if (vq->shadow_avail_idx != idx) {
> num_heads = vq->shadow_avail_idx - idx;
>
> /* Only use the shadow value if it was good initially. */
> if (num_heads <= vq->vring.num) {
> return num_heads;
> }
> }
>
>
> What do you think?
Sounds good.
>
> Best regards, Ilya Maximets.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] virtio: use shadow_avail_idx while checking number of heads
2023-09-25 15:38 ` Stefan Hajnoczi
@ 2023-09-25 20:58 ` Ilya Maximets
2023-09-25 21:24 ` Michael S. Tsirkin
0 siblings, 1 reply; 12+ messages in thread
From: Ilya Maximets @ 2023-09-25 20:58 UTC (permalink / raw)
To: Stefan Hajnoczi
Cc: i.maximets, qemu-devel, Jason Wang, Stefan Hajnoczi,
Michael S. Tsirkin
On 9/25/23 17:38, Stefan Hajnoczi wrote:
> On Mon, 25 Sept 2023 at 11:36, Ilya Maximets <i.maximets@ovn.org> wrote:
>>
>> On 9/25/23 17:12, Stefan Hajnoczi wrote:
>>> On Mon, 25 Sept 2023 at 11:02, Ilya Maximets <i.maximets@ovn.org> wrote:
>>>>
>>>> On 9/25/23 16:23, Stefan Hajnoczi wrote:
>>>>> On Fri, 25 Aug 2023 at 13:04, Ilya Maximets <i.maximets@ovn.org> wrote:
>>>>>>
>>>>>> We do not need the most up to date number of heads, we only want to
>>>>>> know if there is at least one.
>>>>>>
>>>>>> Use shadow variable as long as it is not equal to the last available
>>>>>> index checked. This avoids expensive qatomic dereference of the
>>>>>> RCU-protected memory region cache as well as the memory access itself
>>>>>> and the subsequent memory barrier.
>>>>>>
>>>>>> The change improves performance of the af-xdp network backend by 2-3%.
>>>>>>
>>>>>> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
>>>>>> ---
>>>>>> hw/virtio/virtio.c | 10 +++++++++-
>>>>>> 1 file changed, 9 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
>>>>>> index 309038fd46..04bf7cc977 100644
>>>>>> --- a/hw/virtio/virtio.c
>>>>>> +++ b/hw/virtio/virtio.c
>>>>>> @@ -999,7 +999,15 @@ void virtqueue_push(VirtQueue *vq, const VirtQueueElement *elem,
>>>>>> /* Called within rcu_read_lock(). */
>>>>>> static int virtqueue_num_heads(VirtQueue *vq, unsigned int idx)
>>>>>> {
>>>>>> - uint16_t num_heads = vring_avail_idx(vq) - idx;
>>>>>> + uint16_t num_heads;
>>>>>> +
>>>>>> + if (vq->shadow_avail_idx != idx) {
>>>>>> + num_heads = vq->shadow_avail_idx - idx;
>>>>>> +
>>>>>> + return num_heads;
>>>>>
>>>>> This still needs to check num_heads > vq->vring.num and return -EINVAL
>>>>> as is done below.
>>>>
>>>> Hmm, yeas, you're right. If the value was incorrect initially, the shadow
>>>> will be incorrect. However, I think we should just not return here in this
>>>> case and let vring_avail_idx() to grab an actual new value below. Otherwise
>>>> we may never break out of this error.
>>>>
>>>> Does that make sense?
>>>
>>> No, because virtio_error() marks the device as broken. The device
>>> requires a reset in order to function again. Fetching
>>> vring_avail_idx() again won't help.
>>
>> OK, I see. In this case we're talking about situation where
>> vring_avail_idx() was called in some other place and stored a bad value
>> in the shadow variable, then virtqueue_num_heads() got called. Right?
Hmm, I suppose we also need a read barrier after all even if we use
a shadow index. Assuming the index is correct, but the shadow variable
was updated by a call outside of this function, then we may miss a
barrier and read the descriptor out of order, in theory. Read barrier
is going to be a compiler barrier on x86, so the performance gain from
this patch should still be mostly there. I'll test that.
>>
>> AFAIU, we can still just fall through here and let vring_avail_idx()
>> to read the index again and fail the existing check. That would happen
>> today without this patch applied.
>
> Yes, that is fine.
>
>>
>> I'm jut trying to avoid duplication of the virtio_error call, i.e.:
>>
>> if (vq->shadow_avail_idx != idx) {
>> num_heads = vq->shadow_avail_idx - idx;
>>
>> /* Check it isn't doing very strange things with descriptor numbers. */
>> if (num_heads > vq->vring.num) {
>> virtio_error(vq->vdev, "Guest moved used index from %u to %u",
>> idx, vq->shadow_avail_idx);
>> return -EINVAL;
>> }
>> return num_heads;
>> }
>>
>> vs
>>
>> if (vq->shadow_avail_idx != idx) {
>> num_heads = vq->shadow_avail_idx - idx;
>>
>> /* Only use the shadow value if it was good initially. */
>> if (num_heads <= vq->vring.num) {
>> return num_heads;
>> }
>> }
>>
>>
>> What do you think?
>
> Sounds good.
>
>>
>> Best regards, Ilya Maximets.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] virtio: use shadow_avail_idx while checking number of heads
2023-09-25 20:58 ` Ilya Maximets
@ 2023-09-25 21:24 ` Michael S. Tsirkin
2023-09-25 22:13 ` Ilya Maximets
0 siblings, 1 reply; 12+ messages in thread
From: Michael S. Tsirkin @ 2023-09-25 21:24 UTC (permalink / raw)
To: Ilya Maximets; +Cc: Stefan Hajnoczi, qemu-devel, Jason Wang, Stefan Hajnoczi
On Mon, Sep 25, 2023 at 10:58:05PM +0200, Ilya Maximets wrote:
> On 9/25/23 17:38, Stefan Hajnoczi wrote:
> > On Mon, 25 Sept 2023 at 11:36, Ilya Maximets <i.maximets@ovn.org> wrote:
> >>
> >> On 9/25/23 17:12, Stefan Hajnoczi wrote:
> >>> On Mon, 25 Sept 2023 at 11:02, Ilya Maximets <i.maximets@ovn.org> wrote:
> >>>>
> >>>> On 9/25/23 16:23, Stefan Hajnoczi wrote:
> >>>>> On Fri, 25 Aug 2023 at 13:04, Ilya Maximets <i.maximets@ovn.org> wrote:
> >>>>>>
> >>>>>> We do not need the most up to date number of heads, we only want to
> >>>>>> know if there is at least one.
> >>>>>>
> >>>>>> Use shadow variable as long as it is not equal to the last available
> >>>>>> index checked. This avoids expensive qatomic dereference of the
> >>>>>> RCU-protected memory region cache as well as the memory access itself
> >>>>>> and the subsequent memory barrier.
> >>>>>>
> >>>>>> The change improves performance of the af-xdp network backend by 2-3%.
> >>>>>>
> >>>>>> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
> >>>>>> ---
> >>>>>> hw/virtio/virtio.c | 10 +++++++++-
> >>>>>> 1 file changed, 9 insertions(+), 1 deletion(-)
> >>>>>>
> >>>>>> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> >>>>>> index 309038fd46..04bf7cc977 100644
> >>>>>> --- a/hw/virtio/virtio.c
> >>>>>> +++ b/hw/virtio/virtio.c
> >>>>>> @@ -999,7 +999,15 @@ void virtqueue_push(VirtQueue *vq, const VirtQueueElement *elem,
> >>>>>> /* Called within rcu_read_lock(). */
> >>>>>> static int virtqueue_num_heads(VirtQueue *vq, unsigned int idx)
> >>>>>> {
> >>>>>> - uint16_t num_heads = vring_avail_idx(vq) - idx;
> >>>>>> + uint16_t num_heads;
> >>>>>> +
> >>>>>> + if (vq->shadow_avail_idx != idx) {
> >>>>>> + num_heads = vq->shadow_avail_idx - idx;
> >>>>>> +
> >>>>>> + return num_heads;
> >>>>>
> >>>>> This still needs to check num_heads > vq->vring.num and return -EINVAL
> >>>>> as is done below.
> >>>>
> >>>> Hmm, yeas, you're right. If the value was incorrect initially, the shadow
> >>>> will be incorrect. However, I think we should just not return here in this
> >>>> case and let vring_avail_idx() to grab an actual new value below. Otherwise
> >>>> we may never break out of this error.
> >>>>
> >>>> Does that make sense?
> >>>
> >>> No, because virtio_error() marks the device as broken. The device
> >>> requires a reset in order to function again. Fetching
> >>> vring_avail_idx() again won't help.
> >>
> >> OK, I see. In this case we're talking about situation where
> >> vring_avail_idx() was called in some other place and stored a bad value
> >> in the shadow variable, then virtqueue_num_heads() got called. Right?
>
> Hmm, I suppose we also need a read barrier after all even if we use
> a shadow index. Assuming the index is correct, but the shadow variable
> was updated by a call outside of this function, then we may miss a
> barrier and read the descriptor out of order, in theory. Read barrier
> is going to be a compiler barrier on x86, so the performance gain from
> this patch should still be mostly there. I'll test that.
I can't say I understand generally. shadow is under qemu control,
I don't think it can be updated concurrently by multiple CPUs.
> >>
> >> AFAIU, we can still just fall through here and let vring_avail_idx()
> >> to read the index again and fail the existing check. That would happen
> >> today without this patch applied.
> >
> > Yes, that is fine.
> >
> >>
> >> I'm jut trying to avoid duplication of the virtio_error call, i.e.:
> >>
> >> if (vq->shadow_avail_idx != idx) {
> >> num_heads = vq->shadow_avail_idx - idx;
> >>
> >> /* Check it isn't doing very strange things with descriptor numbers. */
> >> if (num_heads > vq->vring.num) {
> >> virtio_error(vq->vdev, "Guest moved used index from %u to %u",
> >> idx, vq->shadow_avail_idx);
> >> return -EINVAL;
> >> }
> >> return num_heads;
> >> }
> >>
> >> vs
> >>
> >> if (vq->shadow_avail_idx != idx) {
> >> num_heads = vq->shadow_avail_idx - idx;
> >>
> >> /* Only use the shadow value if it was good initially. */
> >> if (num_heads <= vq->vring.num) {
> >> return num_heads;
> >> }
> >> }
> >>
> >>
> >> What do you think?
> >
> > Sounds good.
> >
> >>
> >> Best regards, Ilya Maximets.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] virtio: use shadow_avail_idx while checking number of heads
2023-09-25 21:24 ` Michael S. Tsirkin
@ 2023-09-25 22:13 ` Ilya Maximets
2023-09-25 22:24 ` Michael S. Tsirkin
0 siblings, 1 reply; 12+ messages in thread
From: Ilya Maximets @ 2023-09-25 22:13 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: i.maximets, Stefan Hajnoczi, qemu-devel, Jason Wang,
Stefan Hajnoczi
On 9/25/23 23:24, Michael S. Tsirkin wrote:
> On Mon, Sep 25, 2023 at 10:58:05PM +0200, Ilya Maximets wrote:
>> On 9/25/23 17:38, Stefan Hajnoczi wrote:
>>> On Mon, 25 Sept 2023 at 11:36, Ilya Maximets <i.maximets@ovn.org> wrote:
>>>>
>>>> On 9/25/23 17:12, Stefan Hajnoczi wrote:
>>>>> On Mon, 25 Sept 2023 at 11:02, Ilya Maximets <i.maximets@ovn.org> wrote:
>>>>>>
>>>>>> On 9/25/23 16:23, Stefan Hajnoczi wrote:
>>>>>>> On Fri, 25 Aug 2023 at 13:04, Ilya Maximets <i.maximets@ovn.org> wrote:
>>>>>>>>
>>>>>>>> We do not need the most up to date number of heads, we only want to
>>>>>>>> know if there is at least one.
>>>>>>>>
>>>>>>>> Use shadow variable as long as it is not equal to the last available
>>>>>>>> index checked. This avoids expensive qatomic dereference of the
>>>>>>>> RCU-protected memory region cache as well as the memory access itself
>>>>>>>> and the subsequent memory barrier.
>>>>>>>>
>>>>>>>> The change improves performance of the af-xdp network backend by 2-3%.
>>>>>>>>
>>>>>>>> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
>>>>>>>> ---
>>>>>>>> hw/virtio/virtio.c | 10 +++++++++-
>>>>>>>> 1 file changed, 9 insertions(+), 1 deletion(-)
>>>>>>>>
>>>>>>>> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
>>>>>>>> index 309038fd46..04bf7cc977 100644
>>>>>>>> --- a/hw/virtio/virtio.c
>>>>>>>> +++ b/hw/virtio/virtio.c
>>>>>>>> @@ -999,7 +999,15 @@ void virtqueue_push(VirtQueue *vq, const VirtQueueElement *elem,
>>>>>>>> /* Called within rcu_read_lock(). */
>>>>>>>> static int virtqueue_num_heads(VirtQueue *vq, unsigned int idx)
>>>>>>>> {
>>>>>>>> - uint16_t num_heads = vring_avail_idx(vq) - idx;
>>>>>>>> + uint16_t num_heads;
>>>>>>>> +
>>>>>>>> + if (vq->shadow_avail_idx != idx) {
>>>>>>>> + num_heads = vq->shadow_avail_idx - idx;
>>>>>>>> +
>>>>>>>> + return num_heads;
>>>>>>>
>>>>>>> This still needs to check num_heads > vq->vring.num and return -EINVAL
>>>>>>> as is done below.
>>>>>>
>>>>>> Hmm, yeas, you're right. If the value was incorrect initially, the shadow
>>>>>> will be incorrect. However, I think we should just not return here in this
>>>>>> case and let vring_avail_idx() to grab an actual new value below. Otherwise
>>>>>> we may never break out of this error.
>>>>>>
>>>>>> Does that make sense?
>>>>>
>>>>> No, because virtio_error() marks the device as broken. The device
>>>>> requires a reset in order to function again. Fetching
>>>>> vring_avail_idx() again won't help.
>>>>
>>>> OK, I see. In this case we're talking about situation where
>>>> vring_avail_idx() was called in some other place and stored a bad value
>>>> in the shadow variable, then virtqueue_num_heads() got called. Right?
>>
>> Hmm, I suppose we also need a read barrier after all even if we use
>> a shadow index. Assuming the index is correct, but the shadow variable
>> was updated by a call outside of this function, then we may miss a
>> barrier and read the descriptor out of order, in theory. Read barrier
>> is going to be a compiler barrier on x86, so the performance gain from
>> this patch should still be mostly there. I'll test that.
>
> I can't say I understand generally. shadow is under qemu control,
> I don't think it can be updated concurrently by multiple CPUs.
It can't, I agree. Scenario I'm thinking about is the following:
1. vring_avail_idx() is called from one of the places other than
virtqueue_num_heads(). Shadow is updated with the current value.
Some users of vring_avail_idx() do not use barriers after the call.
2. virtqueue_split_get_avail_bytes() is called.
3. virtqueue_split_get_avail_bytes() calls virtqueue_num_heads().
4. virtqueue_num_heads() checks the shadow and returns early.
5. virtqueue_split_get_avail_bytes() calls vring_split_desc_read() and
reads the descriptor.
If between steps 1 and 5 we do not have a read barrier, we potentially
risk reading descriptor data that is not yet fully written, because
there is no guarantee that reading the last_avail_idx on step 1 wasn't
reordered with the descriptor read.
In current code we always have smp_rmb() in virtqueue_num_heads().
But if we return from this function without a barrier, we may have an
issue, IIUC.
I agree that it's kind of a very unlikely scenario and we will probably
have a control dependency between steps 1 and 5 that will prevent the
issue, but it might be safer to just have an explicit barrier in
virtqueue_num_heads().
Does that make sense? Or am I missing something else here?
>
>
>>>>
>>>> AFAIU, we can still just fall through here and let vring_avail_idx()
>>>> to read the index again and fail the existing check. That would happen
>>>> today without this patch applied.
>>>
>>> Yes, that is fine.
>>>
>>>>
>>>> I'm jut trying to avoid duplication of the virtio_error call, i.e.:
>>>>
>>>> if (vq->shadow_avail_idx != idx) {
>>>> num_heads = vq->shadow_avail_idx - idx;
>>>>
>>>> /* Check it isn't doing very strange things with descriptor numbers. */
>>>> if (num_heads > vq->vring.num) {
>>>> virtio_error(vq->vdev, "Guest moved used index from %u to %u",
>>>> idx, vq->shadow_avail_idx);
>>>> return -EINVAL;
>>>> }
>>>> return num_heads;
>>>> }
>>>>
>>>> vs
>>>>
>>>> if (vq->shadow_avail_idx != idx) {
>>>> num_heads = vq->shadow_avail_idx - idx;
>>>>
>>>> /* Only use the shadow value if it was good initially. */
>>>> if (num_heads <= vq->vring.num) {
>>>> return num_heads;
>>>> }
>>>> }
>>>>
>>>>
>>>> What do you think?
>>>
>>> Sounds good.
>>>
>>>>
>>>> Best regards, Ilya Maximets.
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] virtio: use shadow_avail_idx while checking number of heads
2023-09-25 22:13 ` Ilya Maximets
@ 2023-09-25 22:24 ` Michael S. Tsirkin
2023-09-27 14:01 ` Ilya Maximets
0 siblings, 1 reply; 12+ messages in thread
From: Michael S. Tsirkin @ 2023-09-25 22:24 UTC (permalink / raw)
To: Ilya Maximets; +Cc: Stefan Hajnoczi, qemu-devel, Jason Wang, Stefan Hajnoczi
On Tue, Sep 26, 2023 at 12:13:11AM +0200, Ilya Maximets wrote:
> On 9/25/23 23:24, Michael S. Tsirkin wrote:
> > On Mon, Sep 25, 2023 at 10:58:05PM +0200, Ilya Maximets wrote:
> >> On 9/25/23 17:38, Stefan Hajnoczi wrote:
> >>> On Mon, 25 Sept 2023 at 11:36, Ilya Maximets <i.maximets@ovn.org> wrote:
> >>>>
> >>>> On 9/25/23 17:12, Stefan Hajnoczi wrote:
> >>>>> On Mon, 25 Sept 2023 at 11:02, Ilya Maximets <i.maximets@ovn.org> wrote:
> >>>>>>
> >>>>>> On 9/25/23 16:23, Stefan Hajnoczi wrote:
> >>>>>>> On Fri, 25 Aug 2023 at 13:04, Ilya Maximets <i.maximets@ovn.org> wrote:
> >>>>>>>>
> >>>>>>>> We do not need the most up to date number of heads, we only want to
> >>>>>>>> know if there is at least one.
> >>>>>>>>
> >>>>>>>> Use shadow variable as long as it is not equal to the last available
> >>>>>>>> index checked. This avoids expensive qatomic dereference of the
> >>>>>>>> RCU-protected memory region cache as well as the memory access itself
> >>>>>>>> and the subsequent memory barrier.
> >>>>>>>>
> >>>>>>>> The change improves performance of the af-xdp network backend by 2-3%.
> >>>>>>>>
> >>>>>>>> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
> >>>>>>>> ---
> >>>>>>>> hw/virtio/virtio.c | 10 +++++++++-
> >>>>>>>> 1 file changed, 9 insertions(+), 1 deletion(-)
> >>>>>>>>
> >>>>>>>> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> >>>>>>>> index 309038fd46..04bf7cc977 100644
> >>>>>>>> --- a/hw/virtio/virtio.c
> >>>>>>>> +++ b/hw/virtio/virtio.c
> >>>>>>>> @@ -999,7 +999,15 @@ void virtqueue_push(VirtQueue *vq, const VirtQueueElement *elem,
> >>>>>>>> /* Called within rcu_read_lock(). */
> >>>>>>>> static int virtqueue_num_heads(VirtQueue *vq, unsigned int idx)
> >>>>>>>> {
> >>>>>>>> - uint16_t num_heads = vring_avail_idx(vq) - idx;
> >>>>>>>> + uint16_t num_heads;
> >>>>>>>> +
> >>>>>>>> + if (vq->shadow_avail_idx != idx) {
> >>>>>>>> + num_heads = vq->shadow_avail_idx - idx;
> >>>>>>>> +
> >>>>>>>> + return num_heads;
> >>>>>>>
> >>>>>>> This still needs to check num_heads > vq->vring.num and return -EINVAL
> >>>>>>> as is done below.
> >>>>>>
> >>>>>> Hmm, yeas, you're right. If the value was incorrect initially, the shadow
> >>>>>> will be incorrect. However, I think we should just not return here in this
> >>>>>> case and let vring_avail_idx() to grab an actual new value below. Otherwise
> >>>>>> we may never break out of this error.
> >>>>>>
> >>>>>> Does that make sense?
> >>>>>
> >>>>> No, because virtio_error() marks the device as broken. The device
> >>>>> requires a reset in order to function again. Fetching
> >>>>> vring_avail_idx() again won't help.
> >>>>
> >>>> OK, I see. In this case we're talking about situation where
> >>>> vring_avail_idx() was called in some other place and stored a bad value
> >>>> in the shadow variable, then virtqueue_num_heads() got called. Right?
> >>
> >> Hmm, I suppose we also need a read barrier after all even if we use
> >> a shadow index. Assuming the index is correct, but the shadow variable
> >> was updated by a call outside of this function, then we may miss a
> >> barrier and read the descriptor out of order, in theory. Read barrier
> >> is going to be a compiler barrier on x86, so the performance gain from
> >> this patch should still be mostly there. I'll test that.
> >
> > I can't say I understand generally. shadow is under qemu control,
> > I don't think it can be updated concurrently by multiple CPUs.
>
> It can't, I agree. Scenario I'm thinking about is the following:
>
> 1. vring_avail_idx() is called from one of the places other than
> virtqueue_num_heads(). Shadow is updated with the current value.
> Some users of vring_avail_idx() do not use barriers after the call.
>
> 2. virtqueue_split_get_avail_bytes() is called.
>
> 3. virtqueue_split_get_avail_bytes() calls virtqueue_num_heads().
>
> 4. virtqueue_num_heads() checks the shadow and returns early.
>
> 5. virtqueue_split_get_avail_bytes() calls vring_split_desc_read() and
> reads the descriptor.
>
> If between steps 1 and 5 we do not have a read barrier, we potentially
> risk reading descriptor data that is not yet fully written, because
> there is no guarantee that reading the last_avail_idx on step 1 wasn't
> reordered with the descriptor read.
>
> In current code we always have smp_rmb() in virtqueue_num_heads().
> But if we return from this function without a barrier, we may have an
> issue, IIUC.
>
> I agree that it's kind of a very unlikely scenario and we will probably
> have a control dependency between steps 1 and 5 that will prevent the
> issue, but it might be safer to just have an explicit barrier in
> virtqueue_num_heads().
>
> Does that make sense? Or am I missing something else here?
Aha, got it. Good point, yes. Pls document in a code comment.
> >
> >
> >>>>
> >>>> AFAIU, we can still just fall through here and let vring_avail_idx()
> >>>> to read the index again and fail the existing check. That would happen
> >>>> today without this patch applied.
> >>>
> >>> Yes, that is fine.
> >>>
> >>>>
> >>>> I'm jut trying to avoid duplication of the virtio_error call, i.e.:
> >>>>
> >>>> if (vq->shadow_avail_idx != idx) {
> >>>> num_heads = vq->shadow_avail_idx - idx;
> >>>>
> >>>> /* Check it isn't doing very strange things with descriptor numbers. */
> >>>> if (num_heads > vq->vring.num) {
> >>>> virtio_error(vq->vdev, "Guest moved used index from %u to %u",
> >>>> idx, vq->shadow_avail_idx);
> >>>> return -EINVAL;
> >>>> }
> >>>> return num_heads;
> >>>> }
> >>>>
> >>>> vs
> >>>>
> >>>> if (vq->shadow_avail_idx != idx) {
> >>>> num_heads = vq->shadow_avail_idx - idx;
> >>>>
> >>>> /* Only use the shadow value if it was good initially. */
> >>>> if (num_heads <= vq->vring.num) {
> >>>> return num_heads;
> >>>> }
> >>>> }
> >>>>
> >>>>
> >>>> What do you think?
> >>>
> >>> Sounds good.
> >>>
> >>>>
> >>>> Best regards, Ilya Maximets.
> >
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] virtio: use shadow_avail_idx while checking number of heads
2023-09-25 22:24 ` Michael S. Tsirkin
@ 2023-09-27 14:01 ` Ilya Maximets
0 siblings, 0 replies; 12+ messages in thread
From: Ilya Maximets @ 2023-09-27 14:01 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: i.maximets, Stefan Hajnoczi, qemu-devel, Jason Wang,
Stefan Hajnoczi
On 9/26/23 00:24, Michael S. Tsirkin wrote:
> On Tue, Sep 26, 2023 at 12:13:11AM +0200, Ilya Maximets wrote:
>> On 9/25/23 23:24, Michael S. Tsirkin wrote:
>>> On Mon, Sep 25, 2023 at 10:58:05PM +0200, Ilya Maximets wrote:
>>>> On 9/25/23 17:38, Stefan Hajnoczi wrote:
>>>>> On Mon, 25 Sept 2023 at 11:36, Ilya Maximets <i.maximets@ovn.org> wrote:
>>>>>>
>>>>>> On 9/25/23 17:12, Stefan Hajnoczi wrote:
>>>>>>> On Mon, 25 Sept 2023 at 11:02, Ilya Maximets <i.maximets@ovn.org> wrote:
>>>>>>>>
>>>>>>>> On 9/25/23 16:23, Stefan Hajnoczi wrote:
>>>>>>>>> On Fri, 25 Aug 2023 at 13:04, Ilya Maximets <i.maximets@ovn.org> wrote:
>>>>>>>>>>
>>>>>>>>>> We do not need the most up to date number of heads, we only want to
>>>>>>>>>> know if there is at least one.
>>>>>>>>>>
>>>>>>>>>> Use shadow variable as long as it is not equal to the last available
>>>>>>>>>> index checked. This avoids expensive qatomic dereference of the
>>>>>>>>>> RCU-protected memory region cache as well as the memory access itself
>>>>>>>>>> and the subsequent memory barrier.
>>>>>>>>>>
>>>>>>>>>> The change improves performance of the af-xdp network backend by 2-3%.
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
>>>>>>>>>> ---
>>>>>>>>>> hw/virtio/virtio.c | 10 +++++++++-
>>>>>>>>>> 1 file changed, 9 insertions(+), 1 deletion(-)
>>>>>>>>>>
>>>>>>>>>> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
>>>>>>>>>> index 309038fd46..04bf7cc977 100644
>>>>>>>>>> --- a/hw/virtio/virtio.c
>>>>>>>>>> +++ b/hw/virtio/virtio.c
>>>>>>>>>> @@ -999,7 +999,15 @@ void virtqueue_push(VirtQueue *vq, const VirtQueueElement *elem,
>>>>>>>>>> /* Called within rcu_read_lock(). */
>>>>>>>>>> static int virtqueue_num_heads(VirtQueue *vq, unsigned int idx)
>>>>>>>>>> {
>>>>>>>>>> - uint16_t num_heads = vring_avail_idx(vq) - idx;
>>>>>>>>>> + uint16_t num_heads;
>>>>>>>>>> +
>>>>>>>>>> + if (vq->shadow_avail_idx != idx) {
>>>>>>>>>> + num_heads = vq->shadow_avail_idx - idx;
>>>>>>>>>> +
>>>>>>>>>> + return num_heads;
>>>>>>>>>
>>>>>>>>> This still needs to check num_heads > vq->vring.num and return -EINVAL
>>>>>>>>> as is done below.
>>>>>>>>
>>>>>>>> Hmm, yeas, you're right. If the value was incorrect initially, the shadow
>>>>>>>> will be incorrect. However, I think we should just not return here in this
>>>>>>>> case and let vring_avail_idx() to grab an actual new value below. Otherwise
>>>>>>>> we may never break out of this error.
>>>>>>>>
>>>>>>>> Does that make sense?
>>>>>>>
>>>>>>> No, because virtio_error() marks the device as broken. The device
>>>>>>> requires a reset in order to function again. Fetching
>>>>>>> vring_avail_idx() again won't help.
>>>>>>
>>>>>> OK, I see. In this case we're talking about situation where
>>>>>> vring_avail_idx() was called in some other place and stored a bad value
>>>>>> in the shadow variable, then virtqueue_num_heads() got called. Right?
>>>>
>>>> Hmm, I suppose we also need a read barrier after all even if we use
>>>> a shadow index. Assuming the index is correct, but the shadow variable
>>>> was updated by a call outside of this function, then we may miss a
>>>> barrier and read the descriptor out of order, in theory. Read barrier
>>>> is going to be a compiler barrier on x86, so the performance gain from
>>>> this patch should still be mostly there. I'll test that.
>>>
>>> I can't say I understand generally. shadow is under qemu control,
>>> I don't think it can be updated concurrently by multiple CPUs.
>>
>> It can't, I agree. Scenario I'm thinking about is the following:
>>
>> 1. vring_avail_idx() is called from one of the places other than
>> virtqueue_num_heads(). Shadow is updated with the current value.
>> Some users of vring_avail_idx() do not use barriers after the call.
>>
>> 2. virtqueue_split_get_avail_bytes() is called.
>>
>> 3. virtqueue_split_get_avail_bytes() calls virtqueue_num_heads().
>>
>> 4. virtqueue_num_heads() checks the shadow and returns early.
>>
>> 5. virtqueue_split_get_avail_bytes() calls vring_split_desc_read() and
>> reads the descriptor.
>>
>> If between steps 1 and 5 we do not have a read barrier, we potentially
>> risk reading descriptor data that is not yet fully written, because
>> there is no guarantee that reading the last_avail_idx on step 1 wasn't
>> reordered with the descriptor read.
>>
>> In current code we always have smp_rmb() in virtqueue_num_heads().
>> But if we return from this function without a barrier, we may have an
>> issue, IIUC.
>>
>> I agree that it's kind of a very unlikely scenario and we will probably
>> have a control dependency between steps 1 and 5 that will prevent the
>> issue, but it might be safer to just have an explicit barrier in
>> virtqueue_num_heads().
>>
>> Does that make sense? Or am I missing something else here?
>
> Aha, got it. Good point, yes. Pls document in a code comment.
Done. Posted v2:
https://lore.kernel.org/qemu-devel/20230927135157.2316982-1-i.maximets@ovn.org/
>
>
>>>
>>>
>>>>>>
>>>>>> AFAIU, we can still just fall through here and let vring_avail_idx()
>>>>>> to read the index again and fail the existing check. That would happen
>>>>>> today without this patch applied.
>>>>>
>>>>> Yes, that is fine.
>>>>>
>>>>>>
>>>>>> I'm jut trying to avoid duplication of the virtio_error call, i.e.:
>>>>>>
>>>>>> if (vq->shadow_avail_idx != idx) {
>>>>>> num_heads = vq->shadow_avail_idx - idx;
>>>>>>
>>>>>> /* Check it isn't doing very strange things with descriptor numbers. */
>>>>>> if (num_heads > vq->vring.num) {
>>>>>> virtio_error(vq->vdev, "Guest moved used index from %u to %u",
>>>>>> idx, vq->shadow_avail_idx);
>>>>>> return -EINVAL;
>>>>>> }
>>>>>> return num_heads;
>>>>>> }
>>>>>>
>>>>>> vs
>>>>>>
>>>>>> if (vq->shadow_avail_idx != idx) {
>>>>>> num_heads = vq->shadow_avail_idx - idx;
>>>>>>
>>>>>> /* Only use the shadow value if it was good initially. */
>>>>>> if (num_heads <= vq->vring.num) {
>>>>>> return num_heads;
>>>>>> }
>>>>>> }
>>>>>>
>>>>>>
>>>>>> What do you think?
>>>>>
>>>>> Sounds good.
>>>>>
>>>>>>
>>>>>> Best regards, Ilya Maximets.
>>>
>
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2023-09-27 14:00 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-25 17:04 [PATCH] virtio: use shadow_avail_idx while checking number of heads Ilya Maximets
2023-09-25 11:20 ` Ilya Maximets
2023-09-25 14:23 ` Stefan Hajnoczi
2023-09-25 15:02 ` Ilya Maximets
2023-09-25 15:12 ` Stefan Hajnoczi
2023-09-25 15:36 ` Ilya Maximets
2023-09-25 15:38 ` Stefan Hajnoczi
2023-09-25 20:58 ` Ilya Maximets
2023-09-25 21:24 ` Michael S. Tsirkin
2023-09-25 22:13 ` Ilya Maximets
2023-09-25 22:24 ` Michael S. Tsirkin
2023-09-27 14:01 ` Ilya Maximets
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).