xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Zoltan Kiss <zoltan.kiss@citrix.com>
To: Ian Campbell <Ian.Campbell@citrix.com>
Cc: "xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>,
	Tim Deegan <tim@xen.org>, Wei Liu <wei.liu2@citrix.com>
Subject: Re: RING_HAS_UNCONSUMED_REQUESTS oddness
Date: Tue, 11 Mar 2014 23:34:13 +0000	[thread overview]
Message-ID: <531F9D75.5030600@citrix.com> (raw)
In-Reply-To: <1394553355.30915.72.camel@kazak.uk.xensource.com>

On 11/03/14 15:55, Ian Campbell wrote:
> On Thu, 2014-03-06 at 21:39 +0000, Zoltan Kiss wrote:
>> On 06/03/14 17:30, Tim Deegan wrote:
>>> At 16:31 +0000 on 06 Mar (1394119880), Zoltan Kiss wrote:
>>>> On 06/03/14 15:53, Ian Campbell wrote:
>>>>> On Thu, 2014-03-06 at 15:47 +0000, Zoltan Kiss wrote:
>>>>>> By my understanding, there is no way rsp could be smaller than req, so
>>>>>> there is no point having this. Am I missing something?
>>>>>
>>>>> It happens during wraparound, i.e. after req has wrapped but rsp hasn't
>>>>> yet.
>>>>
>>>> The name of the macro suggest we are interested whether the ring has
>>>> unconsumed requests, and netback uses it that way. The answer to that
>>>> question is req_prod - req_cons. And it works if prod wrapped but cons
>>>> didn't.
>>>
>>> Yes.
>>>
>>>> rsp calculates the number of "consumed but not responded" requests (it
>>>> also works well if req_cons wrapped but rsp_prod_pvt didn't), then
>>>> subtract it from the ring size.
>>>
>>> That is indeed an odd thing to check, since it seems like it could only
>>> be relevant if the request producer overran the response producer.
>>> It's been there in one form or another since the original ring.h,
>>> and RING_REQUEST_CONS_OVERFLOW does something similar.
>>>
>>> I can't remember the original reasoning, and so I'm reluctant to
>>> suggest removing it without some more eyes on the code...
>>
>> I've added the following printk before the "req < rsp" part:
>>
>> 	if (rsp < req)							\
>> 		pr_err("req %u rsp %u req_prod %u req_cons %u rsp_prod_pvt %u\n", req,
>> rsp, (_r)->sring->req_prod, (_r)->req_cons, (_r)->rsp_prod_pvt); \
>>
>> And it gave me such results:
>>
>> xen_netback:xenvif_zerocopy_callback: req 4294967279 rsp 52 req_prod
>> 1770663942 req_cons 1770663959 rsp_prod_pvt 1770663755
>>
>> So it can happen that req_prod is behind req_cons, sometimes even with
>> 17! But it always happen in this callback of my new grant mapping
>> series, which runs outside the NAPI instance. My theory why this can happen:
>> - callback reads req_prod
>> - frontend writes it
>> - backend picks it up, and consumes those slots
>> - callback reads req_cons
>
> I'm a bit confused by what you mean by "it" in your theory, so perhaps I
> misunderstand. Can you use the actual variable names for clarity.
I meant req_prod all the time.

> Are you sure this is a problem with the actual code and not just with
> your debug print? I would expect the real code to be snapshotting things
> as appropriate etc, and also for the public/private state to not
> necessarily be totally in sync when RING_HAS_UNCONSUMED_REQUESTS is
> being called.
I think my above code does the right thing. For double check I printed 
out req, rsp, and the values they are calculated from. I guess req_prod 
is cached in the printk and that's why it is still the same value as we 
read when calculating req. I'll test that with a memory barrier.
>
>> So req can be near UINT_MAX if you call this macro outside the backend.
>> The only place where the actual return value of this macro matters is
>> xenvif_tx_build_gops, and it should be correct there. At other places we
>> are only looking for the fact whether the ring has unconsumed requests
>> or not. If prod is smaller than cons, we clearly read a wrong value. I
>> think what we can do:
>> 1. try again until its correct
>> 2. just return a non-zero value, it shouldn't cause too much trouble if
>> we say yes here
>> 3. we can't see rsp_cons, so try to figure out if the ring is full of
>> consumed but not responded requests, and return zero then, otherwise a
>> positive value. That's what we do know.
>
> s/know/now/? Is #3 here the status quo?
Yes, that's the status quo. It's a best effort calculation, rsp takes 
only effect if req ~ UINT_MAX.
>
>> Does this make sense? Should we rather go option 1? Should I post a
>> comment patch to document this, and spare a few hours for future
>> generations? :)
>
> Docs are always good IMHO.
>
> Ian.
>

  reply	other threads:[~2014-03-11 23:34 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-03-06 15:47 RING_HAS_UNCONSUMED_REQUESTS oddness Zoltan Kiss
2014-03-06 15:53 ` Ian Campbell
2014-03-06 16:31   ` Zoltan Kiss
2014-03-06 17:30     ` Tim Deegan
2014-03-06 21:39       ` Zoltan Kiss
2014-03-07  9:23         ` Paul Durrant
2014-03-07 17:43           ` Zoltan Kiss
2014-03-07 12:02         ` Wei Liu
2014-03-07 18:58           ` Zoltan Kiss
2014-03-11 15:55         ` Ian Campbell
2014-03-11 23:34           ` Zoltan Kiss [this message]
2014-03-13 16:38       ` [PATCH RFC] xen/public/ring.h: simplify RING_HAS_UNCONSUMED_REQUESTS() Tim Deegan
2014-03-22 14:18         ` Zoltan Kiss
2014-03-22 17:14           ` Tim Deegan
2014-03-24  7:38             ` Jan Beulich
2014-03-24  9:39               ` Paul Durrant
2014-03-24  9:59                 ` Jan Beulich
2014-03-24 11:03                   ` Paul Durrant
2014-03-24 12:23               ` Zoltan Kiss
2014-03-24 13:52                 ` Paul Durrant
2014-03-24 23:55                   ` Zoltan Kiss
2014-04-03  9:38         ` Tim Deegan
2014-04-03 15:34           ` Zoltan Kiss
2014-03-11 15:44 ` RING_HAS_UNCONSUMED_REQUESTS oddness Ian Campbell
2014-03-11 23:24   ` Zoltan Kiss
2014-03-12 10:28     ` Ian Campbell
2014-03-12 10:48       ` Roger Pau Monné
2014-03-12 11:25       ` Paul Durrant
2014-03-12 11:38       ` Paul Durrant
2014-03-12 14:41         ` Zoltan Kiss
2014-03-12 15:23           ` Paul Durrant
2014-03-12 15:42             ` Wei Liu
2014-03-12 15:56               ` Paul Durrant
2014-03-12 16:02               ` Paul Durrant
2014-03-12 16:13               ` Zoltan Kiss
2014-03-12 16:42                 ` Paul Durrant
2014-03-12 19:06                   ` Zoltan Kiss
2014-03-13  9:26                     ` Paul Durrant
2014-03-13 10:02                       ` Ian Campbell
2014-03-13 10:58                         ` Paul Durrant
2014-03-13 12:19                           ` Ian Campbell
2014-03-13 12:28                             ` Zoltan Kiss
2014-03-13 12:29                               ` Paul Durrant
2014-03-13 12:44                               ` Ian Campbell
2014-03-12 14:25       ` Zoltan Kiss
2014-03-12 14:27       ` Zoltan Kiss
2014-03-12 14:30         ` Ian Campbell
2014-03-12 15:14           ` Zoltan Kiss
2014-03-12 15:37             ` Ian Campbell
2014-03-12 17:14               ` Zoltan Kiss
2014-03-12 17:43                 ` Ian Campbell
2014-03-12 21:10                   ` Zoltan Kiss
2014-03-13 10:04                     ` Ian Campbell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=531F9D75.5030600@citrix.com \
    --to=zoltan.kiss@citrix.com \
    --cc=Ian.Campbell@citrix.com \
    --cc=tim@xen.org \
    --cc=wei.liu2@citrix.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).