All of lore.kernel.org
 help / color / mirror / Atom feed
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: andrew.cooper3@citrix.com, malcolm.crossley@citrix.com,
	xen-devel@lists.xenproject.org, linux@eikelenboom.it
Subject: Re: [RFC PATCH] dpci: Put the dpci back on the list if running on another CPU.
Date: Mon, 2 Feb 2015 12:44:46 -0500	[thread overview]
Message-ID: <20150202174446.GA8348@l.oracle.com> (raw)
In-Reply-To: <54CFAA53020000780005C103@mail.emea.novell.com>

On Mon, Feb 02, 2015 at 03:48:19PM +0000, Jan Beulich wrote:
> >>> On 02.02.15 at 16:31, <konrad.wilk@oracle.com> wrote:
> > On Mon, Feb 02, 2015 at 03:19:33PM +0000, Jan Beulich wrote:
> >> >>> On 02.02.15 at 15:29, <konrad.wilk@oracle.com> wrote:
> >> > --- a/xen/drivers/passthrough/io.c
> >> > +++ b/xen/drivers/passthrough/io.c
> >> > @@ -63,10 +63,37 @@ enum {
> >> >  static void raise_softirq_for(struct hvm_pirq_dpci *pirq_dpci)
> >> >  {
> >> >      unsigned long flags;
> >> > +    unsigned long old, new, val = pirq_dpci->state;
> >> >  
> >> > -    if ( test_and_set_bit(STATE_SCHED, &pirq_dpci->state) )
> >> > -        return;
> >> > +    /*
> >> > +     * This cmpxch is a more clear version of:
> >> > +     * if ( test_and_set_bit(STATE_SCHED, &pirq_dpci->state) )
> >> > +     *         return;
> >> > +     * since it also checks for STATE_RUN conditions.
> >> > +     */
> >> > +    for ( ;; )
> >> > +    {
> >> > +        new = 1 << STATE_SCHED;
> >> > +        if ( val )
> >> > +            new |= val;
> >> 
> >> Why the if()?
> > 
> > To 'or' the variable new with '1 << STATE_RUN' in case 'val' changed from
> > the first read ('val = pirq_dpci->state') to the moment when
> > we do the cmpxchg.
> 
> But if "val" is zero, the | simply will do nothing.

Correct. Keep in mind that 'new' is set to '1 << STATE_SCHED' at every
loop iteration - so it ends up old = cmpxchg(.., 0, 1 << STATE_SCHED)
(and old == 0, val == 0, so we end up breaking out of the loop).


> 
> >> > +        old = cmpxchg(&pirq_dpci->state, val, new);
> >> > +        switch ( old )
> >> > +        {
> >> > +        case (1 << STATE_SCHED):
> >> > +        case (1 << STATE_RUN) | (1 << STATE_SCHED):
> >> > +            /*
> >> > +             * Whenever STATE_SCHED is set we MUST not schedule it.
> >> > +             */
> >> > +            return;
> >> > +        }
> >> > +        /*
> >> > +         * If the 'state' is 0 or (1 << STATE_RUN) we can schedule it.
> >> > +         */
> >> 
> >> Really? Wasn't the intention to _not_ schedule when STATE_RUN is
> >> set? (Also the above is a only line comment, i.e. wants different style.)
> > 
> > I must confess I must have misread your last review. You said:
> > 
> > 	> Here is a patch that does this. I don't yet have an setup to test
> > 	> the failing scenario (working on that). I removed the part in
> > 	> the softirq because with this patch I cannot see a way it would
> > 	> ever get there (in the softirq hitting the BUG).
> > 
> > 	Hmm, but how do you now schedule the second instance that needed ...
> > 
> > 	> +    case (1 << STATE_RUN):
> > 
> > 	... in this case?
> > 
> > which to me implied you still want to schedule an 'dpci' when STATE_RUN is 
> > set?
> > 
> > My thinking is that we should still schedule an 'dpci' when STATE_RUN is set
> > as that is inline with what the old tasklet code did. It would
> > schedule the tasklet the moment said tasklet was off the global list (but
> > it would not add it in the global list - that would be the job of the
> > tasklet function wrapper to detect and insert said tasklet back on
> > the global list).
> 
> Didn't the original discussion (and issue) revolve around scheduling
> while STATE_RUN was set? Hence the intention to wait for the flag

Yes.
> to clear - but preferably in an explicit rather than implicit (as your
> current and previous patch do) manner.

If we do explicitly we run risk of dead-lock. See below of an draft
(not even compiled tested) of what I think you mean.

The issue is that 'raise_softirq_for' ends up being called from do_IRQ.
And we might have an IRQ coming in just as we are in the dpci_softirq
having set the 1 << STATE_SCHED. Our spin-and-wait in raise_softirq_for
(in this code below) will spin forever.

One way to not get in that quagmire is to well, do what the previous
patches did.

diff --git a/xen/drivers/passthrough/io.c b/xen/drivers/passthrough/io.c
index ae050df..706a636 100644
--- a/xen/drivers/passthrough/io.c
+++ b/xen/drivers/passthrough/io.c
@@ -63,10 +63,35 @@ enum {
 static void raise_softirq_for(struct hvm_pirq_dpci *pirq_dpci)
 {
     unsigned long flags;
+    unsigned long old, new, val = pirq_dpci->state;
 
-    if ( test_and_set_bit(STATE_SCHED, &pirq_dpci->state) )
-        return;
-
+    for ( ;; )
+    {
+        old = cmpxchg(&pirq_dpci->state, 0, 1 << STATE_SCHED);
+        switch ( old )
+        {
+        case (1 << STATE_SCHED):
+            /*
+             * Whenever STATE_SCHED is set we MUST not schedule it.
+             */
+            return;
+        case (1 << STATE_RUN) | (1 << STATE_SCHED):
+        case (1 << STATE_RUN):
+            /* Getting close to finish. Spin. */
+            continue;
+        }
+        /*
+         * If the 'state' is 0 we can schedule it.
+         */
+        if ( old == 0 )
+            break;
+    }
     get_knownalive_domain(pirq_dpci->dom);
 
     local_irq_save(flags);
> 
> Jan
> 

  reply	other threads:[~2015-02-02 17:44 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-01-12 16:45 [RFC PATCH] dpci: Put the dpci back on the list if running on another CPU Konrad Rzeszutek Wilk
2015-01-12 17:27 ` Sander Eikelenboom
2015-01-12 17:35 ` Konrad Rzeszutek Wilk
2015-01-13 10:26   ` Jan Beulich
2015-01-13 10:20 ` Jan Beulich
2015-01-23  1:44   ` Konrad Rzeszutek Wilk
2015-01-23  9:37     ` Jan Beulich
2015-01-23 14:54       ` Konrad Rzeszutek Wilk
2015-03-17 17:44       ` Konrad Rzeszutek Wilk
2015-03-17 22:16         ` Sander Eikelenboom
2015-03-18  7:41         ` Jan Beulich
2015-03-18 14:06           ` Konrad Rzeszutek Wilk
2015-03-18 16:43             ` Jan Beulich
2015-03-18 17:00               ` Konrad Rzeszutek Wilk
2015-03-19  7:15                 ` Jan Beulich
2015-02-02 14:29   ` Konrad Rzeszutek Wilk
2015-02-02 15:19     ` Jan Beulich
2015-02-02 15:31       ` Konrad Rzeszutek Wilk
2015-02-02 15:48         ` Jan Beulich
2015-02-02 17:44           ` Konrad Rzeszutek Wilk [this message]
2015-02-03  8:58             ` Jan Beulich
2015-03-16 17:59               ` Konrad Rzeszutek Wilk
2015-03-17  8:18                 ` Jan Beulich
2015-03-17  8:42                   ` Sander Eikelenboom
2015-03-17 14:54                     ` Konrad Rzeszutek Wilk
2015-03-17 16:01                       ` Jan Beulich
2015-03-17 16:09                         ` Konrad Rzeszutek Wilk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150202174446.GA8348@l.oracle.com \
    --to=konrad.wilk@oracle.com \
    --cc=JBeulich@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=linux@eikelenboom.it \
    --cc=malcolm.crossley@citrix.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.