From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: jbeulich@suse.com, andrew.cooper3@citrix.com,
xen-devel@lists.xenproject.org, linux@eikelenboom.it,
malcolm.crossley@citrix.com
Subject: Re: [RFC PATCH] dpci: Put the dpci back on the list if running on another CPU.
Date: Mon, 12 Jan 2015 12:35:36 -0500 [thread overview]
Message-ID: <20150112173536.GA24188@l.oracle.com> (raw)
In-Reply-To: <1421081130-22959-1-git-send-email-konrad.wilk@oracle.com>
On Mon, Jan 12, 2015 at 11:45:30AM -0500, Konrad Rzeszutek Wilk wrote:
> There is race when we clear the STATE_SCHED in the softirq
> - which allows the 'raise_softirq_for' to progress and
> schedule another dpci. During that time the other CPU could
> receive an interrupt and calls 'raise_softirq_for' and put
> the dpci on its per-cpu list. There would be two 'dpci_softirq'
> running at the same time (on different CPUs) where the
> dpci state is STATE_RUN (and STATE_SCHED is cleared). This
> ends up hitting:
>
> if ( test_and_set_bit(STATE_RUN, &pirq_dpci->state) )
> BUG()
>
> Instead of that put the dpci back on the per-cpu list to deal
> with later.
>
> The reason we can get his with this is when an interrupt
> affinity is set over multiple CPUs.
>
> Another potential fix would be to add a guard in the raise_softirq_for
> to check for 'STATE_RUN' bit being set and not schedule the
> dpci until that bit has been cleared.
>
> Reported-by: Sander Eikelenboom <linux@eikelenboom.it>
> Reported-by: Malcolm Crossley <malcolm.crossley@citrix.com>
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> ---
> xen/drivers/passthrough/io.c | 12 +++++++++++-
> 1 file changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/xen/drivers/passthrough/io.c b/xen/drivers/passthrough/io.c
> index ae050df..9b77334 100644
> --- a/xen/drivers/passthrough/io.c
> +++ b/xen/drivers/passthrough/io.c
> @@ -804,7 +804,17 @@ static void dpci_softirq(void)
> d = pirq_dpci->dom;
> smp_mb(); /* 'd' MUST be saved before we set/clear the bits. */
> if ( test_and_set_bit(STATE_RUN, &pirq_dpci->state) )
> - BUG();
> + {
> + unsigned long flags;
> +
> + /* Put back on the list and retry. */
> + local_irq_save(flags);
> + list_add_tail(&pirq_dpci->softirq_list, &this_cpu(dpci_list));
I chatted with Jan on IRC this, and one worry is that if we add
on our per-cpu list an 'dpci' that is running on another CPU - if
the other CPU runs 'list_del' it will go BOOM.
However, I am not sure if I can come up with a scenario in which
this will be triggered - as by the time we get to checking
the STATE_RUN condition the list removal has already been done.
So adding in on the per-cpu list is OK - and since the STATE_SCHED
is set, it guards against double-list addition.
CPU1 CPU2 CPU3:
softirq_dpci:
list_del() from per-cpu-list
test-and-set(STATE_RUN)
test-and-clear(STATE_SCHED)
..
raise_softirq
test-and-set(STATE_SCHED)
softirq_dpci
list_del from per-cpu-list
if-test-and-set(STATE_RUN) fails:
list_add_tail(..)
continue
raise_softirq
test-and-set(STATE_SCHED)
[fails, so no adding]
..
softirq_dpci
list_del
if-test-and-set(STATE_RUN) fails:
list_add_tail(..) [OK, we did the list_del]
continue..
clear_bit(STATE_RUN)
softrqi_dpci
list_del()
test-and-set(STATE_RUN)
test-and-clear(STATE_SCHED)
> + local_irq_restore(flags);
> +
> + raise_softirq(HVM_DPCI_SOFTIRQ);
> + continue;
> + }
> /*
> * The one who clears STATE_SCHED MUST refcount the domain.
> */
> --
> 2.1.0
>
next prev parent reply other threads:[~2015-01-12 17:35 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-01-12 16:45 [RFC PATCH] dpci: Put the dpci back on the list if running on another CPU Konrad Rzeszutek Wilk
2015-01-12 17:27 ` Sander Eikelenboom
2015-01-12 17:35 ` Konrad Rzeszutek Wilk [this message]
2015-01-13 10:26 ` Jan Beulich
2015-01-13 10:20 ` Jan Beulich
2015-01-23 1:44 ` Konrad Rzeszutek Wilk
2015-01-23 9:37 ` Jan Beulich
2015-01-23 14:54 ` Konrad Rzeszutek Wilk
2015-03-17 17:44 ` Konrad Rzeszutek Wilk
2015-03-17 22:16 ` Sander Eikelenboom
2015-03-18 7:41 ` Jan Beulich
2015-03-18 14:06 ` Konrad Rzeszutek Wilk
2015-03-18 16:43 ` Jan Beulich
2015-03-18 17:00 ` Konrad Rzeszutek Wilk
2015-03-19 7:15 ` Jan Beulich
2015-02-02 14:29 ` Konrad Rzeszutek Wilk
2015-02-02 15:19 ` Jan Beulich
2015-02-02 15:31 ` Konrad Rzeszutek Wilk
2015-02-02 15:48 ` Jan Beulich
2015-02-02 17:44 ` Konrad Rzeszutek Wilk
2015-02-03 8:58 ` Jan Beulich
2015-03-16 17:59 ` Konrad Rzeszutek Wilk
2015-03-17 8:18 ` Jan Beulich
2015-03-17 8:42 ` Sander Eikelenboom
2015-03-17 14:54 ` Konrad Rzeszutek Wilk
2015-03-17 16:01 ` Jan Beulich
2015-03-17 16:09 ` Konrad Rzeszutek Wilk
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150112173536.GA24188@l.oracle.com \
--to=konrad.wilk@oracle.com \
--cc=andrew.cooper3@citrix.com \
--cc=jbeulich@suse.com \
--cc=linux@eikelenboom.it \
--cc=malcolm.crossley@citrix.com \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.