public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed
From: "Jürgen Groß" <jgross@suse.com>
To: Julien Grall <julien@xen.org>,
	xen-devel@lists.xenproject.org, linux-kernel@vger.kernel.org,
	linux-block@vger.kernel.org, netdev@vger.kernel.org,
	linux-scsi@vger.kernel.org
Cc: "Boris Ostrovsky" <boris.ostrovsky@oracle.com>,
	"Stefano Stabellini" <sstabellini@kernel.org>,
	stable@vger.kernel.org,
	"Konrad Rzeszutek Wilk" <konrad.wilk@oracle.com>,
	"Roger Pau Monné" <roger.pau@citrix.com>,
	"Jens Axboe" <axboe@kernel.dk>, "Wei Liu" <wei.liu@kernel.org>,
	"Paul Durrant" <paul@xen.org>,
	"David S. Miller" <davem@davemloft.net>,
	"Jakub Kicinski" <kuba@kernel.org>
Subject: Re: [PATCH 0/7] xen/events: bug fixes and some diagnostic aids
Date: Sun, 7 Feb 2021 13:58:20 +0100	[thread overview]
Message-ID: <eeb62129-d9fc-2155-0e0f-aff1fbb33fbc@suse.com> (raw)
In-Reply-To: <bd63694e-ac0c-7954-ec00-edad05f8da1c@xen.org>


[-- Attachment #1.1.1: Type: text/plain, Size: 3441 bytes --]

On 06.02.21 19:46, Julien Grall wrote:
> Hi Juergen,
> 
> On 06/02/2021 10:49, Juergen Gross wrote:
>> The first three patches are fixes for XSA-332. The avoid WARN splats
>> and a performance issue with interdomain events.
> 
> Thanks for helping to figure out the problem. Unfortunately, I still see 
> reliably the WARN splat with the latest Linux master (1e0d27fce010) + 
> your first 3 patches.
> 
> I am using Xen 4.11 (1c7d984645f9) and dom0 is forced to use the 2L 
> events ABI.
> 
> After some debugging, I think I have an idea what's went wrong. The 
> problem happens when the event is initially bound from vCPU0 to a 
> different vCPU.
> 
>  From the comment in xen_rebind_evtchn_to_cpu(), we are masking the 
> event to prevent it being delivered on an unexpected vCPU. However, I 
> believe the following can happen:
> 
> vCPU0                | vCPU1
>                  |
>                  | Call xen_rebind_evtchn_to_cpu()
> receive event X            |
>                  | mask event X
>                  | bind to vCPU1
> <vCPU descheduled>        | unmask event X
>                  |
>                  | receive event X
>                  |
>                  | handle_edge_irq(X)
> handle_edge_irq(X)        |  -> handle_irq_event()
>                  |   -> set IRQD_IN_PROGRESS
>   -> set IRQS_PENDING        |
>                  |   -> evtchn_interrupt()
>                  |   -> clear IRQD_IN_PROGRESS
>                  |  -> IRQS_PENDING is set
>                  |  -> handle_irq_event()
>                  |   -> evtchn_interrupt()
>                  |     -> WARN()
>                  |
> 
> All the lateeoi handlers expect a ONESHOT semantic and 
> evtchn_interrupt() is doesn't tolerate any deviation.
> 
> I think the problem was introduced by 7f874a0447a9 ("xen/events: fix 
> lateeoi irq acknowledgment") because the interrupt was disabled 
> previously. Therefore we wouldn't do another iteration in 
> handle_edge_irq().

I think you picked the wrong commit for blaming, as this is just
the last patch of the three patches you were testing.

> Aside the handlers, I think it may impact the defer EOI mitigation 
> because in theory if a 3rd vCPU is joining the party (let say vCPU A 
> migrate the event from vCPU B to vCPU C). So info->{eoi_cpu, irq_epoch, 
> eoi_time} could possibly get mangled?
> 
> For a fix, we may want to consider to hold evtchn_rwlock with the write 
> permission. Although, I am not 100% sure this is going to prevent 
> everything.

It will make things worse, as it would violate the locking hierarchy
(xen_rebind_evtchn_to_cpu() is called with the IRQ-desc lock held).

On a first glance I think we'll need a 3rd masking state ("temporarily
masked") in the second patch in order to avoid a race with lateeoi.

In order to avoid the race you outlined above we need an "event is being
handled" indicator checked via test_and_set() semantics in
handle_irq_for_port() and reset only when calling clear_evtchn().

> Does my write-up make sense to you?

Yes. What about my reply? ;-)


Juergen

[-- Attachment #1.1.2: OpenPGP_0xB0DE9DD628BF132F.asc --]
[-- Type: application/pgp-keys, Size: 3135 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

  reply	other threads:[~2021-02-07 12:59 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-06 10:49 [PATCH 0/7] xen/events: bug fixes and some diagnostic aids Juergen Gross
2021-02-06 10:49 ` [PATCH 4/7] xen/events: link interdomain events to associated xenbus device Juergen Gross
2021-02-08 23:26   ` Boris Ostrovsky
2021-02-09 13:55   ` Wei Liu
2021-02-06 18:46 ` [PATCH 0/7] xen/events: bug fixes and some diagnostic aids Julien Grall
2021-02-07 12:58   ` Jürgen Groß [this message]
2021-02-08  9:11     ` Julien Grall
2021-02-08  9:41       ` Jürgen Groß
2021-02-08  9:54         ` Julien Grall
2021-02-08 10:22           ` Jürgen Groß
2021-02-08 10:40             ` Julien Grall
2021-02-08 12:14               ` Jürgen Groß
2021-02-08 12:16                 ` Julien Grall
2021-02-08 12:31                   ` Jürgen Groß
2021-02-08 13:09                     ` Julien Grall
2021-02-08 13:58                       ` Jürgen Groß
2021-02-08 14:20                         ` Julien Grall
2021-02-08 14:35                           ` Julien Grall
2021-02-08 14:50                           ` Jürgen Groß

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=eeb62129-d9fc-2155-0e0f-aff1fbb33fbc@suse.com \
    --to=jgross@suse.com \
    --cc=axboe@kernel.dk \
    --cc=boris.ostrovsky@oracle.com \
    --cc=davem@davemloft.net \
    --cc=julien@xen.org \
    --cc=konrad.wilk@oracle.com \
    --cc=kuba@kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=paul@xen.org \
    --cc=roger.pau@citrix.com \
    --cc=sstabellini@kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=wei.liu@kernel.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox