From: Farhan Ali <alifm@linux.ibm.com>
To: Eric Farman <farman@linux.ibm.com>, cohuck@redhat.com
Cc: pasic@linux.ibm.com, linux-s390@vger.kernel.org, kvm@vger.kernel.org
Subject: Re: [RFC v1 1/1] vfio-ccw: Don't call cp_free if we are processing a channel program
Date: Fri, 21 Jun 2019 10:17:09 -0400 [thread overview]
Message-ID: <581d756d-7418-cd67-e0e8-f9e4fe10b22d@linux.ibm.com> (raw)
In-Reply-To: <638804dc-53c0-ff2f-d123-13c257ad593f@linux.ibm.com>
On 06/20/2019 04:27 PM, Eric Farman wrote:
>
>
> On 6/20/19 3:40 PM, Farhan Ali wrote:
>> There is a small window where it's possible that an interrupt can
>> arrive and can call cp_free, while we are still processing a channel
>> program (i.e allocating memory, pinnging pages, translating
>
> s/pinnging/pinning/
>
>> addresses etc). This can lead to allocating and freeing at the same
>> time and can cause memory corruption.
>>
>> Let's not call cp_free if we are currently processing a channel program.
>
> The check around this cp_free() call is for a solicited interrupt, so
> it's presumably in response to a SSCH we issued. But if we're still
> processing a CP, then we hadn't issued the SSCH to the hardware yet. So
> what is this interrupt for? Do the contents of irb.cpa provide any
> clues, perhaps if it's in the current cp or for someone else?
>
I don't think the interrupt is in response to an ssch but rather due to
an csch/hsch.
>>
>> Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
>> ---
>>
>> I have been running my test overnight with this patch and I haven't
>> seen the stack traces that I mentioned about earlier. I would like
>> to get some reviews on this and also if this is the right thing to
>> do?
>>
>> Thanks
>> Farhan
>>
>> drivers/s390/cio/vfio_ccw_drv.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/s390/cio/vfio_ccw_drv.c b/drivers/s390/cio/vfio_ccw_drv.c
>> index 66a66ac..61ece3f 100644
>> --- a/drivers/s390/cio/vfio_ccw_drv.c
>> +++ b/drivers/s390/cio/vfio_ccw_drv.c
>> @@ -88,7 +88,7 @@ static void vfio_ccw_sch_io_todo(struct work_struct *work)
>> (SCSW_ACTL_DEVACT | SCSW_ACTL_SCHACT));
>> if (scsw_is_solicited(&irb->scsw)) {
>> cp_update_scsw(&private->cp, &irb->scsw);
>
> As I alluded earlier, do we know this irb is for this cp? If no, what
> does this function end up putting in the scsw?
>
>> - if (is_final)
>> + if (is_final && private->state != VFIO_CCW_STATE_CP_PROCESSING)
>
> In looking at how we set this state, and how we exit it, I see we do:
>
> if SSCH got CC0, CP_PROCESSING -> CP_PENDING
> if SSCH got !CC0, CP_PROCESSING -> IDLE
>
> While the first scenario happens immediately after the SSCH instruction,
> I guess it could be just tiny enough, like the io_trigger FSM patch I
> sent a few weeks ago.
>
> Meanwhile, the latter happens way after we return from the jump table.
> So that scenario leaves considerable time for such an interrupt to
> occur, though I don't understand why it would if we got a CC(1-3) on the
> SSCH.
>
> And anyway, the return from fsm_io_helper() in that case will also call
> cp_free(). So why does the cp->initialized check provide protection
> from a double-free in that direction, but not here? I'm confused.
I have a theory where I think it's possible to have 2 different threads
executing cp_free
If we start with private->state == IDLE and the guest issues a
clear/halt and then an ssch
- clear/halt will be issued to hardware, and if succeeds we will return
cc=0 to guest
- the guest can then issue ssch
- we get an interrupt for csch/hsch and we queue the interrupt in the
workqueue
- we start processing the ssch and then at the same time another cpu
could be working on the
interrupt
Thread 1 Thread 2
-------- --------
fsm_io_request vfio_ccw_sch_io_todo
cp_init cp_free
cp_prefetch
fsm_io_helper
cp_free
The test that I am trying is with a guest running an fio workload, while
at the same time stressing the error recovery path in the guest. So
there is a lot of ssch and lot of csch.
Of course I don't think my patch completely solves the problem, I think
it just makes the window narrower. I just wanted to get a discussion
started :)
Now that I am thinking more about it, I think we might have to protect
cp with it's own mutex.
Thanks
Farhan
>
>> cp_free(&private->cp);
>> }
>> mutex_lock(&private->io_mutex);
>>
>
next prev parent reply other threads:[~2019-06-21 14:17 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <cover.1561055076.git.alifm@linux.ibm.com>
2019-06-20 21:07 ` [RFC v1 1/1] vfio-ccw: Don't call cp_free if we are processing a channel program Farhan Ali
2019-06-20 20:27 ` Eric Farman
2019-06-21 14:17 ` Farhan Ali [this message]
2019-06-21 17:40 ` Eric Farman
2019-06-21 18:34 ` Farhan Ali
2019-06-24 9:42 ` Cornelia Huck
2019-06-24 10:05 ` Cornelia Huck
2019-06-24 11:46 ` Cornelia Huck
2019-06-24 12:07 ` Cornelia Huck
2019-06-24 14:44 ` Farhan Ali
2019-06-24 15:09 ` Cornelia Huck
2019-06-24 15:24 ` Farhan Ali
2019-06-27 9:14 ` Cornelia Huck
2019-06-28 13:05 ` Farhan Ali
2019-06-24 11:31 ` Halil Pasic
2019-06-21 14:00 ` Halil Pasic
2019-06-21 14:26 ` Farhan Ali
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=581d756d-7418-cd67-e0e8-f9e4fe10b22d@linux.ibm.com \
--to=alifm@linux.ibm.com \
--cc=cohuck@redhat.com \
--cc=farman@linux.ibm.com \
--cc=kvm@vger.kernel.org \
--cc=linux-s390@vger.kernel.org \
--cc=pasic@linux.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox