From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:22000 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726642AbfGIOWE (ORCPT ); Tue, 9 Jul 2019 10:22:04 -0400 Received: from pps.filterd (m0098417.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id x69EExwS113881 for ; Tue, 9 Jul 2019 10:22:02 -0400 Received: from e06smtp04.uk.ibm.com (e06smtp04.uk.ibm.com [195.75.94.100]) by mx0a-001b2d01.pphosted.com with ESMTP id 2tmsv9frjc-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Tue, 09 Jul 2019 10:21:51 -0400 Received: from localhost by e06smtp04.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 9 Jul 2019 15:21:48 +0100 Date: Tue, 9 Jul 2019 16:21:42 +0200 From: Halil Pasic Subject: Re: [RFC v2 4/5] vfio-ccw: Don't call cp_free if we are processing a channel program In-Reply-To: <45ad7230-3674-2601-af5b-d9beef9312be@linux.ibm.com> References: <1405df8415d3bff446c22753d0e9b91ff246eb0f.1562616169.git.alifm@linux.ibm.com> <20190709121613.6a3554fa.cohuck@redhat.com> <45ad7230-3674-2601-af5b-d9beef9312be@linux.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 8bit Message-Id: <20190709162142.789dd605.pasic@linux.ibm.com> Sender: linux-s390-owner@vger.kernel.org List-ID: To: Farhan Ali Cc: Cornelia Huck , farman@linux.ibm.com, linux-s390@vger.kernel.org, kvm@vger.kernel.org On Tue, 9 Jul 2019 09:46:51 -0400 Farhan Ali wrote: > > > On 07/09/2019 06:16 AM, Cornelia Huck wrote: > > On Mon, 8 Jul 2019 16:10:37 -0400 > > Farhan Ali wrote: > > > >> There is a small window where it's possible that we could be working > >> on an interrupt (queued in the workqueue) and setting up a channel > >> program (i.e allocating memory, pinning pages, translating address). > >> This can lead to allocating and freeing the channel program at the > >> same time and can cause memory corruption. > >> > >> Let's not call cp_free if we are currently processing a channel program. > >> The only way we know for sure that we don't have a thread setting > >> up a channel program is when the state is set to VFIO_CCW_STATE_CP_PENDING. > > > > Can we pinpoint a commit that introduced this bug, or has it been there > > since the beginning? > > > > I think the problem was always there. > I think it became relevant with the async stuff. Because after the async stuff was added we start getting solicited interrupts that are not about channel program is done. At least this is how I remember the discussion. > >> > >> Signed-off-by: Farhan Ali > >> --- > >> drivers/s390/cio/vfio_ccw_drv.c | 2 +- > >> 1 file changed, 1 insertion(+), 1 deletion(-) > >> > >> diff --git a/drivers/s390/cio/vfio_ccw_drv.c b/drivers/s390/cio/vfio_ccw_drv.c > >> index 4e3a903..0357165 100644 > >> --- a/drivers/s390/cio/vfio_ccw_drv.c > >> +++ b/drivers/s390/cio/vfio_ccw_drv.c > >> @@ -92,7 +92,7 @@ static void vfio_ccw_sch_io_todo(struct work_struct *work) > >> (SCSW_ACTL_DEVACT | SCSW_ACTL_SCHACT)); > >> if (scsw_is_solicited(&irb->scsw)) { > >> cp_update_scsw(&private->cp, &irb->scsw); > >> - if (is_final) > >> + if (is_final && private->state == VFIO_CCW_STATE_CP_PENDING) Ain't private->state potentially used by multiple threads of execution? Do we need to use atomic operations or external synchronization to avoid this being another gamble? Or am I missing something? > >> cp_free(&private->cp); > >> } > >> mutex_lock(&private->io_mutex); > > > > Reviewed-by: Cornelia Huck > > > > > Thanks for reviewing. > > Thanks > Farhan