From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.0 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7654BC10F0E for ; Tue, 9 Apr 2019 14:27:59 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 4CB5E20883 for ; Tue, 9 Apr 2019 14:27:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726499AbfDIO16 (ORCPT ); Tue, 9 Apr 2019 10:27:58 -0400 Received: from mx1.redhat.com ([209.132.183.28]:37766 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726463AbfDIO15 (ORCPT ); Tue, 9 Apr 2019 10:27:57 -0400 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 3738F70D65; Tue, 9 Apr 2019 14:27:57 +0000 (UTC) Received: from x1.home (ovpn-116-122.phx2.redhat.com [10.3.116.122]) by smtp.corp.redhat.com (Postfix) with ESMTP id F1E6117AC1; Tue, 9 Apr 2019 14:27:56 +0000 (UTC) Date: Tue, 9 Apr 2019 08:27:56 -0600 From: Alex Williamson To: Farhan Ali Cc: kvm@vger.kernel.org Subject: Re: [PATCH v2 1/1] vfio: Fix WARNING "do not call blocking ops when !TASK_RUNNING" Message-ID: <20190409082756.13eda601@x1.home> In-Reply-To: References: <159c2af6cc2fc8ca838cfa7ab9a54e8a1b7507b9.1554315372.git.alifm@linux.ibm.com> Organization: Red Hat MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.28]); Tue, 09 Apr 2019 14:27:57 +0000 (UTC) Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org On Tue, 9 Apr 2019 09:13:48 -0400 Farhan Ali wrote: > On 04/03/2019 02:22 PM, Farhan Ali wrote: > > vfio_dev_present() which is the condition to > > wait_event_interruptible_timeout(), will call vfio_group_get_device > > and try to acquire the mutex group->device_lock. > > > > wait_event_interruptible_timeout() will set the state of the current > > task to TASK_INTERRUPTIBLE, before doing the condition check. This > > means that we will try to accquire the mutex while already in a > > sleeping state. The scheduler warns us by giving the following > > warning: > > > > [ 4050.264464] ------------[ cut here ]------------ > > [ 4050.264508] do not call blocking ops when !TASK_RUNNING; state=1 set at [<00000000b33c00e2>] prepare_to_wait_event+0x14a/0x188 > > [ 4050.264529] WARNING: CPU: 12 PID: 35924 at kernel/sched/core.c:6112 __might_sleep+0x76/0x90 > > .... > > > > 4050.264756] Call Trace: > > [ 4050.264765] ([<000000000017bbaa>] __might_sleep+0x72/0x90) > > [ 4050.264774] [<0000000000b97edc>] __mutex_lock+0x44/0x8c0 > > [ 4050.264782] [<0000000000b9878a>] mutex_lock_nested+0x32/0x40 > > [ 4050.264793] [<000003ff800d7abe>] vfio_group_get_device+0x36/0xa8 [vfio] > > [ 4050.264803] [<000003ff800d87c0>] vfio_del_group_dev+0x238/0x378 [vfio] > > [ 4050.264813] [<000003ff8015f67c>] mdev_remove+0x3c/0x68 [mdev] > > [ 4050.264825] [<00000000008e01b0>] device_release_driver_internal+0x168/0x268 > > [ 4050.264834] [<00000000008de692>] bus_remove_device+0x162/0x190 > > [ 4050.264843] [<00000000008daf42>] device_del+0x1e2/0x368 > > [ 4050.264851] [<00000000008db12c>] device_unregister+0x64/0x88 > > [ 4050.264862] [<000003ff8015ed84>] mdev_device_remove+0xec/0x130 [mdev] > > [ 4050.264872] [<000003ff8015f074>] remove_store+0x6c/0xa8 [mdev] > > [ 4050.264881] [<000000000046f494>] kernfs_fop_write+0x14c/0x1f8 > > [ 4050.264890] [<00000000003c1530>] __vfs_write+0x38/0x1a8 > > [ 4050.264899] [<00000000003c187c>] vfs_write+0xb4/0x198 > > [ 4050.264908] [<00000000003c1af2>] ksys_write+0x5a/0xb0 > > [ 4050.264916] [<0000000000b9e270>] system_call+0xdc/0x2d8 > > [ 4050.264925] 4 locks held by sh/35924: > > [ 4050.264933] #0: 000000001ef90325 (sb_writers#4){.+.+}, at: vfs_write+0x9e/0x198 > > [ 4050.264948] #1: 000000005c1ab0b3 (&of->mutex){+.+.}, at: kernfs_fop_write+0x1cc/0x1f8 > > [ 4050.264963] #2: 0000000034831ab8 (kn->count#297){++++}, at: kernfs_remove_self+0x12e/0x150 > > [ 4050.264979] #3: 00000000e152484f (&dev->mutex){....}, at: device_release_driver_internal+0x5c/0x268 > > [ 4050.264993] Last Breaking-Event-Address: > > [ 4050.265002] [<000000000017bbaa>] __might_sleep+0x72/0x90 > > [ 4050.265010] irq event stamp: 7039 > > [ 4050.265020] hardirqs last enabled at (7047): [<00000000001cee7a>] console_unlock+0x6d2/0x740 > > [ 4050.265029] hardirqs last disabled at (7054): [<00000000001ce87e>] console_unlock+0xd6/0x740 > > [ 4050.265040] softirqs last enabled at (6416): [<0000000000b8fe26>] __udelay+0xb6/0x100 > > [ 4050.265049] softirqs last disabled at (6415): [<0000000000b8fe06>] __udelay+0x96/0x100 > > [ 4050.265057] ---[ end trace d04a07d39d99a9f9 ]--- > > > > Let's fix this as described in the articlehttps://lwn.net/Articles/628628/. > > > > Signed-off-by: Farhan Ali > > --- > > > > I have already tested in my environment and the warning goes > > away for me with this patch. But appreciate further testing > > and review feedback on the patch. > > > > Thanks > > Farhan > > > > > > ChangeLog > > --------- > > v1 -> v2 > > - Keep the same behavior as before, so the task goes into > > TASK_UNINTERRUPTIBLE state after being interrupted once > > > > --- > > > > Hi Alex, > > Polite ping :) > > Do you have any other feedback regarding the patch? Hi Farhan, It looks fine to me, it's pending integration and further testing on my end. I'll target this for v5.2. Thanks, Alex