public inbox for linux-s390@vger.kernel.org
 help / color / mirror / Atom feed
From: Tony Krowiak <akrowiak@linux.ibm.com>
To: Alex Williamson <alex.williamson@redhat.com>,
	Jason Gunthorpe <jgg@nvidia.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>,
	Harald Freudenberger <freude@linux.ibm.com>,
	Vasily Gorbik <gor@linux.ibm.com>,
	Heiko Carstens <hca@linux.ibm.com>,
	Jason Herne <jjherne@linux.ibm.com>,
	linux-s390@vger.kernel.org, Halil Pasic <pasic@linux.ibm.com>,
	Cornelia Huck <cohuck@redhat.com>, Christoph Hellwig <hch@lst.de>,
	kvm@vger.kernel.org
Subject: Re: [PATCH v2] vfio/ap_ops: Add missed vfio_uninit_group_dev()
Date: Mon, 20 Sep 2021 17:26:25 -0400	[thread overview]
Message-ID: <ee2a0623-84d5-8c21-cc40-de5991ff94b1@linux.ibm.com> (raw)
In-Reply-To: <20210916125130.2db0961e.alex.williamson@redhat.com>



On 9/16/21 2:51 PM, Alex Williamson wrote:
> On Fri, 10 Sep 2021 20:06:30 -0300
> Jason Gunthorpe <jgg@nvidia.com> wrote:
>
>> Without this call an xarray entry is leaked when the vfio_ap device is
>> unprobed. It was missed when the below patch was rebased across the
>> dev_set patch.
>>
>> Fixes: eb0feefd4c02 ("vfio/ap_ops: Convert to use vfio_register_group_dev()")
>> Reviewed-by: Christoph Hellwig <hch@lst.de>
>> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
>> ---
>>   drivers/s390/crypto/vfio_ap_ops.c | 2 ++
>>   1 file changed, 2 insertions(+)
> Hi Tony, Halil, Jason (H),
>
> Any acks for this one?  Thanks,
>
> Alex

I installed this on a test system running the latest linux
code from our library and ran our test suite. I got the
following running a simple test case that assigns some
adapters and domains to a mediated device then
starts a guest using the mdev.

[Sep20 23:19] ======================================================
[  +0.000001] WARNING: possible circular locking dependency detected
[  +0.000002] 5.15.0-rc1-10318-gac08b1c68d1b-dirty #15 Not tainted
[  +0.000002] ------------------------------------------------------
[  +0.000001] nose2/1993 is trying to acquire lock:
[  +0.000002] 00000000cf2a5520 (&new_dev_set->lock){+.+.}-{3:3}, at: 
vfio_uninit_group_dev+0x3c/0xf0 [vfio]
[  +0.000012]
               but task is already holding lock:
[  +0.000001] 00000000c84424d8 (&matrix_dev->lock){+.+.}-{3:3}, at: 
vfio_ap_mdev_remove+0x48/0xc8 [vfio_ap]
[  +0.000006]
               which lock already depends on the new lock.

[  +0.000002]
               the existing dependency chain (in reverse order) is:
[  +0.000001]
               -> #1 (&matrix_dev->lock){+.+.}-{3:3}:
[  +0.000004]        __lock_acquire+0x64c/0xc40
[  +0.000005]        reacquire_held_locks+0x134/0x228
[  +0.000002]        __lock_release+0xc6/0x338
[  +0.000002]        lock_release+0x1b0/0x298
[  +0.000001]        __mutex_unlock_slowpath+0x3a/0x2d8
[  +0.000005]        vfio_ap_mdev_unset_kvm.part.0+0xa6/0xc0 [vfio_ap]
[  +0.000003]        vfio_device_fops_release+0x64/0xe8 [vfio]
[  +0.000003]        __fput+0xae/0x288
[  +0.000005]        task_work_run+0x76/0xc0
[  +0.000005]        do_exit+0x268/0x5c0
[  +0.000005]        do_group_exit+0x48/0xc0
[  +0.000050]        __s390x_sys_exit_group+0x2e/0x30
[  +0.000002]        __do_syscall+0x1c2/0x1f0
[  +0.000004]        system_call+0x78/0xa0
[  +0.000003]
               -> #0 (&new_dev_set->lock){+.+.}-{3:3}:
[  +0.000004]        check_prev_add+0xe0/0xed8
[  +0.000002]        validate_chain+0x9ca/0xde8
[  +0.000001]        __lock_acquire+0x64c/0xc40
[  +0.000002]        lock_acquire.part.0+0xec/0x258
[  +0.000002]        lock_acquire+0xb0/0x200
[  +0.000001]        __mutex_lock+0xa2/0x8f8
[  +0.000002]        mutex_lock_nested+0x32/0x40
[  +0.000002]        vfio_uninit_group_dev+0x3c/0xf0 [vfio]
[  +0.000004]        vfio_ap_mdev_remove+0x90/0xc8 [vfio_ap]
[  +0.000002]        mdev_remove+0x32/0x58 [mdev]
[  +0.000004]        __device_release_driver+0x18a/0x238
[  +0.000005]        device_release_driver+0x40/0x50
[  +0.000002]        bus_remove_device+0x110/0x1a0
[  +0.000002]        device_del+0x19c/0x3e8
[  +0.000002]        mdev_device_remove_common+0x3c/0xb0 [mdev]
[  +0.000003]        mdev_device_remove+0xac/0x100 [mdev]
[  +0.000002]        remove_store+0x78/0x90 [mdev]
[  +0.000003]        kernfs_fop_write_iter+0x13e/0x1e0
[  +0.000004]        new_sync_write+0x100/0x190
[  +0.000003]        vfs_write+0x230/0x2e0
[  +0.000002]        ksys_write+0x6c/0xf8
[  +0.000002]        __do_syscall+0x1c2/0x1f0
[  +0.000002]        system_call+0x78/0xa0
[  +0.000002]
               other info that might help us debug this:

[  +0.000002]  Possible unsafe locking scenario:

[  +0.000001]        CPU0                    CPU1
[  +0.000001]        ----                    ----
[  +0.000001]   lock(&matrix_dev->lock);
[  +0.000002] lock(&new_dev_set->lock);
[  +0.000002] lock(&matrix_dev->lock);
[  +0.000002]   lock(&new_dev_set->lock);
[  +0.000002]
                *** DEADLOCK ***

[  +0.000001] 7 locks held by nose2/1993:
[  +0.000002]  #0: 00000000b373af00 (&f->f_pos_lock){+.+.}-{3:3}, at: 
__fdget_pos+0x6a/0x80
[  +0.000007]  #1: 00000000978e4498 (sb_writers#4){.+.+}-{0:0}, at: 
ksys_write+0x6c/0xf8
[  +0.000005]  #2: 00000000b4b89490 (&of->mutex){+.+.}-{3:3}, at: 
kernfs_fop_write_iter+0x102/0x1e0
[  +0.000006]  #3: 00000000ca215210 (kn->active#215){++++}-{0:0}, at: 
sysfs_remove_file_self+0x42/0x78
[  +0.000006]  #4: 00000000cb7d85b8 (&parent->unreg_sem){++++}-{3:3}, 
at: mdev_device_remove+0x9c/0x100 [mdev]
[  +0.000005]  #5: 00000000cb188990 (&dev->mutex){....}-{3:3}, at: 
device_release_driver+0x32/0x50
[  +0.000006]  #6: 00000000c84424d8 (&matrix_dev->lock){+.+.}-{3:3}, at: 
vfio_ap_mdev_remove+0x48/0xc8 [vfio_ap]
[  +0.000005]
               stack backtrace:
[  +0.000002] CPU: 6 PID: 1993 Comm: nose2 Not tainted 
5.15.0-rc1-10318-gac08b1c68d1b-dirty #15
[  +0.000003] Hardware name: IBM 8561 T01 701 (LPAR)
[  +0.000002] Call Trace:
[  +0.000001]  [<0000000328775056>] dump_stack_lvl+0x8e/0xc8
[  +0.000003]  [<0000000327b4b648>] check_noncircular+0x168/0x188
[  +0.000003]  [<0000000327b4c708>] check_prev_add+0xe0/0xed8
[  +0.000002]  [<0000000327b4deca>] validate_chain+0x9ca/0xde8
[  +0.000002]  [<0000000327b50ec4>] __lock_acquire+0x64c/0xc40
[  +0.000002]  [<0000000327b4facc>] lock_acquire.part.0+0xec/0x258
[  +0.000002]  [<0000000327b4fce8>] lock_acquire+0xb0/0x200
[  +0.000002]  [<00000003287839d2>] __mutex_lock+0xa2/0x8f8
[  +0.000002]  [<000000032878425a>] mutex_lock_nested+0x32/0x40
[  +0.000002]  [<000003ff801c3ce4>] vfio_uninit_group_dev+0x3c/0xf0 [vfio]
[  +0.000004]  [<000003ff805e0000>] vfio_ap_mdev_remove+0x90/0xc8 [vfio_ap]
[  +0.000003]  [<000003ff802339fa>] mdev_remove+0x32/0x58 [mdev]
[  +0.000003]  [<00000003283e9c92>] __device_release_driver+0x18a/0x238
[  +0.000003]  [<00000003283e9d80>] device_release_driver+0x40/0x50
[  +0.000003]  [<00000003283e8e28>] bus_remove_device+0x110/0x1a0
[  +0.000002]  [<00000003283e2cb4>] device_del+0x19c/0x3e8
[  +0.000002]  [<000003ff8023270c>] mdev_device_remove_common+0x3c/0xb0 
[mdev]
[  +0.000003]  [<000003ff80232fcc>] mdev_device_remove+0xac/0x100 [mdev]
[  +0.000003]  [<000003ff80233220>] remove_store+0x78/0x90 [mdev]
[  +0.000003]  [<0000000327eefaae>] kernfs_fop_write_iter+0x13e/0x1e0
[  +0.000003]  [<0000000327dfa160>] new_sync_write+0x100/0x190
[  +0.000002]  [<0000000327dfd038>] vfs_write+0x230/0x2e0
[  +0.000003]  [<0000000327dfd354>] ksys_write+0x6c/0xf8
[  +0.000002]  [<00000003287787b2>] __do_syscall+0x1c2/0x1f0
[  +0.000002]  [<000000032878b178>] system_call+0x78/0xa0
[  +0.000003] INFO: lockdep is turned off.

  reply	other threads:[~2021-09-20 21:28 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-10 23:06 [PATCH v2] vfio/ap_ops: Add missed vfio_uninit_group_dev() Jason Gunthorpe
2021-09-16 18:51 ` Alex Williamson
2021-09-20 21:26   ` Tony Krowiak [this message]
2021-09-20 23:19     ` Jason Gunthorpe
2021-09-21  0:40       ` Tony Krowiak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ee2a0623-84d5-8c21-cc40-de5991ff94b1@linux.ibm.com \
    --to=akrowiak@linux.ibm.com \
    --cc=alex.williamson@redhat.com \
    --cc=borntraeger@de.ibm.com \
    --cc=cohuck@redhat.com \
    --cc=freude@linux.ibm.com \
    --cc=gor@linux.ibm.com \
    --cc=hca@linux.ibm.com \
    --cc=hch@lst.de \
    --cc=jgg@nvidia.com \
    --cc=jjherne@linux.ibm.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=pasic@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox