Re: [PATCH rc 3/3] iommufd: Set end correctly when doing batch carry

Linux kernel -stable discussions
 help / color / mirror / Atom feed

From: Nicolin Chen <nicolinc@nvidia.com>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: <iommu@lists.linux.dev>,
	Alex Williamson <alex.williamson@redhat.com>,
	"Lu Baolu" <baolu.lu@linux.intel.com>,
	Eric Auger <eric.auger@redhat.com>,
	"Kevin Tian" <kevin.tian@intel.com>,
	Lixiao Yang <lixiao.yang@intel.com>,
	"Matthew Rosato" <mjrosato@linux.ibm.com>,
	<stable@vger.kernel.org>,
	<syzbot+7574ebfe589049630608@syzkaller.appspotmail.com>,
	Terrence Xu <terrence.xu@intel.com>, Yi Liu <yi.l.liu@intel.com>
Subject: Re: [PATCH rc 3/3] iommufd: Set end correctly when doing batch carry
Date: Tue, 25 Jul 2023 12:55:11 -0700	[thread overview]
Message-ID: <ZMAonwbzOgm6IY7/@Asurada-Nvidia> (raw)
In-Reply-To: <3-v1-85aacb2af554+bc-iommufd_syz3_jgg@nvidia.com>

On Tue, Jul 25, 2023 at 04:05:50PM -0300, Jason Gunthorpe wrote:
> Even though the test suite covers this it somehow became obscured that
> this wasn't working.
> 
> The test iommufd_ioas.mock_domain.access_domain_destory would blow up
> rarely.
> 
> end should be set to 1 because this just pushed an item, the carry, to the
> pfns list.
> 
> Sometimes the test would blow up with:
> 
>   BUG: kernel NULL pointer dereference, address: 0000000000000000
>   #PF: supervisor read access in kernel mode
>   #PF: error_code(0x0000) - not-present page
>   PGD 0 P4D 0
>   Oops: 0000 [#1] SMP
>   CPU: 5 PID: 584 Comm: iommufd Not tainted 6.5.0-rc1-dirty #1236
>   Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
>   RIP: 0010:batch_unpin+0xa2/0x100 [iommufd]
>   Code: 17 48 81 fe ff ff 07 00 77 70 48 8b 15 b7 be 97 e2 48 85 d2 74 14 48 8b 14 fa 48 85 d2 74 0b 40 0f b6 f6 48 c1 e6 04 48 01 f2 <48> 8b 3a 48 c1 e0 06 89 ca 48 89 de 48 83 e7 f0 48 01 c7 e8 96 dc
>   RSP: 0018:ffffc90001677a58 EFLAGS: 00010246
>   RAX: 00007f7e2646f000 RBX: 0000000000000000 RCX: 0000000000000001
>   RDX: 0000000000000000 RSI: 00000000fefc4c8d RDI: 0000000000fefc4c
>   RBP: ffffc90001677a80 R08: 0000000000000048 R09: 0000000000000200
>   R10: 0000000000030b98 R11: ffffffff81f3bb40 R12: 0000000000000001
>   R13: ffff888101f75800 R14: ffffc90001677ad0 R15: 00000000000001fe
>   FS:  00007f9323679740(0000) GS:ffff8881ba540000(0000) knlGS:0000000000000000
>   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>   CR2: 0000000000000000 CR3: 0000000105ede003 CR4: 00000000003706a0
>   DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>   DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>   Call Trace:
>    <TASK>
>    ? show_regs+0x5c/0x70
>    ? __die+0x1f/0x60
>    ? page_fault_oops+0x15d/0x440
>    ? lock_release+0xbc/0x240
>    ? exc_page_fault+0x4a4/0x970
>    ? asm_exc_page_fault+0x27/0x30
>    ? batch_unpin+0xa2/0x100 [iommufd]
>    ? batch_unpin+0xba/0x100 [iommufd]
>    __iopt_area_unfill_domain+0x198/0x430 [iommufd]
>    ? __mutex_lock+0x8c/0xb80
>    ? __mutex_lock+0x6aa/0xb80
>    ? xa_erase+0x28/0x30
>    ? iopt_table_remove_domain+0x162/0x320 [iommufd]
>    ? lock_release+0xbc/0x240
>    iopt_area_unfill_domain+0xd/0x10 [iommufd]
>    iopt_table_remove_domain+0x195/0x320 [iommufd]
>    iommufd_hw_pagetable_destroy+0xb3/0x110 [iommufd]
>    iommufd_object_destroy_user+0x8e/0xf0 [iommufd]
>    iommufd_device_detach+0xc5/0x140 [iommufd]
>    iommufd_selftest_destroy+0x1f/0x70 [iommufd]
>    iommufd_object_destroy_user+0x8e/0xf0 [iommufd]
>    iommufd_destroy+0x3a/0x50 [iommufd]
>    iommufd_fops_ioctl+0xfb/0x170 [iommufd]
>    __x64_sys_ioctl+0x40d/0x9a0
>    do_syscall_64+0x3c/0x80
>    entry_SYSCALL_64_after_hwframe+0x46/0xb0
> 
> Cc: <stable@vger.kernel.org>
> Fixes: f394576eb11d ("iommufd: PFN handling for iopt_pages")
> Reported-by: Nicolin Chen <nicolinc@nvidia.com>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

This fixes the memory leak at the HugePages, and likely the rarely
triggered BUG too since I see no repro after applying this patch.

Tested-by: Nicolin Chen <nicolinc@nvidia.com>

Thanks!

next prev parent reply	other threads:[~2023-07-25 19:55 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-25 19:05 [PATCH rc 0/3] Several iommufd bug fixes Jason Gunthorpe
2023-07-25 19:05 ` [PATCH rc 1/3] iommufd/selftest: Do not try to destroy an access once it is attached Jason Gunthorpe
2023-07-25 21:45   ` Nicolin Chen
2023-07-26  3:53   ` Greg KH
2023-07-25 19:05 ` [PATCH rc 2/3] iommufd: IOMMUFD_DESTROY should not increase the refcount Jason Gunthorpe
2023-07-27  5:25   ` Tian, Kevin
2023-07-27 14:10     ` Jason Gunthorpe
2023-07-25 19:05 ` [PATCH rc 3/3] iommufd: Set end correctly when doing batch carry Jason Gunthorpe
2023-07-25 19:55   ` Nicolin Chen [this message]
2023-07-27  5:26   ` Tian, Kevin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZMAonwbzOgm6IY7/@Asurada-Nvidia \
    --to=nicolinc@nvidia.com \
    --cc=alex.williamson@redhat.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=eric.auger@redhat.com \
    --cc=iommu@lists.linux.dev \
    --cc=jgg@nvidia.com \
    --cc=kevin.tian@intel.com \
    --cc=lixiao.yang@intel.com \
    --cc=mjrosato@linux.ibm.com \
    --cc=stable@vger.kernel.org \
    --cc=syzbot+7574ebfe589049630608@syzkaller.appspotmail.com \
    --cc=terrence.xu@intel.com \
    --cc=yi.l.liu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox