From: Peter Xu <peterx@redhat.com>
To: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
linux-mm@kvack.org, Alex Williamson <alex.williamson@redhat.com>
Subject: Re: [PATCH 1/5] mm: Free non-hugetlb large folios in a batch
Date: Wed, 24 Apr 2024 11:20:28 -0400 [thread overview]
Message-ID: <ZikjPB0Dt5HA8-uL@x1n> (raw)
In-Reply-To: <20240405153228.2563754-2-willy@infradead.org>
On Fri, Apr 05, 2024 at 04:32:23PM +0100, Matthew Wilcox (Oracle) wrote:
> free_unref_folios() can now handle non-hugetlb large folios, so keep
> normal large folios in the batch. hugetlb folios still need to be
> handled specially. I believe that folios freed using put_pages_list()
> cannot be accounted to a memcg (or the small folios would trip the "page
> still charged to cgroup" warning), but put an assertion in to check that.
There's such user, iommu uses put_pages_list() to free IOMMU pgtables, and
they can be memcg accounted; since 2023 iommu_map switched to use
GFP_KERNEL_ACCOUNT.
I hit below panic when testing my local branch over mm-everthing when
running some VFIO workloads.
For this specific vfio use case, see 160912fc3d4a ("vfio/type1: account
iommu allocations").
I think we should remove the VM_BUG_ON_FOLIO() line, as the memcg will then
be properly taken care of later in free_pages_prepare(). Fixup attached at
the end that will fix this crash for me.
Thanks,
[ 10.092411] kernel BUG at mm/swap.c:152!
[ 10.092686] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[ 10.093034] CPU: 3 PID: 634 Comm: vfio-pci-mmap-t Tainted: G W 6.9.0-rc4-peterx+ #2
[ 10.093628] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
[ 10.094361] RIP: 0010:put_pages_list+0x12b/0x150
[ 10.094675] Code: 6d 08 48 81 c4 00 01 00 00 5b 5d c3 cc cc cc cc 48 c7 c6 f0 fd 9f 82 e8 63 e8 03 00 0f 0b 48 c7 c6 48 00 a0 82 e8 55 e8 03 00 <0f> 0b 48 c7 c6 28 fe 9f 82 e8 47f
[ 10.095896] RSP: 0018:ffffc9000221bc50 EFLAGS: 00010282
[ 10.096242] RAX: 0000000000000038 RBX: ffffea00042695c0 RCX: 0000000000000000
[ 10.096707] RDX: 0000000000000001 RSI: 0000000000000027 RDI: 00000000ffffffff
[ 10.097177] RBP: ffffc9000221bd68 R08: 0000000000000000 R09: 0000000000000003
[ 10.097642] R10: ffffc9000221bb08 R11: ffffffff8335db48 R12: ffff8881070172c0
[ 10.098113] R13: ffff888102fd0000 R14: ffff888107017210 R15: ffff888110a6c7c0
[ 10.098586] FS: 0000000000000000(0000) GS:ffff888276a00000(0000) knlGS:0000000000000000
[ 10.099117] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 10.099494] CR2: 00007f1910000000 CR3: 000000000323c006 CR4: 0000000000770ef0
[ 10.099972] PKRU: 55555554
[ 10.100154] Call Trace:
[ 10.100321] <TASK>
[ 10.100466] ? die+0x32/0x80
[ 10.100666] ? do_trap+0xd9/0x100
[ 10.100897] ? put_pages_list+0x12b/0x150
[ 10.101168] ? put_pages_list+0x12b/0x150
[ 10.101434] ? do_error_trap+0x81/0x110
[ 10.101688] ? put_pages_list+0x12b/0x150
[ 10.101957] ? exc_invalid_op+0x4c/0x60
[ 10.102216] ? put_pages_list+0x12b/0x150
[ 10.102484] ? asm_exc_invalid_op+0x16/0x20
[ 10.102771] ? put_pages_list+0x12b/0x150
[ 10.103026] ? 0xffffffff81000000
[ 10.103246] ? dma_pte_list_pagetables.isra.0+0x38/0xa0
[ 10.103592] ? dma_pte_list_pagetables.isra.0+0x9b/0xa0
[ 10.103933] ? dma_pte_clear_level+0x18c/0x1a0
[ 10.104228] ? domain_unmap+0x65/0x130
[ 10.104481] ? domain_unmap+0xe6/0x130
[ 10.104735] domain_exit+0x47/0x80
[ 10.104968] vfio_iommu_type1_detach_group+0x3f1/0x5f0
[ 10.105308] ? vfio_group_detach_container+0x3c/0x1a0
[ 10.105644] vfio_group_detach_container+0x60/0x1a0
[ 10.105977] vfio_group_fops_release+0x46/0x80
[ 10.106274] __fput+0x9a/0x2d0
[ 10.106479] task_work_run+0x55/0x90
[ 10.106717] do_exit+0x32f/0xb70
[ 10.106945] ? _raw_spin_unlock_irq+0x24/0x50
[ 10.107237] do_group_exit+0x32/0xa0
[ 10.107481] __x64_sys_exit_group+0x14/0x20
[ 10.107760] do_syscall_64+0x75/0x190
[ 10.108007] entry_SYSCALL_64_after_hwframe+0x76/0x7e
==================================
diff --git a/mm/swap.c b/mm/swap.c
index f0d478eee292..8ae5cd4ed180 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -149,7 +149,6 @@ void put_pages_list(struct list_head *pages)
free_huge_folio(folio);
continue;
}
- VM_BUG_ON_FOLIO(folio_memcg(folio), folio);
/* LRU flag must be clear because it's passed using the lru */
if (folio_batch_add(&fbatch, folio) > 0)
continue;
--
Peter Xu
next prev parent reply other threads:[~2024-04-24 15:20 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-05 15:32 [PATCH 0/5] Clean up __folio_put() Matthew Wilcox (Oracle)
2024-04-05 15:32 ` [PATCH 1/5] mm: Free non-hugetlb large folios in a batch Matthew Wilcox (Oracle)
2024-04-24 15:20 ` Peter Xu [this message]
2024-04-25 3:39 ` Matthew Wilcox
2024-04-25 15:00 ` Peter Xu
2024-04-25 20:54 ` Andrew Morton
2024-04-05 15:32 ` [PATCH 2/5] mm: Combine free_the_page() and free_unref_page() Matthew Wilcox (Oracle)
2024-04-05 15:32 ` [PATCH 3/5] mm: Inline destroy_large_folio() into __folio_put_large() Matthew Wilcox (Oracle)
2024-04-05 15:32 ` [PATCH 4/5] mm: Combine __folio_put_small, __folio_put_large and __folio_put Matthew Wilcox (Oracle)
2024-04-05 15:32 ` [PATCH 5/5] mm: Convert free_zone_device_page to free_zone_device_folio Matthew Wilcox (Oracle)
2024-04-05 16:34 ` [PATCH 0/5] Clean up __folio_put() Zi Yan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZikjPB0Dt5HA8-uL@x1n \
--to=peterx@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=alex.williamson@redhat.com \
--cc=linux-mm@kvack.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.