linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Oleksandr Natalenko <oleksandr@natalenko.name>
To: Brian Foster <bfoster@redhat.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Seth Jennings <sjenning@redhat.com>,
	Dan Streetman <ddstreet@ieee.org>,
	Vitaly Wool <vitaly.wool@konsulko.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Miaohe Lin <linmiaohe@huawei.com>
Subject: Re: Panic/lockup in z3fold_zpool_free
Date: Fri, 23 Sep 2022 10:33:14 +0200	[thread overview]
Message-ID: <2650562.mvXUDI8C0e@natalenko.name> (raw)
In-Reply-To: <YyxJAObMV8tFVLkM@bfoster>

Hello.

On čtvrtek 22. září 2022 13:37:36 CEST Brian Foster wrote:
> On Thu, Sep 22, 2022 at 08:53:09AM +0200, Oleksandr Natalenko wrote:
> > Since 5.19 series, zswap went unstable for me under memory pressure, and
> > occasionally I get the following:
> > 
> > ```
> > watchdog: BUG: soft lockup - CPU#0 stuck for 10195s! [mariadbd:478]
> > Modules linked in: netconsole joydev mousedev intel_agp psmouse pcspkr
> > intel_gtt cfg80211 cirrus i2c_piix4 tun rfkill mac_hid nft_ct tcp_bbr2
> > nft_chain_nat nf_tables nfnetlink nf_nat nf_conntrack nf_defrag_ipv6
> > nf_defrag_ipv4 fuse qemu_fw_cfg ip_tables x_tables xfs libcrc32c
> > crc32c_generic dm_crypt cbc encrypted_keys trusted asn1_encoder tee tpm
> > rng_core dm_mod crct10dif_pclmul crc32_pclmul crc32c_intel
> > ghash_clmulni_intel virtio_net aesni_intel serio_raw net_failover
> > ata_generic virtio_balloon failover pata_acpi crypto_simd virtio_blk atkbd
> > libps2 vivaldi_fmap virtio_pci cryptd virtio_pci_legacy_dev ata_piix
> > virtio_pci_modern_dev i8042 floppy serio usbhid
> > Unloaded tainted modules: intel_cstate():1 intel_uncore():1 pcc_cpufreq():1
> > acpi_cpufreq():1
> > CPU: 0 PID: 478 Comm: mariadbd Tainted: G             L    5.19.0-pf5 #1
> > 12baccda8e49539e158b9dd97cbda6c7317d73af
> > Hardware name: Red Hat KVM, BIOS 1.11.0-2.el7 04/01/2014
> > RIP: 0010:z3fold_zpool_free+0x4c/0x5e0
> > Code: 7c 24 08 48 89 04 24 0f 85 e0 00 00 00 48 89 f5 41 bd 00 00 00 80 48
> > 83 e5 c0 48 83 c5 28 eb 0a 48 89 df e8 b6 8d 9f 00 f3 90 <48> 89 ef e8 bc 8b
> > 9f 00 4d 8b 34 24 49 81 e6 00 f0 ff ff 49 8d 5e
> > RSP: 0000:ffffbeadc0e87b68 EFLAGS: 00000202
> > RAX: 0000000000000030 RBX: ffff99ac73d2c010 RCX: ffff99ac4e4ba380
> > RDX: 0000665340000000 RSI: ffffe3b540000000 RDI: ffff99ac73d2c010
> > RBP: ffff99ac55ef3a68 R08: ffff99ac422f0bf0 R09: 000000000000c60b
> > R10: ffffffffffffffc0 R11: 0000000000000000 R12: ffff99ac55ef3a50
> > R13: 0000000080000000 R14: ffff99ac73d2c000 R15: ffff99acf3d2c000
> > FS:  00007f587fcd66c0(0000) GS:ffff99ac7ec00000(0000) knlGS:0000000000000000
> > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 00007f587ce8bec8 CR3: 0000000005b48006 CR4: 00000000000206f0
> > Call Trace:
> >  <TASK>
> >  zswap_free_entry+0xb5/0x110
> >  zswap_frontswap_invalidate_page+0x72/0xa0
> >  __frontswap_invalidate_page+0x3a/0x60
> >  swap_range_free+0xb5/0xd0
> >  swapcache_free_entries+0x16e/0x2e0
> >  free_swap_slot+0xb4/0xc0
> >  put_swap_page+0x259/0x420
> >  delete_from_swap_cache+0x63/0xb0
> >  try_to_free_swap+0x1b5/0x2a0
> >  do_swap_page+0x24c/0xb80
> >  __handle_mm_fault+0xa59/0xf70
> >  handle_mm_fault+0x100/0x2f0
> >  do_user_addr_fault+0x1c7/0x6a0
> >  exc_page_fault+0x74/0x170
> >  asm_exc_page_fault+0x26/0x30
> > RIP: 0033:0x556e96280428
> > Code: a0 03 00 00 67 e8 28 64 ff ff 48 8b 83 b0 00 00 00 48 8b 0d da 18 72
> > 00 48 8b 10 66 48 0f 6e c1 48 85 d2 74 27 0f 1f 44 00 00 <48> c7 82 98 00 00
> > 00 00 00 00 00 48 8b 10 48 83 c0 08 f2 0f 11 82
> > RSP: 002b:00007f587fcd3980 EFLAGS: 00010206
> > RAX: 00007f587d028468 RBX: 00007f587cb1a818 RCX: 3ff0000000000000
> > RDX: 00007f587ce8be30 RSI: 0000000000000000 RDI: 00007f587cedd030
> > RBP: 00007f587fcd39c0 R08: 0000000000000016 R09: 0000000000000000
> > R10: 0000000000000008 R11: 0000556e970961a0 R12: 00007f587d1f17b8
> > R13: 00007f5883595598 R14: 00007f587d1f17a8 R15: 00007f587cb1a928
> >  </TASK>
> > ```
> > 
> > This happens on the latest v5.19.10 kernel as well.
> > 
> > Sometimes it's not a soft lockup but GPF, although the stack trace is the
> > same. So, to me it looks like a memory corruption, UAF, double free or
> > something like that.
> > 
> > Have you got any idea regarding what's going on?
> > 
> 
> It might be unrelated, but this looks somewhat similar to a problem I
> hit recently that is caused by swap entry data stored in page->private
> being clobbered when splitting a huge page. That problem was introduced
> in v5.19, so that potentially lines up as well.
> 
> More details in the links below. [1] includes a VM_BUG_ON() splat with
> DEBUG_VM enabled, but the problem originally manifested as a soft lockup
> without the debug checks enabled. [2] includes a properly formatted
> patch. Any chance you could give that a try?

Thanks for your reply.

I'll give it a try. The only problem is that for me the issue is not reproducible at will, it can take 1 day, or it can take 2 weeks before the panic is hit.

> [1] https://lore.kernel.org/linux-mm/YxDyZLfBdFHK1Y1P@bfoster/
> [2] https://lore.kernel.org/linux-mm/20220906190602.1626037-1-bfoster@redhat.com/

-- 
Oleksandr Natalenko (post-factum)




  reply	other threads:[~2022-09-23  8:33 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-22  6:53 Panic/lockup in z3fold_zpool_free Oleksandr Natalenko
2022-09-22 11:37 ` Brian Foster
2022-09-23  8:33   ` Oleksandr Natalenko [this message]
2022-10-06 15:52     ` Oleksandr Natalenko
2022-10-17 16:13       ` Brian Foster
2022-10-17 16:34         ` Oleksandr Natalenko
2022-10-17 22:24           ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2650562.mvXUDI8C0e@natalenko.name \
    --to=oleksandr@natalenko.name \
    --cc=akpm@linux-foundation.org \
    --cc=bfoster@redhat.com \
    --cc=ddstreet@ieee.org \
    --cc=linmiaohe@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=sjenning@redhat.com \
    --cc=vitaly.wool@konsulko.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).