All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: Yu Zhao <yuzhao@google.com>
Cc: Kairui Song <ryncsn@gmail.com>,
	syzkaller-bugs@googlegroups.com,
	syzbot <syzbot+38a0cbd267eff2d286ff@syzkaller.appspotmail.com>,
	akpm@linux-foundation.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org
Subject: Re: [syzbot] [mm?] WARNING in lock_list_lru_of_memcg
Date: Mon, 16 Dec 2024 13:39:50 -0500	[thread overview]
Message-ID: <Z2Bz9t92Be9l1xqj@lappy> (raw)
In-Reply-To: <CAOUHufYXqj5QqZ5Kv4CNn2HyeUGT6RidKGJ6Jp17NUGjqgKAXA@mail.gmail.com>

On Sun, Dec 15, 2024 at 07:45:38PM -0700, Yu Zhao wrote:
>Hi Kairui,
>
>On Sun, Dec 15, 2024 at 10:45 AM Kairui Song <ryncsn@gmail.com> wrote:
>>
>> On Sun, Dec 15, 2024 at 3:43 AM Kairui Song <ryncsn@gmail.com> wrote:
>> >
>> > On Sat, Dec 14, 2024 at 2:06 PM Yu Zhao <yuzhao@google.com> wrote:
>> > >
>> > > On Fri, Dec 13, 2024 at 8:56 PM syzbot
>> > > <syzbot+38a0cbd267eff2d286ff@syzkaller.appspotmail.com> wrote:
>> > > >
>> > > > Hello,
>> > > >
>> > > > syzbot found the following issue on:
>> > > >
>> > > > HEAD commit:    7cb1b4663150 Merge tag 'locking_urgent_for_v6.13_rc3' of g..
>> > > > git tree:       upstream
>> > > > console output: https://syzkaller.appspot.com/x/log.txt?x=16e96b30580000
>> > > > kernel config:  https://syzkaller.appspot.com/x/.config?x=fee25f93665c89ac
>> > > > dashboard link: https://syzkaller.appspot.com/bug?extid=38a0cbd267eff2d286ff
>> > > > compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
>> > > >
>> > > > Unfortunately, I don't have any reproducer for this issue yet.
>> > > >
>> > > > Downloadable assets:
>> > > > disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7feb34a89c2a/non_bootable_disk-7cb1b466.raw.xz
>> > > > vmlinux: https://storage.googleapis.com/syzbot-assets/13e083329dab/vmlinux-7cb1b466.xz
>> > > > kernel image: https://storage.googleapis.com/syzbot-assets/fe3847d08513/bzImage-7cb1b466.xz
>> > > >
>> > > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
>> > > > Reported-by: syzbot+38a0cbd267eff2d286ff@syzkaller.appspotmail.com
>> > > >
>> > > > ------------[ cut here ]------------
>> > > > WARNING: CPU: 0 PID: 80 at mm/list_lru.c:97 lock_list_lru_of_memcg+0x395/0x4e0 mm/list_lru.c:97
>> > > > Modules linked in:
>> > > > CPU: 0 UID: 0 PID: 80 Comm: kswapd0 Not tainted 6.13.0-rc2-syzkaller-00018-g7cb1b4663150 #0
>> > > > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
>> > > > RIP: 0010:lock_list_lru_of_memcg+0x395/0x4e0 mm/list_lru.c:97
>> > > > Code: e9 22 fe ff ff e8 9b cc b6 ff 4c 8b 7c 24 10 45 84 f6 0f 84 40 ff ff ff e9 37 01 00 00 e8 83 cc b6 ff eb 05 e8 7c cc b6 ff 90 <0f> 0b 90 eb 97 89 e9 80 e1 07 80 c1 03 38 c1 0f 8c 7a fd ff ff 48
>> > > > RSP: 0018:ffffc9000105e798 EFLAGS: 00010093
>> > > > RAX: ffffffff81e891c4 RBX: 0000000000000000 RCX: ffff88801f53a440
>> > > > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
>> > > > RBP: ffff888042e70054 R08: ffffffff81e89156 R09: 1ffffffff2032cae
>> > > > R10: dffffc0000000000 R11: fffffbfff2032caf R12: ffffffff81e88e5e
>> > > > R13: ffffffff9a3feb20 R14: 0000000000000000 R15: ffff888042e70000
>> > > > FS:  0000000000000000(0000) GS:ffff88801fc00000(0000) knlGS:0000000000000000
>> > > > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> > > > CR2: 0000000020161000 CR3: 0000000032d12000 CR4: 0000000000352ef0
>> > > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> > > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> > > > Call Trace:
>> > > >  <TASK>
>> > > >  list_lru_add+0x59/0x270 mm/list_lru.c:164
>> > > >  list_lru_add_obj+0x17b/0x250 mm/list_lru.c:187
>> > > >  workingset_update_node+0x1af/0x230 mm/workingset.c:634
>> > > >  xas_update lib/xarray.c:355 [inline]
>> > > >  update_node lib/xarray.c:758 [inline]
>> > > >  xas_store+0xb8f/0x1890 lib/xarray.c:845
>> > > >  page_cache_delete mm/filemap.c:149 [inline]
>> > > >  __filemap_remove_folio+0x4e9/0x670 mm/filemap.c:232
>> > > >  __remove_mapping+0x86f/0xad0 mm/vmscan.c:791
>> > > >  shrink_folio_list+0x30a6/0x5ca0 mm/vmscan.c:1467
>> > > >  evict_folios+0x3c86/0x5800 mm/vmscan.c:4593
>> > > >  try_to_shrink_lruvec+0x9a6/0xc70 mm/vmscan.c:4789
>> > > >  shrink_one+0x3b9/0x850 mm/vmscan.c:4834
>> > > >  shrink_many mm/vmscan.c:4897 [inline]
>> > > >  lru_gen_shrink_node mm/vmscan.c:4975 [inline]
>> > > >  shrink_node+0x37c5/0x3e50 mm/vmscan.c:5956
>> > > >  kswapd_shrink_node mm/vmscan.c:6785 [inline]
>> > > >  balance_pgdat mm/vmscan.c:6977 [inline]
>> > > >  kswapd+0x1ca9/0x36f0 mm/vmscan.c:7246
>> > > >  kthread+0x2f0/0x390 kernel/kthread.c:389
>> > > >  ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
>> > > >  ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
>> > > >  </TASK>
>> > >
>> > > This one seems to be related to "mm/list_lru: split the lock to
>> > > per-cgroup scope".
>> > >
>> > > Kairui, can you please take a look? Thanks.
>> >
>> > Thanks for pinging, yes that's a new sanity check added by me.
>> >
>> > Which is supposed to mean, a list_lru is being reparented while the
>> > memcg it belongs to isn't dying.
>> >
>> > More concretely, list_lru is marked dead by memcg_offline_kmem ->
>> > memcg_reparent_list_lrus, if the function is called for one memcg, but
>> > now the memcg is not dying, this WARN triggers. I'm not sure how this
>> > is caused. One possibility is if alloc_shrinker_info() in
>> > mem_cgroup_css_online failed, then memcg_offline_kmem is called early?
>> > Doesn't seem to fit this case though.. Or maybe just sync issues with
>> > the memcg dying flag so the user saw the list_lru dying before seeing
>> > memcg dying? The object might be leaked to the parent cgroup, seems
>> > not too terrible though.
>> >
>> > I'm not sure how to reproduce this. I will keep looking.
>>
>> Managed to boot the image and using the kernel config provided by bot,
>> so far local tests didn't trigger any issue. Is there any way I can
>> reproduce what the bot actually did?
>
>If syzbot doesn't have a repro, it might not be productive for you to
>try to find one. Personally, I would analyze stacktraces and double
>check the code, and move on if I can't find something obviously wrong.
>
>> Or provide some patch for the bot
>> to test?
>
>syzbot only can try patches after it finds a repro. So in this case,
>no, it can't try your patches.
>
>Hope the above clarifies things for you.

Chiming in here as LKFT seems to be able to hit a nearby warning on
boot.

The link below contains the full log as well as additional information
on the run.

https://qa-reports.linaro.org/lkft/linux-mainline-master/build/v6.13-rc2-232-g4800575d8c0b/testrun/26323524/suite/log-parser-test/test/exception-warning-cpu-pid-at-mmlist_lruc-list_lru_del/details/

-- 
Thanks,
Sasha


  reply	other threads:[~2024-12-16 18:39 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-14  3:56 [syzbot] [mm?] WARNING in lock_list_lru_of_memcg syzbot
2024-12-14  6:05 ` Yu Zhao
2024-12-14 19:43   ` Kairui Song
2024-12-15 17:44     ` Kairui Song
2024-12-16  2:45       ` Yu Zhao
2024-12-16 18:39         ` Sasha Levin [this message]
2024-12-17 18:19           ` Kairui Song
2024-12-18 19:08             ` Kairui Song
2025-02-14 18:11 ` [syzbot] [mm?] [bcachefs?] " syzbot
2025-02-14 23:23   ` Andrew Morton
2025-02-16 16:13     ` Kairui Song
2025-02-17 17:12       ` Kairui Song
2025-02-17 18:09         ` Alan Huang
2025-02-18 11:40           ` Kairui Song
2025-02-18 12:16             ` Alan Huang
2025-02-18 17:47               ` Kairui Song
2025-02-18 17:09 ` [syzbot] " syzbot
2025-02-18 17:16 ` syzbot
2025-02-18 20:01 ` syzbot
2025-02-19 16:12 ` syzbot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z2Bz9t92Be9l1xqj@lappy \
    --to=sashal@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ryncsn@gmail.com \
    --cc=syzbot+38a0cbd267eff2d286ff@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    --cc=yuzhao@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.