* [PATCH v3 1/4] xfs: stop reclaim before pushing AIL during unmount [not found] <20260308182804.33127-6-ytohnuki@amazon.com> @ 2026-03-08 18:28 ` Yuto Ohnuki 2026-03-09 16:02 ` Darrick J. Wong 2026-03-08 18:28 ` [PATCH v3 2/4] xfs: refactor xfsaild_push loop into helper Yuto Ohnuki ` (2 subsequent siblings) 3 siblings, 1 reply; 17+ messages in thread From: Yuto Ohnuki @ 2026-03-08 18:28 UTC (permalink / raw) To: Carlos Maiolino, Dave Chinner Cc: Darrick J . Wong, Brian Foster, linux-xfs, linux-kernel, Yuto Ohnuki, syzbot+652af2b3c5569c4ab63c, stable The unmount sequence in xfs_unmount_flush_inodes() pushed the AIL while background reclaim and inodegc are still running. This creates a race where reclaim can free inodes and their log items while the AIL push is still referencing them. Reorder xfs_unmount_flush_inodes() to cancel background reclaim and stop inodegc before pushing the AIL, so that background reclaim and inodegc are no longer running while the AIL is pushed. Reported-by: syzbot+652af2b3c5569c4ab63c@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=652af2b3c5569c4ab63c Fixes: 90c60e164012 ("xfs: xfs_iflush() is no longer necessary") Cc: <stable@vger.kernel.org> # v5.9 Signed-off-by: Yuto Ohnuki <ytohnuki@amazon.com> --- fs/xfs/xfs_mount.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c index 9c295abd0a0a..786e1fc720e5 100644 --- a/fs/xfs/xfs_mount.c +++ b/fs/xfs/xfs_mount.c @@ -621,9 +621,9 @@ xfs_unmount_flush_inodes( xfs_set_unmounting(mp); - xfs_ail_push_all_sync(mp->m_ail); - xfs_inodegc_stop(mp); cancel_delayed_work_sync(&mp->m_reclaim_work); + xfs_inodegc_stop(mp); + xfs_ail_push_all_sync(mp->m_ail); xfs_reclaim_inodes(mp); xfs_health_unmount(mp); xfs_healthmon_unmount(mp); -- 2.50.1 Amazon Web Services EMEA SARL, 38 avenue John F. Kennedy, L-1855 Luxembourg, R.C.S. Luxembourg B186284 Amazon Web Services EMEA SARL, Irish Branch, One Burlington Plaza, Burlington Road, Dublin 4, Ireland, branch registration number 908705 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH v3 1/4] xfs: stop reclaim before pushing AIL during unmount 2026-03-08 18:28 ` [PATCH v3 1/4] xfs: stop reclaim before pushing AIL during unmount Yuto Ohnuki @ 2026-03-09 16:02 ` Darrick J. Wong 2026-03-10 17:33 ` Yuto Ohnuki 0 siblings, 1 reply; 17+ messages in thread From: Darrick J. Wong @ 2026-03-09 16:02 UTC (permalink / raw) To: Yuto Ohnuki Cc: Carlos Maiolino, Dave Chinner, Darrick J . Wong, Brian Foster, linux-xfs, linux-kernel, syzbot+652af2b3c5569c4ab63c, stable On Sun, Mar 08, 2026 at 06:28:06PM +0000, Yuto Ohnuki wrote: > The unmount sequence in xfs_unmount_flush_inodes() pushed the AIL while > background reclaim and inodegc are still running. This creates a race > where reclaim can free inodes and their log items while the AIL push is > still referencing them. Is this a general race between background inode reclaim and AIL pushes? Or is the race between an AIL push and the explicit call to xfs_reclaim_inodes below? I ask because there's a call to xfs_ail_push_all_sync from various places in the codebase: - Log covering/quiescing activities - xchk_checkpoint_log in the online fsck code if the inode btree scrubber thinks it's racing with inode reclaim. If inode reclaim happens to be running at the same time as these AIL pushes, won't the same race condition manifest there? But maybe you meant the race is with the explicit xfs_reclaim_inodes below? > Reorder xfs_unmount_flush_inodes() to cancel background reclaim and stop > inodegc before pushing the AIL, so that background reclaim and inodegc > are no longer running while the AIL is pushed. > > Reported-by: syzbot+652af2b3c5569c4ab63c@syzkaller.appspotmail.com > Closes: https://syzkaller.appspot.com/bug?extid=652af2b3c5569c4ab63c > Fixes: 90c60e164012 ("xfs: xfs_iflush() is no longer necessary") > Cc: <stable@vger.kernel.org> # v5.9 > Signed-off-by: Yuto Ohnuki <ytohnuki@amazon.com> > --- > fs/xfs/xfs_mount.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c > index 9c295abd0a0a..786e1fc720e5 100644 > --- a/fs/xfs/xfs_mount.c > +++ b/fs/xfs/xfs_mount.c > @@ -621,9 +621,9 @@ xfs_unmount_flush_inodes( > > xfs_set_unmounting(mp); > > - xfs_ail_push_all_sync(mp->m_ail); > - xfs_inodegc_stop(mp); > cancel_delayed_work_sync(&mp->m_reclaim_work); > + xfs_inodegc_stop(mp); xfs_inodegc_inactivate (aka the inodegc worker) can call xfs_inodegc_set_reclaimable, which in turn calls xfs_reclaim_work_queue. That will re-queue m_reclaim_work, which we just cancelled. I think inodegc_stop has to come before cancelling m_reclaim_work. --D > + xfs_ail_push_all_sync(mp->m_ail); > xfs_reclaim_inodes(mp); > xfs_health_unmount(mp); > xfs_healthmon_unmount(mp); > -- > 2.50.1 > > > > > Amazon Web Services EMEA SARL, 38 avenue John F. Kennedy, L-1855 Luxembourg, R.C.S. Luxembourg B186284 > > Amazon Web Services EMEA SARL, Irish Branch, One Burlington Plaza, Burlington Road, Dublin 4, Ireland, branch registration number 908705 > > > > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v3 1/4] xfs: stop reclaim before pushing AIL during unmount 2026-03-09 16:02 ` Darrick J. Wong @ 2026-03-10 17:33 ` Yuto Ohnuki 0 siblings, 0 replies; 17+ messages in thread From: Yuto Ohnuki @ 2026-03-10 17:33 UTC (permalink / raw) To: djwong Cc: bfoster, cem, darrick.wong, dchinner, linux-kernel, linux-xfs, stable, syzbot+652af2b3c5569c4ab63c, ytohnuki > Is this a general race between background inode reclaim and AIL pushes? > Or is the race between an AIL push and the explicit call to > xfs_reclaim_inodes below? > > I ask because there's a call to xfs_ail_push_all_sync from various > places in the codebase: > > - Log covering/quiescing activities > > - xchk_checkpoint_log in the online fsck code if the inode btree > scrubber thinks it's racing with inode reclaim. > > If inode reclaim happens to be running at the same time as these AIL > pushes, won't the same race condition manifest there? But maybe you > meant the race is with the explicit xfs_reclaim_inodes below? The UAF itself is a general race between background inode reclaim (and the dquot shrinker) and AIL pushes, not a race between the AIL push and the explicit xfs_reclaim_inodes call below. The syzbot report triggered it during shutdown because aborting dirty inodes makes them reclaimable while still referenced by the AIL, but the unsafe post-push dereferences fixed in patches 2/4 and 3/4 in v4 are not shutdown-specific. Those patches address the general race by capturing log item fields before push callbacks and saving the ailp pointer before dropping the AIL lock. This patch (patch 1/4) is a separate correctness fix for the unmount path. As Dave analysed in his v1 review [1], the unmount sequence is broken independently of the UAF - background reclaim and inodegc should not be running while the AIL is being pushed during unmount. This patch eliminates the conditions that make the general race particularly likely to trigger during unmount. [1] https://lore.kernel.org/all/aai66aCvGC66P8cN@dread/ > xfs_inodegc_inactivate (aka the inodegc worker) can call > xfs_inodegc_set_reclaimable, which in turn calls xfs_reclaim_work_queue. > That will re-queue m_reclaim_work, which we just cancelled. I think > inodegc_stop has to come before cancelling m_reclaim_work. > > --D Thank you for your valuable feedback. Fixed in v4 - xfs_inodegc_stop is now called before cancel_delayed_work_sync, and the function comment is updated to reflect the new ordering. Yuto Amazon Web Services EMEA SARL, 38 avenue John F. Kennedy, L-1855 Luxembourg, R.C.S. Luxembourg B186284 Amazon Web Services EMEA SARL, Irish Branch, One Burlington Plaza, Burlington Road, Dublin 4, Ireland, branch registration number 908705 ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH v3 2/4] xfs: refactor xfsaild_push loop into helper [not found] <20260308182804.33127-6-ytohnuki@amazon.com> 2026-03-08 18:28 ` [PATCH v3 1/4] xfs: stop reclaim before pushing AIL during unmount Yuto Ohnuki @ 2026-03-08 18:28 ` Yuto Ohnuki 2026-03-09 16:14 ` Darrick J. Wong 2026-03-10 5:26 ` Dave Chinner 2026-03-08 18:28 ` [PATCH v3 3/4] xfs: avoid dereferencing log items after push callbacks Yuto Ohnuki 2026-03-08 18:28 ` [PATCH v3 4/4] xfs: save ailp before dropping the AIL lock in " Yuto Ohnuki 3 siblings, 2 replies; 17+ messages in thread From: Yuto Ohnuki @ 2026-03-08 18:28 UTC (permalink / raw) To: Carlos Maiolino, Dave Chinner Cc: Darrick J . Wong, Brian Foster, linux-xfs, linux-kernel, Yuto Ohnuki, stable Factor the loop body of xfsaild_push() into a separate xfsaild_process_logitem() helper to improve readability. This is a pure code movement with no functional change. The subsequent patch to fix a use-after-free in the AIL push path depends on this refactoring. Cc: <stable@vger.kernel.org> # v5.9 Signed-off-by: Yuto Ohnuki <ytohnuki@amazon.com> --- fs/xfs/xfs_trans_ail.c | 116 +++++++++++++++++++++++------------------ 1 file changed, 64 insertions(+), 52 deletions(-) diff --git a/fs/xfs/xfs_trans_ail.c b/fs/xfs/xfs_trans_ail.c index 923729af4206..ac747804e1d6 100644 --- a/fs/xfs/xfs_trans_ail.c +++ b/fs/xfs/xfs_trans_ail.c @@ -458,6 +458,69 @@ xfs_ail_calc_push_target( return target_lsn; } +static void +xfsaild_process_logitem( + struct xfs_ail *ailp, + struct xfs_log_item *lip, + xfs_lsn_t lsn, + int *stuck, + int *flushing) +{ + struct xfs_mount *mp = ailp->ail_log->l_mp; + int lock_result; + + /* + * Note that iop_push may unlock and reacquire the AIL lock. We + * rely on the AIL cursor implementation to be able to deal with + * the dropped lock. + */ + lock_result = xfsaild_push_item(ailp, lip); + switch (lock_result) { + case XFS_ITEM_SUCCESS: + XFS_STATS_INC(mp, xs_push_ail_success); + trace_xfs_ail_push(lip); + + ailp->ail_last_pushed_lsn = lsn; + break; + + case XFS_ITEM_FLUSHING: + /* + * The item or its backing buffer is already being + * flushed. The typical reason for that is that an + * inode buffer is locked because we already pushed the + * updates to it as part of inode clustering. + * + * We do not want to stop flushing just because lots + * of items are already being flushed, but we need to + * re-try the flushing relatively soon if most of the + * AIL is being flushed. + */ + XFS_STATS_INC(mp, xs_push_ail_flushing); + trace_xfs_ail_flushing(lip); + + (*flushing)++; + ailp->ail_last_pushed_lsn = lsn; + break; + + case XFS_ITEM_PINNED: + XFS_STATS_INC(mp, xs_push_ail_pinned); + trace_xfs_ail_pinned(lip); + + (*stuck)++; + ailp->ail_log_flush++; + break; + case XFS_ITEM_LOCKED: + XFS_STATS_INC(mp, xs_push_ail_locked); + trace_xfs_ail_locked(lip); + + (*stuck)++; + break; + default: + ASSERT(0); + break; + } +} + static long xfsaild_push( struct xfs_ail *ailp) @@ -505,62 +568,11 @@ xfsaild_push( lsn = lip->li_lsn; while ((XFS_LSN_CMP(lip->li_lsn, ailp->ail_target) <= 0)) { - int lock_result; if (test_bit(XFS_LI_FLUSHING, &lip->li_flags)) goto next_item; - /* - * Note that iop_push may unlock and reacquire the AIL lock. We - * rely on the AIL cursor implementation to be able to deal with - * the dropped lock. - */ - lock_result = xfsaild_push_item(ailp, lip); - switch (lock_result) { - case XFS_ITEM_SUCCESS: - XFS_STATS_INC(mp, xs_push_ail_success); - trace_xfs_ail_push(lip); - - ailp->ail_last_pushed_lsn = lsn; - break; - - case XFS_ITEM_FLUSHING: - /* - * The item or its backing buffer is already being - * flushed. The typical reason for that is that an - * inode buffer is locked because we already pushed the - * updates to it as part of inode clustering. - * - * We do not want to stop flushing just because lots - * of items are already being flushed, but we need to - * re-try the flushing relatively soon if most of the - * AIL is being flushed. - */ - XFS_STATS_INC(mp, xs_push_ail_flushing); - trace_xfs_ail_flushing(lip); - - flushing++; - ailp->ail_last_pushed_lsn = lsn; - break; - - case XFS_ITEM_PINNED: - XFS_STATS_INC(mp, xs_push_ail_pinned); - trace_xfs_ail_pinned(lip); - - stuck++; - ailp->ail_log_flush++; - break; - case XFS_ITEM_LOCKED: - XFS_STATS_INC(mp, xs_push_ail_locked); - trace_xfs_ail_locked(lip); - - stuck++; - break; - default: - ASSERT(0); - break; - } - + xfsaild_process_logitem(ailp, lip, lsn, &stuck, &flushing); count++; /* -- 2.50.1 Amazon Web Services EMEA SARL, 38 avenue John F. Kennedy, L-1855 Luxembourg, R.C.S. Luxembourg B186284 Amazon Web Services EMEA SARL, Irish Branch, One Burlington Plaza, Burlington Road, Dublin 4, Ireland, branch registration number 908705 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH v3 2/4] xfs: refactor xfsaild_push loop into helper 2026-03-08 18:28 ` [PATCH v3 2/4] xfs: refactor xfsaild_push loop into helper Yuto Ohnuki @ 2026-03-09 16:14 ` Darrick J. Wong 2026-03-10 17:38 ` Yuto Ohnuki 2026-03-10 5:26 ` Dave Chinner 1 sibling, 1 reply; 17+ messages in thread From: Darrick J. Wong @ 2026-03-09 16:14 UTC (permalink / raw) To: Yuto Ohnuki Cc: Carlos Maiolino, Dave Chinner, Darrick J . Wong, Brian Foster, linux-xfs, linux-kernel, stable On Sun, Mar 08, 2026 at 06:28:07PM +0000, Yuto Ohnuki wrote: > Factor the loop body of xfsaild_push() into a separate > xfsaild_process_logitem() helper to improve readability. > > This is a pure code movement with no functional change. The > subsequent patch to fix a use-after-free in the AIL push path > depends on this refactoring. > > Cc: <stable@vger.kernel.org> # v5.9 > Signed-off-by: Yuto Ohnuki <ytohnuki@amazon.com> Seems like a reasonable hoist to reduce the length of the function, but in ordering the patches this way (cleanup, then bugfixes) the hoist also has to be backported to 5.10/5.15/6.1/6.6/6.12/6.18/6.19. --D > --- > fs/xfs/xfs_trans_ail.c | 116 +++++++++++++++++++++++------------------ > 1 file changed, 64 insertions(+), 52 deletions(-) > > diff --git a/fs/xfs/xfs_trans_ail.c b/fs/xfs/xfs_trans_ail.c > index 923729af4206..ac747804e1d6 100644 > --- a/fs/xfs/xfs_trans_ail.c > +++ b/fs/xfs/xfs_trans_ail.c > @@ -458,6 +458,69 @@ xfs_ail_calc_push_target( > return target_lsn; > } > > +static void > +xfsaild_process_logitem( > + struct xfs_ail *ailp, > + struct xfs_log_item *lip, > + xfs_lsn_t lsn, > + int *stuck, > + int *flushing) > +{ > + struct xfs_mount *mp = ailp->ail_log->l_mp; > + int lock_result; > + > + /* > + * Note that iop_push may unlock and reacquire the AIL lock. We > + * rely on the AIL cursor implementation to be able to deal with > + * the dropped lock. > + */ > + lock_result = xfsaild_push_item(ailp, lip); > + switch (lock_result) { > + case XFS_ITEM_SUCCESS: > + XFS_STATS_INC(mp, xs_push_ail_success); > + trace_xfs_ail_push(lip); > + > + ailp->ail_last_pushed_lsn = lsn; > + break; > + > + case XFS_ITEM_FLUSHING: > + /* > + * The item or its backing buffer is already being > + * flushed. The typical reason for that is that an > + * inode buffer is locked because we already pushed the > + * updates to it as part of inode clustering. > + * > + * We do not want to stop flushing just because lots > + * of items are already being flushed, but we need to > + * re-try the flushing relatively soon if most of the > + * AIL is being flushed. > + */ > + XFS_STATS_INC(mp, xs_push_ail_flushing); > + trace_xfs_ail_flushing(lip); > + > + (*flushing)++; > + ailp->ail_last_pushed_lsn = lsn; > + break; > + > + case XFS_ITEM_PINNED: > + XFS_STATS_INC(mp, xs_push_ail_pinned); > + trace_xfs_ail_pinned(lip); > + > + (*stuck)++; > + ailp->ail_log_flush++; > + break; > + case XFS_ITEM_LOCKED: > + XFS_STATS_INC(mp, xs_push_ail_locked); > + trace_xfs_ail_locked(lip); > + > + (*stuck)++; > + break; > + default: > + ASSERT(0); > + break; > + } > +} > + > static long > xfsaild_push( > struct xfs_ail *ailp) > @@ -505,62 +568,11 @@ xfsaild_push( > > lsn = lip->li_lsn; > while ((XFS_LSN_CMP(lip->li_lsn, ailp->ail_target) <= 0)) { > - int lock_result; > > if (test_bit(XFS_LI_FLUSHING, &lip->li_flags)) > goto next_item; > > - /* > - * Note that iop_push may unlock and reacquire the AIL lock. We > - * rely on the AIL cursor implementation to be able to deal with > - * the dropped lock. > - */ > - lock_result = xfsaild_push_item(ailp, lip); > - switch (lock_result) { > - case XFS_ITEM_SUCCESS: > - XFS_STATS_INC(mp, xs_push_ail_success); > - trace_xfs_ail_push(lip); > - > - ailp->ail_last_pushed_lsn = lsn; > - break; > - > - case XFS_ITEM_FLUSHING: > - /* > - * The item or its backing buffer is already being > - * flushed. The typical reason for that is that an > - * inode buffer is locked because we already pushed the > - * updates to it as part of inode clustering. > - * > - * We do not want to stop flushing just because lots > - * of items are already being flushed, but we need to > - * re-try the flushing relatively soon if most of the > - * AIL is being flushed. > - */ > - XFS_STATS_INC(mp, xs_push_ail_flushing); > - trace_xfs_ail_flushing(lip); > - > - flushing++; > - ailp->ail_last_pushed_lsn = lsn; > - break; > - > - case XFS_ITEM_PINNED: > - XFS_STATS_INC(mp, xs_push_ail_pinned); > - trace_xfs_ail_pinned(lip); > - > - stuck++; > - ailp->ail_log_flush++; > - break; > - case XFS_ITEM_LOCKED: > - XFS_STATS_INC(mp, xs_push_ail_locked); > - trace_xfs_ail_locked(lip); > - > - stuck++; > - break; > - default: > - ASSERT(0); > - break; > - } > - > + xfsaild_process_logitem(ailp, lip, lsn, &stuck, &flushing); > count++; > > /* > -- > 2.50.1 > > > > > Amazon Web Services EMEA SARL, 38 avenue John F. Kennedy, L-1855 Luxembourg, R.C.S. Luxembourg B186284 > > Amazon Web Services EMEA SARL, Irish Branch, One Burlington Plaza, Burlington Road, Dublin 4, Ireland, branch registration number 908705 > > > > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v3 2/4] xfs: refactor xfsaild_push loop into helper 2026-03-09 16:14 ` Darrick J. Wong @ 2026-03-10 17:38 ` Yuto Ohnuki 0 siblings, 0 replies; 17+ messages in thread From: Yuto Ohnuki @ 2026-03-10 17:38 UTC (permalink / raw) To: djwong Cc: bfoster, cem, darrick.wong, dchinner, linux-kernel, linux-xfs, stable, ytohnuki > Seems like a reasonable hoist to reduce the length of the function, but > in ordering the patches this way (cleanup, then bugfixes) the hoist also > has to be backported to 5.10/5.15/6.1/6.6/6.12/6.18/6.19. > > --D Thank you. In v4, I moved the refactoring to patch 4/4 after the bugfix patches and dropped the Cc: stable tag, so it no longer needs to be backported. Yuto Amazon Web Services EMEA SARL, 38 avenue John F. Kennedy, L-1855 Luxembourg, R.C.S. Luxembourg B186284 Amazon Web Services EMEA SARL, Irish Branch, One Burlington Plaza, Burlington Road, Dublin 4, Ireland, branch registration number 908705 ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v3 2/4] xfs: refactor xfsaild_push loop into helper 2026-03-08 18:28 ` [PATCH v3 2/4] xfs: refactor xfsaild_push loop into helper Yuto Ohnuki 2026-03-09 16:14 ` Darrick J. Wong @ 2026-03-10 5:26 ` Dave Chinner 2026-03-10 17:46 ` Yuto Ohnuki 1 sibling, 1 reply; 17+ messages in thread From: Dave Chinner @ 2026-03-10 5:26 UTC (permalink / raw) To: Yuto Ohnuki Cc: Carlos Maiolino, Dave Chinner, Darrick J . Wong, Brian Foster, linux-xfs, linux-kernel, stable On Sun, Mar 08, 2026 at 06:28:07PM +0000, Yuto Ohnuki wrote: > Factor the loop body of xfsaild_push() into a separate > xfsaild_process_logitem() helper to improve readability. > > This is a pure code movement with no functional change. The > subsequent patch to fix a use-after-free in the AIL push path > depends on this refactoring. > > Cc: <stable@vger.kernel.org> # v5.9 > Signed-off-by: Yuto Ohnuki <ytohnuki@amazon.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> -- Dave Chinner dgc@kernel.org ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v3 2/4] xfs: refactor xfsaild_push loop into helper 2026-03-10 5:26 ` Dave Chinner @ 2026-03-10 17:46 ` Yuto Ohnuki 0 siblings, 0 replies; 17+ messages in thread From: Yuto Ohnuki @ 2026-03-10 17:46 UTC (permalink / raw) To: dgc Cc: bfoster, cem, darrick.wong, dchinner, linux-kernel, linux-xfs, stable, ytohnuki > > Factor the loop body of xfsaild_push() into a separate > > xfsaild_process_logitem() helper to improve readability. > > > > This is a pure code movement with no functional change. The > > subsequent patch to fix a use-after-free in the AIL push path > > depends on this refactoring. > > > > Cc: <stable@vger.kernel.org> # v5.9 > > Signed-off-by: Yuto Ohnuki <ytohnuki@amazon.com> > > Reviewed-by: Dave Chinner <dchinner@redhat.com> Thanks for the review, Dave. In v4, I reworked the patch ordering so that the bugfix patches don't depend on the refactoring patch, reducing the stable backport burden. Since the context has changed slightly, I've dropped your Reviewed-by from this patch in v4 just to be safe. I would appreciate it if you could take another look when you get a chance. Yuto Amazon Web Services EMEA SARL, 38 avenue John F. Kennedy, L-1855 Luxembourg, R.C.S. Luxembourg B186284 Amazon Web Services EMEA SARL, Irish Branch, One Burlington Plaza, Burlington Road, Dublin 4, Ireland, branch registration number 908705 ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH v3 3/4] xfs: avoid dereferencing log items after push callbacks [not found] <20260308182804.33127-6-ytohnuki@amazon.com> 2026-03-08 18:28 ` [PATCH v3 1/4] xfs: stop reclaim before pushing AIL during unmount Yuto Ohnuki 2026-03-08 18:28 ` [PATCH v3 2/4] xfs: refactor xfsaild_push loop into helper Yuto Ohnuki @ 2026-03-08 18:28 ` Yuto Ohnuki 2026-03-09 16:27 ` Darrick J. Wong 2026-03-10 5:27 ` Dave Chinner 2026-03-08 18:28 ` [PATCH v3 4/4] xfs: save ailp before dropping the AIL lock in " Yuto Ohnuki 3 siblings, 2 replies; 17+ messages in thread From: Yuto Ohnuki @ 2026-03-08 18:28 UTC (permalink / raw) To: Carlos Maiolino, Dave Chinner Cc: Darrick J . Wong, Brian Foster, linux-xfs, linux-kernel, Yuto Ohnuki, syzbot+652af2b3c5569c4ab63c, stable After xfsaild_push_item() calls iop_push(), the log item may have been freed if the AIL lock was dropped during the push. The tracepoints in the switch statement dereference the log item after iop_push() returns, which can result in a use-after-free. Fix this by capturing the log item type, flags, and LSN before calling xfsaild_push_item(), and introducing a new xfs_ail_push_class trace event class that takes these pre-captured values and the ailp pointer instead of the log item pointer. Reported-by: syzbot+652af2b3c5569c4ab63c@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=652af2b3c5569c4ab63c Fixes: 90c60e164012 ("xfs: xfs_iflush() is no longer necessary") Cc: <stable@vger.kernel.org> # v5.9 Signed-off-by: Yuto Ohnuki <ytohnuki@amazon.com> --- fs/xfs/xfs_trace.h | 36 ++++++++++++++++++++++++++++++++---- fs/xfs/xfs_trans_ail.c | 24 ++++++++++++++++-------- 2 files changed, 48 insertions(+), 12 deletions(-) diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h index 813e5a9f57eb..0e994b3f768f 100644 --- a/fs/xfs/xfs_trace.h +++ b/fs/xfs/xfs_trace.h @@ -56,6 +56,7 @@ #include <linux/tracepoint.h> struct xfs_agf; +struct xfs_ail; struct xfs_alloc_arg; struct xfs_attr_list_context; struct xfs_buf_log_item; @@ -1650,16 +1651,43 @@ TRACE_EVENT(xfs_log_force, DEFINE_EVENT(xfs_log_item_class, name, \ TP_PROTO(struct xfs_log_item *lip), \ TP_ARGS(lip)) -DEFINE_LOG_ITEM_EVENT(xfs_ail_push); -DEFINE_LOG_ITEM_EVENT(xfs_ail_pinned); -DEFINE_LOG_ITEM_EVENT(xfs_ail_locked); -DEFINE_LOG_ITEM_EVENT(xfs_ail_flushing); DEFINE_LOG_ITEM_EVENT(xfs_cil_whiteout_mark); DEFINE_LOG_ITEM_EVENT(xfs_cil_whiteout_skip); DEFINE_LOG_ITEM_EVENT(xfs_cil_whiteout_unpin); DEFINE_LOG_ITEM_EVENT(xlog_ail_insert_abort); DEFINE_LOG_ITEM_EVENT(xfs_trans_free_abort); +DECLARE_EVENT_CLASS(xfs_ail_push_class, + TP_PROTO(struct xfs_ail *ailp, uint type, unsigned long flags, xfs_lsn_t lsn), + TP_ARGS(ailp, type, flags, lsn), + TP_STRUCT__entry( + __field(dev_t, dev) + __field(uint, type) + __field(unsigned long, flags) + __field(xfs_lsn_t, lsn) + ), + TP_fast_assign( + __entry->dev = ailp->ail_log->l_mp->m_super->s_dev; + __entry->type = type; + __entry->flags = flags; + __entry->lsn = lsn; + ), + TP_printk("dev %d:%d lsn %d/%d type %s flags %s", + MAJOR(__entry->dev), MINOR(__entry->dev), + CYCLE_LSN(__entry->lsn), BLOCK_LSN(__entry->lsn), + __print_symbolic(__entry->type, XFS_LI_TYPE_DESC), + __print_flags(__entry->flags, "|", XFS_LI_FLAGS)) +) + +#define DEFINE_AIL_PUSH_EVENT(name) \ +DEFINE_EVENT(xfs_ail_push_class, name, \ + TP_PROTO(struct xfs_ail *ailp, uint type, unsigned long flags, xfs_lsn_t lsn), \ + TP_ARGS(ailp, type, flags, lsn)) +DEFINE_AIL_PUSH_EVENT(xfs_ail_push); +DEFINE_AIL_PUSH_EVENT(xfs_ail_pinned); +DEFINE_AIL_PUSH_EVENT(xfs_ail_locked); +DEFINE_AIL_PUSH_EVENT(xfs_ail_flushing); + DECLARE_EVENT_CLASS(xfs_ail_class, TP_PROTO(struct xfs_log_item *lip, xfs_lsn_t old_lsn, xfs_lsn_t new_lsn), TP_ARGS(lip, old_lsn, new_lsn), diff --git a/fs/xfs/xfs_trans_ail.c b/fs/xfs/xfs_trans_ail.c index ac747804e1d6..14ffb77b12ea 100644 --- a/fs/xfs/xfs_trans_ail.c +++ b/fs/xfs/xfs_trans_ail.c @@ -365,6 +365,12 @@ xfsaild_resubmit_item( return XFS_ITEM_SUCCESS; } +/* + * Push a single log item from the AIL. + * + * @lip may have been released and freed by the time this function returns, + * so callers must not dereference the log item afterwards. + */ static inline uint xfsaild_push_item( struct xfs_ail *ailp, @@ -462,11 +468,13 @@ static void xfsaild_process_logitem( struct xfs_ail *ailp, struct xfs_log_item *lip, - xfs_lsn_t lsn, int *stuck, int *flushing) { struct xfs_mount *mp = ailp->ail_log->l_mp; + uint type = lip->li_type; + unsigned long flags = lip->li_flags; + xfs_lsn_t item_lsn = lip->li_lsn; int lock_result; /* @@ -478,9 +486,9 @@ xfsaild_process_logitem( switch (lock_result) { case XFS_ITEM_SUCCESS: XFS_STATS_INC(mp, xs_push_ail_success); - trace_xfs_ail_push(lip); + trace_xfs_ail_push(ailp, type, flags, item_lsn); - ailp->ail_last_pushed_lsn = lsn; + ailp->ail_last_pushed_lsn = item_lsn; break; case XFS_ITEM_FLUSHING: @@ -496,22 +504,22 @@ xfsaild_process_logitem( * AIL is being flushed. */ XFS_STATS_INC(mp, xs_push_ail_flushing); - trace_xfs_ail_flushing(lip); + trace_xfs_ail_flushing(ailp, type, flags, item_lsn); (*flushing)++; - ailp->ail_last_pushed_lsn = lsn; + ailp->ail_last_pushed_lsn = item_lsn; break; case XFS_ITEM_PINNED: XFS_STATS_INC(mp, xs_push_ail_pinned); - trace_xfs_ail_pinned(lip); + trace_xfs_ail_pinned(ailp, type, flags, item_lsn); (*stuck)++; ailp->ail_log_flush++; break; case XFS_ITEM_LOCKED: XFS_STATS_INC(mp, xs_push_ail_locked); - trace_xfs_ail_locked(lip); + trace_xfs_ail_locked(ailp, type, flags, item_lsn); (*stuck)++; break; @@ -572,7 +580,7 @@ xfsaild_push( if (test_bit(XFS_LI_FLUSHING, &lip->li_flags)) goto next_item; - xfsaild_process_logitem(ailp, lip, lsn, &stuck, &flushing); + xfsaild_process_logitem(ailp, lip, &stuck, &flushing); count++; /* -- 2.50.1 Amazon Web Services EMEA SARL, 38 avenue John F. Kennedy, L-1855 Luxembourg, R.C.S. Luxembourg B186284 Amazon Web Services EMEA SARL, Irish Branch, One Burlington Plaza, Burlington Road, Dublin 4, Ireland, branch registration number 908705 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH v3 3/4] xfs: avoid dereferencing log items after push callbacks 2026-03-08 18:28 ` [PATCH v3 3/4] xfs: avoid dereferencing log items after push callbacks Yuto Ohnuki @ 2026-03-09 16:27 ` Darrick J. Wong 2026-03-10 5:25 ` Dave Chinner 2026-03-10 17:51 ` Yuto Ohnuki 2026-03-10 5:27 ` Dave Chinner 1 sibling, 2 replies; 17+ messages in thread From: Darrick J. Wong @ 2026-03-09 16:27 UTC (permalink / raw) To: Yuto Ohnuki Cc: Carlos Maiolino, Dave Chinner, Darrick J . Wong, Brian Foster, linux-xfs, linux-kernel, syzbot+652af2b3c5569c4ab63c, stable On Sun, Mar 08, 2026 at 06:28:08PM +0000, Yuto Ohnuki wrote: > After xfsaild_push_item() calls iop_push(), the log item may have been > freed if the AIL lock was dropped during the push. The tracepoints in > the switch statement dereference the log item after iop_push() returns, > which can result in a use-after-free. How difficult would it be to add a refcount to xfs_log_item so that any other code walking through the AIL's log item list can't accidentally suffer from this UAF? I keep seeing periodic log item UAF bugfixes on the list, which (to me anyway) suggests we ought to think about a struct(ural) fix to this problem. I /think/ the answer to that is "sort of nasty" because of things like xfs_dquot embedding its own log item. The other log item types might not be so bad because at least they're allocated separately. However, refcount_t accesses also aren't free. > Fix this by capturing the log item type, flags, and LSN before calling > xfsaild_push_item(), and introducing a new xfs_ail_push_class trace > event class that takes these pre-captured values and the ailp pointer > instead of the log item pointer. > > Reported-by: syzbot+652af2b3c5569c4ab63c@syzkaller.appspotmail.com > Closes: https://syzkaller.appspot.com/bug?extid=652af2b3c5569c4ab63c > Fixes: 90c60e164012 ("xfs: xfs_iflush() is no longer necessary") > Cc: <stable@vger.kernel.org> # v5.9 > Signed-off-by: Yuto Ohnuki <ytohnuki@amazon.com> > --- > fs/xfs/xfs_trace.h | 36 ++++++++++++++++++++++++++++++++---- > fs/xfs/xfs_trans_ail.c | 24 ++++++++++++++++-------- > 2 files changed, 48 insertions(+), 12 deletions(-) > > diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h > index 813e5a9f57eb..0e994b3f768f 100644 > --- a/fs/xfs/xfs_trace.h > +++ b/fs/xfs/xfs_trace.h > @@ -56,6 +56,7 @@ > #include <linux/tracepoint.h> > > struct xfs_agf; > +struct xfs_ail; > struct xfs_alloc_arg; > struct xfs_attr_list_context; > struct xfs_buf_log_item; > @@ -1650,16 +1651,43 @@ TRACE_EVENT(xfs_log_force, > DEFINE_EVENT(xfs_log_item_class, name, \ > TP_PROTO(struct xfs_log_item *lip), \ > TP_ARGS(lip)) > -DEFINE_LOG_ITEM_EVENT(xfs_ail_push); > -DEFINE_LOG_ITEM_EVENT(xfs_ail_pinned); > -DEFINE_LOG_ITEM_EVENT(xfs_ail_locked); > -DEFINE_LOG_ITEM_EVENT(xfs_ail_flushing); > DEFINE_LOG_ITEM_EVENT(xfs_cil_whiteout_mark); > DEFINE_LOG_ITEM_EVENT(xfs_cil_whiteout_skip); > DEFINE_LOG_ITEM_EVENT(xfs_cil_whiteout_unpin); > DEFINE_LOG_ITEM_EVENT(xlog_ail_insert_abort); > DEFINE_LOG_ITEM_EVENT(xfs_trans_free_abort); > > +DECLARE_EVENT_CLASS(xfs_ail_push_class, > + TP_PROTO(struct xfs_ail *ailp, uint type, unsigned long flags, xfs_lsn_t lsn), > + TP_ARGS(ailp, type, flags, lsn), > + TP_STRUCT__entry( > + __field(dev_t, dev) > + __field(uint, type) > + __field(unsigned long, flags) > + __field(xfs_lsn_t, lsn) > + ), > + TP_fast_assign( > + __entry->dev = ailp->ail_log->l_mp->m_super->s_dev; > + __entry->type = type; > + __entry->flags = flags; > + __entry->lsn = lsn; > + ), > + TP_printk("dev %d:%d lsn %d/%d type %s flags %s", > + MAJOR(__entry->dev), MINOR(__entry->dev), > + CYCLE_LSN(__entry->lsn), BLOCK_LSN(__entry->lsn), > + __print_symbolic(__entry->type, XFS_LI_TYPE_DESC), > + __print_flags(__entry->flags, "|", XFS_LI_FLAGS)) > +) > + > +#define DEFINE_AIL_PUSH_EVENT(name) \ > +DEFINE_EVENT(xfs_ail_push_class, name, \ > + TP_PROTO(struct xfs_ail *ailp, uint type, unsigned long flags, xfs_lsn_t lsn), \ > + TP_ARGS(ailp, type, flags, lsn)) > +DEFINE_AIL_PUSH_EVENT(xfs_ail_push); > +DEFINE_AIL_PUSH_EVENT(xfs_ail_pinned); > +DEFINE_AIL_PUSH_EVENT(xfs_ail_locked); > +DEFINE_AIL_PUSH_EVENT(xfs_ail_flushing); > + > DECLARE_EVENT_CLASS(xfs_ail_class, > TP_PROTO(struct xfs_log_item *lip, xfs_lsn_t old_lsn, xfs_lsn_t new_lsn), > TP_ARGS(lip, old_lsn, new_lsn), > diff --git a/fs/xfs/xfs_trans_ail.c b/fs/xfs/xfs_trans_ail.c > index ac747804e1d6..14ffb77b12ea 100644 > --- a/fs/xfs/xfs_trans_ail.c > +++ b/fs/xfs/xfs_trans_ail.c > @@ -365,6 +365,12 @@ xfsaild_resubmit_item( > return XFS_ITEM_SUCCESS; > } > > +/* > + * Push a single log item from the AIL. > + * > + * @lip may have been released and freed by the time this function returns, > + * so callers must not dereference the log item afterwards. This is true after the xfsaild_push_item call, correct? If so then I think the comment for the call needs updating too: /* * Note that iop_push may unlock and reacquire the AIL lock. We * rely on the AIL cursor implementation to be able to deal with * the dropped lock. * * The log item may have been freed by the push, so it must not * be accessed or dereferenced below this line. */ lock_result = xfsaild_push_item(ailp, lip); Otherwise this looks ok to me. --D > + */ > static inline uint > xfsaild_push_item( > struct xfs_ail *ailp, > @@ -462,11 +468,13 @@ static void > xfsaild_process_logitem( > struct xfs_ail *ailp, > struct xfs_log_item *lip, > - xfs_lsn_t lsn, > int *stuck, > int *flushing) > { > struct xfs_mount *mp = ailp->ail_log->l_mp; > + uint type = lip->li_type; > + unsigned long flags = lip->li_flags; > + xfs_lsn_t item_lsn = lip->li_lsn; > int lock_result; > > /* > @@ -478,9 +486,9 @@ xfsaild_process_logitem( > switch (lock_result) { > case XFS_ITEM_SUCCESS: > XFS_STATS_INC(mp, xs_push_ail_success); > - trace_xfs_ail_push(lip); > + trace_xfs_ail_push(ailp, type, flags, item_lsn); > > - ailp->ail_last_pushed_lsn = lsn; > + ailp->ail_last_pushed_lsn = item_lsn; > break; > > case XFS_ITEM_FLUSHING: > @@ -496,22 +504,22 @@ xfsaild_process_logitem( > * AIL is being flushed. > */ > XFS_STATS_INC(mp, xs_push_ail_flushing); > - trace_xfs_ail_flushing(lip); > + trace_xfs_ail_flushing(ailp, type, flags, item_lsn); > > (*flushing)++; > - ailp->ail_last_pushed_lsn = lsn; > + ailp->ail_last_pushed_lsn = item_lsn; > break; > > case XFS_ITEM_PINNED: > XFS_STATS_INC(mp, xs_push_ail_pinned); > - trace_xfs_ail_pinned(lip); > + trace_xfs_ail_pinned(ailp, type, flags, item_lsn); > > (*stuck)++; > ailp->ail_log_flush++; > break; > case XFS_ITEM_LOCKED: > XFS_STATS_INC(mp, xs_push_ail_locked); > - trace_xfs_ail_locked(lip); > + trace_xfs_ail_locked(ailp, type, flags, item_lsn); > > (*stuck)++; > break; > @@ -572,7 +580,7 @@ xfsaild_push( > if (test_bit(XFS_LI_FLUSHING, &lip->li_flags)) > goto next_item; > > - xfsaild_process_logitem(ailp, lip, lsn, &stuck, &flushing); > + xfsaild_process_logitem(ailp, lip, &stuck, &flushing); > count++; > > /* > -- > 2.50.1 > > > > > Amazon Web Services EMEA SARL, 38 avenue John F. Kennedy, L-1855 Luxembourg, R.C.S. Luxembourg B186284 > > Amazon Web Services EMEA SARL, Irish Branch, One Burlington Plaza, Burlington Road, Dublin 4, Ireland, branch registration number 908705 > > > > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v3 3/4] xfs: avoid dereferencing log items after push callbacks 2026-03-09 16:27 ` Darrick J. Wong @ 2026-03-10 5:25 ` Dave Chinner 2026-03-10 17:51 ` Yuto Ohnuki 1 sibling, 0 replies; 17+ messages in thread From: Dave Chinner @ 2026-03-10 5:25 UTC (permalink / raw) To: Darrick J. Wong Cc: Yuto Ohnuki, Carlos Maiolino, Dave Chinner, Darrick J . Wong, Brian Foster, linux-xfs, linux-kernel, syzbot+652af2b3c5569c4ab63c, stable On Mon, Mar 09, 2026 at 09:27:10AM -0700, Darrick J. Wong wrote: > On Sun, Mar 08, 2026 at 06:28:08PM +0000, Yuto Ohnuki wrote: > > After xfsaild_push_item() calls iop_push(), the log item may have been > > freed if the AIL lock was dropped during the push. The tracepoints in > > the switch statement dereference the log item after iop_push() returns, > > which can result in a use-after-free. > > How difficult would it be to add a refcount to xfs_log_item so that any > other code walking through the AIL's log item list can't accidentally > suffer from this UAF? I keep seeing periodic log item UAF bugfixes on > the list, which (to me anyway) suggests we ought to think about a > struct(ural) fix to this problem. > > I /think/ the answer to that is "sort of nasty" because of things like > xfs_dquot embedding its own log item. The other log item types might > not be so bad because at least they're allocated separately. However, > refcount_t accesses also aren't free. It's nasty for many reasons. The biggest one is that the log item isn't valid as a stand-alone object. Once the owner item has been freed, any attempt to use the log item requires dereferencing back to the owner object, and now we have a different set of UAF problems. For example, we can't leave log items in the AIL after freeing the owner object because we have to write the owner object to disk to remove the log item from the AIL. The log item has to be removed from the AIL before we free the high level item the log item belongs to. Hence the life time of a log item must always be a subset of the owner object. That is where log item reference counting becomes an issue - for it to work the log item has to hold a reference to the owner object. We already have log items that do this: the BLI is one example. However, other UAF issues on log items come from using reference counts and the needing references and (potentially) locks on the owner object. Those complexities end up causing - you guessed it - UAF problems... For example: the BLI keeps a reference count for all accesses to the BLI *except* for the AIL reference, because the AIL can't keep active references to dirty high level objects. For example: releasing the last reference to some high level objects (e.g. inodes) can result in them being journalled, and hence the journalling subsystem now has to be able to track and process those dirty high level items without holding an active reference to them. For example: The BLI reference count/buffer locking model is all the complexity in freeing metadata extents (stale buffers) comes from. At transaction completion, the transaction reference to the BLI and the buffer lock is transferred to the CIL (the journal) and is only then released on completion of the journal IO. This is how we prevent a buffer from being reused whilst the transaction freeing underlying storage is in flight - the buffer needs to remain locked until the freeing transaction(s) is stable in the journal. This complexity is where all the UAF in the BLI unpinning operations come from. Normally, the transaction reference and buffer lock are released when the transaction context is torn down after the commit completes. The problems with UAFs in this BLI code comes from the fact that stale, pinned buffers have been transferred to the CIL and the transaction no longer owns the BLI reference... And then, of course, is the fact that the AIL cannot rely on log items with referenced owner objects. Hence the high level items tracked in the AIL are, at times, tracking otherwise unreferenced items. IOWs, we have problems with UAF w.r.t. buffers and BLIs because of the mess of the BLI reference counting model. And we have problems with ILI/inode life times because the ILI does not take references to the inode and it is assumed it is never freed until the inode itself is torn down. And neither buffers, inodes, BLIs nor ILIs are reference counted when they are on the AIL. The impact of this is two-fold: 1. it requires high level object reclaim to be aware of dirty items and to be able to skip over them; and 2. unmount requires explicitly AIL pushing because the AIL might be the only remaining subsystem that tracks the unreferenced object that we need to reclaim before unmount can progress. This is especially true for shutdown filesystems. Ideally xfs_reclaim_inode() would not be trying to abort dirty inodes on shutdown. Historically speaking, this functionality has been necessary because there were times without other mechanisms to abort and clean dirty, unreferenced inodes and this would result in unount on shutdown filesystems hanging. I suspect those times are long since passed - all dirty inodes are tracked in the journal and unmount pushes all dirty objects - so maybe the lesson here is that we could be carrying historic code that worked around shutdown bugs that ino longer occur and so we no longer need... So, yeah, I agree that it would be be great to untangle all this mess, but my experience qwith trying to untangle it over the years is that the a can of worms it opens gets all tangled up in the ball of string I'm trying to untangle.... -Dave. -- Dave Chinner dgc@kernel.org ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v3 3/4] xfs: avoid dereferencing log items after push callbacks 2026-03-09 16:27 ` Darrick J. Wong 2026-03-10 5:25 ` Dave Chinner @ 2026-03-10 17:51 ` Yuto Ohnuki 1 sibling, 0 replies; 17+ messages in thread From: Yuto Ohnuki @ 2026-03-10 17:51 UTC (permalink / raw) To: djwong Cc: bfoster, cem, darrick.wong, dchinner, linux-kernel, linux-xfs, stable, syzbot+652af2b3c5569c4ab63c, ytohnuki > How difficult would it be to add a refcount to xfs_log_item so that any > other code walking through the AIL's log item list can't accidentally > suffer from this UAF? I keep seeing periodic log item UAF bugfixes on > the list, which (to me anyway) suggests we ought to think about a > struct(ural) fix to this problem. > > I /think/ the answer to that is "sort of nasty" because of things like > xfs_dquot embedding its own log item. The other log item types might > not be so bad because at least they're allocated separately. However, > refcount_t accesses also aren't free. Agreed that a structural fix would be the right long-term approach. As you noted, the dquot embedding makes it non-trivial. I'd like to keep this series focused on the immediate syzbot fix and explore a refcount-based approach as a separate effort. > This is true after the xfsaild_push_item call, correct? If so then I > think the comment for the call needs updating too: > > /* > * Note that iop_push may unlock and reacquire the AIL lock. We > * rely on the AIL cursor implementation to be able to deal with > * the dropped lock. > * > * The log item may have been freed by the push, so it must not > * be accessed or dereferenced below this line. > */ > lock_result = xfsaild_push_item(ailp, lip); > > Otherwise this looks ok to me. > > --D Thank you. In v4, I have added the comments you suggested. Yuto Amazon Web Services EMEA SARL, 38 avenue John F. Kennedy, L-1855 Luxembourg, R.C.S. Luxembourg B186284 Amazon Web Services EMEA SARL, Irish Branch, One Burlington Plaza, Burlington Road, Dublin 4, Ireland, branch registration number 908705 ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v3 3/4] xfs: avoid dereferencing log items after push callbacks 2026-03-08 18:28 ` [PATCH v3 3/4] xfs: avoid dereferencing log items after push callbacks Yuto Ohnuki 2026-03-09 16:27 ` Darrick J. Wong @ 2026-03-10 5:27 ` Dave Chinner 2026-03-10 17:56 ` Yuto Ohnuki 1 sibling, 1 reply; 17+ messages in thread From: Dave Chinner @ 2026-03-10 5:27 UTC (permalink / raw) To: Yuto Ohnuki Cc: Carlos Maiolino, Dave Chinner, Darrick J . Wong, Brian Foster, linux-xfs, linux-kernel, syzbot+652af2b3c5569c4ab63c, stable On Sun, Mar 08, 2026 at 06:28:08PM +0000, Yuto Ohnuki wrote: > After xfsaild_push_item() calls iop_push(), the log item may have been > freed if the AIL lock was dropped during the push. The tracepoints in > the switch statement dereference the log item after iop_push() returns, > which can result in a use-after-free. > > Fix this by capturing the log item type, flags, and LSN before calling > xfsaild_push_item(), and introducing a new xfs_ail_push_class trace > event class that takes these pre-captured values and the ailp pointer > instead of the log item pointer. > > Reported-by: syzbot+652af2b3c5569c4ab63c@syzkaller.appspotmail.com > Closes: https://syzkaller.appspot.com/bug?extid=652af2b3c5569c4ab63c > Fixes: 90c60e164012 ("xfs: xfs_iflush() is no longer necessary") > Cc: <stable@vger.kernel.org> # v5.9 > Signed-off-by: Yuto Ohnuki <ytohnuki@amazon.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> -- Dave Chinner dgc@kernel.org ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v3 3/4] xfs: avoid dereferencing log items after push callbacks 2026-03-10 5:27 ` Dave Chinner @ 2026-03-10 17:56 ` Yuto Ohnuki 0 siblings, 0 replies; 17+ messages in thread From: Yuto Ohnuki @ 2026-03-10 17:56 UTC (permalink / raw) To: dgc Cc: bfoster, cem, darrick.wong, dchinner, linux-kernel, linux-xfs, stable, syzbot+652af2b3c5569c4ab63c, ytohnuki > > After xfsaild_push_item() calls iop_push(), the log item may have been > > freed if the AIL lock was dropped during the push. The tracepoints in > > the switch statement dereference the log item after iop_push() returns, > > which can result in a use-after-free. > > > > Fix this by capturing the log item type, flags, and LSN before calling > > xfsaild_push_item(), and introducing a new xfs_ail_push_class trace > > event class that takes these pre-captured values and the ailp pointer > > instead of the log item pointer. > > > > Reported-by: syzbot+652af2b3c5569c4ab63c@syzkaller.appspotmail.com > > Closes: https://syzkaller.appspot.com/bug?extid=652af2b3c5569c4ab63c > > Fixes: 90c60e164012 ("xfs: xfs_iflush() is no longer necessary") > > Cc: <stable@vger.kernel.org> # v5.9 > > Signed-off-by: Yuto Ohnuki <ytohnuki@amazon.com> > > Reviewed-by: Dave Chinner <dchinner@redhat.com> > -- > Dave Chinner > dgc@kernel.org Thanks for the review, Dave. In v4, I reworked the patch ordering so that the bugfix patches come before the refactoring. Since the context has changed, I've dropped your Reviewed-by from this patch in v4 just to be safe. I would appreciate another look when you get a chance. Yuto Amazon Web Services EMEA SARL, 38 avenue John F. Kennedy, L-1855 Luxembourg, R.C.S. Luxembourg B186284 Amazon Web Services EMEA SARL, Irish Branch, One Burlington Plaza, Burlington Road, Dublin 4, Ireland, branch registration number 908705 ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH v3 4/4] xfs: save ailp before dropping the AIL lock in push callbacks [not found] <20260308182804.33127-6-ytohnuki@amazon.com> ` (2 preceding siblings ...) 2026-03-08 18:28 ` [PATCH v3 3/4] xfs: avoid dereferencing log items after push callbacks Yuto Ohnuki @ 2026-03-08 18:28 ` Yuto Ohnuki 2026-03-09 16:28 ` Darrick J. Wong 2026-03-10 5:27 ` Dave Chinner 3 siblings, 2 replies; 17+ messages in thread From: Yuto Ohnuki @ 2026-03-08 18:28 UTC (permalink / raw) To: Carlos Maiolino, Dave Chinner Cc: Darrick J . Wong, Brian Foster, linux-xfs, linux-kernel, Yuto Ohnuki, syzbot+652af2b3c5569c4ab63c, stable In xfs_inode_item_push() and xfs_qm_dquot_logitem_push(), the AIL lock is dropped to perform buffer IO. Once the cluster buffer no longer protects the log item from reclaim, the log item may be freed by background reclaim or the dquot shrinker. The subsequent spin_lock() call dereferences lip->li_ailp, which is a use-after-free. Fix this by saving the ailp pointer in a local variable while the AIL lock is held and the log item is guaranteed to be valid. Reported-by: syzbot+652af2b3c5569c4ab63c@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=652af2b3c5569c4ab63c Fixes: 90c60e164012 ("xfs: xfs_iflush() is no longer necessary") Cc: <stable@vger.kernel.org> # v5.9 Signed-off-by: Yuto Ohnuki <ytohnuki@amazon.com> --- fs/xfs/xfs_dquot_item.c | 9 +++++++-- fs/xfs/xfs_inode_item.c | 9 +++++++-- 2 files changed, 14 insertions(+), 4 deletions(-) diff --git a/fs/xfs/xfs_dquot_item.c b/fs/xfs/xfs_dquot_item.c index 491e2a7053a3..65a0e69c3d08 100644 --- a/fs/xfs/xfs_dquot_item.c +++ b/fs/xfs/xfs_dquot_item.c @@ -125,6 +125,7 @@ xfs_qm_dquot_logitem_push( struct xfs_dq_logitem *qlip = DQUOT_ITEM(lip); struct xfs_dquot *dqp = qlip->qli_dquot; struct xfs_buf *bp; + struct xfs_ail *ailp = lip->li_ailp; uint rval = XFS_ITEM_SUCCESS; int error; @@ -153,7 +154,7 @@ xfs_qm_dquot_logitem_push( goto out_unlock; } - spin_unlock(&lip->li_ailp->ail_lock); + spin_unlock(&ailp->ail_lock); error = xfs_dquot_use_attached_buf(dqp, &bp); if (error == -EAGAIN) { @@ -172,9 +173,13 @@ xfs_qm_dquot_logitem_push( rval = XFS_ITEM_FLUSHING; } xfs_buf_relse(bp); + /* + * The buffer no longer protects the log item from reclaim, so + * do not reference lip after this point. + */ out_relock_ail: - spin_lock(&lip->li_ailp->ail_lock); + spin_lock(&ailp->ail_lock); out_unlock: mutex_unlock(&dqp->q_qlock); return rval; diff --git a/fs/xfs/xfs_inode_item.c b/fs/xfs/xfs_inode_item.c index 8913036b8024..4ae81eed0442 100644 --- a/fs/xfs/xfs_inode_item.c +++ b/fs/xfs/xfs_inode_item.c @@ -746,6 +746,7 @@ xfs_inode_item_push( struct xfs_inode_log_item *iip = INODE_ITEM(lip); struct xfs_inode *ip = iip->ili_inode; struct xfs_buf *bp = lip->li_buf; + struct xfs_ail *ailp = lip->li_ailp; uint rval = XFS_ITEM_SUCCESS; int error; @@ -771,7 +772,7 @@ xfs_inode_item_push( if (!xfs_buf_trylock(bp)) return XFS_ITEM_LOCKED; - spin_unlock(&lip->li_ailp->ail_lock); + spin_unlock(&ailp->ail_lock); /* * We need to hold a reference for flushing the cluster buffer as it may @@ -795,7 +796,11 @@ xfs_inode_item_push( rval = XFS_ITEM_LOCKED; } - spin_lock(&lip->li_ailp->ail_lock); + /* + * The buffer no longer protects the log item from reclaim, so + * do not reference lip after this point. + */ + spin_lock(&ailp->ail_lock); return rval; } -- 2.50.1 Amazon Web Services EMEA SARL, 38 avenue John F. Kennedy, L-1855 Luxembourg, R.C.S. Luxembourg B186284 Amazon Web Services EMEA SARL, Irish Branch, One Burlington Plaza, Burlington Road, Dublin 4, Ireland, branch registration number 908705 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH v3 4/4] xfs: save ailp before dropping the AIL lock in push callbacks 2026-03-08 18:28 ` [PATCH v3 4/4] xfs: save ailp before dropping the AIL lock in " Yuto Ohnuki @ 2026-03-09 16:28 ` Darrick J. Wong 2026-03-10 5:27 ` Dave Chinner 1 sibling, 0 replies; 17+ messages in thread From: Darrick J. Wong @ 2026-03-09 16:28 UTC (permalink / raw) To: Yuto Ohnuki Cc: Carlos Maiolino, Dave Chinner, Darrick J . Wong, Brian Foster, linux-xfs, linux-kernel, syzbot+652af2b3c5569c4ab63c, stable On Sun, Mar 08, 2026 at 06:28:09PM +0000, Yuto Ohnuki wrote: > In xfs_inode_item_push() and xfs_qm_dquot_logitem_push(), the AIL lock > is dropped to perform buffer IO. Once the cluster buffer no longer > protects the log item from reclaim, the log item may be freed by > background reclaim or the dquot shrinker. The subsequent spin_lock() > call dereferences lip->li_ailp, which is a use-after-free. > > Fix this by saving the ailp pointer in a local variable while the AIL > lock is held and the log item is guaranteed to be valid. > > Reported-by: syzbot+652af2b3c5569c4ab63c@syzkaller.appspotmail.com > Closes: https://syzkaller.appspot.com/bug?extid=652af2b3c5569c4ab63c > Fixes: 90c60e164012 ("xfs: xfs_iflush() is no longer necessary") > Cc: <stable@vger.kernel.org> # v5.9 > Signed-off-by: Yuto Ohnuki <ytohnuki@amazon.com> Looks fine to me, Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> --D > --- > fs/xfs/xfs_dquot_item.c | 9 +++++++-- > fs/xfs/xfs_inode_item.c | 9 +++++++-- > 2 files changed, 14 insertions(+), 4 deletions(-) > > diff --git a/fs/xfs/xfs_dquot_item.c b/fs/xfs/xfs_dquot_item.c > index 491e2a7053a3..65a0e69c3d08 100644 > --- a/fs/xfs/xfs_dquot_item.c > +++ b/fs/xfs/xfs_dquot_item.c > @@ -125,6 +125,7 @@ xfs_qm_dquot_logitem_push( > struct xfs_dq_logitem *qlip = DQUOT_ITEM(lip); > struct xfs_dquot *dqp = qlip->qli_dquot; > struct xfs_buf *bp; > + struct xfs_ail *ailp = lip->li_ailp; > uint rval = XFS_ITEM_SUCCESS; > int error; > > @@ -153,7 +154,7 @@ xfs_qm_dquot_logitem_push( > goto out_unlock; > } > > - spin_unlock(&lip->li_ailp->ail_lock); > + spin_unlock(&ailp->ail_lock); > > error = xfs_dquot_use_attached_buf(dqp, &bp); > if (error == -EAGAIN) { > @@ -172,9 +173,13 @@ xfs_qm_dquot_logitem_push( > rval = XFS_ITEM_FLUSHING; > } > xfs_buf_relse(bp); > + /* > + * The buffer no longer protects the log item from reclaim, so > + * do not reference lip after this point. > + */ > > out_relock_ail: > - spin_lock(&lip->li_ailp->ail_lock); > + spin_lock(&ailp->ail_lock); > out_unlock: > mutex_unlock(&dqp->q_qlock); > return rval; > diff --git a/fs/xfs/xfs_inode_item.c b/fs/xfs/xfs_inode_item.c > index 8913036b8024..4ae81eed0442 100644 > --- a/fs/xfs/xfs_inode_item.c > +++ b/fs/xfs/xfs_inode_item.c > @@ -746,6 +746,7 @@ xfs_inode_item_push( > struct xfs_inode_log_item *iip = INODE_ITEM(lip); > struct xfs_inode *ip = iip->ili_inode; > struct xfs_buf *bp = lip->li_buf; > + struct xfs_ail *ailp = lip->li_ailp; > uint rval = XFS_ITEM_SUCCESS; > int error; > > @@ -771,7 +772,7 @@ xfs_inode_item_push( > if (!xfs_buf_trylock(bp)) > return XFS_ITEM_LOCKED; > > - spin_unlock(&lip->li_ailp->ail_lock); > + spin_unlock(&ailp->ail_lock); > > /* > * We need to hold a reference for flushing the cluster buffer as it may > @@ -795,7 +796,11 @@ xfs_inode_item_push( > rval = XFS_ITEM_LOCKED; > } > > - spin_lock(&lip->li_ailp->ail_lock); > + /* > + * The buffer no longer protects the log item from reclaim, so > + * do not reference lip after this point. > + */ > + spin_lock(&ailp->ail_lock); > return rval; > } > > -- > 2.50.1 > > > > > Amazon Web Services EMEA SARL, 38 avenue John F. Kennedy, L-1855 Luxembourg, R.C.S. Luxembourg B186284 > > Amazon Web Services EMEA SARL, Irish Branch, One Burlington Plaza, Burlington Road, Dublin 4, Ireland, branch registration number 908705 > > > > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v3 4/4] xfs: save ailp before dropping the AIL lock in push callbacks 2026-03-08 18:28 ` [PATCH v3 4/4] xfs: save ailp before dropping the AIL lock in " Yuto Ohnuki 2026-03-09 16:28 ` Darrick J. Wong @ 2026-03-10 5:27 ` Dave Chinner 1 sibling, 0 replies; 17+ messages in thread From: Dave Chinner @ 2026-03-10 5:27 UTC (permalink / raw) To: Yuto Ohnuki Cc: Carlos Maiolino, Dave Chinner, Darrick J . Wong, Brian Foster, linux-xfs, linux-kernel, syzbot+652af2b3c5569c4ab63c, stable On Sun, Mar 08, 2026 at 06:28:09PM +0000, Yuto Ohnuki wrote: > In xfs_inode_item_push() and xfs_qm_dquot_logitem_push(), the AIL lock > is dropped to perform buffer IO. Once the cluster buffer no longer > protects the log item from reclaim, the log item may be freed by > background reclaim or the dquot shrinker. The subsequent spin_lock() > call dereferences lip->li_ailp, which is a use-after-free. > > Fix this by saving the ailp pointer in a local variable while the AIL > lock is held and the log item is guaranteed to be valid. > > Reported-by: syzbot+652af2b3c5569c4ab63c@syzkaller.appspotmail.com > Closes: https://syzkaller.appspot.com/bug?extid=652af2b3c5569c4ab63c > Fixes: 90c60e164012 ("xfs: xfs_iflush() is no longer necessary") > Cc: <stable@vger.kernel.org> # v5.9 > Signed-off-by: Yuto Ohnuki <ytohnuki@amazon.com> looks good to me. Reviewed-by: Dave Chinner <dchinner@redhat.com> -- Dave Chinner dgc@kernel.org ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2026-03-10 17:57 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20260308182804.33127-6-ytohnuki@amazon.com>
2026-03-08 18:28 ` [PATCH v3 1/4] xfs: stop reclaim before pushing AIL during unmount Yuto Ohnuki
2026-03-09 16:02 ` Darrick J. Wong
2026-03-10 17:33 ` Yuto Ohnuki
2026-03-08 18:28 ` [PATCH v3 2/4] xfs: refactor xfsaild_push loop into helper Yuto Ohnuki
2026-03-09 16:14 ` Darrick J. Wong
2026-03-10 17:38 ` Yuto Ohnuki
2026-03-10 5:26 ` Dave Chinner
2026-03-10 17:46 ` Yuto Ohnuki
2026-03-08 18:28 ` [PATCH v3 3/4] xfs: avoid dereferencing log items after push callbacks Yuto Ohnuki
2026-03-09 16:27 ` Darrick J. Wong
2026-03-10 5:25 ` Dave Chinner
2026-03-10 17:51 ` Yuto Ohnuki
2026-03-10 5:27 ` Dave Chinner
2026-03-10 17:56 ` Yuto Ohnuki
2026-03-08 18:28 ` [PATCH v3 4/4] xfs: save ailp before dropping the AIL lock in " Yuto Ohnuki
2026-03-09 16:28 ` Darrick J. Wong
2026-03-10 5:27 ` Dave Chinner
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox