* [PATCH v2 1/3] xfs: add lock protection when remove perag from radix tree @ 2023-12-09 12:21 Long Li 2023-12-09 12:21 ` [PATCH v2 2/3] xfs: don't assert perag when free perag Long Li 2023-12-09 12:21 ` [PATCH v2 3/3] xfs: fix perag leak when growfs fails Long Li 0 siblings, 2 replies; 8+ messages in thread From: Long Li @ 2023-12-09 12:21 UTC (permalink / raw) To: djwong, chandanbabu; +Cc: linux-xfs, yi.zhang, houtao1, leo.lilong, yangerkun Take mp->m_perag_lock for deletions from the perag radix tree in xfs_initialize_perag to prevent racing with tagging operations. Lookups are fine - they are RCU protected so already deal with the tree changing shape underneath the lookup - but tagging operations require the tree to be stable while the tags are propagated back up to the root. Right now there's nothing stopping radix tree tagging from operating while a growfs operation is progress and adding/removing new entries into the radix tree. Hence we can have traversals that require a stable tree occurring at the same time we are removing unused entries from the radix tree which causes the shape of the tree to change. Likely this hasn't caused a problem in the past because we are only doing append addition and removal so the active AG part of the tree is not changing shape, but that doesn't mean it is safe. Just making the radix tree modifications serialise against each other is obviously correct. Signed-off-by: Long Li <leo.lilong@huawei.com> Reviewed-by: Christoph Hellwig <hch@lst.de> --- v2: - Refer to Dave's explanation and rewrite the commit message. fs/xfs/libxfs/xfs_ag.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/fs/xfs/libxfs/xfs_ag.c b/fs/xfs/libxfs/xfs_ag.c index f9f4d694640d..cc10a3ca052f 100644 --- a/fs/xfs/libxfs/xfs_ag.c +++ b/fs/xfs/libxfs/xfs_ag.c @@ -424,13 +424,17 @@ xfs_initialize_perag( out_remove_pag: xfs_defer_drain_free(&pag->pag_intents_drain); + spin_lock(&mp->m_perag_lock); radix_tree_delete(&mp->m_perag_tree, index); + spin_unlock(&mp->m_perag_lock); out_free_pag: kmem_free(pag); out_unwind_new_pags: /* unwind any prior newly initialized pags */ for (index = first_initialised; index < agcount; index++) { + spin_lock(&mp->m_perag_lock); pag = radix_tree_delete(&mp->m_perag_tree, index); + spin_unlock(&mp->m_perag_lock); if (!pag) break; xfs_buf_hash_destroy(pag); -- 2.31.1 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v2 2/3] xfs: don't assert perag when free perag 2023-12-09 12:21 [PATCH v2 1/3] xfs: add lock protection when remove perag from radix tree Long Li @ 2023-12-09 12:21 ` Long Li 2023-12-11 21:42 ` Darrick J. Wong 2023-12-11 22:00 ` Dave Chinner 2023-12-09 12:21 ` [PATCH v2 3/3] xfs: fix perag leak when growfs fails Long Li 1 sibling, 2 replies; 8+ messages in thread From: Long Li @ 2023-12-09 12:21 UTC (permalink / raw) To: djwong, chandanbabu; +Cc: linux-xfs, yi.zhang, houtao1, leo.lilong, yangerkun When releasing the perag in xfs_free_perag(), the assertion that the perag in readix tree is correct in most cases. However, there is one corner case where the assertion is not true. During log recovery, the AGs become visible(that is included in mp->m_sb.sb_agcount) first, and then the perag is initialized. If the initialization of the perag fails, the assertion will be triggered. Worse yet, null pointer dereferencing can occur. Signed-off-by: Long Li <leo.lilong@huawei.com> --- fs/xfs/libxfs/xfs_ag.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/xfs/libxfs/xfs_ag.c b/fs/xfs/libxfs/xfs_ag.c index cc10a3ca052f..11ed048c350c 100644 --- a/fs/xfs/libxfs/xfs_ag.c +++ b/fs/xfs/libxfs/xfs_ag.c @@ -258,7 +258,8 @@ xfs_free_perag( spin_lock(&mp->m_perag_lock); pag = radix_tree_delete(&mp->m_perag_tree, agno); spin_unlock(&mp->m_perag_lock); - ASSERT(pag); + if (!pag) + break; XFS_IS_CORRUPT(pag->pag_mount, atomic_read(&pag->pag_ref) != 0); xfs_defer_drain_free(&pag->pag_intents_drain); -- 2.31.1 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v2 2/3] xfs: don't assert perag when free perag 2023-12-09 12:21 ` [PATCH v2 2/3] xfs: don't assert perag when free perag Long Li @ 2023-12-11 21:42 ` Darrick J. Wong 2023-12-11 22:00 ` Dave Chinner 1 sibling, 0 replies; 8+ messages in thread From: Darrick J. Wong @ 2023-12-11 21:42 UTC (permalink / raw) To: Long Li; +Cc: chandanbabu, linux-xfs, yi.zhang, houtao1, yangerkun On Sat, Dec 09, 2023 at 08:21:06PM +0800, Long Li wrote: > When releasing the perag in xfs_free_perag(), the assertion that the > perag in readix tree is correct in most cases. However, there is one > corner case where the assertion is not true. During log recovery, the > AGs become visible(that is included in mp->m_sb.sb_agcount) first, and > then the perag is initialized. If the initialization of the perag fails, > the assertion will be triggered. Worse yet, null pointer dereferencing > can occur. > > Signed-off-by: Long Li <leo.lilong@huawei.com> > --- > fs/xfs/libxfs/xfs_ag.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/fs/xfs/libxfs/xfs_ag.c b/fs/xfs/libxfs/xfs_ag.c > index cc10a3ca052f..11ed048c350c 100644 > --- a/fs/xfs/libxfs/xfs_ag.c > +++ b/fs/xfs/libxfs/xfs_ag.c > @@ -258,7 +258,8 @@ xfs_free_perag( > spin_lock(&mp->m_perag_lock); > pag = radix_tree_delete(&mp->m_perag_tree, agno); > spin_unlock(&mp->m_perag_lock); > - ASSERT(pag); > + if (!pag) > + break; Why wouldn't you continue to the next agnumber? --D > XFS_IS_CORRUPT(pag->pag_mount, atomic_read(&pag->pag_ref) != 0); > xfs_defer_drain_free(&pag->pag_intents_drain); > > -- > 2.31.1 > > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 2/3] xfs: don't assert perag when free perag 2023-12-09 12:21 ` [PATCH v2 2/3] xfs: don't assert perag when free perag Long Li 2023-12-11 21:42 ` Darrick J. Wong @ 2023-12-11 22:00 ` Dave Chinner 2023-12-12 13:28 ` Long Li 1 sibling, 1 reply; 8+ messages in thread From: Dave Chinner @ 2023-12-11 22:00 UTC (permalink / raw) To: Long Li; +Cc: djwong, chandanbabu, linux-xfs, yi.zhang, houtao1, yangerkun On Sat, Dec 09, 2023 at 08:21:06PM +0800, Long Li wrote: > When releasing the perag in xfs_free_perag(), the assertion that the > perag in readix tree is correct in most cases. However, there is one > corner case where the assertion is not true. During log recovery, the > AGs become visible(that is included in mp->m_sb.sb_agcount) first, and > then the perag is initialized. If the initialization of the perag fails, > the assertion will be triggered. Worse yet, null pointer dereferencing > can occur. I'm going to assume that you are talking about xlog_do_recover() because the commit message doesn't actually tell us how this situation occurs. That code re-reads the superblock, then copies it to mp->m_sb, then calls xfs_initialize_perag() with the values from mp->m_sb. If log recovery replayed a growfs transaction, the mp->m_sb has a larger sb_agcount and so then xfs_initialize_perag() is called and if that fails we end up back in xfs_mountfs and the error stack calls xfs_free_perag(). Is that correct? If so, then the fix is to change how xlog_do_recover() works. It needs to initialise the new perags before it updates the in-memory superblock. If xfs_initialize_perag() fails, it undoes all the changes it has made, so if we haven't updated the in-memory superblock when the init of the new perags fails then the error unwinding code works exactly as it should right now. i.e. the bug is that xlog_do_recover() is leaving the in-memory state inconsistent on init failure, and we need to fix that rather than remove the assert that is telling us that in-memory state is inconsistent.... -Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 2/3] xfs: don't assert perag when free perag 2023-12-11 22:00 ` Dave Chinner @ 2023-12-12 13:28 ` Long Li 0 siblings, 0 replies; 8+ messages in thread From: Long Li @ 2023-12-12 13:28 UTC (permalink / raw) To: Dave Chinner; +Cc: djwong, chandanbabu, linux-xfs, yi.zhang, houtao1, yangerkun On Tue, Dec 12, 2023 at 09:00:50AM +1100, Dave Chinner wrote: > On Sat, Dec 09, 2023 at 08:21:06PM +0800, Long Li wrote: > > When releasing the perag in xfs_free_perag(), the assertion that the > > perag in readix tree is correct in most cases. However, there is one > > corner case where the assertion is not true. During log recovery, the > > AGs become visible(that is included in mp->m_sb.sb_agcount) first, and > > then the perag is initialized. If the initialization of the perag fails, > > the assertion will be triggered. Worse yet, null pointer dereferencing > > can occur. > > I'm going to assume that you are talking about xlog_do_recover() > because the commit message doesn't actually tell us how this > situation occurs. > > That code re-reads the superblock, then copies it to mp->m_sb, > then calls xfs_initialize_perag() with the values from mp->m_sb. > > If log recovery replayed a growfs transaction, the mp->m_sb has a > larger sb_agcount and so then xfs_initialize_perag() is called > and if that fails we end up back in xfs_mountfs and the error > stack calls xfs_free_perag(). > > Is that correct? Yes, you are right. When I tried to fix the perag leak issue in patch 3, I found this problem. > > If so, then the fix is to change how xlog_do_recover() works. It > needs to initialise the new perags before it updates the in-memory > superblock. If xfs_initialize_perag() fails, it undoes all the > changes it has made, so if we haven't updated the in-memory > superblock when the init of the new perags fails then the error > unwinding code works exactly as it should right now. > > i.e. the bug is that xlog_do_recover() is leaving the in-memory > state inconsistent on init failure, and we need to fix that rather > than remove the assert that is telling us that in-memory state is > inconsistent.... > Yes, agree with you, I used to think that removing the assertion would solve the problem, but now it seems a bit lazy, the problem should be solved at the source. Right now, I haven't figured out how to fix this problem comprehensively, so I'll fix perag leak issue first. Thanks, Long Li ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH v2 3/3] xfs: fix perag leak when growfs fails 2023-12-09 12:21 [PATCH v2 1/3] xfs: add lock protection when remove perag from radix tree Long Li 2023-12-09 12:21 ` [PATCH v2 2/3] xfs: don't assert perag when free perag Long Li @ 2023-12-09 12:21 ` Long Li 2023-12-11 22:03 ` Darrick J. Wong 1 sibling, 1 reply; 8+ messages in thread From: Long Li @ 2023-12-09 12:21 UTC (permalink / raw) To: djwong, chandanbabu; +Cc: linux-xfs, yi.zhang, houtao1, leo.lilong, yangerkun During growfs, if new ag in memory has been initialized, however sb_agcount has not been updated, if an error occurs at this time it will cause perag leaks as follows, these new AGs will not been freed during umount , because of these new AGs are not visible(that is included in mp->m_sb.sb_agcount). unreferenced object 0xffff88810be40200 (size 512): comm "xfs_growfs", pid 857, jiffies 4294909093 hex dump (first 32 bytes): 00 c0 c1 05 81 88 ff ff 04 00 00 00 00 00 00 00 ................ 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace (crc 381741e2): [<ffffffff8191aef6>] __kmalloc+0x386/0x4f0 [<ffffffff82553e65>] kmem_alloc+0xb5/0x2f0 [<ffffffff8238dac5>] xfs_initialize_perag+0xc5/0x810 [<ffffffff824f679c>] xfs_growfs_data+0x9bc/0xbc0 [<ffffffff8250b90e>] xfs_file_ioctl+0x5fe/0x14d0 [<ffffffff81aa5194>] __x64_sys_ioctl+0x144/0x1c0 [<ffffffff83c3d81f>] do_syscall_64+0x3f/0xe0 [<ffffffff83e00087>] entry_SYSCALL_64_after_hwframe+0x62/0x6a unreferenced object 0xffff88810be40800 (size 512): comm "xfs_growfs", pid 857, jiffies 4294909093 hex dump (first 32 bytes): 20 00 00 00 00 00 00 00 57 ef be dc 00 00 00 00 .......W....... 10 08 e4 0b 81 88 ff ff 10 08 e4 0b 81 88 ff ff ................ backtrace (crc bde50e2d): [<ffffffff8191b43a>] __kmalloc_node+0x3da/0x540 [<ffffffff81814489>] kvmalloc_node+0x99/0x160 [<ffffffff8286acff>] bucket_table_alloc.isra.0+0x5f/0x400 [<ffffffff8286bdc5>] rhashtable_init+0x405/0x760 [<ffffffff8238dda3>] xfs_initialize_perag+0x3a3/0x810 [<ffffffff824f679c>] xfs_growfs_data+0x9bc/0xbc0 [<ffffffff8250b90e>] xfs_file_ioctl+0x5fe/0x14d0 [<ffffffff81aa5194>] __x64_sys_ioctl+0x144/0x1c0 [<ffffffff83c3d81f>] do_syscall_64+0x3f/0xe0 [<ffffffff83e00087>] entry_SYSCALL_64_after_hwframe+0x62/0x6a Now, the logic for freeing perag in xfs_free_perag() and xfs_initialize_perag() error handling is essentially the same. Factor out xfs_free_perag_range() from xfs_free_perag(), used for freeing unused perag within a specified range, inclued when growfs fails. Signed-off-by: Long Li <leo.lilong@huawei.com> --- fs/xfs/libxfs/xfs_ag.c | 35 ++++++++++++++++++++--------------- fs/xfs/libxfs/xfs_ag.h | 2 ++ fs/xfs/xfs_fsops.c | 5 ++++- 3 files changed, 26 insertions(+), 16 deletions(-) diff --git a/fs/xfs/libxfs/xfs_ag.c b/fs/xfs/libxfs/xfs_ag.c index 11ed048c350c..edec03ab09aa 100644 --- a/fs/xfs/libxfs/xfs_ag.c +++ b/fs/xfs/libxfs/xfs_ag.c @@ -245,16 +245,20 @@ __xfs_free_perag( } /* - * Free up the per-ag resources associated with the mount structure. + * Free per-ag within the specified range, if agno is not found in the + * radix tree, then it means that agno and subsequent AGs have not been + * initialized. */ void -xfs_free_perag( - struct xfs_mount *mp) +xfs_free_perag_range( + xfs_mount_t *mp, + xfs_agnumber_t agstart, + xfs_agnumber_t agend) { - struct xfs_perag *pag; xfs_agnumber_t agno; + struct xfs_perag *pag; - for (agno = 0; agno < mp->m_sb.sb_agcount; agno++) { + for (agno = agstart; agno < agend; agno++) { spin_lock(&mp->m_perag_lock); pag = radix_tree_delete(&mp->m_perag_tree, agno); spin_unlock(&mp->m_perag_lock); @@ -274,6 +278,16 @@ xfs_free_perag( } } +/* + * Free up the per-ag resources associated with the mount structure. + */ +void +xfs_free_perag( + struct xfs_mount *mp) +{ + xfs_free_perag_range(mp, 0, mp->m_sb.sb_agcount); +} + /* Find the size of the AG, in blocks. */ static xfs_agblock_t __xfs_ag_block_count( @@ -432,16 +446,7 @@ xfs_initialize_perag( kmem_free(pag); out_unwind_new_pags: /* unwind any prior newly initialized pags */ - for (index = first_initialised; index < agcount; index++) { - spin_lock(&mp->m_perag_lock); - pag = radix_tree_delete(&mp->m_perag_tree, index); - spin_unlock(&mp->m_perag_lock); - if (!pag) - break; - xfs_buf_hash_destroy(pag); - xfs_defer_drain_free(&pag->pag_intents_drain); - kmem_free(pag); - } + xfs_free_perag_range(mp, first_initialised, agcount); return error; } diff --git a/fs/xfs/libxfs/xfs_ag.h b/fs/xfs/libxfs/xfs_ag.h index 2e0aef87d633..927f737f84ec 100644 --- a/fs/xfs/libxfs/xfs_ag.h +++ b/fs/xfs/libxfs/xfs_ag.h @@ -136,6 +136,8 @@ __XFS_AG_OPSTATE(agfl_needs_reset, AGFL_NEEDS_RESET) int xfs_initialize_perag(struct xfs_mount *mp, xfs_agnumber_t agcount, xfs_rfsblock_t dcount, xfs_agnumber_t *maxagi); int xfs_initialize_perag_data(struct xfs_mount *mp, xfs_agnumber_t agno); +void xfs_free_perag_range(xfs_mount_t *mp, xfs_agnumber_t agstart, + xfs_agnumber_t agend); void xfs_free_perag(struct xfs_mount *mp); /* Passive AG references */ diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c index 57076a25f17d..144fea4374af 100644 --- a/fs/xfs/xfs_fsops.c +++ b/fs/xfs/xfs_fsops.c @@ -153,7 +153,7 @@ xfs_growfs_data_private( error = xfs_trans_alloc(mp, &M_RES(mp)->tr_growdata, -delta, 0, 0, &tp); if (error) - return error; + goto out_free_unused_perag; last_pag = xfs_perag_get(mp, oagcount - 1); if (delta > 0) { @@ -227,6 +227,9 @@ xfs_growfs_data_private( out_trans_cancel: xfs_trans_cancel(tp); +out_free_unused_perag: + if (nagcount > oagcount) + xfs_free_perag_range(mp, oagcount, nagcount); return error; } -- 2.31.1 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v2 3/3] xfs: fix perag leak when growfs fails 2023-12-09 12:21 ` [PATCH v2 3/3] xfs: fix perag leak when growfs fails Long Li @ 2023-12-11 22:03 ` Darrick J. Wong 2023-12-12 13:40 ` Long Li 0 siblings, 1 reply; 8+ messages in thread From: Darrick J. Wong @ 2023-12-11 22:03 UTC (permalink / raw) To: Long Li; +Cc: chandanbabu, linux-xfs, yi.zhang, houtao1, yangerkun On Sat, Dec 09, 2023 at 08:21:07PM +0800, Long Li wrote: > During growfs, if new ag in memory has been initialized, however > sb_agcount has not been updated, if an error occurs at this time it > will cause perag leaks as follows, these new AGs will not been freed > during umount , because of these new AGs are not visible(that is > included in mp->m_sb.sb_agcount). > > unreferenced object 0xffff88810be40200 (size 512): > comm "xfs_growfs", pid 857, jiffies 4294909093 > hex dump (first 32 bytes): > 00 c0 c1 05 81 88 ff ff 04 00 00 00 00 00 00 00 ................ > 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ > backtrace (crc 381741e2): > [<ffffffff8191aef6>] __kmalloc+0x386/0x4f0 > [<ffffffff82553e65>] kmem_alloc+0xb5/0x2f0 > [<ffffffff8238dac5>] xfs_initialize_perag+0xc5/0x810 > [<ffffffff824f679c>] xfs_growfs_data+0x9bc/0xbc0 > [<ffffffff8250b90e>] xfs_file_ioctl+0x5fe/0x14d0 > [<ffffffff81aa5194>] __x64_sys_ioctl+0x144/0x1c0 > [<ffffffff83c3d81f>] do_syscall_64+0x3f/0xe0 > [<ffffffff83e00087>] entry_SYSCALL_64_after_hwframe+0x62/0x6a > unreferenced object 0xffff88810be40800 (size 512): > comm "xfs_growfs", pid 857, jiffies 4294909093 > hex dump (first 32 bytes): > 20 00 00 00 00 00 00 00 57 ef be dc 00 00 00 00 .......W....... > 10 08 e4 0b 81 88 ff ff 10 08 e4 0b 81 88 ff ff ................ > backtrace (crc bde50e2d): > [<ffffffff8191b43a>] __kmalloc_node+0x3da/0x540 > [<ffffffff81814489>] kvmalloc_node+0x99/0x160 > [<ffffffff8286acff>] bucket_table_alloc.isra.0+0x5f/0x400 > [<ffffffff8286bdc5>] rhashtable_init+0x405/0x760 > [<ffffffff8238dda3>] xfs_initialize_perag+0x3a3/0x810 > [<ffffffff824f679c>] xfs_growfs_data+0x9bc/0xbc0 > [<ffffffff8250b90e>] xfs_file_ioctl+0x5fe/0x14d0 > [<ffffffff81aa5194>] __x64_sys_ioctl+0x144/0x1c0 > [<ffffffff83c3d81f>] do_syscall_64+0x3f/0xe0 > [<ffffffff83e00087>] entry_SYSCALL_64_after_hwframe+0x62/0x6a > > Now, the logic for freeing perag in xfs_free_perag() and > xfs_initialize_perag() error handling is essentially the same. Factor > out xfs_free_perag_range() from xfs_free_perag(), used for freeing > unused perag within a specified range, inclued when growfs fails. > > Signed-off-by: Long Li <leo.lilong@huawei.com> > --- > fs/xfs/libxfs/xfs_ag.c | 35 ++++++++++++++++++++--------------- > fs/xfs/libxfs/xfs_ag.h | 2 ++ > fs/xfs/xfs_fsops.c | 5 ++++- > 3 files changed, 26 insertions(+), 16 deletions(-) > > diff --git a/fs/xfs/libxfs/xfs_ag.c b/fs/xfs/libxfs/xfs_ag.c > index 11ed048c350c..edec03ab09aa 100644 > --- a/fs/xfs/libxfs/xfs_ag.c > +++ b/fs/xfs/libxfs/xfs_ag.c > @@ -245,16 +245,20 @@ __xfs_free_perag( > } > > /* > - * Free up the per-ag resources associated with the mount structure. > + * Free per-ag within the specified range, if agno is not found in the > + * radix tree, then it means that agno and subsequent AGs have not been > + * initialized. > */ > void > -xfs_free_perag( > - struct xfs_mount *mp) > +xfs_free_perag_range( > + xfs_mount_t *mp, Please stop reverting the codebase's use of typedefs for struct pointers. > + xfs_agnumber_t agstart, > + xfs_agnumber_t agend) This is also ^^ unnecessary indentation. > { > - struct xfs_perag *pag; > xfs_agnumber_t agno; > + struct xfs_perag *pag; ...and unnecessary rearranging of variables... > > - for (agno = 0; agno < mp->m_sb.sb_agcount; agno++) { > + for (agno = agstart; agno < agend; agno++) { > spin_lock(&mp->m_perag_lock); > pag = radix_tree_delete(&mp->m_perag_tree, agno); > spin_unlock(&mp->m_perag_lock); > @@ -274,6 +278,16 @@ xfs_free_perag( > } > } > > +/* > + * Free up the per-ag resources associated with the mount structure. > + */ > +void > +xfs_free_perag( > + struct xfs_mount *mp) > +{ > + xfs_free_perag_range(mp, 0, mp->m_sb.sb_agcount); > +} > + > /* Find the size of the AG, in blocks. */ > static xfs_agblock_t > __xfs_ag_block_count( > @@ -432,16 +446,7 @@ xfs_initialize_perag( > kmem_free(pag); > out_unwind_new_pags: > /* unwind any prior newly initialized pags */ > - for (index = first_initialised; index < agcount; index++) { > - spin_lock(&mp->m_perag_lock); > - pag = radix_tree_delete(&mp->m_perag_tree, index); > - spin_unlock(&mp->m_perag_lock); > - if (!pag) > - break; > - xfs_buf_hash_destroy(pag); > - xfs_defer_drain_free(&pag->pag_intents_drain); > - kmem_free(pag); > - } > + xfs_free_perag_range(mp, first_initialised, agcount); > return error; > } > > diff --git a/fs/xfs/libxfs/xfs_ag.h b/fs/xfs/libxfs/xfs_ag.h > index 2e0aef87d633..927f737f84ec 100644 > --- a/fs/xfs/libxfs/xfs_ag.h > +++ b/fs/xfs/libxfs/xfs_ag.h > @@ -136,6 +136,8 @@ __XFS_AG_OPSTATE(agfl_needs_reset, AGFL_NEEDS_RESET) > int xfs_initialize_perag(struct xfs_mount *mp, xfs_agnumber_t agcount, > xfs_rfsblock_t dcount, xfs_agnumber_t *maxagi); > int xfs_initialize_perag_data(struct xfs_mount *mp, xfs_agnumber_t agno); > +void xfs_free_perag_range(xfs_mount_t *mp, xfs_agnumber_t agstart, > + xfs_agnumber_t agend); > void xfs_free_perag(struct xfs_mount *mp); > > /* Passive AG references */ > diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c > index 57076a25f17d..144fea4374af 100644 > --- a/fs/xfs/xfs_fsops.c > +++ b/fs/xfs/xfs_fsops.c > @@ -153,7 +153,7 @@ xfs_growfs_data_private( > error = xfs_trans_alloc(mp, &M_RES(mp)->tr_growdata, -delta, 0, > 0, &tp); > if (error) > - return error; > + goto out_free_unused_perag; > > last_pag = xfs_perag_get(mp, oagcount - 1); > if (delta > 0) { > @@ -227,6 +227,9 @@ xfs_growfs_data_private( > > out_trans_cancel: > xfs_trans_cancel(tp); > +out_free_unused_perag: > + if (nagcount > oagcount) > + xfs_free_perag_range(mp, oagcount, nagcount); Complaints aside, this addition looks correct. --D > return error; > } > > -- > 2.31.1 > > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 3/3] xfs: fix perag leak when growfs fails 2023-12-11 22:03 ` Darrick J. Wong @ 2023-12-12 13:40 ` Long Li 0 siblings, 0 replies; 8+ messages in thread From: Long Li @ 2023-12-12 13:40 UTC (permalink / raw) To: Darrick J. Wong; +Cc: chandanbabu, linux-xfs, yi.zhang, houtao1, yangerkun On Mon, Dec 11, 2023 at 02:03:05PM -0800, Darrick J. Wong wrote: > On Sat, Dec 09, 2023 at 08:21:07PM +0800, Long Li wrote: > > During growfs, if new ag in memory has been initialized, however > > sb_agcount has not been updated, if an error occurs at this time it > > will cause perag leaks as follows, these new AGs will not been freed > > during umount , because of these new AGs are not visible(that is > > included in mp->m_sb.sb_agcount). > > > > unreferenced object 0xffff88810be40200 (size 512): > > comm "xfs_growfs", pid 857, jiffies 4294909093 > > hex dump (first 32 bytes): > > 00 c0 c1 05 81 88 ff ff 04 00 00 00 00 00 00 00 ................ > > 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ > > backtrace (crc 381741e2): > > [<ffffffff8191aef6>] __kmalloc+0x386/0x4f0 > > [<ffffffff82553e65>] kmem_alloc+0xb5/0x2f0 > > [<ffffffff8238dac5>] xfs_initialize_perag+0xc5/0x810 > > [<ffffffff824f679c>] xfs_growfs_data+0x9bc/0xbc0 > > [<ffffffff8250b90e>] xfs_file_ioctl+0x5fe/0x14d0 > > [<ffffffff81aa5194>] __x64_sys_ioctl+0x144/0x1c0 > > [<ffffffff83c3d81f>] do_syscall_64+0x3f/0xe0 > > [<ffffffff83e00087>] entry_SYSCALL_64_after_hwframe+0x62/0x6a > > unreferenced object 0xffff88810be40800 (size 512): > > comm "xfs_growfs", pid 857, jiffies 4294909093 > > hex dump (first 32 bytes): > > 20 00 00 00 00 00 00 00 57 ef be dc 00 00 00 00 .......W....... > > 10 08 e4 0b 81 88 ff ff 10 08 e4 0b 81 88 ff ff ................ > > backtrace (crc bde50e2d): > > [<ffffffff8191b43a>] __kmalloc_node+0x3da/0x540 > > [<ffffffff81814489>] kvmalloc_node+0x99/0x160 > > [<ffffffff8286acff>] bucket_table_alloc.isra.0+0x5f/0x400 > > [<ffffffff8286bdc5>] rhashtable_init+0x405/0x760 > > [<ffffffff8238dda3>] xfs_initialize_perag+0x3a3/0x810 > > [<ffffffff824f679c>] xfs_growfs_data+0x9bc/0xbc0 > > [<ffffffff8250b90e>] xfs_file_ioctl+0x5fe/0x14d0 > > [<ffffffff81aa5194>] __x64_sys_ioctl+0x144/0x1c0 > > [<ffffffff83c3d81f>] do_syscall_64+0x3f/0xe0 > > [<ffffffff83e00087>] entry_SYSCALL_64_after_hwframe+0x62/0x6a > > > > Now, the logic for freeing perag in xfs_free_perag() and > > xfs_initialize_perag() error handling is essentially the same. Factor > > out xfs_free_perag_range() from xfs_free_perag(), used for freeing > > unused perag within a specified range, inclued when growfs fails. > > > > Signed-off-by: Long Li <leo.lilong@huawei.com> > > --- > > fs/xfs/libxfs/xfs_ag.c | 35 ++++++++++++++++++++--------------- > > fs/xfs/libxfs/xfs_ag.h | 2 ++ > > fs/xfs/xfs_fsops.c | 5 ++++- > > 3 files changed, 26 insertions(+), 16 deletions(-) > > > > diff --git a/fs/xfs/libxfs/xfs_ag.c b/fs/xfs/libxfs/xfs_ag.c > > index 11ed048c350c..edec03ab09aa 100644 > > --- a/fs/xfs/libxfs/xfs_ag.c > > +++ b/fs/xfs/libxfs/xfs_ag.c > > @@ -245,16 +245,20 @@ __xfs_free_perag( > > } > > > > /* > > - * Free up the per-ag resources associated with the mount structure. > > + * Free per-ag within the specified range, if agno is not found in the > > + * radix tree, then it means that agno and subsequent AGs have not been > > + * initialized. > > */ > > void > > -xfs_free_perag( > > - struct xfs_mount *mp) > > +xfs_free_perag_range( > > + xfs_mount_t *mp, > > Please stop reverting the codebase's use of typedefs for struct > pointers. > > > + xfs_agnumber_t agstart, > > + xfs_agnumber_t agend) > > This is also ^^ unnecessary indentation. > > > { > > - struct xfs_perag *pag; > > xfs_agnumber_t agno; > > + struct xfs_perag *pag; > > ...and unnecessary rearranging of variables... I ignored these minor issues, thanks for pointing them out, and it will be changed in the next version. Thanks, Long Li ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2023-12-12 13:36 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-12-09 12:21 [PATCH v2 1/3] xfs: add lock protection when remove perag from radix tree Long Li 2023-12-09 12:21 ` [PATCH v2 2/3] xfs: don't assert perag when free perag Long Li 2023-12-11 21:42 ` Darrick J. Wong 2023-12-11 22:00 ` Dave Chinner 2023-12-12 13:28 ` Long Li 2023-12-09 12:21 ` [PATCH v2 3/3] xfs: fix perag leak when growfs fails Long Li 2023-12-11 22:03 ` Darrick J. Wong 2023-12-12 13:40 ` Long Li
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox