* [PATCH 0/2] ocfs2: Fix deadlock in extent_map and handle zero @ 2024-09-18 17:20 Mohammed Anees 2024-09-18 17:20 ` [PATCH 1/2] ocfs2: Fix deadlock in ocfs2_read_virt_blocks Mohammed Anees 2024-09-18 17:20 ` [PATCH 2/2] osfs2: Fix kernel BUG in ocfs2_write_cluster Mohammed Anees 0 siblings, 2 replies; 7+ messages in thread From: Mohammed Anees @ 2024-09-18 17:20 UTC (permalink / raw) To: ocfs2-devel, linux-kernel Cc: Mark Fasheh, Joel Becker, Joseph Qi, Mohammed Anees This patch series addresses two distinct issues within the OCFS2 file system: 1. A potential deadlock occuring in ocfs2_read_virt_blocks due to contention on the lock. 2. Handle block conversion with value 0 in ocfs2_add_clusters_in_btree Mohammed Anees (2): ocfs2: Fix deadlock in ocfs2_read_virt_blocks osfs2: Fix kernel BUG in ocfs2_write_cluster fs/ocfs2/alloc.c | 7 +++++++ fs/ocfs2/extent_map.c | 16 +++++++++++++++- 2 files changed, 22 insertions(+), 1 deletion(-) -- 2.46.0 ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 1/2] ocfs2: Fix deadlock in ocfs2_read_virt_blocks 2024-09-18 17:20 [PATCH 0/2] ocfs2: Fix deadlock in extent_map and handle zero Mohammed Anees @ 2024-09-18 17:20 ` Mohammed Anees 2024-09-19 8:13 ` heming.zhao 2024-09-18 17:20 ` [PATCH 2/2] osfs2: Fix kernel BUG in ocfs2_write_cluster Mohammed Anees 1 sibling, 1 reply; 7+ messages in thread From: Mohammed Anees @ 2024-09-18 17:20 UTC (permalink / raw) To: ocfs2-devel, linux-kernel Cc: Mark Fasheh, Joel Becker, Joseph Qi, Mohammed Anees, syzbot+18a87160c7d64ba2e2f6 syzbot has found a kernel BUG in ocfs2_write_cluster_by_desc, while the next patch in the series resolves this, another bug has been detected due to a potential deadlock [1]. The scenario is depicted here, CPU0 CPU1 lock(&ocfs2_file_ip_alloc_sem_key); lock(&osb->system_file_mutex); lock(&ocfs2_file_ip_alloc_sem_key); lock(&osb->system_file_mutex); The function calls which could lead to this are: CPU0 ocfs2_write_begin - lock(&ocfs2_file_ip_alloc_sem_key); . . . ocfs2_get_system_file_inode - lock(&osb->system_file_mutex); CPU1 - ocfs2_get_system_file_inode - lock(&osb->system_file_mutex); . . . ocfs2_read_virt_blocks - lock(&ocfs2_file_ip_alloc_sem_key); This issue can be resolved by making the down_read -> down_read_try in the ocfs2_read_virt_blocks. [1] https://syzkaller.appspot.com/bug?extid=18a87160c7d64ba2e2f6 Reported-and-tested-by: syzbot+18a87160c7d64ba2e2f6@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=18a87160c7d64ba2e2f6 Signed-off-by: Mohammed Anees <pvmohammedanees2003@gmail.com> --- fs/ocfs2/extent_map.c | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-) diff --git a/fs/ocfs2/extent_map.c b/fs/ocfs2/extent_map.c index 70a768b62..f83d0a3b6 100644 --- a/fs/ocfs2/extent_map.c +++ b/fs/ocfs2/extent_map.c @@ -12,6 +12,7 @@ #include <linux/slab.h> #include <linux/types.h> #include <linux/fiemap.h> +#include <linux/delay.h> #include <cluster/masklog.h> @@ -961,6 +962,8 @@ int ocfs2_read_virt_blocks(struct inode *inode, u64 v_block, int nr, int rc = 0; u64 p_block, p_count; int i, count, done = 0; + int retries, max_retries = 5; + int retry_delay_ms = 30; trace_ocfs2_read_virt_blocks( inode, (unsigned long long)v_block, nr, bhs, flags, @@ -973,7 +976,18 @@ int ocfs2_read_virt_blocks(struct inode *inode, u64 v_block, int nr, } while (done < nr) { - down_read(&OCFS2_I(inode)->ip_alloc_sem); + retries = 0; + while (retries < max_retries) { + if (down_read_trylock(&OCFS2_I(inode)->ip_alloc_sem)) + break; // Lock acquired + msleep(retry_delay_ms); + retries++; + } + if (retries == max_retries) { + rc = -EAGAIN; + mlog(ML_ERROR, "Cannot acquire lock\n"); + break; + } rc = ocfs2_extent_map_get_blocks(inode, v_block + done, &p_block, &p_count, NULL); up_read(&OCFS2_I(inode)->ip_alloc_sem); -- 2.46.0 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] ocfs2: Fix deadlock in ocfs2_read_virt_blocks 2024-09-18 17:20 ` [PATCH 1/2] ocfs2: Fix deadlock in ocfs2_read_virt_blocks Mohammed Anees @ 2024-09-19 8:13 ` heming.zhao 2024-09-21 14:22 ` Mohammed Anees 0 siblings, 1 reply; 7+ messages in thread From: heming.zhao @ 2024-09-19 8:13 UTC (permalink / raw) To: Mohammed Anees, ocfs2-devel, linux-kernel Cc: Mark Fasheh, Joel Becker, Joseph Qi, syzbot+18a87160c7d64ba2e2f6 On 9/19/24 01:20, Mohammed Anees wrote: > syzbot has found a kernel BUG in ocfs2_write_cluster_by_desc, > while the next patch in the series resolves this, another > bug has been detected due to a potential deadlock [1]. > > The scenario is depicted here, > > CPU0 CPU1 > lock(&ocfs2_file_ip_alloc_sem_key); > lock(&osb->system_file_mutex); > lock(&ocfs2_file_ip_alloc_sem_key); > lock(&osb->system_file_mutex); > > The function calls which could lead to this are: > > CPU0 > ocfs2_write_begin - lock(&ocfs2_file_ip_alloc_sem_key); > . > . > . > ocfs2_get_system_file_inode - lock(&osb->system_file_mutex); > > CPU1 - > ocfs2_get_system_file_inode - lock(&osb->system_file_mutex); > . > . > . > ocfs2_read_virt_blocks - lock(&ocfs2_file_ip_alloc_sem_key); > > This issue can be resolved by making the down_read -> down_read_try > in the ocfs2_read_virt_blocks. > > [1] https://syzkaller.appspot.com/bug?extid=18a87160c7d64ba2e2f6 I haven't checked this patch, but in my view, following URL is correct. https://syzkaller.appspot.com/bug?extid=e0055ea09f1f5e6fabdd Heming > > Reported-and-tested-by: syzbot+18a87160c7d64ba2e2f6@syzkaller.appspotmail.com > Closes: https://syzkaller.appspot.com/bug?extid=18a87160c7d64ba2e2f6 > Signed-off-by: Mohammed Anees <pvmohammedanees2003@gmail.com> > --- > fs/ocfs2/extent_map.c | 16 +++++++++++++++- > 1 file changed, 15 insertions(+), 1 deletion(-) > > diff --git a/fs/ocfs2/extent_map.c b/fs/ocfs2/extent_map.c > index 70a768b62..f83d0a3b6 100644 > --- a/fs/ocfs2/extent_map.c > +++ b/fs/ocfs2/extent_map.c > @@ -12,6 +12,7 @@ > #include <linux/slab.h> > #include <linux/types.h> > #include <linux/fiemap.h> > +#include <linux/delay.h> > > #include <cluster/masklog.h> > > @@ -961,6 +962,8 @@ int ocfs2_read_virt_blocks(struct inode *inode, u64 v_block, int nr, > int rc = 0; > u64 p_block, p_count; > int i, count, done = 0; > + int retries, max_retries = 5; > + int retry_delay_ms = 30; > > trace_ocfs2_read_virt_blocks( > inode, (unsigned long long)v_block, nr, bhs, flags, > @@ -973,7 +976,18 @@ int ocfs2_read_virt_blocks(struct inode *inode, u64 v_block, int nr, > } > > while (done < nr) { > - down_read(&OCFS2_I(inode)->ip_alloc_sem); > + retries = 0; > + while (retries < max_retries) { > + if (down_read_trylock(&OCFS2_I(inode)->ip_alloc_sem)) > + break; // Lock acquired > + msleep(retry_delay_ms); > + retries++; > + } > + if (retries == max_retries) { > + rc = -EAGAIN; > + mlog(ML_ERROR, "Cannot acquire lock\n"); > + break; > + } > rc = ocfs2_extent_map_get_blocks(inode, v_block + done, > &p_block, &p_count, NULL); > up_read(&OCFS2_I(inode)->ip_alloc_sem); ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] ocfs2: Fix deadlock in ocfs2_read_virt_blocks 2024-09-19 8:13 ` heming.zhao @ 2024-09-21 14:22 ` Mohammed Anees 0 siblings, 0 replies; 7+ messages in thread From: Mohammed Anees @ 2024-09-21 14:22 UTC (permalink / raw) To: heming.zhao Cc: jlbec, joseph.qi, linux-kernel, mark, ocfs2-devel, pvmohammedanees2003, syzbot+18a87160c7d64ba2e2f6 Hi, I tested the patch on the URL you have provided, it seems to tackle the bug and it is appropriate, i will link this to that URL and send the new patch. Thanks! ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 2/2] osfs2: Fix kernel BUG in ocfs2_write_cluster 2024-09-18 17:20 [PATCH 0/2] ocfs2: Fix deadlock in extent_map and handle zero Mohammed Anees 2024-09-18 17:20 ` [PATCH 1/2] ocfs2: Fix deadlock in ocfs2_read_virt_blocks Mohammed Anees @ 2024-09-18 17:20 ` Mohammed Anees 2024-09-19 7:44 ` heming.zhao 1 sibling, 1 reply; 7+ messages in thread From: Mohammed Anees @ 2024-09-18 17:20 UTC (permalink / raw) To: ocfs2-devel, linux-kernel Cc: Mark Fasheh, Joel Becker, Joseph Qi, Mohammed Anees, syzbot+18a87160c7d64ba2e2f6 syzbot has found a kernel BUG in ocfs2_write_cluster_by_desc [1]. The issue arises because ocfs2_insert_extent receives start_blk as 0, which incorrectly maps to a physical address of 0. This occurs when block is 0 after the call to ocfs2_clusters_to_blocks which is invoked inside the ocfs2_add_clusters_in_btree. The block value is then passed to ocfs2_insert_extent, leading to the problem. [1] https://syzkaller.appspot.com/bug?extid=18a87160c7d64ba2e2f6 Reported-and-tested-by: syzbot+18a87160c7d64ba2e2f6@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=18a87160c7d64ba2e2f6 Signed-off-by: Mohammed Anees <pvmohammedanees2003@gmail.com> --- fs/ocfs2/alloc.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/fs/ocfs2/alloc.c b/fs/ocfs2/alloc.c index 395e23920..926ffeed8 100644 --- a/fs/ocfs2/alloc.c +++ b/fs/ocfs2/alloc.c @@ -4843,6 +4843,13 @@ int ocfs2_add_clusters_in_btree(handle_t *handle, } block = ocfs2_clusters_to_blocks(osb->sb, bit_off); + if (block == 0) { + mlog(ML_ERROR, "Conversion resulted in zero block number"); + status = -EIO; + need_free = 1; + goto bail; + } + trace_ocfs2_add_clusters_in_btree( (unsigned long long)ocfs2_metadata_cache_owner(et->et_ci), bit_off, num_bits); -- 2.46.0 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] osfs2: Fix kernel BUG in ocfs2_write_cluster 2024-09-18 17:20 ` [PATCH 2/2] osfs2: Fix kernel BUG in ocfs2_write_cluster Mohammed Anees @ 2024-09-19 7:44 ` heming.zhao 2024-09-21 17:30 ` Mohammed Anees 0 siblings, 1 reply; 7+ messages in thread From: heming.zhao @ 2024-09-19 7:44 UTC (permalink / raw) To: Mohammed Anees, ocfs2-devel, linux-kernel Cc: Mark Fasheh, Joel Becker, Joseph Qi, syzbot+18a87160c7d64ba2e2f6 On 9/19/24 01:20, Mohammed Anees wrote: > syzbot has found a kernel BUG in ocfs2_write_cluster_by_desc [1]. > > The issue arises because ocfs2_insert_extent receives start_blk > as 0, which incorrectly maps to a physical address of 0. This > occurs when block is 0 after the call to ocfs2_clusters_to_blocks > which is invoked inside the ocfs2_add_clusters_in_btree. The block > value is then passed to ocfs2_insert_extent, leading to the problem. > > [1] https://syzkaller.appspot.com/bug?extid=18a87160c7d64ba2e2f6 > > Reported-and-tested-by: syzbot+18a87160c7d64ba2e2f6@syzkaller.appspotmail.com > Closes: https://syzkaller.appspot.com/bug?extid=18a87160c7d64ba2e2f6 > Signed-off-by: Mohammed Anees <pvmohammedanees2003@gmail.com> > --- > fs/ocfs2/alloc.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/fs/ocfs2/alloc.c b/fs/ocfs2/alloc.c > index 395e23920..926ffeed8 100644 > --- a/fs/ocfs2/alloc.c > +++ b/fs/ocfs2/alloc.c > @@ -4843,6 +4843,13 @@ int ocfs2_add_clusters_in_btree(handle_t *handle, > } > > block = ocfs2_clusters_to_blocks(osb->sb, bit_off);> + if (block == 0) { > + mlog(ML_ERROR, "Conversion resulted in zero block number"); > + status = -EIO; > + need_free = 1; > + goto bail; > + } > + If you check this function, there is no IO operation, so -EIO is not suitable. In the the ocfs2_clusters_to_blocks() code, there are two possible cases where the result is zero: bit_off is 0 or bit_off is out of range for a u64 after a bit shift. It seems that the root cause is that __ocfs2_claim_clusters allocates an incorrect bit_off. -Heming > trace_ocfs2_add_clusters_in_btree( > (unsigned long long)ocfs2_metadata_cache_owner(et->et_ci), > bit_off, num_bits); ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] osfs2: Fix kernel BUG in ocfs2_write_cluster 2024-09-19 7:44 ` heming.zhao @ 2024-09-21 17:30 ` Mohammed Anees 0 siblings, 0 replies; 7+ messages in thread From: Mohammed Anees @ 2024-09-21 17:30 UTC (permalink / raw) To: heming.zhao Cc: jlbec, joseph.qi, linux-kernel, mark, ocfs2-devel, pvmohammedanees2003, syzbot+18a87160c7d64ba2e2f6 Yes, you are absolutely right, __ocfs2_claim_clusters indeed allocates 0 as the bit_off, looking into this I believe the problem is triggered due to ocfs2_search_chain called by ocfs2_claim_suballoc_bits, what do you think would be the best approach to solve this issue, what function I be looking for, any insights will be highly appreciated. Thanks! ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2024-09-21 17:31 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-09-18 17:20 [PATCH 0/2] ocfs2: Fix deadlock in extent_map and handle zero Mohammed Anees 2024-09-18 17:20 ` [PATCH 1/2] ocfs2: Fix deadlock in ocfs2_read_virt_blocks Mohammed Anees 2024-09-19 8:13 ` heming.zhao 2024-09-21 14:22 ` Mohammed Anees 2024-09-18 17:20 ` [PATCH 2/2] osfs2: Fix kernel BUG in ocfs2_write_cluster Mohammed Anees 2024-09-19 7:44 ` heming.zhao 2024-09-21 17:30 ` Mohammed Anees
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox