From mboxrd@z Thu Jan 1 00:00:00 1970 From: piaojun Date: Mon, 25 Mar 2019 09:32:54 +0800 Subject: [Ocfs2-devel] [PATCH] ocfs2: fix ocfs2 read inode data panic in ocfs2_iget In-Reply-To: <01433584-44d5-5b27-a0c9-fff4bff74864@gmail.com> References: <1553477217-2736-1-git-send-email-sunny.s.zhang@oracle.com> <01433584-44d5-5b27-a0c9-fff4bff74864@gmail.com> Message-ID: <5C982FC6.6050900@huawei.com> List-Id: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: ocfs2-devel@oss.oracle.com On 2019/3/25 9:31, Joseph Qi wrote: > > > On 19/3/25 09:26, Shuning Zhang wrote: >> In some cases, the function ocfs2_iget read the data of inode, which has >> been deleted for some reason. That will make the system panic. So We >> should judge whether this inode has been deleted, and tell the caller >> that the inode is a bad inode. >> >> [283465.542049] OCFS2: ERROR (device sdp): ocfs2_validate_inode_block: >> Invalid dinode #580640: OCFS2_VALID_FL not set >> > > Looks like it is a filesystem corruption bug. > So we'd better find where it is introduced and try to fix it there. Agreed, sounds like read-only problem. > > Thanks, > Joseph > >> [283465.545490] Kernel panic - not syncing: OCFS2: (device sdp): panic forced >> after error >> >> [283465.546889] CPU: 5 PID: 12416 Comm: nfsd Tainted: G W >> 4.1.12-124.18.6.el6uek.bug28762940v3.x86_64 #2 >> [283465.548382] Hardware name: VMware, Inc. VMware Virtual Platform/440BX >> Desktop Reference Platform, BIOS 6.00 09/21/2015 >> [283465.549657] 0000000000000000 ffff8800a56fb7b8 ffffffff816e839c >> ffffffffa0514758 >> [283465.550392] 000000000008dc20 ffff8800a56fb838 ffffffff816e62d3 >> 0000000000000008 >> [283465.551056] ffff880000000010 ffff8800a56fb848 ffff8800a56fb7e8 >> ffff88005df9f000 >> [283465.551710] Call Trace: >> [283465.552516] [] dump_stack+0x63/0x81 >> [283465.553291] [] panic+0xcb/0x21b >> [283465.554037] [] ocfs2_handle_error+0xf0/0xf0 [ocfs2] >> [283465.554882] [] __ocfs2_error+0x67/0x70 [ocfs2] >> [283465.555768] [] ocfs2_validate_inode_block+0x229/0x230 >> [ocfs2] >> [283465.556683] [] ocfs2_read_blocks+0x46c/0x7b0 [ocfs2] >> [283465.557408] [] ? ocfs2_inode_cache_io_unlock+0x20/0x20 >> [ocfs2] >> [283465.557973] [] ocfs2_read_inode_block_full+0x3b/0x60 >> [ocfs2] >> [283465.558525] [] ocfs2_iget+0x4aa/0x880 [ocfs2] >> [283465.559082] [] ocfs2_get_parent+0x9e/0x220 [ocfs2] >> [283465.559622] [] reconnect_path+0xb5/0x300 >> [283465.560156] [] exportfs_decode_fh+0xf6/0x2b0 >> [283465.560708] [] ? nfsd_proc_getattr+0xa0/0xa0 [nfsd] >> [283465.561262] [] ? prepare_creds+0x26/0x110 >> [283465.561932] [] fh_verify+0x350/0x660 [nfsd] >> [283465.562862] [] ? nfsd_cache_lookup+0x44/0x630 [nfsd] >> [283465.563697] [] nfsd3_proc_getattr+0x69/0xf0 [nfsd] >> [283465.564510] [] nfsd_dispatch+0xe0/0x290 [nfsd] >> [283465.565358] [] ? svc_tcp_adjust_wspace+0x12/0x30 >> [sunrpc] >> [283465.566272] [] svc_process_common+0x412/0x6a0 [sunrpc] >> [283465.567155] [] svc_process+0x123/0x210 [sunrpc] >> [283465.568020] [] nfsd+0xff/0x170 [nfsd] >> [283465.568962] [] ? nfsd_destroy+0x80/0x80 [nfsd] >> [283465.570112] [] kthread+0xcb/0xf0 >> [283465.571099] [] ? kthread_create_on_node+0x180/0x180 >> [283465.572114] [] ret_from_fork+0x58/0x90 >> [283465.573156] [] ? kthread_create_on_node+0x180/0x180 >> --- >> fs/ocfs2/inode.c | 22 ++++++++++++++++++++++ >> 1 files changed, 22 insertions(+), 0 deletions(-) >> >> diff --git a/fs/ocfs2/inode.c b/fs/ocfs2/inode.c >> index b254416..a7f8160 100644 >> --- a/fs/ocfs2/inode.c >> +++ b/fs/ocfs2/inode.c >> @@ -513,6 +513,16 @@ static int ocfs2_read_locked_inode(struct inode *inode, >> status = -EINVAL; >> fe = (struct ocfs2_dinode *) bh->b_data; >> >> + /* >> + * In some cases, this inode has been deleted, but the data of >> + * inode is read directly. So We should judge whether this inode >> + * has been deleted. >> + * */ >> + if (unlikely(fe->i_dtime != 0)) { >> + mlog(ML_NOTICE, "The inode %llu has been deleted\n", >> + le64_to_cpu(fe->i_blkno)); >> + goto bail; >> + } >> /* >> * This is a code bug. Right now the caller needs to >> * understand whether it is asking for a system file inode or >> @@ -1331,6 +1341,18 @@ int ocfs2_validate_inode_block(struct super_block *sb, >> >> BUG_ON(!buffer_uptodate(bh)); >> >> + /* >> + * In some cases, this inode has been deleted, but the data of >> + * inode is read directly. This is not a data corruption.So >> + * We should judge whether this inode has been deleted. >> + * */ >> + if (unlikely(di->i_dtime != 0)) { >> + rc = -EINVAL; >> + mlog(ML_NOTICE, "The inode %llu has been deleted\n", >> + le64_to_cpu(di->i_blkno)); >> + goto bail; >> + } >> + >> /* >> * If the ecc fails, we return the error but otherwise >> * leave the filesystem running. We know any error is >> > > _______________________________________________ > Ocfs2-devel mailing list > Ocfs2-devel at oss.oracle.com > https://oss.oracle.com/mailman/listinfo/ocfs2-devel >