From: Joel Becker <Joel.Becker@oracle.com>
To: ocfs2-devel@oss.oracle.com
Subject: [Ocfs2-devel] [PATCH 1/1] OCFS2: anti stale inode for nfs (V3)
Date: Tue, 17 Feb 2009 17:37:01 -0800 [thread overview]
Message-ID: <20090218013700.GA26431@mail.oracle.com> (raw)
In-Reply-To: <200902171255.n1HCsxeo017544@acsinet13.oracle.com>
On Tue, Feb 17, 2009 at 08:53:47PM +0800, wengang wang wrote:
> For nfs exporting, ocfs2_get_dentry() returns the dentry for fh.
> ocfs2_get_dentry() may read from disk(when inode not in memory) without
> any cross cluster lock. this leads to load a stale inode.
>
> this patch fixes above problem.
This patch is almost there. Excellent!
> this patch is based on 1.4 git.
Going forward, fixes really need to be against mainline. Let's
finish out this patch against 1.4 and then you can port it to mainline.
But for the future, we fix against mainline and backport.
> + status = ocfs2_get_suballoc_slot_bit(osb, blkno, &suballoc_slot,
> + &suballoc_bit);
> + if (status < 0) {
> + if (status == -EINVAL) {
> + /* meta block never be re-allocated as data block.
> + * nfsd gives us wrong blkno */
> + status = -EEXIST;
> + } else {
> + mlog(ML_ERROR, "get alloc slot and bit failed %d\n",
> + status);
> + }
> + goto unlock_nfs_sync;
> + }
> + inode_alloc_inode =
> + ocfs2_get_system_file_inode(osb, INODE_ALLOC_SYSTEM_INODE,
> + suballoc_slot);
> + if (!inode_alloc_inode) {
> + status = -EEXIST;
> + mlog(ML_ERROR, "unable to get alloc inode in slot %u\n",
> + (u32)suballoc_slot);
> + goto unlock_nfs_sync;
> + }
> +
> + mutex_lock(&inode_alloc_inode->i_mutex);
> + status = ocfs2_inode_lock(inode_alloc_inode, &alloc_bh, 0);
> + if (status < 0) {
> + mlog(ML_ERROR, "lock on alloc inode on slot %u failed %d\n",
> + (u32)suballoc_slot, status);
> + goto unlock_mutex;
> + }
> + status = ocfs2_test_suballoc_bit(osb, inode_alloc_inode, alloc_bh,
> + blkno, suballoc_bit, &set);
> + if (status < 0) {
> + mlog(ML_ERROR, "test suballoc bit failed %d\n", status);
> + goto inode_unlock;
> + }
> + /* allocate bit is clear, inode is a stale inode */
> + if (!set) {
> + status = -ESTALE;
> + goto inode_unlock;
> + }
You can drop the suballocator lock here. Taking the lock has
made sure that other nodes flushed their journals. You have just
validated that the bit is set, and other nodes cannot clear the bit
until they get the nfs_sync lock, which you already hold. So it is safe
to call ocfs2_inode_unlock(inode_alloc_inode, 0) and
mutex_unlock(&inode_alloc_inode->i_mutex) before calling ocfs2_iget().
This has two benefits. Number 1, we don't take the suballoc
lock and the inode lock (in iget()) at the same time. The fewer locks
we take at the same time, the better. Number 2, this means the entire
suballocator lookup code above can be made into a subfunction. This
improves the readability of the code.
> +/* reads(hit disk) the inode specified by blkno to get suballoc_slot
> + * and suballoc_bit
> + * */
> +int ocfs2_get_suballoc_slot_bit(struct ocfs2_super *osb, u64 blkno,
> + u16 *suballoc_slot, u16 *suballoc_bit)
> +{
> + int status;
> + struct buffer_head *inode_bh = NULL;
> + struct ocfs2_dinode *inode_fe;
> +
> + mlog_entry("blkno: %llu\n", blkno);
> +
> + /* dirty read disk */
> + status = ocfs2_read_block(osb, blkno, &inode_bh, 0, NULL);
> + if (status < 0)
> + goto bail;
> +
> + inode_fe = (struct ocfs2_dinode *) inode_bh->b_data;
> + if (!OCFS2_IS_VALID_DINODE(inode_fe)) {
> + status = -EINVAL;
> + goto bail;
> + }
> +
> + if (suballoc_slot)
> + *suballoc_slot = le16_to_cpu(inode_fe->i_suballoc_slot);
Probably want to validate that suballoc_slot is within the range
of valid slot numbers. Just in case.
Otherwise, everything looks good. The nfs_sync_lock is good.
It will need to be added to debugfs.ocfs2's lock displays.
Joel
--
"Baby, even the losers
Get luck sometimes.
Even the losers
Keep a little bit of pride."
Joel Becker
Principal Software Developer
Oracle
E-mail: joel.becker at oracle.com
Phone: (650) 506-8127
next prev parent reply other threads:[~2009-02-18 1:37 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-02-17 12:53 [Ocfs2-devel] [PATCH 1/1] OCFS2: anti stale inode for nfs (V3) wengang wang
2009-02-18 1:37 ` Joel Becker [this message]
2009-02-18 8:30 ` Wengang Wang
2009-02-18 17:43 ` Joel Becker
2009-02-19 13:10 ` Wengang Wang
2009-02-19 17:35 ` Joel Becker
2009-02-19 19:29 ` Sunil Mushran
2009-04-10 12:50 ` Wengang Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090218013700.GA26431@mail.oracle.com \
--to=joel.becker@oracle.com \
--cc=ocfs2-devel@oss.oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.