From: Jeff Liu <jeff.liu@oracle.com>
To: Brian Foster <bfoster@redhat.com>, "xfs@oss.sgi.com" <xfs@oss.sgi.com>
Subject: Re: [PATCH 3/10] xfs: consolidate xfs_inumbers
Date: Tue, 07 Jan 2014 14:58:19 +0800 [thread overview]
Message-ID: <52CBA58B.30504@oracle.com> (raw)
In-Reply-To: <52CAC622.2020805@redhat.com>
On 01/06 2014 23:05 PM, Brian Foster wrote:
> On 01/06/2014 01:23 AM, Jeff Liu wrote:
>> On 01/04 2014 04:53 AM, Brian Foster wrote:
>>> On 12/28/2013 06:20 AM, Jeff Liu wrote:
>>>> From: Jie Liu <jeff.liu@oracle.com>
>>>>
>>>> To fetch the file system number tables, we currently just ignore the
>>>> errors and proceed to loop over the next AG or bump agino to the next
>>>> chunk in case of btree operations failed, that is not properly because
>>>> those errors might hint us potential file system problems.
>>>>
>>>> This patch rework xfs_inumbers() to handle the btree operation errors
>>>> as well as the loop conditions. Also, add pre-checkups for the given
>>>> inode, we can save alloc/free the format buffer once against an invalid
>>>> inode number.
>>>>
>>>> Signed-off-by: Jie Liu <jeff.liu@oracle.com>
>>>> ---
>>>> fs/xfs/xfs_itable.c | 163 +++++++++++++++++++++++-----------------------------
>>>> 1 file changed, 72 insertions(+), 91 deletions(-)
>>>>
>>>> diff --git a/fs/xfs/xfs_itable.c b/fs/xfs/xfs_itable.c
>>>> index 692671c..4d262f6 100644
>>>> --- a/fs/xfs/xfs_itable.c
>>>> +++ b/fs/xfs/xfs_itable.c
> ...
>>>> - error = xfs_inobt_get_rec(cur, &r, &i);
>>>> - if (error || i == 0) {
>>>> - xfs_buf_relse(agbp);
>>>> - agbp = NULL;
>>>> - xfs_btree_del_cursor(cur, XFS_BTREE_NOERROR);
>>>> - cur = NULL;
>>>> - agno++;
>>>> - agino = 0;
>>>> - continue;
>>>> + error = xfs_inobt_lookup(cur, agino, XFS_LOOKUP_GE, &stat);
>>>> + if (error)
>>>> + break;
>>>
>>> Isn't this lookup only needed after cursor initialization? i.e., we
>>> lookup once and increment through the records via xfs_btree_increment()
>>> below.
>>
>> Yes, but please see my comments below.
>>>
> ...
>>>> + if (!--left)
>>>> + break;
>>>> +
>>>> + error = xfs_btree_increment(cur, 0, &stat);
>>>> + if (error)
>>>> + break;
>>>> + if (stat) {
>>>> + /*
>>>> + * The agino value has already been bumped, just try
>>>> + * to skip up to it.
>>>> + */
>>>> + agino += XFS_INODES_PER_CHUNK;
>>>> + continue;
>>>> }
>>>
>>> Maybe it's just me, but this reads a little funny to me. In particular
>>> because we only get here if stat == 1. I wonder if this would look a bit
>>> cleaner if we pulled the next_ag labeled block below up into the goto,
>>> since that appears to be the only reference. Then just let the loop fall
>>> through.
>>
>> Actually, we would never hint xfs_btree_increment() at all no matter the default
>> or with my current change, because the left variable value is 1 as per the user
>> space imap implementation, thus, it should be decreased to 0 in this point.
>>
>> The problem is, how we define the user call interface originally, why it does not
>> support a 2nd argument which can be used to specified a desired icount to return a
>> limited number of inode mapping tables? i.e,
>>
>> imap_f() {
>>
>> if (argc != 2)
>> nent = 1;
>> else
>> nent = atoi(argv[1]);
>>
>> ....
>> }
>>
>> The imap command does not support another options but initialize bulkreq.icount = nent = 1;
>> In kernel, the bcount is evaluated to 1 and if (bufidx == bcount) should be true anyway as
>> below code logic:
>>
>> bcount = MIN(left, (int)(PAGE_SIZE / sizeof(*buffer)));
>>
>> if (bufidx == bcount) {
>> ....
>> formatter();
>> ....
>> }
>
> Hi Jeff,
Hi Brian,
>
> I see, but this is just a characteristic of the imap command in xfs_io.
> I can use an argument just by changing argmax from 0 to 1. Perhaps this
> is just a bug, since 'help imap' provides syntax that allows a parameter:
>
> xfs_io> help imap
> imap [nentries] -- inode map for filesystem of current file
I also thought we should fix the command usage in terms of the help menu,
though I missed a long history of XFS development... > >>
>> But, if the left value could be specified from the user progs, maybe we cannot simply
>> assuming "stat == 1" is always be true, in particular, in next path for implementing
>> per AG inumber mechanism, xfs_btree_increment() would probably succeed but "stat == 0"
>> if there is no right neighbors, and if the user could specified the 2nd imap option.
>>
>
> Hmm, that looked odd to me as well once you pointed that out. A quick
> printk check shows that xfs_btree_increment() does not fail in this
> scenario, but the subsequent xfs_inobt_get_rec() in the following
> iteration sets i == 0 and skips to the next ag appropriately. I agree
> that a tmp == 0 check in this scenario would be slightly more intuitive
> though.
>
> But either way, we should probably maintain the general algorithm of
> walking the btree explicitly rather than incrementing agino and issuing
> the lookup each iteration of the loop.
>
>> Hence, in this patch, I only intended to fix the btree error handling mechanism as a
>> preparation step to implement per AG inumber.
>>
>
> Right... I see that the following patches clean this up further (though
> I haven't looked in detail yet).
>
>>>
>>> Also, I think the agino addition here becomes unnecessary when the
>>> lookup issue above is addressed.
>>>
>>>> - }
>>>> +
>>>> +next_ag:
>>>> + xfs_btree_del_cursor(cur, XFS_BTREE_NOERROR);
>>>> + cur = NULL;
>>>> + xfs_buf_relse(agbp);
>>>> + agbp = NULL;
>>>> + agino = 0;
>>>> + } while (++agno < mp->m_sb.sb_agcount);
>>>> +
>>>
>>> ... and just thinking about the logic that way highlights the bug here,
>>> where we bump agno due to the continue above (where IIUC, we intend to
>>> only move forward within the ag). Perhaps the ++agno should be part of
>>> the broken off 'next_ag' logic as well.
>>
>> Can you please point out a bit more about the bug? We bump agno due to
>> current run out of the current AG but the user want to show more imaps.
>>
>
> agno is bumped in the while condition at the end of the loop. This will
> also be executed in the following block if xfs_btree_increment()
> succeeds and finds a record to the right:
>
> error = xfs_btree_increment(cur, 0, &stat);
> if (error)
> break;
> if (stat) {
> ...
> agino += XFS_INODES_PER_CHUNK;
> ------> continue;
> }
>
> The continue statement will execute the loop condition to determine
> whether it should iterate the loop again or break. In this case, you're
> still moving along the original AG, but agno has been bumped
> erroneously. This error looks to be hidden by the non-looping scenario
> where nentries is 1, but make the xfsprogs fix I described above and
> you'll probably see what I'm describing. I only see a subset of the
> original imap output using a record count of 10 on an fs with a couple
> thousand inodes spread across a couple AGs.
>
> FWIW, it does look like the following patch fixes this particular
> problem via the introduction of xfs_perag_inumbers(), but we shouldn't
> introduce transient errors into the patch stream if we can help it.
Thanks for your so much detailed clarifications, now I can see what you
mean. :) and Yep, that is really a problem should be fixed here.
> It looks like the comments about the repeated inobt lookups still apply
> though (and the implementation as of patch 4 is still skipping inodes
> with nentries > 1, perhaps due to advancing agino multiple times per
> iteration..?).
Hmm...,now I think it does not needed, and the forth patch would be changed
in next round of post just as you pointed out, i.e, the xfs_inobt_lookup()
only need to call once, and then it could be moved out of the loop, and the
agino don't need to bump again so, an increased patch against [patch 4] would
looks like below:
diff --git a/fs/xfs/xfs_itable.c b/fs/xfs/xfs_itable.c
index 764e169..ff7efaf 100644
--- a/fs/xfs/xfs_itable.c
+++ b/fs/xfs/xfs_itable.c
@@ -595,6 +595,7 @@ xfs_perag_inumbers(
struct xfs_agi *agi;
struct xfs_buf *agbp;
struct xfs_btree_cur *cur;
+ int stat;
int error;
error = xfs_ialloc_read_agi(mp, NULL, agno, &agbp);
@@ -602,14 +603,13 @@ xfs_perag_inumbers(
return error;
agi = XFS_BUF_TO_AGI(agbp);
cur = xfs_inobt_init_cursor(mp, NULL, agbp, agno);
+ /* Done if failed to lookup or no inode chuck is found */
+ error = xfs_inobt_lookup(cur, agino, XFS_LOOKUP_GE, &stat);
+ if (error || stat == 0)
+ goto error0;
+
for (;;) {
struct xfs_inobt_rec_incore r;
- int stat;
-
- /* Done if failed to lookup or no inode chuck is found */
- error = xfs_inobt_lookup(cur, agino, XFS_LOOKUP_GE, &stat);
- if (error || stat == 0)
- break;
error = xfs_inobt_get_rec(cur, &r, &stat);
if (error)
@@ -629,9 +629,9 @@ xfs_perag_inumbers(
error = formatter(ubuffer, buffer, bufidx, &written);
if (error)
break;
- ubuffer += written;
count += bufidx;
bufidx = 0;
+ break;
}
if (!--ubleft)
break;
@@ -641,12 +641,6 @@ xfs_perag_inumbers(
/* Done if failed or there are no rightward entries */
break;
}
-
- /*
- * The agino value has already been bumped. Just try to skip
- * up to it.
- */
- agino += XFS_INODES_PER_CHUNK;
}
if (!error) {
Above temporary patch is just for demonstration purpose, I'll run extra tests
combine with your xfs_io/imap fix.
Thanks,
-Jeff
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
prev parent reply other threads:[~2014-01-07 6:58 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-12-28 11:20 [PATCH 3/10] xfs: consolidate xfs_inumbers Jeff Liu
2014-01-03 20:53 ` Brian Foster
2014-01-06 6:23 ` Jeff Liu
2014-01-06 15:05 ` Brian Foster
2014-01-07 6:58 ` Jeff Liu [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=52CBA58B.30504@oracle.com \
--to=jeff.liu@oracle.com \
--cc=bfoster@redhat.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).