public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Chandan Babu R <chandan.babu@oracle.com>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-xfs@vger.kernel.org, djwong@kernel.org
Subject: Re: [PATCH V7 15/17] xfs: Enable bulkstat ioctl to support 64-bit per-inode extent counters
Date: Tue, 08 Mar 2022 08:22:48 +0530	[thread overview]
Message-ID: <87mti1qhof.fsf@debian-BULLSEYE-live-builder-AMD64> (raw)
In-Reply-To: <20220307214127.GQ59715@dread.disaster.area>

On 08 Mar 2022 at 03:11, Dave Chinner wrote:
> On Mon, Mar 07, 2022 at 07:16:57PM +0530, Chandan Babu R wrote:
>> On 07 Mar 2022 at 10:43, Dave Chinner wrote:
>> > On Sat, Mar 05, 2022 at 06:15:37PM +0530, Chandan Babu R wrote:
>> >> On 04 Mar 2022 at 13:39, Dave Chinner wrote:
>> >> > On Tue, Mar 01, 2022 at 04:09:36PM +0530, Chandan Babu R wrote:
>> >> >> @@ -102,7 +104,27 @@ xfs_bulkstat_one_int(
>> >> >>  
>> >> >>  	buf->bs_xflags = xfs_ip2xflags(ip);
>> >> >>  	buf->bs_extsize_blks = ip->i_extsize;
>> >> >> -	buf->bs_extents = xfs_ifork_nextents(&ip->i_df);
>> >> >> +
>> >> >> +	nextents = xfs_ifork_nextents(&ip->i_df);
>> >> >> +	if (!(bc->breq->flags & XFS_IBULK_NREXT64)) {
>> >> >> +		xfs_extnum_t	max_nextents = XFS_MAX_EXTCNT_DATA_FORK_OLD;
>> >> >> +
>> >> >> +		if (unlikely(XFS_TEST_ERROR(false, mp,
>> >> >> +				XFS_ERRTAG_REDUCE_MAX_IEXTENTS)))
>> >> >> +			max_nextents = 10;
>> >> >> +
>> >> >> +		if (nextents > max_nextents) {
>> >> >> +			xfs_iunlock(ip, XFS_ILOCK_SHARED);
>> >> >> +			xfs_irele(ip);
>> >> >> +			error = -EOVERFLOW;
>> >> >> +			goto out;
>> >> >> +		}
>> >> >
>> >> > This just seems wrong. This will cause a total abort of the bulkstat
>> >> > pass which will just be completely unexpected by any application
>> >> > taht does not know about 64 bit extent counts. Most of them likely
>> >> > don't even care about the extent count in the data being returned.
>> >> >
>> >> > Really, I think this should just set the extent count to the MAX
>> >> > number and just continue onwards, otherwise existing application
>> >> > will not be able to bulkstat a filesystem with large extents counts
>> >> > in it at all.
>> >> >
>> >> 
>> >> Actually, I don't know much about how applications use bulkstat. I am
>> >> dependent on guidance from other developers who are well versed on this
>> >> topic. I will change the code to return maximum extent count if the value
>> >> overflows older extent count limits.
>> >
>> > They tend to just run in a loop until either no more inodes are to
>> > be found or an error occurs. bulkstat loops don't expect errors to
>> > be reported - it's hard to do something based on all inodes if you
>> > get errors reading then inodes part way through. There's no way for
>> > the application to tell where it should restart scanning - the
>> > bulkstat iteration cookie is controlled by the kernel, and I don't
>> > think we update it on error.
>> 
>> xfs_bulkstat() has the following,
>> 
>>         kmem_free(bc.buf);
>> 
>>         /*
>>          * We found some inodes, so clear the error status and return them.
>>          * The lastino pointer will point directly at the inode that triggered
>>          * any error that occurred, so on the next call the error will be
>>          * triggered again and propagated to userspace as there will be no
>>          * formatted inodes in the buffer.
>>          */
>>         if (breq->ocount > 0)
>>                 error = 0;
>> 
>>         return error;
>> 
>> The above will help the userspace process to issue another bulkstat call which
>> beging from the inode causing an error.
>
> ANd then it returns with a cookie pointing at the overflowed inode,
> and we try that one first on the next loop, triggering -EOVERFLOW
> with breq->ocount == 0.
>
> Or maybe we have two inodes in a row that trigger EOVERFLOW, so even
> if we skip the first and return to userspace, we trip the second on
> the next call and boom...
>
>> > e.g. see fstests src/bstat.c and src/bulkstat_unlink_test*.c - they
>> > simply abort if bulkstat fails. Same goes for xfsdump common/util.c
>> > and dump/content.c - they just error out and return and don't try to
>> > continue further.
>> 
>> I made the following changes to src/bstat.c,
>> 
>> diff --git a/src/bstat.c b/src/bstat.c
>> index 3f3dc2c6..0e72190e 100644
>> --- a/src/bstat.c
>> +++ b/src/bstat.c
>> @@ -143,7 +143,19 @@ main(int argc, char **argv)
>>  	bulkreq.ubuffer = t;
>>  	bulkreq.ocount  = &count;
>>  
>> -	while ((ret = xfsctl(name, fsfd, XFS_IOC_FSBULKSTAT, &bulkreq)) == 0) {
>> +	while (1) {
>> +		ret = xfsctl(name, fsfd, XFS_IOC_FSBULKSTAT, &bulkreq);
>> +		if (ret == -1) {
>> +			if (errno == EOVERFLOW) {
>> +				printf("Skipping inode %llu.\n",  last+1);
>> +				++last;
>> +				continue;
>> +			}
>> +
>> +			perror("xfsctl");
>> +			exit(1);
>> +		}
>> +
>>  		total += count;
>>  
>> 
>> Executing the script at
>> https://gist.github.com/chandanr/f2d147fa20a681e1508e182b5b7cdb00 provides the
>> following output,
>> 
>> ...
>> 
>> ino 128 mode 040755 nlink 3 uid 0 gid 0 rdev 0
>> blksize 4096 size 37 blocks 0 xflags 0 extsize 0
>> atime Thu Jan  1 00:00:00.000000000 1970
>> mtime Mon Mar  7 13:06:30.051339892 2022
>> ctime Mon Mar  7 13:06:30.051339892 2022
>> extents 0 0 gen 0
>> DMI: event mask 0x00000000 state 0x0000
>> 
>> Skipping inode 131.
>> 
>> ino 132 mode 040755 nlink 2 uid 0 gid 0 rdev 0
>> blksize 4096 size 97 blocks 0 xflags 0 extsize 0
>> atime Mon Mar  7 13:06:30.051339892 2022
>> mtime Mon Mar  7 13:06:30.083339892 2022
>> ctime Mon Mar  7 13:06:30.083339892 2022
>> extents 0 0 gen 548703887
>> DMI: event mask 0x00000000 state 0x0000
>> 
>> ...
>> 
>> The above illustrates that userspace programs can be modified to use lastip to
>> skip inodes which cause bulkstat ioctl to return with an error.
>
> Yes, I know they can be modified to handle it - that is not the
> concern here. The concern is that this new error can potentially
> break the *unmodified* applications already out there. e.g. xfsdump
> may just stop dumping a filesystem half way through because it
> doesn't handle unexpected errors like this sanely. But we can't tie
> a version of xfsdump to a specific kernel feature, so we have to
> make sure that buklstat from older builds of xfsdump will still
> iterate through the entire filesystem without explicit EOVERFLOW
> support...

Ok. Thanks for the clarification.

-- 
chandan

  reply	other threads:[~2022-03-08  2:53 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-01 10:39 [PATCH V7 00/17] xfs: Extend per-inode extent counters Chandan Babu R
2022-03-01 10:39 ` [PATCH V7 01/17] xfs: Move extent count limits to xfs_format.h Chandan Babu R
2022-03-04  0:55   ` Dave Chinner
2022-03-01 10:39 ` [PATCH V7 02/17] xfs: Introduce xfs_iext_max_nextents() helper Chandan Babu R
2022-03-04  0:56   ` Dave Chinner
2022-03-01 10:39 ` [PATCH V7 03/17] xfs: Use xfs_extnum_t instead of basic data types Chandan Babu R
2022-03-04  0:59   ` Dave Chinner
2022-03-04  1:30     ` Dave Chinner
2022-03-01 10:39 ` [PATCH V7 04/17] xfs: Introduce xfs_dfork_nextents() helper Chandan Babu R
2022-03-04  1:43   ` Dave Chinner
2022-03-05 12:42     ` Chandan Babu R
2022-03-01 10:39 ` [PATCH V7 05/17] xfs: Use basic types to define xfs_log_dinode's di_nextents and di_anextents Chandan Babu R
2022-03-04  1:44   ` Dave Chinner
2022-03-01 10:39 ` [PATCH V7 06/17] xfs: Promote xfs_extnum_t and xfs_aextnum_t to 64 and 32-bits respectively Chandan Babu R
2022-03-04  1:29   ` Dave Chinner
2022-03-05 12:43     ` Chandan Babu R
2022-03-07  4:55       ` Dave Chinner
2022-03-01 10:39 ` [PATCH V7 07/17] xfs: Introduce XFS_SB_FEAT_INCOMPAT_NREXT64 and associated per-fs feature bit Chandan Babu R
2022-03-04  1:57   ` Dave Chinner
2022-03-05 12:43     ` Chandan Babu R
2022-03-01 10:39 ` [PATCH V7 08/17] xfs: Introduce XFS_FSOP_GEOM_FLAGS_NREXT64 Chandan Babu R
2022-03-04  1:58   ` Dave Chinner
2022-03-01 10:39 ` [PATCH V7 09/17] xfs: Introduce XFS_DIFLAG2_NREXT64 and associated helpers Chandan Babu R
2022-03-01 10:39 ` [PATCH V7 10/17] xfs: Use xfs_rfsblock_t to count maximum blocks that can be used by BMBT Chandan Babu R
2022-03-04  2:09   ` Dave Chinner
2022-03-05 12:44     ` Chandan Babu R
2022-03-01 10:39 ` [PATCH V7 11/17] xfs: Introduce macros to represent new maximum extent counts for data/attr forks Chandan Babu R
2022-03-04  2:32   ` Dave Chinner
2022-03-05 12:44     ` Chandan Babu R
2022-03-01 10:39 ` [PATCH V7 12/17] xfs: Introduce per-inode 64-bit extent counters Chandan Babu R
2022-03-04  7:14   ` Dave Chinner
2022-03-05 12:44     ` Chandan Babu R
2022-03-01 10:39 ` [PATCH V7 13/17] xfs: xfs_growfs_rt_alloc: Unlock inode explicitly rather than through iop_committing() Chandan Babu R
2022-03-02  0:26   ` Darrick J. Wong
2022-03-04  7:25   ` Dave Chinner
2022-03-05 12:44     ` Chandan Babu R
2022-03-01 10:39 ` [PATCH V7 14/17] xfs: Conditionally upgrade existing inodes to use 64-bit extent counters Chandan Babu R
2022-03-04  7:51   ` Dave Chinner
2022-03-05 12:45     ` Chandan Babu R
2022-03-07  5:02       ` Dave Chinner
2022-03-07 10:20         ` Chandan Babu R
2022-03-01 10:39 ` [PATCH V7 15/17] xfs: Enable bulkstat ioctl to support 64-bit per-inode " Chandan Babu R
2022-03-02  0:31   ` Darrick J. Wong
2022-03-04  8:09   ` Dave Chinner
2022-03-05 12:45     ` Chandan Babu R
2022-03-07  5:13       ` Dave Chinner
2022-03-07 13:46         ` Chandan Babu R
2022-03-07 21:41           ` Dave Chinner
2022-03-08  2:52             ` Chandan Babu R [this message]
2022-03-01 10:39 ` [PATCH V7 16/17] xfs: Add XFS_SB_FEAT_INCOMPAT_NREXT64 to the list of supported flags Chandan Babu R
2022-03-01 10:39 ` [PATCH V7 17/17] xfs: Define max extent length based on on-disk format definition Chandan Babu R
2022-03-04  8:15   ` Dave Chinner
2022-03-05 12:45     ` Chandan Babu R

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87mti1qhof.fsf@debian-BULLSEYE-live-builder-AMD64 \
    --to=chandan.babu@oracle.com \
    --cc=david@fromorbit.com \
    --cc=djwong@kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox