* [PATCHBOMB 6.12] xfs: a ton of bugfixes and cleanups
@ 2024-09-02 18:16 Darrick J. Wong
2024-09-02 18:21 ` [PATCHSET v31.1 1/8] xfs: atomic file content commits Darrick J. Wong
` (7 more replies)
0 siblings, 8 replies; 53+ messages in thread
From: Darrick J. Wong @ 2024-09-02 18:16 UTC (permalink / raw)
To: Chandan Babu R; +Cc: linux-xfs, Christoph Hellwig
Hi everyone,
6.12 is (allegedly) an LTS release, and as it's end of summer vacation
time in the northern hemisphere, the most that I'm going to get done for
this cycle is bug fixes and cleanups in preparation for metadata
directories and realtime allocation groups.
Christoph and I have finished reviewing this big batch of changes and I
think they're ready to be merged. I'm resending the entire series so
that the patches are recorded in the list archives, and will follow it
with a pile of pull requests for actual merging.
--D
^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCHSET v31.1 1/8] xfs: atomic file content commits
2024-09-02 18:16 [PATCHBOMB 6.12] xfs: a ton of bugfixes and cleanups Darrick J. Wong
@ 2024-09-02 18:21 ` Darrick J. Wong
2024-09-02 18:23 ` [PATCH 1/1] xfs: introduce new file range commit ioctls Darrick J. Wong
2024-09-02 18:21 ` [PATCHSET v4.2 2/8] xfs: cleanups before adding metadata directories Darrick J. Wong
` (6 subsequent siblings)
7 siblings, 1 reply; 53+ messages in thread
From: Darrick J. Wong @ 2024-09-02 18:21 UTC (permalink / raw)
To: chandanbabu, djwong
Cc: Jeff Layton, Christoph Hellwig, linux-fsdevel, linux-xfs
Hi all,
This series creates XFS_IOC_START_COMMIT and XFS_IOC_COMMIT_RANGE ioctls
to perform the exchange only if the target file has not been changed
since a given sampling point.
This new functionality uses the mechanism underlying EXCHANGE_RANGE to
stage and commit file updates such that reader programs will see either
the old contents or the new contents in their entirety, with no chance
of torn writes. A successful call completion guarantees that the new
contents will be seen even if the system fails. The pair of ioctls
allows userspace to perform what amounts to a compare and exchange
operation on entire file contents.
Note that there are ongoing arguments in the community about how best to
implement some sort of file data write counter that nfsd could also use
to signal invalidations to clients. Until such a thing is implemented,
this patch will rely on ctime/mtime updates.
Here are the proposed manual pages:
IOCTL-XFS-COMMIT-RANGE(2) System Calls ManualIOCTL-XFS-COMMIT-RANGE(2)
NAME
ioctl_xfs_start_commit - prepare to exchange the contents of
two files ioctl_xfs_commit_range - conditionally exchange the
contents of parts of two files
SYNOPSIS
#include <sys/ioctl.h>
#include <xfs/xfs_fs.h>
int ioctl(int file2_fd, XFS_IOC_START_COMMIT, struct xfs_com‐
mit_range *arg);
int ioctl(int file2_fd, XFS_IOC_COMMIT_RANGE, struct xfs_com‐
mit_range *arg);
DESCRIPTION
Given a range of bytes in a first file file1_fd and a second
range of bytes in a second file file2_fd, this ioctl(2) ex‐
changes the contents of the two ranges if file2_fd passes cer‐
tain freshness criteria.
Before exchanging the contents, the program must call the
XFS_IOC_START_COMMIT ioctl to sample freshness data for
file2_fd. If the sampled metadata does not match the file
metadata at commit time, XFS_IOC_COMMIT_RANGE will return
EBUSY.
Exchanges are atomic with regards to concurrent file opera‐
tions. Implementations must guarantee that readers see either
the old contents or the new contents in their entirety, even if
the system fails.
The system call parameters are conveyed in structures of the
following form:
struct xfs_commit_range {
__s32 file1_fd;
__u32 pad;
__u64 file1_offset;
__u64 file2_offset;
__u64 length;
__u64 flags;
__u64 file2_freshness[5];
};
The field pad must be zero.
The fields file1_fd, file1_offset, and length define the first
range of bytes to be exchanged.
The fields file2_fd, file2_offset, and length define the second
range of bytes to be exchanged.
The field file2_freshness is an opaque field whose contents are
determined by the kernel. These file attributes are used to
confirm that file2_fd has not changed by another thread since
the current thread began staging its own update.
Both files must be from the same filesystem mount. If the two
file descriptors represent the same file, the byte ranges must
not overlap. Most disk-based filesystems require that the
starts of both ranges must be aligned to the file block size.
If this is the case, the ends of the ranges must also be so
aligned unless the XFS_EXCHANGE_RANGE_TO_EOF flag is set.
The field flags control the behavior of the exchange operation.
XFS_EXCHANGE_RANGE_TO_EOF
Ignore the length parameter. All bytes in file1_fd
from file1_offset to EOF are moved to file2_fd, and
file2's size is set to (file2_offset+(file1_length-
file1_offset)). Meanwhile, all bytes in file2 from
file2_offset to EOF are moved to file1 and file1's
size is set to (file1_offset+(file2_length-
file2_offset)).
XFS_EXCHANGE_RANGE_DSYNC
Ensure that all modified in-core data in both file
ranges and all metadata updates pertaining to the
exchange operation are flushed to persistent storage
before the call returns. Opening either file de‐
scriptor with O_SYNC or O_DSYNC will have the same
effect.
XFS_EXCHANGE_RANGE_FILE1_WRITTEN
Only exchange sub-ranges of file1_fd that are known
to contain data written by application software.
Each sub-range may be expanded (both upwards and
downwards) to align with the file allocation unit.
For files on the data device, this is one filesystem
block. For files on the realtime device, this is
the realtime extent size. This facility can be used
to implement fast atomic scatter-gather writes of
any complexity for software-defined storage targets
if all writes are aligned to the file allocation
unit.
XFS_EXCHANGE_RANGE_DRY_RUN
Check the parameters and the feasibility of the op‐
eration, but do not change anything.
RETURN VALUE
On error, -1 is returned, and errno is set to indicate the er‐
ror.
ERRORS
Error codes can be one of, but are not limited to, the follow‐
ing:
EBADF file1_fd is not open for reading and writing or is open
for append-only writes; or file2_fd is not open for
reading and writing or is open for append-only writes.
EBUSY The file2 inode number and timestamps supplied do not
match file2_fd.
EINVAL The parameters are not correct for these files. This
error can also appear if either file descriptor repre‐
sents a device, FIFO, or socket. Disk filesystems gen‐
erally require the offset and length arguments to be
aligned to the fundamental block sizes of both files.
EIO An I/O error occurred.
EISDIR One of the files is a directory.
ENOMEM The kernel was unable to allocate sufficient memory to
perform the operation.
ENOSPC There is not enough free space in the filesystem ex‐
change the contents safely.
EOPNOTSUPP
The filesystem does not support exchanging bytes between
the two files.
EPERM file1_fd or file2_fd are immutable.
ETXTBSY
One of the files is a swap file.
EUCLEAN
The filesystem is corrupt.
EXDEV file1_fd and file2_fd are not on the same mounted
filesystem.
CONFORMING TO
This API is XFS-specific.
USE CASES
Several use cases are imagined for this system call. Coordina‐
tion between multiple threads is performed by the kernel.
The first is a filesystem defragmenter, which copies the con‐
tents of a file into another file and wishes to exchange the
space mappings of the two files, provided that the original
file has not changed.
An example program might look like this:
int fd = open("/some/file", O_RDWR);
int temp_fd = open("/some", O_TMPFILE | O_RDWR);
struct stat sb;
struct xfs_commit_range args = {
.flags = XFS_EXCHANGE_RANGE_TO_EOF,
};
/* gather file2's freshness information */
ioctl(fd, XFS_IOC_START_COMMIT, &args);
fstat(fd, &sb);
/* make a fresh copy of the file with terrible alignment to avoid reflink */
clone_file_range(fd, NULL, temp_fd, NULL, 1, 0);
clone_file_range(fd, NULL, temp_fd, NULL, sb.st_size - 1, 0);
/* commit the entire update */
args.file1_fd = temp_fd;
ret = ioctl(fd, XFS_IOC_COMMIT_RANGE, &args);
if (ret && errno == EBUSY)
printf("file changed while defrag was underway
");
The second is a data storage program that wants to commit non-
contiguous updates to a file atomically. This program cannot
coordinate updates to the file and therefore relies on the ker‐
nel to reject the COMMIT_RANGE command if the file has been up‐
dated by someone else. This can be done by creating a tempo‐
rary file, calling FICLONE(2) to share the contents, and stag‐
ing the updates into the temporary file. The FULL_FILES flag
is recommended for this purpose. The temporary file can be
deleted or punched out afterwards.
An example program might look like this:
int fd = open("/some/file", O_RDWR);
int temp_fd = open("/some", O_TMPFILE | O_RDWR);
struct xfs_commit_range args = {
.flags = XFS_EXCHANGE_RANGE_TO_EOF,
};
/* gather file2's freshness information */
ioctl(fd, XFS_IOC_START_COMMIT, &args);
ioctl(temp_fd, FICLONE, fd);
/* append 1MB of records */
lseek(temp_fd, 0, SEEK_END);
write(temp_fd, data1, 1000000);
/* update record index */
pwrite(temp_fd, data1, 600, 98765);
pwrite(temp_fd, data2, 320, 54321);
pwrite(temp_fd, data2, 15, 0);
/* commit the entire update */
args.file1_fd = temp_fd;
ret = ioctl(fd, XFS_IOC_COMMIT_RANGE, &args);
if (ret && errno == EBUSY)
printf("file changed before commit; will roll back
");
NOTES
Some filesystems may limit the amount of data or the number of
extents that can be exchanged in a single call.
SEE ALSO
ioctl(2)
XFS 2024-02-18 IOCTL-XFS-COMMIT-RANGE(2)
If you're going to start using this code, I strongly recommend pulling
from my git trees, which are linked below.
With a bit of luck, this should all go splendidly.
Comments and questions are, as always, welcome.
--D
kernel git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=atomic-file-commits-6.12
---
Commits in this patchset:
* xfs: introduce new file range commit ioctls
---
fs/xfs/libxfs/xfs_fs.h | 26 +++++++++
fs/xfs/xfs_exchrange.c | 143 ++++++++++++++++++++++++++++++++++++++++++++++++
fs/xfs/xfs_exchrange.h | 16 +++++
fs/xfs/xfs_ioctl.c | 4 +
fs/xfs/xfs_trace.h | 57 +++++++++++++++++++
5 files changed, 243 insertions(+), 3 deletions(-)
^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCHSET v4.2 2/8] xfs: cleanups before adding metadata directories
2024-09-02 18:16 [PATCHBOMB 6.12] xfs: a ton of bugfixes and cleanups Darrick J. Wong
2024-09-02 18:21 ` [PATCHSET v31.1 1/8] xfs: atomic file content commits Darrick J. Wong
@ 2024-09-02 18:21 ` Darrick J. Wong
2024-09-02 18:23 ` [PATCH 1/3] xfs: validate inumber in xfs_iget Darrick J. Wong
` (2 more replies)
2024-09-02 18:21 ` [PATCHSET v4.2 3/8] xfs: clean up the rtbitmap code Darrick J. Wong
` (5 subsequent siblings)
7 siblings, 3 replies; 53+ messages in thread
From: Darrick J. Wong @ 2024-09-02 18:21 UTC (permalink / raw)
To: chandanbabu, djwong; +Cc: Dave Chinner, Christoph Hellwig, linux-xfs
Hi all,
Before we start adding code for metadata directory trees, let's clean up
some warts in the realtime bitmap code and the inode allocator code.
If you're going to start using this code, I strongly recommend pulling
from my git trees, which are linked below.
With a bit of luck, this should all go splendidly.
Comments and questions are, as always, welcome.
--D
kernel git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=metadir-cleanups-6.12
---
Commits in this patchset:
* xfs: validate inumber in xfs_iget
* xfs: match on the global RT inode numbers in xfs_is_metadata_inode
* xfs: pass the icreate args object to xfs_dialloc
---
fs/xfs/libxfs/xfs_ialloc.c | 5 +++--
fs/xfs/libxfs/xfs_ialloc.h | 4 +++-
fs/xfs/scrub/tempfile.c | 2 +-
fs/xfs/xfs_icache.c | 2 +-
fs/xfs/xfs_inode.c | 4 ++--
fs/xfs/xfs_inode.h | 7 ++++---
fs/xfs/xfs_qm.c | 2 +-
fs/xfs/xfs_symlink.c | 2 +-
8 files changed, 16 insertions(+), 12 deletions(-)
^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCHSET v4.2 3/8] xfs: clean up the rtbitmap code
2024-09-02 18:16 [PATCHBOMB 6.12] xfs: a ton of bugfixes and cleanups Darrick J. Wong
2024-09-02 18:21 ` [PATCHSET v31.1 1/8] xfs: atomic file content commits Darrick J. Wong
2024-09-02 18:21 ` [PATCHSET v4.2 2/8] xfs: cleanups before adding metadata directories Darrick J. Wong
@ 2024-09-02 18:21 ` Darrick J. Wong
2024-09-02 18:24 ` [PATCH 01/12] xfs: remove xfs_validate_rtextents Darrick J. Wong
` (11 more replies)
2024-09-02 18:21 ` [PATCHSET v4.2 4/8] xfs: fixes for the realtime allocator Darrick J. Wong
` (4 subsequent siblings)
7 siblings, 12 replies; 53+ messages in thread
From: Darrick J. Wong @ 2024-09-02 18:21 UTC (permalink / raw)
To: chandanbabu, djwong; +Cc: Christoph Hellwig, linux-xfs
Hi all,
Here are some cleanups and reorganization of the realtime bitmap code to share
more of that code between userspace and the kernel.
If you're going to start using this code, I strongly recommend pulling
from my git trees, which are linked below.
With a bit of luck, this should all go splendidly.
Comments and questions are, as always, welcome.
--D
kernel git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=rtbitmap-cleanups-6.12
---
Commits in this patchset:
* xfs: remove xfs_validate_rtextents
* xfs: factor out a xfs_validate_rt_geometry helper
* xfs: make the RT rsum_cache mandatory
* xfs: remove the limit argument to xfs_rtfind_back
* xfs: assert a valid limit in xfs_rtfind_forw
* xfs: add bounds checking to xfs_rt{bitmap,summary}_read_buf
* xfs: cleanup the calling convention for xfs_rtpick_extent
* xfs: push the calls to xfs_rtallocate_range out to xfs_bmap_rtalloc
* xfs: factor out a xfs_growfs_rt_bmblock helper
* xfs: factor out a xfs_last_rt_bmblock helper
* xfs: factor out rtbitmap/summary initialization helpers
* xfs: push transaction join out of xfs_rtbitmap_lock and xfs_rtgroup_lock
---
fs/xfs/libxfs/xfs_bmap.c | 3
fs/xfs/libxfs/xfs_rtbitmap.c | 192 ++++++++++++++-
fs/xfs/libxfs/xfs_rtbitmap.h | 33 +--
fs/xfs/libxfs/xfs_sb.c | 64 +++--
fs/xfs/libxfs/xfs_sb.h | 1
fs/xfs/libxfs/xfs_types.h | 12 -
fs/xfs/xfs_rtalloc.c | 535 +++++++++++++++++-------------------------
7 files changed, 438 insertions(+), 402 deletions(-)
^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCHSET v4.2 4/8] xfs: fixes for the realtime allocator
2024-09-02 18:16 [PATCHBOMB 6.12] xfs: a ton of bugfixes and cleanups Darrick J. Wong
` (2 preceding siblings ...)
2024-09-02 18:21 ` [PATCHSET v4.2 3/8] xfs: clean up the rtbitmap code Darrick J. Wong
@ 2024-09-02 18:21 ` Darrick J. Wong
2024-09-02 18:27 ` [PATCH 01/10] xfs: use the recalculated transaction reservation in xfs_growfs_rt_bmblock Darrick J. Wong
` (9 more replies)
2024-09-02 18:22 ` [PATCHSET v4.2 5/8] xfs: cleanups for the realtime allocator Darrick J. Wong
` (3 subsequent siblings)
7 siblings, 10 replies; 53+ messages in thread
From: Darrick J. Wong @ 2024-09-02 18:21 UTC (permalink / raw)
To: chandanbabu, djwong; +Cc: Christoph Hellwig, linux-xfs
Hi all,
While I was reviewing how to integrate realtime allocation groups with
the rt allocator, I noticed several bugs in the existing allocation code
with regards to calculating the maximum range of rtx to scan for free
space. This series fixes those range bugs and cleans up a few things
too.
I also added a few cleanups from Christoph.
If you're going to start using this code, I strongly recommend pulling
from my git trees, which are linked below.
With a bit of luck, this should all go splendidly.
Comments and questions are, as always, welcome.
--D
kernel git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=rtalloc-fixes-6.12
---
Commits in this patchset:
* xfs: use the recalculated transaction reservation in xfs_growfs_rt_bmblock
* xfs: ensure rtx mask/shift are correct after growfs
* xfs: don't return too-short extents from xfs_rtallocate_extent_block
* xfs: don't scan off the end of the rt volume in xfs_rtallocate_extent_block
* xfs: refactor aligning bestlen to prod
* xfs: clean up xfs_rtallocate_extent_exact a bit
* xfs: reduce excessive clamping of maxlen in xfs_rtallocate_extent_near
* xfs: fix broken variable-sized allocation detection in xfs_rtallocate_extent_block
* xfs: remove xfs_rtb_to_rtxrem
* xfs: simplify xfs_rtalloc_query_range
---
fs/xfs/libxfs/xfs_rtbitmap.c | 51 ++++++---------
fs/xfs/libxfs/xfs_rtbitmap.h | 21 ------
fs/xfs/libxfs/xfs_sb.c | 12 +++
fs/xfs/libxfs/xfs_sb.h | 2 +
fs/xfs/xfs_discard.c | 15 ++--
fs/xfs/xfs_fsmap.c | 11 +--
fs/xfs/xfs_rtalloc.c | 145 +++++++++++++++++++++++-------------------
7 files changed, 124 insertions(+), 133 deletions(-)
^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCHSET v4.2 5/8] xfs: cleanups for the realtime allocator
2024-09-02 18:16 [PATCHBOMB 6.12] xfs: a ton of bugfixes and cleanups Darrick J. Wong
` (3 preceding siblings ...)
2024-09-02 18:21 ` [PATCHSET v4.2 4/8] xfs: fixes for the realtime allocator Darrick J. Wong
@ 2024-09-02 18:22 ` Darrick J. Wong
2024-09-02 18:29 ` [PATCH 01/10] xfs: clean up the ISVALID macro in xfs_bmap_adjacent Darrick J. Wong
` (9 more replies)
2024-09-02 18:22 ` [PATCHSET v4.2 6/8] xfs: cleanups for quota mount Darrick J. Wong
` (2 subsequent siblings)
7 siblings, 10 replies; 53+ messages in thread
From: Darrick J. Wong @ 2024-09-02 18:22 UTC (permalink / raw)
To: chandanbabu, djwong; +Cc: Christoph Hellwig, linux-xfs
Hi all,
This third series cleans up the realtime allocator code so that it'll be
somewhat less difficult to figure out what on earth it's doing. We also
rearrange the fsmap code a bit.
If you're going to start using this code, I strongly recommend pulling
from my git trees, which are linked below.
With a bit of luck, this should all go splendidly.
Comments and questions are, as always, welcome.
--D
kernel git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=rtalloc-cleanups-6.12
---
Commits in this patchset:
* xfs: clean up the ISVALID macro in xfs_bmap_adjacent
* xfs: factor out a xfs_rtallocate helper
* xfs: rework the rtalloc fallback handling
* xfs: factor out a xfs_rtallocate_align helper
* xfs: make the rtalloc start hint a xfs_rtblock_t
* xfs: add xchk_setup_nothing and xchk_nothing helpers
* xfs: remove xfs_{rtbitmap,rtsummary}_wordcount
* xfs: replace m_rsumsize with m_rsumblocks
* xfs: rearrange xfs_fsmap.c a little bit
* xfs: move xfs_ioc_getfsmap out of xfs_ioctl.c
---
fs/xfs/libxfs/xfs_bmap.c | 55 +++--
fs/xfs/libxfs/xfs_rtbitmap.c | 33 ---
fs/xfs/libxfs/xfs_rtbitmap.h | 7 -
fs/xfs/libxfs/xfs_trans_resv.c | 2
fs/xfs/scrub/common.h | 29 +--
fs/xfs/scrub/rtsummary.c | 11 -
fs/xfs/scrub/rtsummary.h | 2
fs/xfs/scrub/rtsummary_repair.c | 12 -
fs/xfs/scrub/scrub.h | 29 +--
fs/xfs/xfs_fsmap.c | 402 ++++++++++++++++++++++++++-------------
fs/xfs/xfs_fsmap.h | 6 -
fs/xfs/xfs_ioctl.c | 130 -------------
fs/xfs/xfs_mount.h | 2
fs/xfs/xfs_rtalloc.c | 246 ++++++++++++++----------
14 files changed, 477 insertions(+), 489 deletions(-)
^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCHSET v4.2 6/8] xfs: cleanups for quota mount
2024-09-02 18:16 [PATCHBOMB 6.12] xfs: a ton of bugfixes and cleanups Darrick J. Wong
` (4 preceding siblings ...)
2024-09-02 18:22 ` [PATCHSET v4.2 5/8] xfs: cleanups for the realtime allocator Darrick J. Wong
@ 2024-09-02 18:22 ` Darrick J. Wong
2024-09-02 18:32 ` [PATCH 1/1] xfs: refactor loading quota inodes in the regular case Darrick J. Wong
2024-09-02 18:22 ` [PATCHSET 7/8] xfs: various bug fixes for 6.12 Darrick J. Wong
2024-09-02 18:22 ` [PATCHSET v4.2 8/8] xfs: cleanups for inode rooted btree code Darrick J. Wong
7 siblings, 1 reply; 53+ messages in thread
From: Darrick J. Wong @ 2024-09-02 18:22 UTC (permalink / raw)
To: chandanbabu, djwong; +Cc: Christoph Hellwig, linux-xfs
Hi all,
Refactor the quota file loading code in preparation for adding metadata
directory trees. Did you know that quotarm works even when quota isn't active?
If you're going to start using this code, I strongly recommend pulling
from my git trees, which are linked below.
With a bit of luck, this should all go splendidly.
Comments and questions are, as always, welcome.
--D
kernel git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=quota-cleanups-6.12
---
Commits in this patchset:
* xfs: refactor loading quota inodes in the regular case
---
fs/xfs/xfs_qm.c | 46 +++++++++++++++++++++++++++++++++++-----
fs/xfs/xfs_qm.h | 3 +++
fs/xfs/xfs_qm_syscalls.c | 13 +++++------
fs/xfs/xfs_quotaops.c | 53 +++++++++++++++++++++++++++-------------------
4 files changed, 80 insertions(+), 35 deletions(-)
^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCHSET 7/8] xfs: various bug fixes for 6.12
2024-09-02 18:16 [PATCHBOMB 6.12] xfs: a ton of bugfixes and cleanups Darrick J. Wong
` (5 preceding siblings ...)
2024-09-02 18:22 ` [PATCHSET v4.2 6/8] xfs: cleanups for quota mount Darrick J. Wong
@ 2024-09-02 18:22 ` Darrick J. Wong
2024-09-02 18:32 ` [PATCH 1/3] xfs: fix C++ compilation errors in xfs_fs.h Darrick J. Wong
` (2 more replies)
2024-09-02 18:22 ` [PATCHSET v4.2 8/8] xfs: cleanups for inode rooted btree code Darrick J. Wong
7 siblings, 3 replies; 53+ messages in thread
From: Darrick J. Wong @ 2024-09-02 18:22 UTC (permalink / raw)
To: chandanbabu, djwong; +Cc: Christoph Hellwig, kernel, sam, linux-xfs
Hi all,
Various bug fixes for 6.12.
If you're going to start using this code, I strongly recommend pulling
from my git trees, which are linked below.
With a bit of luck, this should all go splendidly.
Comments and questions are, as always, welcome.
--D
kernel git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=xfs-fixes-6.12
---
Commits in this patchset:
* xfs: fix C++ compilation errors in xfs_fs.h
* xfs: fix FITRIM reporting again
* xfs: fix a sloppy memory handling bug in xfs_iroot_realloc
---
fs/xfs/libxfs/xfs_fs.h | 5 +++--
fs/xfs/libxfs/xfs_inode_fork.c | 10 +++++-----
fs/xfs/xfs_discard.c | 2 +-
3 files changed, 9 insertions(+), 8 deletions(-)
^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCHSET v4.2 8/8] xfs: cleanups for inode rooted btree code
2024-09-02 18:16 [PATCHBOMB 6.12] xfs: a ton of bugfixes and cleanups Darrick J. Wong
` (6 preceding siblings ...)
2024-09-02 18:22 ` [PATCHSET 7/8] xfs: various bug fixes for 6.12 Darrick J. Wong
@ 2024-09-02 18:22 ` Darrick J. Wong
2024-09-02 18:33 ` [PATCH 1/2] xfs: replace shouty XFS_BM{BT,DR} macros Darrick J. Wong
2024-09-02 18:33 ` [PATCH 2/2] xfs: standardize the btree maxrecs function parameters Darrick J. Wong
7 siblings, 2 replies; 53+ messages in thread
From: Darrick J. Wong @ 2024-09-02 18:22 UTC (permalink / raw)
To: chandanbabu, djwong; +Cc: Christoph Hellwig, linux-xfs
Hi all,
This series prepares the btree code to support realtime reverse mapping btrees
by refactoring xfs_ifork_realloc to be fed a per-btree ops structure so that it
can handle multiple types of inode-rooted btrees. It moves on to refactoring
the btree code to use the new realloc routines.
If you're going to start using this code, I strongly recommend pulling
from my git trees, which are linked below.
With a bit of luck, this should all go splendidly.
Comments and questions are, as always, welcome.
--D
kernel git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=btree-cleanups-6.12
---
Commits in this patchset:
* xfs: replace shouty XFS_BM{BT,DR} macros
* xfs: standardize the btree maxrecs function parameters
---
fs/xfs/libxfs/xfs_alloc_btree.c | 6 +
fs/xfs/libxfs/xfs_alloc_btree.h | 3 -
fs/xfs/libxfs/xfs_attr_leaf.c | 8 +
fs/xfs/libxfs/xfs_bmap.c | 42 ++++---
fs/xfs/libxfs/xfs_bmap_btree.c | 24 ++--
fs/xfs/libxfs/xfs_bmap_btree.h | 207 +++++++++++++++++++++++++-----------
fs/xfs/libxfs/xfs_ialloc.c | 4 -
fs/xfs/libxfs/xfs_ialloc_btree.c | 6 +
fs/xfs/libxfs/xfs_ialloc_btree.h | 3 -
fs/xfs/libxfs/xfs_inode_fork.c | 34 +++---
fs/xfs/libxfs/xfs_refcount_btree.c | 5 +
fs/xfs/libxfs/xfs_refcount_btree.h | 3 -
fs/xfs/libxfs/xfs_rmap_btree.c | 7 +
fs/xfs/libxfs/xfs_rmap_btree.h | 3 -
fs/xfs/libxfs/xfs_sb.c | 16 +--
fs/xfs/libxfs/xfs_trans_resv.c | 2
fs/xfs/scrub/bmap_repair.c | 2
fs/xfs/scrub/inode_repair.c | 12 +-
fs/xfs/xfs_bmap_util.c | 4 -
19 files changed, 237 insertions(+), 154 deletions(-)
^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCH 1/1] xfs: introduce new file range commit ioctls
2024-09-02 18:21 ` [PATCHSET v31.1 1/8] xfs: atomic file content commits Darrick J. Wong
@ 2024-09-02 18:23 ` Darrick J. Wong
2024-09-03 7:52 ` Christian Brauner
0 siblings, 1 reply; 53+ messages in thread
From: Darrick J. Wong @ 2024-09-02 18:23 UTC (permalink / raw)
To: chandanbabu, djwong
Cc: Jeff Layton, Christoph Hellwig, linux-fsdevel, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
This patch introduces two more new ioctls to manage atomic updates to
file contents -- XFS_IOC_START_COMMIT and XFS_IOC_COMMIT_RANGE. The
commit mechanism here is exactly the same as what XFS_IOC_EXCHANGE_RANGE
does, but with the additional requirement that file2 cannot have changed
since some sampling point. The start-commit ioctl performs the sampling
of file attributes.
Note: This patch currently samples i_ctime during START_COMMIT and
checks that it hasn't changed during COMMIT_RANGE. This isn't entirely
safe in kernels prior to 6.12 because ctime only had coarse grained
granularity and very fast updates could collide with a COMMIT_RANGE.
With the multi-granularity ctime introduced by Jeff Layton, it's now
possible to update ctime such that this does not happen.
It is critical, then, that this patch must not be backported to any
kernel that does not support fine-grained file change timestamps.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Acked-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
fs/xfs/libxfs/xfs_fs.h | 26 +++++++++
fs/xfs/xfs_exchrange.c | 143 ++++++++++++++++++++++++++++++++++++++++++++++++
fs/xfs/xfs_exchrange.h | 16 +++++
fs/xfs/xfs_ioctl.c | 4 +
fs/xfs/xfs_trace.h | 57 +++++++++++++++++++
5 files changed, 243 insertions(+), 3 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_fs.h b/fs/xfs/libxfs/xfs_fs.h
index 454b63ef7201..c85c8077fac3 100644
--- a/fs/xfs/libxfs/xfs_fs.h
+++ b/fs/xfs/libxfs/xfs_fs.h
@@ -825,6 +825,30 @@ struct xfs_exchange_range {
__u64 flags; /* see XFS_EXCHANGE_RANGE_* below */
};
+/*
+ * Using the same definition of file2 as struct xfs_exchange_range, commit the
+ * contents of file1 into file2 if file2 has the same inode number, mtime, and
+ * ctime as the arguments provided to the call. The old contents of file2 will
+ * be moved to file1.
+ *
+ * Returns -EBUSY if there isn't an exact match for the file2 fields.
+ *
+ * Filesystems must be able to restart and complete the operation even after
+ * the system goes down.
+ */
+struct xfs_commit_range {
+ __s32 file1_fd;
+ __u32 pad; /* must be zeroes */
+ __u64 file1_offset; /* file1 offset, bytes */
+ __u64 file2_offset; /* file2 offset, bytes */
+ __u64 length; /* bytes to exchange */
+
+ __u64 flags; /* see XFS_EXCHANGE_RANGE_* below */
+
+ /* opaque file2 metadata for freshness checks */
+ __u64 file2_freshness[6];
+};
+
/*
* Exchange file data all the way to the ends of both files, and then exchange
* the file sizes. This flag can be used to replace a file's contents with a
@@ -997,6 +1021,8 @@ struct xfs_getparents_by_handle {
#define XFS_IOC_BULKSTAT _IOR ('X', 127, struct xfs_bulkstat_req)
#define XFS_IOC_INUMBERS _IOR ('X', 128, struct xfs_inumbers_req)
#define XFS_IOC_EXCHANGE_RANGE _IOW ('X', 129, struct xfs_exchange_range)
+#define XFS_IOC_START_COMMIT _IOR ('X', 130, struct xfs_commit_range)
+#define XFS_IOC_COMMIT_RANGE _IOW ('X', 131, struct xfs_commit_range)
/* XFS_IOC_GETFSUUID ---------- deprecated 140 */
diff --git a/fs/xfs/xfs_exchrange.c b/fs/xfs/xfs_exchrange.c
index c8a655c92c92..d0889190ab7f 100644
--- a/fs/xfs/xfs_exchrange.c
+++ b/fs/xfs/xfs_exchrange.c
@@ -72,6 +72,34 @@ xfs_exchrange_estimate(
return error;
}
+/*
+ * Check that file2's metadata agree with the snapshot that we took for the
+ * range commit request.
+ *
+ * This should be called after the filesystem has locked /all/ inode metadata
+ * against modification.
+ */
+STATIC int
+xfs_exchrange_check_freshness(
+ const struct xfs_exchrange *fxr,
+ struct xfs_inode *ip2)
+{
+ struct inode *inode2 = VFS_I(ip2);
+ struct timespec64 ctime = inode_get_ctime(inode2);
+ struct timespec64 mtime = inode_get_mtime(inode2);
+
+ trace_xfs_exchrange_freshness(fxr, ip2);
+
+ /* Check that file2 hasn't otherwise been modified. */
+ if (fxr->file2_ino != ip2->i_ino ||
+ fxr->file2_gen != inode2->i_generation ||
+ !timespec64_equal(&fxr->file2_ctime, &ctime) ||
+ !timespec64_equal(&fxr->file2_mtime, &mtime))
+ return -EBUSY;
+
+ return 0;
+}
+
#define QRETRY_IP1 (0x1)
#define QRETRY_IP2 (0x2)
@@ -607,6 +635,12 @@ xfs_exchrange_prep(
if (error || fxr->length == 0)
return error;
+ if (fxr->flags & __XFS_EXCHANGE_RANGE_CHECK_FRESH2) {
+ error = xfs_exchrange_check_freshness(fxr, ip2);
+ if (error)
+ return error;
+ }
+
/* Attach dquots to both inodes before changing block maps. */
error = xfs_qm_dqattach(ip2);
if (error)
@@ -719,7 +753,8 @@ xfs_exchange_range(
if (fxr->file1->f_path.mnt != fxr->file2->f_path.mnt)
return -EXDEV;
- if (fxr->flags & ~XFS_EXCHANGE_RANGE_ALL_FLAGS)
+ if (fxr->flags & ~(XFS_EXCHANGE_RANGE_ALL_FLAGS |
+ __XFS_EXCHANGE_RANGE_CHECK_FRESH2))
return -EINVAL;
/* Userspace requests only honored for regular files. */
@@ -802,3 +837,109 @@ xfs_ioc_exchange_range(
fdput(file1);
return error;
}
+
+/* Opaque freshness blob for XFS_IOC_COMMIT_RANGE */
+struct xfs_commit_range_fresh {
+ xfs_fsid_t fsid; /* m_fixedfsid */
+ __u64 file2_ino; /* inode number */
+ __s64 file2_mtime; /* modification time */
+ __s64 file2_ctime; /* change time */
+ __s32 file2_mtime_nsec; /* mod time, nsec */
+ __s32 file2_ctime_nsec; /* change time, nsec */
+ __u32 file2_gen; /* inode generation */
+ __u32 magic; /* zero */
+};
+#define XCR_FRESH_MAGIC 0x444F524B /* DORK */
+
+/* Set up a commitrange operation by sampling file2's write-related attrs */
+long
+xfs_ioc_start_commit(
+ struct file *file,
+ struct xfs_commit_range __user *argp)
+{
+ struct xfs_commit_range args = { };
+ struct timespec64 ts;
+ struct xfs_commit_range_fresh *kern_f;
+ struct xfs_commit_range_fresh __user *user_f;
+ struct inode *inode2 = file_inode(file);
+ struct xfs_inode *ip2 = XFS_I(inode2);
+ const unsigned int lockflags = XFS_IOLOCK_SHARED |
+ XFS_MMAPLOCK_SHARED |
+ XFS_ILOCK_SHARED;
+
+ BUILD_BUG_ON(sizeof(struct xfs_commit_range_fresh) !=
+ sizeof(args.file2_freshness));
+
+ kern_f = (struct xfs_commit_range_fresh *)&args.file2_freshness;
+
+ memcpy(&kern_f->fsid, ip2->i_mount->m_fixedfsid, sizeof(xfs_fsid_t));
+
+ xfs_ilock(ip2, lockflags);
+ ts = inode_get_ctime(inode2);
+ kern_f->file2_ctime = ts.tv_sec;
+ kern_f->file2_ctime_nsec = ts.tv_nsec;
+ ts = inode_get_mtime(inode2);
+ kern_f->file2_mtime = ts.tv_sec;
+ kern_f->file2_mtime_nsec = ts.tv_nsec;
+ kern_f->file2_ino = ip2->i_ino;
+ kern_f->file2_gen = inode2->i_generation;
+ kern_f->magic = XCR_FRESH_MAGIC;
+ xfs_iunlock(ip2, lockflags);
+
+ user_f = (struct xfs_commit_range_fresh __user *)&argp->file2_freshness;
+ if (copy_to_user(user_f, kern_f, sizeof(*kern_f)))
+ return -EFAULT;
+
+ return 0;
+}
+
+/*
+ * Exchange file1 and file2 contents if file2 has not been written since the
+ * start commit operation.
+ */
+long
+xfs_ioc_commit_range(
+ struct file *file,
+ struct xfs_commit_range __user *argp)
+{
+ struct xfs_exchrange fxr = {
+ .file2 = file,
+ };
+ struct xfs_commit_range args;
+ struct xfs_commit_range_fresh *kern_f;
+ struct xfs_inode *ip2 = XFS_I(file_inode(file));
+ struct xfs_mount *mp = ip2->i_mount;
+ struct fd file1;
+ int error;
+
+ kern_f = (struct xfs_commit_range_fresh *)&args.file2_freshness;
+
+ if (copy_from_user(&args, argp, sizeof(args)))
+ return -EFAULT;
+ if (args.flags & ~XFS_EXCHANGE_RANGE_ALL_FLAGS)
+ return -EINVAL;
+ if (kern_f->magic != XCR_FRESH_MAGIC)
+ return -EBUSY;
+ if (memcmp(&kern_f->fsid, mp->m_fixedfsid, sizeof(xfs_fsid_t)))
+ return -EBUSY;
+
+ fxr.file1_offset = args.file1_offset;
+ fxr.file2_offset = args.file2_offset;
+ fxr.length = args.length;
+ fxr.flags = args.flags | __XFS_EXCHANGE_RANGE_CHECK_FRESH2;
+ fxr.file2_ino = kern_f->file2_ino;
+ fxr.file2_gen = kern_f->file2_gen;
+ fxr.file2_mtime.tv_sec = kern_f->file2_mtime;
+ fxr.file2_mtime.tv_nsec = kern_f->file2_mtime_nsec;
+ fxr.file2_ctime.tv_sec = kern_f->file2_ctime;
+ fxr.file2_ctime.tv_nsec = kern_f->file2_ctime_nsec;
+
+ file1 = fdget(args.file1_fd);
+ if (!file1.file)
+ return -EBADF;
+ fxr.file1 = file1.file;
+
+ error = xfs_exchange_range(&fxr);
+ fdput(file1);
+ return error;
+}
diff --git a/fs/xfs/xfs_exchrange.h b/fs/xfs/xfs_exchrange.h
index 039abcca546e..bc1298aba806 100644
--- a/fs/xfs/xfs_exchrange.h
+++ b/fs/xfs/xfs_exchrange.h
@@ -10,8 +10,12 @@
#define __XFS_EXCHANGE_RANGE_UPD_CMTIME1 (1ULL << 63)
#define __XFS_EXCHANGE_RANGE_UPD_CMTIME2 (1ULL << 62)
+/* Freshness check required */
+#define __XFS_EXCHANGE_RANGE_CHECK_FRESH2 (1ULL << 61)
+
#define XFS_EXCHANGE_RANGE_PRIV_FLAGS (__XFS_EXCHANGE_RANGE_UPD_CMTIME1 | \
- __XFS_EXCHANGE_RANGE_UPD_CMTIME2)
+ __XFS_EXCHANGE_RANGE_UPD_CMTIME2 | \
+ __XFS_EXCHANGE_RANGE_CHECK_FRESH2)
struct xfs_exchrange {
struct file *file1;
@@ -22,10 +26,20 @@ struct xfs_exchrange {
u64 length;
u64 flags; /* XFS_EXCHANGE_RANGE flags */
+
+ /* file2 metadata for freshness checks */
+ u64 file2_ino;
+ struct timespec64 file2_mtime;
+ struct timespec64 file2_ctime;
+ u32 file2_gen;
};
long xfs_ioc_exchange_range(struct file *file,
struct xfs_exchange_range __user *argp);
+long xfs_ioc_start_commit(struct file *file,
+ struct xfs_commit_range __user *argp);
+long xfs_ioc_commit_range(struct file *file,
+ struct xfs_commit_range __user *argp);
struct xfs_exchmaps_req;
diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
index 6b13666d4e96..90b3ee21e7fe 100644
--- a/fs/xfs/xfs_ioctl.c
+++ b/fs/xfs/xfs_ioctl.c
@@ -1518,6 +1518,10 @@ xfs_file_ioctl(
case XFS_IOC_EXCHANGE_RANGE:
return xfs_ioc_exchange_range(filp, arg);
+ case XFS_IOC_START_COMMIT:
+ return xfs_ioc_start_commit(filp, arg);
+ case XFS_IOC_COMMIT_RANGE:
+ return xfs_ioc_commit_range(filp, arg);
default:
return -ENOTTY;
diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h
index 180ce697305a..4cf0fa71ba9c 100644
--- a/fs/xfs/xfs_trace.h
+++ b/fs/xfs/xfs_trace.h
@@ -4926,7 +4926,8 @@ DEFINE_INODE_ERROR_EVENT(xfs_exchrange_error);
{ XFS_EXCHANGE_RANGE_DRY_RUN, "DRY_RUN" }, \
{ XFS_EXCHANGE_RANGE_FILE1_WRITTEN, "F1_WRITTEN" }, \
{ __XFS_EXCHANGE_RANGE_UPD_CMTIME1, "CMTIME1" }, \
- { __XFS_EXCHANGE_RANGE_UPD_CMTIME2, "CMTIME2" }
+ { __XFS_EXCHANGE_RANGE_UPD_CMTIME2, "CMTIME2" }, \
+ { __XFS_EXCHANGE_RANGE_CHECK_FRESH2, "FRESH2" }
/* file exchange-range tracepoint class */
DECLARE_EVENT_CLASS(xfs_exchrange_class,
@@ -4986,6 +4987,60 @@ DEFINE_EXCHRANGE_EVENT(xfs_exchrange_prep);
DEFINE_EXCHRANGE_EVENT(xfs_exchrange_flush);
DEFINE_EXCHRANGE_EVENT(xfs_exchrange_mappings);
+TRACE_EVENT(xfs_exchrange_freshness,
+ TP_PROTO(const struct xfs_exchrange *fxr, struct xfs_inode *ip2),
+ TP_ARGS(fxr, ip2),
+ TP_STRUCT__entry(
+ __field(dev_t, dev)
+ __field(xfs_ino_t, ip2_ino)
+ __field(long long, ip2_mtime)
+ __field(long long, ip2_ctime)
+ __field(int, ip2_mtime_nsec)
+ __field(int, ip2_ctime_nsec)
+
+ __field(xfs_ino_t, file2_ino)
+ __field(long long, file2_mtime)
+ __field(long long, file2_ctime)
+ __field(int, file2_mtime_nsec)
+ __field(int, file2_ctime_nsec)
+ ),
+ TP_fast_assign(
+ struct timespec64 ts64;
+ struct inode *inode2 = VFS_I(ip2);
+
+ __entry->dev = inode2->i_sb->s_dev;
+ __entry->ip2_ino = ip2->i_ino;
+
+ ts64 = inode_get_ctime(inode2);
+ __entry->ip2_ctime = ts64.tv_sec;
+ __entry->ip2_ctime_nsec = ts64.tv_nsec;
+
+ ts64 = inode_get_mtime(inode2);
+ __entry->ip2_mtime = ts64.tv_sec;
+ __entry->ip2_mtime_nsec = ts64.tv_nsec;
+
+ __entry->file2_ino = fxr->file2_ino;
+ __entry->file2_mtime = fxr->file2_mtime.tv_sec;
+ __entry->file2_ctime = fxr->file2_ctime.tv_sec;
+ __entry->file2_mtime_nsec = fxr->file2_mtime.tv_nsec;
+ __entry->file2_ctime_nsec = fxr->file2_ctime.tv_nsec;
+ ),
+ TP_printk("dev %d:%d "
+ "ino 0x%llx mtime %lld:%d ctime %lld:%d -> "
+ "file 0x%llx mtime %lld:%d ctime %lld:%d",
+ MAJOR(__entry->dev), MINOR(__entry->dev),
+ __entry->ip2_ino,
+ __entry->ip2_mtime,
+ __entry->ip2_mtime_nsec,
+ __entry->ip2_ctime,
+ __entry->ip2_ctime_nsec,
+ __entry->file2_ino,
+ __entry->file2_mtime,
+ __entry->file2_mtime_nsec,
+ __entry->file2_ctime,
+ __entry->file2_ctime_nsec)
+);
+
TRACE_EVENT(xfs_exchmaps_overhead,
TP_PROTO(struct xfs_mount *mp, unsigned long long bmbt_blocks,
unsigned long long rmapbt_blocks),
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 1/3] xfs: validate inumber in xfs_iget
2024-09-02 18:21 ` [PATCHSET v4.2 2/8] xfs: cleanups before adding metadata directories Darrick J. Wong
@ 2024-09-02 18:23 ` Darrick J. Wong
2024-09-02 18:23 ` [PATCH 2/3] xfs: match on the global RT inode numbers in xfs_is_metadata_inode Darrick J. Wong
2024-09-02 18:23 ` [PATCH 3/3] xfs: pass the icreate args object to xfs_dialloc Darrick J. Wong
2 siblings, 0 replies; 53+ messages in thread
From: Darrick J. Wong @ 2024-09-02 18:23 UTC (permalink / raw)
To: chandanbabu, djwong; +Cc: Christoph Hellwig, Dave Chinner, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Actually use the inumber validator to check the argument passed in here.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
---
fs/xfs/xfs_icache.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
index cf629302d48e..887d2a01161e 100644
--- a/fs/xfs/xfs_icache.c
+++ b/fs/xfs/xfs_icache.c
@@ -755,7 +755,7 @@ xfs_iget(
ASSERT((lock_flags & (XFS_IOLOCK_EXCL | XFS_IOLOCK_SHARED)) == 0);
/* reject inode numbers outside existing AGs */
- if (!ino || XFS_INO_TO_AGNO(mp, ino) >= mp->m_sb.sb_agcount)
+ if (!xfs_verify_ino(mp, ino))
return -EINVAL;
XFS_STATS_INC(mp, xs_ig_attempts);
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 2/3] xfs: match on the global RT inode numbers in xfs_is_metadata_inode
2024-09-02 18:21 ` [PATCHSET v4.2 2/8] xfs: cleanups before adding metadata directories Darrick J. Wong
2024-09-02 18:23 ` [PATCH 1/3] xfs: validate inumber in xfs_iget Darrick J. Wong
@ 2024-09-02 18:23 ` Darrick J. Wong
2024-09-02 18:23 ` [PATCH 3/3] xfs: pass the icreate args object to xfs_dialloc Darrick J. Wong
2 siblings, 0 replies; 53+ messages in thread
From: Darrick J. Wong @ 2024-09-02 18:23 UTC (permalink / raw)
To: chandanbabu, djwong; +Cc: Christoph Hellwig, linux-xfs
From: Christoph Hellwig <hch@lst.de>
Match the inode number instead of the inode pointers, as the inode
pointers in the superblock will go away soon.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
[djwong: port to my tree, make the parameter a const pointer]
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
fs/xfs/xfs_inode.h | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h
index 51defdebef30..1908409968db 100644
--- a/fs/xfs/xfs_inode.h
+++ b/fs/xfs/xfs_inode.h
@@ -276,12 +276,13 @@ static inline bool xfs_is_reflink_inode(struct xfs_inode *ip)
return ip->i_diflags2 & XFS_DIFLAG2_REFLINK;
}
-static inline bool xfs_is_metadata_inode(struct xfs_inode *ip)
+static inline bool xfs_is_metadata_inode(const struct xfs_inode *ip)
{
struct xfs_mount *mp = ip->i_mount;
- return ip == mp->m_rbmip || ip == mp->m_rsumip ||
- xfs_is_quota_inode(&mp->m_sb, ip->i_ino);
+ return ip->i_ino == mp->m_sb.sb_rbmino ||
+ ip->i_ino == mp->m_sb.sb_rsumino ||
+ xfs_is_quota_inode(&mp->m_sb, ip->i_ino);
}
bool xfs_is_always_cow_inode(struct xfs_inode *ip);
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 3/3] xfs: pass the icreate args object to xfs_dialloc
2024-09-02 18:21 ` [PATCHSET v4.2 2/8] xfs: cleanups before adding metadata directories Darrick J. Wong
2024-09-02 18:23 ` [PATCH 1/3] xfs: validate inumber in xfs_iget Darrick J. Wong
2024-09-02 18:23 ` [PATCH 2/3] xfs: match on the global RT inode numbers in xfs_is_metadata_inode Darrick J. Wong
@ 2024-09-02 18:23 ` Darrick J. Wong
2 siblings, 0 replies; 53+ messages in thread
From: Darrick J. Wong @ 2024-09-02 18:23 UTC (permalink / raw)
To: chandanbabu, djwong; +Cc: Christoph Hellwig, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Pass the xfs_icreate_args object to xfs_dialloc since we can extract the
relevant mode (really just the file type) and parent inumber from there.
This simplifies the calling convention in preparation for the next
patch.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
fs/xfs/libxfs/xfs_ialloc.c | 5 +++--
fs/xfs/libxfs/xfs_ialloc.h | 4 +++-
fs/xfs/scrub/tempfile.c | 2 +-
fs/xfs/xfs_inode.c | 4 ++--
fs/xfs/xfs_qm.c | 2 +-
fs/xfs/xfs_symlink.c | 2 +-
6 files changed, 11 insertions(+), 8 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_ialloc.c b/fs/xfs/libxfs/xfs_ialloc.c
index 0af5b7a33d05..fc70601e8d8e 100644
--- a/fs/xfs/libxfs/xfs_ialloc.c
+++ b/fs/xfs/libxfs/xfs_ialloc.c
@@ -1855,11 +1855,12 @@ xfs_dialloc_try_ag(
int
xfs_dialloc(
struct xfs_trans **tpp,
- xfs_ino_t parent,
- umode_t mode,
+ const struct xfs_icreate_args *args,
xfs_ino_t *new_ino)
{
struct xfs_mount *mp = (*tpp)->t_mountp;
+ xfs_ino_t parent = args->pip ? args->pip->i_ino : 0;
+ umode_t mode = args->mode & S_IFMT;
xfs_agnumber_t agno;
int error = 0;
xfs_agnumber_t start_agno;
diff --git a/fs/xfs/libxfs/xfs_ialloc.h b/fs/xfs/libxfs/xfs_ialloc.h
index b549627e3a61..3a1323155a45 100644
--- a/fs/xfs/libxfs/xfs_ialloc.h
+++ b/fs/xfs/libxfs/xfs_ialloc.h
@@ -33,11 +33,13 @@ xfs_make_iptr(struct xfs_mount *mp, struct xfs_buf *b, int o)
return xfs_buf_offset(b, o << (mp)->m_sb.sb_inodelog);
}
+struct xfs_icreate_args;
+
/*
* Allocate an inode on disk. Mode is used to tell whether the new inode will
* need space, and whether it is a directory.
*/
-int xfs_dialloc(struct xfs_trans **tpp, xfs_ino_t parent, umode_t mode,
+int xfs_dialloc(struct xfs_trans **tpp, const struct xfs_icreate_args *args,
xfs_ino_t *new_ino);
int xfs_difree(struct xfs_trans *tp, struct xfs_perag *pag,
diff --git a/fs/xfs/scrub/tempfile.c b/fs/xfs/scrub/tempfile.c
index d390d56cd875..177f922acfaf 100644
--- a/fs/xfs/scrub/tempfile.c
+++ b/fs/xfs/scrub/tempfile.c
@@ -88,7 +88,7 @@ xrep_tempfile_create(
goto out_release_dquots;
/* Allocate inode, set up directory. */
- error = xfs_dialloc(&tp, dp->i_ino, mode, &ino);
+ error = xfs_dialloc(&tp, &args, &ino);
if (error)
goto out_trans_cancel;
error = xfs_icreate(tp, ino, &args, &sc->tempip);
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index 7dc6f326936c..9ea7a18f5da1 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -704,7 +704,7 @@ xfs_create(
* entry pointing to them, but a directory also the "." entry
* pointing to itself.
*/
- error = xfs_dialloc(&tp, dp->i_ino, args->mode, &ino);
+ error = xfs_dialloc(&tp, args, &ino);
if (!error)
error = xfs_icreate(tp, ino, args, &du.ip);
if (error)
@@ -812,7 +812,7 @@ xfs_create_tmpfile(
if (error)
goto out_release_dquots;
- error = xfs_dialloc(&tp, dp->i_ino, args->mode, &ino);
+ error = xfs_dialloc(&tp, args, &ino);
if (!error)
error = xfs_icreate(tp, ino, args, &ip);
if (error)
diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c
index 9490b913a4ab..63f6ca2db251 100644
--- a/fs/xfs/xfs_qm.c
+++ b/fs/xfs/xfs_qm.c
@@ -799,7 +799,7 @@ xfs_qm_qino_alloc(
};
xfs_ino_t ino;
- error = xfs_dialloc(&tp, 0, S_IFREG, &ino);
+ error = xfs_dialloc(&tp, &args, &ino);
if (!error)
error = xfs_icreate(tp, ino, &args, ipp);
if (error) {
diff --git a/fs/xfs/xfs_symlink.c b/fs/xfs/xfs_symlink.c
index 77f19e2f66e0..4252b07cd251 100644
--- a/fs/xfs/xfs_symlink.c
+++ b/fs/xfs/xfs_symlink.c
@@ -165,7 +165,7 @@ xfs_symlink(
/*
* Allocate an inode for the symlink.
*/
- error = xfs_dialloc(&tp, dp->i_ino, S_IFLNK, &ino);
+ error = xfs_dialloc(&tp, &args, &ino);
if (!error)
error = xfs_icreate(tp, ino, &args, &du.ip);
if (error)
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 01/12] xfs: remove xfs_validate_rtextents
2024-09-02 18:21 ` [PATCHSET v4.2 3/8] xfs: clean up the rtbitmap code Darrick J. Wong
@ 2024-09-02 18:24 ` Darrick J. Wong
2024-09-02 18:24 ` [PATCH 02/12] xfs: factor out a xfs_validate_rt_geometry helper Darrick J. Wong
` (10 subsequent siblings)
11 siblings, 0 replies; 53+ messages in thread
From: Darrick J. Wong @ 2024-09-02 18:24 UTC (permalink / raw)
To: chandanbabu, djwong; +Cc: Christoph Hellwig, linux-xfs
From: Christoph Hellwig <hch@lst.de>
Replace xfs_validate_rtextents with an open coded check for 0
rtextents. The name for the function implies it does a lot more
than a zero check, which is more obvious when open coded.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
fs/xfs/libxfs/xfs_sb.c | 2 +-
fs/xfs/libxfs/xfs_types.h | 12 ------------
fs/xfs/xfs_rtalloc.c | 2 +-
3 files changed, 2 insertions(+), 14 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c
index 6b56f0f6d4c1..f2fb6035fd21 100644
--- a/fs/xfs/libxfs/xfs_sb.c
+++ b/fs/xfs/libxfs/xfs_sb.c
@@ -514,7 +514,7 @@ xfs_validate_sb_common(
rbmblocks = howmany_64(sbp->sb_rextents,
NBBY * sbp->sb_blocksize);
- if (!xfs_validate_rtextents(rexts) ||
+ if (sbp->sb_rextents == 0 ||
sbp->sb_rextents != rexts ||
sbp->sb_rextslog != xfs_compute_rextslog(rexts) ||
sbp->sb_rbmblocks != rbmblocks) {
diff --git a/fs/xfs/libxfs/xfs_types.h b/fs/xfs/libxfs/xfs_types.h
index 76eb9e328835..a8cd44d03ef6 100644
--- a/fs/xfs/libxfs/xfs_types.h
+++ b/fs/xfs/libxfs/xfs_types.h
@@ -235,16 +235,4 @@ bool xfs_verify_fileoff(struct xfs_mount *mp, xfs_fileoff_t off);
bool xfs_verify_fileext(struct xfs_mount *mp, xfs_fileoff_t off,
xfs_fileoff_t len);
-/* Do we support an rt volume having this number of rtextents? */
-static inline bool
-xfs_validate_rtextents(
- xfs_rtbxlen_t rtextents)
-{
- /* No runt rt volumes */
- if (rtextents == 0)
- return false;
-
- return true;
-}
-
#endif /* __XFS_TYPES_H__ */
diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
index ebeab8e4dab1..d28395abdd02 100644
--- a/fs/xfs/xfs_rtalloc.c
+++ b/fs/xfs/xfs_rtalloc.c
@@ -903,7 +903,7 @@ xfs_growfs_rt(
*/
nrextents = nrblocks;
do_div(nrextents, in->extsize);
- if (!xfs_validate_rtextents(nrextents)) {
+ if (nrextents == 0) {
error = -EINVAL;
goto out_unlock;
}
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 02/12] xfs: factor out a xfs_validate_rt_geometry helper
2024-09-02 18:21 ` [PATCHSET v4.2 3/8] xfs: clean up the rtbitmap code Darrick J. Wong
2024-09-02 18:24 ` [PATCH 01/12] xfs: remove xfs_validate_rtextents Darrick J. Wong
@ 2024-09-02 18:24 ` Darrick J. Wong
2024-09-02 18:24 ` [PATCH 03/12] xfs: make the RT rsum_cache mandatory Darrick J. Wong
` (9 subsequent siblings)
11 siblings, 0 replies; 53+ messages in thread
From: Darrick J. Wong @ 2024-09-02 18:24 UTC (permalink / raw)
To: chandanbabu, djwong; +Cc: Christoph Hellwig, linux-xfs
From: Christoph Hellwig <hch@lst.de>
Split the RT geometry validation in the early mount code into a
helper than can be reused by repair (from which this code was
apparently originally stolen anyway).
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
[djwong: u64 return value for calc_rbmblocks]
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
fs/xfs/libxfs/xfs_sb.c | 64 ++++++++++++++++++++++++++----------------------
fs/xfs/libxfs/xfs_sb.h | 1 +
2 files changed, 36 insertions(+), 29 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c
index f2fb6035fd21..f9c3045f71e0 100644
--- a/fs/xfs/libxfs/xfs_sb.c
+++ b/fs/xfs/libxfs/xfs_sb.c
@@ -232,6 +232,38 @@ xfs_validate_sb_read(
return 0;
}
+static uint64_t
+xfs_sb_calc_rbmblocks(
+ struct xfs_sb *sbp)
+{
+ return howmany_64(sbp->sb_rextents, NBBY * sbp->sb_blocksize);
+}
+
+/* Validate the realtime geometry */
+bool
+xfs_validate_rt_geometry(
+ struct xfs_sb *sbp)
+{
+ if (sbp->sb_rextsize * sbp->sb_blocksize > XFS_MAX_RTEXTSIZE ||
+ sbp->sb_rextsize * sbp->sb_blocksize < XFS_MIN_RTEXTSIZE)
+ return false;
+
+ if (sbp->sb_rblocks == 0) {
+ if (sbp->sb_rextents != 0 || sbp->sb_rbmblocks != 0 ||
+ sbp->sb_rextslog != 0 || sbp->sb_frextents != 0)
+ return false;
+ return true;
+ }
+
+ if (sbp->sb_rextents == 0 ||
+ sbp->sb_rextents != div_u64(sbp->sb_rblocks, sbp->sb_rextsize) ||
+ sbp->sb_rextslog != xfs_compute_rextslog(sbp->sb_rextents) ||
+ sbp->sb_rbmblocks != xfs_sb_calc_rbmblocks(sbp))
+ return false;
+
+ return true;
+}
+
/* Check all the superblock fields we care about when writing one out. */
STATIC int
xfs_validate_sb_write(
@@ -491,39 +523,13 @@ xfs_validate_sb_common(
}
}
- /* Validate the realtime geometry; stolen from xfs_repair */
- if (sbp->sb_rextsize * sbp->sb_blocksize > XFS_MAX_RTEXTSIZE ||
- sbp->sb_rextsize * sbp->sb_blocksize < XFS_MIN_RTEXTSIZE) {
+ if (!xfs_validate_rt_geometry(sbp)) {
xfs_notice(mp,
- "realtime extent sanity check failed");
+ "realtime %sgeometry check failed",
+ sbp->sb_rblocks ? "" : "zeroed ");
return -EFSCORRUPTED;
}
- if (sbp->sb_rblocks == 0) {
- if (sbp->sb_rextents != 0 || sbp->sb_rbmblocks != 0 ||
- sbp->sb_rextslog != 0 || sbp->sb_frextents != 0) {
- xfs_notice(mp,
- "realtime zeroed geometry check failed");
- return -EFSCORRUPTED;
- }
- } else {
- uint64_t rexts;
- uint64_t rbmblocks;
-
- rexts = div_u64(sbp->sb_rblocks, sbp->sb_rextsize);
- rbmblocks = howmany_64(sbp->sb_rextents,
- NBBY * sbp->sb_blocksize);
-
- if (sbp->sb_rextents == 0 ||
- sbp->sb_rextents != rexts ||
- sbp->sb_rextslog != xfs_compute_rextslog(rexts) ||
- sbp->sb_rbmblocks != rbmblocks) {
- xfs_notice(mp,
- "realtime geometry sanity check failed");
- return -EFSCORRUPTED;
- }
- }
-
/*
* Either (sb_unit and !hasdalign) or (!sb_unit and hasdalign)
* would imply the image is corrupted.
diff --git a/fs/xfs/libxfs/xfs_sb.h b/fs/xfs/libxfs/xfs_sb.h
index 37b1ed1bc209..796f02191dfd 100644
--- a/fs/xfs/libxfs/xfs_sb.h
+++ b/fs/xfs/libxfs/xfs_sb.h
@@ -38,6 +38,7 @@ extern int xfs_sb_get_secondary(struct xfs_mount *mp,
bool xfs_validate_stripe_geometry(struct xfs_mount *mp,
__s64 sunit, __s64 swidth, int sectorsize, bool may_repair,
bool silent);
+bool xfs_validate_rt_geometry(struct xfs_sb *sbp);
uint8_t xfs_compute_rextslog(xfs_rtbxlen_t rtextents);
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 03/12] xfs: make the RT rsum_cache mandatory
2024-09-02 18:21 ` [PATCHSET v4.2 3/8] xfs: clean up the rtbitmap code Darrick J. Wong
2024-09-02 18:24 ` [PATCH 01/12] xfs: remove xfs_validate_rtextents Darrick J. Wong
2024-09-02 18:24 ` [PATCH 02/12] xfs: factor out a xfs_validate_rt_geometry helper Darrick J. Wong
@ 2024-09-02 18:24 ` Darrick J. Wong
2024-09-02 18:24 ` [PATCH 04/12] xfs: remove the limit argument to xfs_rtfind_back Darrick J. Wong
` (8 subsequent siblings)
11 siblings, 0 replies; 53+ messages in thread
From: Darrick J. Wong @ 2024-09-02 18:24 UTC (permalink / raw)
To: chandanbabu, djwong; +Cc: Christoph Hellwig, linux-xfs
From: Christoph Hellwig <hch@lst.de>
Currently the RT mount code simply ignores an allocation failure for the
rsum_cache. The code mostly works fine with it, but not having it leads
to nasty corner cases in the growfs code that we don't really handle
well. Switch to failing the mount if we can't allocate the memory, the
file system would not exactly be useful in such a constrained environment
to start with.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
fs/xfs/xfs_rtalloc.c | 26 +++++++++++++++-----------
1 file changed, 15 insertions(+), 11 deletions(-)
diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
index d28395abdd02..26eab1b408c8 100644
--- a/fs/xfs/xfs_rtalloc.c
+++ b/fs/xfs/xfs_rtalloc.c
@@ -767,21 +767,20 @@ xfs_growfs_rt_alloc(
return error;
}
-static void
+static int
xfs_alloc_rsum_cache(
- xfs_mount_t *mp, /* file system mount structure */
- xfs_extlen_t rbmblocks) /* number of rt bitmap blocks */
+ struct xfs_mount *mp,
+ xfs_extlen_t rbmblocks)
{
/*
* The rsum cache is initialized to the maximum value, which is
* trivially an upper bound on the maximum level with any free extents.
- * We can continue without the cache if it couldn't be allocated.
*/
mp->m_rsum_cache = kvmalloc(rbmblocks, GFP_KERNEL);
- if (mp->m_rsum_cache)
- memset(mp->m_rsum_cache, -1, rbmblocks);
- else
- xfs_warn(mp, "could not allocate realtime summary cache");
+ if (!mp->m_rsum_cache)
+ return -ENOMEM;
+ memset(mp->m_rsum_cache, -1, rbmblocks);
+ return 0;
}
/*
@@ -939,8 +938,11 @@ xfs_growfs_rt(
goto out_unlock;
rsum_cache = mp->m_rsum_cache;
- if (nrbmblocks != sbp->sb_rbmblocks)
- xfs_alloc_rsum_cache(mp, nrbmblocks);
+ if (nrbmblocks != sbp->sb_rbmblocks) {
+ error = xfs_alloc_rsum_cache(mp, nrbmblocks);
+ if (error)
+ goto out_unlock;
+ }
/*
* Allocate a new (fake) mount/sb.
@@ -1268,7 +1270,9 @@ xfs_rtmount_inodes(
if (error)
goto out_rele_summary;
- xfs_alloc_rsum_cache(mp, sbp->sb_rbmblocks);
+ error = xfs_alloc_rsum_cache(mp, sbp->sb_rbmblocks);
+ if (error)
+ goto out_rele_summary;
return 0;
out_rele_summary:
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 04/12] xfs: remove the limit argument to xfs_rtfind_back
2024-09-02 18:21 ` [PATCHSET v4.2 3/8] xfs: clean up the rtbitmap code Darrick J. Wong
` (2 preceding siblings ...)
2024-09-02 18:24 ` [PATCH 03/12] xfs: make the RT rsum_cache mandatory Darrick J. Wong
@ 2024-09-02 18:24 ` Darrick J. Wong
2024-09-02 18:25 ` [PATCH 05/12] xfs: assert a valid limit in xfs_rtfind_forw Darrick J. Wong
` (7 subsequent siblings)
11 siblings, 0 replies; 53+ messages in thread
From: Darrick J. Wong @ 2024-09-02 18:24 UTC (permalink / raw)
To: chandanbabu, djwong; +Cc: Christoph Hellwig, linux-xfs
From: Christoph Hellwig <hch@lst.de>
All callers pass a 0 limit to xfs_rtfind_back, so remove the argument
and hard code it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
fs/xfs/libxfs/xfs_rtbitmap.c | 9 ++++-----
fs/xfs/libxfs/xfs_rtbitmap.h | 2 +-
fs/xfs/xfs_rtalloc.c | 2 +-
3 files changed, 6 insertions(+), 7 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_rtbitmap.c b/fs/xfs/libxfs/xfs_rtbitmap.c
index 386b672c5058..9feeefe53948 100644
--- a/fs/xfs/libxfs/xfs_rtbitmap.c
+++ b/fs/xfs/libxfs/xfs_rtbitmap.c
@@ -139,14 +139,13 @@ xfs_rtbuf_get(
}
/*
- * Searching backward from start to limit, find the first block whose
- * allocated/free state is different from start's.
+ * Searching backward from start find the first block whose allocated/free state
+ * is different from start's.
*/
int
xfs_rtfind_back(
struct xfs_rtalloc_args *args,
xfs_rtxnum_t start, /* starting rtext to look at */
- xfs_rtxnum_t limit, /* last rtext to look at */
xfs_rtxnum_t *rtx) /* out: start rtext found */
{
struct xfs_mount *mp = args->mp;
@@ -175,7 +174,7 @@ xfs_rtfind_back(
*/
word = xfs_rtx_to_rbmword(mp, start);
bit = (int)(start & (XFS_NBWORD - 1));
- len = start - limit + 1;
+ len = start + 1;
/*
* Compute match value, based on the bit at start: if 1 (free)
* then all-ones, else all-zeroes.
@@ -698,7 +697,7 @@ xfs_rtfree_range(
* We need to find the beginning and end of the extent so we can
* properly update the summary.
*/
- error = xfs_rtfind_back(args, start, 0, &preblock);
+ error = xfs_rtfind_back(args, start, &preblock);
if (error) {
return error;
}
diff --git a/fs/xfs/libxfs/xfs_rtbitmap.h b/fs/xfs/libxfs/xfs_rtbitmap.h
index 6186585f2c37..1e04f0954a0f 100644
--- a/fs/xfs/libxfs/xfs_rtbitmap.h
+++ b/fs/xfs/libxfs/xfs_rtbitmap.h
@@ -316,7 +316,7 @@ xfs_rtsummary_read_buf(
int xfs_rtcheck_range(struct xfs_rtalloc_args *args, xfs_rtxnum_t start,
xfs_rtxlen_t len, int val, xfs_rtxnum_t *new, int *stat);
int xfs_rtfind_back(struct xfs_rtalloc_args *args, xfs_rtxnum_t start,
- xfs_rtxnum_t limit, xfs_rtxnum_t *rtblock);
+ xfs_rtxnum_t *rtblock);
int xfs_rtfind_forw(struct xfs_rtalloc_args *args, xfs_rtxnum_t start,
xfs_rtxnum_t limit, xfs_rtxnum_t *rtblock);
int xfs_rtmodify_range(struct xfs_rtalloc_args *args, xfs_rtxnum_t start,
diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
index 26eab1b408c8..3728445b0b1c 100644
--- a/fs/xfs/xfs_rtalloc.c
+++ b/fs/xfs/xfs_rtalloc.c
@@ -142,7 +142,7 @@ xfs_rtallocate_range(
* We need to find the beginning and end of the extent so we can
* properly update the summary.
*/
- error = xfs_rtfind_back(args, start, 0, &preblock);
+ error = xfs_rtfind_back(args, start, &preblock);
if (error)
return error;
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 05/12] xfs: assert a valid limit in xfs_rtfind_forw
2024-09-02 18:21 ` [PATCHSET v4.2 3/8] xfs: clean up the rtbitmap code Darrick J. Wong
` (3 preceding siblings ...)
2024-09-02 18:24 ` [PATCH 04/12] xfs: remove the limit argument to xfs_rtfind_back Darrick J. Wong
@ 2024-09-02 18:25 ` Darrick J. Wong
2024-09-02 18:25 ` [PATCH 06/12] xfs: add bounds checking to xfs_rt{bitmap,summary}_read_buf Darrick J. Wong
` (6 subsequent siblings)
11 siblings, 0 replies; 53+ messages in thread
From: Darrick J. Wong @ 2024-09-02 18:25 UTC (permalink / raw)
To: chandanbabu, djwong; +Cc: Christoph Hellwig, linux-xfs
From: Christoph Hellwig <hch@lst.de>
Protect against developers passing stupid limits when refactoring the
RT code once again.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
fs/xfs/libxfs/xfs_rtbitmap.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/fs/xfs/libxfs/xfs_rtbitmap.c b/fs/xfs/libxfs/xfs_rtbitmap.c
index 9feeefe53948..4de97c4e8ebd 100644
--- a/fs/xfs/libxfs/xfs_rtbitmap.c
+++ b/fs/xfs/libxfs/xfs_rtbitmap.c
@@ -315,6 +315,8 @@ xfs_rtfind_forw(
xfs_rtword_t incore;
unsigned int word; /* word number in the buffer */
+ ASSERT(start <= limit);
+
/*
* Compute and read in starting bitmap block for starting block.
*/
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 06/12] xfs: add bounds checking to xfs_rt{bitmap,summary}_read_buf
2024-09-02 18:21 ` [PATCHSET v4.2 3/8] xfs: clean up the rtbitmap code Darrick J. Wong
` (4 preceding siblings ...)
2024-09-02 18:25 ` [PATCH 05/12] xfs: assert a valid limit in xfs_rtfind_forw Darrick J. Wong
@ 2024-09-02 18:25 ` Darrick J. Wong
2024-09-02 18:25 ` [PATCH 07/12] xfs: cleanup the calling convention for xfs_rtpick_extent Darrick J. Wong
` (5 subsequent siblings)
11 siblings, 0 replies; 53+ messages in thread
From: Darrick J. Wong @ 2024-09-02 18:25 UTC (permalink / raw)
To: chandanbabu, djwong; +Cc: Christoph Hellwig, linux-xfs
From: Christoph Hellwig <hch@lst.de>
Add a corruption check for passing an invalid block number, which is a
lot easier to understand than the xfs_bmapi_read failure later on.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
fs/xfs/libxfs/xfs_rtbitmap.c | 31 ++++++++++++++++++++++++++++++-
fs/xfs/libxfs/xfs_rtbitmap.h | 22 ++--------------------
2 files changed, 32 insertions(+), 21 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_rtbitmap.c b/fs/xfs/libxfs/xfs_rtbitmap.c
index 4de97c4e8ebd..02d6668d860f 100644
--- a/fs/xfs/libxfs/xfs_rtbitmap.c
+++ b/fs/xfs/libxfs/xfs_rtbitmap.c
@@ -69,7 +69,7 @@ xfs_rtbuf_cache_relse(
* Get a buffer for the bitmap or summary file block specified.
* The buffer is returned read and locked.
*/
-int
+static int
xfs_rtbuf_get(
struct xfs_rtalloc_args *args,
xfs_fileoff_t block, /* block number in bitmap or summary */
@@ -138,6 +138,35 @@ xfs_rtbuf_get(
return 0;
}
+int
+xfs_rtbitmap_read_buf(
+ struct xfs_rtalloc_args *args,
+ xfs_fileoff_t block)
+{
+ struct xfs_mount *mp = args->mp;
+
+ if (XFS_IS_CORRUPT(mp, block >= mp->m_sb.sb_rbmblocks)) {
+ xfs_rt_mark_sick(mp, XFS_SICK_RT_BITMAP);
+ return -EFSCORRUPTED;
+ }
+
+ return xfs_rtbuf_get(args, block, 0);
+}
+
+int
+xfs_rtsummary_read_buf(
+ struct xfs_rtalloc_args *args,
+ xfs_fileoff_t block)
+{
+ struct xfs_mount *mp = args->mp;
+
+ if (XFS_IS_CORRUPT(mp, block >= XFS_B_TO_FSB(mp, mp->m_rsumsize))) {
+ xfs_rt_mark_sick(args->mp, XFS_SICK_RT_SUMMARY);
+ return -EFSCORRUPTED;
+ }
+ return xfs_rtbuf_get(args, block, 1);
+}
+
/*
* Searching backward from start find the first block whose allocated/free state
* is different from start's.
diff --git a/fs/xfs/libxfs/xfs_rtbitmap.h b/fs/xfs/libxfs/xfs_rtbitmap.h
index 1e04f0954a0f..e87e2099cff5 100644
--- a/fs/xfs/libxfs/xfs_rtbitmap.h
+++ b/fs/xfs/libxfs/xfs_rtbitmap.h
@@ -293,26 +293,8 @@ typedef int (*xfs_rtalloc_query_range_fn)(
#ifdef CONFIG_XFS_RT
void xfs_rtbuf_cache_relse(struct xfs_rtalloc_args *args);
-
-int xfs_rtbuf_get(struct xfs_rtalloc_args *args, xfs_fileoff_t block,
- int issum);
-
-static inline int
-xfs_rtbitmap_read_buf(
- struct xfs_rtalloc_args *args,
- xfs_fileoff_t block)
-{
- return xfs_rtbuf_get(args, block, 0);
-}
-
-static inline int
-xfs_rtsummary_read_buf(
- struct xfs_rtalloc_args *args,
- xfs_fileoff_t block)
-{
- return xfs_rtbuf_get(args, block, 1);
-}
-
+int xfs_rtbitmap_read_buf(struct xfs_rtalloc_args *args, xfs_fileoff_t block);
+int xfs_rtsummary_read_buf(struct xfs_rtalloc_args *args, xfs_fileoff_t block);
int xfs_rtcheck_range(struct xfs_rtalloc_args *args, xfs_rtxnum_t start,
xfs_rtxlen_t len, int val, xfs_rtxnum_t *new, int *stat);
int xfs_rtfind_back(struct xfs_rtalloc_args *args, xfs_rtxnum_t start,
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 07/12] xfs: cleanup the calling convention for xfs_rtpick_extent
2024-09-02 18:21 ` [PATCHSET v4.2 3/8] xfs: clean up the rtbitmap code Darrick J. Wong
` (5 preceding siblings ...)
2024-09-02 18:25 ` [PATCH 06/12] xfs: add bounds checking to xfs_rt{bitmap,summary}_read_buf Darrick J. Wong
@ 2024-09-02 18:25 ` Darrick J. Wong
2024-09-02 18:25 ` [PATCH 08/12] xfs: push the calls to xfs_rtallocate_range out to xfs_bmap_rtalloc Darrick J. Wong
` (4 subsequent siblings)
11 siblings, 0 replies; 53+ messages in thread
From: Darrick J. Wong @ 2024-09-02 18:25 UTC (permalink / raw)
To: chandanbabu, djwong; +Cc: Christoph Hellwig, linux-xfs
From: Christoph Hellwig <hch@lst.de>
xfs_rtpick_extent never returns an error. Do away with the error return
and directly return the picked extent instead of doing that through a
call by reference argument.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
fs/xfs/xfs_rtalloc.c | 12 ++++--------
1 file changed, 4 insertions(+), 8 deletions(-)
diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
index 3728445b0b1c..64ba4bcf6e29 100644
--- a/fs/xfs/xfs_rtalloc.c
+++ b/fs/xfs/xfs_rtalloc.c
@@ -1300,12 +1300,11 @@ xfs_rtunmount_inodes(
* of rtextents and the fraction.
* The fraction sequence is 0, 1/2, 1/4, 3/4, 1/8, ..., 7/8, 1/16, ...
*/
-static int
+static xfs_rtxnum_t
xfs_rtpick_extent(
xfs_mount_t *mp, /* file system mount point */
xfs_trans_t *tp, /* transaction pointer */
- xfs_rtxlen_t len, /* allocation length (rtextents) */
- xfs_rtxnum_t *pick) /* result rt extent */
+ xfs_rtxlen_t len) /* allocation length (rtextents) */
{
xfs_rtxnum_t b; /* result rtext */
int log2; /* log of sequence number */
@@ -1336,8 +1335,7 @@ xfs_rtpick_extent(
ts.tv_sec = seq + 1;
inode_set_atime_to_ts(VFS_I(mp->m_rbmip), ts);
xfs_trans_log_inode(tp, mp->m_rbmip, XFS_ILOG_CORE);
- *pick = b;
- return 0;
+ return b;
}
static void
@@ -1444,9 +1442,7 @@ xfs_bmap_rtalloc(
* If it's an allocation to an empty file at offset 0, pick an
* extent that will space things out in the rt area.
*/
- error = xfs_rtpick_extent(mp, ap->tp, ralen, &start);
- if (error)
- return error;
+ start = xfs_rtpick_extent(mp, ap->tp, ralen);
} else {
start = 0;
}
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 08/12] xfs: push the calls to xfs_rtallocate_range out to xfs_bmap_rtalloc
2024-09-02 18:21 ` [PATCHSET v4.2 3/8] xfs: clean up the rtbitmap code Darrick J. Wong
` (6 preceding siblings ...)
2024-09-02 18:25 ` [PATCH 07/12] xfs: cleanup the calling convention for xfs_rtpick_extent Darrick J. Wong
@ 2024-09-02 18:25 ` Darrick J. Wong
2024-09-02 18:26 ` [PATCH 09/12] xfs: factor out a xfs_growfs_rt_bmblock helper Darrick J. Wong
` (3 subsequent siblings)
11 siblings, 0 replies; 53+ messages in thread
From: Darrick J. Wong @ 2024-09-02 18:25 UTC (permalink / raw)
To: chandanbabu, djwong; +Cc: Christoph Hellwig, linux-xfs
From: Christoph Hellwig <hch@lst.de>
Currently the various low-level RT allocator functions call into
xfs_rtallocate_range directly, which ties them into the locking protocol
for the RT bitmap. As these helpers already return the allocated range,
lift the call to xfs_rtallocate_range into xfs_bmap_rtalloc so that it
happens as high as possible in the stack, which will simplify future
changes to the locking protocol.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
fs/xfs/xfs_rtalloc.c | 38 ++++++++++++++++++--------------------
1 file changed, 18 insertions(+), 20 deletions(-)
diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
index 64ba4bcf6e29..59e599af74f4 100644
--- a/fs/xfs/xfs_rtalloc.c
+++ b/fs/xfs/xfs_rtalloc.c
@@ -259,9 +259,9 @@ xfs_rtallocate_extent_block(
/*
* i for maxlen is all free, allocate and return that.
*/
- bestlen = maxlen;
- besti = i;
- goto allocate;
+ *len = maxlen;
+ *rtx = i;
+ return 0;
}
/*
@@ -312,12 +312,8 @@ xfs_rtallocate_extent_block(
}
/*
- * Allocate besti for bestlen & return that.
+ * Pick besti for bestlen & return that.
*/
-allocate:
- error = xfs_rtallocate_range(args, besti, bestlen);
- if (error)
- return error;
*len = bestlen;
*rtx = besti;
return 0;
@@ -371,12 +367,6 @@ xfs_rtallocate_extent_exact(
}
}
- /*
- * Allocate what we can and return it.
- */
- error = xfs_rtallocate_range(args, start, maxlen);
- if (error)
- return error;
*len = maxlen;
*rtx = start;
return 0;
@@ -429,7 +419,6 @@ xfs_rtallocate_extent_near(
if (error != -ENOSPC)
return error;
-
bbno = xfs_rtx_to_rbmblock(mp, start);
i = 0;
j = -1;
@@ -552,11 +541,11 @@ xfs_rtalloc_sumlevel(
xfs_rtxnum_t *rtx) /* out: start rtext allocated */
{
xfs_fileoff_t i; /* bitmap block number */
+ int error;
for (i = 0; i < args->mp->m_sb.sb_rbmblocks; i++) {
xfs_suminfo_t sum; /* summary information for extents */
xfs_rtxnum_t n; /* next rtext to be tried */
- int error;
error = xfs_rtget_summary(args, l, i, &sum);
if (error)
@@ -1467,9 +1456,12 @@ xfs_bmap_rtalloc(
error = xfs_rtallocate_extent_size(&args, raminlen,
ralen, &ralen, prod, &rtx);
}
- xfs_rtbuf_cache_relse(&args);
- if (error == -ENOSPC) {
+ if (error) {
+ xfs_rtbuf_cache_relse(&args);
+ if (error != -ENOSPC)
+ return error;
+
if (align > mp->m_sb.sb_rextsize) {
/*
* We previously enlarged the request length to try to
@@ -1497,14 +1489,20 @@ xfs_bmap_rtalloc(
ap->length = 0;
return 0;
}
+
+ error = xfs_rtallocate_range(&args, rtx, ralen);
if (error)
- return error;
+ goto out_release;
xfs_trans_mod_sb(ap->tp, ap->wasdel ?
XFS_TRANS_SB_RES_FREXTENTS : XFS_TRANS_SB_FREXTENTS,
-(long)ralen);
+
ap->blkno = xfs_rtx_to_rtb(mp, rtx);
ap->length = xfs_rtxlen_to_extlen(mp, ralen);
xfs_bmap_alloc_account(ap);
- return 0;
+
+out_release:
+ xfs_rtbuf_cache_relse(&args);
+ return error;
}
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 09/12] xfs: factor out a xfs_growfs_rt_bmblock helper
2024-09-02 18:21 ` [PATCHSET v4.2 3/8] xfs: clean up the rtbitmap code Darrick J. Wong
` (7 preceding siblings ...)
2024-09-02 18:25 ` [PATCH 08/12] xfs: push the calls to xfs_rtallocate_range out to xfs_bmap_rtalloc Darrick J. Wong
@ 2024-09-02 18:26 ` Darrick J. Wong
2024-09-02 18:26 ` [PATCH 10/12] xfs: factor out a xfs_last_rt_bmblock helper Darrick J. Wong
` (2 subsequent siblings)
11 siblings, 0 replies; 53+ messages in thread
From: Darrick J. Wong @ 2024-09-02 18:26 UTC (permalink / raw)
To: chandanbabu, djwong; +Cc: Christoph Hellwig, linux-xfs
From: Christoph Hellwig <hch@lst.de>
Add a helper to contain the per-rtbitmap block logic in xfs_growfs_rt.
Note that this helper now allocates a new fake mount structure for
each rtbitmap block iteration instead of reusing the memory for an
entire growfs call. Compared to all the other work done when freeing
the blocks the overhead for this is in the noise and it keeps the code
nicely modular.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
fs/xfs/xfs_rtalloc.c | 317 +++++++++++++++++++++++++-------------------------
1 file changed, 158 insertions(+), 159 deletions(-)
diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
index 59e599af74f4..febd039718ee 100644
--- a/fs/xfs/xfs_rtalloc.c
+++ b/fs/xfs/xfs_rtalloc.c
@@ -805,9 +805,148 @@ xfs_growfs_rt_fixup_extsize(
return error;
}
-/*
- * Visible (exported) functions.
- */
+static int
+xfs_growfs_rt_bmblock(
+ struct xfs_mount *mp,
+ xfs_rfsblock_t nrblocks,
+ xfs_agblock_t rextsize,
+ xfs_fileoff_t bmbno)
+{
+ struct xfs_inode *rbmip = mp->m_rbmip;
+ struct xfs_inode *rsumip = mp->m_rsumip;
+ struct xfs_rtalloc_args args = {
+ .mp = mp,
+ };
+ struct xfs_rtalloc_args nargs = {
+ };
+ struct xfs_mount *nmp;
+ xfs_rfsblock_t nrblocks_step;
+ xfs_rtbxlen_t freed_rtx;
+ int error;
+
+
+ nrblocks_step = (bmbno + 1) * NBBY * mp->m_sb.sb_blocksize * rextsize;
+
+ nmp = nargs.mp = kmemdup(mp, sizeof(*mp), GFP_KERNEL);
+ if (!nmp)
+ return -ENOMEM;
+
+ /*
+ * Calculate new sb and mount fields for this round.
+ */
+ nmp->m_rtxblklog = -1; /* don't use shift or masking */
+ nmp->m_sb.sb_rextsize = rextsize;
+ nmp->m_sb.sb_rbmblocks = bmbno + 1;
+ nmp->m_sb.sb_rblocks = min(nrblocks, nrblocks_step);
+ nmp->m_sb.sb_rextents = xfs_rtb_to_rtx(nmp, nmp->m_sb.sb_rblocks);
+ nmp->m_sb.sb_rextslog = xfs_compute_rextslog(nmp->m_sb.sb_rextents);
+ nmp->m_rsumlevels = nmp->m_sb.sb_rextslog + 1;
+ nmp->m_rsumsize = XFS_FSB_TO_B(mp,
+ xfs_rtsummary_blockcount(mp, nmp->m_rsumlevels,
+ nmp->m_sb.sb_rbmblocks));
+
+ /* recompute growfsrt reservation from new rsumsize */
+ xfs_trans_resv_calc(nmp, &nmp->m_resv);
+
+ error = xfs_trans_alloc(mp, &M_RES(mp)->tr_growrtfree, 0, 0, 0,
+ &args.tp);
+ if (error)
+ goto out_free;
+ nargs.tp = args.tp;
+
+ xfs_rtbitmap_lock(args.tp, mp);
+
+ /*
+ * Update the bitmap inode's size ondisk and incore. We need to update
+ * the incore size so that inode inactivation won't punch what it thinks
+ * are "posteof" blocks.
+ */
+ rbmip->i_disk_size = nmp->m_sb.sb_rbmblocks * nmp->m_sb.sb_blocksize;
+ i_size_write(VFS_I(rbmip), rbmip->i_disk_size);
+ xfs_trans_log_inode(args.tp, rbmip, XFS_ILOG_CORE);
+
+ /*
+ * Update the summary inode's size. We need to update the incore size
+ * so that inode inactivation won't punch what it thinks are "posteof"
+ * blocks.
+ */
+ rsumip->i_disk_size = nmp->m_rsumsize;
+ i_size_write(VFS_I(rsumip), rsumip->i_disk_size);
+ xfs_trans_log_inode(args.tp, rsumip, XFS_ILOG_CORE);
+
+ /*
+ * Copy summary data from old to new sizes when the real size (not
+ * block-aligned) changes.
+ */
+ if (mp->m_sb.sb_rbmblocks != nmp->m_sb.sb_rbmblocks ||
+ mp->m_rsumlevels != nmp->m_rsumlevels) {
+ error = xfs_rtcopy_summary(&args, &nargs);
+ if (error)
+ goto out_cancel;
+ }
+
+ /*
+ * Update superblock fields.
+ */
+ if (nmp->m_sb.sb_rextsize != mp->m_sb.sb_rextsize)
+ xfs_trans_mod_sb(args.tp, XFS_TRANS_SB_REXTSIZE,
+ nmp->m_sb.sb_rextsize - mp->m_sb.sb_rextsize);
+ if (nmp->m_sb.sb_rbmblocks != mp->m_sb.sb_rbmblocks)
+ xfs_trans_mod_sb(args.tp, XFS_TRANS_SB_RBMBLOCKS,
+ nmp->m_sb.sb_rbmblocks - mp->m_sb.sb_rbmblocks);
+ if (nmp->m_sb.sb_rblocks != mp->m_sb.sb_rblocks)
+ xfs_trans_mod_sb(args.tp, XFS_TRANS_SB_RBLOCKS,
+ nmp->m_sb.sb_rblocks - mp->m_sb.sb_rblocks);
+ if (nmp->m_sb.sb_rextents != mp->m_sb.sb_rextents)
+ xfs_trans_mod_sb(args.tp, XFS_TRANS_SB_REXTENTS,
+ nmp->m_sb.sb_rextents - mp->m_sb.sb_rextents);
+ if (nmp->m_sb.sb_rextslog != mp->m_sb.sb_rextslog)
+ xfs_trans_mod_sb(args.tp, XFS_TRANS_SB_REXTSLOG,
+ nmp->m_sb.sb_rextslog - mp->m_sb.sb_rextslog);
+
+ /*
+ * Free the new extent.
+ */
+ freed_rtx = nmp->m_sb.sb_rextents - mp->m_sb.sb_rextents;
+ error = xfs_rtfree_range(&nargs, mp->m_sb.sb_rextents, freed_rtx);
+ xfs_rtbuf_cache_relse(&nargs);
+ if (error)
+ goto out_cancel;
+
+ /*
+ * Mark more blocks free in the superblock.
+ */
+ xfs_trans_mod_sb(args.tp, XFS_TRANS_SB_FREXTENTS, freed_rtx);
+
+ /*
+ * Update mp values into the real mp structure.
+ */
+ mp->m_rsumlevels = nmp->m_rsumlevels;
+ mp->m_rsumsize = nmp->m_rsumsize;
+
+ /*
+ * Recompute the growfsrt reservation from the new rsumsize.
+ */
+ xfs_trans_resv_calc(mp, &mp->m_resv);
+
+ error = xfs_trans_commit(args.tp);
+ if (error)
+ goto out_free;
+
+ /*
+ * Ensure the mount RT feature flag is now set.
+ */
+ mp->m_features |= XFS_FEAT_REALTIME;
+
+ kfree(nmp);
+ return 0;
+
+out_cancel:
+ xfs_trans_cancel(args.tp);
+out_free:
+ kfree(nmp);
+ return error;
+}
/*
* Grow the realtime area of the filesystem.
@@ -820,23 +959,14 @@ xfs_growfs_rt(
xfs_fileoff_t bmbno; /* bitmap block number */
struct xfs_buf *bp; /* temporary buffer */
int error; /* error return value */
- xfs_mount_t *nmp; /* new (fake) mount structure */
- xfs_rfsblock_t nrblocks; /* new number of realtime blocks */
xfs_extlen_t nrbmblocks; /* new number of rt bitmap blocks */
xfs_rtxnum_t nrextents; /* new number of realtime extents */
- uint8_t nrextslog; /* new log2 of sb_rextents */
xfs_extlen_t nrsumblocks; /* new number of summary blocks */
- uint nrsumlevels; /* new rt summary levels */
- uint nrsumsize; /* new size of rt summary, bytes */
- xfs_sb_t *nsbp; /* new superblock */
xfs_extlen_t rbmblocks; /* current number of rt bitmap blocks */
xfs_extlen_t rsumblocks; /* current number of rt summary blks */
- xfs_sb_t *sbp; /* old superblock */
uint8_t *rsum_cache; /* old summary cache */
xfs_agblock_t old_rextsize = mp->m_sb.sb_rextsize;
- sbp = &mp->m_sb;
-
if (!capable(CAP_SYS_ADMIN))
return -EPERM;
@@ -855,11 +985,10 @@ xfs_growfs_rt(
goto out_unlock;
/* Shrink not supported. */
- if (in->newblocks <= sbp->sb_rblocks)
+ if (in->newblocks <= mp->m_sb.sb_rblocks)
goto out_unlock;
-
/* Can only change rt extent size when adding rt volume. */
- if (sbp->sb_rblocks > 0 && in->extsize != sbp->sb_rextsize)
+ if (mp->m_sb.sb_rblocks > 0 && in->extsize != mp->m_sb.sb_rextsize)
goto out_unlock;
/* Range check the extent size. */
@@ -872,15 +1001,14 @@ xfs_growfs_rt(
if (xfs_has_rmapbt(mp) || xfs_has_reflink(mp) || xfs_has_quota(mp))
goto out_unlock;
- nrblocks = in->newblocks;
- error = xfs_sb_validate_fsb_count(sbp, nrblocks);
+ error = xfs_sb_validate_fsb_count(&mp->m_sb, in->newblocks);
if (error)
goto out_unlock;
/*
* Read in the last block of the device, make sure it exists.
*/
error = xfs_buf_read_uncached(mp->m_rtdev_targp,
- XFS_FSB_TO_BB(mp, nrblocks - 1),
+ XFS_FSB_TO_BB(mp, in->newblocks - 1),
XFS_FSB_TO_BB(mp, 1), 0, &bp, NULL);
if (error)
goto out_unlock;
@@ -889,17 +1017,15 @@ xfs_growfs_rt(
/*
* Calculate new parameters. These are the final values to be reached.
*/
- nrextents = nrblocks;
- do_div(nrextents, in->extsize);
+ nrextents = div_u64(in->newblocks, in->extsize);
if (nrextents == 0) {
error = -EINVAL;
goto out_unlock;
}
nrbmblocks = xfs_rtbitmap_blockcount(mp, nrextents);
- nrextslog = xfs_compute_rextslog(nrextents);
- nrsumlevels = nrextslog + 1;
- nrsumblocks = xfs_rtsummary_blockcount(mp, nrsumlevels, nrbmblocks);
- nrsumsize = XFS_FSB_TO_B(mp, nrsumblocks);
+ nrsumblocks = xfs_rtsummary_blockcount(mp,
+ xfs_compute_rextslog(nrextents) + 1, nrbmblocks);
+
/*
* New summary size can't be more than half the size of
* the log. This prevents us from getting a log overflow,
@@ -927,149 +1053,27 @@ xfs_growfs_rt(
goto out_unlock;
rsum_cache = mp->m_rsum_cache;
- if (nrbmblocks != sbp->sb_rbmblocks) {
+ if (nrbmblocks != mp->m_sb.sb_rbmblocks) {
error = xfs_alloc_rsum_cache(mp, nrbmblocks);
if (error)
goto out_unlock;
}
- /*
- * Allocate a new (fake) mount/sb.
- */
- nmp = kmalloc(sizeof(*nmp), GFP_KERNEL | __GFP_NOFAIL);
/*
* Loop over the bitmap blocks.
* We will do everything one bitmap block at a time.
* Skip the current block if it is exactly full.
* This also deals with the case where there were no rtextents before.
*/
- for (bmbno = sbp->sb_rbmblocks -
- ((sbp->sb_rextents & ((1 << mp->m_blkbit_log) - 1)) != 0);
- bmbno < nrbmblocks;
- bmbno++) {
- struct xfs_rtalloc_args args = {
- .mp = mp,
- };
- struct xfs_rtalloc_args nargs = {
- .mp = nmp,
- };
- struct xfs_trans *tp;
- xfs_rfsblock_t nrblocks_step;
-
- *nmp = *mp;
- nsbp = &nmp->m_sb;
- /*
- * Calculate new sb and mount fields for this round.
- */
- nsbp->sb_rextsize = in->extsize;
- nmp->m_rtxblklog = -1; /* don't use shift or masking */
- nsbp->sb_rbmblocks = bmbno + 1;
- nrblocks_step = (bmbno + 1) * NBBY * nsbp->sb_blocksize *
- nsbp->sb_rextsize;
- nsbp->sb_rblocks = min(nrblocks, nrblocks_step);
- nsbp->sb_rextents = xfs_rtb_to_rtx(nmp, nsbp->sb_rblocks);
- ASSERT(nsbp->sb_rextents != 0);
- nsbp->sb_rextslog = xfs_compute_rextslog(nsbp->sb_rextents);
- nrsumlevels = nmp->m_rsumlevels = nsbp->sb_rextslog + 1;
- nrsumblocks = xfs_rtsummary_blockcount(mp, nrsumlevels,
- nsbp->sb_rbmblocks);
- nmp->m_rsumsize = nrsumsize = XFS_FSB_TO_B(mp, nrsumblocks);
- /* recompute growfsrt reservation from new rsumsize */
- xfs_trans_resv_calc(nmp, &nmp->m_resv);
-
- /*
- * Start a transaction, get the log reservation.
- */
- error = xfs_trans_alloc(mp, &M_RES(mp)->tr_growrtfree, 0, 0, 0,
- &tp);
+ bmbno = mp->m_sb.sb_rbmblocks;
+ if (xfs_rtx_to_rbmword(mp, mp->m_sb.sb_rextents) != 0)
+ bmbno--;
+ for (; bmbno < nrbmblocks; bmbno++) {
+ error = xfs_growfs_rt_bmblock(mp, in->newblocks, in->extsize,
+ bmbno);
if (error)
- break;
- args.tp = tp;
- nargs.tp = tp;
-
- /*
- * Lock out other callers by grabbing the bitmap and summary
- * inode locks and joining them to the transaction.
- */
- xfs_rtbitmap_lock(tp, mp);
- /*
- * Update the bitmap inode's size ondisk and incore. We need
- * to update the incore size so that inode inactivation won't
- * punch what it thinks are "posteof" blocks.
- */
- mp->m_rbmip->i_disk_size =
- nsbp->sb_rbmblocks * nsbp->sb_blocksize;
- i_size_write(VFS_I(mp->m_rbmip), mp->m_rbmip->i_disk_size);
- xfs_trans_log_inode(tp, mp->m_rbmip, XFS_ILOG_CORE);
- /*
- * Update the summary inode's size. We need to update the
- * incore size so that inode inactivation won't punch what it
- * thinks are "posteof" blocks.
- */
- mp->m_rsumip->i_disk_size = nmp->m_rsumsize;
- i_size_write(VFS_I(mp->m_rsumip), mp->m_rsumip->i_disk_size);
- xfs_trans_log_inode(tp, mp->m_rsumip, XFS_ILOG_CORE);
- /*
- * Copy summary data from old to new sizes.
- * Do this when the real size (not block-aligned) changes.
- */
- if (sbp->sb_rbmblocks != nsbp->sb_rbmblocks ||
- mp->m_rsumlevels != nmp->m_rsumlevels) {
- error = xfs_rtcopy_summary(&args, &nargs);
- if (error)
- goto error_cancel;
- }
- /*
- * Update superblock fields.
- */
- if (nsbp->sb_rextsize != sbp->sb_rextsize)
- xfs_trans_mod_sb(tp, XFS_TRANS_SB_REXTSIZE,
- nsbp->sb_rextsize - sbp->sb_rextsize);
- if (nsbp->sb_rbmblocks != sbp->sb_rbmblocks)
- xfs_trans_mod_sb(tp, XFS_TRANS_SB_RBMBLOCKS,
- nsbp->sb_rbmblocks - sbp->sb_rbmblocks);
- if (nsbp->sb_rblocks != sbp->sb_rblocks)
- xfs_trans_mod_sb(tp, XFS_TRANS_SB_RBLOCKS,
- nsbp->sb_rblocks - sbp->sb_rblocks);
- if (nsbp->sb_rextents != sbp->sb_rextents)
- xfs_trans_mod_sb(tp, XFS_TRANS_SB_REXTENTS,
- nsbp->sb_rextents - sbp->sb_rextents);
- if (nsbp->sb_rextslog != sbp->sb_rextslog)
- xfs_trans_mod_sb(tp, XFS_TRANS_SB_REXTSLOG,
- nsbp->sb_rextslog - sbp->sb_rextslog);
- /*
- * Free new extent.
- */
- error = xfs_rtfree_range(&nargs, sbp->sb_rextents,
- nsbp->sb_rextents - sbp->sb_rextents);
- xfs_rtbuf_cache_relse(&nargs);
- if (error) {
-error_cancel:
- xfs_trans_cancel(tp);
- break;
- }
- /*
- * Mark more blocks free in the superblock.
- */
- xfs_trans_mod_sb(tp, XFS_TRANS_SB_FREXTENTS,
- nsbp->sb_rextents - sbp->sb_rextents);
- /*
- * Update mp values into the real mp structure.
- */
- mp->m_rsumlevels = nrsumlevels;
- mp->m_rsumsize = nrsumsize;
- /* recompute growfsrt reservation from new rsumsize */
- xfs_trans_resv_calc(mp, &mp->m_resv);
-
- error = xfs_trans_commit(tp);
- if (error)
- break;
-
- /* Ensure the mount RT feature flag is now set. */
- mp->m_features |= XFS_FEAT_REALTIME;
+ goto out_free;
}
- if (error)
- goto out_free;
if (old_rextsize != in->extsize) {
error = xfs_growfs_rt_fixup_extsize(mp);
@@ -1081,11 +1085,6 @@ xfs_growfs_rt(
error = xfs_update_secondary_sbs(mp);
out_free:
- /*
- * Free the fake mp structure.
- */
- kfree(nmp);
-
/*
* If we had to allocate a new rsum_cache, we either need to free the
* old one (if we succeeded) or free the new one and restore the old one
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 10/12] xfs: factor out a xfs_last_rt_bmblock helper
2024-09-02 18:21 ` [PATCHSET v4.2 3/8] xfs: clean up the rtbitmap code Darrick J. Wong
` (8 preceding siblings ...)
2024-09-02 18:26 ` [PATCH 09/12] xfs: factor out a xfs_growfs_rt_bmblock helper Darrick J. Wong
@ 2024-09-02 18:26 ` Darrick J. Wong
2024-09-02 18:26 ` [PATCH 11/12] xfs: factor out rtbitmap/summary initialization helpers Darrick J. Wong
2024-09-02 18:27 ` [PATCH 12/12] xfs: push transaction join out of xfs_rtbitmap_lock and xfs_rtgroup_lock Darrick J. Wong
11 siblings, 0 replies; 53+ messages in thread
From: Darrick J. Wong @ 2024-09-02 18:26 UTC (permalink / raw)
To: chandanbabu, djwong; +Cc: Christoph Hellwig, linux-xfs
From: Christoph Hellwig <hch@lst.de>
Add helper to calculate the last currently used rt bitmap block to
better structure the growfs code and prepare for future changes to it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
fs/xfs/xfs_rtalloc.c | 29 +++++++++++++++++++----------
1 file changed, 19 insertions(+), 10 deletions(-)
diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
index febd039718ee..45a0d29949ea 100644
--- a/fs/xfs/xfs_rtalloc.c
+++ b/fs/xfs/xfs_rtalloc.c
@@ -948,6 +948,23 @@ xfs_growfs_rt_bmblock(
return error;
}
+/*
+ * Calculate the last rbmblock currently used.
+ *
+ * This also deals with the case where there were no rtextents before.
+ */
+static xfs_fileoff_t
+xfs_last_rt_bmblock(
+ struct xfs_mount *mp)
+{
+ xfs_fileoff_t bmbno = mp->m_sb.sb_rbmblocks;
+
+ /* Skip the current block if it is exactly full. */
+ if (xfs_rtx_to_rbmword(mp, mp->m_sb.sb_rextents) != 0)
+ bmbno--;
+ return bmbno;
+}
+
/*
* Grow the realtime area of the filesystem.
*/
@@ -1059,16 +1076,8 @@ xfs_growfs_rt(
goto out_unlock;
}
- /*
- * Loop over the bitmap blocks.
- * We will do everything one bitmap block at a time.
- * Skip the current block if it is exactly full.
- * This also deals with the case where there were no rtextents before.
- */
- bmbno = mp->m_sb.sb_rbmblocks;
- if (xfs_rtx_to_rbmword(mp, mp->m_sb.sb_rextents) != 0)
- bmbno--;
- for (; bmbno < nrbmblocks; bmbno++) {
+ /* Initialize the free space bitmap one bitmap block at a time. */
+ for (bmbno = xfs_last_rt_bmblock(mp); bmbno < nrbmblocks; bmbno++) {
error = xfs_growfs_rt_bmblock(mp, in->newblocks, in->extsize,
bmbno);
if (error)
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 11/12] xfs: factor out rtbitmap/summary initialization helpers
2024-09-02 18:21 ` [PATCHSET v4.2 3/8] xfs: clean up the rtbitmap code Darrick J. Wong
` (9 preceding siblings ...)
2024-09-02 18:26 ` [PATCH 10/12] xfs: factor out a xfs_last_rt_bmblock helper Darrick J. Wong
@ 2024-09-02 18:26 ` Darrick J. Wong
2024-09-02 18:27 ` [PATCH 12/12] xfs: push transaction join out of xfs_rtbitmap_lock and xfs_rtgroup_lock Darrick J. Wong
11 siblings, 0 replies; 53+ messages in thread
From: Darrick J. Wong @ 2024-09-02 18:26 UTC (permalink / raw)
To: chandanbabu, djwong; +Cc: Christoph Hellwig, linux-xfs
From: Christoph Hellwig <hch@lst.de>
Add helpers to libxfs that can be shared by growfs and mkfs for
initializing the rtbitmap and summary, and by passing the optional data
pointer also by repair for rebuilding them. This will become even more
useful when the rtgroups feature adds a metadata header to each block,
which means even more shared code.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
[djwong: minor documentation and data advance tweaks]
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
fs/xfs/libxfs/xfs_rtbitmap.c | 126 ++++++++++++++++++++++++++++++++++++++++++
fs/xfs/libxfs/xfs_rtbitmap.h | 3 +
fs/xfs/xfs_rtalloc.c | 121 +---------------------------------------
3 files changed, 133 insertions(+), 117 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_rtbitmap.c b/fs/xfs/libxfs/xfs_rtbitmap.c
index 02d6668d860f..715d2c54ce02 100644
--- a/fs/xfs/libxfs/xfs_rtbitmap.c
+++ b/fs/xfs/libxfs/xfs_rtbitmap.c
@@ -13,6 +13,8 @@
#include "xfs_mount.h"
#include "xfs_inode.h"
#include "xfs_bmap.h"
+#include "xfs_bmap_btree.h"
+#include "xfs_trans_space.h"
#include "xfs_trans.h"
#include "xfs_rtalloc.h"
#include "xfs_error.h"
@@ -1255,3 +1257,127 @@ xfs_rtbitmap_unlock_shared(
if (rbmlock_flags & XFS_RBMLOCK_BITMAP)
xfs_iunlock(mp->m_rbmip, XFS_ILOCK_SHARED | XFS_ILOCK_RTBITMAP);
}
+
+static int
+xfs_rtfile_alloc_blocks(
+ struct xfs_inode *ip,
+ xfs_fileoff_t offset_fsb,
+ xfs_filblks_t count_fsb,
+ struct xfs_bmbt_irec *map)
+{
+ struct xfs_mount *mp = ip->i_mount;
+ struct xfs_trans *tp;
+ int nmap = 1;
+ int error;
+
+ error = xfs_trans_alloc(mp, &M_RES(mp)->tr_growrtalloc,
+ XFS_GROWFSRT_SPACE_RES(mp, count_fsb), 0, 0, &tp);
+ if (error)
+ return error;
+
+ xfs_ilock(ip, XFS_ILOCK_EXCL);
+ xfs_trans_ijoin(tp, ip, XFS_ILOCK_EXCL);
+
+ error = xfs_iext_count_extend(tp, ip, XFS_DATA_FORK,
+ XFS_IEXT_ADD_NOSPLIT_CNT);
+ if (error)
+ goto out_trans_cancel;
+
+ error = xfs_bmapi_write(tp, ip, offset_fsb, count_fsb,
+ XFS_BMAPI_METADATA, 0, map, &nmap);
+ if (error)
+ goto out_trans_cancel;
+
+ return xfs_trans_commit(tp);
+
+out_trans_cancel:
+ xfs_trans_cancel(tp);
+ return error;
+}
+
+/* Get a buffer for the block. */
+static int
+xfs_rtfile_initialize_block(
+ struct xfs_inode *ip,
+ xfs_fsblock_t fsbno,
+ void *data)
+{
+ struct xfs_mount *mp = ip->i_mount;
+ struct xfs_trans *tp;
+ struct xfs_buf *bp;
+ const size_t copylen = mp->m_blockwsize << XFS_WORDLOG;
+ enum xfs_blft buf_type;
+ int error;
+
+ if (ip == mp->m_rsumip)
+ buf_type = XFS_BLFT_RTSUMMARY_BUF;
+ else
+ buf_type = XFS_BLFT_RTBITMAP_BUF;
+
+ error = xfs_trans_alloc(mp, &M_RES(mp)->tr_growrtzero, 0, 0, 0, &tp);
+ if (error)
+ return error;
+ xfs_ilock(ip, XFS_ILOCK_EXCL);
+ xfs_trans_ijoin(tp, ip, XFS_ILOCK_EXCL);
+
+ error = xfs_trans_get_buf(tp, mp->m_ddev_targp,
+ XFS_FSB_TO_DADDR(mp, fsbno), mp->m_bsize, 0, &bp);
+ if (error) {
+ xfs_trans_cancel(tp);
+ return error;
+ }
+
+ xfs_trans_buf_set_type(tp, bp, buf_type);
+ bp->b_ops = &xfs_rtbuf_ops;
+ if (data)
+ memcpy(bp->b_addr, data, copylen);
+ else
+ memset(bp->b_addr, 0, copylen);
+ xfs_trans_log_buf(tp, bp, 0, mp->m_sb.sb_blocksize - 1);
+ return xfs_trans_commit(tp);
+}
+
+/*
+ * Allocate space to the bitmap or summary file, and zero it, for growfs.
+ * @data must be a contiguous buffer large enough to fill all blocks in the
+ * file; or NULL to initialize the contents to zeroes.
+ */
+int
+xfs_rtfile_initialize_blocks(
+ struct xfs_inode *ip, /* inode (bitmap/summary) */
+ xfs_fileoff_t offset_fsb, /* offset to start from */
+ xfs_fileoff_t end_fsb, /* offset to allocate to */
+ void *data) /* data to fill the blocks */
+{
+ struct xfs_mount *mp = ip->i_mount;
+ const size_t copylen = mp->m_blockwsize << XFS_WORDLOG;
+
+ while (offset_fsb < end_fsb) {
+ struct xfs_bmbt_irec map;
+ xfs_filblks_t i;
+ int error;
+
+ error = xfs_rtfile_alloc_blocks(ip, offset_fsb,
+ end_fsb - offset_fsb, &map);
+ if (error)
+ return error;
+
+ /*
+ * Now we need to clear the allocated blocks.
+ *
+ * Do this one block per transaction, to keep it simple.
+ */
+ for (i = 0; i < map.br_blockcount; i++) {
+ error = xfs_rtfile_initialize_block(ip,
+ map.br_startblock + i, data);
+ if (error)
+ return error;
+ if (data)
+ data += copylen;
+ }
+
+ offset_fsb = map.br_startoff + map.br_blockcount;
+ }
+
+ return 0;
+}
diff --git a/fs/xfs/libxfs/xfs_rtbitmap.h b/fs/xfs/libxfs/xfs_rtbitmap.h
index e87e2099cff5..0d5ab5e2cb6a 100644
--- a/fs/xfs/libxfs/xfs_rtbitmap.h
+++ b/fs/xfs/libxfs/xfs_rtbitmap.h
@@ -343,6 +343,9 @@ xfs_filblks_t xfs_rtsummary_blockcount(struct xfs_mount *mp,
unsigned long long xfs_rtsummary_wordcount(struct xfs_mount *mp,
unsigned int rsumlevels, xfs_extlen_t rbmblocks);
+int xfs_rtfile_initialize_blocks(struct xfs_inode *ip,
+ xfs_fileoff_t offset_fsb, xfs_fileoff_t end_fsb, void *data);
+
void xfs_rtbitmap_lock(struct xfs_trans *tp, struct xfs_mount *mp);
void xfs_rtbitmap_unlock(struct xfs_mount *mp);
diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
index 45a0d29949ea..114807cd80ba 100644
--- a/fs/xfs/xfs_rtalloc.c
+++ b/fs/xfs/xfs_rtalloc.c
@@ -641,121 +641,6 @@ xfs_rtallocate_extent_size(
return -ENOSPC;
}
-/*
- * Allocate space to the bitmap or summary file, and zero it, for growfs.
- */
-STATIC int
-xfs_growfs_rt_alloc(
- struct xfs_mount *mp, /* file system mount point */
- xfs_extlen_t oblocks, /* old count of blocks */
- xfs_extlen_t nblocks, /* new count of blocks */
- struct xfs_inode *ip) /* inode (bitmap/summary) */
-{
- xfs_fileoff_t bno; /* block number in file */
- struct xfs_buf *bp; /* temporary buffer for zeroing */
- xfs_daddr_t d; /* disk block address */
- int error; /* error return value */
- xfs_fsblock_t fsbno; /* filesystem block for bno */
- struct xfs_bmbt_irec map; /* block map output */
- int nmap; /* number of block maps */
- int resblks; /* space reservation */
- enum xfs_blft buf_type;
- struct xfs_trans *tp;
-
- if (ip == mp->m_rsumip)
- buf_type = XFS_BLFT_RTSUMMARY_BUF;
- else
- buf_type = XFS_BLFT_RTBITMAP_BUF;
-
- /*
- * Allocate space to the file, as necessary.
- */
- while (oblocks < nblocks) {
- resblks = XFS_GROWFSRT_SPACE_RES(mp, nblocks - oblocks);
- /*
- * Reserve space & log for one extent added to the file.
- */
- error = xfs_trans_alloc(mp, &M_RES(mp)->tr_growrtalloc, resblks,
- 0, 0, &tp);
- if (error)
- return error;
- /*
- * Lock the inode.
- */
- xfs_ilock(ip, XFS_ILOCK_EXCL);
- xfs_trans_ijoin(tp, ip, XFS_ILOCK_EXCL);
-
- error = xfs_iext_count_extend(tp, ip, XFS_DATA_FORK,
- XFS_IEXT_ADD_NOSPLIT_CNT);
- if (error)
- goto out_trans_cancel;
-
- /*
- * Allocate blocks to the bitmap file.
- */
- nmap = 1;
- error = xfs_bmapi_write(tp, ip, oblocks, nblocks - oblocks,
- XFS_BMAPI_METADATA, 0, &map, &nmap);
- if (error)
- goto out_trans_cancel;
- /*
- * Free any blocks freed up in the transaction, then commit.
- */
- error = xfs_trans_commit(tp);
- if (error)
- return error;
- /*
- * Now we need to clear the allocated blocks.
- * Do this one block per transaction, to keep it simple.
- */
- for (bno = map.br_startoff, fsbno = map.br_startblock;
- bno < map.br_startoff + map.br_blockcount;
- bno++, fsbno++) {
- /*
- * Reserve log for one block zeroing.
- */
- error = xfs_trans_alloc(mp, &M_RES(mp)->tr_growrtzero,
- 0, 0, 0, &tp);
- if (error)
- return error;
- /*
- * Lock the bitmap inode.
- */
- xfs_ilock(ip, XFS_ILOCK_EXCL);
- xfs_trans_ijoin(tp, ip, XFS_ILOCK_EXCL);
- /*
- * Get a buffer for the block.
- */
- d = XFS_FSB_TO_DADDR(mp, fsbno);
- error = xfs_trans_get_buf(tp, mp->m_ddev_targp, d,
- mp->m_bsize, 0, &bp);
- if (error)
- goto out_trans_cancel;
-
- xfs_trans_buf_set_type(tp, bp, buf_type);
- bp->b_ops = &xfs_rtbuf_ops;
- memset(bp->b_addr, 0, mp->m_sb.sb_blocksize);
- xfs_trans_log_buf(tp, bp, 0, mp->m_sb.sb_blocksize - 1);
- /*
- * Commit the transaction.
- */
- error = xfs_trans_commit(tp);
- if (error)
- return error;
- }
- /*
- * Go on to the next extent, if any.
- */
- oblocks = map.br_startoff + map.br_blockcount;
- }
-
- return 0;
-
-out_trans_cancel:
- xfs_trans_cancel(tp);
- return error;
-}
-
static int
xfs_alloc_rsum_cache(
struct xfs_mount *mp,
@@ -1062,10 +947,12 @@ xfs_growfs_rt(
/*
* Allocate space to the bitmap and summary files, as necessary.
*/
- error = xfs_growfs_rt_alloc(mp, rbmblocks, nrbmblocks, mp->m_rbmip);
+ error = xfs_rtfile_initialize_blocks(mp->m_rbmip, rbmblocks,
+ nrbmblocks, NULL);
if (error)
goto out_unlock;
- error = xfs_growfs_rt_alloc(mp, rsumblocks, nrsumblocks, mp->m_rsumip);
+ error = xfs_rtfile_initialize_blocks(mp->m_rsumip, rsumblocks,
+ nrsumblocks, NULL);
if (error)
goto out_unlock;
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 12/12] xfs: push transaction join out of xfs_rtbitmap_lock and xfs_rtgroup_lock
2024-09-02 18:21 ` [PATCHSET v4.2 3/8] xfs: clean up the rtbitmap code Darrick J. Wong
` (10 preceding siblings ...)
2024-09-02 18:26 ` [PATCH 11/12] xfs: factor out rtbitmap/summary initialization helpers Darrick J. Wong
@ 2024-09-02 18:27 ` Darrick J. Wong
11 siblings, 0 replies; 53+ messages in thread
From: Darrick J. Wong @ 2024-09-02 18:27 UTC (permalink / raw)
To: chandanbabu, djwong; +Cc: Christoph Hellwig, linux-xfs
From: Christoph Hellwig <hch@lst.de>
To prepare for being able to join an already locked rtbitmap inode to a
transaction split out separate helpers for joining the transaction from
the locking helpers.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
fs/xfs/libxfs/xfs_bmap.c | 3 ++-
fs/xfs/libxfs/xfs_rtbitmap.c | 24 +++++++++++++-----------
fs/xfs/libxfs/xfs_rtbitmap.h | 6 ++++--
fs/xfs/xfs_rtalloc.c | 6 ++++--
4 files changed, 23 insertions(+), 16 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index 7df74c35d9f9..112c7ee2d493 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -5376,7 +5376,8 @@ xfs_bmap_del_extent_real(
*/
if (!(tp->t_flags & XFS_TRANS_RTBITMAP_LOCKED)) {
tp->t_flags |= XFS_TRANS_RTBITMAP_LOCKED;
- xfs_rtbitmap_lock(tp, mp);
+ xfs_rtbitmap_lock(mp);
+ xfs_rtbitmap_trans_join(tp);
}
error = xfs_rtfree_blocks(tp, del->br_startblock,
del->br_blockcount);
diff --git a/fs/xfs/libxfs/xfs_rtbitmap.c b/fs/xfs/libxfs/xfs_rtbitmap.c
index 715d2c54ce02..d7c731aeee12 100644
--- a/fs/xfs/libxfs/xfs_rtbitmap.c
+++ b/fs/xfs/libxfs/xfs_rtbitmap.c
@@ -1201,23 +1201,25 @@ xfs_rtsummary_wordcount(
return XFS_FSB_TO_B(mp, blocks) >> XFS_WORDLOG;
}
-/*
- * Lock both realtime free space metadata inodes for a freespace update. If a
- * transaction is given, the inodes will be joined to the transaction and the
- * ILOCKs will be released on transaction commit.
- */
+/* Lock both realtime free space metadata inodes for a freespace update. */
void
xfs_rtbitmap_lock(
- struct xfs_trans *tp,
struct xfs_mount *mp)
{
xfs_ilock(mp->m_rbmip, XFS_ILOCK_EXCL | XFS_ILOCK_RTBITMAP);
- if (tp)
- xfs_trans_ijoin(tp, mp->m_rbmip, XFS_ILOCK_EXCL);
-
xfs_ilock(mp->m_rsumip, XFS_ILOCK_EXCL | XFS_ILOCK_RTSUM);
- if (tp)
- xfs_trans_ijoin(tp, mp->m_rsumip, XFS_ILOCK_EXCL);
+}
+
+/*
+ * Join both realtime free space metadata inodes to the transaction. The
+ * ILOCKs will be released on transaction commit.
+ */
+void
+xfs_rtbitmap_trans_join(
+ struct xfs_trans *tp)
+{
+ xfs_trans_ijoin(tp, tp->t_mountp->m_rbmip, XFS_ILOCK_EXCL);
+ xfs_trans_ijoin(tp, tp->t_mountp->m_rsumip, XFS_ILOCK_EXCL);
}
/* Unlock both realtime free space metadata inodes after a freespace update. */
diff --git a/fs/xfs/libxfs/xfs_rtbitmap.h b/fs/xfs/libxfs/xfs_rtbitmap.h
index 0d5ab5e2cb6a..523d3d3c12c6 100644
--- a/fs/xfs/libxfs/xfs_rtbitmap.h
+++ b/fs/xfs/libxfs/xfs_rtbitmap.h
@@ -346,8 +346,9 @@ unsigned long long xfs_rtsummary_wordcount(struct xfs_mount *mp,
int xfs_rtfile_initialize_blocks(struct xfs_inode *ip,
xfs_fileoff_t offset_fsb, xfs_fileoff_t end_fsb, void *data);
-void xfs_rtbitmap_lock(struct xfs_trans *tp, struct xfs_mount *mp);
+void xfs_rtbitmap_lock(struct xfs_mount *mp);
void xfs_rtbitmap_unlock(struct xfs_mount *mp);
+void xfs_rtbitmap_trans_join(struct xfs_trans *tp);
/* Lock the rt bitmap inode in shared mode */
#define XFS_RBMLOCK_BITMAP (1U << 0)
@@ -376,7 +377,8 @@ xfs_rtbitmap_blockcount(struct xfs_mount *mp, xfs_rtbxlen_t rtextents)
# define xfs_rtbitmap_wordcount(mp, r) (0)
# define xfs_rtsummary_blockcount(mp, l, b) (0)
# define xfs_rtsummary_wordcount(mp, l, b) (0)
-# define xfs_rtbitmap_lock(tp, mp) do { } while (0)
+# define xfs_rtbitmap_lock(mp) do { } while (0)
+# define xfs_rtbitmap_trans_join(tp) do { } while (0)
# define xfs_rtbitmap_unlock(mp) do { } while (0)
# define xfs_rtbitmap_lock_shared(mp, lf) do { } while (0)
# define xfs_rtbitmap_unlock_shared(mp, lf) do { } while (0)
diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
index 114807cd80ba..d290749b0304 100644
--- a/fs/xfs/xfs_rtalloc.c
+++ b/fs/xfs/xfs_rtalloc.c
@@ -739,7 +739,8 @@ xfs_growfs_rt_bmblock(
goto out_free;
nargs.tp = args.tp;
- xfs_rtbitmap_lock(args.tp, mp);
+ xfs_rtbitmap_lock(mp);
+ xfs_rtbitmap_trans_join(args.tp);
/*
* Update the bitmap inode's size ondisk and incore. We need to update
@@ -1313,7 +1314,8 @@ xfs_bmap_rtalloc(
* Lock out modifications to both the RT bitmap and summary inodes
*/
if (!rtlocked) {
- xfs_rtbitmap_lock(ap->tp, mp);
+ xfs_rtbitmap_lock(mp);
+ xfs_rtbitmap_trans_join(ap->tp);
rtlocked = true;
}
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 01/10] xfs: use the recalculated transaction reservation in xfs_growfs_rt_bmblock
2024-09-02 18:21 ` [PATCHSET v4.2 4/8] xfs: fixes for the realtime allocator Darrick J. Wong
@ 2024-09-02 18:27 ` Darrick J. Wong
2024-09-02 18:27 ` [PATCH 02/10] xfs: ensure rtx mask/shift are correct after growfs Darrick J. Wong
` (8 subsequent siblings)
9 siblings, 0 replies; 53+ messages in thread
From: Darrick J. Wong @ 2024-09-02 18:27 UTC (permalink / raw)
To: chandanbabu, djwong; +Cc: Christoph Hellwig, linux-xfs
From: Christoph Hellwig <hch@lst.de>
After going great length to calculate the transaction reservation for
the new geometry, we should also use it to allocate the transaction it
was calculated for.
Fixes: 578bd4ce7100 ("xfs: recompute growfsrtfree transaction reservation while growing rt volume")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
fs/xfs/xfs_rtalloc.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
index d290749b0304..a9f08d96f1fe 100644
--- a/fs/xfs/xfs_rtalloc.c
+++ b/fs/xfs/xfs_rtalloc.c
@@ -730,10 +730,12 @@ xfs_growfs_rt_bmblock(
xfs_rtsummary_blockcount(mp, nmp->m_rsumlevels,
nmp->m_sb.sb_rbmblocks));
- /* recompute growfsrt reservation from new rsumsize */
+ /*
+ * Recompute the growfsrt reservation from the new rsumsize, so that the
+ * transaction below use the new, potentially larger value.
+ * */
xfs_trans_resv_calc(nmp, &nmp->m_resv);
-
- error = xfs_trans_alloc(mp, &M_RES(mp)->tr_growrtfree, 0, 0, 0,
+ error = xfs_trans_alloc(mp, &M_RES(nmp)->tr_growrtfree, 0, 0, 0,
&args.tp);
if (error)
goto out_free;
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 02/10] xfs: ensure rtx mask/shift are correct after growfs
2024-09-02 18:21 ` [PATCHSET v4.2 4/8] xfs: fixes for the realtime allocator Darrick J. Wong
2024-09-02 18:27 ` [PATCH 01/10] xfs: use the recalculated transaction reservation in xfs_growfs_rt_bmblock Darrick J. Wong
@ 2024-09-02 18:27 ` Darrick J. Wong
2024-09-02 18:27 ` [PATCH 03/10] xfs: don't return too-short extents from xfs_rtallocate_extent_block Darrick J. Wong
` (7 subsequent siblings)
9 siblings, 0 replies; 53+ messages in thread
From: Darrick J. Wong @ 2024-09-02 18:27 UTC (permalink / raw)
To: chandanbabu, djwong; +Cc: Christoph Hellwig, linux-xfs
From: Christoph Hellwig <hch@lst.de>
When growfs sets an extent size, it doesn't updated the m_rtxblklog and
m_rtxblkmask values, which could lead to incorrect usage of them if they
were set before and can't be used for the new extent size.
Add a xfs_mount_sb_set_rextsize helper that updates the two fields, and
also use it when calculating the new RT geometry instead of disabling
the optimization there.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
fs/xfs/libxfs/xfs_sb.c | 12 ++++++++++--
fs/xfs/libxfs/xfs_sb.h | 2 ++
fs/xfs/xfs_rtalloc.c | 5 +++--
3 files changed, 15 insertions(+), 4 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c
index f9c3045f71e0..a6fa9aedb28b 100644
--- a/fs/xfs/libxfs/xfs_sb.c
+++ b/fs/xfs/libxfs/xfs_sb.c
@@ -965,6 +965,15 @@ const struct xfs_buf_ops xfs_sb_quiet_buf_ops = {
.verify_write = xfs_sb_write_verify,
};
+void
+xfs_mount_sb_set_rextsize(
+ struct xfs_mount *mp,
+ struct xfs_sb *sbp)
+{
+ mp->m_rtxblklog = log2_if_power2(sbp->sb_rextsize);
+ mp->m_rtxblkmask = mask64_if_power2(sbp->sb_rextsize);
+}
+
/*
* xfs_mount_common
*
@@ -989,8 +998,7 @@ xfs_sb_mount_common(
mp->m_blockmask = sbp->sb_blocksize - 1;
mp->m_blockwsize = sbp->sb_blocksize >> XFS_WORDLOG;
mp->m_blockwmask = mp->m_blockwsize - 1;
- mp->m_rtxblklog = log2_if_power2(sbp->sb_rextsize);
- mp->m_rtxblkmask = mask64_if_power2(sbp->sb_rextsize);
+ xfs_mount_sb_set_rextsize(mp, sbp);
mp->m_alloc_mxr[0] = xfs_allocbt_maxrecs(mp, sbp->sb_blocksize, 1);
mp->m_alloc_mxr[1] = xfs_allocbt_maxrecs(mp, sbp->sb_blocksize, 0);
diff --git a/fs/xfs/libxfs/xfs_sb.h b/fs/xfs/libxfs/xfs_sb.h
index 796f02191dfd..885c83755991 100644
--- a/fs/xfs/libxfs/xfs_sb.h
+++ b/fs/xfs/libxfs/xfs_sb.h
@@ -17,6 +17,8 @@ extern void xfs_log_sb(struct xfs_trans *tp);
extern int xfs_sync_sb(struct xfs_mount *mp, bool wait);
extern int xfs_sync_sb_buf(struct xfs_mount *mp);
extern void xfs_sb_mount_common(struct xfs_mount *mp, struct xfs_sb *sbp);
+void xfs_mount_sb_set_rextsize(struct xfs_mount *mp,
+ struct xfs_sb *sbp);
extern void xfs_sb_from_disk(struct xfs_sb *to, struct xfs_dsb *from);
extern void xfs_sb_to_disk(struct xfs_dsb *to, struct xfs_sb *from);
extern void xfs_sb_quota_from_disk(struct xfs_sb *sbp);
diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
index a9f08d96f1fe..4bbb50d5a4b7 100644
--- a/fs/xfs/xfs_rtalloc.c
+++ b/fs/xfs/xfs_rtalloc.c
@@ -719,8 +719,8 @@ xfs_growfs_rt_bmblock(
/*
* Calculate new sb and mount fields for this round.
*/
- nmp->m_rtxblklog = -1; /* don't use shift or masking */
nmp->m_sb.sb_rextsize = rextsize;
+ xfs_mount_sb_set_rextsize(nmp, &nmp->m_sb);
nmp->m_sb.sb_rbmblocks = bmbno + 1;
nmp->m_sb.sb_rblocks = min(nrblocks, nrblocks_step);
nmp->m_sb.sb_rextents = xfs_rtb_to_rtx(nmp, nmp->m_sb.sb_rblocks);
@@ -807,10 +807,11 @@ xfs_growfs_rt_bmblock(
xfs_trans_mod_sb(args.tp, XFS_TRANS_SB_FREXTENTS, freed_rtx);
/*
- * Update mp values into the real mp structure.
+ * Update the calculated values in the real mount structure.
*/
mp->m_rsumlevels = nmp->m_rsumlevels;
mp->m_rsumsize = nmp->m_rsumsize;
+ xfs_mount_sb_set_rextsize(mp, &mp->m_sb);
/*
* Recompute the growfsrt reservation from the new rsumsize.
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 03/10] xfs: don't return too-short extents from xfs_rtallocate_extent_block
2024-09-02 18:21 ` [PATCHSET v4.2 4/8] xfs: fixes for the realtime allocator Darrick J. Wong
2024-09-02 18:27 ` [PATCH 01/10] xfs: use the recalculated transaction reservation in xfs_growfs_rt_bmblock Darrick J. Wong
2024-09-02 18:27 ` [PATCH 02/10] xfs: ensure rtx mask/shift are correct after growfs Darrick J. Wong
@ 2024-09-02 18:27 ` Darrick J. Wong
2024-09-02 18:28 ` [PATCH 04/10] xfs: don't scan off the end of the rt volume in xfs_rtallocate_extent_block Darrick J. Wong
` (6 subsequent siblings)
9 siblings, 0 replies; 53+ messages in thread
From: Darrick J. Wong @ 2024-09-02 18:27 UTC (permalink / raw)
To: chandanbabu, djwong; +Cc: Christoph Hellwig, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
If xfs_rtallocate_extent_block is asked for a variable-sized allocation,
it will try to return the best-sized free extent, which is apparently
the largest one that it finds starting in this rtbitmap block. It will
then trim the size of the extent as needed to align it with prod.
However, it misses one thing -- rounding down the best-fit candidate to
the required alignment could make the extent shorter than minlen. In
the case where minlen > 1, we'd rather the caller relaxed its alignment
requirements and tried again, as the allocator already supports that.
Returning a too-short extent that causes xfs_bmapi_write to return
ENOSR if there aren't enough nmaps to handle multiple new allocations,
which can then cause filesystem shutdowns.
I haven't seen this happen on any production systems, but then I don't
think it's very common to set a per-file extent size hint on realtime
files. I tripped it while working on the rtgroups feature and pounding
on the realtime allocator enthusiastically.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
fs/xfs/xfs_rtalloc.c | 21 +++++++++++----------
1 file changed, 11 insertions(+), 10 deletions(-)
diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
index 4bbb50d5a4b7..c65ee8d1d38d 100644
--- a/fs/xfs/xfs_rtalloc.c
+++ b/fs/xfs/xfs_rtalloc.c
@@ -289,16 +289,9 @@ xfs_rtallocate_extent_block(
return error;
}
- /*
- * Searched the whole thing & didn't find a maxlen free extent.
- */
- if (minlen > maxlen || besti == -1) {
- /*
- * Allocation failed. Set *nextp to the next block to try.
- */
- *nextp = next;
- return -ENOSPC;
- }
+ /* Searched the whole thing & didn't find a maxlen free extent. */
+ if (minlen > maxlen || besti == -1)
+ goto nospace;
/*
* If size should be a multiple of prod, make that so.
@@ -311,12 +304,20 @@ xfs_rtallocate_extent_block(
bestlen -= p;
}
+ /* Don't return a too-short extent. */
+ if (bestlen < minlen)
+ goto nospace;
+
/*
* Pick besti for bestlen & return that.
*/
*len = bestlen;
*rtx = besti;
return 0;
+nospace:
+ /* Allocation failed. Set *nextp to the next block to try. */
+ *nextp = next;
+ return -ENOSPC;
}
/*
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 04/10] xfs: don't scan off the end of the rt volume in xfs_rtallocate_extent_block
2024-09-02 18:21 ` [PATCHSET v4.2 4/8] xfs: fixes for the realtime allocator Darrick J. Wong
` (2 preceding siblings ...)
2024-09-02 18:27 ` [PATCH 03/10] xfs: don't return too-short extents from xfs_rtallocate_extent_block Darrick J. Wong
@ 2024-09-02 18:28 ` Darrick J. Wong
2024-09-02 18:28 ` [PATCH 05/10] xfs: refactor aligning bestlen to prod Darrick J. Wong
` (5 subsequent siblings)
9 siblings, 0 replies; 53+ messages in thread
From: Darrick J. Wong @ 2024-09-02 18:28 UTC (permalink / raw)
To: chandanbabu, djwong; +Cc: Christoph Hellwig, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
The loop conditional here is not quite correct because an rtbitmap block
can represent rtextents beyond the end of the rt volume. There's no way
that it makes sense to scan for free space beyond EOFS, so don't do it.
This overrun has been present since v2.6.0.
Also fix the type of bestlen, which was incorrectly converted.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
fs/xfs/xfs_rtalloc.c | 16 +++++++---------
1 file changed, 7 insertions(+), 9 deletions(-)
diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
index c65ee8d1d38d..58081ce5247b 100644
--- a/fs/xfs/xfs_rtalloc.c
+++ b/fs/xfs/xfs_rtalloc.c
@@ -229,22 +229,20 @@ xfs_rtallocate_extent_block(
xfs_rtxnum_t *rtx) /* out: start rtext allocated */
{
struct xfs_mount *mp = args->mp;
- xfs_rtxnum_t besti; /* best rtext found so far */
- xfs_rtxnum_t bestlen;/* best length found so far */
+ xfs_rtxnum_t besti = -1; /* best rtext found so far */
xfs_rtxnum_t end; /* last rtext in chunk */
- int error;
xfs_rtxnum_t i; /* current rtext trying */
xfs_rtxnum_t next; /* next rtext to try */
+ xfs_rtxlen_t bestlen = 0; /* best length found so far */
int stat; /* status from internal calls */
+ int error;
/*
- * Loop over all the extents starting in this bitmap block,
- * looking for one that's long enough.
+ * Loop over all the extents starting in this bitmap block up to the
+ * end of the rt volume, looking for one that's long enough.
*/
- for (i = xfs_rbmblock_to_rtx(mp, bbno), besti = -1, bestlen = 0,
- end = xfs_rbmblock_to_rtx(mp, bbno + 1) - 1;
- i <= end;
- i++) {
+ end = min(mp->m_sb.sb_rextents, xfs_rbmblock_to_rtx(mp, bbno + 1)) - 1;
+ for (i = xfs_rbmblock_to_rtx(mp, bbno); i <= end; i++) {
/* Make sure we don't scan off the end of the rt volume. */
maxlen = xfs_rtallocate_clamp_len(mp, i, maxlen, prod);
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 05/10] xfs: refactor aligning bestlen to prod
2024-09-02 18:21 ` [PATCHSET v4.2 4/8] xfs: fixes for the realtime allocator Darrick J. Wong
` (3 preceding siblings ...)
2024-09-02 18:28 ` [PATCH 04/10] xfs: don't scan off the end of the rt volume in xfs_rtallocate_extent_block Darrick J. Wong
@ 2024-09-02 18:28 ` Darrick J. Wong
2024-09-02 18:28 ` [PATCH 06/10] xfs: clean up xfs_rtallocate_extent_exact a bit Darrick J. Wong
` (4 subsequent siblings)
9 siblings, 0 replies; 53+ messages in thread
From: Darrick J. Wong @ 2024-09-02 18:28 UTC (permalink / raw)
To: chandanbabu, djwong; +Cc: Christoph Hellwig, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
There are two places in xfs_rtalloc.c where we want to make sure that a
count of rt extents is aligned with a particular prod(uct) factor. In
one spot, we actually use rounddown(), albeit unnecessarily if prod < 2.
In the other case, we open-code this rounding inefficiently by promoting
the 32-bit length value to a 64-bit value and then performing a 64-bit
division to figure out the subtraction.
Refactor this into a single helper that uses the correct types and
division method for the type, and skips the division entirely unless
prod is large enough to make a difference.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
fs/xfs/xfs_rtalloc.c | 26 +++++++++++++++-----------
1 file changed, 15 insertions(+), 11 deletions(-)
diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
index 58081ce5247b..11c58f12bcb2 100644
--- a/fs/xfs/xfs_rtalloc.c
+++ b/fs/xfs/xfs_rtalloc.c
@@ -194,6 +194,17 @@ xfs_rtallocate_range(
return xfs_rtmodify_range(args, start, len, 0);
}
+/* Reduce @rtxlen until it is a multiple of @prod. */
+static inline xfs_rtxlen_t
+xfs_rtalloc_align_len(
+ xfs_rtxlen_t rtxlen,
+ xfs_rtxlen_t prod)
+{
+ if (unlikely(prod > 1))
+ return rounddown(rtxlen, prod);
+ return rtxlen;
+}
+
/*
* Make sure we don't run off the end of the rt volume. Be careful that
* adjusting maxlen downwards doesn't cause us to fail the alignment checks.
@@ -208,7 +219,7 @@ xfs_rtallocate_clamp_len(
xfs_rtxlen_t ret;
ret = min(mp->m_sb.sb_rextents, startrtx + rtxlen) - startrtx;
- return rounddown(ret, prod);
+ return xfs_rtalloc_align_len(ret, prod);
}
/*
@@ -292,17 +303,10 @@ xfs_rtallocate_extent_block(
goto nospace;
/*
- * If size should be a multiple of prod, make that so.
+ * Ensure bestlen is a multiple of prod, but don't return a too-short
+ * extent.
*/
- if (prod > 1) {
- xfs_rtxlen_t p; /* amount to trim length by */
-
- div_u64_rem(bestlen, prod, &p);
- if (p)
- bestlen -= p;
- }
-
- /* Don't return a too-short extent. */
+ bestlen = xfs_rtalloc_align_len(bestlen, prod);
if (bestlen < minlen)
goto nospace;
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 06/10] xfs: clean up xfs_rtallocate_extent_exact a bit
2024-09-02 18:21 ` [PATCHSET v4.2 4/8] xfs: fixes for the realtime allocator Darrick J. Wong
` (4 preceding siblings ...)
2024-09-02 18:28 ` [PATCH 05/10] xfs: refactor aligning bestlen to prod Darrick J. Wong
@ 2024-09-02 18:28 ` Darrick J. Wong
2024-09-02 18:28 ` [PATCH 07/10] xfs: reduce excessive clamping of maxlen in xfs_rtallocate_extent_near Darrick J. Wong
` (3 subsequent siblings)
9 siblings, 0 replies; 53+ messages in thread
From: Darrick J. Wong @ 2024-09-02 18:28 UTC (permalink / raw)
To: chandanbabu, djwong; +Cc: Christoph Hellwig, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Before we start doing more surgery on the rt allocator, let's clean up
the exact allocator so that it doesn't change its arguments and uses the
helper introduced in the previous patch.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
fs/xfs/xfs_rtalloc.c | 41 +++++++++++++++++++++--------------------
1 file changed, 21 insertions(+), 20 deletions(-)
diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
index 11c58f12bcb2..af357704895d 100644
--- a/fs/xfs/xfs_rtalloc.c
+++ b/fs/xfs/xfs_rtalloc.c
@@ -338,10 +338,10 @@ xfs_rtallocate_extent_exact(
xfs_rtxlen_t prod, /* extent product factor */
xfs_rtxnum_t *rtx) /* out: start rtext allocated */
{
- int error;
- xfs_rtxlen_t i; /* extent length trimmed due to prod */
- int isfree; /* extent is free */
xfs_rtxnum_t next; /* next rtext to try (dummy) */
+ xfs_rtxlen_t alloclen; /* candidate length */
+ int isfree; /* extent is free */
+ int error;
ASSERT(minlen % prod == 0);
ASSERT(maxlen % prod == 0);
@@ -352,25 +352,26 @@ xfs_rtallocate_extent_exact(
if (error)
return error;
- if (!isfree) {
- /*
- * If not, allocate what there is, if it's at least minlen.
- */
- maxlen = next - start;
- if (maxlen < minlen)
- return -ENOSPC;
-
- /*
- * Trim off tail of extent, if prod is specified.
- */
- if (prod > 1 && (i = maxlen % prod)) {
- maxlen -= i;
- if (maxlen < minlen)
- return -ENOSPC;
- }
+ if (isfree) {
+ /* start to maxlen is all free; allocate it. */
+ *len = maxlen;
+ *rtx = start;
+ return 0;
}
- *len = maxlen;
+ /*
+ * If not, allocate what there is, if it's at least minlen.
+ */
+ alloclen = next - start;
+ if (alloclen < minlen)
+ return -ENOSPC;
+
+ /* Ensure alloclen is a multiple of prod. */
+ alloclen = xfs_rtalloc_align_len(alloclen, prod);
+ if (alloclen < minlen)
+ return -ENOSPC;
+
+ *len = alloclen;
*rtx = start;
return 0;
}
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 07/10] xfs: reduce excessive clamping of maxlen in xfs_rtallocate_extent_near
2024-09-02 18:21 ` [PATCHSET v4.2 4/8] xfs: fixes for the realtime allocator Darrick J. Wong
` (5 preceding siblings ...)
2024-09-02 18:28 ` [PATCH 06/10] xfs: clean up xfs_rtallocate_extent_exact a bit Darrick J. Wong
@ 2024-09-02 18:28 ` Darrick J. Wong
2024-09-02 18:29 ` [PATCH 08/10] xfs: fix broken variable-sized allocation detection in xfs_rtallocate_extent_block Darrick J. Wong
` (2 subsequent siblings)
9 siblings, 0 replies; 53+ messages in thread
From: Darrick J. Wong @ 2024-09-02 18:28 UTC (permalink / raw)
To: chandanbabu, djwong; +Cc: Christoph Hellwig, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
The near rt allocator employs two allocation strategies -- first it
tries to allocate at exactly @start. If that fails, it will pivot back
and forth around that starting point looking for an appropriately sized
free space.
However, I clamped maxlen ages ago to prevent the exact allocation scan
from running off the end of the rt volume. This, I realize, was
excessive. If the allocation request is (say) for 32 rtx but the start
position is 5 rtx from the end of the volume, we clamp maxlen to 5. If
the exact allocation fails, we then pivot back and forth looking for 5
rtx, even though the original intent was to try to get 32 rtx.
If we then find 5 rtx when we could have gotten 32 rtx, we've not done
as well as we could have. This may be moot if the caller immediately
comes back for more space, but it might not be. Either way, we can do
better here.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
fs/xfs/xfs_rtalloc.c | 23 ++++++++++++-----------
1 file changed, 12 insertions(+), 11 deletions(-)
diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
index af357704895d..d27bfec08ef8 100644
--- a/fs/xfs/xfs_rtalloc.c
+++ b/fs/xfs/xfs_rtalloc.c
@@ -338,23 +338,29 @@ xfs_rtallocate_extent_exact(
xfs_rtxlen_t prod, /* extent product factor */
xfs_rtxnum_t *rtx) /* out: start rtext allocated */
{
+ struct xfs_mount *mp = args->mp;
xfs_rtxnum_t next; /* next rtext to try (dummy) */
xfs_rtxlen_t alloclen; /* candidate length */
+ xfs_rtxlen_t scanlen; /* number of free rtx to look for */
int isfree; /* extent is free */
int error;
ASSERT(minlen % prod == 0);
ASSERT(maxlen % prod == 0);
- /*
- * Check if the range in question (for maxlen) is free.
- */
- error = xfs_rtcheck_range(args, start, maxlen, 1, &next, &isfree);
+
+ /* Make sure we don't run off the end of the rt volume. */
+ scanlen = xfs_rtallocate_clamp_len(mp, start, maxlen, prod);
+ if (scanlen < minlen)
+ return -ENOSPC;
+
+ /* Check if the range in question (for scanlen) is free. */
+ error = xfs_rtcheck_range(args, start, scanlen, 1, &next, &isfree);
if (error)
return error;
if (isfree) {
- /* start to maxlen is all free; allocate it. */
- *len = maxlen;
+ /* start to scanlen is all free; allocate it. */
+ *len = scanlen;
*rtx = start;
return 0;
}
@@ -410,11 +416,6 @@ xfs_rtallocate_extent_near(
if (start >= mp->m_sb.sb_rextents)
start = mp->m_sb.sb_rextents - 1;
- /* Make sure we don't run off the end of the rt volume. */
- maxlen = xfs_rtallocate_clamp_len(mp, start, maxlen, prod);
- if (maxlen < minlen)
- return -ENOSPC;
-
/*
* Try the exact allocation first.
*/
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 08/10] xfs: fix broken variable-sized allocation detection in xfs_rtallocate_extent_block
2024-09-02 18:21 ` [PATCHSET v4.2 4/8] xfs: fixes for the realtime allocator Darrick J. Wong
` (6 preceding siblings ...)
2024-09-02 18:28 ` [PATCH 07/10] xfs: reduce excessive clamping of maxlen in xfs_rtallocate_extent_near Darrick J. Wong
@ 2024-09-02 18:29 ` Darrick J. Wong
2024-09-02 18:29 ` [PATCH 09/10] xfs: remove xfs_rtb_to_rtxrem Darrick J. Wong
2024-09-02 18:29 ` [PATCH 10/10] xfs: simplify xfs_rtalloc_query_range Darrick J. Wong
9 siblings, 0 replies; 53+ messages in thread
From: Darrick J. Wong @ 2024-09-02 18:29 UTC (permalink / raw)
To: chandanbabu, djwong; +Cc: Christoph Hellwig, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
This function tries to find a suitable free space extent starting from
a particular rtbitmap block. Some time ago, I added a clamping function
to prevent the free space scans from running off the end of the bitmap,
but I didn't quite get the logic right.
Let's say there's an allocation request with a minlen of 5 and a maxlen
of 32 and we're scanning the last rtbitmap block. If we come within 4
rtx of the end of the rt volume, maxlen will get clamped to 4. If the
next 3 rtx are free, we could have satisfied the allocation, but the
code setting partial besti/bestlen for "minlen < maxlen" will think that
we're doing a non-variable allocation and ignore it.
The root of this problem is overwriting maxlen; I should have stuffed
the results in a different variable, which would not have introduced
this bug.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
fs/xfs/xfs_rtalloc.c | 15 +++++++++------
1 file changed, 9 insertions(+), 6 deletions(-)
diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
index d27bfec08ef8..72123e2337d8 100644
--- a/fs/xfs/xfs_rtalloc.c
+++ b/fs/xfs/xfs_rtalloc.c
@@ -244,6 +244,7 @@ xfs_rtallocate_extent_block(
xfs_rtxnum_t end; /* last rtext in chunk */
xfs_rtxnum_t i; /* current rtext trying */
xfs_rtxnum_t next; /* next rtext to try */
+ xfs_rtxlen_t scanlen; /* number of free rtx to look for */
xfs_rtxlen_t bestlen = 0; /* best length found so far */
int stat; /* status from internal calls */
int error;
@@ -255,20 +256,22 @@ xfs_rtallocate_extent_block(
end = min(mp->m_sb.sb_rextents, xfs_rbmblock_to_rtx(mp, bbno + 1)) - 1;
for (i = xfs_rbmblock_to_rtx(mp, bbno); i <= end; i++) {
/* Make sure we don't scan off the end of the rt volume. */
- maxlen = xfs_rtallocate_clamp_len(mp, i, maxlen, prod);
+ scanlen = xfs_rtallocate_clamp_len(mp, i, maxlen, prod);
+ if (scanlen < minlen)
+ break;
/*
- * See if there's a free extent of maxlen starting at i.
+ * See if there's a free extent of scanlen starting at i.
* If it's not so then next will contain the first non-free.
*/
- error = xfs_rtcheck_range(args, i, maxlen, 1, &next, &stat);
+ error = xfs_rtcheck_range(args, i, scanlen, 1, &next, &stat);
if (error)
return error;
if (stat) {
/*
- * i for maxlen is all free, allocate and return that.
+ * i to scanlen is all free, allocate and return that.
*/
- *len = maxlen;
+ *len = scanlen;
*rtx = i;
return 0;
}
@@ -299,7 +302,7 @@ xfs_rtallocate_extent_block(
}
/* Searched the whole thing & didn't find a maxlen free extent. */
- if (minlen > maxlen || besti == -1)
+ if (besti == -1)
goto nospace;
/*
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 09/10] xfs: remove xfs_rtb_to_rtxrem
2024-09-02 18:21 ` [PATCHSET v4.2 4/8] xfs: fixes for the realtime allocator Darrick J. Wong
` (7 preceding siblings ...)
2024-09-02 18:29 ` [PATCH 08/10] xfs: fix broken variable-sized allocation detection in xfs_rtallocate_extent_block Darrick J. Wong
@ 2024-09-02 18:29 ` Darrick J. Wong
2024-09-02 18:29 ` [PATCH 10/10] xfs: simplify xfs_rtalloc_query_range Darrick J. Wong
9 siblings, 0 replies; 53+ messages in thread
From: Darrick J. Wong @ 2024-09-02 18:29 UTC (permalink / raw)
To: chandanbabu, djwong; +Cc: Christoph Hellwig, linux-xfs
From: Christoph Hellwig <hch@lst.de>
Simplify the number of block number conversion helpers by removing
xfs_rtb_to_rtxrem. Any recent compiler is smart enough to eliminate
the double divisions if using separate xfs_rtb_to_rtx and
xfs_rtb_to_rtxoff calls.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
fs/xfs/libxfs/xfs_rtbitmap.c | 9 ++++-----
fs/xfs/libxfs/xfs_rtbitmap.h | 18 ------------------
2 files changed, 4 insertions(+), 23 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_rtbitmap.c b/fs/xfs/libxfs/xfs_rtbitmap.c
index d7c731aeee12..431ef62939ca 100644
--- a/fs/xfs/libxfs/xfs_rtbitmap.c
+++ b/fs/xfs/libxfs/xfs_rtbitmap.c
@@ -1022,25 +1022,24 @@ xfs_rtfree_blocks(
xfs_filblks_t rtlen)
{
struct xfs_mount *mp = tp->t_mountp;
- xfs_rtxnum_t start;
- xfs_filblks_t len;
xfs_extlen_t mod;
ASSERT(rtlen <= XFS_MAX_BMBT_EXTLEN);
- len = xfs_rtb_to_rtxrem(mp, rtlen, &mod);
+ mod = xfs_rtb_to_rtxoff(mp, rtlen);
if (mod) {
ASSERT(mod == 0);
return -EIO;
}
- start = xfs_rtb_to_rtxrem(mp, rtbno, &mod);
+ mod = xfs_rtb_to_rtxoff(mp, rtbno);
if (mod) {
ASSERT(mod == 0);
return -EIO;
}
- return xfs_rtfree_extent(tp, start, len);
+ return xfs_rtfree_extent(tp, xfs_rtb_to_rtx(mp, rtbno),
+ xfs_rtb_to_rtx(mp, rtlen));
}
/* Find all the free records within a given range. */
diff --git a/fs/xfs/libxfs/xfs_rtbitmap.h b/fs/xfs/libxfs/xfs_rtbitmap.h
index 523d3d3c12c6..69ddacd4b01e 100644
--- a/fs/xfs/libxfs/xfs_rtbitmap.h
+++ b/fs/xfs/libxfs/xfs_rtbitmap.h
@@ -86,24 +86,6 @@ xfs_rtb_to_rtxoff(
return do_div(rtbno, mp->m_sb.sb_rextsize);
}
-/*
- * Crack an rt block number into an rt extent number and an offset within that
- * rt extent. Returns the rt extent number directly and the offset in @off.
- */
-static inline xfs_rtxnum_t
-xfs_rtb_to_rtxrem(
- struct xfs_mount *mp,
- xfs_rtblock_t rtbno,
- xfs_extlen_t *off)
-{
- if (likely(mp->m_rtxblklog >= 0)) {
- *off = rtbno & mp->m_rtxblkmask;
- return rtbno >> mp->m_rtxblklog;
- }
-
- return div_u64_rem(rtbno, mp->m_sb.sb_rextsize, off);
-}
-
/*
* Convert an rt block number into an rt extent number, rounding up to the next
* rt extent if the rt block is not aligned to an rt extent boundary.
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 10/10] xfs: simplify xfs_rtalloc_query_range
2024-09-02 18:21 ` [PATCHSET v4.2 4/8] xfs: fixes for the realtime allocator Darrick J. Wong
` (8 preceding siblings ...)
2024-09-02 18:29 ` [PATCH 09/10] xfs: remove xfs_rtb_to_rtxrem Darrick J. Wong
@ 2024-09-02 18:29 ` Darrick J. Wong
9 siblings, 0 replies; 53+ messages in thread
From: Darrick J. Wong @ 2024-09-02 18:29 UTC (permalink / raw)
To: chandanbabu, djwong; +Cc: Christoph Hellwig, linux-xfs
From: Christoph Hellwig <hch@lst.de>
There isn't much of a good reason to pass the xfs_rtalloc_rec structures
that describe extents to xfs_rtalloc_query_range as we really just want
a lower and upper bound xfs_rtxnum_t. Pass the rtxnum directly and
simply the interface.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
fs/xfs/libxfs/xfs_rtbitmap.c | 42 +++++++++++++++++-------------------------
fs/xfs/libxfs/xfs_rtbitmap.h | 3 +--
fs/xfs/xfs_discard.c | 15 +++++++--------
fs/xfs/xfs_fsmap.c | 11 +++++------
4 files changed, 30 insertions(+), 41 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_rtbitmap.c b/fs/xfs/libxfs/xfs_rtbitmap.c
index 431ef62939ca..c58eb75ef0fa 100644
--- a/fs/xfs/libxfs/xfs_rtbitmap.c
+++ b/fs/xfs/libxfs/xfs_rtbitmap.c
@@ -1047,8 +1047,8 @@ int
xfs_rtalloc_query_range(
struct xfs_mount *mp,
struct xfs_trans *tp,
- const struct xfs_rtalloc_rec *low_rec,
- const struct xfs_rtalloc_rec *high_rec,
+ xfs_rtxnum_t start,
+ xfs_rtxnum_t end,
xfs_rtalloc_query_range_fn fn,
void *priv)
{
@@ -1056,45 +1056,42 @@ xfs_rtalloc_query_range(
.mp = mp,
.tp = tp,
};
- struct xfs_rtalloc_rec rec;
- xfs_rtxnum_t rtstart;
- xfs_rtxnum_t rtend;
- xfs_rtxnum_t high_key;
- int is_free;
int error = 0;
- if (low_rec->ar_startext > high_rec->ar_startext)
+ if (start > end)
return -EINVAL;
- if (low_rec->ar_startext >= mp->m_sb.sb_rextents ||
- low_rec->ar_startext == high_rec->ar_startext)
+ if (start == end || start >= mp->m_sb.sb_rextents)
return 0;
- high_key = min(high_rec->ar_startext, mp->m_sb.sb_rextents - 1);
+ end = min(end, mp->m_sb.sb_rextents - 1);
/* Iterate the bitmap, looking for discrepancies. */
- rtstart = low_rec->ar_startext;
- while (rtstart <= high_key) {
+ while (start <= end) {
+ struct xfs_rtalloc_rec rec;
+ int is_free;
+ xfs_rtxnum_t rtend;
+
/* Is the first block free? */
- error = xfs_rtcheck_range(&args, rtstart, 1, 1, &rtend,
+ error = xfs_rtcheck_range(&args, start, 1, 1, &rtend,
&is_free);
if (error)
break;
/* How long does the extent go for? */
- error = xfs_rtfind_forw(&args, rtstart, high_key, &rtend);
+ error = xfs_rtfind_forw(&args, start, end, &rtend);
if (error)
break;
if (is_free) {
- rec.ar_startext = rtstart;
- rec.ar_extcount = rtend - rtstart + 1;
+ rec.ar_startext = start;
+ rec.ar_extcount = rtend - start + 1;
error = fn(mp, tp, &rec, priv);
if (error)
break;
}
- rtstart = rtend + 1;
+ start = rtend + 1;
}
xfs_rtbuf_cache_relse(&args);
@@ -1109,13 +1106,8 @@ xfs_rtalloc_query_all(
xfs_rtalloc_query_range_fn fn,
void *priv)
{
- struct xfs_rtalloc_rec keys[2];
-
- keys[0].ar_startext = 0;
- keys[1].ar_startext = mp->m_sb.sb_rextents - 1;
- keys[0].ar_extcount = keys[1].ar_extcount = 0;
-
- return xfs_rtalloc_query_range(mp, tp, &keys[0], &keys[1], fn, priv);
+ return xfs_rtalloc_query_range(mp, tp, 0, mp->m_sb.sb_rextents - 1, fn,
+ priv);
}
/* Is the given extent all free? */
diff --git a/fs/xfs/libxfs/xfs_rtbitmap.h b/fs/xfs/libxfs/xfs_rtbitmap.h
index 69ddacd4b01e..0dbc9bb40668 100644
--- a/fs/xfs/libxfs/xfs_rtbitmap.h
+++ b/fs/xfs/libxfs/xfs_rtbitmap.h
@@ -292,8 +292,7 @@ int xfs_rtmodify_summary(struct xfs_rtalloc_args *args, int log,
int xfs_rtfree_range(struct xfs_rtalloc_args *args, xfs_rtxnum_t start,
xfs_rtxlen_t len);
int xfs_rtalloc_query_range(struct xfs_mount *mp, struct xfs_trans *tp,
- const struct xfs_rtalloc_rec *low_rec,
- const struct xfs_rtalloc_rec *high_rec,
+ xfs_rtxnum_t start, xfs_rtxnum_t end,
xfs_rtalloc_query_range_fn fn, void *priv);
int xfs_rtalloc_query_all(struct xfs_mount *mp, struct xfs_trans *tp,
xfs_rtalloc_query_range_fn fn,
diff --git a/fs/xfs/xfs_discard.c b/fs/xfs/xfs_discard.c
index 25f5dffeab2a..bf1e3f330018 100644
--- a/fs/xfs/xfs_discard.c
+++ b/fs/xfs/xfs_discard.c
@@ -554,11 +554,10 @@ xfs_trim_rtdev_extents(
xfs_daddr_t end,
xfs_daddr_t minlen)
{
- struct xfs_rtalloc_rec low = { };
- struct xfs_rtalloc_rec high = { };
struct xfs_trim_rtdev tr = {
.minlen_fsb = XFS_BB_TO_FSB(mp, minlen),
};
+ xfs_rtxnum_t low, high;
struct xfs_trans *tp;
xfs_daddr_t rtdev_daddr;
int error;
@@ -584,17 +583,17 @@ xfs_trim_rtdev_extents(
XFS_FSB_TO_BB(mp, mp->m_sb.sb_rblocks) - 1);
/* Convert the rt blocks to rt extents */
- low.ar_startext = xfs_rtb_to_rtxup(mp, XFS_BB_TO_FSB(mp, start));
- high.ar_startext = xfs_rtb_to_rtx(mp, XFS_BB_TO_FSBT(mp, end));
+ low = xfs_rtb_to_rtxup(mp, XFS_BB_TO_FSB(mp, start));
+ high = xfs_rtb_to_rtx(mp, XFS_BB_TO_FSBT(mp, end));
/*
* Walk the free ranges between low and high. The query_range function
* trims the extents returned.
*/
do {
- tr.stop_rtx = low.ar_startext + (mp->m_sb.sb_blocksize * NBBY);
+ tr.stop_rtx = low + (mp->m_sb.sb_blocksize * NBBY);
xfs_rtbitmap_lock_shared(mp, XFS_RBMLOCK_BITMAP);
- error = xfs_rtalloc_query_range(mp, tp, &low, &high,
+ error = xfs_rtalloc_query_range(mp, tp, low, high,
xfs_trim_gather_rtextent, &tr);
if (error == -ECANCELED)
@@ -615,8 +614,8 @@ xfs_trim_rtdev_extents(
if (error)
break;
- low.ar_startext = tr.restart_rtx;
- } while (!xfs_trim_should_stop() && low.ar_startext <= high.ar_startext);
+ low = tr.restart_rtx;
+ } while (!xfs_trim_should_stop() && low <= high);
xfs_trans_cancel(tp);
return error;
diff --git a/fs/xfs/xfs_fsmap.c b/fs/xfs/xfs_fsmap.c
index 71f32354944e..e15446626875 100644
--- a/fs/xfs/xfs_fsmap.c
+++ b/fs/xfs/xfs_fsmap.c
@@ -520,11 +520,11 @@ xfs_getfsmap_rtdev_rtbitmap(
struct xfs_getfsmap_info *info)
{
- struct xfs_rtalloc_rec alow = { 0 };
struct xfs_rtalloc_rec ahigh = { 0 };
struct xfs_mount *mp = tp->t_mountp;
xfs_rtblock_t start_rtb;
xfs_rtblock_t end_rtb;
+ xfs_rtxnum_t high;
uint64_t eofs;
int error;
@@ -553,10 +553,9 @@ xfs_getfsmap_rtdev_rtbitmap(
* Set up query parameters to return free rtextents covering the range
* we want.
*/
- alow.ar_startext = xfs_rtb_to_rtx(mp, start_rtb);
- ahigh.ar_startext = xfs_rtb_to_rtxup(mp, end_rtb);
- error = xfs_rtalloc_query_range(mp, tp, &alow, &ahigh,
- xfs_getfsmap_rtdev_rtbitmap_helper, info);
+ high = xfs_rtb_to_rtxup(mp, end_rtb);
+ error = xfs_rtalloc_query_range(mp, tp, xfs_rtb_to_rtx(mp, start_rtb),
+ high, xfs_getfsmap_rtdev_rtbitmap_helper, info);
if (error)
goto err;
@@ -565,7 +564,7 @@ xfs_getfsmap_rtdev_rtbitmap(
* rmap starting at the block after the end of the query range.
*/
info->last = true;
- ahigh.ar_startext = min(mp->m_sb.sb_rextents, ahigh.ar_startext);
+ ahigh.ar_startext = min(mp->m_sb.sb_rextents, high);
error = xfs_getfsmap_rtdev_rtbitmap_helper(mp, tp, &ahigh, info);
if (error)
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 01/10] xfs: clean up the ISVALID macro in xfs_bmap_adjacent
2024-09-02 18:22 ` [PATCHSET v4.2 5/8] xfs: cleanups for the realtime allocator Darrick J. Wong
@ 2024-09-02 18:29 ` Darrick J. Wong
2024-09-02 18:30 ` [PATCH 02/10] xfs: factor out a xfs_rtallocate helper Darrick J. Wong
` (8 subsequent siblings)
9 siblings, 0 replies; 53+ messages in thread
From: Darrick J. Wong @ 2024-09-02 18:29 UTC (permalink / raw)
To: chandanbabu, djwong; +Cc: Christoph Hellwig, linux-xfs
From: Christoph Hellwig <hch@lst.de>
Turn the ISVALID macro defined and used inside in xfs_bmap_adjacent
that relies on implict context into a proper inline function.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
fs/xfs/libxfs/xfs_bmap.c | 55 +++++++++++++++++++++++++++-------------------
1 file changed, 32 insertions(+), 23 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index 112c7ee2d493..434433ed29dc 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -3112,6 +3112,23 @@ xfs_bmap_extsize_align(
return 0;
}
+static inline bool
+xfs_bmap_adjacent_valid(
+ struct xfs_bmalloca *ap,
+ xfs_fsblock_t x,
+ xfs_fsblock_t y)
+{
+ struct xfs_mount *mp = ap->ip->i_mount;
+
+ if (XFS_IS_REALTIME_INODE(ap->ip) &&
+ (ap->datatype & XFS_ALLOC_USERDATA))
+ return x < mp->m_sb.sb_rblocks;
+
+ return XFS_FSB_TO_AGNO(mp, x) == XFS_FSB_TO_AGNO(mp, y) &&
+ XFS_FSB_TO_AGNO(mp, x) < mp->m_sb.sb_agcount &&
+ XFS_FSB_TO_AGBNO(mp, x) < mp->m_sb.sb_agblocks;
+}
+
#define XFS_ALLOC_GAP_UNITS 4
/* returns true if ap->blkno was modified */
@@ -3119,36 +3136,25 @@ bool
xfs_bmap_adjacent(
struct xfs_bmalloca *ap) /* bmap alloc argument struct */
{
- xfs_fsblock_t adjust; /* adjustment to block numbers */
- xfs_mount_t *mp; /* mount point structure */
- int rt; /* true if inode is realtime */
+ xfs_fsblock_t adjust; /* adjustment to block numbers */
-#define ISVALID(x,y) \
- (rt ? \
- (x) < mp->m_sb.sb_rblocks : \
- XFS_FSB_TO_AGNO(mp, x) == XFS_FSB_TO_AGNO(mp, y) && \
- XFS_FSB_TO_AGNO(mp, x) < mp->m_sb.sb_agcount && \
- XFS_FSB_TO_AGBNO(mp, x) < mp->m_sb.sb_agblocks)
-
- mp = ap->ip->i_mount;
- rt = XFS_IS_REALTIME_INODE(ap->ip) &&
- (ap->datatype & XFS_ALLOC_USERDATA);
/*
* If allocating at eof, and there's a previous real block,
* try to use its last block as our starting point.
*/
if (ap->eof && ap->prev.br_startoff != NULLFILEOFF &&
!isnullstartblock(ap->prev.br_startblock) &&
- ISVALID(ap->prev.br_startblock + ap->prev.br_blockcount,
- ap->prev.br_startblock)) {
+ xfs_bmap_adjacent_valid(ap,
+ ap->prev.br_startblock + ap->prev.br_blockcount,
+ ap->prev.br_startblock)) {
ap->blkno = ap->prev.br_startblock + ap->prev.br_blockcount;
/*
* Adjust for the gap between prevp and us.
*/
adjust = ap->offset -
(ap->prev.br_startoff + ap->prev.br_blockcount);
- if (adjust &&
- ISVALID(ap->blkno + adjust, ap->prev.br_startblock))
+ if (adjust && xfs_bmap_adjacent_valid(ap, ap->blkno + adjust,
+ ap->prev.br_startblock))
ap->blkno += adjust;
return true;
}
@@ -3171,7 +3177,8 @@ xfs_bmap_adjacent(
!isnullstartblock(ap->prev.br_startblock) &&
(prevbno = ap->prev.br_startblock +
ap->prev.br_blockcount) &&
- ISVALID(prevbno, ap->prev.br_startblock)) {
+ xfs_bmap_adjacent_valid(ap, prevbno,
+ ap->prev.br_startblock)) {
/*
* Calculate gap to end of previous block.
*/
@@ -3187,8 +3194,8 @@ xfs_bmap_adjacent(
* number, then just use the end of the previous block.
*/
if (prevdiff <= XFS_ALLOC_GAP_UNITS * ap->length &&
- ISVALID(prevbno + prevdiff,
- ap->prev.br_startblock))
+ xfs_bmap_adjacent_valid(ap, prevbno + prevdiff,
+ ap->prev.br_startblock))
prevbno += adjust;
else
prevdiff += adjust;
@@ -3220,9 +3227,11 @@ xfs_bmap_adjacent(
* offset by our length.
*/
if (gotdiff <= XFS_ALLOC_GAP_UNITS * ap->length &&
- ISVALID(gotbno - gotdiff, gotbno))
+ xfs_bmap_adjacent_valid(ap, gotbno - gotdiff,
+ gotbno))
gotbno -= adjust;
- else if (ISVALID(gotbno - ap->length, gotbno)) {
+ else if (xfs_bmap_adjacent_valid(ap, gotbno - ap->length,
+ gotbno)) {
gotbno -= ap->length;
gotdiff += adjust - ap->length;
} else
@@ -3250,7 +3259,7 @@ xfs_bmap_adjacent(
return true;
}
}
-#undef ISVALID
+
return false;
}
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 02/10] xfs: factor out a xfs_rtallocate helper
2024-09-02 18:22 ` [PATCHSET v4.2 5/8] xfs: cleanups for the realtime allocator Darrick J. Wong
2024-09-02 18:29 ` [PATCH 01/10] xfs: clean up the ISVALID macro in xfs_bmap_adjacent Darrick J. Wong
@ 2024-09-02 18:30 ` Darrick J. Wong
2024-09-02 18:30 ` [PATCH 03/10] xfs: rework the rtalloc fallback handling Darrick J. Wong
` (7 subsequent siblings)
9 siblings, 0 replies; 53+ messages in thread
From: Darrick J. Wong @ 2024-09-02 18:30 UTC (permalink / raw)
To: chandanbabu, djwong; +Cc: Christoph Hellwig, linux-xfs
From: Christoph Hellwig <hch@lst.de>
Split out a helper from xfs_rtallocate that performs the actual
allocation. This keeps the scope of the xfs_rtalloc_args structure
contained, and prepares for rtgroups support.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
fs/xfs/xfs_rtalloc.c | 81 +++++++++++++++++++++++++++++++-------------------
1 file changed, 50 insertions(+), 31 deletions(-)
diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
index 72123e2337d8..12cf7cb3c02c 100644
--- a/fs/xfs/xfs_rtalloc.c
+++ b/fs/xfs/xfs_rtalloc.c
@@ -1263,6 +1263,51 @@ xfs_rtalloc_align_minmax(
*raminlen = newminlen;
}
+static int
+xfs_rtallocate(
+ struct xfs_trans *tp,
+ xfs_rtxnum_t start,
+ xfs_rtxlen_t minlen,
+ xfs_rtxlen_t maxlen,
+ xfs_rtxlen_t prod,
+ bool wasdel,
+ xfs_rtblock_t *bno,
+ xfs_extlen_t *blen)
+{
+ struct xfs_rtalloc_args args = {
+ .mp = tp->t_mountp,
+ .tp = tp,
+ };
+ xfs_rtxnum_t rtx;
+ xfs_rtxlen_t len = 0;
+ int error;
+
+ if (start) {
+ error = xfs_rtallocate_extent_near(&args, start, minlen, maxlen,
+ &len, prod, &rtx);
+ } else {
+ error = xfs_rtallocate_extent_size(&args, minlen, maxlen, &len,
+ prod, &rtx);
+ }
+
+ if (error)
+ goto out_release;
+
+ error = xfs_rtallocate_range(&args, rtx, len);
+ if (error)
+ goto out_release;
+
+ xfs_trans_mod_sb(tp, wasdel ?
+ XFS_TRANS_SB_RES_FREXTENTS : XFS_TRANS_SB_FREXTENTS,
+ -(long)len);
+ *bno = xfs_rtx_to_rtb(args.mp, rtx);
+ *blen = xfs_rtxlen_to_extlen(args.mp, len);
+
+out_release:
+ xfs_rtbuf_cache_relse(&args);
+ return error;
+}
+
int
xfs_bmap_rtalloc(
struct xfs_bmalloca *ap)
@@ -1270,7 +1315,6 @@ xfs_bmap_rtalloc(
struct xfs_mount *mp = ap->ip->i_mount;
xfs_fileoff_t orig_offset = ap->offset;
xfs_rtxnum_t start; /* allocation hint rtextent no */
- xfs_rtxnum_t rtx; /* actually allocated rtextent no */
xfs_rtxlen_t prod = 0; /* product factor for allocators */
xfs_extlen_t mod = 0; /* product factor for allocators */
xfs_rtxlen_t ralen = 0; /* realtime allocation length */
@@ -1280,10 +1324,6 @@ xfs_bmap_rtalloc(
xfs_rtxlen_t raminlen;
bool rtlocked = false;
bool ignore_locality = false;
- struct xfs_rtalloc_args args = {
- .mp = mp,
- .tp = ap->tp,
- };
int error;
align = xfs_get_extsz_hint(ap->ip);
@@ -1357,19 +1397,9 @@ xfs_bmap_rtalloc(
xfs_rtalloc_align_minmax(&raminlen, &ralen, &prod);
}
- if (start) {
- error = xfs_rtallocate_extent_near(&args, start, raminlen,
- ralen, &ralen, prod, &rtx);
- } else {
- error = xfs_rtallocate_extent_size(&args, raminlen,
- ralen, &ralen, prod, &rtx);
- }
-
- if (error) {
- xfs_rtbuf_cache_relse(&args);
- if (error != -ENOSPC)
- return error;
-
+ error = xfs_rtallocate(ap->tp, start, raminlen, ralen, prod, ap->wasdel,
+ &ap->blkno, &ap->length);
+ if (error == -ENOSPC) {
if (align > mp->m_sb.sb_rextsize) {
/*
* We previously enlarged the request length to try to
@@ -1397,20 +1427,9 @@ xfs_bmap_rtalloc(
ap->length = 0;
return 0;
}
-
- error = xfs_rtallocate_range(&args, rtx, ralen);
if (error)
- goto out_release;
+ return error;
- xfs_trans_mod_sb(ap->tp, ap->wasdel ?
- XFS_TRANS_SB_RES_FREXTENTS : XFS_TRANS_SB_FREXTENTS,
- -(long)ralen);
-
- ap->blkno = xfs_rtx_to_rtb(mp, rtx);
- ap->length = xfs_rtxlen_to_extlen(mp, ralen);
xfs_bmap_alloc_account(ap);
-
-out_release:
- xfs_rtbuf_cache_relse(&args);
- return error;
+ return 0;
}
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 03/10] xfs: rework the rtalloc fallback handling
2024-09-02 18:22 ` [PATCHSET v4.2 5/8] xfs: cleanups for the realtime allocator Darrick J. Wong
2024-09-02 18:29 ` [PATCH 01/10] xfs: clean up the ISVALID macro in xfs_bmap_adjacent Darrick J. Wong
2024-09-02 18:30 ` [PATCH 02/10] xfs: factor out a xfs_rtallocate helper Darrick J. Wong
@ 2024-09-02 18:30 ` Darrick J. Wong
2024-09-02 18:30 ` [PATCH 04/10] xfs: factor out a xfs_rtallocate_align helper Darrick J. Wong
` (6 subsequent siblings)
9 siblings, 0 replies; 53+ messages in thread
From: Darrick J. Wong @ 2024-09-02 18:30 UTC (permalink / raw)
To: chandanbabu, djwong; +Cc: Christoph Hellwig, linux-xfs
From: Christoph Hellwig <hch@lst.de>
xfs_rtallocate currently has two fallbacks, when an allocation fails:
1) drop the requested extent size alignment, if any, and retry
2) ignore the locality hint
Oddly enough it does those in order, as trying a different location
is more in line with what the user asked for, and does it in a very
unstructured way.
Lift the fallback to try to allocate without the locality hint into
xfs_rtallocate to both perform them in a more sensible order and to
clean up the code.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
fs/xfs/xfs_rtalloc.c | 69 +++++++++++++++++++++++++-------------------------
1 file changed, 34 insertions(+), 35 deletions(-)
diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
index 12cf7cb3c02c..a6b9ba572cdc 100644
--- a/fs/xfs/xfs_rtalloc.c
+++ b/fs/xfs/xfs_rtalloc.c
@@ -1271,6 +1271,8 @@ xfs_rtallocate(
xfs_rtxlen_t maxlen,
xfs_rtxlen_t prod,
bool wasdel,
+ bool initial_user_data,
+ bool *rtlocked,
xfs_rtblock_t *bno,
xfs_extlen_t *blen)
{
@@ -1280,12 +1282,38 @@ xfs_rtallocate(
};
xfs_rtxnum_t rtx;
xfs_rtxlen_t len = 0;
- int error;
+ int error = 0;
+
+ /*
+ * Lock out modifications to both the RT bitmap and summary inodes.
+ */
+ if (!*rtlocked) {
+ xfs_rtbitmap_lock(args.mp);
+ xfs_rtbitmap_trans_join(tp);
+ *rtlocked = true;
+ }
+
+ /*
+ * For an allocation to an empty file at offset 0, pick an extent that
+ * will space things out in the rt area.
+ */
+ if (!start && initial_user_data)
+ start = xfs_rtpick_extent(args.mp, tp, maxlen);
if (start) {
error = xfs_rtallocate_extent_near(&args, start, minlen, maxlen,
&len, prod, &rtx);
- } else {
+ /*
+ * If we can't allocate near a specific rt extent, try again
+ * without locality criteria.
+ */
+ if (error == -ENOSPC) {
+ xfs_rtbuf_cache_relse(&args);
+ error = 0;
+ }
+ }
+
+ if (!error) {
error = xfs_rtallocate_extent_size(&args, minlen, maxlen, &len,
prod, &rtx);
}
@@ -1314,7 +1342,7 @@ xfs_bmap_rtalloc(
{
struct xfs_mount *mp = ap->ip->i_mount;
xfs_fileoff_t orig_offset = ap->offset;
- xfs_rtxnum_t start; /* allocation hint rtextent no */
+ xfs_rtxnum_t start = 0; /* allocation hint rtextent no */
xfs_rtxlen_t prod = 0; /* product factor for allocators */
xfs_extlen_t mod = 0; /* product factor for allocators */
xfs_rtxlen_t ralen = 0; /* realtime allocation length */
@@ -1323,7 +1351,6 @@ xfs_bmap_rtalloc(
xfs_extlen_t minlen = mp->m_sb.sb_rextsize;
xfs_rtxlen_t raminlen;
bool rtlocked = false;
- bool ignore_locality = false;
int error;
align = xfs_get_extsz_hint(ap->ip);
@@ -1361,28 +1388,8 @@ xfs_bmap_rtalloc(
ASSERT(raminlen > 0);
ASSERT(raminlen <= ralen);
- /*
- * Lock out modifications to both the RT bitmap and summary inodes
- */
- if (!rtlocked) {
- xfs_rtbitmap_lock(mp);
- xfs_rtbitmap_trans_join(ap->tp);
- rtlocked = true;
- }
-
- if (ignore_locality) {
- start = 0;
- } else if (xfs_bmap_adjacent(ap)) {
+ if (xfs_bmap_adjacent(ap))
start = xfs_rtb_to_rtx(mp, ap->blkno);
- } else if (ap->datatype & XFS_ALLOC_INITIAL_USER_DATA) {
- /*
- * If it's an allocation to an empty file at offset 0, pick an
- * extent that will space things out in the rt area.
- */
- start = xfs_rtpick_extent(mp, ap->tp, ralen);
- } else {
- start = 0;
- }
/*
* Only bother calculating a real prod factor if offset & length are
@@ -1398,7 +1405,8 @@ xfs_bmap_rtalloc(
}
error = xfs_rtallocate(ap->tp, start, raminlen, ralen, prod, ap->wasdel,
- &ap->blkno, &ap->length);
+ ap->datatype & XFS_ALLOC_INITIAL_USER_DATA, &rtlocked,
+ &ap->blkno, &ap->length);
if (error == -ENOSPC) {
if (align > mp->m_sb.sb_rextsize) {
/*
@@ -1414,15 +1422,6 @@ xfs_bmap_rtalloc(
goto retry;
}
- if (!ignore_locality && start != 0) {
- /*
- * If we can't allocate near a specific rt extent, try
- * again without locality criteria.
- */
- ignore_locality = true;
- goto retry;
- }
-
ap->blkno = NULLFSBLOCK;
ap->length = 0;
return 0;
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 04/10] xfs: factor out a xfs_rtallocate_align helper
2024-09-02 18:22 ` [PATCHSET v4.2 5/8] xfs: cleanups for the realtime allocator Darrick J. Wong
` (2 preceding siblings ...)
2024-09-02 18:30 ` [PATCH 03/10] xfs: rework the rtalloc fallback handling Darrick J. Wong
@ 2024-09-02 18:30 ` Darrick J. Wong
2024-09-02 18:30 ` [PATCH 05/10] xfs: make the rtalloc start hint a xfs_rtblock_t Darrick J. Wong
` (5 subsequent siblings)
9 siblings, 0 replies; 53+ messages in thread
From: Darrick J. Wong @ 2024-09-02 18:30 UTC (permalink / raw)
To: chandanbabu, djwong; +Cc: Christoph Hellwig, linux-xfs
From: Christoph Hellwig <hch@lst.de>
Split the code to calculate the aligned allocation request from
xfs_bmap_rtalloc into a separate self-contained helper.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
fs/xfs/xfs_rtalloc.c | 93 ++++++++++++++++++++++++++++++++------------------
1 file changed, 59 insertions(+), 34 deletions(-)
diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
index a6b9ba572cdc..61e0c5b7a327 100644
--- a/fs/xfs/xfs_rtalloc.c
+++ b/fs/xfs/xfs_rtalloc.c
@@ -1336,30 +1336,33 @@ xfs_rtallocate(
return error;
}
-int
-xfs_bmap_rtalloc(
- struct xfs_bmalloca *ap)
+static int
+xfs_rtallocate_align(
+ struct xfs_bmalloca *ap,
+ xfs_rtxlen_t *ralen,
+ xfs_rtxlen_t *raminlen,
+ xfs_rtxlen_t *prod,
+ bool *noalign)
{
struct xfs_mount *mp = ap->ip->i_mount;
xfs_fileoff_t orig_offset = ap->offset;
- xfs_rtxnum_t start = 0; /* allocation hint rtextent no */
- xfs_rtxlen_t prod = 0; /* product factor for allocators */
- xfs_extlen_t mod = 0; /* product factor for allocators */
- xfs_rtxlen_t ralen = 0; /* realtime allocation length */
- xfs_extlen_t align; /* minimum allocation alignment */
- xfs_extlen_t orig_length = ap->length;
xfs_extlen_t minlen = mp->m_sb.sb_rextsize;
- xfs_rtxlen_t raminlen;
- bool rtlocked = false;
+ xfs_extlen_t align; /* minimum allocation alignment */
+ xfs_extlen_t mod; /* product factor for allocators */
int error;
- align = xfs_get_extsz_hint(ap->ip);
- if (!align)
- align = 1;
-retry:
- error = xfs_bmap_extsize_align(mp, &ap->got, &ap->prev,
- align, 1, ap->eof, 0,
- ap->conv, &ap->offset, &ap->length);
+ if (*noalign) {
+ align = mp->m_sb.sb_rextsize;
+ } else {
+ align = xfs_get_extsz_hint(ap->ip);
+ if (!align)
+ align = 1;
+ if (align == mp->m_sb.sb_rextsize)
+ *noalign = true;
+ }
+
+ error = xfs_bmap_extsize_align(mp, &ap->got, &ap->prev, align, 1,
+ ap->eof, 0, ap->conv, &ap->offset, &ap->length);
if (error)
return error;
ASSERT(ap->length);
@@ -1383,32 +1386,54 @@ xfs_bmap_rtalloc(
* XFS_BMBT_MAX_EXTLEN), we don't hear about that number, and can't
* adjust the starting point to match it.
*/
- ralen = xfs_extlen_to_rtxlen(mp, min(ap->length, XFS_MAX_BMBT_EXTLEN));
- raminlen = max_t(xfs_rtxlen_t, 1, xfs_extlen_to_rtxlen(mp, minlen));
- ASSERT(raminlen > 0);
- ASSERT(raminlen <= ralen);
-
- if (xfs_bmap_adjacent(ap))
- start = xfs_rtb_to_rtx(mp, ap->blkno);
+ *ralen = xfs_extlen_to_rtxlen(mp, min(ap->length, XFS_MAX_BMBT_EXTLEN));
+ *raminlen = max_t(xfs_rtxlen_t, 1, xfs_extlen_to_rtxlen(mp, minlen));
+ ASSERT(*raminlen > 0);
+ ASSERT(*raminlen <= *ralen);
/*
* Only bother calculating a real prod factor if offset & length are
* perfectly aligned, otherwise it will just get us in trouble.
*/
div_u64_rem(ap->offset, align, &mod);
- if (mod || ap->length % align) {
- prod = 1;
- } else {
- prod = xfs_extlen_to_rtxlen(mp, align);
- if (prod > 1)
- xfs_rtalloc_align_minmax(&raminlen, &ralen, &prod);
- }
+ if (mod || ap->length % align)
+ *prod = 1;
+ else
+ *prod = xfs_extlen_to_rtxlen(mp, align);
+
+ if (*prod > 1)
+ xfs_rtalloc_align_minmax(raminlen, ralen, prod);
+ return 0;
+}
+
+int
+xfs_bmap_rtalloc(
+ struct xfs_bmalloca *ap)
+{
+ struct xfs_mount *mp = ap->ip->i_mount;
+ xfs_fileoff_t orig_offset = ap->offset;
+ xfs_rtxnum_t start = 0; /* allocation hint rtextent no */
+ xfs_rtxlen_t prod = 0; /* product factor for allocators */
+ xfs_rtxlen_t ralen = 0; /* realtime allocation length */
+ xfs_extlen_t orig_length = ap->length;
+ xfs_rtxlen_t raminlen;
+ bool rtlocked = false;
+ bool noalign = false;
+ int error;
+
+retry:
+ error = xfs_rtallocate_align(ap, &ralen, &raminlen, &prod, &noalign);
+ if (error)
+ return error;
+
+ if (xfs_bmap_adjacent(ap))
+ start = xfs_rtb_to_rtx(mp, ap->blkno);
error = xfs_rtallocate(ap->tp, start, raminlen, ralen, prod, ap->wasdel,
ap->datatype & XFS_ALLOC_INITIAL_USER_DATA, &rtlocked,
&ap->blkno, &ap->length);
if (error == -ENOSPC) {
- if (align > mp->m_sb.sb_rextsize) {
+ if (!noalign) {
/*
* We previously enlarged the request length to try to
* satisfy an extent size hint. The allocator didn't
@@ -1418,7 +1443,7 @@ xfs_bmap_rtalloc(
*/
ap->offset = orig_offset;
ap->length = orig_length;
- minlen = align = mp->m_sb.sb_rextsize;
+ noalign = true;
goto retry;
}
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 05/10] xfs: make the rtalloc start hint a xfs_rtblock_t
2024-09-02 18:22 ` [PATCHSET v4.2 5/8] xfs: cleanups for the realtime allocator Darrick J. Wong
` (3 preceding siblings ...)
2024-09-02 18:30 ` [PATCH 04/10] xfs: factor out a xfs_rtallocate_align helper Darrick J. Wong
@ 2024-09-02 18:30 ` Darrick J. Wong
2024-09-02 18:31 ` [PATCH 06/10] xfs: add xchk_setup_nothing and xchk_nothing helpers Darrick J. Wong
` (4 subsequent siblings)
9 siblings, 0 replies; 53+ messages in thread
From: Darrick J. Wong @ 2024-09-02 18:30 UTC (permalink / raw)
To: chandanbabu, djwong; +Cc: Christoph Hellwig, linux-xfs
From: Christoph Hellwig <hch@lst.de>
0 is a valid start RT extent, and with pending changes it will become
both more common and non-unique. Switch to pass a xfs_rtblock_t instead
so that we can use NULLRTBLOCK to determine if a hint was set or not.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
fs/xfs/xfs_rtalloc.c | 18 +++++++++++-------
1 file changed, 11 insertions(+), 7 deletions(-)
diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
index 61e0c5b7a327..29edb8044b00 100644
--- a/fs/xfs/xfs_rtalloc.c
+++ b/fs/xfs/xfs_rtalloc.c
@@ -1266,7 +1266,7 @@ xfs_rtalloc_align_minmax(
static int
xfs_rtallocate(
struct xfs_trans *tp,
- xfs_rtxnum_t start,
+ xfs_rtblock_t bno_hint,
xfs_rtxlen_t minlen,
xfs_rtxlen_t maxlen,
xfs_rtxlen_t prod,
@@ -1280,6 +1280,7 @@ xfs_rtallocate(
.mp = tp->t_mountp,
.tp = tp,
};
+ xfs_rtxnum_t start = 0;
xfs_rtxnum_t rtx;
xfs_rtxlen_t len = 0;
int error = 0;
@@ -1297,7 +1298,9 @@ xfs_rtallocate(
* For an allocation to an empty file at offset 0, pick an extent that
* will space things out in the rt area.
*/
- if (!start && initial_user_data)
+ if (bno_hint)
+ start = xfs_rtb_to_rtx(args.mp, bno_hint);
+ else if (initial_user_data)
start = xfs_rtpick_extent(args.mp, tp, maxlen);
if (start) {
@@ -1410,15 +1413,16 @@ int
xfs_bmap_rtalloc(
struct xfs_bmalloca *ap)
{
- struct xfs_mount *mp = ap->ip->i_mount;
xfs_fileoff_t orig_offset = ap->offset;
- xfs_rtxnum_t start = 0; /* allocation hint rtextent no */
xfs_rtxlen_t prod = 0; /* product factor for allocators */
xfs_rtxlen_t ralen = 0; /* realtime allocation length */
+ xfs_rtblock_t bno_hint = NULLRTBLOCK;
xfs_extlen_t orig_length = ap->length;
xfs_rtxlen_t raminlen;
bool rtlocked = false;
bool noalign = false;
+ bool initial_user_data =
+ ap->datatype & XFS_ALLOC_INITIAL_USER_DATA;
int error;
retry:
@@ -1427,10 +1431,10 @@ xfs_bmap_rtalloc(
return error;
if (xfs_bmap_adjacent(ap))
- start = xfs_rtb_to_rtx(mp, ap->blkno);
+ bno_hint = ap->blkno;
- error = xfs_rtallocate(ap->tp, start, raminlen, ralen, prod, ap->wasdel,
- ap->datatype & XFS_ALLOC_INITIAL_USER_DATA, &rtlocked,
+ error = xfs_rtallocate(ap->tp, bno_hint, raminlen, ralen, prod,
+ ap->wasdel, initial_user_data, &rtlocked,
&ap->blkno, &ap->length);
if (error == -ENOSPC) {
if (!noalign) {
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 06/10] xfs: add xchk_setup_nothing and xchk_nothing helpers
2024-09-02 18:22 ` [PATCHSET v4.2 5/8] xfs: cleanups for the realtime allocator Darrick J. Wong
` (4 preceding siblings ...)
2024-09-02 18:30 ` [PATCH 05/10] xfs: make the rtalloc start hint a xfs_rtblock_t Darrick J. Wong
@ 2024-09-02 18:31 ` Darrick J. Wong
2024-09-02 18:31 ` [PATCH 07/10] xfs: remove xfs_{rtbitmap,rtsummary}_wordcount Darrick J. Wong
` (3 subsequent siblings)
9 siblings, 0 replies; 53+ messages in thread
From: Darrick J. Wong @ 2024-09-02 18:31 UTC (permalink / raw)
To: chandanbabu, djwong; +Cc: Christoph Hellwig, Christoph Hellwig, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Add common helpers for no-op scrubbing methods.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
[hch: split from a larger patch]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
fs/xfs/scrub/common.h | 29 +++++++++--------------------
fs/xfs/scrub/scrub.h | 29 +++++++++--------------------
2 files changed, 18 insertions(+), 40 deletions(-)
diff --git a/fs/xfs/scrub/common.h b/fs/xfs/scrub/common.h
index 3d5f1f6b4b7b..47148cc4a833 100644
--- a/fs/xfs/scrub/common.h
+++ b/fs/xfs/scrub/common.h
@@ -53,6 +53,11 @@ int xchk_checkpoint_log(struct xfs_mount *mp);
bool xchk_should_check_xref(struct xfs_scrub *sc, int *error,
struct xfs_btree_cur **curpp);
+static inline int xchk_setup_nothing(struct xfs_scrub *sc)
+{
+ return -ENOENT;
+}
+
/* Setup functions */
int xchk_setup_agheader(struct xfs_scrub *sc);
int xchk_setup_fs(struct xfs_scrub *sc);
@@ -72,16 +77,8 @@ int xchk_setup_dirtree(struct xfs_scrub *sc);
int xchk_setup_rtbitmap(struct xfs_scrub *sc);
int xchk_setup_rtsummary(struct xfs_scrub *sc);
#else
-static inline int
-xchk_setup_rtbitmap(struct xfs_scrub *sc)
-{
- return -ENOENT;
-}
-static inline int
-xchk_setup_rtsummary(struct xfs_scrub *sc)
-{
- return -ENOENT;
-}
+# define xchk_setup_rtbitmap xchk_setup_nothing
+# define xchk_setup_rtsummary xchk_setup_nothing
#endif
#ifdef CONFIG_XFS_QUOTA
int xchk_ino_dqattach(struct xfs_scrub *sc);
@@ -93,16 +90,8 @@ xchk_ino_dqattach(struct xfs_scrub *sc)
{
return 0;
}
-static inline int
-xchk_setup_quota(struct xfs_scrub *sc)
-{
- return -ENOENT;
-}
-static inline int
-xchk_setup_quotacheck(struct xfs_scrub *sc)
-{
- return -ENOENT;
-}
+# define xchk_setup_quota xchk_setup_nothing
+# define xchk_setup_quotacheck xchk_setup_nothing
#endif
int xchk_setup_fscounters(struct xfs_scrub *sc);
int xchk_setup_nlinks(struct xfs_scrub *sc);
diff --git a/fs/xfs/scrub/scrub.h b/fs/xfs/scrub/scrub.h
index 1bc33f010d0e..5993fcaffb2c 100644
--- a/fs/xfs/scrub/scrub.h
+++ b/fs/xfs/scrub/scrub.h
@@ -231,6 +231,11 @@ xchk_should_terminate(
return false;
}
+static inline int xchk_nothing(struct xfs_scrub *sc)
+{
+ return -ENOENT;
+}
+
/* Metadata scrubbers */
int xchk_tester(struct xfs_scrub *sc);
int xchk_superblock(struct xfs_scrub *sc);
@@ -254,31 +259,15 @@ int xchk_dirtree(struct xfs_scrub *sc);
int xchk_rtbitmap(struct xfs_scrub *sc);
int xchk_rtsummary(struct xfs_scrub *sc);
#else
-static inline int
-xchk_rtbitmap(struct xfs_scrub *sc)
-{
- return -ENOENT;
-}
-static inline int
-xchk_rtsummary(struct xfs_scrub *sc)
-{
- return -ENOENT;
-}
+# define xchk_rtbitmap xchk_nothing
+# define xchk_rtsummary xchk_nothing
#endif
#ifdef CONFIG_XFS_QUOTA
int xchk_quota(struct xfs_scrub *sc);
int xchk_quotacheck(struct xfs_scrub *sc);
#else
-static inline int
-xchk_quota(struct xfs_scrub *sc)
-{
- return -ENOENT;
-}
-static inline int
-xchk_quotacheck(struct xfs_scrub *sc)
-{
- return -ENOENT;
-}
+# define xchk_quota xchk_nothing
+# define xchk_quotacheck xchk_nothing
#endif
int xchk_fscounters(struct xfs_scrub *sc);
int xchk_nlinks(struct xfs_scrub *sc);
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 07/10] xfs: remove xfs_{rtbitmap,rtsummary}_wordcount
2024-09-02 18:22 ` [PATCHSET v4.2 5/8] xfs: cleanups for the realtime allocator Darrick J. Wong
` (5 preceding siblings ...)
2024-09-02 18:31 ` [PATCH 06/10] xfs: add xchk_setup_nothing and xchk_nothing helpers Darrick J. Wong
@ 2024-09-02 18:31 ` Darrick J. Wong
2024-09-02 18:31 ` [PATCH 08/10] xfs: replace m_rsumsize with m_rsumblocks Darrick J. Wong
` (2 subsequent siblings)
9 siblings, 0 replies; 53+ messages in thread
From: Darrick J. Wong @ 2024-09-02 18:31 UTC (permalink / raw)
To: chandanbabu, djwong; +Cc: Christoph Hellwig, linux-xfs
From: Christoph Hellwig <hch@lst.de>
xfs_rtbitmap_wordcount and xfs_rtsummary_wordcount are currently unused,
so remove them to simplify refactoring other rtbitmap helpers. They
can be added back or simply open coded when actually needed.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
fs/xfs/libxfs/xfs_rtbitmap.c | 31 -------------------------------
fs/xfs/libxfs/xfs_rtbitmap.h | 7 -------
2 files changed, 38 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_rtbitmap.c b/fs/xfs/libxfs/xfs_rtbitmap.c
index c58eb75ef0fa..76706e8bbc4e 100644
--- a/fs/xfs/libxfs/xfs_rtbitmap.c
+++ b/fs/xfs/libxfs/xfs_rtbitmap.c
@@ -1148,21 +1148,6 @@ xfs_rtbitmap_blockcount(
return howmany_64(rtextents, NBBY * mp->m_sb.sb_blocksize);
}
-/*
- * Compute the number of rtbitmap words needed to populate every block of a
- * bitmap that is large enough to track the given number of rt extents.
- */
-unsigned long long
-xfs_rtbitmap_wordcount(
- struct xfs_mount *mp,
- xfs_rtbxlen_t rtextents)
-{
- xfs_filblks_t blocks;
-
- blocks = xfs_rtbitmap_blockcount(mp, rtextents);
- return XFS_FSB_TO_B(mp, blocks) >> XFS_WORDLOG;
-}
-
/* Compute the number of rtsummary blocks needed to track the given rt space. */
xfs_filblks_t
xfs_rtsummary_blockcount(
@@ -1176,22 +1161,6 @@ xfs_rtsummary_blockcount(
return XFS_B_TO_FSB(mp, rsumwords << XFS_WORDLOG);
}
-/*
- * Compute the number of rtsummary info words needed to populate every block of
- * a summary file that is large enough to track the given rt space.
- */
-unsigned long long
-xfs_rtsummary_wordcount(
- struct xfs_mount *mp,
- unsigned int rsumlevels,
- xfs_extlen_t rbmblocks)
-{
- xfs_filblks_t blocks;
-
- blocks = xfs_rtsummary_blockcount(mp, rsumlevels, rbmblocks);
- return XFS_FSB_TO_B(mp, blocks) >> XFS_WORDLOG;
-}
-
/* Lock both realtime free space metadata inodes for a freespace update. */
void
xfs_rtbitmap_lock(
diff --git a/fs/xfs/libxfs/xfs_rtbitmap.h b/fs/xfs/libxfs/xfs_rtbitmap.h
index 0dbc9bb40668..140513d1d6bc 100644
--- a/fs/xfs/libxfs/xfs_rtbitmap.h
+++ b/fs/xfs/libxfs/xfs_rtbitmap.h
@@ -316,13 +316,8 @@ int xfs_rtfree_blocks(struct xfs_trans *tp, xfs_fsblock_t rtbno,
xfs_filblks_t xfs_rtbitmap_blockcount(struct xfs_mount *mp, xfs_rtbxlen_t
rtextents);
-unsigned long long xfs_rtbitmap_wordcount(struct xfs_mount *mp,
- xfs_rtbxlen_t rtextents);
-
xfs_filblks_t xfs_rtsummary_blockcount(struct xfs_mount *mp,
unsigned int rsumlevels, xfs_extlen_t rbmblocks);
-unsigned long long xfs_rtsummary_wordcount(struct xfs_mount *mp,
- unsigned int rsumlevels, xfs_extlen_t rbmblocks);
int xfs_rtfile_initialize_blocks(struct xfs_inode *ip,
xfs_fileoff_t offset_fsb, xfs_fileoff_t end_fsb, void *data);
@@ -355,9 +350,7 @@ xfs_rtbitmap_blockcount(struct xfs_mount *mp, xfs_rtbxlen_t rtextents)
/* shut up gcc */
return 0;
}
-# define xfs_rtbitmap_wordcount(mp, r) (0)
# define xfs_rtsummary_blockcount(mp, l, b) (0)
-# define xfs_rtsummary_wordcount(mp, l, b) (0)
# define xfs_rtbitmap_lock(mp) do { } while (0)
# define xfs_rtbitmap_trans_join(tp) do { } while (0)
# define xfs_rtbitmap_unlock(mp) do { } while (0)
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 08/10] xfs: replace m_rsumsize with m_rsumblocks
2024-09-02 18:22 ` [PATCHSET v4.2 5/8] xfs: cleanups for the realtime allocator Darrick J. Wong
` (6 preceding siblings ...)
2024-09-02 18:31 ` [PATCH 07/10] xfs: remove xfs_{rtbitmap,rtsummary}_wordcount Darrick J. Wong
@ 2024-09-02 18:31 ` Darrick J. Wong
2024-09-02 18:31 ` [PATCH 09/10] xfs: rearrange xfs_fsmap.c a little bit Darrick J. Wong
2024-09-02 18:32 ` [PATCH 10/10] xfs: move xfs_ioc_getfsmap out of xfs_ioctl.c Darrick J. Wong
9 siblings, 0 replies; 53+ messages in thread
From: Darrick J. Wong @ 2024-09-02 18:31 UTC (permalink / raw)
To: chandanbabu, djwong; +Cc: Christoph Hellwig, linux-xfs
From: Christoph Hellwig <hch@lst.de>
Track the RT summary file size in blocks, just like the RT bitmap
file. While we have users of both units, blocks are used slightly
more often and this matches the bitmap file for consistency.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
fs/xfs/libxfs/xfs_rtbitmap.c | 2 +-
fs/xfs/libxfs/xfs_trans_resv.c | 2 +-
fs/xfs/scrub/rtsummary.c | 11 +++++------
fs/xfs/scrub/rtsummary.h | 2 +-
fs/xfs/scrub/rtsummary_repair.c | 12 +++++-------
fs/xfs/xfs_mount.h | 2 +-
fs/xfs/xfs_rtalloc.c | 13 +++++--------
7 files changed, 19 insertions(+), 25 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_rtbitmap.c b/fs/xfs/libxfs/xfs_rtbitmap.c
index 76706e8bbc4e..27a4472402ba 100644
--- a/fs/xfs/libxfs/xfs_rtbitmap.c
+++ b/fs/xfs/libxfs/xfs_rtbitmap.c
@@ -162,7 +162,7 @@ xfs_rtsummary_read_buf(
{
struct xfs_mount *mp = args->mp;
- if (XFS_IS_CORRUPT(mp, block >= XFS_B_TO_FSB(mp, mp->m_rsumsize))) {
+ if (XFS_IS_CORRUPT(mp, block >= mp->m_rsumblocks)) {
xfs_rt_mark_sick(args->mp, XFS_SICK_RT_SUMMARY);
return -EFSCORRUPTED;
}
diff --git a/fs/xfs/libxfs/xfs_trans_resv.c b/fs/xfs/libxfs/xfs_trans_resv.c
index 45aaf169806a..2e6d7bb3b5a2 100644
--- a/fs/xfs/libxfs/xfs_trans_resv.c
+++ b/fs/xfs/libxfs/xfs_trans_resv.c
@@ -918,7 +918,7 @@ xfs_calc_growrtfree_reservation(
return xfs_calc_buf_res(1, mp->m_sb.sb_sectsize) +
xfs_calc_inode_res(mp, 2) +
xfs_calc_buf_res(1, mp->m_sb.sb_blocksize) +
- xfs_calc_buf_res(1, mp->m_rsumsize);
+ xfs_calc_buf_res(1, XFS_FSB_TO_B(mp, mp->m_rsumblocks));
}
/*
diff --git a/fs/xfs/scrub/rtsummary.c b/fs/xfs/scrub/rtsummary.c
index 3fee603f5244..7c7366c98338 100644
--- a/fs/xfs/scrub/rtsummary.c
+++ b/fs/xfs/scrub/rtsummary.c
@@ -63,7 +63,8 @@ xchk_setup_rtsummary(
* us to avoid pinning kernel memory for this purpose.
*/
descr = xchk_xfile_descr(sc, "realtime summary file");
- error = xfile_create(descr, mp->m_rsumsize, &sc->xfile);
+ error = xfile_create(descr, XFS_FSB_TO_B(mp, mp->m_rsumblocks),
+ &sc->xfile);
kfree(descr);
if (error)
return error;
@@ -95,16 +96,14 @@ xchk_setup_rtsummary(
* volume. Hence it is safe to compute and check the geometry values.
*/
if (mp->m_sb.sb_rblocks) {
- xfs_filblks_t rsumblocks;
int rextslog;
rts->rextents = xfs_rtb_to_rtx(mp, mp->m_sb.sb_rblocks);
rextslog = xfs_compute_rextslog(rts->rextents);
rts->rsumlevels = rextslog + 1;
rts->rbmblocks = xfs_rtbitmap_blockcount(mp, rts->rextents);
- rsumblocks = xfs_rtsummary_blockcount(mp, rts->rsumlevels,
+ rts->rsumblocks = xfs_rtsummary_blockcount(mp, rts->rsumlevels,
rts->rbmblocks);
- rts->rsumsize = XFS_FSB_TO_B(mp, rsumblocks);
}
return 0;
}
@@ -316,7 +315,7 @@ xchk_rtsummary(
}
/* Is m_rsumsize correct? */
- if (mp->m_rsumsize != rts->rsumsize) {
+ if (mp->m_rsumblocks != rts->rsumblocks) {
xchk_ino_set_corrupt(sc, mp->m_rsumip->i_ino);
goto out_rbm;
}
@@ -332,7 +331,7 @@ xchk_rtsummary(
* growfsrt expands the summary file before updating sb_rextents, so
* the file can be larger than rsumsize.
*/
- if (mp->m_rsumip->i_disk_size < rts->rsumsize) {
+ if (mp->m_rsumip->i_disk_size < XFS_FSB_TO_B(mp, rts->rsumblocks)) {
xchk_ino_set_corrupt(sc, mp->m_rsumip->i_ino);
goto out_rbm;
}
diff --git a/fs/xfs/scrub/rtsummary.h b/fs/xfs/scrub/rtsummary.h
index e1d50304d8d4..e44b04cb6e2d 100644
--- a/fs/xfs/scrub/rtsummary.h
+++ b/fs/xfs/scrub/rtsummary.h
@@ -14,7 +14,7 @@ struct xchk_rtsummary {
uint64_t rextents;
uint64_t rbmblocks;
- uint64_t rsumsize;
+ xfs_filblks_t rsumblocks;
unsigned int rsumlevels;
unsigned int resblks;
diff --git a/fs/xfs/scrub/rtsummary_repair.c b/fs/xfs/scrub/rtsummary_repair.c
index d9e971c4c79f..7deeb948cb70 100644
--- a/fs/xfs/scrub/rtsummary_repair.c
+++ b/fs/xfs/scrub/rtsummary_repair.c
@@ -56,7 +56,7 @@ xrep_setup_rtsummary(
* transaction (which we cannot drop because we cannot drop the
* rtsummary ILOCK) and cannot ask for more reservation.
*/
- blocks = XFS_B_TO_FSB(mp, mp->m_rsumsize);
+ blocks = mp->m_rsumblocks;
blocks += xfs_bmbt_calc_size(mp, blocks) * 2;
if (blocks > UINT_MAX)
return -EOPNOTSUPP;
@@ -100,7 +100,6 @@ xrep_rtsummary(
{
struct xchk_rtsummary *rts = sc->buf;
struct xfs_mount *mp = sc->mp;
- xfs_filblks_t rsumblocks;
int error;
/* We require the rmapbt to rebuild anything. */
@@ -131,10 +130,9 @@ xrep_rtsummary(
}
/* Make sure we have space allocated for the entire summary file. */
- rsumblocks = XFS_B_TO_FSB(mp, rts->rsumsize);
xfs_trans_ijoin(sc->tp, sc->ip, 0);
xfs_trans_ijoin(sc->tp, sc->tempip, 0);
- error = xrep_tempfile_prealloc(sc, 0, rsumblocks);
+ error = xrep_tempfile_prealloc(sc, 0, rts->rsumblocks);
if (error)
return error;
@@ -143,11 +141,11 @@ xrep_rtsummary(
return error;
/* Copy the rtsummary file that we generated. */
- error = xrep_tempfile_copyin(sc, 0, rsumblocks,
+ error = xrep_tempfile_copyin(sc, 0, rts->rsumblocks,
xrep_rtsummary_prep_buf, rts);
if (error)
return error;
- error = xrep_tempfile_set_isize(sc, rts->rsumsize);
+ error = xrep_tempfile_set_isize(sc, XFS_FSB_TO_B(mp, rts->rsumblocks));
if (error)
return error;
@@ -168,7 +166,7 @@ xrep_rtsummary(
memset(mp->m_rsum_cache, 0xFF, mp->m_sb.sb_rbmblocks);
mp->m_rsumlevels = rts->rsumlevels;
- mp->m_rsumsize = rts->rsumsize;
+ mp->m_rsumblocks = rts->rsumblocks;
/* Free the old rtsummary blocks if they're not in use. */
return xrep_reap_ifork(sc, sc->tempip, XFS_DATA_FORK);
diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h
index d0567dfbc036..7bf635cccaa1 100644
--- a/fs/xfs/xfs_mount.h
+++ b/fs/xfs/xfs_mount.h
@@ -147,7 +147,7 @@ typedef struct xfs_mount {
int m_logbufs; /* number of log buffers */
int m_logbsize; /* size of each log buffer */
uint m_rsumlevels; /* rt summary levels */
- uint m_rsumsize; /* size of rt summary, bytes */
+ xfs_filblks_t m_rsumblocks; /* size of rt summary, FSBs */
int m_fixedfsid[2]; /* unchanged for life of FS */
uint m_qflags; /* quota status flags */
uint64_t m_features; /* active filesystem features */
diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
index 29edb8044b00..3a2005a1e673 100644
--- a/fs/xfs/xfs_rtalloc.c
+++ b/fs/xfs/xfs_rtalloc.c
@@ -734,9 +734,8 @@ xfs_growfs_rt_bmblock(
nmp->m_sb.sb_rextents = xfs_rtb_to_rtx(nmp, nmp->m_sb.sb_rblocks);
nmp->m_sb.sb_rextslog = xfs_compute_rextslog(nmp->m_sb.sb_rextents);
nmp->m_rsumlevels = nmp->m_sb.sb_rextslog + 1;
- nmp->m_rsumsize = XFS_FSB_TO_B(mp,
- xfs_rtsummary_blockcount(mp, nmp->m_rsumlevels,
- nmp->m_sb.sb_rbmblocks));
+ nmp->m_rsumblocks = xfs_rtsummary_blockcount(mp, nmp->m_rsumlevels,
+ nmp->m_sb.sb_rbmblocks);
/*
* Recompute the growfsrt reservation from the new rsumsize, so that the
@@ -766,7 +765,7 @@ xfs_growfs_rt_bmblock(
* so that inode inactivation won't punch what it thinks are "posteof"
* blocks.
*/
- rsumip->i_disk_size = nmp->m_rsumsize;
+ rsumip->i_disk_size = nmp->m_rsumblocks * nmp->m_sb.sb_blocksize;
i_size_write(VFS_I(rsumip), rsumip->i_disk_size);
xfs_trans_log_inode(args.tp, rsumip, XFS_ILOG_CORE);
@@ -818,7 +817,7 @@ xfs_growfs_rt_bmblock(
* Update the calculated values in the real mount structure.
*/
mp->m_rsumlevels = nmp->m_rsumlevels;
- mp->m_rsumsize = nmp->m_rsumsize;
+ mp->m_rsumblocks = nmp->m_rsumblocks;
xfs_mount_sb_set_rextsize(mp, &mp->m_sb);
/*
@@ -1022,7 +1021,6 @@ xfs_rtmount_init(
struct xfs_buf *bp; /* buffer for last block of subvolume */
struct xfs_sb *sbp; /* filesystem superblock copy in mount */
xfs_daddr_t d; /* address of last block of subvolume */
- unsigned int rsumblocks;
int error;
sbp = &mp->m_sb;
@@ -1034,9 +1032,8 @@ xfs_rtmount_init(
return -ENODEV;
}
mp->m_rsumlevels = sbp->sb_rextslog + 1;
- rsumblocks = xfs_rtsummary_blockcount(mp, mp->m_rsumlevels,
+ mp->m_rsumblocks = xfs_rtsummary_blockcount(mp, mp->m_rsumlevels,
mp->m_sb.sb_rbmblocks);
- mp->m_rsumsize = XFS_FSB_TO_B(mp, rsumblocks);
mp->m_rbmip = mp->m_rsumip = NULL;
/*
* Check that the realtime section is an ok size.
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 09/10] xfs: rearrange xfs_fsmap.c a little bit
2024-09-02 18:22 ` [PATCHSET v4.2 5/8] xfs: cleanups for the realtime allocator Darrick J. Wong
` (7 preceding siblings ...)
2024-09-02 18:31 ` [PATCH 08/10] xfs: replace m_rsumsize with m_rsumblocks Darrick J. Wong
@ 2024-09-02 18:31 ` Darrick J. Wong
2024-09-02 18:32 ` [PATCH 10/10] xfs: move xfs_ioc_getfsmap out of xfs_ioctl.c Darrick J. Wong
9 siblings, 0 replies; 53+ messages in thread
From: Darrick J. Wong @ 2024-09-02 18:31 UTC (permalink / raw)
To: chandanbabu, djwong; +Cc: Christoph Hellwig, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
The order of the functions in this file has gotten a little confusing
over the years. Specifically, the two data device implementations
(bnobt and rmapbt) could be adjacent in the source code instead of split
in two by the logdev and rtdev fsmap implementations. We're about to
add more functionality to this file, so rearrange things now.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
fs/xfs/xfs_fsmap.c | 268 ++++++++++++++++++++++++++--------------------------
1 file changed, 134 insertions(+), 134 deletions(-)
diff --git a/fs/xfs/xfs_fsmap.c b/fs/xfs/xfs_fsmap.c
index e15446626875..615253406fde 100644
--- a/fs/xfs/xfs_fsmap.c
+++ b/fs/xfs/xfs_fsmap.c
@@ -441,140 +441,6 @@ xfs_getfsmap_set_irec_flags(
irec->rm_flags |= XFS_RMAP_UNWRITTEN;
}
-/* Execute a getfsmap query against the log device. */
-STATIC int
-xfs_getfsmap_logdev(
- struct xfs_trans *tp,
- const struct xfs_fsmap *keys,
- struct xfs_getfsmap_info *info)
-{
- struct xfs_mount *mp = tp->t_mountp;
- struct xfs_rmap_irec rmap;
- xfs_daddr_t rec_daddr, len_daddr;
- xfs_fsblock_t start_fsb, end_fsb;
- uint64_t eofs;
-
- eofs = XFS_FSB_TO_BB(mp, mp->m_sb.sb_logblocks);
- if (keys[0].fmr_physical >= eofs)
- return 0;
- start_fsb = XFS_BB_TO_FSBT(mp,
- keys[0].fmr_physical + keys[0].fmr_length);
- end_fsb = XFS_BB_TO_FSB(mp, min(eofs - 1, keys[1].fmr_physical));
-
- /* Adjust the low key if we are continuing from where we left off. */
- if (keys[0].fmr_length > 0)
- info->low_daddr = XFS_FSB_TO_BB(mp, start_fsb);
-
- trace_xfs_fsmap_low_key_linear(mp, info->dev, start_fsb);
- trace_xfs_fsmap_high_key_linear(mp, info->dev, end_fsb);
-
- if (start_fsb > 0)
- return 0;
-
- /* Fabricate an rmap entry for the external log device. */
- rmap.rm_startblock = 0;
- rmap.rm_blockcount = mp->m_sb.sb_logblocks;
- rmap.rm_owner = XFS_RMAP_OWN_LOG;
- rmap.rm_offset = 0;
- rmap.rm_flags = 0;
-
- rec_daddr = XFS_FSB_TO_BB(mp, rmap.rm_startblock);
- len_daddr = XFS_FSB_TO_BB(mp, rmap.rm_blockcount);
- return xfs_getfsmap_helper(tp, info, &rmap, rec_daddr, len_daddr);
-}
-
-#ifdef CONFIG_XFS_RT
-/* Transform a rtbitmap "record" into a fsmap */
-STATIC int
-xfs_getfsmap_rtdev_rtbitmap_helper(
- struct xfs_mount *mp,
- struct xfs_trans *tp,
- const struct xfs_rtalloc_rec *rec,
- void *priv)
-{
- struct xfs_getfsmap_info *info = priv;
- struct xfs_rmap_irec irec;
- xfs_rtblock_t rtbno;
- xfs_daddr_t rec_daddr, len_daddr;
-
- rtbno = xfs_rtx_to_rtb(mp, rec->ar_startext);
- rec_daddr = XFS_FSB_TO_BB(mp, rtbno);
- irec.rm_startblock = rtbno;
-
- rtbno = xfs_rtx_to_rtb(mp, rec->ar_extcount);
- len_daddr = XFS_FSB_TO_BB(mp, rtbno);
- irec.rm_blockcount = rtbno;
-
- irec.rm_owner = XFS_RMAP_OWN_NULL; /* "free" */
- irec.rm_offset = 0;
- irec.rm_flags = 0;
-
- return xfs_getfsmap_helper(tp, info, &irec, rec_daddr, len_daddr);
-}
-
-/* Execute a getfsmap query against the realtime device rtbitmap. */
-STATIC int
-xfs_getfsmap_rtdev_rtbitmap(
- struct xfs_trans *tp,
- const struct xfs_fsmap *keys,
- struct xfs_getfsmap_info *info)
-{
-
- struct xfs_rtalloc_rec ahigh = { 0 };
- struct xfs_mount *mp = tp->t_mountp;
- xfs_rtblock_t start_rtb;
- xfs_rtblock_t end_rtb;
- xfs_rtxnum_t high;
- uint64_t eofs;
- int error;
-
- eofs = XFS_FSB_TO_BB(mp, xfs_rtx_to_rtb(mp, mp->m_sb.sb_rextents));
- if (keys[0].fmr_physical >= eofs)
- return 0;
- start_rtb = XFS_BB_TO_FSBT(mp,
- keys[0].fmr_physical + keys[0].fmr_length);
- end_rtb = XFS_BB_TO_FSB(mp, min(eofs - 1, keys[1].fmr_physical));
-
- info->missing_owner = XFS_FMR_OWN_UNKNOWN;
-
- /* Adjust the low key if we are continuing from where we left off. */
- if (keys[0].fmr_length > 0) {
- info->low_daddr = XFS_FSB_TO_BB(mp, start_rtb);
- if (info->low_daddr >= eofs)
- return 0;
- }
-
- trace_xfs_fsmap_low_key_linear(mp, info->dev, start_rtb);
- trace_xfs_fsmap_high_key_linear(mp, info->dev, end_rtb);
-
- xfs_rtbitmap_lock_shared(mp, XFS_RBMLOCK_BITMAP);
-
- /*
- * Set up query parameters to return free rtextents covering the range
- * we want.
- */
- high = xfs_rtb_to_rtxup(mp, end_rtb);
- error = xfs_rtalloc_query_range(mp, tp, xfs_rtb_to_rtx(mp, start_rtb),
- high, xfs_getfsmap_rtdev_rtbitmap_helper, info);
- if (error)
- goto err;
-
- /*
- * Report any gaps at the end of the rtbitmap by simulating a null
- * rmap starting at the block after the end of the query range.
- */
- info->last = true;
- ahigh.ar_startext = min(mp->m_sb.sb_rextents, high);
-
- error = xfs_getfsmap_rtdev_rtbitmap_helper(mp, tp, &ahigh, info);
- if (error)
- goto err;
-err:
- xfs_rtbitmap_unlock_shared(mp, XFS_RBMLOCK_BITMAP);
- return error;
-}
-#endif /* CONFIG_XFS_RT */
-
static inline bool
rmap_not_shareable(struct xfs_mount *mp, const struct xfs_rmap_irec *r)
{
@@ -799,6 +665,140 @@ xfs_getfsmap_datadev_bnobt(
xfs_getfsmap_datadev_bnobt_query, &akeys[0]);
}
+/* Execute a getfsmap query against the log device. */
+STATIC int
+xfs_getfsmap_logdev(
+ struct xfs_trans *tp,
+ const struct xfs_fsmap *keys,
+ struct xfs_getfsmap_info *info)
+{
+ struct xfs_mount *mp = tp->t_mountp;
+ struct xfs_rmap_irec rmap;
+ xfs_daddr_t rec_daddr, len_daddr;
+ xfs_fsblock_t start_fsb, end_fsb;
+ uint64_t eofs;
+
+ eofs = XFS_FSB_TO_BB(mp, mp->m_sb.sb_logblocks);
+ if (keys[0].fmr_physical >= eofs)
+ return 0;
+ start_fsb = XFS_BB_TO_FSBT(mp,
+ keys[0].fmr_physical + keys[0].fmr_length);
+ end_fsb = XFS_BB_TO_FSB(mp, min(eofs - 1, keys[1].fmr_physical));
+
+ /* Adjust the low key if we are continuing from where we left off. */
+ if (keys[0].fmr_length > 0)
+ info->low_daddr = XFS_FSB_TO_BB(mp, start_fsb);
+
+ trace_xfs_fsmap_low_key_linear(mp, info->dev, start_fsb);
+ trace_xfs_fsmap_high_key_linear(mp, info->dev, end_fsb);
+
+ if (start_fsb > 0)
+ return 0;
+
+ /* Fabricate an rmap entry for the external log device. */
+ rmap.rm_startblock = 0;
+ rmap.rm_blockcount = mp->m_sb.sb_logblocks;
+ rmap.rm_owner = XFS_RMAP_OWN_LOG;
+ rmap.rm_offset = 0;
+ rmap.rm_flags = 0;
+
+ rec_daddr = XFS_FSB_TO_BB(mp, rmap.rm_startblock);
+ len_daddr = XFS_FSB_TO_BB(mp, rmap.rm_blockcount);
+ return xfs_getfsmap_helper(tp, info, &rmap, rec_daddr, len_daddr);
+}
+
+#ifdef CONFIG_XFS_RT
+/* Transform a rtbitmap "record" into a fsmap */
+STATIC int
+xfs_getfsmap_rtdev_rtbitmap_helper(
+ struct xfs_mount *mp,
+ struct xfs_trans *tp,
+ const struct xfs_rtalloc_rec *rec,
+ void *priv)
+{
+ struct xfs_getfsmap_info *info = priv;
+ struct xfs_rmap_irec irec;
+ xfs_rtblock_t rtbno;
+ xfs_daddr_t rec_daddr, len_daddr;
+
+ rtbno = xfs_rtx_to_rtb(mp, rec->ar_startext);
+ rec_daddr = XFS_FSB_TO_BB(mp, rtbno);
+ irec.rm_startblock = rtbno;
+
+ rtbno = xfs_rtx_to_rtb(mp, rec->ar_extcount);
+ len_daddr = XFS_FSB_TO_BB(mp, rtbno);
+ irec.rm_blockcount = rtbno;
+
+ irec.rm_owner = XFS_RMAP_OWN_NULL; /* "free" */
+ irec.rm_offset = 0;
+ irec.rm_flags = 0;
+
+ return xfs_getfsmap_helper(tp, info, &irec, rec_daddr, len_daddr);
+}
+
+/* Execute a getfsmap query against the realtime device rtbitmap. */
+STATIC int
+xfs_getfsmap_rtdev_rtbitmap(
+ struct xfs_trans *tp,
+ const struct xfs_fsmap *keys,
+ struct xfs_getfsmap_info *info)
+{
+
+ struct xfs_rtalloc_rec ahigh = { 0 };
+ struct xfs_mount *mp = tp->t_mountp;
+ xfs_rtblock_t start_rtb;
+ xfs_rtblock_t end_rtb;
+ xfs_rtxnum_t high;
+ uint64_t eofs;
+ int error;
+
+ eofs = XFS_FSB_TO_BB(mp, xfs_rtx_to_rtb(mp, mp->m_sb.sb_rextents));
+ if (keys[0].fmr_physical >= eofs)
+ return 0;
+ start_rtb = XFS_BB_TO_FSBT(mp,
+ keys[0].fmr_physical + keys[0].fmr_length);
+ end_rtb = XFS_BB_TO_FSB(mp, min(eofs - 1, keys[1].fmr_physical));
+
+ info->missing_owner = XFS_FMR_OWN_UNKNOWN;
+
+ /* Adjust the low key if we are continuing from where we left off. */
+ if (keys[0].fmr_length > 0) {
+ info->low_daddr = XFS_FSB_TO_BB(mp, start_rtb);
+ if (info->low_daddr >= eofs)
+ return 0;
+ }
+
+ trace_xfs_fsmap_low_key_linear(mp, info->dev, start_rtb);
+ trace_xfs_fsmap_high_key_linear(mp, info->dev, end_rtb);
+
+ xfs_rtbitmap_lock_shared(mp, XFS_RBMLOCK_BITMAP);
+
+ /*
+ * Set up query parameters to return free rtextents covering the range
+ * we want.
+ */
+ high = xfs_rtb_to_rtxup(mp, end_rtb);
+ error = xfs_rtalloc_query_range(mp, tp, xfs_rtb_to_rtx(mp, start_rtb),
+ high, xfs_getfsmap_rtdev_rtbitmap_helper, info);
+ if (error)
+ goto err;
+
+ /*
+ * Report any gaps at the end of the rtbitmap by simulating a null
+ * rmap starting at the block after the end of the query range.
+ */
+ info->last = true;
+ ahigh.ar_startext = min(mp->m_sb.sb_rextents, high);
+
+ error = xfs_getfsmap_rtdev_rtbitmap_helper(mp, tp, &ahigh, info);
+ if (error)
+ goto err;
+err:
+ xfs_rtbitmap_unlock_shared(mp, XFS_RBMLOCK_BITMAP);
+ return error;
+}
+#endif /* CONFIG_XFS_RT */
+
/* Do we recognize the device? */
STATIC bool
xfs_getfsmap_is_valid_device(
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 10/10] xfs: move xfs_ioc_getfsmap out of xfs_ioctl.c
2024-09-02 18:22 ` [PATCHSET v4.2 5/8] xfs: cleanups for the realtime allocator Darrick J. Wong
` (8 preceding siblings ...)
2024-09-02 18:31 ` [PATCH 09/10] xfs: rearrange xfs_fsmap.c a little bit Darrick J. Wong
@ 2024-09-02 18:32 ` Darrick J. Wong
9 siblings, 0 replies; 53+ messages in thread
From: Darrick J. Wong @ 2024-09-02 18:32 UTC (permalink / raw)
To: chandanbabu, djwong; +Cc: Christoph Hellwig, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Move this function out of xfs_ioctl.c to reduce the clutter in there,
and make the entire getfsmap implementation self-contained in a single
file.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
fs/xfs/xfs_fsmap.c | 134 +++++++++++++++++++++++++++++++++++++++++++++++++++-
fs/xfs/xfs_fsmap.h | 6 +-
fs/xfs/xfs_ioctl.c | 130 --------------------------------------------------
3 files changed, 134 insertions(+), 136 deletions(-)
diff --git a/fs/xfs/xfs_fsmap.c b/fs/xfs/xfs_fsmap.c
index 615253406fde..ae18ab86e608 100644
--- a/fs/xfs/xfs_fsmap.c
+++ b/fs/xfs/xfs_fsmap.c
@@ -44,7 +44,7 @@ xfs_fsmap_from_internal(
}
/* Convert an fsmap to an xfs_fsmap. */
-void
+static void
xfs_fsmap_to_internal(
struct xfs_fsmap *dest,
struct fsmap *src)
@@ -889,7 +889,7 @@ xfs_getfsmap_check_keys(
* xfs_getfsmap_info.low/high -- per-AG low/high keys computed from
* dkeys; used to query the metadata.
*/
-int
+STATIC int
xfs_getfsmap(
struct xfs_mount *mp,
struct xfs_fsmap_head *head,
@@ -1019,3 +1019,133 @@ xfs_getfsmap(
head->fmh_oflags = FMH_OF_DEV_T;
return error;
}
+
+int
+xfs_ioc_getfsmap(
+ struct xfs_inode *ip,
+ struct fsmap_head __user *arg)
+{
+ struct xfs_fsmap_head xhead = {0};
+ struct fsmap_head head;
+ struct fsmap *recs;
+ unsigned int count;
+ __u32 last_flags = 0;
+ bool done = false;
+ int error;
+
+ if (copy_from_user(&head, arg, sizeof(struct fsmap_head)))
+ return -EFAULT;
+ if (memchr_inv(head.fmh_reserved, 0, sizeof(head.fmh_reserved)) ||
+ memchr_inv(head.fmh_keys[0].fmr_reserved, 0,
+ sizeof(head.fmh_keys[0].fmr_reserved)) ||
+ memchr_inv(head.fmh_keys[1].fmr_reserved, 0,
+ sizeof(head.fmh_keys[1].fmr_reserved)))
+ return -EINVAL;
+
+ /*
+ * Use an internal memory buffer so that we don't have to copy fsmap
+ * data to userspace while holding locks. Start by trying to allocate
+ * up to 128k for the buffer, but fall back to a single page if needed.
+ */
+ count = min_t(unsigned int, head.fmh_count,
+ 131072 / sizeof(struct fsmap));
+ recs = kvcalloc(count, sizeof(struct fsmap), GFP_KERNEL);
+ if (!recs) {
+ count = min_t(unsigned int, head.fmh_count,
+ PAGE_SIZE / sizeof(struct fsmap));
+ recs = kvcalloc(count, sizeof(struct fsmap), GFP_KERNEL);
+ if (!recs)
+ return -ENOMEM;
+ }
+
+ xhead.fmh_iflags = head.fmh_iflags;
+ xfs_fsmap_to_internal(&xhead.fmh_keys[0], &head.fmh_keys[0]);
+ xfs_fsmap_to_internal(&xhead.fmh_keys[1], &head.fmh_keys[1]);
+
+ trace_xfs_getfsmap_low_key(ip->i_mount, &xhead.fmh_keys[0]);
+ trace_xfs_getfsmap_high_key(ip->i_mount, &xhead.fmh_keys[1]);
+
+ head.fmh_entries = 0;
+ do {
+ struct fsmap __user *user_recs;
+ struct fsmap *last_rec;
+
+ user_recs = &arg->fmh_recs[head.fmh_entries];
+ xhead.fmh_entries = 0;
+ xhead.fmh_count = min_t(unsigned int, count,
+ head.fmh_count - head.fmh_entries);
+
+ /* Run query, record how many entries we got. */
+ error = xfs_getfsmap(ip->i_mount, &xhead, recs);
+ switch (error) {
+ case 0:
+ /*
+ * There are no more records in the result set. Copy
+ * whatever we got to userspace and break out.
+ */
+ done = true;
+ break;
+ case -ECANCELED:
+ /*
+ * The internal memory buffer is full. Copy whatever
+ * records we got to userspace and go again if we have
+ * not yet filled the userspace buffer.
+ */
+ error = 0;
+ break;
+ default:
+ goto out_free;
+ }
+ head.fmh_entries += xhead.fmh_entries;
+ head.fmh_oflags = xhead.fmh_oflags;
+
+ /*
+ * If the caller wanted a record count or there aren't any
+ * new records to return, we're done.
+ */
+ if (head.fmh_count == 0 || xhead.fmh_entries == 0)
+ break;
+
+ /* Copy all the records we got out to userspace. */
+ if (copy_to_user(user_recs, recs,
+ xhead.fmh_entries * sizeof(struct fsmap))) {
+ error = -EFAULT;
+ goto out_free;
+ }
+
+ /* Remember the last record flags we copied to userspace. */
+ last_rec = &recs[xhead.fmh_entries - 1];
+ last_flags = last_rec->fmr_flags;
+
+ /* Set up the low key for the next iteration. */
+ xfs_fsmap_to_internal(&xhead.fmh_keys[0], last_rec);
+ trace_xfs_getfsmap_low_key(ip->i_mount, &xhead.fmh_keys[0]);
+ } while (!done && head.fmh_entries < head.fmh_count);
+
+ /*
+ * If there are no more records in the query result set and we're not
+ * in counting mode, mark the last record returned with the LAST flag.
+ */
+ if (done && head.fmh_count > 0 && head.fmh_entries > 0) {
+ struct fsmap __user *user_rec;
+
+ last_flags |= FMR_OF_LAST;
+ user_rec = &arg->fmh_recs[head.fmh_entries - 1];
+
+ if (copy_to_user(&user_rec->fmr_flags, &last_flags,
+ sizeof(last_flags))) {
+ error = -EFAULT;
+ goto out_free;
+ }
+ }
+
+ /* copy back header */
+ if (copy_to_user(arg, &head, sizeof(struct fsmap_head))) {
+ error = -EFAULT;
+ goto out_free;
+ }
+
+out_free:
+ kvfree(recs);
+ return error;
+}
diff --git a/fs/xfs/xfs_fsmap.h b/fs/xfs/xfs_fsmap.h
index a0775788e7b1..a0bcc38486a5 100644
--- a/fs/xfs/xfs_fsmap.h
+++ b/fs/xfs/xfs_fsmap.h
@@ -7,6 +7,7 @@
#define __XFS_FSMAP_H__
struct fsmap;
+struct fsmap_head;
/* internal fsmap representation */
struct xfs_fsmap {
@@ -27,9 +28,6 @@ struct xfs_fsmap_head {
struct xfs_fsmap fmh_keys[2]; /* low and high keys */
};
-void xfs_fsmap_to_internal(struct xfs_fsmap *dest, struct fsmap *src);
-
-int xfs_getfsmap(struct xfs_mount *mp, struct xfs_fsmap_head *head,
- struct fsmap *out_recs);
+int xfs_ioc_getfsmap(struct xfs_inode *ip, struct fsmap_head __user *arg);
#endif /* __XFS_FSMAP_H__ */
diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
index 90b3ee21e7fe..7226d27e8afc 100644
--- a/fs/xfs/xfs_ioctl.c
+++ b/fs/xfs/xfs_ioctl.c
@@ -876,136 +876,6 @@ xfs_ioc_getbmap(
return error;
}
-STATIC int
-xfs_ioc_getfsmap(
- struct xfs_inode *ip,
- struct fsmap_head __user *arg)
-{
- struct xfs_fsmap_head xhead = {0};
- struct fsmap_head head;
- struct fsmap *recs;
- unsigned int count;
- __u32 last_flags = 0;
- bool done = false;
- int error;
-
- if (copy_from_user(&head, arg, sizeof(struct fsmap_head)))
- return -EFAULT;
- if (memchr_inv(head.fmh_reserved, 0, sizeof(head.fmh_reserved)) ||
- memchr_inv(head.fmh_keys[0].fmr_reserved, 0,
- sizeof(head.fmh_keys[0].fmr_reserved)) ||
- memchr_inv(head.fmh_keys[1].fmr_reserved, 0,
- sizeof(head.fmh_keys[1].fmr_reserved)))
- return -EINVAL;
-
- /*
- * Use an internal memory buffer so that we don't have to copy fsmap
- * data to userspace while holding locks. Start by trying to allocate
- * up to 128k for the buffer, but fall back to a single page if needed.
- */
- count = min_t(unsigned int, head.fmh_count,
- 131072 / sizeof(struct fsmap));
- recs = kvcalloc(count, sizeof(struct fsmap), GFP_KERNEL);
- if (!recs) {
- count = min_t(unsigned int, head.fmh_count,
- PAGE_SIZE / sizeof(struct fsmap));
- recs = kvcalloc(count, sizeof(struct fsmap), GFP_KERNEL);
- if (!recs)
- return -ENOMEM;
- }
-
- xhead.fmh_iflags = head.fmh_iflags;
- xfs_fsmap_to_internal(&xhead.fmh_keys[0], &head.fmh_keys[0]);
- xfs_fsmap_to_internal(&xhead.fmh_keys[1], &head.fmh_keys[1]);
-
- trace_xfs_getfsmap_low_key(ip->i_mount, &xhead.fmh_keys[0]);
- trace_xfs_getfsmap_high_key(ip->i_mount, &xhead.fmh_keys[1]);
-
- head.fmh_entries = 0;
- do {
- struct fsmap __user *user_recs;
- struct fsmap *last_rec;
-
- user_recs = &arg->fmh_recs[head.fmh_entries];
- xhead.fmh_entries = 0;
- xhead.fmh_count = min_t(unsigned int, count,
- head.fmh_count - head.fmh_entries);
-
- /* Run query, record how many entries we got. */
- error = xfs_getfsmap(ip->i_mount, &xhead, recs);
- switch (error) {
- case 0:
- /*
- * There are no more records in the result set. Copy
- * whatever we got to userspace and break out.
- */
- done = true;
- break;
- case -ECANCELED:
- /*
- * The internal memory buffer is full. Copy whatever
- * records we got to userspace and go again if we have
- * not yet filled the userspace buffer.
- */
- error = 0;
- break;
- default:
- goto out_free;
- }
- head.fmh_entries += xhead.fmh_entries;
- head.fmh_oflags = xhead.fmh_oflags;
-
- /*
- * If the caller wanted a record count or there aren't any
- * new records to return, we're done.
- */
- if (head.fmh_count == 0 || xhead.fmh_entries == 0)
- break;
-
- /* Copy all the records we got out to userspace. */
- if (copy_to_user(user_recs, recs,
- xhead.fmh_entries * sizeof(struct fsmap))) {
- error = -EFAULT;
- goto out_free;
- }
-
- /* Remember the last record flags we copied to userspace. */
- last_rec = &recs[xhead.fmh_entries - 1];
- last_flags = last_rec->fmr_flags;
-
- /* Set up the low key for the next iteration. */
- xfs_fsmap_to_internal(&xhead.fmh_keys[0], last_rec);
- trace_xfs_getfsmap_low_key(ip->i_mount, &xhead.fmh_keys[0]);
- } while (!done && head.fmh_entries < head.fmh_count);
-
- /*
- * If there are no more records in the query result set and we're not
- * in counting mode, mark the last record returned with the LAST flag.
- */
- if (done && head.fmh_count > 0 && head.fmh_entries > 0) {
- struct fsmap __user *user_rec;
-
- last_flags |= FMR_OF_LAST;
- user_rec = &arg->fmh_recs[head.fmh_entries - 1];
-
- if (copy_to_user(&user_rec->fmr_flags, &last_flags,
- sizeof(last_flags))) {
- error = -EFAULT;
- goto out_free;
- }
- }
-
- /* copy back header */
- if (copy_to_user(arg, &head, sizeof(struct fsmap_head))) {
- error = -EFAULT;
- goto out_free;
- }
-
-out_free:
- kvfree(recs);
- return error;
-}
-
int
xfs_ioc_swapext(
xfs_swapext_t *sxp)
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 1/1] xfs: refactor loading quota inodes in the regular case
2024-09-02 18:22 ` [PATCHSET v4.2 6/8] xfs: cleanups for quota mount Darrick J. Wong
@ 2024-09-02 18:32 ` Darrick J. Wong
0 siblings, 0 replies; 53+ messages in thread
From: Darrick J. Wong @ 2024-09-02 18:32 UTC (permalink / raw)
To: chandanbabu, djwong; +Cc: Christoph Hellwig, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Create a helper function to load quota inodes in the case where the
dqtype and the sb quota inode fields correspond. This is true for
nearly all the iget callsites in the quota code, except for when we're
switching the group and project quota inodes. We'll need this in
subsequent patches to make the metadir handling less convoluted.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
fs/xfs/xfs_qm.c | 46 +++++++++++++++++++++++++++++++++++-----
fs/xfs/xfs_qm.h | 3 +++
fs/xfs/xfs_qm_syscalls.c | 13 +++++------
fs/xfs/xfs_quotaops.c | 53 +++++++++++++++++++++++++++-------------------
4 files changed, 80 insertions(+), 35 deletions(-)
diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c
index 63f6ca2db251..7e2307921deb 100644
--- a/fs/xfs/xfs_qm.c
+++ b/fs/xfs/xfs_qm.c
@@ -1538,6 +1538,43 @@ xfs_qm_mount_quotas(
}
}
+/*
+ * Load the inode for a given type of quota, assuming that the sb fields have
+ * been sorted out. This is not true when switching quota types on a V4
+ * filesystem, so do not use this function for that.
+ *
+ * Returns -ENOENT if the quota inode field is NULLFSINO; 0 and an inode on
+ * success; or a negative errno.
+ */
+int
+xfs_qm_qino_load(
+ struct xfs_mount *mp,
+ xfs_dqtype_t type,
+ struct xfs_inode **ipp)
+{
+ xfs_ino_t ino = NULLFSINO;
+
+ switch (type) {
+ case XFS_DQTYPE_USER:
+ ino = mp->m_sb.sb_uquotino;
+ break;
+ case XFS_DQTYPE_GROUP:
+ ino = mp->m_sb.sb_gquotino;
+ break;
+ case XFS_DQTYPE_PROJ:
+ ino = mp->m_sb.sb_pquotino;
+ break;
+ default:
+ ASSERT(0);
+ return -EFSCORRUPTED;
+ }
+
+ if (ino == NULLFSINO)
+ return -ENOENT;
+
+ return xfs_iget(mp, NULL, ino, 0, 0, ipp);
+}
+
/*
* This is called after the superblock has been read in and we're ready to
* iget the quota inodes.
@@ -1561,24 +1598,21 @@ xfs_qm_init_quotainos(
if (XFS_IS_UQUOTA_ON(mp) &&
mp->m_sb.sb_uquotino != NULLFSINO) {
ASSERT(mp->m_sb.sb_uquotino > 0);
- error = xfs_iget(mp, NULL, mp->m_sb.sb_uquotino,
- 0, 0, &uip);
+ error = xfs_qm_qino_load(mp, XFS_DQTYPE_USER, &uip);
if (error)
return error;
}
if (XFS_IS_GQUOTA_ON(mp) &&
mp->m_sb.sb_gquotino != NULLFSINO) {
ASSERT(mp->m_sb.sb_gquotino > 0);
- error = xfs_iget(mp, NULL, mp->m_sb.sb_gquotino,
- 0, 0, &gip);
+ error = xfs_qm_qino_load(mp, XFS_DQTYPE_GROUP, &gip);
if (error)
goto error_rele;
}
if (XFS_IS_PQUOTA_ON(mp) &&
mp->m_sb.sb_pquotino != NULLFSINO) {
ASSERT(mp->m_sb.sb_pquotino > 0);
- error = xfs_iget(mp, NULL, mp->m_sb.sb_pquotino,
- 0, 0, &pip);
+ error = xfs_qm_qino_load(mp, XFS_DQTYPE_PROJ, &pip);
if (error)
goto error_rele;
}
diff --git a/fs/xfs/xfs_qm.h b/fs/xfs/xfs_qm.h
index 6e09dfcd13e2..e919c7f62f57 100644
--- a/fs/xfs/xfs_qm.h
+++ b/fs/xfs/xfs_qm.h
@@ -184,4 +184,7 @@ xfs_get_defquota(struct xfs_quotainfo *qi, xfs_dqtype_t type)
}
}
+int xfs_qm_qino_load(struct xfs_mount *mp, xfs_dqtype_t type,
+ struct xfs_inode **ipp);
+
#endif /* __XFS_QM_H__ */
diff --git a/fs/xfs/xfs_qm_syscalls.c b/fs/xfs/xfs_qm_syscalls.c
index 392cb39cc10c..4eda50ae2d1c 100644
--- a/fs/xfs/xfs_qm_syscalls.c
+++ b/fs/xfs/xfs_qm_syscalls.c
@@ -53,16 +53,15 @@ xfs_qm_scall_quotaoff(
STATIC int
xfs_qm_scall_trunc_qfile(
struct xfs_mount *mp,
- xfs_ino_t ino)
+ xfs_dqtype_t type)
{
struct xfs_inode *ip;
struct xfs_trans *tp;
int error;
- if (ino == NULLFSINO)
+ error = xfs_qm_qino_load(mp, type, &ip);
+ if (error == -ENOENT)
return 0;
-
- error = xfs_iget(mp, NULL, ino, 0, 0, &ip);
if (error)
return error;
@@ -113,17 +112,17 @@ xfs_qm_scall_trunc_qfiles(
}
if (flags & XFS_QMOPT_UQUOTA) {
- error = xfs_qm_scall_trunc_qfile(mp, mp->m_sb.sb_uquotino);
+ error = xfs_qm_scall_trunc_qfile(mp, XFS_DQTYPE_USER);
if (error)
return error;
}
if (flags & XFS_QMOPT_GQUOTA) {
- error = xfs_qm_scall_trunc_qfile(mp, mp->m_sb.sb_gquotino);
+ error = xfs_qm_scall_trunc_qfile(mp, XFS_DQTYPE_GROUP);
if (error)
return error;
}
if (flags & XFS_QMOPT_PQUOTA)
- error = xfs_qm_scall_trunc_qfile(mp, mp->m_sb.sb_pquotino);
+ error = xfs_qm_scall_trunc_qfile(mp, XFS_DQTYPE_PROJ);
return error;
}
diff --git a/fs/xfs/xfs_quotaops.c b/fs/xfs/xfs_quotaops.c
index 9c162e69976b..4c7f7ce4fd2f 100644
--- a/fs/xfs/xfs_quotaops.c
+++ b/fs/xfs/xfs_quotaops.c
@@ -16,24 +16,25 @@
#include "xfs_qm.h"
-static void
+static int
xfs_qm_fill_state(
struct qc_type_state *tstate,
struct xfs_mount *mp,
- struct xfs_inode *ip,
- xfs_ino_t ino,
- struct xfs_def_quota *defq)
+ xfs_dqtype_t type)
{
- bool tempqip = false;
+ struct xfs_inode *ip;
+ struct xfs_def_quota *defq;
+ int error;
- tstate->ino = ino;
- if (!ip && ino == NULLFSINO)
- return;
- if (!ip) {
- if (xfs_iget(mp, NULL, ino, 0, 0, &ip))
- return;
- tempqip = true;
+ error = xfs_qm_qino_load(mp, type, &ip);
+ if (error) {
+ tstate->ino = NULLFSINO;
+ return error != -ENOENT ? error : 0;
}
+
+ defq = xfs_get_defquota(mp->m_quotainfo, type);
+
+ tstate->ino = ip->i_ino;
tstate->flags |= QCI_SYSFILE;
tstate->blocks = ip->i_nblocks;
tstate->nextents = ip->i_df.if_nextents;
@@ -43,8 +44,9 @@ xfs_qm_fill_state(
tstate->spc_warnlimit = 0;
tstate->ino_warnlimit = 0;
tstate->rt_spc_warnlimit = 0;
- if (tempqip)
- xfs_irele(ip);
+ xfs_irele(ip);
+
+ return 0;
}
/*
@@ -56,8 +58,9 @@ xfs_fs_get_quota_state(
struct super_block *sb,
struct qc_state *state)
{
- struct xfs_mount *mp = XFS_M(sb);
- struct xfs_quotainfo *q = mp->m_quotainfo;
+ struct xfs_mount *mp = XFS_M(sb);
+ struct xfs_quotainfo *q = mp->m_quotainfo;
+ int error;
memset(state, 0, sizeof(*state));
if (!XFS_IS_QUOTA_ON(mp))
@@ -76,12 +79,18 @@ xfs_fs_get_quota_state(
if (XFS_IS_PQUOTA_ENFORCED(mp))
state->s_state[PRJQUOTA].flags |= QCI_LIMITS_ENFORCED;
- xfs_qm_fill_state(&state->s_state[USRQUOTA], mp, q->qi_uquotaip,
- mp->m_sb.sb_uquotino, &q->qi_usr_default);
- xfs_qm_fill_state(&state->s_state[GRPQUOTA], mp, q->qi_gquotaip,
- mp->m_sb.sb_gquotino, &q->qi_grp_default);
- xfs_qm_fill_state(&state->s_state[PRJQUOTA], mp, q->qi_pquotaip,
- mp->m_sb.sb_pquotino, &q->qi_prj_default);
+ error = xfs_qm_fill_state(&state->s_state[USRQUOTA], mp,
+ XFS_DQTYPE_USER);
+ if (error)
+ return error;
+ error = xfs_qm_fill_state(&state->s_state[GRPQUOTA], mp,
+ XFS_DQTYPE_GROUP);
+ if (error)
+ return error;
+ error = xfs_qm_fill_state(&state->s_state[PRJQUOTA], mp,
+ XFS_DQTYPE_PROJ);
+ if (error)
+ return error;
return 0;
}
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 1/3] xfs: fix C++ compilation errors in xfs_fs.h
2024-09-02 18:22 ` [PATCHSET 7/8] xfs: various bug fixes for 6.12 Darrick J. Wong
@ 2024-09-02 18:32 ` Darrick J. Wong
2024-09-02 18:33 ` [PATCH 2/3] xfs: fix FITRIM reporting again Darrick J. Wong
2024-09-02 18:33 ` [PATCH 3/3] xfs: fix a sloppy memory handling bug in xfs_iroot_realloc Darrick J. Wong
2 siblings, 0 replies; 53+ messages in thread
From: Darrick J. Wong @ 2024-09-02 18:32 UTC (permalink / raw)
To: chandanbabu, djwong; +Cc: kernel, sam, Christoph Hellwig, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Several people reported C++ compilation errors due to things that C
compilers allow but C++ compilers do not. Fix both of these problems,
and hope there aren't more of these brown paper bags in 2 months when we
finally get these fixes through the process into a released xfsprogs.
NOTE: I am submitting this bugfix over the objections of a former
maintainer, who insists that we should remove this function from the
published userspace ABI instead of fixing the C++ compilation errors.
No deprecation period, no discussion, just a hard drop of an already
provided and correct C function, which would be in contravention of
Linus' rules. IOWs, removing ABI that have already shipped in a
released kernel requires a careful deprecation period, so I will let
that maintainer run that process.
Reported-by: kernel@mattwhitlock.name
Reported-by: sam@gentoo.org
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219203
Fixes: 233f4e12bbb2c ("xfs: add parent pointer ioctls")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
fs/xfs/libxfs/xfs_fs.h | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_fs.h b/fs/xfs/libxfs/xfs_fs.h
index c85c8077fac3..860284064c5a 100644
--- a/fs/xfs/libxfs/xfs_fs.h
+++ b/fs/xfs/libxfs/xfs_fs.h
@@ -8,6 +8,7 @@
/*
* SGI's XFS filesystem's major stuff (constants, structures)
+ * NOTE: This file must be compile-able with C++ compilers.
*/
/*
@@ -930,13 +931,13 @@ static inline struct xfs_getparents_rec *
xfs_getparents_next_rec(struct xfs_getparents *gp,
struct xfs_getparents_rec *gpr)
{
- void *next = ((void *)gpr + gpr->gpr_reclen);
+ void *next = ((char *)gpr + gpr->gpr_reclen);
void *end = (void *)(uintptr_t)(gp->gp_buffer + gp->gp_bufsize);
if (next >= end)
return NULL;
- return next;
+ return (struct xfs_getparents_rec *)next;
}
/* Iterate through this file handle's directory parent pointers. */
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 2/3] xfs: fix FITRIM reporting again
2024-09-02 18:22 ` [PATCHSET 7/8] xfs: various bug fixes for 6.12 Darrick J. Wong
2024-09-02 18:32 ` [PATCH 1/3] xfs: fix C++ compilation errors in xfs_fs.h Darrick J. Wong
@ 2024-09-02 18:33 ` Darrick J. Wong
2024-09-02 18:33 ` [PATCH 3/3] xfs: fix a sloppy memory handling bug in xfs_iroot_realloc Darrick J. Wong
2 siblings, 0 replies; 53+ messages in thread
From: Darrick J. Wong @ 2024-09-02 18:33 UTC (permalink / raw)
To: chandanbabu, djwong; +Cc: Christoph Hellwig, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Don't report FITRIMming more bytes than possibly exist in the
filesystem.
Fixes: 410e8a18f8e93 ("xfs: don't bother reporting blocks trimmed via FITRIM")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
fs/xfs/xfs_discard.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/xfs/xfs_discard.c b/fs/xfs/xfs_discard.c
index bf1e3f330018..d8c4a5dcca7a 100644
--- a/fs/xfs/xfs_discard.c
+++ b/fs/xfs/xfs_discard.c
@@ -707,7 +707,7 @@ xfs_ioc_trim(
return last_error;
range.len = min_t(unsigned long long, range.len,
- XFS_FSB_TO_B(mp, max_blocks));
+ XFS_FSB_TO_B(mp, max_blocks) - range.start);
if (copy_to_user(urange, &range, sizeof(range)))
return -EFAULT;
return 0;
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 3/3] xfs: fix a sloppy memory handling bug in xfs_iroot_realloc
2024-09-02 18:22 ` [PATCHSET 7/8] xfs: various bug fixes for 6.12 Darrick J. Wong
2024-09-02 18:32 ` [PATCH 1/3] xfs: fix C++ compilation errors in xfs_fs.h Darrick J. Wong
2024-09-02 18:33 ` [PATCH 2/3] xfs: fix FITRIM reporting again Darrick J. Wong
@ 2024-09-02 18:33 ` Darrick J. Wong
2 siblings, 0 replies; 53+ messages in thread
From: Darrick J. Wong @ 2024-09-02 18:33 UTC (permalink / raw)
To: chandanbabu, djwong; +Cc: Christoph Hellwig, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
While refactoring code, I noticed that when xfs_iroot_realloc tries to
shrink a bmbt root block, it allocates a smaller new block and then
copies "records" and pointers to the new block. However, bmbt root
blocks cannot ever be leaves, which means that it's not technically
correct to copy records. We /should/ be copying keys.
Note that this has never resulted in actual memory corruption because
sizeof(bmbt_rec) == (sizeof(bmbt_key) + sizeof(bmbt_ptr)). However,
this will no longer be true when we start adding realtime rmap stuff,
so fix this now.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
fs/xfs/libxfs/xfs_inode_fork.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_inode_fork.c b/fs/xfs/libxfs/xfs_inode_fork.c
index 9d11ae015909..622382300904 100644
--- a/fs/xfs/libxfs/xfs_inode_fork.c
+++ b/fs/xfs/libxfs/xfs_inode_fork.c
@@ -463,15 +463,15 @@ xfs_iroot_realloc(
}
/*
- * Only copy the records and pointers if there are any.
+ * Only copy the keys and pointers if there are any.
*/
if (new_max > 0) {
/*
- * First copy the records.
+ * First copy the keys.
*/
- op = (char *)XFS_BMBT_REC_ADDR(mp, ifp->if_broot, 1);
- np = (char *)XFS_BMBT_REC_ADDR(mp, new_broot, 1);
- memcpy(np, op, new_max * (uint)sizeof(xfs_bmbt_rec_t));
+ op = (char *)XFS_BMBT_KEY_ADDR(mp, ifp->if_broot, 1);
+ np = (char *)XFS_BMBT_KEY_ADDR(mp, new_broot, 1);
+ memcpy(np, op, new_max * (uint)sizeof(xfs_bmbt_key_t));
/*
* Then copy the pointers.
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 1/2] xfs: replace shouty XFS_BM{BT,DR} macros
2024-09-02 18:22 ` [PATCHSET v4.2 8/8] xfs: cleanups for inode rooted btree code Darrick J. Wong
@ 2024-09-02 18:33 ` Darrick J. Wong
2024-09-02 18:33 ` [PATCH 2/2] xfs: standardize the btree maxrecs function parameters Darrick J. Wong
1 sibling, 0 replies; 53+ messages in thread
From: Darrick J. Wong @ 2024-09-02 18:33 UTC (permalink / raw)
To: chandanbabu, djwong; +Cc: Christoph Hellwig, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Replace all the shouty bmap btree and bmap disk root macros with actual
functions.
sed \
-e 's/XFS_BMBT_BLOCK_LEN/xfs_bmbt_block_len/g' \
-e 's/XFS_BMBT_REC_ADDR/xfs_bmbt_rec_addr/g' \
-e 's/XFS_BMBT_KEY_ADDR/xfs_bmbt_key_addr/g' \
-e 's/XFS_BMBT_PTR_ADDR/xfs_bmbt_ptr_addr/g' \
-e 's/XFS_BMDR_REC_ADDR/xfs_bmdr_rec_addr/g' \
-e 's/XFS_BMDR_KEY_ADDR/xfs_bmdr_key_addr/g' \
-e 's/XFS_BMDR_PTR_ADDR/xfs_bmdr_ptr_addr/g' \
-e 's/XFS_BMAP_BROOT_PTR_ADDR/xfs_bmap_broot_ptr_addr/g' \
-e 's/XFS_BMAP_BROOT_SPACE_CALC/xfs_bmap_broot_space_calc/g' \
-e 's/XFS_BMAP_BROOT_SPACE/xfs_bmap_broot_space/g' \
-e 's/XFS_BMDR_SPACE_CALC/xfs_bmdr_space_calc/g' \
-e 's/XFS_BMAP_BMDR_SPACE/xfs_bmap_bmdr_space/g' \
-i $(git ls-files fs/xfs/*.[ch] fs/xfs/libxfs/*.[ch] fs/xfs/scrub/*.[ch])
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
fs/xfs/libxfs/xfs_attr_leaf.c | 8 +-
fs/xfs/libxfs/xfs_bmap.c | 40 ++++----
fs/xfs/libxfs/xfs_bmap_btree.c | 18 ++--
fs/xfs/libxfs/xfs_bmap_btree.h | 204 +++++++++++++++++++++++++++-------------
fs/xfs/libxfs/xfs_inode_fork.c | 30 +++---
fs/xfs/libxfs/xfs_trans_resv.c | 2
fs/xfs/scrub/bmap_repair.c | 2
fs/xfs/scrub/inode_repair.c | 12 +-
fs/xfs/xfs_bmap_util.c | 4 -
9 files changed, 198 insertions(+), 122 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_attr_leaf.c b/fs/xfs/libxfs/xfs_attr_leaf.c
index b9e98950eb3d..6aaec1246c95 100644
--- a/fs/xfs/libxfs/xfs_attr_leaf.c
+++ b/fs/xfs/libxfs/xfs_attr_leaf.c
@@ -686,7 +686,7 @@ xfs_attr_shortform_bytesfit(
*/
if (!dp->i_forkoff && dp->i_df.if_bytes >
xfs_default_attroffset(dp))
- dsize = XFS_BMDR_SPACE_CALC(MINDBTPTRS);
+ dsize = xfs_bmdr_space_calc(MINDBTPTRS);
break;
case XFS_DINODE_FMT_BTREE:
/*
@@ -700,7 +700,7 @@ xfs_attr_shortform_bytesfit(
return 0;
return dp->i_forkoff;
}
- dsize = XFS_BMAP_BROOT_SPACE(mp, dp->i_df.if_broot);
+ dsize = xfs_bmap_bmdr_space(dp->i_df.if_broot);
break;
}
@@ -708,11 +708,11 @@ xfs_attr_shortform_bytesfit(
* A data fork btree root must have space for at least
* MINDBTPTRS key/ptr pairs if the data fork is small or empty.
*/
- minforkoff = max_t(int64_t, dsize, XFS_BMDR_SPACE_CALC(MINDBTPTRS));
+ minforkoff = max_t(int64_t, dsize, xfs_bmdr_space_calc(MINDBTPTRS));
minforkoff = roundup(minforkoff, 8) >> 3;
/* attr fork btree root can have at least this many key/ptr pairs */
- maxforkoff = XFS_LITINO(mp) - XFS_BMDR_SPACE_CALC(MINABTPTRS);
+ maxforkoff = XFS_LITINO(mp) - xfs_bmdr_space_calc(MINABTPTRS);
maxforkoff = maxforkoff >> 3; /* rounded down */
if (offset >= maxforkoff)
diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index 434433ed29dc..00cac756c956 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -79,9 +79,9 @@ xfs_bmap_compute_maxlevels(
maxleafents = xfs_iext_max_nextents(xfs_has_large_extent_counts(mp),
whichfork);
if (whichfork == XFS_DATA_FORK)
- sz = XFS_BMDR_SPACE_CALC(MINDBTPTRS);
+ sz = xfs_bmdr_space_calc(MINDBTPTRS);
else
- sz = XFS_BMDR_SPACE_CALC(MINABTPTRS);
+ sz = xfs_bmdr_space_calc(MINABTPTRS);
maxrootrecs = xfs_bmdr_maxrecs(sz, 0);
minleafrecs = mp->m_bmap_dmnr[0];
@@ -102,8 +102,8 @@ xfs_bmap_compute_attr_offset(
struct xfs_mount *mp)
{
if (mp->m_sb.sb_inodesize == 256)
- return XFS_LITINO(mp) - XFS_BMDR_SPACE_CALC(MINABTPTRS);
- return XFS_BMDR_SPACE_CALC(6 * MINABTPTRS);
+ return XFS_LITINO(mp) - xfs_bmdr_space_calc(MINABTPTRS);
+ return xfs_bmdr_space_calc(6 * MINABTPTRS);
}
STATIC int /* error */
@@ -298,7 +298,7 @@ xfs_check_block(
prevp = NULL;
for( i = 1; i <= xfs_btree_get_numrecs(block); i++) {
dmxr = mp->m_bmap_dmxr[0];
- keyp = XFS_BMBT_KEY_ADDR(mp, block, i);
+ keyp = xfs_bmbt_key_addr(mp, block, i);
if (prevp) {
ASSERT(be64_to_cpu(prevp->br_startoff) <
@@ -310,15 +310,15 @@ xfs_check_block(
* Compare the block numbers to see if there are dups.
*/
if (root)
- pp = XFS_BMAP_BROOT_PTR_ADDR(mp, block, i, sz);
+ pp = xfs_bmap_broot_ptr_addr(mp, block, i, sz);
else
- pp = XFS_BMBT_PTR_ADDR(mp, block, i, dmxr);
+ pp = xfs_bmbt_ptr_addr(mp, block, i, dmxr);
for (j = i+1; j <= be16_to_cpu(block->bb_numrecs); j++) {
if (root)
- thispa = XFS_BMAP_BROOT_PTR_ADDR(mp, block, j, sz);
+ thispa = xfs_bmap_broot_ptr_addr(mp, block, j, sz);
else
- thispa = XFS_BMBT_PTR_ADDR(mp, block, j, dmxr);
+ thispa = xfs_bmbt_ptr_addr(mp, block, j, dmxr);
if (*thispa == *pp) {
xfs_warn(mp, "%s: thispa(%d) == pp(%d) %lld",
__func__, j, i,
@@ -373,7 +373,7 @@ xfs_bmap_check_leaf_extents(
level = be16_to_cpu(block->bb_level);
ASSERT(level > 0);
xfs_check_block(block, mp, 1, ifp->if_broot_bytes);
- pp = XFS_BMAP_BROOT_PTR_ADDR(mp, block, 1, ifp->if_broot_bytes);
+ pp = xfs_bmap_broot_ptr_addr(mp, block, 1, ifp->if_broot_bytes);
bno = be64_to_cpu(*pp);
ASSERT(bno != NULLFSBLOCK);
@@ -406,7 +406,7 @@ xfs_bmap_check_leaf_extents(
*/
xfs_check_block(block, mp, 0, 0);
- pp = XFS_BMBT_PTR_ADDR(mp, block, 1, mp->m_bmap_dmxr[1]);
+ pp = xfs_bmbt_ptr_addr(mp, block, 1, mp->m_bmap_dmxr[1]);
bno = be64_to_cpu(*pp);
if (XFS_IS_CORRUPT(mp, !xfs_verify_fsbno(mp, bno))) {
xfs_btree_mark_sick(cur);
@@ -446,14 +446,14 @@ xfs_bmap_check_leaf_extents(
* conform with the first entry in this one.
*/
- ep = XFS_BMBT_REC_ADDR(mp, block, 1);
+ ep = xfs_bmbt_rec_addr(mp, block, 1);
if (i) {
ASSERT(xfs_bmbt_disk_get_startoff(&last) +
xfs_bmbt_disk_get_blockcount(&last) <=
xfs_bmbt_disk_get_startoff(ep));
}
for (j = 1; j < num_recs; j++) {
- nextp = XFS_BMBT_REC_ADDR(mp, block, j + 1);
+ nextp = xfs_bmbt_rec_addr(mp, block, j + 1);
ASSERT(xfs_bmbt_disk_get_startoff(ep) +
xfs_bmbt_disk_get_blockcount(ep) <=
xfs_bmbt_disk_get_startoff(nextp));
@@ -586,7 +586,7 @@ xfs_bmap_btree_to_extents(
ASSERT(be16_to_cpu(rblock->bb_numrecs) == 1);
ASSERT(xfs_bmbt_maxrecs(mp, ifp->if_broot_bytes, 0) == 1);
- pp = XFS_BMAP_BROOT_PTR_ADDR(mp, rblock, 1, ifp->if_broot_bytes);
+ pp = xfs_bmap_broot_ptr_addr(mp, rblock, 1, ifp->if_broot_bytes);
cbno = be64_to_cpu(*pp);
#ifdef DEBUG
if (XFS_IS_CORRUPT(cur->bc_mp, !xfs_verify_fsbno(mp, cbno))) {
@@ -714,7 +714,7 @@ xfs_bmap_extents_to_btree(
for_each_xfs_iext(ifp, &icur, &rec) {
if (isnullstartblock(rec.br_startblock))
continue;
- arp = XFS_BMBT_REC_ADDR(mp, ablock, 1 + cnt);
+ arp = xfs_bmbt_rec_addr(mp, ablock, 1 + cnt);
xfs_bmbt_disk_set_all(arp, &rec);
cnt++;
}
@@ -724,10 +724,10 @@ xfs_bmap_extents_to_btree(
/*
* Fill in the root key and pointer.
*/
- kp = XFS_BMBT_KEY_ADDR(mp, block, 1);
- arp = XFS_BMBT_REC_ADDR(mp, ablock, 1);
+ kp = xfs_bmbt_key_addr(mp, block, 1);
+ arp = xfs_bmbt_rec_addr(mp, ablock, 1);
kp->br_startoff = cpu_to_be64(xfs_bmbt_disk_get_startoff(arp));
- pp = XFS_BMBT_PTR_ADDR(mp, block, 1, xfs_bmbt_get_maxrecs(cur,
+ pp = xfs_bmbt_ptr_addr(mp, block, 1, xfs_bmbt_get_maxrecs(cur,
be16_to_cpu(block->bb_level)));
*pp = cpu_to_be64(args.fsbno);
@@ -896,7 +896,7 @@ xfs_bmap_add_attrfork_btree(
mp = ip->i_mount;
- if (XFS_BMAP_BMDR_SPACE(block) <= xfs_inode_data_fork_size(ip))
+ if (xfs_bmap_bmdr_space(block) <= xfs_inode_data_fork_size(ip))
*flags |= XFS_ILOG_DBROOT;
else {
cur = xfs_bmbt_init_cursor(mp, tp, ip, XFS_DATA_FORK);
@@ -1160,7 +1160,7 @@ xfs_iread_bmbt_block(
}
/* Copy records into the incore cache. */
- frp = XFS_BMBT_REC_ADDR(mp, block, 1);
+ frp = xfs_bmbt_rec_addr(mp, block, 1);
for (j = 0; j < num_recs; j++, frp++, ir->loaded++) {
struct xfs_bmbt_irec new;
xfs_failaddr_t fa;
diff --git a/fs/xfs/libxfs/xfs_bmap_btree.c b/fs/xfs/libxfs/xfs_bmap_btree.c
index d1b06ccde19e..3695b3ad07d4 100644
--- a/fs/xfs/libxfs/xfs_bmap_btree.c
+++ b/fs/xfs/libxfs/xfs_bmap_btree.c
@@ -65,10 +65,10 @@ xfs_bmdr_to_bmbt(
ASSERT(be16_to_cpu(rblock->bb_level) > 0);
rblock->bb_numrecs = dblock->bb_numrecs;
dmxr = xfs_bmdr_maxrecs(dblocklen, 0);
- fkp = XFS_BMDR_KEY_ADDR(dblock, 1);
- tkp = XFS_BMBT_KEY_ADDR(mp, rblock, 1);
- fpp = XFS_BMDR_PTR_ADDR(dblock, 1, dmxr);
- tpp = XFS_BMAP_BROOT_PTR_ADDR(mp, rblock, 1, rblocklen);
+ fkp = xfs_bmdr_key_addr(dblock, 1);
+ tkp = xfs_bmbt_key_addr(mp, rblock, 1);
+ fpp = xfs_bmdr_ptr_addr(dblock, 1, dmxr);
+ tpp = xfs_bmap_broot_ptr_addr(mp, rblock, 1, rblocklen);
dmxr = be16_to_cpu(dblock->bb_numrecs);
memcpy(tkp, fkp, sizeof(*fkp) * dmxr);
memcpy(tpp, fpp, sizeof(*fpp) * dmxr);
@@ -168,10 +168,10 @@ xfs_bmbt_to_bmdr(
dblock->bb_level = rblock->bb_level;
dblock->bb_numrecs = rblock->bb_numrecs;
dmxr = xfs_bmdr_maxrecs(dblocklen, 0);
- fkp = XFS_BMBT_KEY_ADDR(mp, rblock, 1);
- tkp = XFS_BMDR_KEY_ADDR(dblock, 1);
- fpp = XFS_BMAP_BROOT_PTR_ADDR(mp, rblock, 1, rblocklen);
- tpp = XFS_BMDR_PTR_ADDR(dblock, 1, dmxr);
+ fkp = xfs_bmbt_key_addr(mp, rblock, 1);
+ tkp = xfs_bmdr_key_addr(dblock, 1);
+ fpp = xfs_bmap_broot_ptr_addr(mp, rblock, 1, rblocklen);
+ tpp = xfs_bmdr_ptr_addr(dblock, 1, dmxr);
dmxr = be16_to_cpu(dblock->bb_numrecs);
memcpy(tkp, fkp, sizeof(*fkp) * dmxr);
memcpy(tpp, fpp, sizeof(*fpp) * dmxr);
@@ -651,7 +651,7 @@ xfs_bmbt_maxrecs(
int blocklen,
int leaf)
{
- blocklen -= XFS_BMBT_BLOCK_LEN(mp);
+ blocklen -= xfs_bmbt_block_len(mp);
return xfs_bmbt_block_maxrecs(blocklen, leaf);
}
diff --git a/fs/xfs/libxfs/xfs_bmap_btree.h b/fs/xfs/libxfs/xfs_bmap_btree.h
index de1b73f1225c..d006798d591b 100644
--- a/fs/xfs/libxfs/xfs_bmap_btree.h
+++ b/fs/xfs/libxfs/xfs_bmap_btree.h
@@ -13,70 +13,6 @@ struct xfs_inode;
struct xfs_trans;
struct xbtree_ifakeroot;
-/*
- * Btree block header size depends on a superblock flag.
- */
-#define XFS_BMBT_BLOCK_LEN(mp) \
- (xfs_has_crc(((mp))) ? \
- XFS_BTREE_LBLOCK_CRC_LEN : XFS_BTREE_LBLOCK_LEN)
-
-#define XFS_BMBT_REC_ADDR(mp, block, index) \
- ((xfs_bmbt_rec_t *) \
- ((char *)(block) + \
- XFS_BMBT_BLOCK_LEN(mp) + \
- ((index) - 1) * sizeof(xfs_bmbt_rec_t)))
-
-#define XFS_BMBT_KEY_ADDR(mp, block, index) \
- ((xfs_bmbt_key_t *) \
- ((char *)(block) + \
- XFS_BMBT_BLOCK_LEN(mp) + \
- ((index) - 1) * sizeof(xfs_bmbt_key_t)))
-
-#define XFS_BMBT_PTR_ADDR(mp, block, index, maxrecs) \
- ((xfs_bmbt_ptr_t *) \
- ((char *)(block) + \
- XFS_BMBT_BLOCK_LEN(mp) + \
- (maxrecs) * sizeof(xfs_bmbt_key_t) + \
- ((index) - 1) * sizeof(xfs_bmbt_ptr_t)))
-
-#define XFS_BMDR_REC_ADDR(block, index) \
- ((xfs_bmdr_rec_t *) \
- ((char *)(block) + \
- sizeof(struct xfs_bmdr_block) + \
- ((index) - 1) * sizeof(xfs_bmdr_rec_t)))
-
-#define XFS_BMDR_KEY_ADDR(block, index) \
- ((xfs_bmdr_key_t *) \
- ((char *)(block) + \
- sizeof(struct xfs_bmdr_block) + \
- ((index) - 1) * sizeof(xfs_bmdr_key_t)))
-
-#define XFS_BMDR_PTR_ADDR(block, index, maxrecs) \
- ((xfs_bmdr_ptr_t *) \
- ((char *)(block) + \
- sizeof(struct xfs_bmdr_block) + \
- (maxrecs) * sizeof(xfs_bmdr_key_t) + \
- ((index) - 1) * sizeof(xfs_bmdr_ptr_t)))
-
-/*
- * These are to be used when we know the size of the block and
- * we don't have a cursor.
- */
-#define XFS_BMAP_BROOT_PTR_ADDR(mp, bb, i, sz) \
- XFS_BMBT_PTR_ADDR(mp, bb, i, xfs_bmbt_maxrecs(mp, sz, 0))
-
-#define XFS_BMAP_BROOT_SPACE_CALC(mp, nrecs) \
- (int)(XFS_BMBT_BLOCK_LEN(mp) + \
- ((nrecs) * (sizeof(xfs_bmbt_key_t) + sizeof(xfs_bmbt_ptr_t))))
-
-#define XFS_BMAP_BROOT_SPACE(mp, bb) \
- (XFS_BMAP_BROOT_SPACE_CALC(mp, be16_to_cpu((bb)->bb_numrecs)))
-#define XFS_BMDR_SPACE_CALC(nrecs) \
- (int)(sizeof(xfs_bmdr_block_t) + \
- ((nrecs) * (sizeof(xfs_bmbt_key_t) + sizeof(xfs_bmbt_ptr_t))))
-#define XFS_BMAP_BMDR_SPACE(bb) \
- (XFS_BMDR_SPACE_CALC(be16_to_cpu((bb)->bb_numrecs)))
-
/*
* Maximum number of bmap btree levels.
*/
@@ -121,4 +57,144 @@ void xfs_bmbt_destroy_cur_cache(void);
void xfs_bmbt_init_block(struct xfs_inode *ip, struct xfs_btree_block *buf,
struct xfs_buf *bp, __u16 level, __u16 numrecs);
+/*
+ * Btree block header size depends on a superblock flag.
+ */
+static inline size_t
+xfs_bmbt_block_len(struct xfs_mount *mp)
+{
+ return xfs_has_crc(mp) ?
+ XFS_BTREE_LBLOCK_CRC_LEN : XFS_BTREE_LBLOCK_LEN;
+}
+
+/* Addresses of key, pointers, and records within an incore bmbt block. */
+
+static inline struct xfs_bmbt_rec *
+xfs_bmbt_rec_addr(
+ struct xfs_mount *mp,
+ struct xfs_btree_block *block,
+ unsigned int index)
+{
+ return (struct xfs_bmbt_rec *)
+ ((char *)block + xfs_bmbt_block_len(mp) +
+ (index - 1) * sizeof(struct xfs_bmbt_rec));
+}
+
+static inline struct xfs_bmbt_key *
+xfs_bmbt_key_addr(
+ struct xfs_mount *mp,
+ struct xfs_btree_block *block,
+ unsigned int index)
+{
+ return (struct xfs_bmbt_key *)
+ ((char *)block + xfs_bmbt_block_len(mp) +
+ (index - 1) * sizeof(struct xfs_bmbt_key *));
+}
+
+static inline xfs_bmbt_ptr_t *
+xfs_bmbt_ptr_addr(
+ struct xfs_mount *mp,
+ struct xfs_btree_block *block,
+ unsigned int index,
+ unsigned int maxrecs)
+{
+ return (xfs_bmbt_ptr_t *)
+ ((char *)block + xfs_bmbt_block_len(mp) +
+ maxrecs * sizeof(struct xfs_bmbt_key) +
+ (index - 1) * sizeof(xfs_bmbt_ptr_t));
+}
+
+/* Addresses of key, pointers, and records within an ondisk bmbt block. */
+
+static inline struct xfs_bmbt_rec *
+xfs_bmdr_rec_addr(
+ struct xfs_bmdr_block *block,
+ unsigned int index)
+{
+ return (struct xfs_bmbt_rec *)
+ ((char *)(block + 1) +
+ (index - 1) * sizeof(struct xfs_bmbt_rec));
+}
+
+static inline struct xfs_bmbt_key *
+xfs_bmdr_key_addr(
+ struct xfs_bmdr_block *block,
+ unsigned int index)
+{
+ return (struct xfs_bmbt_key *)
+ ((char *)(block + 1) +
+ (index - 1) * sizeof(struct xfs_bmbt_key));
+}
+
+static inline xfs_bmbt_ptr_t *
+xfs_bmdr_ptr_addr(
+ struct xfs_bmdr_block *block,
+ unsigned int index,
+ unsigned int maxrecs)
+{
+ return (xfs_bmbt_ptr_t *)
+ ((char *)(block + 1) +
+ maxrecs * sizeof(struct xfs_bmbt_key) +
+ (index - 1) * sizeof(xfs_bmbt_ptr_t));
+}
+
+/*
+ * Address of pointers within the incore btree root.
+ *
+ * These are to be used when we know the size of the block and
+ * we don't have a cursor.
+ */
+static inline xfs_bmbt_ptr_t *
+xfs_bmap_broot_ptr_addr(
+ struct xfs_mount *mp,
+ struct xfs_btree_block *bb,
+ unsigned int i,
+ unsigned int sz)
+{
+ return xfs_bmbt_ptr_addr(mp, bb, i, xfs_bmbt_maxrecs(mp, sz, 0));
+}
+
+/*
+ * Compute the space required for the incore btree root containing the given
+ * number of records.
+ */
+static inline size_t
+xfs_bmap_broot_space_calc(
+ struct xfs_mount *mp,
+ unsigned int nrecs)
+{
+ return xfs_bmbt_block_len(mp) +
+ (nrecs * (sizeof(struct xfs_bmbt_key) + sizeof(xfs_bmbt_ptr_t)));
+}
+
+/*
+ * Compute the space required for the incore btree root given the ondisk
+ * btree root block.
+ */
+static inline size_t
+xfs_bmap_broot_space(
+ struct xfs_mount *mp,
+ struct xfs_bmdr_block *bb)
+{
+ return xfs_bmap_broot_space_calc(mp, be16_to_cpu(bb->bb_numrecs));
+}
+
+/* Compute the space required for the ondisk root block. */
+static inline size_t
+xfs_bmdr_space_calc(unsigned int nrecs)
+{
+ return sizeof(struct xfs_bmdr_block) +
+ (nrecs * (sizeof(struct xfs_bmbt_key) + sizeof(xfs_bmbt_ptr_t)));
+}
+
+/*
+ * Compute the space required for the ondisk root block given an incore root
+ * block.
+ */
+static inline size_t
+xfs_bmap_bmdr_space(struct xfs_btree_block *bb)
+{
+ return xfs_bmdr_space_calc(be16_to_cpu(bb->bb_numrecs));
+}
+
#endif /* __XFS_BMAP_BTREE_H__ */
diff --git a/fs/xfs/libxfs/xfs_inode_fork.c b/fs/xfs/libxfs/xfs_inode_fork.c
index 622382300904..973e027e3d88 100644
--- a/fs/xfs/libxfs/xfs_inode_fork.c
+++ b/fs/xfs/libxfs/xfs_inode_fork.c
@@ -185,7 +185,7 @@ xfs_iformat_btree(
ifp = xfs_ifork_ptr(ip, whichfork);
dfp = (xfs_bmdr_block_t *)XFS_DFORK_PTR(dip, whichfork);
- size = XFS_BMAP_BROOT_SPACE(mp, dfp);
+ size = xfs_bmap_broot_space(mp, dfp);
nrecs = be16_to_cpu(dfp->bb_numrecs);
level = be16_to_cpu(dfp->bb_level);
@@ -198,7 +198,7 @@ xfs_iformat_btree(
*/
if (unlikely(ifp->if_nextents <= XFS_IFORK_MAXEXT(ip, whichfork) ||
nrecs == 0 ||
- XFS_BMDR_SPACE_CALC(nrecs) >
+ xfs_bmdr_space_calc(nrecs) >
XFS_DFORK_SIZE(dip, mp, whichfork) ||
ifp->if_nextents > ip->i_nblocks) ||
level == 0 || level > XFS_BM_MAXLEVELS(mp, whichfork)) {
@@ -409,7 +409,7 @@ xfs_iroot_realloc(
* allocate it now and get out.
*/
if (ifp->if_broot_bytes == 0) {
- new_size = XFS_BMAP_BROOT_SPACE_CALC(mp, rec_diff);
+ new_size = xfs_bmap_broot_space_calc(mp, rec_diff);
ifp->if_broot = kmalloc(new_size,
GFP_KERNEL | __GFP_NOFAIL);
ifp->if_broot_bytes = (int)new_size;
@@ -424,15 +424,15 @@ xfs_iroot_realloc(
*/
cur_max = xfs_bmbt_maxrecs(mp, ifp->if_broot_bytes, 0);
new_max = cur_max + rec_diff;
- new_size = XFS_BMAP_BROOT_SPACE_CALC(mp, new_max);
+ new_size = xfs_bmap_broot_space_calc(mp, new_max);
ifp->if_broot = krealloc(ifp->if_broot, new_size,
GFP_KERNEL | __GFP_NOFAIL);
- op = (char *)XFS_BMAP_BROOT_PTR_ADDR(mp, ifp->if_broot, 1,
+ op = (char *)xfs_bmap_broot_ptr_addr(mp, ifp->if_broot, 1,
ifp->if_broot_bytes);
- np = (char *)XFS_BMAP_BROOT_PTR_ADDR(mp, ifp->if_broot, 1,
+ np = (char *)xfs_bmap_broot_ptr_addr(mp, ifp->if_broot, 1,
(int)new_size);
ifp->if_broot_bytes = (int)new_size;
- ASSERT(XFS_BMAP_BMDR_SPACE(ifp->if_broot) <=
+ ASSERT(xfs_bmap_bmdr_space(ifp->if_broot) <=
xfs_inode_fork_size(ip, whichfork));
memmove(np, op, cur_max * (uint)sizeof(xfs_fsblock_t));
return;
@@ -448,7 +448,7 @@ xfs_iroot_realloc(
new_max = cur_max + rec_diff;
ASSERT(new_max >= 0);
if (new_max > 0)
- new_size = XFS_BMAP_BROOT_SPACE_CALC(mp, new_max);
+ new_size = xfs_bmap_broot_space_calc(mp, new_max);
else
new_size = 0;
if (new_size > 0) {
@@ -457,7 +457,7 @@ xfs_iroot_realloc(
* First copy over the btree block header.
*/
memcpy(new_broot, ifp->if_broot,
- XFS_BMBT_BLOCK_LEN(ip->i_mount));
+ xfs_bmbt_block_len(ip->i_mount));
} else {
new_broot = NULL;
}
@@ -469,16 +469,16 @@ xfs_iroot_realloc(
/*
* First copy the keys.
*/
- op = (char *)XFS_BMBT_KEY_ADDR(mp, ifp->if_broot, 1);
- np = (char *)XFS_BMBT_KEY_ADDR(mp, new_broot, 1);
+ op = (char *)xfs_bmbt_key_addr(mp, ifp->if_broot, 1);
+ np = (char *)xfs_bmbt_key_addr(mp, new_broot, 1);
memcpy(np, op, new_max * (uint)sizeof(xfs_bmbt_key_t));
/*
* Then copy the pointers.
*/
- op = (char *)XFS_BMAP_BROOT_PTR_ADDR(mp, ifp->if_broot, 1,
+ op = (char *)xfs_bmap_broot_ptr_addr(mp, ifp->if_broot, 1,
ifp->if_broot_bytes);
- np = (char *)XFS_BMAP_BROOT_PTR_ADDR(mp, new_broot, 1,
+ np = (char *)xfs_bmap_broot_ptr_addr(mp, new_broot, 1,
(int)new_size);
memcpy(np, op, new_max * (uint)sizeof(xfs_fsblock_t));
}
@@ -486,7 +486,7 @@ xfs_iroot_realloc(
ifp->if_broot = new_broot;
ifp->if_broot_bytes = (int)new_size;
if (ifp->if_broot)
- ASSERT(XFS_BMAP_BMDR_SPACE(ifp->if_broot) <=
+ ASSERT(xfs_bmap_bmdr_space(ifp->if_broot) <=
xfs_inode_fork_size(ip, whichfork));
return;
}
@@ -655,7 +655,7 @@ xfs_iflush_fork(
if ((iip->ili_fields & brootflag[whichfork]) &&
(ifp->if_broot_bytes > 0)) {
ASSERT(ifp->if_broot != NULL);
- ASSERT(XFS_BMAP_BMDR_SPACE(ifp->if_broot) <=
+ ASSERT(xfs_bmap_bmdr_space(ifp->if_broot) <=
xfs_inode_fork_size(ip, whichfork));
xfs_bmbt_to_bmdr(mp, ifp->if_broot, ifp->if_broot_bytes,
(xfs_bmdr_block_t *)cp,
diff --git a/fs/xfs/libxfs/xfs_trans_resv.c b/fs/xfs/libxfs/xfs_trans_resv.c
index 2e6d7bb3b5a2..1a7f95bcf069 100644
--- a/fs/xfs/libxfs/xfs_trans_resv.c
+++ b/fs/xfs/libxfs/xfs_trans_resv.c
@@ -130,7 +130,7 @@ xfs_calc_inode_res(
(4 * sizeof(struct xlog_op_header) +
sizeof(struct xfs_inode_log_format) +
mp->m_sb.sb_inodesize +
- 2 * XFS_BMBT_BLOCK_LEN(mp));
+ 2 * xfs_bmbt_block_len(mp));
}
/*
diff --git a/fs/xfs/scrub/bmap_repair.c b/fs/xfs/scrub/bmap_repair.c
index 1e656fab5e41..49dc38acc66b 100644
--- a/fs/xfs/scrub/bmap_repair.c
+++ b/fs/xfs/scrub/bmap_repair.c
@@ -480,7 +480,7 @@ xrep_bmap_iroot_size(
{
ASSERT(level > 0);
- return XFS_BMAP_BROOT_SPACE_CALC(cur->bc_mp, nr_this_level);
+ return xfs_bmap_broot_space_calc(cur->bc_mp, nr_this_level);
}
/* Update the inode counters. */
diff --git a/fs/xfs/scrub/inode_repair.c b/fs/xfs/scrub/inode_repair.c
index daf9f1ee7c2c..3e45b9b72312 100644
--- a/fs/xfs/scrub/inode_repair.c
+++ b/fs/xfs/scrub/inode_repair.c
@@ -846,7 +846,7 @@ xrep_dinode_bad_bmbt_fork(
nrecs = be16_to_cpu(dfp->bb_numrecs);
level = be16_to_cpu(dfp->bb_level);
- if (nrecs == 0 || XFS_BMDR_SPACE_CALC(nrecs) > dfork_size)
+ if (nrecs == 0 || xfs_bmdr_space_calc(nrecs) > dfork_size)
return true;
if (level == 0 || level >= XFS_BM_MAXLEVELS(sc->mp, whichfork))
return true;
@@ -858,12 +858,12 @@ xrep_dinode_bad_bmbt_fork(
xfs_fileoff_t fileoff;
xfs_fsblock_t fsbno;
- fkp = XFS_BMDR_KEY_ADDR(dfp, i);
+ fkp = xfs_bmdr_key_addr(dfp, i);
fileoff = be64_to_cpu(fkp->br_startoff);
if (!xfs_verify_fileoff(sc->mp, fileoff))
return true;
- fpp = XFS_BMDR_PTR_ADDR(dfp, i, dmxr);
+ fpp = xfs_bmdr_ptr_addr(dfp, i, dmxr);
fsbno = be64_to_cpu(*fpp);
if (!xfs_verify_fsbno(sc->mp, fsbno))
return true;
@@ -1121,7 +1121,7 @@ xrep_dinode_ensure_forkoff(
struct xfs_bmdr_block *bmdr;
struct xfs_scrub *sc = ri->sc;
xfs_extnum_t attr_extents, data_extents;
- size_t bmdr_minsz = XFS_BMDR_SPACE_CALC(1);
+ size_t bmdr_minsz = xfs_bmdr_space_calc(1);
unsigned int lit_sz = XFS_LITINO(sc->mp);
unsigned int afork_min, dfork_min;
@@ -1173,7 +1173,7 @@ xrep_dinode_ensure_forkoff(
case XFS_DINODE_FMT_BTREE:
/* Must have space for btree header and key/pointers. */
bmdr = XFS_DFORK_PTR(dip, XFS_ATTR_FORK);
- afork_min = XFS_BMAP_BROOT_SPACE(sc->mp, bmdr);
+ afork_min = xfs_bmap_broot_space(sc->mp, bmdr);
break;
default:
/* We should never see any other formats. */
@@ -1223,7 +1223,7 @@ xrep_dinode_ensure_forkoff(
case XFS_DINODE_FMT_BTREE:
/* Must have space for btree header and key/pointers. */
bmdr = XFS_DFORK_PTR(dip, XFS_DATA_FORK);
- dfork_min = XFS_BMAP_BROOT_SPACE(sc->mp, bmdr);
+ dfork_min = xfs_bmap_broot_space(sc->mp, bmdr);
break;
default:
dfork_min = 0;
diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
index fe2e2c930975..a2c8f0dd85d0 100644
--- a/fs/xfs/xfs_bmap_util.c
+++ b/fs/xfs/xfs_bmap_util.c
@@ -1184,7 +1184,7 @@ xfs_swap_extents_check_format(
*/
if (tifp->if_format == XFS_DINODE_FMT_BTREE) {
if (xfs_inode_has_attr_fork(ip) &&
- XFS_BMAP_BMDR_SPACE(tifp->if_broot) > xfs_inode_fork_boff(ip))
+ xfs_bmap_bmdr_space(tifp->if_broot) > xfs_inode_fork_boff(ip))
return -EINVAL;
if (tifp->if_nextents <= XFS_IFORK_MAXEXT(ip, XFS_DATA_FORK))
return -EINVAL;
@@ -1193,7 +1193,7 @@ xfs_swap_extents_check_format(
/* Reciprocal target->temp btree format checks */
if (ifp->if_format == XFS_DINODE_FMT_BTREE) {
if (xfs_inode_has_attr_fork(tip) &&
- XFS_BMAP_BMDR_SPACE(ip->i_df.if_broot) > xfs_inode_fork_boff(tip))
+ xfs_bmap_bmdr_space(ip->i_df.if_broot) > xfs_inode_fork_boff(tip))
return -EINVAL;
if (ifp->if_nextents <= XFS_IFORK_MAXEXT(tip, XFS_DATA_FORK))
return -EINVAL;
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 2/2] xfs: standardize the btree maxrecs function parameters
2024-09-02 18:22 ` [PATCHSET v4.2 8/8] xfs: cleanups for inode rooted btree code Darrick J. Wong
2024-09-02 18:33 ` [PATCH 1/2] xfs: replace shouty XFS_BM{BT,DR} macros Darrick J. Wong
@ 2024-09-02 18:33 ` Darrick J. Wong
1 sibling, 0 replies; 53+ messages in thread
From: Darrick J. Wong @ 2024-09-02 18:33 UTC (permalink / raw)
To: chandanbabu, djwong; +Cc: Christoph Hellwig, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Standardize the parameters in xfs_{alloc,bm,ino,rmap,refcount}bt_maxrecs
so that we have consistent calling conventions. This doesn't affect the
kernel that much, but enables us to clean up userspace a bit.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
fs/xfs/libxfs/xfs_alloc_btree.c | 6 +++---
fs/xfs/libxfs/xfs_alloc_btree.h | 3 ++-
fs/xfs/libxfs/xfs_bmap.c | 2 +-
fs/xfs/libxfs/xfs_bmap_btree.c | 6 +++---
fs/xfs/libxfs/xfs_bmap_btree.h | 5 +++--
fs/xfs/libxfs/xfs_ialloc.c | 4 ++--
fs/xfs/libxfs/xfs_ialloc_btree.c | 6 +++---
fs/xfs/libxfs/xfs_ialloc_btree.h | 3 ++-
fs/xfs/libxfs/xfs_inode_fork.c | 4 ++--
fs/xfs/libxfs/xfs_refcount_btree.c | 5 +++--
fs/xfs/libxfs/xfs_refcount_btree.h | 3 ++-
fs/xfs/libxfs/xfs_rmap_btree.c | 7 ++++---
fs/xfs/libxfs/xfs_rmap_btree.h | 3 ++-
fs/xfs/libxfs/xfs_sb.c | 16 ++++++++--------
14 files changed, 40 insertions(+), 33 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_alloc_btree.c b/fs/xfs/libxfs/xfs_alloc_btree.c
index 585e98e87ef9..aada676eee51 100644
--- a/fs/xfs/libxfs/xfs_alloc_btree.c
+++ b/fs/xfs/libxfs/xfs_alloc_btree.c
@@ -569,11 +569,11 @@ xfs_allocbt_block_maxrecs(
/*
* Calculate number of records in an alloc btree block.
*/
-int
+unsigned int
xfs_allocbt_maxrecs(
struct xfs_mount *mp,
- int blocklen,
- int leaf)
+ unsigned int blocklen,
+ bool leaf)
{
blocklen -= XFS_ALLOC_BLOCK_LEN(mp);
return xfs_allocbt_block_maxrecs(blocklen, leaf);
diff --git a/fs/xfs/libxfs/xfs_alloc_btree.h b/fs/xfs/libxfs/xfs_alloc_btree.h
index 155b47f231ab..12647f9aaa6d 100644
--- a/fs/xfs/libxfs/xfs_alloc_btree.h
+++ b/fs/xfs/libxfs/xfs_alloc_btree.h
@@ -53,7 +53,8 @@ struct xfs_btree_cur *xfs_bnobt_init_cursor(struct xfs_mount *mp,
struct xfs_btree_cur *xfs_cntbt_init_cursor(struct xfs_mount *mp,
struct xfs_trans *tp, struct xfs_buf *bp,
struct xfs_perag *pag);
-extern int xfs_allocbt_maxrecs(struct xfs_mount *, int, int);
+unsigned int xfs_allocbt_maxrecs(struct xfs_mount *mp, unsigned int blocklen,
+ bool leaf);
extern xfs_extlen_t xfs_allocbt_calc_size(struct xfs_mount *mp,
unsigned long long len);
diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index 00cac756c956..28473b6a95cc 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -584,7 +584,7 @@ xfs_bmap_btree_to_extents(
ASSERT(ifp->if_format == XFS_DINODE_FMT_BTREE);
ASSERT(be16_to_cpu(rblock->bb_level) == 1);
ASSERT(be16_to_cpu(rblock->bb_numrecs) == 1);
- ASSERT(xfs_bmbt_maxrecs(mp, ifp->if_broot_bytes, 0) == 1);
+ ASSERT(xfs_bmbt_maxrecs(mp, ifp->if_broot_bytes, false) == 1);
pp = xfs_bmap_broot_ptr_addr(mp, rblock, 1, ifp->if_broot_bytes);
cbno = be64_to_cpu(*pp);
diff --git a/fs/xfs/libxfs/xfs_bmap_btree.c b/fs/xfs/libxfs/xfs_bmap_btree.c
index 3695b3ad07d4..3464be771f95 100644
--- a/fs/xfs/libxfs/xfs_bmap_btree.c
+++ b/fs/xfs/libxfs/xfs_bmap_btree.c
@@ -645,11 +645,11 @@ xfs_bmbt_commit_staged_btree(
/*
* Calculate number of records in a bmap btree block.
*/
-int
+unsigned int
xfs_bmbt_maxrecs(
struct xfs_mount *mp,
- int blocklen,
- int leaf)
+ unsigned int blocklen,
+ bool leaf)
{
blocklen -= xfs_bmbt_block_len(mp);
return xfs_bmbt_block_maxrecs(blocklen, leaf);
diff --git a/fs/xfs/libxfs/xfs_bmap_btree.h b/fs/xfs/libxfs/xfs_bmap_btree.h
index d006798d591b..49a3bae3f6ec 100644
--- a/fs/xfs/libxfs/xfs_bmap_btree.h
+++ b/fs/xfs/libxfs/xfs_bmap_btree.h
@@ -35,7 +35,8 @@ extern void xfs_bmbt_to_bmdr(struct xfs_mount *, struct xfs_btree_block *, int,
extern int xfs_bmbt_get_maxrecs(struct xfs_btree_cur *, int level);
extern int xfs_bmdr_maxrecs(int blocklen, int leaf);
-extern int xfs_bmbt_maxrecs(struct xfs_mount *, int blocklen, int leaf);
+unsigned int xfs_bmbt_maxrecs(struct xfs_mount *mp, unsigned int blocklen,
+ bool leaf);
extern int xfs_bmbt_change_owner(struct xfs_trans *tp, struct xfs_inode *ip,
int whichfork, xfs_ino_t new_owner,
@@ -151,7 +152,7 @@ xfs_bmap_broot_ptr_addr(
unsigned int i,
unsigned int sz)
{
- return xfs_bmbt_ptr_addr(mp, bb, i, xfs_bmbt_maxrecs(mp, sz, 0));
+ return xfs_bmbt_ptr_addr(mp, bb, i, xfs_bmbt_maxrecs(mp, sz, false));
}
/*
diff --git a/fs/xfs/libxfs/xfs_ialloc.c b/fs/xfs/libxfs/xfs_ialloc.c
index fc70601e8d8e..20bb5ce38134 100644
--- a/fs/xfs/libxfs/xfs_ialloc.c
+++ b/fs/xfs/libxfs/xfs_ialloc.c
@@ -2948,8 +2948,8 @@ xfs_ialloc_setup_geometry(
/* Compute inode btree geometry. */
igeo->agino_log = sbp->sb_inopblog + sbp->sb_agblklog;
- igeo->inobt_mxr[0] = xfs_inobt_maxrecs(mp, sbp->sb_blocksize, 1);
- igeo->inobt_mxr[1] = xfs_inobt_maxrecs(mp, sbp->sb_blocksize, 0);
+ igeo->inobt_mxr[0] = xfs_inobt_maxrecs(mp, sbp->sb_blocksize, true);
+ igeo->inobt_mxr[1] = xfs_inobt_maxrecs(mp, sbp->sb_blocksize, false);
igeo->inobt_mnr[0] = igeo->inobt_mxr[0] / 2;
igeo->inobt_mnr[1] = igeo->inobt_mxr[1] / 2;
diff --git a/fs/xfs/libxfs/xfs_ialloc_btree.c b/fs/xfs/libxfs/xfs_ialloc_btree.c
index 797d5b5f7b72..401b42d52af6 100644
--- a/fs/xfs/libxfs/xfs_ialloc_btree.c
+++ b/fs/xfs/libxfs/xfs_ialloc_btree.c
@@ -572,11 +572,11 @@ xfs_inobt_block_maxrecs(
/*
* Calculate number of records in an inobt btree block.
*/
-int
+unsigned int
xfs_inobt_maxrecs(
struct xfs_mount *mp,
- int blocklen,
- int leaf)
+ unsigned int blocklen,
+ bool leaf)
{
blocklen -= XFS_INOBT_BLOCK_LEN(mp);
return xfs_inobt_block_maxrecs(blocklen, leaf);
diff --git a/fs/xfs/libxfs/xfs_ialloc_btree.h b/fs/xfs/libxfs/xfs_ialloc_btree.h
index 6472ec1ecbb4..300edf5bc009 100644
--- a/fs/xfs/libxfs/xfs_ialloc_btree.h
+++ b/fs/xfs/libxfs/xfs_ialloc_btree.h
@@ -50,7 +50,8 @@ struct xfs_btree_cur *xfs_inobt_init_cursor(struct xfs_perag *pag,
struct xfs_trans *tp, struct xfs_buf *agbp);
struct xfs_btree_cur *xfs_finobt_init_cursor(struct xfs_perag *pag,
struct xfs_trans *tp, struct xfs_buf *agbp);
-extern int xfs_inobt_maxrecs(struct xfs_mount *, int, int);
+unsigned int xfs_inobt_maxrecs(struct xfs_mount *mp, unsigned int blocklen,
+ bool leaf);
/* ir_holemask to inode allocation bitmap conversion */
uint64_t xfs_inobt_irec_to_allocmask(const struct xfs_inobt_rec_incore *irec);
diff --git a/fs/xfs/libxfs/xfs_inode_fork.c b/fs/xfs/libxfs/xfs_inode_fork.c
index 973e027e3d88..1158ca48626b 100644
--- a/fs/xfs/libxfs/xfs_inode_fork.c
+++ b/fs/xfs/libxfs/xfs_inode_fork.c
@@ -422,7 +422,7 @@ xfs_iroot_realloc(
* location. The records don't change location because
* they are kept butted up against the btree block header.
*/
- cur_max = xfs_bmbt_maxrecs(mp, ifp->if_broot_bytes, 0);
+ cur_max = xfs_bmbt_maxrecs(mp, ifp->if_broot_bytes, false);
new_max = cur_max + rec_diff;
new_size = xfs_bmap_broot_space_calc(mp, new_max);
ifp->if_broot = krealloc(ifp->if_broot, new_size,
@@ -444,7 +444,7 @@ xfs_iroot_realloc(
* records, just get rid of the root and clear the status bit.
*/
ASSERT((ifp->if_broot != NULL) && (ifp->if_broot_bytes > 0));
- cur_max = xfs_bmbt_maxrecs(mp, ifp->if_broot_bytes, 0);
+ cur_max = xfs_bmbt_maxrecs(mp, ifp->if_broot_bytes, false);
new_max = cur_max + rec_diff;
ASSERT(new_max >= 0);
if (new_max > 0)
diff --git a/fs/xfs/libxfs/xfs_refcount_btree.c b/fs/xfs/libxfs/xfs_refcount_btree.c
index cb3b1d42ae9a..795928d1a66d 100644
--- a/fs/xfs/libxfs/xfs_refcount_btree.c
+++ b/fs/xfs/libxfs/xfs_refcount_btree.c
@@ -417,9 +417,10 @@ xfs_refcountbt_block_maxrecs(
/*
* Calculate the number of records in a refcount btree block.
*/
-int
+unsigned int
xfs_refcountbt_maxrecs(
- int blocklen,
+ struct xfs_mount *mp,
+ unsigned int blocklen,
bool leaf)
{
blocklen -= XFS_REFCOUNT_BLOCK_LEN;
diff --git a/fs/xfs/libxfs/xfs_refcount_btree.h b/fs/xfs/libxfs/xfs_refcount_btree.h
index 1e0ab25f6c68..beb93bef6a81 100644
--- a/fs/xfs/libxfs/xfs_refcount_btree.h
+++ b/fs/xfs/libxfs/xfs_refcount_btree.h
@@ -48,7 +48,8 @@ struct xbtree_afakeroot;
extern struct xfs_btree_cur *xfs_refcountbt_init_cursor(struct xfs_mount *mp,
struct xfs_trans *tp, struct xfs_buf *agbp,
struct xfs_perag *pag);
-extern int xfs_refcountbt_maxrecs(int blocklen, bool leaf);
+unsigned int xfs_refcountbt_maxrecs(struct xfs_mount *mp, unsigned int blocklen,
+ bool leaf);
extern void xfs_refcountbt_compute_maxlevels(struct xfs_mount *mp);
extern xfs_extlen_t xfs_refcountbt_calc_size(struct xfs_mount *mp,
diff --git a/fs/xfs/libxfs/xfs_rmap_btree.c b/fs/xfs/libxfs/xfs_rmap_btree.c
index 56fd6c4bd8b4..ac2f1f499b76 100644
--- a/fs/xfs/libxfs/xfs_rmap_btree.c
+++ b/fs/xfs/libxfs/xfs_rmap_btree.c
@@ -731,10 +731,11 @@ xfs_rmapbt_block_maxrecs(
/*
* Calculate number of records in an rmap btree block.
*/
-int
+unsigned int
xfs_rmapbt_maxrecs(
- int blocklen,
- int leaf)
+ struct xfs_mount *mp,
+ unsigned int blocklen,
+ bool leaf)
{
blocklen -= XFS_RMAP_BLOCK_LEN;
return xfs_rmapbt_block_maxrecs(blocklen, leaf);
diff --git a/fs/xfs/libxfs/xfs_rmap_btree.h b/fs/xfs/libxfs/xfs_rmap_btree.h
index eb90d89e8086..119b1567cd0e 100644
--- a/fs/xfs/libxfs/xfs_rmap_btree.h
+++ b/fs/xfs/libxfs/xfs_rmap_btree.h
@@ -47,7 +47,8 @@ struct xfs_btree_cur *xfs_rmapbt_init_cursor(struct xfs_mount *mp,
struct xfs_perag *pag);
void xfs_rmapbt_commit_staged_btree(struct xfs_btree_cur *cur,
struct xfs_trans *tp, struct xfs_buf *agbp);
-int xfs_rmapbt_maxrecs(int blocklen, int leaf);
+unsigned int xfs_rmapbt_maxrecs(struct xfs_mount *mp, unsigned int blocklen,
+ bool leaf);
extern void xfs_rmapbt_compute_maxlevels(struct xfs_mount *mp);
extern xfs_extlen_t xfs_rmapbt_calc_size(struct xfs_mount *mp,
diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c
index a6fa9aedb28b..d95409f3cba6 100644
--- a/fs/xfs/libxfs/xfs_sb.c
+++ b/fs/xfs/libxfs/xfs_sb.c
@@ -1000,23 +1000,23 @@ xfs_sb_mount_common(
mp->m_blockwmask = mp->m_blockwsize - 1;
xfs_mount_sb_set_rextsize(mp, sbp);
- mp->m_alloc_mxr[0] = xfs_allocbt_maxrecs(mp, sbp->sb_blocksize, 1);
- mp->m_alloc_mxr[1] = xfs_allocbt_maxrecs(mp, sbp->sb_blocksize, 0);
+ mp->m_alloc_mxr[0] = xfs_allocbt_maxrecs(mp, sbp->sb_blocksize, true);
+ mp->m_alloc_mxr[1] = xfs_allocbt_maxrecs(mp, sbp->sb_blocksize, false);
mp->m_alloc_mnr[0] = mp->m_alloc_mxr[0] / 2;
mp->m_alloc_mnr[1] = mp->m_alloc_mxr[1] / 2;
- mp->m_bmap_dmxr[0] = xfs_bmbt_maxrecs(mp, sbp->sb_blocksize, 1);
- mp->m_bmap_dmxr[1] = xfs_bmbt_maxrecs(mp, sbp->sb_blocksize, 0);
+ mp->m_bmap_dmxr[0] = xfs_bmbt_maxrecs(mp, sbp->sb_blocksize, true);
+ mp->m_bmap_dmxr[1] = xfs_bmbt_maxrecs(mp, sbp->sb_blocksize, false);
mp->m_bmap_dmnr[0] = mp->m_bmap_dmxr[0] / 2;
mp->m_bmap_dmnr[1] = mp->m_bmap_dmxr[1] / 2;
- mp->m_rmap_mxr[0] = xfs_rmapbt_maxrecs(sbp->sb_blocksize, 1);
- mp->m_rmap_mxr[1] = xfs_rmapbt_maxrecs(sbp->sb_blocksize, 0);
+ mp->m_rmap_mxr[0] = xfs_rmapbt_maxrecs(mp, sbp->sb_blocksize, true);
+ mp->m_rmap_mxr[1] = xfs_rmapbt_maxrecs(mp, sbp->sb_blocksize, false);
mp->m_rmap_mnr[0] = mp->m_rmap_mxr[0] / 2;
mp->m_rmap_mnr[1] = mp->m_rmap_mxr[1] / 2;
- mp->m_refc_mxr[0] = xfs_refcountbt_maxrecs(sbp->sb_blocksize, true);
- mp->m_refc_mxr[1] = xfs_refcountbt_maxrecs(sbp->sb_blocksize, false);
+ mp->m_refc_mxr[0] = xfs_refcountbt_maxrecs(mp, sbp->sb_blocksize, true);
+ mp->m_refc_mxr[1] = xfs_refcountbt_maxrecs(mp, sbp->sb_blocksize, false);
mp->m_refc_mnr[0] = mp->m_refc_mxr[0] / 2;
mp->m_refc_mnr[1] = mp->m_refc_mxr[1] / 2;
^ permalink raw reply related [flat|nested] 53+ messages in thread
* Re: [PATCH 1/1] xfs: introduce new file range commit ioctls
2024-09-02 18:23 ` [PATCH 1/1] xfs: introduce new file range commit ioctls Darrick J. Wong
@ 2024-09-03 7:52 ` Christian Brauner
2024-10-25 21:23 ` Darrick J. Wong
0 siblings, 1 reply; 53+ messages in thread
From: Christian Brauner @ 2024-09-03 7:52 UTC (permalink / raw)
To: Darrick J. Wong
Cc: chandanbabu, Jeff Layton, Christoph Hellwig, linux-fsdevel,
linux-xfs
On Mon, Sep 02, 2024 at 11:23:07AM GMT, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
>
> This patch introduces two more new ioctls to manage atomic updates to
> file contents -- XFS_IOC_START_COMMIT and XFS_IOC_COMMIT_RANGE. The
> commit mechanism here is exactly the same as what XFS_IOC_EXCHANGE_RANGE
> does, but with the additional requirement that file2 cannot have changed
> since some sampling point. The start-commit ioctl performs the sampling
> of file attributes.
>
> Note: This patch currently samples i_ctime during START_COMMIT and
> checks that it hasn't changed during COMMIT_RANGE. This isn't entirely
> safe in kernels prior to 6.12 because ctime only had coarse grained
> granularity and very fast updates could collide with a COMMIT_RANGE.
> With the multi-granularity ctime introduced by Jeff Layton, it's now
> possible to update ctime such that this does not happen.
>
> It is critical, then, that this patch must not be backported to any
> kernel that does not support fine-grained file change timestamps.
>
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> Acked-by: Jeff Layton <jlayton@kernel.org>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> ---
> fs/xfs/libxfs/xfs_fs.h | 26 +++++++++
> fs/xfs/xfs_exchrange.c | 143 ++++++++++++++++++++++++++++++++++++++++++++++++
> fs/xfs/xfs_exchrange.h | 16 +++++
> fs/xfs/xfs_ioctl.c | 4 +
> fs/xfs/xfs_trace.h | 57 +++++++++++++++++++
> 5 files changed, 243 insertions(+), 3 deletions(-)
>
>
> diff --git a/fs/xfs/libxfs/xfs_fs.h b/fs/xfs/libxfs/xfs_fs.h
> index 454b63ef7201..c85c8077fac3 100644
> --- a/fs/xfs/libxfs/xfs_fs.h
> +++ b/fs/xfs/libxfs/xfs_fs.h
> @@ -825,6 +825,30 @@ struct xfs_exchange_range {
> __u64 flags; /* see XFS_EXCHANGE_RANGE_* below */
> };
>
> +/*
> + * Using the same definition of file2 as struct xfs_exchange_range, commit the
> + * contents of file1 into file2 if file2 has the same inode number, mtime, and
> + * ctime as the arguments provided to the call. The old contents of file2 will
> + * be moved to file1.
> + *
> + * Returns -EBUSY if there isn't an exact match for the file2 fields.
> + *
> + * Filesystems must be able to restart and complete the operation even after
> + * the system goes down.
> + */
> +struct xfs_commit_range {
> + __s32 file1_fd;
> + __u32 pad; /* must be zeroes */
> + __u64 file1_offset; /* file1 offset, bytes */
> + __u64 file2_offset; /* file2 offset, bytes */
> + __u64 length; /* bytes to exchange */
> +
> + __u64 flags; /* see XFS_EXCHANGE_RANGE_* below */
> +
> + /* opaque file2 metadata for freshness checks */
> + __u64 file2_freshness[6];
> +};
> +
> /*
> * Exchange file data all the way to the ends of both files, and then exchange
> * the file sizes. This flag can be used to replace a file's contents with a
> @@ -997,6 +1021,8 @@ struct xfs_getparents_by_handle {
> #define XFS_IOC_BULKSTAT _IOR ('X', 127, struct xfs_bulkstat_req)
> #define XFS_IOC_INUMBERS _IOR ('X', 128, struct xfs_inumbers_req)
> #define XFS_IOC_EXCHANGE_RANGE _IOW ('X', 129, struct xfs_exchange_range)
> +#define XFS_IOC_START_COMMIT _IOR ('X', 130, struct xfs_commit_range)
> +#define XFS_IOC_COMMIT_RANGE _IOW ('X', 131, struct xfs_commit_range)
> /* XFS_IOC_GETFSUUID ---------- deprecated 140 */
>
>
> diff --git a/fs/xfs/xfs_exchrange.c b/fs/xfs/xfs_exchrange.c
> index c8a655c92c92..d0889190ab7f 100644
> --- a/fs/xfs/xfs_exchrange.c
> +++ b/fs/xfs/xfs_exchrange.c
> @@ -72,6 +72,34 @@ xfs_exchrange_estimate(
> return error;
> }
>
> +/*
> + * Check that file2's metadata agree with the snapshot that we took for the
> + * range commit request.
> + *
> + * This should be called after the filesystem has locked /all/ inode metadata
> + * against modification.
> + */
> +STATIC int
> +xfs_exchrange_check_freshness(
> + const struct xfs_exchrange *fxr,
> + struct xfs_inode *ip2)
> +{
> + struct inode *inode2 = VFS_I(ip2);
> + struct timespec64 ctime = inode_get_ctime(inode2);
> + struct timespec64 mtime = inode_get_mtime(inode2);
> +
> + trace_xfs_exchrange_freshness(fxr, ip2);
> +
> + /* Check that file2 hasn't otherwise been modified. */
> + if (fxr->file2_ino != ip2->i_ino ||
> + fxr->file2_gen != inode2->i_generation ||
> + !timespec64_equal(&fxr->file2_ctime, &ctime) ||
> + !timespec64_equal(&fxr->file2_mtime, &mtime))
> + return -EBUSY;
> +
> + return 0;
> +}
> +
> #define QRETRY_IP1 (0x1)
> #define QRETRY_IP2 (0x2)
>
> @@ -607,6 +635,12 @@ xfs_exchrange_prep(
> if (error || fxr->length == 0)
> return error;
>
> + if (fxr->flags & __XFS_EXCHANGE_RANGE_CHECK_FRESH2) {
> + error = xfs_exchrange_check_freshness(fxr, ip2);
> + if (error)
> + return error;
> + }
> +
> /* Attach dquots to both inodes before changing block maps. */
> error = xfs_qm_dqattach(ip2);
> if (error)
> @@ -719,7 +753,8 @@ xfs_exchange_range(
> if (fxr->file1->f_path.mnt != fxr->file2->f_path.mnt)
> return -EXDEV;
>
> - if (fxr->flags & ~XFS_EXCHANGE_RANGE_ALL_FLAGS)
> + if (fxr->flags & ~(XFS_EXCHANGE_RANGE_ALL_FLAGS |
> + __XFS_EXCHANGE_RANGE_CHECK_FRESH2))
> return -EINVAL;
>
> /* Userspace requests only honored for regular files. */
> @@ -802,3 +837,109 @@ xfs_ioc_exchange_range(
> fdput(file1);
> return error;
> }
> +
> +/* Opaque freshness blob for XFS_IOC_COMMIT_RANGE */
> +struct xfs_commit_range_fresh {
> + xfs_fsid_t fsid; /* m_fixedfsid */
> + __u64 file2_ino; /* inode number */
> + __s64 file2_mtime; /* modification time */
> + __s64 file2_ctime; /* change time */
> + __s32 file2_mtime_nsec; /* mod time, nsec */
> + __s32 file2_ctime_nsec; /* change time, nsec */
> + __u32 file2_gen; /* inode generation */
> + __u32 magic; /* zero */
> +};
> +#define XCR_FRESH_MAGIC 0x444F524B /* DORK */
> +
> +/* Set up a commitrange operation by sampling file2's write-related attrs */
> +long
> +xfs_ioc_start_commit(
> + struct file *file,
> + struct xfs_commit_range __user *argp)
> +{
> + struct xfs_commit_range args = { };
> + struct timespec64 ts;
> + struct xfs_commit_range_fresh *kern_f;
> + struct xfs_commit_range_fresh __user *user_f;
> + struct inode *inode2 = file_inode(file);
> + struct xfs_inode *ip2 = XFS_I(inode2);
> + const unsigned int lockflags = XFS_IOLOCK_SHARED |
> + XFS_MMAPLOCK_SHARED |
> + XFS_ILOCK_SHARED;
> +
> + BUILD_BUG_ON(sizeof(struct xfs_commit_range_fresh) !=
> + sizeof(args.file2_freshness));
> +
> + kern_f = (struct xfs_commit_range_fresh *)&args.file2_freshness;
> +
> + memcpy(&kern_f->fsid, ip2->i_mount->m_fixedfsid, sizeof(xfs_fsid_t));
> +
> + xfs_ilock(ip2, lockflags);
> + ts = inode_get_ctime(inode2);
> + kern_f->file2_ctime = ts.tv_sec;
> + kern_f->file2_ctime_nsec = ts.tv_nsec;
> + ts = inode_get_mtime(inode2);
> + kern_f->file2_mtime = ts.tv_sec;
> + kern_f->file2_mtime_nsec = ts.tv_nsec;
> + kern_f->file2_ino = ip2->i_ino;
> + kern_f->file2_gen = inode2->i_generation;
> + kern_f->magic = XCR_FRESH_MAGIC;
> + xfs_iunlock(ip2, lockflags);
> +
> + user_f = (struct xfs_commit_range_fresh __user *)&argp->file2_freshness;
> + if (copy_to_user(user_f, kern_f, sizeof(*kern_f)))
> + return -EFAULT;
> +
> + return 0;
> +}
> +
> +/*
> + * Exchange file1 and file2 contents if file2 has not been written since the
> + * start commit operation.
> + */
> +long
> +xfs_ioc_commit_range(
> + struct file *file,
> + struct xfs_commit_range __user *argp)
> +{
> + struct xfs_exchrange fxr = {
> + .file2 = file,
> + };
> + struct xfs_commit_range args;
> + struct xfs_commit_range_fresh *kern_f;
> + struct xfs_inode *ip2 = XFS_I(file_inode(file));
> + struct xfs_mount *mp = ip2->i_mount;
> + struct fd file1;
> + int error;
> +
> + kern_f = (struct xfs_commit_range_fresh *)&args.file2_freshness;
> +
> + if (copy_from_user(&args, argp, sizeof(args)))
> + return -EFAULT;
> + if (args.flags & ~XFS_EXCHANGE_RANGE_ALL_FLAGS)
> + return -EINVAL;
> + if (kern_f->magic != XCR_FRESH_MAGIC)
> + return -EBUSY;
> + if (memcmp(&kern_f->fsid, mp->m_fixedfsid, sizeof(xfs_fsid_t)))
> + return -EBUSY;
So, I mentioned this before in another mail a few months ago and I think
you liked the idea so just as a reminder in case you forgot:
Ioctls are extensible if done correctly:
switch (__IOC_NR(ioctl)) {
case _IOC_NR(XFS_IOC_START_COMMIT): {
size_t usize = _IOC_SIZE(ioctl);
struct xfs_commit_range args;
if (usize < XFS_IOC_START_COMMIT_SIZE_VER0)
return -EINVAL;
if (copy_struct_from_user(&args, sizeof(args), argp, usize))
return -EFAULT;
}
If you code it this way, relying on copy_struct_from_user() right from
the start you can easily extend your struct in a backward and forward
compatible manner.
}
> +
> + fxr.file1_offset = args.file1_offset;
> + fxr.file2_offset = args.file2_offset;
> + fxr.length = args.length;
> + fxr.flags = args.flags | __XFS_EXCHANGE_RANGE_CHECK_FRESH2;
> + fxr.file2_ino = kern_f->file2_ino;
> + fxr.file2_gen = kern_f->file2_gen;
> + fxr.file2_mtime.tv_sec = kern_f->file2_mtime;
> + fxr.file2_mtime.tv_nsec = kern_f->file2_mtime_nsec;
> + fxr.file2_ctime.tv_sec = kern_f->file2_ctime;
> + fxr.file2_ctime.tv_nsec = kern_f->file2_ctime_nsec;
> +
> + file1 = fdget(args.file1_fd);
> + if (!file1.file)
> + return -EBADF;
Please use CLASS(fd, f)(args.file1_fd) :)
> + fxr.file1 = file1.file;
> +
> + error = xfs_exchange_range(&fxr);
> + fdput(file1);
> + return error;
> +}
> diff --git a/fs/xfs/xfs_exchrange.h b/fs/xfs/xfs_exchrange.h
> index 039abcca546e..bc1298aba806 100644
> --- a/fs/xfs/xfs_exchrange.h
> +++ b/fs/xfs/xfs_exchrange.h
> @@ -10,8 +10,12 @@
> #define __XFS_EXCHANGE_RANGE_UPD_CMTIME1 (1ULL << 63)
> #define __XFS_EXCHANGE_RANGE_UPD_CMTIME2 (1ULL << 62)
>
> +/* Freshness check required */
> +#define __XFS_EXCHANGE_RANGE_CHECK_FRESH2 (1ULL << 61)
> +
> #define XFS_EXCHANGE_RANGE_PRIV_FLAGS (__XFS_EXCHANGE_RANGE_UPD_CMTIME1 | \
> - __XFS_EXCHANGE_RANGE_UPD_CMTIME2)
> + __XFS_EXCHANGE_RANGE_UPD_CMTIME2 | \
> + __XFS_EXCHANGE_RANGE_CHECK_FRESH2)
>
> struct xfs_exchrange {
> struct file *file1;
> @@ -22,10 +26,20 @@ struct xfs_exchrange {
> u64 length;
>
> u64 flags; /* XFS_EXCHANGE_RANGE flags */
> +
> + /* file2 metadata for freshness checks */
> + u64 file2_ino;
> + struct timespec64 file2_mtime;
> + struct timespec64 file2_ctime;
> + u32 file2_gen;
> };
>
> long xfs_ioc_exchange_range(struct file *file,
> struct xfs_exchange_range __user *argp);
> +long xfs_ioc_start_commit(struct file *file,
> + struct xfs_commit_range __user *argp);
> +long xfs_ioc_commit_range(struct file *file,
> + struct xfs_commit_range __user *argp);
>
> struct xfs_exchmaps_req;
>
> diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
> index 6b13666d4e96..90b3ee21e7fe 100644
> --- a/fs/xfs/xfs_ioctl.c
> +++ b/fs/xfs/xfs_ioctl.c
> @@ -1518,6 +1518,10 @@ xfs_file_ioctl(
>
> case XFS_IOC_EXCHANGE_RANGE:
> return xfs_ioc_exchange_range(filp, arg);
> + case XFS_IOC_START_COMMIT:
> + return xfs_ioc_start_commit(filp, arg);
> + case XFS_IOC_COMMIT_RANGE:
> + return xfs_ioc_commit_range(filp, arg);
>
> default:
> return -ENOTTY;
> diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h
> index 180ce697305a..4cf0fa71ba9c 100644
> --- a/fs/xfs/xfs_trace.h
> +++ b/fs/xfs/xfs_trace.h
> @@ -4926,7 +4926,8 @@ DEFINE_INODE_ERROR_EVENT(xfs_exchrange_error);
> { XFS_EXCHANGE_RANGE_DRY_RUN, "DRY_RUN" }, \
> { XFS_EXCHANGE_RANGE_FILE1_WRITTEN, "F1_WRITTEN" }, \
> { __XFS_EXCHANGE_RANGE_UPD_CMTIME1, "CMTIME1" }, \
> - { __XFS_EXCHANGE_RANGE_UPD_CMTIME2, "CMTIME2" }
> + { __XFS_EXCHANGE_RANGE_UPD_CMTIME2, "CMTIME2" }, \
> + { __XFS_EXCHANGE_RANGE_CHECK_FRESH2, "FRESH2" }
>
> /* file exchange-range tracepoint class */
> DECLARE_EVENT_CLASS(xfs_exchrange_class,
> @@ -4986,6 +4987,60 @@ DEFINE_EXCHRANGE_EVENT(xfs_exchrange_prep);
> DEFINE_EXCHRANGE_EVENT(xfs_exchrange_flush);
> DEFINE_EXCHRANGE_EVENT(xfs_exchrange_mappings);
>
> +TRACE_EVENT(xfs_exchrange_freshness,
> + TP_PROTO(const struct xfs_exchrange *fxr, struct xfs_inode *ip2),
> + TP_ARGS(fxr, ip2),
> + TP_STRUCT__entry(
> + __field(dev_t, dev)
> + __field(xfs_ino_t, ip2_ino)
> + __field(long long, ip2_mtime)
> + __field(long long, ip2_ctime)
> + __field(int, ip2_mtime_nsec)
> + __field(int, ip2_ctime_nsec)
> +
> + __field(xfs_ino_t, file2_ino)
> + __field(long long, file2_mtime)
> + __field(long long, file2_ctime)
> + __field(int, file2_mtime_nsec)
> + __field(int, file2_ctime_nsec)
> + ),
> + TP_fast_assign(
> + struct timespec64 ts64;
> + struct inode *inode2 = VFS_I(ip2);
> +
> + __entry->dev = inode2->i_sb->s_dev;
> + __entry->ip2_ino = ip2->i_ino;
> +
> + ts64 = inode_get_ctime(inode2);
> + __entry->ip2_ctime = ts64.tv_sec;
> + __entry->ip2_ctime_nsec = ts64.tv_nsec;
> +
> + ts64 = inode_get_mtime(inode2);
> + __entry->ip2_mtime = ts64.tv_sec;
> + __entry->ip2_mtime_nsec = ts64.tv_nsec;
> +
> + __entry->file2_ino = fxr->file2_ino;
> + __entry->file2_mtime = fxr->file2_mtime.tv_sec;
> + __entry->file2_ctime = fxr->file2_ctime.tv_sec;
> + __entry->file2_mtime_nsec = fxr->file2_mtime.tv_nsec;
> + __entry->file2_ctime_nsec = fxr->file2_ctime.tv_nsec;
> + ),
> + TP_printk("dev %d:%d "
> + "ino 0x%llx mtime %lld:%d ctime %lld:%d -> "
> + "file 0x%llx mtime %lld:%d ctime %lld:%d",
> + MAJOR(__entry->dev), MINOR(__entry->dev),
> + __entry->ip2_ino,
> + __entry->ip2_mtime,
> + __entry->ip2_mtime_nsec,
> + __entry->ip2_ctime,
> + __entry->ip2_ctime_nsec,
> + __entry->file2_ino,
> + __entry->file2_mtime,
> + __entry->file2_mtime_nsec,
> + __entry->file2_ctime,
> + __entry->file2_ctime_nsec)
> +);
> +
> TRACE_EVENT(xfs_exchmaps_overhead,
> TP_PROTO(struct xfs_mount *mp, unsigned long long bmbt_blocks,
> unsigned long long rmapbt_blocks),
>
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 1/1] xfs: introduce new file range commit ioctls
2024-09-03 7:52 ` Christian Brauner
@ 2024-10-25 21:23 ` Darrick J. Wong
0 siblings, 0 replies; 53+ messages in thread
From: Darrick J. Wong @ 2024-10-25 21:23 UTC (permalink / raw)
To: Christian Brauner
Cc: chandanbabu, Jeff Layton, Christoph Hellwig, linux-fsdevel,
linux-xfs
On Tue, Sep 03, 2024 at 09:52:43AM +0200, Christian Brauner wrote:
> On Mon, Sep 02, 2024 at 11:23:07AM GMT, Darrick J. Wong wrote:
> > From: Darrick J. Wong <djwong@kernel.org>
> >
> > This patch introduces two more new ioctls to manage atomic updates to
> > file contents -- XFS_IOC_START_COMMIT and XFS_IOC_COMMIT_RANGE. The
> > commit mechanism here is exactly the same as what XFS_IOC_EXCHANGE_RANGE
> > does, but with the additional requirement that file2 cannot have changed
> > since some sampling point. The start-commit ioctl performs the sampling
> > of file attributes.
> >
> > Note: This patch currently samples i_ctime during START_COMMIT and
> > checks that it hasn't changed during COMMIT_RANGE. This isn't entirely
> > safe in kernels prior to 6.12 because ctime only had coarse grained
> > granularity and very fast updates could collide with a COMMIT_RANGE.
> > With the multi-granularity ctime introduced by Jeff Layton, it's now
> > possible to update ctime such that this does not happen.
> >
> > It is critical, then, that this patch must not be backported to any
> > kernel that does not support fine-grained file change timestamps.
> >
> > Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> > Acked-by: Jeff Layton <jlayton@kernel.org>
> > Reviewed-by: Christoph Hellwig <hch@lst.de>
> > ---
> > fs/xfs/libxfs/xfs_fs.h | 26 +++++++++
> > fs/xfs/xfs_exchrange.c | 143 ++++++++++++++++++++++++++++++++++++++++++++++++
> > fs/xfs/xfs_exchrange.h | 16 +++++
> > fs/xfs/xfs_ioctl.c | 4 +
> > fs/xfs/xfs_trace.h | 57 +++++++++++++++++++
> > 5 files changed, 243 insertions(+), 3 deletions(-)
> >
> >
> > diff --git a/fs/xfs/libxfs/xfs_fs.h b/fs/xfs/libxfs/xfs_fs.h
> > index 454b63ef7201..c85c8077fac3 100644
> > --- a/fs/xfs/libxfs/xfs_fs.h
> > +++ b/fs/xfs/libxfs/xfs_fs.h
> > @@ -825,6 +825,30 @@ struct xfs_exchange_range {
> > __u64 flags; /* see XFS_EXCHANGE_RANGE_* below */
> > };
> >
> > +/*
> > + * Using the same definition of file2 as struct xfs_exchange_range, commit the
> > + * contents of file1 into file2 if file2 has the same inode number, mtime, and
> > + * ctime as the arguments provided to the call. The old contents of file2 will
> > + * be moved to file1.
> > + *
> > + * Returns -EBUSY if there isn't an exact match for the file2 fields.
> > + *
> > + * Filesystems must be able to restart and complete the operation even after
> > + * the system goes down.
> > + */
> > +struct xfs_commit_range {
> > + __s32 file1_fd;
> > + __u32 pad; /* must be zeroes */
> > + __u64 file1_offset; /* file1 offset, bytes */
> > + __u64 file2_offset; /* file2 offset, bytes */
> > + __u64 length; /* bytes to exchange */
> > +
> > + __u64 flags; /* see XFS_EXCHANGE_RANGE_* below */
> > +
> > + /* opaque file2 metadata for freshness checks */
> > + __u64 file2_freshness[6];
> > +};
> > +
> > /*
> > * Exchange file data all the way to the ends of both files, and then exchange
> > * the file sizes. This flag can be used to replace a file's contents with a
> > @@ -997,6 +1021,8 @@ struct xfs_getparents_by_handle {
> > #define XFS_IOC_BULKSTAT _IOR ('X', 127, struct xfs_bulkstat_req)
> > #define XFS_IOC_INUMBERS _IOR ('X', 128, struct xfs_inumbers_req)
> > #define XFS_IOC_EXCHANGE_RANGE _IOW ('X', 129, struct xfs_exchange_range)
> > +#define XFS_IOC_START_COMMIT _IOR ('X', 130, struct xfs_commit_range)
> > +#define XFS_IOC_COMMIT_RANGE _IOW ('X', 131, struct xfs_commit_range)
> > /* XFS_IOC_GETFSUUID ---------- deprecated 140 */
> >
> >
> > diff --git a/fs/xfs/xfs_exchrange.c b/fs/xfs/xfs_exchrange.c
> > index c8a655c92c92..d0889190ab7f 100644
> > --- a/fs/xfs/xfs_exchrange.c
> > +++ b/fs/xfs/xfs_exchrange.c
> > @@ -72,6 +72,34 @@ xfs_exchrange_estimate(
> > return error;
> > }
> >
> > +/*
> > + * Check that file2's metadata agree with the snapshot that we took for the
> > + * range commit request.
> > + *
> > + * This should be called after the filesystem has locked /all/ inode metadata
> > + * against modification.
> > + */
> > +STATIC int
> > +xfs_exchrange_check_freshness(
> > + const struct xfs_exchrange *fxr,
> > + struct xfs_inode *ip2)
> > +{
> > + struct inode *inode2 = VFS_I(ip2);
> > + struct timespec64 ctime = inode_get_ctime(inode2);
> > + struct timespec64 mtime = inode_get_mtime(inode2);
> > +
> > + trace_xfs_exchrange_freshness(fxr, ip2);
> > +
> > + /* Check that file2 hasn't otherwise been modified. */
> > + if (fxr->file2_ino != ip2->i_ino ||
> > + fxr->file2_gen != inode2->i_generation ||
> > + !timespec64_equal(&fxr->file2_ctime, &ctime) ||
> > + !timespec64_equal(&fxr->file2_mtime, &mtime))
> > + return -EBUSY;
> > +
> > + return 0;
> > +}
> > +
> > #define QRETRY_IP1 (0x1)
> > #define QRETRY_IP2 (0x2)
> >
> > @@ -607,6 +635,12 @@ xfs_exchrange_prep(
> > if (error || fxr->length == 0)
> > return error;
> >
> > + if (fxr->flags & __XFS_EXCHANGE_RANGE_CHECK_FRESH2) {
> > + error = xfs_exchrange_check_freshness(fxr, ip2);
> > + if (error)
> > + return error;
> > + }
> > +
> > /* Attach dquots to both inodes before changing block maps. */
> > error = xfs_qm_dqattach(ip2);
> > if (error)
> > @@ -719,7 +753,8 @@ xfs_exchange_range(
> > if (fxr->file1->f_path.mnt != fxr->file2->f_path.mnt)
> > return -EXDEV;
> >
> > - if (fxr->flags & ~XFS_EXCHANGE_RANGE_ALL_FLAGS)
> > + if (fxr->flags & ~(XFS_EXCHANGE_RANGE_ALL_FLAGS |
> > + __XFS_EXCHANGE_RANGE_CHECK_FRESH2))
> > return -EINVAL;
> >
> > /* Userspace requests only honored for regular files. */
> > @@ -802,3 +837,109 @@ xfs_ioc_exchange_range(
> > fdput(file1);
> > return error;
> > }
> > +
> > +/* Opaque freshness blob for XFS_IOC_COMMIT_RANGE */
> > +struct xfs_commit_range_fresh {
> > + xfs_fsid_t fsid; /* m_fixedfsid */
> > + __u64 file2_ino; /* inode number */
> > + __s64 file2_mtime; /* modification time */
> > + __s64 file2_ctime; /* change time */
> > + __s32 file2_mtime_nsec; /* mod time, nsec */
> > + __s32 file2_ctime_nsec; /* change time, nsec */
> > + __u32 file2_gen; /* inode generation */
> > + __u32 magic; /* zero */
> > +};
> > +#define XCR_FRESH_MAGIC 0x444F524B /* DORK */
> > +
> > +/* Set up a commitrange operation by sampling file2's write-related attrs */
> > +long
> > +xfs_ioc_start_commit(
> > + struct file *file,
> > + struct xfs_commit_range __user *argp)
> > +{
> > + struct xfs_commit_range args = { };
> > + struct timespec64 ts;
> > + struct xfs_commit_range_fresh *kern_f;
> > + struct xfs_commit_range_fresh __user *user_f;
> > + struct inode *inode2 = file_inode(file);
> > + struct xfs_inode *ip2 = XFS_I(inode2);
> > + const unsigned int lockflags = XFS_IOLOCK_SHARED |
> > + XFS_MMAPLOCK_SHARED |
> > + XFS_ILOCK_SHARED;
> > +
> > + BUILD_BUG_ON(sizeof(struct xfs_commit_range_fresh) !=
> > + sizeof(args.file2_freshness));
> > +
> > + kern_f = (struct xfs_commit_range_fresh *)&args.file2_freshness;
> > +
> > + memcpy(&kern_f->fsid, ip2->i_mount->m_fixedfsid, sizeof(xfs_fsid_t));
> > +
> > + xfs_ilock(ip2, lockflags);
> > + ts = inode_get_ctime(inode2);
> > + kern_f->file2_ctime = ts.tv_sec;
> > + kern_f->file2_ctime_nsec = ts.tv_nsec;
> > + ts = inode_get_mtime(inode2);
> > + kern_f->file2_mtime = ts.tv_sec;
> > + kern_f->file2_mtime_nsec = ts.tv_nsec;
> > + kern_f->file2_ino = ip2->i_ino;
> > + kern_f->file2_gen = inode2->i_generation;
> > + kern_f->magic = XCR_FRESH_MAGIC;
> > + xfs_iunlock(ip2, lockflags);
> > +
> > + user_f = (struct xfs_commit_range_fresh __user *)&argp->file2_freshness;
> > + if (copy_to_user(user_f, kern_f, sizeof(*kern_f)))
> > + return -EFAULT;
> > +
> > + return 0;
> > +}
> > +
> > +/*
> > + * Exchange file1 and file2 contents if file2 has not been written since the
> > + * start commit operation.
> > + */
> > +long
> > +xfs_ioc_commit_range(
> > + struct file *file,
> > + struct xfs_commit_range __user *argp)
> > +{
> > + struct xfs_exchrange fxr = {
> > + .file2 = file,
> > + };
> > + struct xfs_commit_range args;
> > + struct xfs_commit_range_fresh *kern_f;
> > + struct xfs_inode *ip2 = XFS_I(file_inode(file));
> > + struct xfs_mount *mp = ip2->i_mount;
> > + struct fd file1;
> > + int error;
> > +
> > + kern_f = (struct xfs_commit_range_fresh *)&args.file2_freshness;
> > +
> > + if (copy_from_user(&args, argp, sizeof(args)))
> > + return -EFAULT;
> > + if (args.flags & ~XFS_EXCHANGE_RANGE_ALL_FLAGS)
> > + return -EINVAL;
> > + if (kern_f->magic != XCR_FRESH_MAGIC)
> > + return -EBUSY;
> > + if (memcmp(&kern_f->fsid, mp->m_fixedfsid, sizeof(xfs_fsid_t)))
> > + return -EBUSY;
>
> So, I mentioned this before in another mail a few months ago and I think
> you liked the idea so just as a reminder in case you forgot:
>
> Ioctls are extensible if done correctly:
>
> switch (__IOC_NR(ioctl)) {
> case _IOC_NR(XFS_IOC_START_COMMIT): {
> size_t usize = _IOC_SIZE(ioctl);
> struct xfs_commit_range args;
>
> if (usize < XFS_IOC_START_COMMIT_SIZE_VER0)
> return -EINVAL;
>
> if (copy_struct_from_user(&args, sizeof(args), argp, usize))
> return -EFAULT;
> }
>
> If you code it this way, relying on copy_struct_from_user() right from
> the start you can easily extend your struct in a backward and forward
> compatible manner.
I don't know that we'd really need it for commitrange since there's
plenty of space (~40 bytes) in the "opaque" freshness blob. I suspect
that if we ever add subvolumes to XFS then we might want to take over
the 12 bytes used by mtime* for the subvolume id.
That said, I also think we could convert to this format pretty easily.
Also, copy_struct_from_user can return -E2BIG so I think that needs to
be:
ret = copy_struct_from_user(&args, sizeof(args), argp, usize);
if (ret)
return ret;
Though the overriding reason for not writing the __IOC_NR dispatch code
this way is that every time I've tried extend an xfs ioctl in this
manner, Dave says no because (I think) he doesn't trust how the struct
size is opaquely encoded in the ioctl number /and/ doesn't trust the
BUILD_BUG_ONs I put in the code to guarantee uniqueness so I pick a new
number so I can complete the review instead of starting over with a
different reviewer who doesn't have that particular preference.
> }
>
> > +
> > + fxr.file1_offset = args.file1_offset;
> > + fxr.file2_offset = args.file2_offset;
> > + fxr.length = args.length;
> > + fxr.flags = args.flags | __XFS_EXCHANGE_RANGE_CHECK_FRESH2;
> > + fxr.file2_ino = kern_f->file2_ino;
> > + fxr.file2_gen = kern_f->file2_gen;
> > + fxr.file2_mtime.tv_sec = kern_f->file2_mtime;
> > + fxr.file2_mtime.tv_nsec = kern_f->file2_mtime_nsec;
> > + fxr.file2_ctime.tv_sec = kern_f->file2_ctime;
> > + fxr.file2_ctime.tv_nsec = kern_f->file2_ctime_nsec;
> > +
> > + file1 = fdget(args.file1_fd);
> > + if (!file1.file)
> > + return -EBADF;
>
> Please use CLASS(fd, f)(args.file1_fd) :)
Yeah, I saw that the fd cleanups collided with xfs in for-next, thanks
for the heads up.
--D
> > + fxr.file1 = file1.file;
> > +
> > + error = xfs_exchange_range(&fxr);
> > + fdput(file1);
> > + return error;
> > +}
> > diff --git a/fs/xfs/xfs_exchrange.h b/fs/xfs/xfs_exchrange.h
> > index 039abcca546e..bc1298aba806 100644
> > --- a/fs/xfs/xfs_exchrange.h
> > +++ b/fs/xfs/xfs_exchrange.h
> > @@ -10,8 +10,12 @@
> > #define __XFS_EXCHANGE_RANGE_UPD_CMTIME1 (1ULL << 63)
> > #define __XFS_EXCHANGE_RANGE_UPD_CMTIME2 (1ULL << 62)
> >
> > +/* Freshness check required */
> > +#define __XFS_EXCHANGE_RANGE_CHECK_FRESH2 (1ULL << 61)
> > +
> > #define XFS_EXCHANGE_RANGE_PRIV_FLAGS (__XFS_EXCHANGE_RANGE_UPD_CMTIME1 | \
> > - __XFS_EXCHANGE_RANGE_UPD_CMTIME2)
> > + __XFS_EXCHANGE_RANGE_UPD_CMTIME2 | \
> > + __XFS_EXCHANGE_RANGE_CHECK_FRESH2)
> >
> > struct xfs_exchrange {
> > struct file *file1;
> > @@ -22,10 +26,20 @@ struct xfs_exchrange {
> > u64 length;
> >
> > u64 flags; /* XFS_EXCHANGE_RANGE flags */
> > +
> > + /* file2 metadata for freshness checks */
> > + u64 file2_ino;
> > + struct timespec64 file2_mtime;
> > + struct timespec64 file2_ctime;
> > + u32 file2_gen;
> > };
> >
> > long xfs_ioc_exchange_range(struct file *file,
> > struct xfs_exchange_range __user *argp);
> > +long xfs_ioc_start_commit(struct file *file,
> > + struct xfs_commit_range __user *argp);
> > +long xfs_ioc_commit_range(struct file *file,
> > + struct xfs_commit_range __user *argp);
> >
> > struct xfs_exchmaps_req;
> >
> > diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
> > index 6b13666d4e96..90b3ee21e7fe 100644
> > --- a/fs/xfs/xfs_ioctl.c
> > +++ b/fs/xfs/xfs_ioctl.c
> > @@ -1518,6 +1518,10 @@ xfs_file_ioctl(
> >
> > case XFS_IOC_EXCHANGE_RANGE:
> > return xfs_ioc_exchange_range(filp, arg);
> > + case XFS_IOC_START_COMMIT:
> > + return xfs_ioc_start_commit(filp, arg);
> > + case XFS_IOC_COMMIT_RANGE:
> > + return xfs_ioc_commit_range(filp, arg);
> >
> > default:
> > return -ENOTTY;
> > diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h
> > index 180ce697305a..4cf0fa71ba9c 100644
> > --- a/fs/xfs/xfs_trace.h
> > +++ b/fs/xfs/xfs_trace.h
> > @@ -4926,7 +4926,8 @@ DEFINE_INODE_ERROR_EVENT(xfs_exchrange_error);
> > { XFS_EXCHANGE_RANGE_DRY_RUN, "DRY_RUN" }, \
> > { XFS_EXCHANGE_RANGE_FILE1_WRITTEN, "F1_WRITTEN" }, \
> > { __XFS_EXCHANGE_RANGE_UPD_CMTIME1, "CMTIME1" }, \
> > - { __XFS_EXCHANGE_RANGE_UPD_CMTIME2, "CMTIME2" }
> > + { __XFS_EXCHANGE_RANGE_UPD_CMTIME2, "CMTIME2" }, \
> > + { __XFS_EXCHANGE_RANGE_CHECK_FRESH2, "FRESH2" }
> >
> > /* file exchange-range tracepoint class */
> > DECLARE_EVENT_CLASS(xfs_exchrange_class,
> > @@ -4986,6 +4987,60 @@ DEFINE_EXCHRANGE_EVENT(xfs_exchrange_prep);
> > DEFINE_EXCHRANGE_EVENT(xfs_exchrange_flush);
> > DEFINE_EXCHRANGE_EVENT(xfs_exchrange_mappings);
> >
> > +TRACE_EVENT(xfs_exchrange_freshness,
> > + TP_PROTO(const struct xfs_exchrange *fxr, struct xfs_inode *ip2),
> > + TP_ARGS(fxr, ip2),
> > + TP_STRUCT__entry(
> > + __field(dev_t, dev)
> > + __field(xfs_ino_t, ip2_ino)
> > + __field(long long, ip2_mtime)
> > + __field(long long, ip2_ctime)
> > + __field(int, ip2_mtime_nsec)
> > + __field(int, ip2_ctime_nsec)
> > +
> > + __field(xfs_ino_t, file2_ino)
> > + __field(long long, file2_mtime)
> > + __field(long long, file2_ctime)
> > + __field(int, file2_mtime_nsec)
> > + __field(int, file2_ctime_nsec)
> > + ),
> > + TP_fast_assign(
> > + struct timespec64 ts64;
> > + struct inode *inode2 = VFS_I(ip2);
> > +
> > + __entry->dev = inode2->i_sb->s_dev;
> > + __entry->ip2_ino = ip2->i_ino;
> > +
> > + ts64 = inode_get_ctime(inode2);
> > + __entry->ip2_ctime = ts64.tv_sec;
> > + __entry->ip2_ctime_nsec = ts64.tv_nsec;
> > +
> > + ts64 = inode_get_mtime(inode2);
> > + __entry->ip2_mtime = ts64.tv_sec;
> > + __entry->ip2_mtime_nsec = ts64.tv_nsec;
> > +
> > + __entry->file2_ino = fxr->file2_ino;
> > + __entry->file2_mtime = fxr->file2_mtime.tv_sec;
> > + __entry->file2_ctime = fxr->file2_ctime.tv_sec;
> > + __entry->file2_mtime_nsec = fxr->file2_mtime.tv_nsec;
> > + __entry->file2_ctime_nsec = fxr->file2_ctime.tv_nsec;
> > + ),
> > + TP_printk("dev %d:%d "
> > + "ino 0x%llx mtime %lld:%d ctime %lld:%d -> "
> > + "file 0x%llx mtime %lld:%d ctime %lld:%d",
> > + MAJOR(__entry->dev), MINOR(__entry->dev),
> > + __entry->ip2_ino,
> > + __entry->ip2_mtime,
> > + __entry->ip2_mtime_nsec,
> > + __entry->ip2_ctime,
> > + __entry->ip2_ctime_nsec,
> > + __entry->file2_ino,
> > + __entry->file2_mtime,
> > + __entry->file2_mtime_nsec,
> > + __entry->file2_ctime,
> > + __entry->file2_ctime_nsec)
> > +);
> > +
> > TRACE_EVENT(xfs_exchmaps_overhead,
> > TP_PROTO(struct xfs_mount *mp, unsigned long long bmbt_blocks,
> > unsigned long long rmapbt_blocks),
> >
>
^ permalink raw reply [flat|nested] 53+ messages in thread
end of thread, other threads:[~2024-10-25 21:23 UTC | newest]
Thread overview: 53+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-02 18:16 [PATCHBOMB 6.12] xfs: a ton of bugfixes and cleanups Darrick J. Wong
2024-09-02 18:21 ` [PATCHSET v31.1 1/8] xfs: atomic file content commits Darrick J. Wong
2024-09-02 18:23 ` [PATCH 1/1] xfs: introduce new file range commit ioctls Darrick J. Wong
2024-09-03 7:52 ` Christian Brauner
2024-10-25 21:23 ` Darrick J. Wong
2024-09-02 18:21 ` [PATCHSET v4.2 2/8] xfs: cleanups before adding metadata directories Darrick J. Wong
2024-09-02 18:23 ` [PATCH 1/3] xfs: validate inumber in xfs_iget Darrick J. Wong
2024-09-02 18:23 ` [PATCH 2/3] xfs: match on the global RT inode numbers in xfs_is_metadata_inode Darrick J. Wong
2024-09-02 18:23 ` [PATCH 3/3] xfs: pass the icreate args object to xfs_dialloc Darrick J. Wong
2024-09-02 18:21 ` [PATCHSET v4.2 3/8] xfs: clean up the rtbitmap code Darrick J. Wong
2024-09-02 18:24 ` [PATCH 01/12] xfs: remove xfs_validate_rtextents Darrick J. Wong
2024-09-02 18:24 ` [PATCH 02/12] xfs: factor out a xfs_validate_rt_geometry helper Darrick J. Wong
2024-09-02 18:24 ` [PATCH 03/12] xfs: make the RT rsum_cache mandatory Darrick J. Wong
2024-09-02 18:24 ` [PATCH 04/12] xfs: remove the limit argument to xfs_rtfind_back Darrick J. Wong
2024-09-02 18:25 ` [PATCH 05/12] xfs: assert a valid limit in xfs_rtfind_forw Darrick J. Wong
2024-09-02 18:25 ` [PATCH 06/12] xfs: add bounds checking to xfs_rt{bitmap,summary}_read_buf Darrick J. Wong
2024-09-02 18:25 ` [PATCH 07/12] xfs: cleanup the calling convention for xfs_rtpick_extent Darrick J. Wong
2024-09-02 18:25 ` [PATCH 08/12] xfs: push the calls to xfs_rtallocate_range out to xfs_bmap_rtalloc Darrick J. Wong
2024-09-02 18:26 ` [PATCH 09/12] xfs: factor out a xfs_growfs_rt_bmblock helper Darrick J. Wong
2024-09-02 18:26 ` [PATCH 10/12] xfs: factor out a xfs_last_rt_bmblock helper Darrick J. Wong
2024-09-02 18:26 ` [PATCH 11/12] xfs: factor out rtbitmap/summary initialization helpers Darrick J. Wong
2024-09-02 18:27 ` [PATCH 12/12] xfs: push transaction join out of xfs_rtbitmap_lock and xfs_rtgroup_lock Darrick J. Wong
2024-09-02 18:21 ` [PATCHSET v4.2 4/8] xfs: fixes for the realtime allocator Darrick J. Wong
2024-09-02 18:27 ` [PATCH 01/10] xfs: use the recalculated transaction reservation in xfs_growfs_rt_bmblock Darrick J. Wong
2024-09-02 18:27 ` [PATCH 02/10] xfs: ensure rtx mask/shift are correct after growfs Darrick J. Wong
2024-09-02 18:27 ` [PATCH 03/10] xfs: don't return too-short extents from xfs_rtallocate_extent_block Darrick J. Wong
2024-09-02 18:28 ` [PATCH 04/10] xfs: don't scan off the end of the rt volume in xfs_rtallocate_extent_block Darrick J. Wong
2024-09-02 18:28 ` [PATCH 05/10] xfs: refactor aligning bestlen to prod Darrick J. Wong
2024-09-02 18:28 ` [PATCH 06/10] xfs: clean up xfs_rtallocate_extent_exact a bit Darrick J. Wong
2024-09-02 18:28 ` [PATCH 07/10] xfs: reduce excessive clamping of maxlen in xfs_rtallocate_extent_near Darrick J. Wong
2024-09-02 18:29 ` [PATCH 08/10] xfs: fix broken variable-sized allocation detection in xfs_rtallocate_extent_block Darrick J. Wong
2024-09-02 18:29 ` [PATCH 09/10] xfs: remove xfs_rtb_to_rtxrem Darrick J. Wong
2024-09-02 18:29 ` [PATCH 10/10] xfs: simplify xfs_rtalloc_query_range Darrick J. Wong
2024-09-02 18:22 ` [PATCHSET v4.2 5/8] xfs: cleanups for the realtime allocator Darrick J. Wong
2024-09-02 18:29 ` [PATCH 01/10] xfs: clean up the ISVALID macro in xfs_bmap_adjacent Darrick J. Wong
2024-09-02 18:30 ` [PATCH 02/10] xfs: factor out a xfs_rtallocate helper Darrick J. Wong
2024-09-02 18:30 ` [PATCH 03/10] xfs: rework the rtalloc fallback handling Darrick J. Wong
2024-09-02 18:30 ` [PATCH 04/10] xfs: factor out a xfs_rtallocate_align helper Darrick J. Wong
2024-09-02 18:30 ` [PATCH 05/10] xfs: make the rtalloc start hint a xfs_rtblock_t Darrick J. Wong
2024-09-02 18:31 ` [PATCH 06/10] xfs: add xchk_setup_nothing and xchk_nothing helpers Darrick J. Wong
2024-09-02 18:31 ` [PATCH 07/10] xfs: remove xfs_{rtbitmap,rtsummary}_wordcount Darrick J. Wong
2024-09-02 18:31 ` [PATCH 08/10] xfs: replace m_rsumsize with m_rsumblocks Darrick J. Wong
2024-09-02 18:31 ` [PATCH 09/10] xfs: rearrange xfs_fsmap.c a little bit Darrick J. Wong
2024-09-02 18:32 ` [PATCH 10/10] xfs: move xfs_ioc_getfsmap out of xfs_ioctl.c Darrick J. Wong
2024-09-02 18:22 ` [PATCHSET v4.2 6/8] xfs: cleanups for quota mount Darrick J. Wong
2024-09-02 18:32 ` [PATCH 1/1] xfs: refactor loading quota inodes in the regular case Darrick J. Wong
2024-09-02 18:22 ` [PATCHSET 7/8] xfs: various bug fixes for 6.12 Darrick J. Wong
2024-09-02 18:32 ` [PATCH 1/3] xfs: fix C++ compilation errors in xfs_fs.h Darrick J. Wong
2024-09-02 18:33 ` [PATCH 2/3] xfs: fix FITRIM reporting again Darrick J. Wong
2024-09-02 18:33 ` [PATCH 3/3] xfs: fix a sloppy memory handling bug in xfs_iroot_realloc Darrick J. Wong
2024-09-02 18:22 ` [PATCHSET v4.2 8/8] xfs: cleanups for inode rooted btree code Darrick J. Wong
2024-09-02 18:33 ` [PATCH 1/2] xfs: replace shouty XFS_BM{BT,DR} macros Darrick J. Wong
2024-09-02 18:33 ` [PATCH 2/2] xfs: standardize the btree maxrecs function parameters Darrick J. Wong
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox