* [PATCH 0/3] btrfs: make extent buffer memory continuous
@ 2023-08-24 6:33 Qu Wenruo
2023-08-24 6:33 ` [PATCH 1/3] btrfs: warn on tree blocks which are not nodesize aligned Qu Wenruo
` (3 more replies)
0 siblings, 4 replies; 9+ messages in thread
From: Qu Wenruo @ 2023-08-24 6:33 UTC (permalink / raw)
To: linux-btrfs
[CHANGELOG]
RFC->v1:
- Rebased to the latest misc-next branch
Just a small conflicts in extent_buffer_memmove().
- Further cleanup the extent buffer bitmap operations
[REPO]
https://github.com/adam900710/linux/tree/eb_page_cleanups
This includes the submitted extent buffer accessors cleanup as
the dependency.
[BACKGROUND]
We have a lot of extent buffer code addressing the cross-page accesses, on
the other hand, other filesystems like XFS is mapping its xfs_buf into
kernel virtual address space, so that they can access the content of
xfs_buf without bothering the page boundaries.
[OBJECTIVE]
This patchset is mostly learning from the xfs_buf, to greatly simplify
the extent buffer accessors.
Now all the extent buffer accessors are turned into wrappers of
memcpy()/memcmp()/memmove().
For now, it can pass test cases from btrfs group without new
regressions.
Qu Wenruo (3):
btrfs: warn on tree blocks which are not nodesize aligned
btrfs: map uncontinuous extent buffer pages into virtual address space
btrfs: utilize the physically/virtually continuous extent buffer
memory
fs/btrfs/disk-io.c | 18 +--
fs/btrfs/extent_io.c | 360 +++++++++++++------------------------------
fs/btrfs/extent_io.h | 17 ++
fs/btrfs/fs.h | 7 +
4 files changed, 139 insertions(+), 263 deletions(-)
--
2.41.0
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 1/3] btrfs: warn on tree blocks which are not nodesize aligned
2023-08-24 6:33 [PATCH 0/3] btrfs: make extent buffer memory continuous Qu Wenruo
@ 2023-08-24 6:33 ` Qu Wenruo
2023-09-06 9:34 ` Anand Jain
2023-08-24 6:33 ` [PATCH 2/3] btrfs: map uncontinuous extent buffer pages into virtual address space Qu Wenruo
` (2 subsequent siblings)
3 siblings, 1 reply; 9+ messages in thread
From: Qu Wenruo @ 2023-08-24 6:33 UTC (permalink / raw)
To: linux-btrfs
A long time ago, we have some metadata chunks which starts at sector
boundary but not aligned at nodesize boundary.
This led to some older fs which can have tree blocks only aligned to
sectorsize, but not nodesize.
Later btrfs check gained the ability to detect and warn about such tree
blocks, and kernel fixed the chunk allocation behavior, nowadays those
tree blocks should be pretty rare.
But in the future, if we want to migrate metadata to folio, we can not
have such tree blocks, as filemap_add_folio() requires the page index to
be aligned with the folio number of pages.
(AKA, such unaligned tree blocks can lead to VM_BUG_ON().)
So this patch adds extra warning for those unaligned tree blocks, as a
preparation for the future folio migration.
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
fs/btrfs/extent_io.c | 8 ++++++++
fs/btrfs/fs.h | 7 +++++++
2 files changed, 15 insertions(+)
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index ac3fca5a5e41..f13211975e0b 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -3462,6 +3462,14 @@ static int check_eb_alignment(struct btrfs_fs_info *fs_info, u64 start)
start, fs_info->nodesize);
return -EINVAL;
}
+ if (!IS_ALIGNED(start, fs_info->nodesize) &&
+ !test_and_set_bit(BTRFS_FS_UNALIGNED_TREE_BLOCK,
+ &fs_info->flags)) {
+ btrfs_warn(fs_info,
+ "tree block not nodesize aligned, start %llu nodesize %u",
+ start, fs_info->nodesize);
+ btrfs_warn(fs_info, "this can be solved by a full metadata balance");
+ }
return 0;
}
diff --git a/fs/btrfs/fs.h b/fs/btrfs/fs.h
index a523d64d5491..4dc16d74437c 100644
--- a/fs/btrfs/fs.h
+++ b/fs/btrfs/fs.h
@@ -139,6 +139,13 @@ enum {
*/
BTRFS_FS_FEATURE_CHANGED,
+ /*
+ * Indicate if we have tree block which is only aligned to sectorsize,
+ * but not to nodesize.
+ * This should be rare nowadays.
+ */
+ BTRFS_FS_UNALIGNED_TREE_BLOCK,
+
#if BITS_PER_LONG == 32
/* Indicate if we have error/warn message printed on 32bit systems */
BTRFS_FS_32BIT_ERROR,
--
2.41.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 2/3] btrfs: map uncontinuous extent buffer pages into virtual address space
2023-08-24 6:33 [PATCH 0/3] btrfs: make extent buffer memory continuous Qu Wenruo
2023-08-24 6:33 ` [PATCH 1/3] btrfs: warn on tree blocks which are not nodesize aligned Qu Wenruo
@ 2023-08-24 6:33 ` Qu Wenruo
2023-08-28 10:36 ` Johannes Thumshirn
2023-08-24 6:33 ` [PATCH 3/3] btrfs: utilize the physically/virtually continuous extent buffer memory Qu Wenruo
2023-09-06 17:49 ` [PATCH 0/3] btrfs: make extent buffer memory continuous David Sterba
3 siblings, 1 reply; 9+ messages in thread
From: Qu Wenruo @ 2023-08-24 6:33 UTC (permalink / raw)
To: linux-btrfs
Currently btrfs implements its extent buffer read-write using various
helpers doing cross-page handling for the pages array.
However other filesystems like XFS is mapping the pages into kernel
virtual address space, greatly simplify the access.
This patch would learn from XFS and map the pages into virtual address
space, if and only if the pages are not physically continuous.
(Note, a single page counts as physically continuous.)
For now we only do the map, but not yet really utilize the mapped
address.
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
fs/btrfs/extent_io.c | 70 ++++++++++++++++++++++++++++++++++++++++++++
fs/btrfs/extent_io.h | 7 +++++
2 files changed, 77 insertions(+)
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index f13211975e0b..9f9a3ab82f04 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -14,6 +14,7 @@
#include <linux/pagevec.h>
#include <linux/prefetch.h>
#include <linux/fsverity.h>
+#include <linux/vmalloc.h>
#include "misc.h"
#include "extent_io.h"
#include "extent-io-tree.h"
@@ -3153,6 +3154,8 @@ static void btrfs_release_extent_buffer_pages(struct extent_buffer *eb)
ASSERT(!extent_buffer_under_io(eb));
num_pages = num_extent_pages(eb);
+ if (eb->vaddr)
+ vm_unmap_ram(eb->vaddr, num_pages);
for (i = 0; i < num_pages; i++) {
struct page *page = eb->pages[i];
@@ -3202,6 +3205,7 @@ struct extent_buffer *btrfs_clone_extent_buffer(const struct extent_buffer *src)
{
int i;
struct extent_buffer *new;
+ bool pages_contig = true;
int num_pages = num_extent_pages(src);
int ret;
@@ -3226,6 +3230,9 @@ struct extent_buffer *btrfs_clone_extent_buffer(const struct extent_buffer *src)
int ret;
struct page *p = new->pages[i];
+ if (i && p != new->pages[i - 1] + 1)
+ pages_contig = false;
+
ret = attach_extent_buffer_page(new, p, NULL);
if (ret < 0) {
btrfs_release_extent_buffer(new);
@@ -3233,6 +3240,23 @@ struct extent_buffer *btrfs_clone_extent_buffer(const struct extent_buffer *src)
}
WARN_ON(PageDirty(p));
}
+ if (!pages_contig) {
+ unsigned int nofs_flag;
+ int retried = 0;
+
+ nofs_flag = memalloc_nofs_save();
+ do {
+ new->vaddr = vm_map_ram(new->pages, num_pages, -1);
+ if (new->vaddr)
+ break;
+ vm_unmap_aliases();
+ } while ((retried++) <= 1);
+ memalloc_nofs_restore(nofs_flag);
+ if (!new->vaddr) {
+ btrfs_release_extent_buffer(new);
+ return NULL;
+ }
+ }
copy_extent_buffer_full(new, src);
set_extent_buffer_uptodate(new);
@@ -3243,6 +3267,7 @@ struct extent_buffer *__alloc_dummy_extent_buffer(struct btrfs_fs_info *fs_info,
u64 start, unsigned long len)
{
struct extent_buffer *eb;
+ bool pages_contig = true;
int num_pages;
int i;
int ret;
@@ -3259,11 +3284,29 @@ struct extent_buffer *__alloc_dummy_extent_buffer(struct btrfs_fs_info *fs_info,
for (i = 0; i < num_pages; i++) {
struct page *p = eb->pages[i];
+ if (i && p != eb->pages[i - 1] + 1)
+ pages_contig = false;
+
ret = attach_extent_buffer_page(eb, p, NULL);
if (ret < 0)
goto err;
}
+ if (!pages_contig) {
+ unsigned int nofs_flag;
+ int retried = 0;
+
+ nofs_flag = memalloc_nofs_save();
+ do {
+ eb->vaddr = vm_map_ram(eb->pages, num_pages, -1);
+ if (eb->vaddr)
+ break;
+ vm_unmap_aliases();
+ } while ((retried++) <= 1);
+ memalloc_nofs_restore(nofs_flag);
+ if (!eb->vaddr)
+ goto err;
+ }
set_extent_buffer_uptodate(eb);
btrfs_set_header_nritems(eb, 0);
set_bit(EXTENT_BUFFER_UNMAPPED, &eb->bflags);
@@ -3486,6 +3529,7 @@ struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info,
struct address_space *mapping = fs_info->btree_inode->i_mapping;
struct btrfs_subpage *prealloc = NULL;
u64 lockdep_owner = owner_root;
+ bool pages_contig = true;
int uptodate = 1;
int ret;
@@ -3558,6 +3602,10 @@ struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info,
/* Should not fail, as we have preallocated the memory */
ret = attach_extent_buffer_page(eb, p, prealloc);
ASSERT(!ret);
+
+ if (i && p != eb->pages[i - 1] + 1)
+ pages_contig = false;
+
/*
* To inform we have extra eb under allocation, so that
* detach_extent_buffer_page() won't release the page private
@@ -3583,6 +3631,28 @@ struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info,
* we could crash.
*/
}
+
+ /*
+ * If pages are not continuous, here we map it into a continuous virtual
+ * range to make later access easier.
+ */
+ if (!pages_contig) {
+ unsigned int nofs_flag;
+ int retried = 0;
+
+ nofs_flag = memalloc_nofs_save();
+ do {
+ eb->vaddr = vm_map_ram(eb->pages, num_pages, -1);
+ if (eb->vaddr)
+ break;
+ vm_unmap_aliases();
+ } while ((retried++) <= 1);
+ memalloc_nofs_restore(nofs_flag);
+ if (!eb->vaddr) {
+ exists = ERR_PTR(-ENOMEM);
+ goto free_eb;
+ }
+ }
if (uptodate)
set_bit(EXTENT_BUFFER_UPTODATE, &eb->bflags);
again:
diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h
index 68368ba99321..930a2dc38157 100644
--- a/fs/btrfs/extent_io.h
+++ b/fs/btrfs/extent_io.h
@@ -87,6 +87,13 @@ struct extent_buffer {
struct rw_semaphore lock;
+ /*
+ * For virtually mapped address.
+ *
+ * NULL if the pages are physically continuous.
+ */
+ void *vaddr;
+
struct page *pages[INLINE_EXTENT_BUFFER_PAGES];
#ifdef CONFIG_BTRFS_DEBUG
struct list_head leak_list;
--
2.41.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 3/3] btrfs: utilize the physically/virtually continuous extent buffer memory
2023-08-24 6:33 [PATCH 0/3] btrfs: make extent buffer memory continuous Qu Wenruo
2023-08-24 6:33 ` [PATCH 1/3] btrfs: warn on tree blocks which are not nodesize aligned Qu Wenruo
2023-08-24 6:33 ` [PATCH 2/3] btrfs: map uncontinuous extent buffer pages into virtual address space Qu Wenruo
@ 2023-08-24 6:33 ` Qu Wenruo
2023-09-06 2:45 ` kernel test robot
2023-09-06 17:49 ` [PATCH 0/3] btrfs: make extent buffer memory continuous David Sterba
3 siblings, 1 reply; 9+ messages in thread
From: Qu Wenruo @ 2023-08-24 6:33 UTC (permalink / raw)
To: linux-btrfs
Since the extent buffer pages are either physically or virtually
continuous, let's benefit from the new feature.
This involves the following changes:
- Extent buffer accessors
Now read/write/memcpy/memmove_extent_buffer() functions are just
a wrapper of memcpy()/memmove().
The cross-page handling are handled by hardware MMU.
- Extent buffer bitmap accessors
- csum_tree_block()
We can directly go crypto_shash_digest(), as we don't need to handle
page boundaries anymore.
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
fs/btrfs/disk-io.c | 18 +--
fs/btrfs/extent_io.c | 282 +++++--------------------------------------
fs/btrfs/extent_io.h | 10 ++
3 files changed, 47 insertions(+), 263 deletions(-)
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 0a96ea8c1d3a..03a423f687b8 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -75,24 +75,14 @@ static void btrfs_free_csum_hash(struct btrfs_fs_info *fs_info)
static void csum_tree_block(struct extent_buffer *buf, u8 *result)
{
struct btrfs_fs_info *fs_info = buf->fs_info;
- const int num_pages = num_extent_pages(buf);
- const int first_page_part = min_t(u32, PAGE_SIZE, fs_info->nodesize);
SHASH_DESC_ON_STACK(shash, fs_info->csum_shash);
- char *kaddr;
- int i;
+ void *eb_addr = btrfs_get_eb_addr(buf);
+ memset(result, 0, BTRFS_CSUM_SIZE);
shash->tfm = fs_info->csum_shash;
crypto_shash_init(shash);
- kaddr = page_address(buf->pages[0]) + offset_in_page(buf->start);
- crypto_shash_update(shash, kaddr + BTRFS_CSUM_SIZE,
- first_page_part - BTRFS_CSUM_SIZE);
-
- for (i = 1; i < num_pages && INLINE_EXTENT_BUFFER_PAGES > 1; i++) {
- kaddr = page_address(buf->pages[i]);
- crypto_shash_update(shash, kaddr, PAGE_SIZE);
- }
- memset(result, 0, BTRFS_CSUM_SIZE);
- crypto_shash_final(shash, result);
+ crypto_shash_digest(shash, eb_addr + BTRFS_CSUM_SIZE,
+ buf->len - BTRFS_CSUM_SIZE, result);
}
/*
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 9f9a3ab82f04..70e22b9ccd28 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -4073,100 +4073,39 @@ static inline int check_eb_range(const struct extent_buffer *eb,
void read_extent_buffer(const struct extent_buffer *eb, void *dstv,
unsigned long start, unsigned long len)
{
- size_t cur;
- size_t offset;
- struct page *page;
- char *kaddr;
- char *dst = (char *)dstv;
- unsigned long i = get_eb_page_index(start);
+ void *eb_addr = btrfs_get_eb_addr(eb);
if (check_eb_range(eb, start, len))
return;
- offset = get_eb_offset_in_page(eb, start);
-
- while (len > 0) {
- page = eb->pages[i];
-
- cur = min(len, (PAGE_SIZE - offset));
- kaddr = page_address(page);
- memcpy(dst, kaddr + offset, cur);
-
- dst += cur;
- len -= cur;
- offset = 0;
- i++;
- }
+ memcpy(dstv, eb_addr + start, len);
}
int read_extent_buffer_to_user_nofault(const struct extent_buffer *eb,
void __user *dstv,
unsigned long start, unsigned long len)
{
- size_t cur;
- size_t offset;
- struct page *page;
- char *kaddr;
- char __user *dst = (char __user *)dstv;
- unsigned long i = get_eb_page_index(start);
- int ret = 0;
+ void *eb_addr = btrfs_get_eb_addr(eb);
+ int ret;
WARN_ON(start > eb->len);
WARN_ON(start + len > eb->start + eb->len);
- offset = get_eb_offset_in_page(eb, start);
-
- while (len > 0) {
- page = eb->pages[i];
-
- cur = min(len, (PAGE_SIZE - offset));
- kaddr = page_address(page);
- if (copy_to_user_nofault(dst, kaddr + offset, cur)) {
- ret = -EFAULT;
- break;
- }
-
- dst += cur;
- len -= cur;
- offset = 0;
- i++;
- }
-
- return ret;
+ ret = copy_to_user_nofault(dstv, eb_addr + start, len);
+ if (ret)
+ return -EFAULT;
+ return 0;
}
int memcmp_extent_buffer(const struct extent_buffer *eb, const void *ptrv,
unsigned long start, unsigned long len)
{
- size_t cur;
- size_t offset;
- struct page *page;
- char *kaddr;
- char *ptr = (char *)ptrv;
- unsigned long i = get_eb_page_index(start);
- int ret = 0;
+ void *eb_addr = btrfs_get_eb_addr(eb);
if (check_eb_range(eb, start, len))
return -EINVAL;
- offset = get_eb_offset_in_page(eb, start);
-
- while (len > 0) {
- page = eb->pages[i];
-
- cur = min(len, (PAGE_SIZE - offset));
-
- kaddr = page_address(page);
- ret = memcmp(ptr, kaddr + offset, cur);
- if (ret)
- break;
-
- ptr += cur;
- len -= cur;
- offset = 0;
- i++;
- }
- return ret;
+ return memcmp(ptrv, eb_addr + start, len);
}
/*
@@ -4200,67 +4139,20 @@ static void assert_eb_page_uptodate(const struct extent_buffer *eb,
}
}
-static void __write_extent_buffer(const struct extent_buffer *eb,
- const void *srcv, unsigned long start,
- unsigned long len, bool use_memmove)
-{
- size_t cur;
- size_t offset;
- struct page *page;
- char *kaddr;
- char *src = (char *)srcv;
- unsigned long i = get_eb_page_index(start);
- /* For unmapped (dummy) ebs, no need to check their uptodate status. */
- const bool check_uptodate = !test_bit(EXTENT_BUFFER_UNMAPPED, &eb->bflags);
-
- WARN_ON(test_bit(EXTENT_BUFFER_NO_CHECK, &eb->bflags));
-
- if (check_eb_range(eb, start, len))
- return;
-
- offset = get_eb_offset_in_page(eb, start);
-
- while (len > 0) {
- page = eb->pages[i];
- if (check_uptodate)
- assert_eb_page_uptodate(eb, page);
-
- cur = min(len, PAGE_SIZE - offset);
- kaddr = page_address(page);
- if (use_memmove)
- memmove(kaddr + offset, src, cur);
- else
- memcpy(kaddr + offset, src, cur);
-
- src += cur;
- len -= cur;
- offset = 0;
- i++;
- }
-}
-
void write_extent_buffer(const struct extent_buffer *eb, const void *srcv,
unsigned long start, unsigned long len)
{
- return __write_extent_buffer(eb, srcv, start, len, false);
+ void *eb_addr = btrfs_get_eb_addr(eb);
+
+ memcpy(eb_addr + start, srcv, len);
}
static void memset_extent_buffer(const struct extent_buffer *eb, int c,
unsigned long start, unsigned long len)
{
- unsigned long cur = start;
+ void *eb_addr = btrfs_get_eb_addr(eb);
- while (cur < start + len) {
- unsigned long index = get_eb_page_index(cur);
- unsigned int offset = get_eb_offset_in_page(eb, cur);
- unsigned int cur_len = min(start + len - cur, PAGE_SIZE - offset);
- struct page *page = eb->pages[index];
-
- assert_eb_page_uptodate(eb, page);
- memset(page_address(page) + offset, c, cur_len);
-
- cur += cur_len;
- }
+ memset(eb_addr + start, c, len);
}
void memzero_extent_buffer(const struct extent_buffer *eb, unsigned long start,
@@ -4274,20 +4166,12 @@ void memzero_extent_buffer(const struct extent_buffer *eb, unsigned long start,
void copy_extent_buffer_full(const struct extent_buffer *dst,
const struct extent_buffer *src)
{
- unsigned long cur = 0;
+ void *dst_addr = btrfs_get_eb_addr(dst);
+ void *src_addr = btrfs_get_eb_addr(src);
ASSERT(dst->len == src->len);
- while (cur < src->len) {
- unsigned long index = get_eb_page_index(cur);
- unsigned long offset = get_eb_offset_in_page(src, cur);
- unsigned long cur_len = min(src->len, PAGE_SIZE - offset);
- void *addr = page_address(src->pages[index]) + offset;
-
- write_extent_buffer(dst, addr, cur, cur_len);
-
- cur += cur_len;
- }
+ memcpy(dst_addr, src_addr, dst->len);
}
void copy_extent_buffer(const struct extent_buffer *dst,
@@ -4296,11 +4180,8 @@ void copy_extent_buffer(const struct extent_buffer *dst,
unsigned long len)
{
u64 dst_len = dst->len;
- size_t cur;
- size_t offset;
- struct page *page;
- char *kaddr;
- unsigned long i = get_eb_page_index(dst_offset);
+ void *dst_addr = btrfs_get_eb_addr(dst);
+ void *src_addr = btrfs_get_eb_addr(src);
if (check_eb_range(dst, dst_offset, len) ||
check_eb_range(src, src_offset, len))
@@ -4308,54 +4189,7 @@ void copy_extent_buffer(const struct extent_buffer *dst,
WARN_ON(src->len != dst_len);
- offset = get_eb_offset_in_page(dst, dst_offset);
-
- while (len > 0) {
- page = dst->pages[i];
- assert_eb_page_uptodate(dst, page);
-
- cur = min(len, (unsigned long)(PAGE_SIZE - offset));
-
- kaddr = page_address(page);
- read_extent_buffer(src, kaddr + offset, src_offset, cur);
-
- src_offset += cur;
- len -= cur;
- offset = 0;
- i++;
- }
-}
-
-/*
- * eb_bitmap_offset() - calculate the page and offset of the byte containing the
- * given bit number
- * @eb: the extent buffer
- * @start: offset of the bitmap item in the extent buffer
- * @nr: bit number
- * @page_index: return index of the page in the extent buffer that contains the
- * given bit number
- * @page_offset: return offset into the page given by page_index
- *
- * This helper hides the ugliness of finding the byte in an extent buffer which
- * contains a given bit.
- */
-static inline void eb_bitmap_offset(const struct extent_buffer *eb,
- unsigned long start, unsigned long nr,
- unsigned long *page_index,
- size_t *page_offset)
-{
- size_t byte_offset = BIT_BYTE(nr);
- size_t offset;
-
- /*
- * The byte we want is the offset of the extent buffer + the offset of
- * the bitmap item in the extent buffer + the offset of the byte in the
- * bitmap item.
- */
- offset = start + offset_in_page(eb->start) + byte_offset;
-
- *page_index = offset >> PAGE_SHIFT;
- *page_offset = offset_in_page(offset);
+ memcpy(dst_addr + dst_offset, src_addr + src_offset, len);
}
/*
@@ -4368,25 +4202,18 @@ static inline void eb_bitmap_offset(const struct extent_buffer *eb,
int extent_buffer_test_bit(const struct extent_buffer *eb, unsigned long start,
unsigned long nr)
{
- u8 *kaddr;
- struct page *page;
- unsigned long i;
- size_t offset;
+ const u8 *kaddr = btrfs_get_eb_addr(eb);
+ const unsigned long first_byte = start + BIT_BYTE(nr);
- eb_bitmap_offset(eb, start, nr, &i, &offset);
- page = eb->pages[i];
- assert_eb_page_uptodate(eb, page);
- kaddr = page_address(page);
- return 1U & (kaddr[offset] >> (nr & (BITS_PER_BYTE - 1)));
+ assert_eb_page_uptodate(eb, eb->pages[first_byte >> PAGE_SHIFT]);
+ return 1U & (kaddr[first_byte] >> (nr & (BITS_PER_BYTE - 1)));
}
static u8 *extent_buffer_get_byte(const struct extent_buffer *eb, unsigned long bytenr)
{
- unsigned long index = get_eb_page_index(bytenr);
-
if (check_eb_range(eb, bytenr, 1))
return NULL;
- return page_address(eb->pages[index]) + get_eb_offset_in_page(eb, bytenr);
+ return btrfs_get_eb_addr(eb) + bytenr;
}
/*
@@ -4471,72 +4298,29 @@ void memcpy_extent_buffer(const struct extent_buffer *dst,
unsigned long dst_offset, unsigned long src_offset,
unsigned long len)
{
- unsigned long cur_off = 0;
+ void *eb_addr = btrfs_get_eb_addr(dst);
if (check_eb_range(dst, dst_offset, len) ||
check_eb_range(dst, src_offset, len))
return;
- while (cur_off < len) {
- unsigned long cur_src = cur_off + src_offset;
- unsigned long pg_index = get_eb_page_index(cur_src);
- unsigned long pg_off = get_eb_offset_in_page(dst, cur_src);
- unsigned long cur_len = min(src_offset + len - cur_src,
- PAGE_SIZE - pg_off);
- void *src_addr = page_address(dst->pages[pg_index]) + pg_off;
- const bool use_memmove = areas_overlap(src_offset + cur_off,
- dst_offset + cur_off, cur_len);
-
- __write_extent_buffer(dst, src_addr, dst_offset + cur_off, cur_len,
- use_memmove);
- cur_off += cur_len;
- }
+ if (areas_overlap(dst_offset, src_offset, len))
+ memmove(eb_addr + dst_offset, eb_addr + src_offset, len);
+ else
+ memcpy(eb_addr + dst_offset, eb_addr + src_offset, len);
}
void memmove_extent_buffer(const struct extent_buffer *dst,
unsigned long dst_offset, unsigned long src_offset,
unsigned long len)
{
- unsigned long dst_end = dst_offset + len - 1;
- unsigned long src_end = src_offset + len - 1;
+ void *eb_addr = btrfs_get_eb_addr(dst);
if (check_eb_range(dst, dst_offset, len) ||
check_eb_range(dst, src_offset, len))
return;
- if (dst_offset < src_offset) {
- memcpy_extent_buffer(dst, dst_offset, src_offset, len);
- return;
- }
-
- while (len > 0) {
- unsigned long src_i;
- size_t cur;
- size_t dst_off_in_page;
- size_t src_off_in_page;
- void *src_addr;
- bool use_memmove;
-
- src_i = get_eb_page_index(src_end);
-
- dst_off_in_page = get_eb_offset_in_page(dst, dst_end);
- src_off_in_page = get_eb_offset_in_page(dst, src_end);
-
- cur = min_t(unsigned long, len, src_off_in_page + 1);
- cur = min(cur, dst_off_in_page + 1);
-
- src_addr = page_address(dst->pages[src_i]) + src_off_in_page -
- cur + 1;
- use_memmove = areas_overlap(src_end - cur + 1, dst_end - cur + 1,
- cur);
-
- __write_extent_buffer(dst, src_addr, dst_end - cur + 1, cur,
- use_memmove);
-
- dst_end -= cur;
- src_end -= cur;
- len -= cur;
- }
+ memmove(eb_addr + dst_offset, eb_addr + src_offset, len);
}
#define GANG_LOOKUP_SIZE 16
diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h
index 930a2dc38157..bfa14457f461 100644
--- a/fs/btrfs/extent_io.h
+++ b/fs/btrfs/extent_io.h
@@ -140,6 +140,16 @@ static inline unsigned long get_eb_page_index(unsigned long offset)
return offset >> PAGE_SHIFT;
}
+static inline void *btrfs_get_eb_addr(const struct extent_buffer *eb)
+{
+ /* For fallback vmapped extent buffer. */
+ if (eb->vaddr)
+ return eb->vaddr;
+
+ /* For physically continuous pages and subpage cases. */
+ return page_address(eb->pages[0]) + offset_in_page(eb->start);
+}
+
/*
* Structure to record how many bytes and which ranges are set/cleared
*/
--
2.41.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 2/3] btrfs: map uncontinuous extent buffer pages into virtual address space
2023-08-24 6:33 ` [PATCH 2/3] btrfs: map uncontinuous extent buffer pages into virtual address space Qu Wenruo
@ 2023-08-28 10:36 ` Johannes Thumshirn
0 siblings, 0 replies; 9+ messages in thread
From: Johannes Thumshirn @ 2023-08-28 10:36 UTC (permalink / raw)
To: Qu Wenruo, linux-btrfs@vger.kernel.org
On 24.08.23 08:34, Qu Wenruo wrote:
> + do {
> + new->vaddr = vm_map_ram(new->pages, num_pages, -1);
Please use NUMA_NO_NODE instead of -1. This makes it easier to read.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 3/3] btrfs: utilize the physically/virtually continuous extent buffer memory
2023-08-24 6:33 ` [PATCH 3/3] btrfs: utilize the physically/virtually continuous extent buffer memory Qu Wenruo
@ 2023-09-06 2:45 ` kernel test robot
0 siblings, 0 replies; 9+ messages in thread
From: kernel test robot @ 2023-09-06 2:45 UTC (permalink / raw)
To: Qu Wenruo
Cc: oe-lkp, lkp, linux-btrfs, ying.huang, feng.tang, fengwei.yin,
oliver.sang
Hello,
kernel test robot noticed a 12.0% improvement of filebench.sum_operations/s on:
commit: 2fa4ac9754a7fa77bad88aae11ac77ba137d3858 ("[PATCH 3/3] btrfs: utilize the physically/virtually continuous extent buffer memory")
url: https://github.com/intel-lab-lkp/linux/commits/Qu-Wenruo/btrfs-warn-on-tree-blocks-which-are-not-nodesize-aligned/20230824-143628
base: https://git.kernel.org/cgit/linux/kernel/git/kdave/linux.git for-next
patch link: https://lore.kernel.org/all/8bc15bfdaa2805d1d1b660b8b2e07a55aa02027d.1692858397.git.wqu@suse.com/
patch subject: [PATCH 3/3] btrfs: utilize the physically/virtually continuous extent buffer memory
testcase: filebench
test machine: 96 threads 2 sockets (Ice Lake) with 128G memory
parameters:
disk: 1HDD
fs: btrfs
fs2: cifs
test: webproxy.f
cpufreq_governor: performance
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20230906/202309061050.19c12499-oliver.sang@intel.com
=========================================================================================
compiler/cpufreq_governor/disk/fs2/fs/kconfig/rootfs/tbox_group/test/testcase:
gcc-12/performance/1HDD/cifs/btrfs/x86_64-rhel-8.3/debian-11.1-x86_64-20220510.cgz/lkp-icl-2sp1/webproxy.f/filebench
commit:
19e81514b8 ("btrfs: map uncontinuous extent buffer pages into virtual address space")
2fa4ac9754 ("btrfs: utilize the physically/virtually continuous extent buffer memory")
19e81514b8c09202 2fa4ac9754a7fa77bad88aae11a
---------------- ---------------------------
%stddev %change %stddev
\ | \
30592 ±194% -92.3% 2343 ± 24% sched_debug.cpu.avg_idle.min
1.38 -5.9% 1.30 iostat.cpu.iowait
4.63 +8.9% 5.04 iostat.cpu.system
2.56 +0.5 3.09 mpstat.cpu.all.sys%
0.54 +0.1 0.61 mpstat.cpu.all.usr%
1996 +3.3% 2062 vmstat.io.bo
33480 +13.5% 37993 vmstat.system.cs
152.67 +12.6% 171.83 turbostat.Avg_MHz
2562 +4.2% 2670 turbostat.Bzy_MHz
5.34 +0.5 5.83 turbostat.C1E%
7.12 ± 12% -21.6% 5.58 ± 12% turbostat.Pkg%pc2
209.72 +1.5% 212.81 turbostat.PkgWatt
4.92 ± 24% +3.5 8.37 ± 32% perf-profile.calltrace.cycles-pp.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary
5.13 ± 28% +3.6 8.68 ± 31% perf-profile.calltrace.cycles-pp.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
5.13 ± 28% +3.8 8.90 ± 30% perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry
5.13 ± 28% +3.8 8.90 ± 30% perf-profile.children.cycles-pp.cpuidle_enter
5.13 ± 28% +3.8 8.90 ± 30% perf-profile.children.cycles-pp.cpuidle_enter_state
5.34 ± 34% +3.9 9.21 ± 28% perf-profile.children.cycles-pp.cpuidle_idle_call
13.90 +9.6% 15.23 filebench.sum_bytes_mb/s
238030 +12.0% 266575 filebench.sum_operations
3966 +12.0% 4442 filebench.sum_operations/s
1043 +12.0% 1168 filebench.sum_reads/s
25.14 -10.7% 22.46 filebench.sum_time_ms/op
208.83 +11.9% 233.67 filebench.sum_writes/s
506705 +5.8% 536097 filebench.time.file_system_outputs
1597 ± 5% -36.1% 1020 ± 3% filebench.time.involuntary_context_switches
61810 ± 2% +6.0% 65519 filebench.time.minor_page_faults
157.67 ± 2% +31.5% 207.33 filebench.time.percent_of_cpu_this_job_got
117.60 ± 2% +27.1% 149.48 filebench.time.system_time
375177 +10.3% 413862 filebench.time.voluntary_context_switches
18717 +6.5% 19942 proc-vmstat.nr_active_anon
20206 +1.2% 20445 proc-vmstat.nr_active_file
298911 +2.2% 305406 proc-vmstat.nr_anon_pages
132893 +5.6% 140397 proc-vmstat.nr_dirtied
313040 +2.0% 319443 proc-vmstat.nr_inactive_anon
32910 +3.4% 34035 proc-vmstat.nr_shmem
62503 +1.4% 63367 proc-vmstat.nr_slab_unreclaimable
99471 +3.7% 103159 proc-vmstat.nr_written
18717 +6.5% 19942 proc-vmstat.nr_zone_active_anon
20206 +1.2% 20445 proc-vmstat.nr_zone_active_file
313040 +2.0% 319443 proc-vmstat.nr_zone_inactive_anon
943632 +3.2% 974142 proc-vmstat.numa_hit
841654 +3.6% 871757 proc-vmstat.numa_local
453634 ± 17% +27.0% 576268 ± 5% proc-vmstat.numa_pte_updates
87464 +6.1% 92814 proc-vmstat.pgactivate
1595438 +2.9% 1641074 proc-vmstat.pgalloc_normal
1453326 +3.0% 1497530 proc-vmstat.pgfree
17590 ± 5% +14.0% 20045 ± 7% proc-vmstat.pgreuse
732160 -1.8% 719104 proc-vmstat.unevictable_pgs_scanned
19.10 -8.1% 17.55 perf-stat.i.MPKI
2.039e+09 +17.3% 2.393e+09 perf-stat.i.branch-instructions
1.27 ± 2% -0.1 1.15 perf-stat.i.branch-miss-rate%
25600761 +5.8% 27075672 perf-stat.i.branch-misses
5037721 ± 4% +11.4% 5612619 perf-stat.i.cache-misses
1.632e+08 +5.9% 1.729e+08 perf-stat.i.cache-references
34079 +14.1% 38871 perf-stat.i.context-switches
1.326e+10 +14.7% 1.521e+10 perf-stat.i.cpu-cycles
551.02 ± 2% +21.0% 666.59 ± 3% perf-stat.i.cpu-migrations
3953434 ± 2% +10.8% 4381924 ± 3% perf-stat.i.dTLB-load-misses
2.343e+09 +15.4% 2.704e+09 perf-stat.i.dTLB-loads
1.141e+09 +14.3% 1.303e+09 perf-stat.i.dTLB-stores
9.047e+09 +14.9% 1.039e+10 perf-stat.i.instructions
0.69 +2.0% 0.71 perf-stat.i.ipc
0.14 +14.7% 0.16 perf-stat.i.metric.GHz
34.94 ± 4% +11.1% 38.80 perf-stat.i.metric.K/sec
59.21 +15.6% 68.43 perf-stat.i.metric.M/sec
3999 ± 3% +6.3% 4250 perf-stat.i.minor-faults
1116010 ± 4% +14.8% 1280875 ± 2% perf-stat.i.node-load-misses
1168171 ± 3% +7.9% 1259922 ± 2% perf-stat.i.node-stores
3999 ± 3% +6.3% 4250 perf-stat.i.page-faults
18.04 -7.8% 16.64 perf-stat.overall.MPKI
1.26 ± 2% -0.1 1.13 perf-stat.overall.branch-miss-rate%
2.012e+09 +17.3% 2.359e+09 perf-stat.ps.branch-instructions
25253051 +5.7% 26690222 perf-stat.ps.branch-misses
4970910 ± 4% +11.3% 5534021 perf-stat.ps.cache-misses
1.61e+08 +5.9% 1.705e+08 perf-stat.ps.cache-references
33628 +14.0% 38332 perf-stat.ps.context-switches
1.308e+10 +14.6% 1.5e+10 perf-stat.ps.cpu-cycles
543.73 ± 2% +20.9% 657.37 ± 3% perf-stat.ps.cpu-migrations
3900887 ± 2% +10.8% 4321011 ± 3% perf-stat.ps.dTLB-load-misses
2.312e+09 +15.3% 2.666e+09 perf-stat.ps.dTLB-loads
1.125e+09 +14.2% 1.285e+09 perf-stat.ps.dTLB-stores
8.925e+09 +14.8% 1.024e+10 perf-stat.ps.instructions
3943 ± 3% +6.2% 4187 perf-stat.ps.minor-faults
1101275 ± 4% +14.7% 1263151 ± 2% perf-stat.ps.node-load-misses
1152648 ± 3% +7.7% 1241973 ± 2% perf-stat.ps.node-stores
3943 ± 3% +6.2% 4187 perf-stat.ps.page-faults
6.777e+11 +10.5% 7.49e+11 perf-stat.total.instructions
0.01 ± 7% -28.2% 0.00 ± 26% perf-sched.sch_delay.avg.ms.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read.__btrfs_tree_read_lock
0.30 ± 35% -63.0% 0.11 ± 25% perf-sched.sch_delay.max.ms.__cond_resched.__kmem_cache_alloc_node.__kmalloc.cifs_strndup_to_utf16.cifs_convert_path_to_utf16
30.21 ± 3% -6.2% 28.33 ± 3% perf-sched.total_wait_and_delay.average.ms
30.15 ± 3% -6.2% 28.28 ± 3% perf-sched.total_wait_time.average.ms
1.08 -20.5% 0.86 ± 2% perf-sched.wait_and_delay.avg.ms.io_schedule.folio_wait_bit_common.filemap_update_page.filemap_get_pages
99.86 ± 27% +71.6% 171.38 ± 32% perf-sched.wait_and_delay.avg.ms.kthreadd.ret_from_fork.ret_from_fork_asm
1.10 ± 2% -16.3% 0.92 perf-sched.wait_and_delay.avg.ms.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
1.41 ± 5% -87.1% 0.18 ±223% perf-sched.wait_and_delay.avg.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.cifs_call_async
0.21 -13.4% 0.18 perf-sched.wait_and_delay.avg.ms.schedule_timeout.wait_woken.sk_wait_data.tcp_recvmsg_locked
195.95 ± 10% -18.4% 159.83 ± 12% perf-sched.wait_and_delay.avg.ms.wait_for_response.compound_send_recv.cifs_send_recv.__SMB2_close
2.60 -23.5% 1.99 perf-sched.wait_and_delay.avg.ms.wait_for_response.compound_send_recv.cifs_send_recv.query_info
20.46 -13.7% 17.66 ± 4% perf-sched.wait_and_delay.avg.ms.wait_for_response.compound_send_recv.smb2_compound_op.smb2_query_path_info
3.35 ± 66% +342.5% 14.82 ± 20% perf-sched.wait_and_delay.avg.ms.wait_for_response.compound_send_recv.smb2_compound_op.smb2_unlink
2103 +10.0% 2312 ± 3% perf-sched.wait_and_delay.count.__lock_sock.sk_wait_data.tcp_recvmsg_locked.tcp_recvmsg
1025 +14.8% 1176 perf-sched.wait_and_delay.count.io_schedule.folio_wait_bit_common.folio_wait_writeback.__filemap_fdatawait_range
9729 ± 2% +21.1% 11779 perf-sched.wait_and_delay.count.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
2349 ± 9% +29.3% 3038 ± 10% perf-sched.wait_and_delay.count.schedule_preempt_disabled.__mutex_lock.constprop.0.compound_send_recv
998.00 +14.3% 1140 perf-sched.wait_and_delay.count.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write.do_unlinkat
1026 +15.0% 1181 perf-sched.wait_and_delay.count.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write.open_last_lookups
18409 +12.5% 20714 ± 4% perf-sched.wait_and_delay.count.schedule_timeout.wait_woken.sk_wait_data.tcp_recvmsg_locked
1011 +14.8% 1160 perf-sched.wait_and_delay.count.wait_for_response.compound_send_recv.cifs_send_recv.query_info
1013 +14.5% 1160 perf-sched.wait_and_delay.count.wait_for_response.compound_send_recv.smb2_compound_op.smb2_unlink
2.68 ± 4% -19.6% 2.16 ± 7% perf-sched.wait_and_delay.max.ms.__lock_sock.sk_wait_data.tcp_recvmsg_locked.tcp_recvmsg
282.00 ± 3% -11.3% 250.07 ± 4% perf-sched.wait_and_delay.max.ms.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write.do_unlinkat
280.97 ± 2% -12.8% 244.97 ± 2% perf-sched.wait_and_delay.max.ms.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write.open_last_lookups
0.49 ±125% -97.2% 0.01 ±198% perf-sched.wait_time.avg.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_call_function_single
1.05 -20.9% 0.83 ± 2% perf-sched.wait_time.avg.ms.io_schedule.folio_wait_bit_common.filemap_update_page.filemap_get_pages
2.14 ± 4% +19.1% 2.55 ± 8% perf-sched.wait_time.avg.ms.io_schedule.rq_qos_wait.wbt_wait.__rq_qos_throttle
99.82 ± 27% +69.8% 169.46 ± 31% perf-sched.wait_time.avg.ms.kthreadd.ret_from_fork.ret_from_fork_asm
1.08 ± 2% -16.6% 0.90 perf-sched.wait_time.avg.ms.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
1.37 ± 5% -24.5% 1.03 ± 5% perf-sched.wait_time.avg.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.cifs_call_async
0.20 -14.2% 0.17 perf-sched.wait_time.avg.ms.schedule_timeout.wait_woken.sk_wait_data.tcp_recvmsg_locked
195.53 ± 10% -18.4% 159.54 ± 12% perf-sched.wait_time.avg.ms.wait_for_response.compound_send_recv.cifs_send_recv.__SMB2_close
2.54 -24.0% 1.93 perf-sched.wait_time.avg.ms.wait_for_response.compound_send_recv.cifs_send_recv.query_info
20.44 -13.8% 17.63 ± 4% perf-sched.wait_time.avg.ms.wait_for_response.compound_send_recv.smb2_compound_op.smb2_query_path_info
3.32 ± 67% +345.6% 14.78 ± 20% perf-sched.wait_time.avg.ms.wait_for_response.compound_send_recv.smb2_compound_op.smb2_unlink
245.89 ± 9% -11.8% 216.92 ± 6% perf-sched.wait_time.max.ms.__cond_resched.__kmem_cache_alloc_node.__kmalloc.cifs_strndup_to_utf16.cifs_convert_path_to_utf16
3.14 ± 9% -43.6% 1.77 ± 40% perf-sched.wait_time.max.ms.__cond_resched.dput.terminate_walk.path_openat.do_filp_open
2.65 ± 3% -19.9% 2.12 ± 6% perf-sched.wait_time.max.ms.__lock_sock.sk_wait_data.tcp_recvmsg_locked.tcp_recvmsg
0.57 ±101% -91.5% 0.05 ±213% perf-sched.wait_time.max.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_call_function_single
1.79 ± 82% -86.4% 0.24 ± 58% perf-sched.wait_time.max.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi
281.92 ± 3% -11.3% 249.99 ± 4% perf-sched.wait_time.max.ms.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write.do_unlinkat
280.90 ± 2% -12.8% 244.88 ± 2% perf-sched.wait_time.max.ms.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write.open_last_lookups
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/3] btrfs: warn on tree blocks which are not nodesize aligned
2023-08-24 6:33 ` [PATCH 1/3] btrfs: warn on tree blocks which are not nodesize aligned Qu Wenruo
@ 2023-09-06 9:34 ` Anand Jain
2023-09-06 16:53 ` David Sterba
0 siblings, 1 reply; 9+ messages in thread
From: Anand Jain @ 2023-09-06 9:34 UTC (permalink / raw)
To: Qu Wenruo, linux-btrfs
On 8/24/23 14:33, Qu Wenruo wrote:
> A long time ago, we have some metadata chunks which starts at sector
> boundary but not aligned at nodesize boundary.
>
> + if (!IS_ALIGNED(start, fs_info->nodesize) &&
> + !test_and_set_bit(BTRFS_FS_UNALIGNED_TREE_BLOCK,
> + &fs_info->flags)) {
> + btrfs_warn(fs_info,
> + "tree block not nodesize aligned, start %llu nodesize %u",
> + start, fs_info->nodesize);
> + btrfs_warn(fs_info, "this can be solved by a full metadata balance");
> + }
> return 0;
I don't know if ratelimited is required here. But, that shouldn't be a
no-go for the patch.
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Thanks.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/3] btrfs: warn on tree blocks which are not nodesize aligned
2023-09-06 9:34 ` Anand Jain
@ 2023-09-06 16:53 ` David Sterba
0 siblings, 0 replies; 9+ messages in thread
From: David Sterba @ 2023-09-06 16:53 UTC (permalink / raw)
To: Anand Jain; +Cc: Qu Wenruo, linux-btrfs
On Wed, Sep 06, 2023 at 05:34:15PM +0800, Anand Jain wrote:
> On 8/24/23 14:33, Qu Wenruo wrote:
> > A long time ago, we have some metadata chunks which starts at sector
> > boundary but not aligned at nodesize boundary.
> >
> > + if (!IS_ALIGNED(start, fs_info->nodesize) &&
> > + !test_and_set_bit(BTRFS_FS_UNALIGNED_TREE_BLOCK,
> > + &fs_info->flags)) {
> > + btrfs_warn(fs_info,
> > + "tree block not nodesize aligned, start %llu nodesize %u",
> > + start, fs_info->nodesize);
> > + btrfs_warn(fs_info, "this can be solved by a full metadata balance");
> > + }
> > return 0;
>
> I don't know if ratelimited is required here. But, that shouldn't be a
> no-go for the patch.
There will be only one such message as it's tracked by the global state
bit BTRFS_FS_UNALIGNED_TREE_BLOCK. However the message should be printed
in one go so it's not mixed with other system messages.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 0/3] btrfs: make extent buffer memory continuous
2023-08-24 6:33 [PATCH 0/3] btrfs: make extent buffer memory continuous Qu Wenruo
` (2 preceding siblings ...)
2023-08-24 6:33 ` [PATCH 3/3] btrfs: utilize the physically/virtually continuous extent buffer memory Qu Wenruo
@ 2023-09-06 17:49 ` David Sterba
3 siblings, 0 replies; 9+ messages in thread
From: David Sterba @ 2023-09-06 17:49 UTC (permalink / raw)
To: Qu Wenruo; +Cc: linux-btrfs
On Thu, Aug 24, 2023 at 02:33:35PM +0800, Qu Wenruo wrote:
> [CHANGELOG]
> RFC->v1:
> - Rebased to the latest misc-next branch
> Just a small conflicts in extent_buffer_memmove().
>
> - Further cleanup the extent buffer bitmap operations
>
> [REPO]
> https://github.com/adam900710/linux/tree/eb_page_cleanups
>
> This includes the submitted extent buffer accessors cleanup as
> the dependency.
>
> [BACKGROUND]
> We have a lot of extent buffer code addressing the cross-page accesses, on
> the other hand, other filesystems like XFS is mapping its xfs_buf into
> kernel virtual address space, so that they can access the content of
> xfs_buf without bothering the page boundaries.
>
> [OBJECTIVE]
> This patchset is mostly learning from the xfs_buf, to greatly simplify
> the extent buffer accessors.
>
> Now all the extent buffer accessors are turned into wrappers of
> memcpy()/memcmp()/memmove().
>
> For now, it can pass test cases from btrfs group without new
> regressions.
>
> Qu Wenruo (3):
> btrfs: warn on tree blocks which are not nodesize aligned
> btrfs: map uncontinuous extent buffer pages into virtual address space
> btrfs: utilize the physically/virtually continuous extent buffer
> memory
My objections stand and we can continue the discussion under the RFC
series, but for testing purposes I'll add the series to for-next.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2023-09-06 17:57 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-24 6:33 [PATCH 0/3] btrfs: make extent buffer memory continuous Qu Wenruo
2023-08-24 6:33 ` [PATCH 1/3] btrfs: warn on tree blocks which are not nodesize aligned Qu Wenruo
2023-09-06 9:34 ` Anand Jain
2023-09-06 16:53 ` David Sterba
2023-08-24 6:33 ` [PATCH 2/3] btrfs: map uncontinuous extent buffer pages into virtual address space Qu Wenruo
2023-08-28 10:36 ` Johannes Thumshirn
2023-08-24 6:33 ` [PATCH 3/3] btrfs: utilize the physically/virtually continuous extent buffer memory Qu Wenruo
2023-09-06 2:45 ` kernel test robot
2023-09-06 17:49 ` [PATCH 0/3] btrfs: make extent buffer memory continuous David Sterba
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).