* Re: [PATCH 09/11] drm, cgroup: Introduce lgpu as DRM cgroup resource
From: Tejun Heo @ 2020-02-14 21:15 UTC (permalink / raw)
To: Kenny Ho
Cc: juan.zuniga-anaya, Daniel Vetter, Kenny Ho, Kuehling, Felix,
jsparks, nirmoy.das, Maling list - DRI developers, lkaplan,
Greathouse, Joseph, amd-gfx mailing list, Jason Ekstrand,
Johannes Weiner, Alex Deucher, cgroups, Christian König,
damon.mcdougall
In-Reply-To: <CAOWid-dA2Ad-FTZDDLOs4pperYbsru9cknSuXo_2ajpPbQH0Xg@mail.gmail.com>
On Fri, Feb 14, 2020 at 03:28:40PM -0500, Kenny Ho wrote:
> Can you elaborate, per your understanding, how the lgpu weight
> attribute differ from the io.weight you suggested? Is it merely a
Oh, it's the non-weight part which is problematic.
> formatting/naming issue or is it the implementation details that you
> find troubling? From my perspective, the weight attribute implements
> as you suggested back in RFCv4 (proportional control on top of a unit
> - either physical or time unit.)
>
> Perhaps more explicit questions would help me understand what you
> mean. If I remove the 'list' and 'count' attributes leaving just
> weight, is that satisfactory? Are you saying the idea of affinity or
At least from interface pov, yes, although I think it should be clear
what the weight controls.
> named-resource is banned from cgroup entirely (even though it exists
> in the form of cpuset already and users are interested in having such
> options [i.e. userspace OpenCL] when needed?)
>
> To be clear, I am not saying no proportional control. I am saying
> give the user the options, which is what has been implemented.
We can get there if we *really* have to but not from the get-go but
I'd rather avoid affinities if at all possible.
Thanks.
--
tejun
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply
* Re: Regression: hibernation is broken since e6bc9de714972cac34daa1dc1567ee48a47a9342
From: Domenico Andreoli @ 2020-02-14 21:15 UTC (permalink / raw)
To: Darrick J. Wong
Cc: Christoph Hellwig, Linus Torvalds, Linux Kernel Mailing List,
linux-pm
In-Reply-To: <20200213194135.GF6870@magnolia>
[ added linux-pm ]
On Thu, Feb 13, 2020 at 11:41:35AM -0800, Darrick J. Wong wrote:
> On Thu, Feb 13, 2020 at 11:34:10AM -0800, Darrick J. Wong wrote:
> >
> > Well ... you could try the in-kernel hibernate (which I think is what
> > 'systemctl hibernate' does), though you'd lose the nifty features of
> > µswsusp.
Indeed 'systemctl hibernate' works perfectly with v5.6-rc1 in my setup.
> > In the end, though, I'll probably have to revert all those IS_SWAPFILE
> > checks (at least if CONFIG_HIBERNATION=y) since it's not fair to force
> > you to totally reconfigure your hibernation setup.
>
> Also, does the following partial revert fix uswsusp for you? It'll
> allow the direct writes that uswsusp wants to do, while leaving the rest
> (mmap writes) in place.
>
> --D
>
> diff --git a/fs/block_dev.c b/fs/block_dev.c
> index 69bf2fb6f7cd..077d9fa6b87d 100644
> --- a/fs/block_dev.c
> +++ b/fs/block_dev.c
> @@ -2001,8 +2001,10 @@ ssize_t blkdev_write_iter(struct kiocb *iocb, struct iov_iter *from)
> if (bdev_read_only(I_BDEV(bd_inode)))
> return -EPERM;
>
> +#ifndef CONFIG_HIBERNATION
> if (IS_SWAPFILE(bd_inode))
> return -ETXTBSY;
> +#endif
This alone is enough to make uswsusp work again.
I propose this alternative:
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -2001,7 +2001,8 @@ ssize_t blkdev_write_iter(struct kiocb *iocb, struct iov_iter *from)
if (bdev_read_only(I_BDEV(bd_inode)))
return -EPERM;
- if (IS_SWAPFILE(bd_inode))
+ /* Hibernation might happen via uswsusp, let it write to the swap */
+ if (IS_SWAPFILE(bd_inode) && !IS_ENABLED(CONFIG_HIBERNATION))
return -ETXTBSY;
if (!iov_iter_count(from))
I looked for a more selective way to enable writes to swap at runtime,
so I tried with system_entering_hibernation() but it's not yet armed
at the point in which uswsusp wants to write to the swap and therefore
it does not work.
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -34,6 +34,7 @@
#include <linux/task_io_accounting_ops.h>
#include <linux/falloc.h>
#include <linux/uaccess.h>
+#include <linux/suspend.h>
#include "internal.h"
struct bdev_inode {
@@ -2001,7 +2002,8 @@ ssize_t blkdev_write_iter(struct kiocb *iocb, struct iov_iter *from)
if (bdev_read_only(I_BDEV(bd_inode)))
return -EPERM;
- if (IS_SWAPFILE(bd_inode))
+ /* Hibernation might happen via uswsusp, let it write to the swap */
+ if (IS_SWAPFILE(bd_inode) && !system_entering_hibernation())
return -ETXTBSY;
if (!iov_iter_count(from))
> if (!iov_iter_count(from))
> return 0;
> diff --git a/mm/filemap.c b/mm/filemap.c
> index 1784478270e1..3df3211abe25 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -2920,8 +2920,10 @@ inline ssize_t generic_write_checks(struct kiocb *iocb, struct iov_iter *from)
> loff_t count;
> int ret;
>
> +#ifndef CONFIG_HIBERNATION
> if (IS_SWAPFILE(inode))
> return -ETXTBSY;
> +#endif
>
> if (!iov_iter_count(from))
> return 0;
The above is not needed in my case but I'm not sure it would not be
needed in some other configuration of uswsusp.
Dom
--
rsa4096: 3B10 0CA1 8674 ACBA B4FE FCD2 CE5B CF17 9960 DE13
ed25519: FFB4 0CC3 7F2E 091D F7DA 356E CC79 2832 ED38 CB05
^ permalink raw reply
* Re: [PATCH 09/11] drm, cgroup: Introduce lgpu as DRM cgroup resource
From: Tejun Heo @ 2020-02-14 21:15 UTC (permalink / raw)
To: Kenny Ho
Cc: Daniel Vetter, Jason Ekstrand, Kenny Ho,
cgroups-u79uwXL29TY76Z2rM5mHXA, Maling list - DRI developers,
amd-gfx mailing list, Alex Deucher, Christian König,
Kuehling, Felix, Greathouse, Joseph, jsparks-WVYJKLFxKCc,
lkaplan-WVYJKLFxKCc, nirmoy.das-5C7GfCeVMHo,
damon.mcdougall-5C7GfCeVMHo, juan.zuniga-anaya-5C7GfCeVMHo,
Johannes Weiner
In-Reply-To: <CAOWid-dA2Ad-FTZDDLOs4pperYbsru9cknSuXo_2ajpPbQH0Xg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
On Fri, Feb 14, 2020 at 03:28:40PM -0500, Kenny Ho wrote:
> Can you elaborate, per your understanding, how the lgpu weight
> attribute differ from the io.weight you suggested? Is it merely a
Oh, it's the non-weight part which is problematic.
> formatting/naming issue or is it the implementation details that you
> find troubling? From my perspective, the weight attribute implements
> as you suggested back in RFCv4 (proportional control on top of a unit
> - either physical or time unit.)
>
> Perhaps more explicit questions would help me understand what you
> mean. If I remove the 'list' and 'count' attributes leaving just
> weight, is that satisfactory? Are you saying the idea of affinity or
At least from interface pov, yes, although I think it should be clear
what the weight controls.
> named-resource is banned from cgroup entirely (even though it exists
> in the form of cpuset already and users are interested in having such
> options [i.e. userspace OpenCL] when needed?)
>
> To be clear, I am not saying no proportional control. I am saying
> give the user the options, which is what has been implemented.
We can get there if we *really* have to but not from the get-go but
I'd rather avoid affinities if at all possible.
Thanks.
--
tejun
^ permalink raw reply
* Re: [PATCH] Fix bitbake-layerindex to checkout the requested branch
From: "Jan-Simon Möller" @ 2020-02-14 21:07 UTC (permalink / raw)
To: Mark Hatle; +Cc: bitbake-devel
In-Reply-To: <80610678-1f40-8b81-f852-4d48a950ef73@kernel.crashing.org>
Hi Mark!
Thanks for reviewing. I had hoped to find the right variables. A pity it did not land earlier.
Can we backport this to zeus, please ?
> This issue was already found (and I thought fixed?) Maybe the submission hadn't
> been accepted. See:
>
> http://lists.openembedded.org/pipermail/bitbake-devel/2019-December/020645.html
Please ignore my patchset then.
Best,
JS
^ permalink raw reply
* Re: [PATCH] drm/amdgpu: stop allocating PDs/PTs with the eviction lock held
From: Felix Kuehling @ 2020-02-14 21:12 UTC (permalink / raw)
To: Christian König, alex.sierra, amd-gfx
In-Reply-To: <20200212151426.14197-1-christian.koenig@amd.com>
Now you allow eviction of page tables while you allocate page tables.
Isn't the whole point of the eviction lock to prevent page table
evictions while manipulating page tables?
Or does this only apply to PTE invalidations which never allocated
memory? Is that the only case that doesn't reserve page tables?
Regards,
Felix
On 2020-02-12 10:14 a.m., Christian König wrote:
> We need to make sure to not allocate PDs/PTs while holding
> the eviction lock or otherwise we will run into lock inversion
> in the MM as soon as we enable the MMU notifier.
>
> Signed-off-by: Christian König <christian.koenig@amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 31 +++++++++++++++++++++-----
> 1 file changed, 25 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index 77c400675b79..e7ab0c1e2793 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -897,27 +897,42 @@ static int amdgpu_vm_alloc_pts(struct amdgpu_device *adev,
> struct amdgpu_vm_pt *entry = cursor->entry;
> struct amdgpu_bo_param bp;
> struct amdgpu_bo *pt;
> + bool need_entries;
> int r;
>
> - if (cursor->level < AMDGPU_VM_PTB && !entry->entries) {
> + need_entries = cursor->level < AMDGPU_VM_PTB && !entry->entries;
> + if (!need_entries && entry->base.bo)
> + return 0;
> +
> + /* We need to make sure that we don't allocate PDs/PTs while holding the
> + * eviction lock or we run into lock recursion in the MM.
> + */
> + amdgpu_vm_eviction_unlock(vm);
> +
> + if (need_entries) {
> unsigned num_entries;
>
> num_entries = amdgpu_vm_num_entries(adev, cursor->level);
> entry->entries = kvmalloc_array(num_entries,
> sizeof(*entry->entries),
> GFP_KERNEL | __GFP_ZERO);
> - if (!entry->entries)
> - return -ENOMEM;
> + if (!entry->entries) {
> + r = -ENOMEM;
> + goto error_lock;
> + }
> }
>
> - if (entry->base.bo)
> - return 0;
> + if (entry->base.bo) {
> + r = 0;
> + goto error_lock;
> + }
>
> amdgpu_vm_bo_param(adev, vm, cursor->level, direct, &bp);
>
> r = amdgpu_bo_create(adev, &bp, &pt);
> + amdgpu_vm_eviction_lock(vm);
> if (r)
> - return r;
> + goto error_free_pt;
>
> /* Keep a reference to the root directory to avoid
> * freeing them up in the wrong order.
> @@ -936,6 +951,10 @@ static int amdgpu_vm_alloc_pts(struct amdgpu_device *adev,
> amdgpu_bo_unref(&pt);
> entry->base.bo = NULL;
> return r;
> +
> +error_lock:
> + amdgpu_vm_eviction_lock(vm);
> + return r;
> }
>
> /**
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply
* [PATCH v2 6/6] NFS: Decode multiple READ_PLUS segments
From: schumaker.anna @ 2020-02-14 21:12 UTC (permalink / raw)
To: Trond.Myklebust, linux-nfs; +Cc: Anna.Schumaker
In-Reply-To: <20200214211227.407836-1-Anna.Schumaker@Netapp.com>
From: Anna Schumaker <Anna.Schumaker@Netapp.com>
We now have everything we need to read holes and shift data to where
it's supposed to be. I switch over to using xdr_align_data() to put data
segments in the proper place.
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
---
fs/nfs/nfs42xdr.c | 43 +++++++++++++++++++++++--------------------
1 file changed, 23 insertions(+), 20 deletions(-)
diff --git a/fs/nfs/nfs42xdr.c b/fs/nfs/nfs42xdr.c
index 3407a3cf2e13..b5c638bcab66 100644
--- a/fs/nfs/nfs42xdr.c
+++ b/fs/nfs/nfs42xdr.c
@@ -53,7 +53,7 @@
#define decode_read_plus_maxsz (op_decode_hdr_maxsz + \
1 /* rpr_eof */ + \
1 /* rpr_contents count */ + \
- NFS42_READ_PLUS_SEGMENT_SIZE)
+ (2 * NFS42_READ_PLUS_SEGMENT_SIZE))
#define encode_seek_maxsz (op_encode_hdr_maxsz + \
encode_stateid_maxsz + \
2 /* offset */ + \
@@ -745,7 +745,7 @@ static int decode_deallocate(struct xdr_stream *xdr, struct nfs42_falloc_res *re
}
static uint32_t decode_read_plus_data(struct xdr_stream *xdr, struct nfs_pgio_res *res,
- uint32_t *eof)
+ uint32_t *eof, uint64_t total)
{
__be32 *p;
uint32_t count, recvd;
@@ -760,7 +760,7 @@ static uint32_t decode_read_plus_data(struct xdr_stream *xdr, struct nfs_pgio_re
if (count == 0)
return 0;
- recvd = xdr_read_pages(xdr, count);
+ recvd = xdr_align_data(xdr, total, count);
if (count > recvd) {
dprintk("NFS: server cheating in read reply: "
"count %u > recvd %u\n", count, recvd);
@@ -772,7 +772,7 @@ static uint32_t decode_read_plus_data(struct xdr_stream *xdr, struct nfs_pgio_re
}
static uint32_t decode_read_plus_hole(struct xdr_stream *xdr, struct nfs_pgio_res *res,
- uint32_t *eof)
+ uint32_t *eof, uint64_t total)
{
__be32 *p;
uint64_t offset, length;
@@ -787,7 +787,7 @@ static uint32_t decode_read_plus_hole(struct xdr_stream *xdr, struct nfs_pgio_re
if (length == 0)
return 0;
- recvd = xdr_expand_hole(xdr, 0, length);
+ recvd = xdr_expand_hole(xdr, total, length);
if (recvd < length) {
*eof = 0;
length = recvd;
@@ -799,8 +799,9 @@ static uint32_t decode_read_plus_hole(struct xdr_stream *xdr, struct nfs_pgio_re
static int decode_read_plus(struct xdr_stream *xdr, struct nfs_pgio_res *res)
{
__be32 *p;
- uint32_t count, eof, segments, type;
- int status;
+ uint32_t eof, segments, type, total;
+ int32_t count;
+ int status, i;
status = decode_op_hdr(xdr, OP_READ_PLUS);
if (status)
@@ -810,26 +811,28 @@ static int decode_read_plus(struct xdr_stream *xdr, struct nfs_pgio_res *res)
if (unlikely(!p))
return -EIO;
+ total = 0;
eof = be32_to_cpup(p++);
segments = be32_to_cpup(p++);
- if (segments == 0)
- return 0;
- p = xdr_inline_decode(xdr, 4);
- if (unlikely(!p))
- return -EIO;
+ for (i = 0; i < segments; i++) {
+ p = xdr_inline_decode(xdr, 4);
+ if (unlikely(!p))
+ return -EIO;
- type = be32_to_cpup(p++);
- if (type == NFS4_CONTENT_DATA)
- count = decode_read_plus_data(xdr, res, &eof);
- else
- count = decode_read_plus_hole(xdr, res, &eof);
+ type = be32_to_cpup(p);
+ if (type == NFS4_CONTENT_DATA)
+ count = decode_read_plus_data(xdr, res, &eof, total);
+ else
+ count = decode_read_plus_hole(xdr, res, &eof, total);
- if (segments > 1)
- eof = 0;
+ if (count < 0)
+ return count;
+ total += count;
+ }
res->eof = eof;
- res->count = count;
+ res->count = total;
return 0;
}
--
2.25.0
^ permalink raw reply related
* [PATCH v2 4/6] NFS: Add READ_PLUS data segment support
From: schumaker.anna @ 2020-02-14 21:12 UTC (permalink / raw)
To: Trond.Myklebust, linux-nfs; +Cc: Anna.Schumaker
In-Reply-To: <20200214211227.407836-1-Anna.Schumaker@Netapp.com>
From: Anna Schumaker <Anna.Schumaker@Netapp.com>
This patch adds client support for decoding a single NFS4_CONTENT_DATA
segment returned by the server. This is the simplest implementation
possible, since it does not account for any hole segments in the reply.
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
---
fs/nfs/nfs42xdr.c | 138 ++++++++++++++++++++++++++++++++++++++
fs/nfs/nfs4proc.c | 43 +++++++++++-
fs/nfs/nfs4xdr.c | 1 +
include/linux/nfs4.h | 2 +-
include/linux/nfs_fs_sb.h | 1 +
include/linux/nfs_xdr.h | 2 +-
6 files changed, 182 insertions(+), 5 deletions(-)
diff --git a/fs/nfs/nfs42xdr.c b/fs/nfs/nfs42xdr.c
index c03f3246d6c5..bf118ecabe2c 100644
--- a/fs/nfs/nfs42xdr.c
+++ b/fs/nfs/nfs42xdr.c
@@ -45,6 +45,15 @@
#define encode_deallocate_maxsz (op_encode_hdr_maxsz + \
encode_fallocate_maxsz)
#define decode_deallocate_maxsz (op_decode_hdr_maxsz)
+#define encode_read_plus_maxsz (op_encode_hdr_maxsz + \
+ encode_stateid_maxsz + 3)
+#define NFS42_READ_PLUS_SEGMENT_SIZE (1 /* data_content4 */ + \
+ 2 /* data_info4.di_offset */ + \
+ 2 /* data_info4.di_length */)
+#define decode_read_plus_maxsz (op_decode_hdr_maxsz + \
+ 1 /* rpr_eof */ + \
+ 1 /* rpr_contents count */ + \
+ NFS42_READ_PLUS_SEGMENT_SIZE)
#define encode_seek_maxsz (op_encode_hdr_maxsz + \
encode_stateid_maxsz + \
2 /* offset */ + \
@@ -128,6 +137,14 @@
decode_putfh_maxsz + \
decode_deallocate_maxsz + \
decode_getattr_maxsz)
+#define NFS4_enc_read_plus_sz (compound_encode_hdr_maxsz + \
+ encode_sequence_maxsz + \
+ encode_putfh_maxsz + \
+ encode_read_plus_maxsz)
+#define NFS4_dec_read_plus_sz (compound_decode_hdr_maxsz + \
+ decode_sequence_maxsz + \
+ decode_putfh_maxsz + \
+ decode_read_plus_maxsz)
#define NFS4_enc_seek_sz (compound_encode_hdr_maxsz + \
encode_sequence_maxsz + \
encode_putfh_maxsz + \
@@ -252,6 +269,16 @@ static void encode_deallocate(struct xdr_stream *xdr,
encode_fallocate(xdr, args);
}
+static void encode_read_plus(struct xdr_stream *xdr,
+ const struct nfs_pgio_args *args,
+ struct compound_hdr *hdr)
+{
+ encode_op_hdr(xdr, OP_READ_PLUS, decode_read_plus_maxsz, hdr);
+ encode_nfs4_stateid(xdr, &args->stateid);
+ encode_uint64(xdr, args->offset);
+ encode_uint32(xdr, args->count);
+}
+
static void encode_seek(struct xdr_stream *xdr,
const struct nfs42_seek_args *args,
struct compound_hdr *hdr)
@@ -446,6 +473,29 @@ static void nfs4_xdr_enc_deallocate(struct rpc_rqst *req,
encode_nops(&hdr);
}
+/*
+ * Encode READ_PLUS request
+ */
+static void nfs4_xdr_enc_read_plus(struct rpc_rqst *req,
+ struct xdr_stream *xdr,
+ const void *data)
+{
+ const struct nfs_pgio_args *args = data;
+ struct compound_hdr hdr = {
+ .minorversion = nfs4_xdr_minorversion(&args->seq_args),
+ };
+
+ encode_compound_hdr(xdr, req, &hdr);
+ encode_sequence(xdr, &args->seq_args, &hdr);
+ encode_putfh(xdr, args->fh, &hdr);
+ encode_read_plus(xdr, args, &hdr);
+
+ rpc_prepare_reply_pages(req, args->pages, args->pgbase,
+ args->count, hdr.replen);
+ req->rq_rcv_buf.flags |= XDRBUF_READ;
+ encode_nops(&hdr);
+}
+
/*
* Encode SEEK request
*/
@@ -694,6 +744,67 @@ static int decode_deallocate(struct xdr_stream *xdr, struct nfs42_falloc_res *re
return decode_op_hdr(xdr, OP_DEALLOCATE);
}
+static uint32_t decode_read_plus_data(struct xdr_stream *xdr, struct nfs_pgio_res *res,
+ uint32_t *eof)
+{
+ __be32 *p;
+ uint32_t count, recvd;
+ uint64_t offset;
+
+ p = xdr_inline_decode(xdr, 8 + 4);
+ if (unlikely(!p))
+ return -EIO;
+
+ p = xdr_decode_hyper(p, &offset);
+ count = be32_to_cpup(p);
+ if (count == 0)
+ return 0;
+
+ recvd = xdr_read_pages(xdr, count);
+ if (count > recvd) {
+ dprintk("NFS: server cheating in read reply: "
+ "count %u > recvd %u\n", count, recvd);
+ count = recvd;
+ *eof = 0;
+ }
+
+ return count;
+}
+
+static int decode_read_plus(struct xdr_stream *xdr, struct nfs_pgio_res *res)
+{
+ __be32 *p;
+ uint32_t count, eof, segments, type;
+ int status;
+
+ status = decode_op_hdr(xdr, OP_READ_PLUS);
+ if (status)
+ return status;
+
+ p = xdr_inline_decode(xdr, 4 + 4);
+ if (unlikely(!p))
+ return -EIO;
+
+ eof = be32_to_cpup(p++);
+ segments = be32_to_cpup(p++);
+ if (segments == 0)
+ return 0;
+
+ p = xdr_inline_decode(xdr, 4);
+ if (unlikely(!p))
+ return -EIO;
+
+ type = be32_to_cpup(p++);
+ if (type == NFS4_CONTENT_DATA)
+ count = decode_read_plus_data(xdr, res, &eof);
+ else
+ return -EINVAL;
+
+ res->eof = eof;
+ res->count = count;
+ return 0;
+}
+
static int decode_seek(struct xdr_stream *xdr, struct nfs42_seek_res *res)
{
int status;
@@ -870,6 +981,33 @@ static int nfs4_xdr_dec_deallocate(struct rpc_rqst *rqstp,
return status;
}
+/*
+ * Decode READ_PLUS request
+ */
+static int nfs4_xdr_dec_read_plus(struct rpc_rqst *rqstp,
+ struct xdr_stream *xdr,
+ void *data)
+{
+ struct nfs_pgio_res *res = data;
+ struct compound_hdr hdr;
+ int status;
+
+ status = decode_compound_hdr(xdr, &hdr);
+ if (status)
+ goto out;
+ status = decode_sequence(xdr, &res->seq_res, rqstp);
+ if (status)
+ goto out;
+ status = decode_putfh(xdr);
+ if (status)
+ goto out;
+ status = decode_read_plus(xdr, res);
+ if (!status)
+ status = res->count;
+out:
+ return status;
+}
+
/*
* Decode SEEK request
*/
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 95d07a3dc5d1..ed3ec8c36273 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -69,6 +69,10 @@
#include "nfs4trace.h"
+#ifdef CONFIG_NFS_V4_2
+#include "nfs42.h"
+#endif /* CONFIG_NFS_V4_2 */
+
#define NFSDBG_FACILITY NFSDBG_PROC
#define NFS4_BITMASK_SZ 3
@@ -5199,28 +5203,60 @@ static bool nfs4_read_stateid_changed(struct rpc_task *task,
return true;
}
+static bool nfs4_read_plus_not_supported(struct rpc_task *task,
+ struct nfs_pgio_header *hdr)
+{
+ struct nfs_server *server = NFS_SERVER(hdr->inode);
+ struct rpc_message *msg = &task->tk_msg;
+
+ if (msg->rpc_proc == &nfs4_procedures[NFSPROC4_CLNT_READ_PLUS] &&
+ server->caps & NFS_CAP_READ_PLUS && task->tk_status == -ENOTSUPP) {
+ server->caps &= ~NFS_CAP_READ_PLUS;
+ msg->rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_READ];
+ rpc_restart_call_prepare(task);
+ return true;
+ }
+ return false;
+}
+
static int nfs4_read_done(struct rpc_task *task, struct nfs_pgio_header *hdr)
{
-
dprintk("--> %s\n", __func__);
if (!nfs4_sequence_done(task, &hdr->res.seq_res))
return -EAGAIN;
if (nfs4_read_stateid_changed(task, &hdr->args))
return -EAGAIN;
+ if (nfs4_read_plus_not_supported(task, hdr))
+ return -EAGAIN;
if (task->tk_status > 0)
nfs_invalidate_atime(hdr->inode);
return hdr->pgio_done_cb ? hdr->pgio_done_cb(task, hdr) :
nfs4_read_done_cb(task, hdr);
}
+#ifdef CONFIG_NFS_V4_2
+static void nfs42_read_plus_support(struct nfs_server *server, struct rpc_message *msg)
+{
+ if (server->caps & NFS_CAP_READ_PLUS)
+ msg->rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_READ_PLUS];
+ else
+ msg->rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_READ];
+}
+#else
+static void nfs42_read_plus_support(struct nfs_server *server, struct rpc_message *msg)
+{
+ msg->rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_READ];
+}
+#endif /* CONFIG_NFS_V4_2 */
+
static void nfs4_proc_read_setup(struct nfs_pgio_header *hdr,
struct rpc_message *msg)
{
hdr->timestamp = jiffies;
if (!hdr->pgio_done_cb)
hdr->pgio_done_cb = nfs4_read_done_cb;
- msg->rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_READ];
+ nfs42_read_plus_support(NFS_SERVER(hdr->inode), msg);
nfs4_init_sequence(&hdr->args.seq_args, &hdr->res.seq_res, 0, 0);
}
@@ -9970,7 +10006,8 @@ static const struct nfs4_minor_version_ops nfs_v4_2_minor_ops = {
| NFS_CAP_SEEK
| NFS_CAP_LAYOUTSTATS
| NFS_CAP_CLONE
- | NFS_CAP_LAYOUTERROR,
+ | NFS_CAP_LAYOUTERROR
+ | NFS_CAP_READ_PLUS,
.init_client = nfs41_init_client,
.shutdown_client = nfs41_shutdown_client,
.match_stateid = nfs41_match_stateid,
diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
index 47817ef0aadb..68b2917d0537 100644
--- a/fs/nfs/nfs4xdr.c
+++ b/fs/nfs/nfs4xdr.c
@@ -7584,6 +7584,7 @@ const struct rpc_procinfo nfs4_procedures[] = {
PROC42(COPY_NOTIFY, enc_copy_notify, dec_copy_notify),
PROC(LOOKUPP, enc_lookupp, dec_lookupp),
PROC42(LAYOUTERROR, enc_layouterror, dec_layouterror),
+ PROC42(READ_PLUS, enc_read_plus, dec_read_plus),
};
static unsigned int nfs_version4_counts[ARRAY_SIZE(nfs4_procedures)];
diff --git a/include/linux/nfs4.h b/include/linux/nfs4.h
index 82d8fb422092..c1eeef52545c 100644
--- a/include/linux/nfs4.h
+++ b/include/linux/nfs4.h
@@ -540,8 +540,8 @@ enum {
NFSPROC4_CLNT_LOOKUPP,
NFSPROC4_CLNT_LAYOUTERROR,
-
NFSPROC4_CLNT_COPY_NOTIFY,
+ NFSPROC4_CLNT_READ_PLUS,
};
/* nfs41 types */
diff --git a/include/linux/nfs_fs_sb.h b/include/linux/nfs_fs_sb.h
index 465fa98258a3..11248c5a7b24 100644
--- a/include/linux/nfs_fs_sb.h
+++ b/include/linux/nfs_fs_sb.h
@@ -281,5 +281,6 @@ struct nfs_server {
#define NFS_CAP_OFFLOAD_CANCEL (1U << 25)
#define NFS_CAP_LAYOUTERROR (1U << 26)
#define NFS_CAP_COPY_NOTIFY (1U << 27)
+#define NFS_CAP_READ_PLUS (1U << 28)
#endif
diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
index 94c77ed55ce1..8efbf3d8b263 100644
--- a/include/linux/nfs_xdr.h
+++ b/include/linux/nfs_xdr.h
@@ -655,7 +655,7 @@ struct nfs_pgio_args {
struct nfs_pgio_res {
struct nfs4_sequence_res seq_res;
struct nfs_fattr * fattr;
- __u32 count;
+ __u64 count;
__u32 op_status;
union {
struct {
--
2.25.0
^ permalink raw reply related
* [PATCH v2 5/6] NFS: Add READ_PLUS hole segment decoding
From: schumaker.anna @ 2020-02-14 21:12 UTC (permalink / raw)
To: Trond.Myklebust, linux-nfs; +Cc: Anna.Schumaker
In-Reply-To: <20200214211227.407836-1-Anna.Schumaker@Netapp.com>
From: Anna Schumaker <Anna.Schumaker@Netapp.com>
We keep things simple for now by only decoding a single hole or data
segment returned by the server, even if they returned more to us.
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
---
fs/nfs/nfs42xdr.c | 32 ++++++++++++++++++++++++++++++--
1 file changed, 30 insertions(+), 2 deletions(-)
diff --git a/fs/nfs/nfs42xdr.c b/fs/nfs/nfs42xdr.c
index bf118ecabe2c..3407a3cf2e13 100644
--- a/fs/nfs/nfs42xdr.c
+++ b/fs/nfs/nfs42xdr.c
@@ -47,7 +47,7 @@
#define decode_deallocate_maxsz (op_decode_hdr_maxsz)
#define encode_read_plus_maxsz (op_encode_hdr_maxsz + \
encode_stateid_maxsz + 3)
-#define NFS42_READ_PLUS_SEGMENT_SIZE (1 /* data_content4 */ + \
+#define NFS42_READ_PLUS_SEGMENT_SIZE (2 /* data_content4 */ + \
2 /* data_info4.di_offset */ + \
2 /* data_info4.di_length */)
#define decode_read_plus_maxsz (op_decode_hdr_maxsz + \
@@ -771,6 +771,31 @@ static uint32_t decode_read_plus_data(struct xdr_stream *xdr, struct nfs_pgio_re
return count;
}
+static uint32_t decode_read_plus_hole(struct xdr_stream *xdr, struct nfs_pgio_res *res,
+ uint32_t *eof)
+{
+ __be32 *p;
+ uint64_t offset, length;
+ size_t recvd;
+
+ p = xdr_inline_decode(xdr, 8 + 8);
+ if (unlikely(!p))
+ return -EIO;
+
+ p = xdr_decode_hyper(p, &offset);
+ p = xdr_decode_hyper(p, &length);
+ if (length == 0)
+ return 0;
+
+ recvd = xdr_expand_hole(xdr, 0, length);
+ if (recvd < length) {
+ *eof = 0;
+ length = recvd;
+ }
+
+ return length;
+}
+
static int decode_read_plus(struct xdr_stream *xdr, struct nfs_pgio_res *res)
{
__be32 *p;
@@ -798,7 +823,10 @@ static int decode_read_plus(struct xdr_stream *xdr, struct nfs_pgio_res *res)
if (type == NFS4_CONTENT_DATA)
count = decode_read_plus_data(xdr, res, &eof);
else
- return -EINVAL;
+ count = decode_read_plus_hole(xdr, res, &eof);
+
+ if (segments > 1)
+ eof = 0;
res->eof = eof;
res->count = count;
--
2.25.0
^ permalink raw reply related
* [PATCH v2 3/6] SUNRPC: Add the ability to shift data to a specific offset
From: schumaker.anna @ 2020-02-14 21:12 UTC (permalink / raw)
To: Trond.Myklebust, linux-nfs; +Cc: Anna.Schumaker
In-Reply-To: <20200214211227.407836-1-Anna.Schumaker@Netapp.com>
From: Anna Schumaker <Anna.Schumaker@Netapp.com>
Expanding holes tends to put the data content a few bytes to the right
of where we want it. This patch implements a left-shift operation to
line everything up properly.
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
---
include/linux/sunrpc/xdr.h | 1 +
net/sunrpc/xdr.c | 135 +++++++++++++++++++++++++++++++++++++
2 files changed, 136 insertions(+)
diff --git a/include/linux/sunrpc/xdr.h b/include/linux/sunrpc/xdr.h
index 81a79099ed3b..90abf732c8ee 100644
--- a/include/linux/sunrpc/xdr.h
+++ b/include/linux/sunrpc/xdr.h
@@ -264,6 +264,7 @@ extern unsigned int xdr_read_pages(struct xdr_stream *xdr, unsigned int len);
extern void xdr_enter_page(struct xdr_stream *xdr, unsigned int len);
extern int xdr_process_buf(struct xdr_buf *buf, unsigned int offset, unsigned int len, int (*actor)(struct scatterlist *, void *), void *data);
extern size_t xdr_expand_hole(struct xdr_stream *, size_t, uint64_t);
+extern uint64_t xdr_align_data(struct xdr_stream *, uint64_t, uint64_t);
/**
* xdr_stream_remaining - Return the number of bytes remaining in the stream
diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c
index bc9b9b0945f5..a857c2308ee1 100644
--- a/net/sunrpc/xdr.c
+++ b/net/sunrpc/xdr.c
@@ -19,6 +19,9 @@
#include <linux/bvec.h>
#include <trace/events/sunrpc.h>
+static void _copy_to_pages(struct page **, size_t, const char *, size_t);
+
+
/*
* XDR functions for basic NFS types
*/
@@ -300,6 +303,117 @@ _shift_data_right(struct xdr_buf *buf, size_t to, size_t from, size_t len)
shift);
}
+
+/**
+ * _shift_data_left_pages
+ * @pages: vector of pages containing both the source and dest memory area.
+ * @pgto_base: page vector address of destination
+ * @pgfrom_base: page vector address of source
+ * @len: number of bytes to copy
+ *
+ * Note: the addresses pgto_base and pgfrom_base are both calculated in
+ * the same way:
+ * if a memory area starts at byte 'base' in page 'pages[i]',
+ * then its address is given as (i << PAGE_CACHE_SHIFT) + base
+ * Alse note: pgto_base must be < pgfrom_base, but the memory areas
+ * they point to may overlap.
+ */
+static void
+_shift_data_left_pages(struct page **pages, size_t pgto_base,
+ size_t pgfrom_base, size_t len)
+{
+ struct page **pgfrom, **pgto;
+ char *vfrom, *vto;
+ size_t copy;
+
+ BUG_ON(pgfrom_base <= pgto_base);
+
+ pgto = pages + (pgto_base >> PAGE_SHIFT);
+ pgfrom = pages + (pgfrom_base >> PAGE_SHIFT);
+
+ pgto_base = pgto_base % PAGE_SIZE;
+ pgfrom_base = pgfrom_base % PAGE_SIZE;
+
+ do {
+ if (pgto_base >= PAGE_SIZE) {
+ pgto_base = 0;
+ pgto++;
+ }
+ if (pgfrom_base >= PAGE_SIZE){
+ pgfrom_base = 0;
+ pgfrom++;
+ }
+
+ copy = len;
+ if (copy > (PAGE_SIZE - pgto_base))
+ copy = PAGE_SIZE - pgto_base;
+ if (copy > (PAGE_SIZE - pgfrom_base))
+ copy = PAGE_SIZE - pgfrom_base;
+
+ if (pgto_base == 131056)
+ break;
+
+ vto = kmap_atomic(*pgto);
+ if (*pgto != *pgfrom) {
+ vfrom = kmap_atomic(*pgfrom);
+ memcpy(vto + pgto_base, vfrom + pgfrom_base, copy);
+ kunmap_atomic(vfrom);
+ } else
+ memmove(vto + pgto_base, vto + pgfrom_base, copy);
+ flush_dcache_page(*pgto);
+ kunmap_atomic(vto);
+
+ pgto_base += copy;
+ pgfrom_base += copy;
+
+ } while ((len -= copy) != 0);
+}
+
+static void
+_shift_data_left_tail(struct xdr_buf *buf, size_t pgto_base,
+ size_t tail_from, size_t len)
+{
+ struct kvec *tail = buf->tail;
+ size_t shift = len;
+
+ if (len == 0)
+ return;
+ if (pgto_base + len > buf->page_len)
+ shift = buf->page_len - pgto_base;
+
+ _copy_to_pages(buf->pages,
+ buf->page_base + pgto_base,
+ (char *)(tail->iov_base + tail_from),
+ shift);
+
+ memmove((char *)tail->iov_base, tail->iov_base + tail_from + shift, shift);
+ tail->iov_len -= (tail_from + shift);
+}
+
+static void
+_shift_data_left(struct xdr_buf *buf, size_t to, size_t from, size_t len)
+{
+ size_t shift = len;
+
+ if (from < buf->page_len) {
+ shift = min(len, buf->page_len - from);
+ _shift_data_left_pages(buf->pages,
+ buf->page_base + to,
+ buf->page_base + from,
+ shift);
+ to += shift;
+ from += shift;
+ shift = len - shift;
+ }
+
+ if (shift == 0)
+ return;
+ if (from >= buf->page_len)
+ from -= buf->page_len;
+
+ _shift_data_left_tail(buf, to, from, shift);
+}
+
/**
* _copy_to_pages
* @pages: array of pages
@@ -1162,6 +1276,27 @@ size_t xdr_expand_hole(struct xdr_stream *xdr, size_t offset, uint64_t length)
}
EXPORT_SYMBOL_GPL(xdr_expand_hole);
+uint64_t xdr_align_data(struct xdr_stream *xdr, uint64_t offset, uint64_t length)
+{
+ struct xdr_buf *buf = xdr->buf;
+ size_t from = offset;
+
+ if (offset + length > buf->page_len)
+ length = buf->page_len - offset;
+
+ if (offset == 0)
+ xdr_align_pages(xdr, xdr->nwords << 2);
+ else {
+ from = xdr_page_pos(xdr);
+ _shift_data_left(buf, offset, from, length);
+ }
+
+ xdr->nwords -= XDR_QUADLEN(length);
+ xdr_set_page(xdr, from + length);
+ return length;
+}
+EXPORT_SYMBOL_GPL(xdr_align_data);
+
/**
* xdr_enter_page - decode data from the XDR page
* @xdr: pointer to xdr_stream struct
--
2.25.0
^ permalink raw reply related
* [PATCH v2 2/6] SUNRPC: Add the ability to expand holes in data pages
From: schumaker.anna @ 2020-02-14 21:12 UTC (permalink / raw)
To: Trond.Myklebust, linux-nfs; +Cc: Anna.Schumaker
In-Reply-To: <20200214211227.407836-1-Anna.Schumaker@Netapp.com>
From: Anna Schumaker <Anna.Schumaker@Netapp.com>
This patch adds the ability to "read a hole" into a set of XDR data
pages by taking the following steps:
1) Shift all data after the current xdr->p to the right, possibly into
the tail,
2) Zero the specified range, and
3) Update xdr->p to point beyond the hole.
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
---
include/linux/sunrpc/xdr.h | 1 +
net/sunrpc/xdr.c | 100 +++++++++++++++++++++++++++++++++++++
2 files changed, 101 insertions(+)
diff --git a/include/linux/sunrpc/xdr.h b/include/linux/sunrpc/xdr.h
index b41f34977995..81a79099ed3b 100644
--- a/include/linux/sunrpc/xdr.h
+++ b/include/linux/sunrpc/xdr.h
@@ -263,6 +263,7 @@ extern __be32 *xdr_inline_decode(struct xdr_stream *xdr, size_t nbytes);
extern unsigned int xdr_read_pages(struct xdr_stream *xdr, unsigned int len);
extern void xdr_enter_page(struct xdr_stream *xdr, unsigned int len);
extern int xdr_process_buf(struct xdr_buf *buf, unsigned int offset, unsigned int len, int (*actor)(struct scatterlist *, void *), void *data);
+extern size_t xdr_expand_hole(struct xdr_stream *, size_t, uint64_t);
/**
* xdr_stream_remaining - Return the number of bytes remaining in the stream
diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c
index 8bed0ec21563..bc9b9b0945f5 100644
--- a/net/sunrpc/xdr.c
+++ b/net/sunrpc/xdr.c
@@ -266,6 +266,40 @@ _shift_data_right_pages(struct page **pages, size_t pgto_base,
} while ((len -= copy) != 0);
}
+static void
+_shift_data_right_tail(struct xdr_buf *buf, size_t pgfrom_base, size_t len)
+{
+ struct kvec *tail = buf->tail;
+
+ /* Make room for new data. */
+ if (tail->iov_len > 0)
+ memmove((char *)tail->iov_base + len, tail->iov_base, len);
+
+ _copy_from_pages((char *)tail->iov_base,
+ buf->pages,
+ buf->page_base + pgfrom_base,
+ len);
+
+ tail->iov_len += len;
+}
+
+static void
+_shift_data_right(struct xdr_buf *buf, size_t to, size_t from, size_t len)
+{
+ size_t shift = len;
+
+ if ((to + len) > buf->page_len) {
+ shift = (to + len) - buf->page_len;
+ _shift_data_right_tail(buf, (from + len) - shift, shift);
+ shift = len - shift;
+ }
+
+ _shift_data_right_pages(buf->pages,
+ buf->page_base + to,
+ buf->page_base + from,
+ shift);
+}
+
/**
* _copy_to_pages
* @pages: array of pages
@@ -350,6 +384,33 @@ _copy_from_pages(char *p, struct page **pages, size_t pgbase, size_t len)
}
EXPORT_SYMBOL_GPL(_copy_from_pages);
+/**
+ * _zero_data_pages
+ * @pages: array of pages
+ * @pgbase: beginning page vector address
+ * @len: length
+ */
+static void
+_zero_data_pages(struct page **pages, size_t pgbase, size_t len)
+{
+ struct page **page;
+ size_t zero;
+
+ page = pages + (pgbase >> PAGE_SHIFT);
+ pgbase &= ~PAGE_MASK;
+
+ do {
+ zero = len;
+ if (pgbase + zero > PAGE_SIZE)
+ zero = PAGE_SIZE - pgbase;
+
+ zero_user_segment(*page, pgbase, pgbase + zero);
+ page++;
+ pgbase = 0;
+
+ } while ((len -= zero) != 0);
+}
+
/**
* xdr_shrink_bufhead
* @buf: xdr_buf
@@ -505,6 +566,24 @@ unsigned int xdr_stream_pos(const struct xdr_stream *xdr)
}
EXPORT_SYMBOL_GPL(xdr_stream_pos);
+/**
+ * xdr_page_pos - Return the current offset from the start of the xdr->buf->pages
+ * @xdr: pointer to struct xdr_stream
+ */
+static size_t xdr_page_pos(const struct xdr_stream *xdr)
+{
+ unsigned int offset;
+ unsigned int base = xdr->buf->page_len;
+ void *kaddr = xdr->buf->tail->iov_base;;
+
+ if (xdr->page_ptr) {
+ base = (xdr->page_ptr - xdr->buf->pages) * PAGE_SIZE;
+ kaddr = page_address(*xdr->page_ptr);
+ }
+ offset = xdr->p - (__be32 *)kaddr;
+ return base + (offset * sizeof(__be32));
+}
+
/**
* xdr_init_encode - Initialize a struct xdr_stream for sending data.
* @xdr: pointer to xdr_stream struct
@@ -1062,6 +1141,27 @@ unsigned int xdr_read_pages(struct xdr_stream *xdr, unsigned int len)
}
EXPORT_SYMBOL_GPL(xdr_read_pages);
+size_t xdr_expand_hole(struct xdr_stream *xdr, size_t offset, uint64_t length)
+{
+ struct xdr_buf *buf = xdr->buf;
+ size_t from = 0;
+
+ if ((offset + length) < offset ||
+ (offset + length) > buf->page_len)
+ length = buf->page_len - offset;
+
+ if (offset == 0)
+ xdr_align_pages(xdr, xdr->nwords << 2);
+ else
+ from = xdr_page_pos(xdr);
+
+ _shift_data_right(buf, offset + length, from, xdr->nwords << 2);
+ _zero_data_pages(buf->pages, buf->page_base + offset, length);
+ xdr_set_page(xdr, offset + length);
+ return length;
+}
+EXPORT_SYMBOL_GPL(xdr_expand_hole);
+
/**
* xdr_enter_page - decode data from the XDR page
* @xdr: pointer to xdr_stream struct
--
2.25.0
^ permalink raw reply related
* [PATCH v2 1/6] SUNRPC: Split out a function for setting current page
From: schumaker.anna @ 2020-02-14 21:12 UTC (permalink / raw)
To: Trond.Myklebust, linux-nfs; +Cc: Anna.Schumaker
In-Reply-To: <20200214211227.407836-1-Anna.Schumaker@Netapp.com>
From: Anna Schumaker <Anna.Schumaker@Netapp.com>
I'm going to need this bit of code in a few places for READ_PLUS
decoding, so let's make it a helper function.
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
---
net/sunrpc/xdr.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c
index e5497dc2475b..8bed0ec21563 100644
--- a/net/sunrpc/xdr.c
+++ b/net/sunrpc/xdr.c
@@ -825,6 +825,12 @@ static int xdr_set_page_base(struct xdr_stream *xdr,
return 0;
}
+static void xdr_set_page(struct xdr_stream *xdr, unsigned int base)
+{
+ if (xdr_set_page_base(xdr, base, PAGE_SIZE) < 0)
+ xdr_set_iov(xdr, xdr->buf->tail, xdr->nwords << 2);
+}
+
static void xdr_set_next_page(struct xdr_stream *xdr)
{
unsigned int newbase;
@@ -832,8 +838,7 @@ static void xdr_set_next_page(struct xdr_stream *xdr)
newbase = (1 + xdr->page_ptr - xdr->buf->pages) << PAGE_SHIFT;
newbase -= xdr->buf->page_base;
- if (xdr_set_page_base(xdr, newbase, PAGE_SIZE) < 0)
- xdr_set_iov(xdr, xdr->buf->tail, xdr->nwords << 2);
+ xdr_set_page(xdr, newbase);
}
static bool xdr_set_next_buffer(struct xdr_stream *xdr)
--
2.25.0
^ permalink raw reply related
* [PATCH v2 0/6] NFS: Add support for the v4.2 READ_PLUS operation
From: schumaker.anna @ 2020-02-14 21:12 UTC (permalink / raw)
To: Trond.Myklebust, linux-nfs; +Cc: Anna.Schumaker
From: Anna Schumaker <Anna.Schumaker@Netapp.com>
These patches add client support for the READ_PLUS operation, which
breaks read requests into several "data" and "hole" segments when
replying to the client. I also add a "noreadplus" mount option to allow
users to disable the new operation if it becomes a problem, similar to
the "nordirplus" mount option that we already have.
Here are the results of some performance tests I ran on Netapp lab
machines. I tested by reading various 2G files from a few different
undelying filesystems and across several NFS versions. I used the
`vmtouch` utility to make sure files were only cached when we wanted
them to be. In addition to 100% data and 100% hole cases, I also tested
with files that alternate between data and hole segments. These files
have either 4K, 8K, 16K, or 32K segment sizes and start with either data
or hole segments. So the file mixed-4d has a 4K segment size beginning
with a data segment, but mixed-32h hase 32K segments beginning with a
hole. The units are in seconds, with the first number for each NFS
version being the uncached read time and the second number is for when
the file is cached on the server.
ext4 | v3 | v4.0 | v4.1 | v4.2 |
----------|-----------------|-----------------|-----------------|-----------------|
data | 22.909 : 18.253 | 22.934 : 18.252 | 22.902 : 18.253 | 23.485 : 18.253 |
hole | 18.256 : 18.253 | 18.255 : 18.252 | 18.256 : 18.253 | 0.708 : 0.709 |
mixed-4d | 28.261 : 18.253 | 29.616 : 18.252 | 28.341 : 18.252 | 24.508 : 9.150 |
mixed-8d | 27.956 : 18.253 | 28.404 : 18.252 | 28.320 : 18.252 | 23.967 : 9.140 |
mixed-16d | 28.172 : 18.253 | 27.946 : 18.252 | 27.627 : 18.252 | 23.043 : 9.134 |
mixed-32d | 25.350 : 18.253 | 24.406 : 18.252 | 24.384 : 18.253 | 20.698 : 9.132 |
mixed-4h | 28.913 : 18.253 | 28.564 : 18.252 | 27.996 : 18.252 | 21.837 : 9.150 |
mixed-8h | 28.625 : 18.253 | 27.833 : 18.252 | 27.798 : 18.253 | 21.710 : 9.140 |
mixed-16h | 27.975 : 18.253 | 27.662 : 18.252 | 27.795 : 18.253 | 20.585 : 9.134 |
mixed-32h | 25.958 : 18.253 | 25.491 : 18.252 | 24.856 : 18.252 | 21.018 : 9.132 |
xfs | v3 | v4.0 | v4.1 | v4.2 |
----------|-----------------|-----------------|-----------------|-----------------|
data | 22.041 : 18.253 | 22.618 : 18.252 | 23.067 : 18.253 | 23.496 : 18.253 |
hole | 18.256 : 18.253 | 18.255 : 18.252 | 18.256 : 18.253 | 0.723 : 0.708 |
mixed-4d | 29.417 : 18.253 | 28.503 : 18.252 | 28.671 : 18.253 | 24.957 : 9.150 |
mixed-8d | 29.080 : 18.253 | 29.401 : 18.252 | 29.251 : 18.252 | 24.625 : 9.140 |
mixed-16d | 27.638 : 18.253 | 28.606 : 18.252 | 27.871 : 18.253 | 25.511 : 9.135 |
mixed-32d | 24.967 : 18.253 | 25.239 : 18.252 | 25.434 : 18.252 | 21.728 : 9.132 |
mixed-4h | 34.816 : 18.253 | 36.243 : 18.252 | 35.837 : 18.252 | 32.332 : 9.150 |
mixed-8h | 43.469 : 18.253 | 44.009 : 18.252 | 43.810 : 18.253 | 37.962 : 9.140 |
mixed-16h | 29.280 : 18.253 | 28.563 : 18.252 | 28.241 : 18.252 | 22.116 : 9.134 |
mixed-32h | 29.428 : 18.253 | 29.378 : 18.252 | 28.808 : 18.253 | 27.378 : 9.134 |
btrfs | v3 | v4.0 | v4.1 | v4.2 |
----------|-----------------|-----------------|-----------------|-----------------|
data | 25.547 : 18.253 | 25.053 : 18.252 | 24.209 : 18.253 | 32.121 : 18.253 |
hole | 18.256 : 18.253 | 18.255 : 18.252 | 18.256 : 18.252 | 0.702 : 0.724 |
mixed-4d | 19.016 : 18.253 | 18.822 : 18.252 | 18.955 : 18.253 | 18.697 : 9.150 |
mixed-8d | 19.186 : 18.253 | 19.444 : 18.252 | 18.841 : 18.253 | 18.452 : 9.140 |
mixed-16d | 18.480 : 18.253 | 19.010 : 18.252 | 19.167 : 18.252 | 16.000 : 9.134 |
mixed-32d | 18.635 : 18.253 | 18.565 : 18.252 | 18.550 : 18.252 | 15.930 : 9.132 |
mixed-4h | 19.079 : 18.253 | 18.990 : 18.252 | 19.157 : 18.253 | 27.834 : 9.150 |
mixed-8h | 18.613 : 18.253 | 19.234 : 18.252 | 18.616 : 18.253 | 20.177 : 9.140 |
mixed-16h | 18.590 : 18.253 | 19.221 : 18.252 | 19.654 : 18.253 | 17.273 : 9.135 |
mixed-32h | 18.768 : 18.253 | 19.122 : 18.252 | 18.535 : 18.252 | 15.791 : 9.132 |
ext3 | v3 | v4.0 | v4.1 | v4.2 |
----------|-----------------|-----------------|-----------------|-----------------|
data | 34.292 : 18.253 | 33.810 : 18.252 | 33.450 : 18.253 | 33.390 : 18.254 |
hole | 18.256 : 18.253 | 18.255 : 18.252 | 18.256 : 18.253 | 0.718 : 0.728 |
mixed-4d | 46.818 : 18.253 | 47.140 : 18.252 | 48.385 : 18.253 | 42.887 : 9.150 |
mixed-8d | 58.554 : 18.253 | 59.277 : 18.252 | 59.673 : 18.253 | 56.760 : 9.140 |
mixed-16d | 44.631 : 18.253 | 44.291 : 18.252 | 44.729 : 18.253 | 40.237 : 9.135 |
mixed-32d | 39.110 : 18.253 | 38.735 : 18.252 | 38.902 : 18.252 | 35.270 : 9.132 |
mixed-4h | 56.396 : 18.253 | 56.387 : 18.252 | 56.573 : 18.253 | 67.661 : 9.150 |
mixed-8h | 58.483 : 18.253 | 58.484 : 18.252 | 59.099 : 18.253 | 77.958 : 9.140 |
mixed-16h | 42.511 : 18.253 | 42.338 : 18.252 | 42.356 : 18.252 | 51.805 : 9.135 |
mixed-32h | 38.419 : 18.253 | 38.504 : 18.252 | 38.643 : 18.252 | 40.411 : 9.132 |
Changes since v1:
- Rebase to 5.6-rc1
- Drop the mount option patch for now
- Fix fallback to READ when the server doesn't support READ_PLUS
Any questions?
Anna
Anna Schumaker (6):
SUNRPC: Split out a function for setting current page
SUNRPC: Add the ability to expand holes in data pages
SUNRPC: Add the ability to shift data to a specific offset
NFS: Add READ_PLUS data segment support
NFS: Add READ_PLUS hole segment decoding
NFS: Decode multiple READ_PLUS segments
fs/nfs/nfs42xdr.c | 169 +++++++++++++++++++++++++
fs/nfs/nfs4proc.c | 43 ++++++-
fs/nfs/nfs4xdr.c | 1 +
include/linux/nfs4.h | 2 +-
include/linux/nfs_fs_sb.h | 1 +
include/linux/nfs_xdr.h | 2 +-
include/linux/sunrpc/xdr.h | 2 +
net/sunrpc/xdr.c | 244 ++++++++++++++++++++++++++++++++++++-
8 files changed, 457 insertions(+), 7 deletions(-)
--
2.25.0
^ permalink raw reply
* [PATCH v2 4/4] NFSD: Encode a full READ_PLUS reply
From: schumaker.anna @ 2020-02-14 21:12 UTC (permalink / raw)
To: bfields, linux-nfs; +Cc: Anna.Schumaker
In-Reply-To: <20200214211206.407725-1-Anna.Schumaker@Netapp.com>
From: Anna Schumaker <Anna.Schumaker@Netapp.com>
Reply to the client with multiple hole and data segments. This might
have performance issues due to the number of calls to vfs_llseek(),
depending on the underlying filesystem used on the server.
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
---
fs/nfsd/nfs4xdr.c | 41 +++++++++++++++++++++++++++++------------
1 file changed, 29 insertions(+), 12 deletions(-)
diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 1a2f06de651d..44bd0b8deafb 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -4385,14 +4385,18 @@ nfsd4_encode_offload_status(struct nfsd4_compoundres *resp, __be32 nfserr,
static __be32
nfsd4_encode_read_plus_data(struct nfsd4_compoundres *resp,
- struct nfsd4_read *read,
- unsigned long maxcount, u32 *eof)
+ struct nfsd4_read *read, u32 *eof)
{
struct xdr_stream *xdr = &resp->xdr;
struct file *file = read->rd_nf->nf_file;
+ unsigned long maxcount = read->rd_length;
+ loff_t hole_pos = vfs_llseek(file, read->rd_offset, SEEK_HOLE);
__be32 nfserr;
__be32 *p;
+ if (hole_pos > read->rd_offset)
+ maxcount = min_t(unsigned long, maxcount, hole_pos - read->rd_offset);
+
/* Content type, offset, byte count */
p = xdr_reserve_space(xdr, 4 + 8 + 4);
if (!p)
@@ -4404,6 +4408,7 @@ nfsd4_encode_read_plus_data(struct nfsd4_compoundres *resp,
nfserr = nfsd4_encode_splice_read(resp, read, file, &maxcount, eof);
else
nfserr = nfsd4_encode_readv(resp, read, file, &maxcount, eof);
+ clear_bit(RQ_SPLICE_OK, &resp->rqstp->rq_flags);
if (nfserr)
return nfserr;
@@ -4418,18 +4423,24 @@ nfsd4_encode_read_plus_data(struct nfsd4_compoundres *resp,
}
static __be32
-nfsd4_encode_read_plus_hole(struct nfsd4_compoundres *resp, struct nfsd4_read *read,
- unsigned long maxcount, u32 *eof)
+nfsd4_encode_read_plus_hole(struct nfsd4_compoundres *resp,
+ struct nfsd4_read *read, u32 *eof, loff_t data_pos)
{
struct file *file = read->rd_nf->nf_file;
+ unsigned long maxcount = read->rd_length;
__be32 *p;
+ if (data_pos == 0)
+ data_pos = vfs_llseek(file, read->rd_offset, SEEK_DATA);
+ if (data_pos == -ENXIO)
+ data_pos = i_size_read(file_inode(file));
+
/* Content type, offset, byte count */
p = xdr_reserve_space(&resp->xdr, 4 + 8 + 8);
if (!p)
return nfserr_resource;
- maxcount = min_t(unsigned long, maxcount, read->rd_length);
+ maxcount = min_t(unsigned long, maxcount, data_pos - read->rd_offset);
*p++ = cpu_to_be32(NFS4_CONTENT_HOLE);
p = xdr_encode_hyper(p, read->rd_offset);
@@ -4453,6 +4464,7 @@ nfsd4_encode_read_plus(struct nfsd4_compoundres *resp, __be32 nfserr,
int starting_len = xdr->buf->len;
unsigned int segments = 0;
loff_t data_pos;
+ bool is_data;
__be32 *p;
if (nfserr)
@@ -4476,21 +4488,26 @@ nfsd4_encode_read_plus(struct nfsd4_compoundres *resp, __be32 nfserr,
maxcount = min_t(unsigned long, maxcount,
(xdr->buf->buflen - xdr->buf->len));
maxcount = min_t(unsigned long, maxcount, read->rd_length);
+ read->rd_length = maxcount;
data_pos = vfs_llseek(file, read->rd_offset, SEEK_DATA);
if (data_pos == -ENXIO)
data_pos = i_size_read(file_inode(file));
else if (data_pos < 0)
data_pos = read->rd_offset;
+ is_data = (data_pos == read->rd_offset);
+ eof = read->rd_offset > i_size_read(file_inode(file));
- if (data_pos > read->rd_offset) {
- nfserr = nfsd4_encode_read_plus_hole(resp, read,
- data_pos - read->rd_offset, &eof);
- segments++;
- }
+ while (read->rd_length > 0 && !eof) {
+ if (is_data)
+ nfserr = nfsd4_encode_read_plus_data(resp, read, &eof);
+ else
+ nfserr = nfsd4_encode_read_plus_hole(resp, read, &eof, data_pos);
- if (!nfserr && !eof && read->rd_length > 0) {
- nfserr = nfsd4_encode_read_plus_data(resp, read, maxcount, &eof);
+ if (nfserr)
+ break;
+ is_data = !is_data;
+ data_pos = 0;
segments++;
}
--
2.25.0
^ permalink raw reply related
* [PATCH v2 3/4] NFSD: Add READ_PLUS hole segment encoding
From: schumaker.anna @ 2020-02-14 21:12 UTC (permalink / raw)
To: bfields, linux-nfs; +Cc: Anna.Schumaker
In-Reply-To: <20200214211206.407725-1-Anna.Schumaker@Netapp.com>
From: Anna Schumaker <Anna.Schumaker@Netapp.com>
However, we only encode the hole if it is at the beginning of the range
and treat everything else as data to keep things simple.
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
---
fs/nfsd/nfs4proc.c | 2 +-
fs/nfsd/nfs4xdr.c | 47 ++++++++++++++++++++++++++++++++++++++++++++--
2 files changed, 46 insertions(+), 3 deletions(-)
diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index 9643181591e3..c65939a1e40c 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -2527,7 +2527,7 @@ static inline u32 nfsd4_read_plus_rsize(struct svc_rqst *rqstp, struct nfsd4_op
u32 maxcount = svc_max_payload(rqstp);
u32 rlen = min(op->u.read.rd_length, maxcount);
/* enough extra xdr space for encoding either a hole or data segment. */
- u32 segments = 1 + 2 + 2;
+ u32 segments = 2 * (1 + 2 + 2);
return (op_encode_hdr_size + 2 + segments + XDR_QUADLEN(rlen)) * sizeof(__be32);
}
diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 8efb59d4fda4..1a2f06de651d 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -4417,6 +4417,31 @@ nfsd4_encode_read_plus_data(struct nfsd4_compoundres *resp,
return nfserr;
}
+static __be32
+nfsd4_encode_read_plus_hole(struct nfsd4_compoundres *resp, struct nfsd4_read *read,
+ unsigned long maxcount, u32 *eof)
+{
+ struct file *file = read->rd_nf->nf_file;
+ __be32 *p;
+
+ /* Content type, offset, byte count */
+ p = xdr_reserve_space(&resp->xdr, 4 + 8 + 8);
+ if (!p)
+ return nfserr_resource;
+
+ maxcount = min_t(unsigned long, maxcount, read->rd_length);
+
+ *p++ = cpu_to_be32(NFS4_CONTENT_HOLE);
+ p = xdr_encode_hyper(p, read->rd_offset);
+ p = xdr_encode_hyper(p, maxcount);
+
+ *eof = (read->rd_offset + maxcount) >= i_size_read(file_inode(file));
+
+ read->rd_offset += maxcount;
+ read->rd_length = (maxcount > 0) ? read->rd_length - maxcount : 0;
+ return nfs_ok;
+}
+
static __be32
nfsd4_encode_read_plus(struct nfsd4_compoundres *resp, __be32 nfserr,
struct nfsd4_read *read)
@@ -4426,6 +4451,8 @@ nfsd4_encode_read_plus(struct nfsd4_compoundres *resp, __be32 nfserr,
struct xdr_stream *xdr = &resp->xdr;
struct file *file;
int starting_len = xdr->buf->len;
+ unsigned int segments = 0;
+ loff_t data_pos;
__be32 *p;
if (nfserr)
@@ -4450,12 +4477,28 @@ nfsd4_encode_read_plus(struct nfsd4_compoundres *resp, __be32 nfserr,
(xdr->buf->buflen - xdr->buf->len));
maxcount = min_t(unsigned long, maxcount, read->rd_length);
- nfserr = nfsd4_encode_read_plus_data(resp, read, maxcount, &eof);
+ data_pos = vfs_llseek(file, read->rd_offset, SEEK_DATA);
+ if (data_pos == -ENXIO)
+ data_pos = i_size_read(file_inode(file));
+ else if (data_pos < 0)
+ data_pos = read->rd_offset;
+
+ if (data_pos > read->rd_offset) {
+ nfserr = nfsd4_encode_read_plus_hole(resp, read,
+ data_pos - read->rd_offset, &eof);
+ segments++;
+ }
+
+ if (!nfserr && !eof && read->rd_length > 0) {
+ nfserr = nfsd4_encode_read_plus_data(resp, read, maxcount, &eof);
+ segments++;
+ }
+
if (nfserr)
xdr_truncate_encode(xdr, starting_len);
else {
*p++ = htonl(eof);
- *p++ = htonl(1);
+ *p++ = htonl(segments);
}
return nfserr;
--
2.25.0
^ permalink raw reply related
* [PATCH v2 2/4] NFSD: Add READ_PLUS data support
From: schumaker.anna @ 2020-02-14 21:12 UTC (permalink / raw)
To: bfields, linux-nfs; +Cc: Anna.Schumaker
In-Reply-To: <20200214211206.407725-1-Anna.Schumaker@Netapp.com>
From: Anna Schumaker <Anna.Schumaker@Netapp.com>
This patch adds READ_PLUS support for returning a single
NFS4_CONTENT_DATA segment to the client. This is basically the same as
the READ operation, only with the extra information about data segments.
Note that Wireshark 3.0 will report "malformed packed" when trying to
decode NFS4_CONTENT_DATA segments. This is because the actual data is
encoded as a variable length array, which RFC 4506 says should start
with a 32-bit length value. Wireshark incorrectly expects length to be a
64-bit value instead.
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
---
fs/nfsd/nfs4proc.c | 17 +++++++++
fs/nfsd/nfs4xdr.c | 90 ++++++++++++++++++++++++++++++++++++++++++----
2 files changed, 101 insertions(+), 6 deletions(-)
diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index 0e75f7fb5fec..9643181591e3 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -2522,6 +2522,16 @@ static inline u32 nfsd4_read_rsize(struct svc_rqst *rqstp, struct nfsd4_op *op)
return (op_encode_hdr_size + 2 + XDR_QUADLEN(rlen)) * sizeof(__be32);
}
+static inline u32 nfsd4_read_plus_rsize(struct svc_rqst *rqstp, struct nfsd4_op *op)
+{
+ u32 maxcount = svc_max_payload(rqstp);
+ u32 rlen = min(op->u.read.rd_length, maxcount);
+ /* enough extra xdr space for encoding either a hole or data segment. */
+ u32 segments = 1 + 2 + 2;
+
+ return (op_encode_hdr_size + 2 + segments + XDR_QUADLEN(rlen)) * sizeof(__be32);
+}
+
static inline u32 nfsd4_readdir_rsize(struct svc_rqst *rqstp, struct nfsd4_op *op)
{
u32 maxcount = 0, rlen = 0;
@@ -3058,6 +3068,13 @@ static const struct nfsd4_operation nfsd4_ops[] = {
.op_name = "OP_COPY",
.op_rsize_bop = nfsd4_copy_rsize,
},
+ [OP_READ_PLUS] = {
+ .op_func = nfsd4_read,
+ .op_release = nfsd4_read_release,
+ .op_name = "OP_READ_PLUS",
+ .op_rsize_bop = nfsd4_read_plus_rsize,
+ .op_get_currentstateid = nfsd4_get_readstateid,
+ },
[OP_SEEK] = {
.op_func = nfsd4_seek,
.op_name = "OP_SEEK",
diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 45f0623f6488..8efb59d4fda4 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -1957,7 +1957,7 @@ static const nfsd4_dec nfsd4_dec_ops[] = {
[OP_LAYOUTSTATS] = (nfsd4_dec)nfsd4_decode_notsupp,
[OP_OFFLOAD_CANCEL] = (nfsd4_dec)nfsd4_decode_offload_status,
[OP_OFFLOAD_STATUS] = (nfsd4_dec)nfsd4_decode_offload_status,
- [OP_READ_PLUS] = (nfsd4_dec)nfsd4_decode_notsupp,
+ [OP_READ_PLUS] = (nfsd4_dec)nfsd4_decode_read,
[OP_SEEK] = (nfsd4_dec)nfsd4_decode_seek,
[OP_WRITE_SAME] = (nfsd4_dec)nfsd4_decode_notsupp,
[OP_CLONE] = (nfsd4_dec)nfsd4_decode_clone,
@@ -3664,10 +3664,11 @@ nfsd4_encode_read(struct nfsd4_compoundres *resp, __be32 nfserr,
if (nfserr)
xdr_truncate_encode(xdr, starting_len);
-
- read->rd_length = maxcount;
- *p++ = htonl(eof);
- *p++ = htonl(maxcount);
+ else {
+ read->rd_length = maxcount;
+ *p++ = htonl(eof);
+ *p++ = htonl(maxcount);
+ }
return nfserr;
}
@@ -4379,6 +4380,83 @@ nfsd4_encode_offload_status(struct nfsd4_compoundres *resp, __be32 nfserr,
return nfserr_resource;
p = xdr_encode_hyper(p, os->count);
*p++ = cpu_to_be32(0);
+ return nfserr;
+}
+
+static __be32
+nfsd4_encode_read_plus_data(struct nfsd4_compoundres *resp,
+ struct nfsd4_read *read,
+ unsigned long maxcount, u32 *eof)
+{
+ struct xdr_stream *xdr = &resp->xdr;
+ struct file *file = read->rd_nf->nf_file;
+ __be32 nfserr;
+ __be32 *p;
+
+ /* Content type, offset, byte count */
+ p = xdr_reserve_space(xdr, 4 + 8 + 4);
+ if (!p)
+ return nfserr_resource;
+ xdr_commit_encode(xdr);
+
+ if (file->f_op->splice_read &&
+ test_bit(RQ_SPLICE_OK, &resp->rqstp->rq_flags))
+ nfserr = nfsd4_encode_splice_read(resp, read, file, &maxcount, eof);
+ else
+ nfserr = nfsd4_encode_readv(resp, read, file, &maxcount, eof);
+
+ if (nfserr)
+ return nfserr;
+
+ *p++ = htonl(NFS4_CONTENT_DATA);
+ p = xdr_encode_hyper(p, read->rd_offset);
+ *p++ = htonl(maxcount);
+
+ read->rd_offset += maxcount;
+ read->rd_length = (maxcount > 0) ? read->rd_length - maxcount : 0;
+ return nfserr;
+}
+
+static __be32
+nfsd4_encode_read_plus(struct nfsd4_compoundres *resp, __be32 nfserr,
+ struct nfsd4_read *read)
+{
+ unsigned long maxcount;
+ u32 eof;
+ struct xdr_stream *xdr = &resp->xdr;
+ struct file *file;
+ int starting_len = xdr->buf->len;
+ __be32 *p;
+
+ if (nfserr)
+ return nfserr;
+ file = read->rd_nf->nf_file;
+
+ /* eof flag, segment count */
+ p = xdr_reserve_space(xdr, 4 + 4);
+ if (!p) {
+ WARN_ON_ONCE(test_bit(RQ_SPLICE_OK, &resp->rqstp->rq_flags));
+ return nfserr_resource;
+ }
+ if (resp->xdr.buf->page_len &&
+ test_bit(RQ_SPLICE_OK, &resp->rqstp->rq_flags)) {
+ WARN_ON_ONCE(1);
+ return nfserr_resource;
+ }
+ xdr_commit_encode(xdr);
+
+ maxcount = svc_max_payload(resp->rqstp);
+ maxcount = min_t(unsigned long, maxcount,
+ (xdr->buf->buflen - xdr->buf->len));
+ maxcount = min_t(unsigned long, maxcount, read->rd_length);
+
+ nfserr = nfsd4_encode_read_plus_data(resp, read, maxcount, &eof);
+ if (nfserr)
+ xdr_truncate_encode(xdr, starting_len);
+ else {
+ *p++ = htonl(eof);
+ *p++ = htonl(1);
+ }
return nfserr;
}
@@ -4521,7 +4599,7 @@ static const nfsd4_enc nfsd4_enc_ops[] = {
[OP_LAYOUTSTATS] = (nfsd4_enc)nfsd4_encode_noop,
[OP_OFFLOAD_CANCEL] = (nfsd4_enc)nfsd4_encode_noop,
[OP_OFFLOAD_STATUS] = (nfsd4_enc)nfsd4_encode_offload_status,
- [OP_READ_PLUS] = (nfsd4_enc)nfsd4_encode_noop,
+ [OP_READ_PLUS] = (nfsd4_enc)nfsd4_encode_read_plus,
[OP_SEEK] = (nfsd4_enc)nfsd4_encode_seek,
[OP_WRITE_SAME] = (nfsd4_enc)nfsd4_encode_noop,
[OP_CLONE] = (nfsd4_enc)nfsd4_encode_noop,
--
2.25.0
^ permalink raw reply related
* [PATCH v2 1/4] NFSD: Return eof and maxcount to nfsd4_encode_read()
From: schumaker.anna @ 2020-02-14 21:12 UTC (permalink / raw)
To: bfields, linux-nfs; +Cc: Anna.Schumaker
In-Reply-To: <20200214211206.407725-1-Anna.Schumaker@Netapp.com>
From: Anna Schumaker <Anna.Schumaker@Netapp.com>
I want to reuse nfsd4_encode_readv() and nfsd4_encode_splice_read() in
READ_PLUS rather than reimplementing them. READ_PLUS returns a single
eof flag for the entire call and a separate maxcount for each data
segment, so we need to have the READ call encode these values in a
different place.
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
---
fs/nfsd/nfs4xdr.c | 60 ++++++++++++++++++++---------------------------
1 file changed, 26 insertions(+), 34 deletions(-)
diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 9761512674a0..45f0623f6488 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -3521,23 +3521,22 @@ nfsd4_encode_open_downgrade(struct nfsd4_compoundres *resp, __be32 nfserr, struc
static __be32 nfsd4_encode_splice_read(
struct nfsd4_compoundres *resp,
- struct nfsd4_read *read,
- struct file *file, unsigned long maxcount)
+ struct nfsd4_read *read, struct file *file,
+ unsigned long *maxcount, u32 *eof)
{
struct xdr_stream *xdr = &resp->xdr;
struct xdr_buf *buf = xdr->buf;
- u32 eof;
+ long len;
int space_left;
__be32 nfserr;
- __be32 *p = xdr->p - 2;
/* Make sure there will be room for padding if needed */
if (xdr->end - xdr->p < 1)
return nfserr_resource;
+ len = *maxcount;
nfserr = nfsd_splice_read(read->rd_rqstp, read->rd_fhp,
- file, read->rd_offset, &maxcount, &eof);
- read->rd_length = maxcount;
+ file, read->rd_offset, maxcount, eof);
if (nfserr) {
/*
* nfsd_splice_actor may have already messed with the
@@ -3548,24 +3547,21 @@ static __be32 nfsd4_encode_splice_read(
return nfserr;
}
- *(p++) = htonl(eof);
- *(p++) = htonl(maxcount);
-
- buf->page_len = maxcount;
- buf->len += maxcount;
- xdr->page_ptr += (buf->page_base + maxcount + PAGE_SIZE - 1)
+ buf->page_len = *maxcount;
+ buf->len += *maxcount;
+ xdr->page_ptr += (buf->page_base + *maxcount + PAGE_SIZE - 1)
/ PAGE_SIZE;
/* Use rest of head for padding and remaining ops: */
buf->tail[0].iov_base = xdr->p;
buf->tail[0].iov_len = 0;
xdr->iov = buf->tail;
- if (maxcount&3) {
- int pad = 4 - (maxcount&3);
+ if (*maxcount&3) {
+ int pad = 4 - (*maxcount&3);
*(xdr->p++) = 0;
- buf->tail[0].iov_base += maxcount&3;
+ buf->tail[0].iov_base += *maxcount&3;
buf->tail[0].iov_len = pad;
buf->len += pad;
}
@@ -3579,22 +3575,20 @@ static __be32 nfsd4_encode_splice_read(
}
static __be32 nfsd4_encode_readv(struct nfsd4_compoundres *resp,
- struct nfsd4_read *read,
- struct file *file, unsigned long maxcount)
+ struct nfsd4_read *read, struct file *file,
+ unsigned long *maxcount, u32 *eof)
{
struct xdr_stream *xdr = &resp->xdr;
- u32 eof;
int v;
int starting_len = xdr->buf->len - 8;
long len;
int thislen;
__be32 nfserr;
- __be32 tmp;
__be32 *p;
u32 zzz = 0;
int pad;
- len = maxcount;
+ len = *maxcount;
v = 0;
thislen = min_t(long, len, ((void *)xdr->end - (void *)xdr->p));
@@ -3616,22 +3610,15 @@ static __be32 nfsd4_encode_readv(struct nfsd4_compoundres *resp,
}
read->rd_vlen = v;
- len = maxcount;
+ len = *maxcount;
nfserr = nfsd_readv(resp->rqstp, read->rd_fhp, file, read->rd_offset,
- resp->rqstp->rq_vec, read->rd_vlen, &maxcount,
- &eof);
- read->rd_length = maxcount;
+ resp->rqstp->rq_vec, read->rd_vlen, maxcount, eof);
if (nfserr)
return nfserr;
- xdr_truncate_encode(xdr, starting_len + 8 + ((maxcount+3)&~3));
+ xdr_truncate_encode(xdr, starting_len + 8 + ((*maxcount+3)&~3));
- tmp = htonl(eof);
- write_bytes_to_xdr_buf(xdr->buf, starting_len , &tmp, 4);
- tmp = htonl(maxcount);
- write_bytes_to_xdr_buf(xdr->buf, starting_len + 4, &tmp, 4);
-
- pad = (maxcount&3) ? 4 - (maxcount&3) : 0;
- write_bytes_to_xdr_buf(xdr->buf, starting_len + 8 + maxcount,
+ pad = (*maxcount&3) ? 4 - (*maxcount&3) : 0;
+ write_bytes_to_xdr_buf(xdr->buf, starting_len + 8 + *maxcount,
&zzz, pad);
return 0;
@@ -3642,6 +3629,7 @@ nfsd4_encode_read(struct nfsd4_compoundres *resp, __be32 nfserr,
struct nfsd4_read *read)
{
unsigned long maxcount;
+ u32 eof;
struct xdr_stream *xdr = &resp->xdr;
struct file *file;
int starting_len = xdr->buf->len;
@@ -3670,13 +3658,17 @@ nfsd4_encode_read(struct nfsd4_compoundres *resp, __be32 nfserr,
if (file->f_op->splice_read &&
test_bit(RQ_SPLICE_OK, &resp->rqstp->rq_flags))
- nfserr = nfsd4_encode_splice_read(resp, read, file, maxcount);
+ nfserr = nfsd4_encode_splice_read(resp, read, file, &maxcount, &eof);
else
- nfserr = nfsd4_encode_readv(resp, read, file, maxcount);
+ nfserr = nfsd4_encode_readv(resp, read, file, &maxcount, &eof);
if (nfserr)
xdr_truncate_encode(xdr, starting_len);
+ read->rd_length = maxcount;
+ *p++ = htonl(eof);
+ *p++ = htonl(maxcount);
+
return nfserr;
}
--
2.25.0
^ permalink raw reply related
* [PATCH v2 0/4] NFSD: Add support for the v4.2 READ_PLUS operation
From: schumaker.anna @ 2020-02-14 21:12 UTC (permalink / raw)
To: bfields, linux-nfs; +Cc: Anna.Schumaker
From: Anna Schumaker <Anna.Schumaker@Netapp.com>
These patches add server support for the READ_PLUS operation, which
breaks read requests into several "data" and "hole" segments when
replying to the client.
Here are the results of some performance tests I ran on Netapp lab
machines. I tested by reading various 2G files from a few different
undelying filesystems and across several NFS versions. I used the
`vmtouch` utility to make sure files were only cached when we wanted
them to be. In addition to 100% data and 100% hole cases, I also tested
with files that alternate between data and hole segments. These files
have either 4K, 8K, 16K, or 32K segment sizes and start with either data
or hole segments. So the file mixed-4d has a 4K segment size beginning
with a data segment, but mixed-32h hase 32K segments beginning with a
hole. The units are in seconds, with the first number for each NFS
version being the uncached read time and the second number is for when
the file is cached on the server.
ext4 | v3 | v4.0 | v4.1 | v4.2 |
----------|-----------------|-----------------|-----------------|-----------------|
data | 22.909 : 18.253 | 22.934 : 18.252 | 22.902 : 18.253 | 23.485 : 18.253 |
hole | 18.256 : 18.253 | 18.255 : 18.252 | 18.256 : 18.253 | 0.708 : 0.709 |
mixed-4d | 28.261 : 18.253 | 29.616 : 18.252 | 28.341 : 18.252 | 24.508 : 9.150 |
mixed-8d | 27.956 : 18.253 | 28.404 : 18.252 | 28.320 : 18.252 | 23.967 : 9.140 |
mixed-16d | 28.172 : 18.253 | 27.946 : 18.252 | 27.627 : 18.252 | 23.043 : 9.134 |
mixed-32d | 25.350 : 18.253 | 24.406 : 18.252 | 24.384 : 18.253 | 20.698 : 9.132 |
mixed-4h | 28.913 : 18.253 | 28.564 : 18.252 | 27.996 : 18.252 | 21.837 : 9.150 |
mixed-8h | 28.625 : 18.253 | 27.833 : 18.252 | 27.798 : 18.253 | 21.710 : 9.140 |
mixed-16h | 27.975 : 18.253 | 27.662 : 18.252 | 27.795 : 18.253 | 20.585 : 9.134 |
mixed-32h | 25.958 : 18.253 | 25.491 : 18.252 | 24.856 : 18.252 | 21.018 : 9.132 |
xfs | v3 | v4.0 | v4.1 | v4.2 |
----------|-----------------|-----------------|-----------------|-----------------|
data | 22.041 : 18.253 | 22.618 : 18.252 | 23.067 : 18.253 | 23.496 : 18.253 |
hole | 18.256 : 18.253 | 18.255 : 18.252 | 18.256 : 18.253 | 0.723 : 0.708 |
mixed-4d | 29.417 : 18.253 | 28.503 : 18.252 | 28.671 : 18.253 | 24.957 : 9.150 |
mixed-8d | 29.080 : 18.253 | 29.401 : 18.252 | 29.251 : 18.252 | 24.625 : 9.140 |
mixed-16d | 27.638 : 18.253 | 28.606 : 18.252 | 27.871 : 18.253 | 25.511 : 9.135 |
mixed-32d | 24.967 : 18.253 | 25.239 : 18.252 | 25.434 : 18.252 | 21.728 : 9.132 |
mixed-4h | 34.816 : 18.253 | 36.243 : 18.252 | 35.837 : 18.252 | 32.332 : 9.150 |
mixed-8h | 43.469 : 18.253 | 44.009 : 18.252 | 43.810 : 18.253 | 37.962 : 9.140 |
mixed-16h | 29.280 : 18.253 | 28.563 : 18.252 | 28.241 : 18.252 | 22.116 : 9.134 |
mixed-32h | 29.428 : 18.253 | 29.378 : 18.252 | 28.808 : 18.253 | 27.378 : 9.134 |
btrfs | v3 | v4.0 | v4.1 | v4.2 |
----------|-----------------|-----------------|-----------------|-----------------|
data | 25.547 : 18.253 | 25.053 : 18.252 | 24.209 : 18.253 | 32.121 : 18.253 |
hole | 18.256 : 18.253 | 18.255 : 18.252 | 18.256 : 18.252 | 0.702 : 0.724 |
mixed-4d | 19.016 : 18.253 | 18.822 : 18.252 | 18.955 : 18.253 | 18.697 : 9.150 |
mixed-8d | 19.186 : 18.253 | 19.444 : 18.252 | 18.841 : 18.253 | 18.452 : 9.140 |
mixed-16d | 18.480 : 18.253 | 19.010 : 18.252 | 19.167 : 18.252 | 16.000 : 9.134 |
mixed-32d | 18.635 : 18.253 | 18.565 : 18.252 | 18.550 : 18.252 | 15.930 : 9.132 |
mixed-4h | 19.079 : 18.253 | 18.990 : 18.252 | 19.157 : 18.253 | 27.834 : 9.150 |
mixed-8h | 18.613 : 18.253 | 19.234 : 18.252 | 18.616 : 18.253 | 20.177 : 9.140 |
mixed-16h | 18.590 : 18.253 | 19.221 : 18.252 | 19.654 : 18.253 | 17.273 : 9.135 |
mixed-32h | 18.768 : 18.253 | 19.122 : 18.252 | 18.535 : 18.252 | 15.791 : 9.132 |
ext3 | v3 | v4.0 | v4.1 | v4.2 |
----------|-----------------|-----------------|-----------------|-----------------|
data | 34.292 : 18.253 | 33.810 : 18.252 | 33.450 : 18.253 | 33.390 : 18.254 |
hole | 18.256 : 18.253 | 18.255 : 18.252 | 18.256 : 18.253 | 0.718 : 0.728 |
mixed-4d | 46.818 : 18.253 | 47.140 : 18.252 | 48.385 : 18.253 | 42.887 : 9.150 |
mixed-8d | 58.554 : 18.253 | 59.277 : 18.252 | 59.673 : 18.253 | 56.760 : 9.140 |
mixed-16d | 44.631 : 18.253 | 44.291 : 18.252 | 44.729 : 18.253 | 40.237 : 9.135 |
mixed-32d | 39.110 : 18.253 | 38.735 : 18.252 | 38.902 : 18.252 | 35.270 : 9.132 |
mixed-4h | 56.396 : 18.253 | 56.387 : 18.252 | 56.573 : 18.253 | 67.661 : 9.150 |
mixed-8h | 58.483 : 18.253 | 58.484 : 18.252 | 59.099 : 18.253 | 77.958 : 9.140 |
mixed-16h | 42.511 : 18.253 | 42.338 : 18.252 | 42.356 : 18.252 | 51.805 : 9.135 |
mixed-32h | 38.419 : 18.253 | 38.504 : 18.252 | 38.643 : 18.252 | 40.411 : 9.132 |
Any questions?
Anna
Anna Schumaker (4):
NFSD: Return eof and maxcount to nfsd4_encode_read()
NFSD: Add READ_PLUS data support
NFSD: Add READ_PLUS hole segment encoding
NFSD: Encode a full READ_PLUS reply
fs/nfsd/nfs4proc.c | 17 ++++
fs/nfsd/nfs4xdr.c | 202 +++++++++++++++++++++++++++++++++++++--------
2 files changed, 183 insertions(+), 36 deletions(-)
--
2.25.0
^ permalink raw reply
* [PATCH 8/8] btrfs: kill the subvol_srcu
From: Josef Bacik @ 2020-02-14 21:11 UTC (permalink / raw)
To: linux-btrfs, kernel-team
In-Reply-To: <20200214211147.24610-1-josef@toxicpanda.com>
Now that we have proper root ref counting everywhere we can kill the
subvol_srcu.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
fs/btrfs/backref.c | 12 +-----------
fs/btrfs/ctree.h | 1 -
fs/btrfs/disk-io.c | 37 +++++++++---------------------------
fs/btrfs/export.c | 21 ++++----------------
fs/btrfs/file.c | 5 -----
fs/btrfs/inode.c | 3 ---
fs/btrfs/send.c | 14 --------------
fs/btrfs/tests/btrfs-tests.c | 9 ---------
8 files changed, 14 insertions(+), 88 deletions(-)
diff --git a/fs/btrfs/backref.c b/fs/btrfs/backref.c
index ded46efac27d..9d0f87df2c35 100644
--- a/fs/btrfs/backref.c
+++ b/fs/btrfs/backref.c
@@ -512,23 +512,18 @@ static int resolve_indirect_ref(struct btrfs_fs_info *fs_info,
int ret = 0;
int root_level;
int level = ref->level;
- int index;
root_key.objectid = ref->root_id;
root_key.type = BTRFS_ROOT_ITEM_KEY;
root_key.offset = (u64)-1;
- index = srcu_read_lock(&fs_info->subvol_srcu);
-
root = btrfs_get_fs_root(fs_info, &root_key, false);
if (IS_ERR(root)) {
- srcu_read_unlock(&fs_info->subvol_srcu, index);
ret = PTR_ERR(root);
goto out_free;
}
if (btrfs_is_testing(fs_info)) {
- srcu_read_unlock(&fs_info->subvol_srcu, index);
ret = -ENOENT;
goto out;
}
@@ -540,10 +535,8 @@ static int resolve_indirect_ref(struct btrfs_fs_info *fs_info,
else
root_level = btrfs_old_root_level(root, time_seq);
- if (root_level + 1 == level) {
- srcu_read_unlock(&fs_info->subvol_srcu, index);
+ if (root_level + 1 == level)
goto out;
- }
path->lowest_level = level;
if (time_seq == SEQ_LAST)
@@ -553,9 +546,6 @@ static int resolve_indirect_ref(struct btrfs_fs_info *fs_info,
ret = btrfs_search_old_slot(root, &ref->key_for_search, path,
time_seq);
- /* root node has been locked, we can release @subvol_srcu safely here */
- srcu_read_unlock(&fs_info->subvol_srcu, index);
-
btrfs_debug(fs_info,
"search slot in root %llu (level %d, ref count %d) returned %d for key (%llu %u %llu)",
ref->root_id, level, ref->count, ret,
diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 000ad54629e3..069868a37785 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -696,7 +696,6 @@ struct btrfs_fs_info {
struct rw_semaphore cleanup_work_sem;
struct rw_semaphore subvol_sem;
- struct srcu_struct subvol_srcu;
spinlock_t trans_lock;
/*
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index ca46ca3e9dca..9159da5b0a67 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2806,46 +2806,33 @@ static int init_mount_fs_info(struct btrfs_fs_info *fs_info, struct super_block
sb->s_blocksize = BTRFS_BDEV_BLOCKSIZE;
sb->s_blocksize_bits = blksize_bits(BTRFS_BDEV_BLOCKSIZE);
- ret = init_srcu_struct(&fs_info->subvol_srcu);
- if (ret)
- return ret;
-
ret = percpu_counter_init(&fs_info->dio_bytes, 0, GFP_KERNEL);
if (ret)
- goto fail;
+ return ret;
ret = percpu_counter_init(&fs_info->dirty_metadata_bytes, 0, GFP_KERNEL);
if (ret)
- goto fail;
+ return ret;
fs_info->dirty_metadata_batch = PAGE_SIZE *
(1 + ilog2(nr_cpu_ids));
ret = percpu_counter_init(&fs_info->delalloc_bytes, 0, GFP_KERNEL);
if (ret)
- goto fail;
+ return ret;
ret = percpu_counter_init(&fs_info->dev_replace.bio_counter, 0,
GFP_KERNEL);
if (ret)
- goto fail;
+ return ret;
fs_info->delayed_root = kmalloc(sizeof(struct btrfs_delayed_root),
GFP_KERNEL);
- if (!fs_info->delayed_root) {
- ret = -ENOMEM;
- goto fail;
- }
+ if (!fs_info->delayed_root)
+ return -ENOMEM;
btrfs_init_delayed_root(fs_info->delayed_root);
- ret = btrfs_alloc_stripe_hash_table(fs_info);
- if (ret)
- goto fail;
-
- return 0;
-fail:
- cleanup_srcu_struct(&fs_info->subvol_srcu);
- return ret;
+ return btrfs_alloc_stripe_hash_table(fs_info);
}
int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_devices,
@@ -2883,13 +2870,13 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
fs_info->chunk_root = chunk_root;
if (!tree_root || !chunk_root) {
err = -ENOMEM;
- goto fail_srcu;
+ goto fail;
}
fs_info->btree_inode = new_inode(sb);
if (!fs_info->btree_inode) {
err = -ENOMEM;
- goto fail_srcu;
+ goto fail;
}
mapping_set_gfp_mask(fs_info->btree_inode->i_mapping, GFP_NOFS);
btrfs_init_btree_inode(fs_info);
@@ -3398,8 +3385,6 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
btrfs_mapping_tree_free(&fs_info->mapping_tree);
iput(fs_info->btree_inode);
-fail_srcu:
- cleanup_srcu_struct(&fs_info->subvol_srcu);
fail:
btrfs_close_devices(fs_info->fs_devices);
return err;
@@ -3894,9 +3879,6 @@ void btrfs_drop_and_free_fs_root(struct btrfs_fs_info *fs_info,
drop_ref = true;
spin_unlock(&fs_info->fs_roots_radix_lock);
- if (btrfs_root_refs(&root->root_item) == 0)
- synchronize_srcu(&fs_info->subvol_srcu);
-
if (test_bit(BTRFS_FS_STATE_ERROR, &fs_info->fs_state)) {
btrfs_free_log(NULL, root);
if (root->reloc_root) {
@@ -4095,7 +4077,6 @@ void __cold close_ctree(struct btrfs_fs_info *fs_info)
btrfs_mapping_tree_free(&fs_info->mapping_tree);
btrfs_close_devices(fs_info->fs_devices);
- cleanup_srcu_struct(&fs_info->subvol_srcu);
}
int btrfs_buffer_uptodate(struct extent_buffer *buf, u64 parent_transid,
diff --git a/fs/btrfs/export.c b/fs/btrfs/export.c
index 657fd6ad6e18..7f9dd8384083 100644
--- a/fs/btrfs/export.c
+++ b/fs/btrfs/export.c
@@ -65,8 +65,6 @@ static struct dentry *btrfs_get_dentry(struct super_block *sb, u64 objectid,
struct btrfs_root *root;
struct inode *inode;
struct btrfs_key key;
- int index;
- int err = 0;
if (objectid < BTRFS_FIRST_FREE_OBJECTID)
return ERR_PTR(-ESTALE);
@@ -75,13 +73,9 @@ static struct dentry *btrfs_get_dentry(struct super_block *sb, u64 objectid,
key.type = BTRFS_ROOT_ITEM_KEY;
key.offset = (u64)-1;
- index = srcu_read_lock(&fs_info->subvol_srcu);
-
root = btrfs_get_fs_root(fs_info, &key, true);
- if (IS_ERR(root)) {
- err = PTR_ERR(root);
- goto fail;
- }
+ if (IS_ERR(root))
+ return ERR_CAST(root);
key.objectid = objectid;
key.type = BTRFS_INODE_ITEM_KEY;
@@ -89,12 +83,8 @@ static struct dentry *btrfs_get_dentry(struct super_block *sb, u64 objectid,
inode = btrfs_iget(sb, &key, root);
btrfs_put_root(root);
- if (IS_ERR(inode)) {
- err = PTR_ERR(inode);
- goto fail;
- }
-
- srcu_read_unlock(&fs_info->subvol_srcu, index);
+ if (IS_ERR(inode))
+ return ERR_CAST(inode);
if (check_generation && generation != inode->i_generation) {
iput(inode);
@@ -102,9 +92,6 @@ static struct dentry *btrfs_get_dentry(struct super_block *sb, u64 objectid,
}
return d_obtain_alias(inode);
-fail:
- srcu_read_unlock(&fs_info->subvol_srcu, index);
- return ERR_PTR(err);
}
static struct dentry *btrfs_fh_to_parent(struct super_block *sb, struct fid *fh,
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index a37feadd9344..5cdcdbdd908b 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -277,7 +277,6 @@ static int __btrfs_run_defrag_inode(struct btrfs_fs_info *fs_info,
struct btrfs_key key;
struct btrfs_ioctl_defrag_range_args range;
int num_defrag;
- int index;
int ret;
/* get the inode */
@@ -285,8 +284,6 @@ static int __btrfs_run_defrag_inode(struct btrfs_fs_info *fs_info,
key.type = BTRFS_ROOT_ITEM_KEY;
key.offset = (u64)-1;
- index = srcu_read_lock(&fs_info->subvol_srcu);
-
inode_root = btrfs_get_fs_root(fs_info, &key, true);
if (IS_ERR(inode_root)) {
ret = PTR_ERR(inode_root);
@@ -302,7 +299,6 @@ static int __btrfs_run_defrag_inode(struct btrfs_fs_info *fs_info,
ret = PTR_ERR(inode);
goto cleanup;
}
- srcu_read_unlock(&fs_info->subvol_srcu, index);
/* do a chunk of defrag */
clear_bit(BTRFS_INODE_IN_DEFRAG, &BTRFS_I(inode)->runtime_flags);
@@ -338,7 +334,6 @@ static int __btrfs_run_defrag_inode(struct btrfs_fs_info *fs_info,
iput(inode);
return 0;
cleanup:
- srcu_read_unlock(&fs_info->subvol_srcu, index);
kmem_cache_free(btrfs_inode_defrag_cachep, defrag);
return ret;
}
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 338283bb66f1..b2dbfb25c3d2 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -5363,7 +5363,6 @@ struct inode *btrfs_lookup_dentry(struct inode *dir, struct dentry *dentry)
struct btrfs_root *sub_root = root;
struct btrfs_key location;
u8 di_type = 0;
- int index;
int ret = 0;
if (dentry->d_name.len > BTRFS_NAME_LEN)
@@ -5390,7 +5389,6 @@ struct inode *btrfs_lookup_dentry(struct inode *dir, struct dentry *dentry)
return inode;
}
- index = srcu_read_lock(&fs_info->subvol_srcu);
ret = fixup_tree_root_location(fs_info, dir, dentry,
&location, &sub_root);
if (ret < 0) {
@@ -5403,7 +5401,6 @@ struct inode *btrfs_lookup_dentry(struct inode *dir, struct dentry *dentry)
}
if (root != sub_root)
btrfs_put_root(sub_root);
- srcu_read_unlock(&fs_info->subvol_srcu, index);
if (!IS_ERR(inode) && root != sub_root) {
down_read(&fs_info->cleanup_work_sem);
diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c
index 6b86841315be..21e1c77832ed 100644
--- a/fs/btrfs/send.c
+++ b/fs/btrfs/send.c
@@ -7066,7 +7066,6 @@ long btrfs_ioctl_send(struct file *mnt_file, struct btrfs_ioctl_send_args *arg)
int clone_sources_to_rollback = 0;
unsigned alloc_size;
int sort_clone_roots = 0;
- int index;
if (!capable(CAP_SYS_ADMIN))
return -EPERM;
@@ -7193,11 +7192,8 @@ long btrfs_ioctl_send(struct file *mnt_file, struct btrfs_ioctl_send_args *arg)
key.type = BTRFS_ROOT_ITEM_KEY;
key.offset = (u64)-1;
- index = srcu_read_lock(&fs_info->subvol_srcu);
-
clone_root = btrfs_get_fs_root(fs_info, &key, true);
if (IS_ERR(clone_root)) {
- srcu_read_unlock(&fs_info->subvol_srcu, index);
ret = PTR_ERR(clone_root);
goto out;
}
@@ -7206,7 +7202,6 @@ long btrfs_ioctl_send(struct file *mnt_file, struct btrfs_ioctl_send_args *arg)
btrfs_root_dead(clone_root)) {
spin_unlock(&clone_root->root_item_lock);
btrfs_put_root(clone_root);
- srcu_read_unlock(&fs_info->subvol_srcu, index);
ret = -EPERM;
goto out;
}
@@ -7214,13 +7209,11 @@ long btrfs_ioctl_send(struct file *mnt_file, struct btrfs_ioctl_send_args *arg)
dedupe_in_progress_warn(clone_root);
spin_unlock(&clone_root->root_item_lock);
btrfs_put_root(clone_root);
- srcu_read_unlock(&fs_info->subvol_srcu, index);
ret = -EAGAIN;
goto out;
}
clone_root->send_in_progress++;
spin_unlock(&clone_root->root_item_lock);
- srcu_read_unlock(&fs_info->subvol_srcu, index);
sctx->clone_roots[i].root = clone_root;
clone_sources_to_rollback = i + 1;
@@ -7234,11 +7227,8 @@ long btrfs_ioctl_send(struct file *mnt_file, struct btrfs_ioctl_send_args *arg)
key.type = BTRFS_ROOT_ITEM_KEY;
key.offset = (u64)-1;
- index = srcu_read_lock(&fs_info->subvol_srcu);
-
sctx->parent_root = btrfs_get_fs_root(fs_info, &key, true);
if (IS_ERR(sctx->parent_root)) {
- srcu_read_unlock(&fs_info->subvol_srcu, index);
ret = PTR_ERR(sctx->parent_root);
goto out;
}
@@ -7248,20 +7238,16 @@ long btrfs_ioctl_send(struct file *mnt_file, struct btrfs_ioctl_send_args *arg)
if (!btrfs_root_readonly(sctx->parent_root) ||
btrfs_root_dead(sctx->parent_root)) {
spin_unlock(&sctx->parent_root->root_item_lock);
- srcu_read_unlock(&fs_info->subvol_srcu, index);
ret = -EPERM;
goto out;
}
if (sctx->parent_root->dedupe_in_progress) {
dedupe_in_progress_warn(sctx->parent_root);
spin_unlock(&sctx->parent_root->root_item_lock);
- srcu_read_unlock(&fs_info->subvol_srcu, index);
ret = -EAGAIN;
goto out;
}
spin_unlock(&sctx->parent_root->root_item_lock);
-
- srcu_read_unlock(&fs_info->subvol_srcu, index);
}
/*
diff --git a/fs/btrfs/tests/btrfs-tests.c b/fs/btrfs/tests/btrfs-tests.c
index 42e62fd2809c..999c14e5d0bd 100644
--- a/fs/btrfs/tests/btrfs-tests.c
+++ b/fs/btrfs/tests/btrfs-tests.c
@@ -134,14 +134,6 @@ struct btrfs_fs_info *btrfs_alloc_dummy_fs_info(u32 nodesize, u32 sectorsize)
fs_info->nodesize = nodesize;
fs_info->sectorsize = sectorsize;
-
- if (init_srcu_struct(&fs_info->subvol_srcu)) {
- kfree(fs_info->fs_devices);
- kfree(fs_info->super_copy);
- kfree(fs_info);
- return NULL;
- }
-
set_bit(BTRFS_FS_STATE_DUMMY_FS_INFO, &fs_info->fs_state);
test_mnt->mnt_sb->s_fs_info = fs_info;
@@ -191,7 +183,6 @@ void btrfs_free_dummy_fs_info(struct btrfs_fs_info *fs_info)
}
btrfs_free_qgroup_config(fs_info);
btrfs_free_fs_roots(fs_info);
- cleanup_srcu_struct(&fs_info->subvol_srcu);
kfree(fs_info->super_copy);
btrfs_check_leaked_roots(fs_info);
btrfs_extent_buffer_leak_debug_check(fs_info);
--
2.24.1
^ permalink raw reply related
* [PATCH 7/8] btrfs: make btrfs_cleanup_fs_roots use the fs_roots_radix_lock
From: Josef Bacik @ 2020-02-14 21:11 UTC (permalink / raw)
To: linux-btrfs, kernel-team
In-Reply-To: <20200214211147.24610-1-josef@toxicpanda.com>
The radix root is primarily protected by the fs_roots_radix_lock, so use
that to lookup and get a ref on all of our fs roots in
btrfs_cleanup_fs_roots.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
fs/btrfs/disk-io.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 052690799d64..ca46ca3e9dca 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -3924,15 +3924,14 @@ int btrfs_cleanup_fs_roots(struct btrfs_fs_info *fs_info)
int i = 0;
int err = 0;
unsigned int ret = 0;
- int index;
while (1) {
- index = srcu_read_lock(&fs_info->subvol_srcu);
+ spin_lock(&fs_info->fs_roots_radix_lock);
ret = radix_tree_gang_lookup(&fs_info->fs_roots_radix,
(void **)gang, root_objectid,
ARRAY_SIZE(gang));
if (!ret) {
- srcu_read_unlock(&fs_info->subvol_srcu, index);
+ spin_unlock(&fs_info->fs_roots_radix_lock);
break;
}
root_objectid = gang[ret - 1]->root_key.objectid + 1;
@@ -3946,7 +3945,7 @@ int btrfs_cleanup_fs_roots(struct btrfs_fs_info *fs_info)
/* grab all the search result for later use */
gang[i] = btrfs_grab_root(gang[i]);
}
- srcu_read_unlock(&fs_info->subvol_srcu, index);
+ spin_unlock(&fs_info->fs_roots_radix_lock);
for (i = 0; i < ret; i++) {
if (!gang[i])
--
2.24.1
^ permalink raw reply related
* [PATCH 6/8] btrfs: don't take an extra root ref at allocation time
From: Josef Bacik @ 2020-02-14 21:11 UTC (permalink / raw)
To: linux-btrfs, kernel-team
In-Reply-To: <20200214211147.24610-1-josef@toxicpanda.com>
Now that all the users of roots take references for them we can drop the
extra root ref we've been taking. Before we had roots at 2 refs for the
life of the file system, one for the radix tree, and one simply for
existing. Now that we have proper ref accounting in all places that use
roots we can drop this extra ref simply for existing as we no longer
need it.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
fs/btrfs/disk-io.c | 20 ++++++--------------
fs/btrfs/tests/qgroup-tests.c | 2 ++
2 files changed, 8 insertions(+), 14 deletions(-)
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index ffbc4c6c17fe..052690799d64 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -1645,22 +1645,11 @@ struct btrfs_root *btrfs_get_fs_root(struct btrfs_fs_info *fs_info,
if (ret == 0)
set_bit(BTRFS_ROOT_ORPHAN_ITEM_INSERTED, &root->state);
- /*
- * All roots have two refs on them at all times, one for the mounted fs,
- * and one for being in the radix tree. This way we only free the root
- * when we are unmounting or deleting the subvolume. We get one ref
- * from __setup_root, one for inserting it into the radix tree, and then
- * we have the third for returning it, and the caller will put it when
- * it's done with the root.
- */
- btrfs_grab_root(root);
ret = btrfs_insert_fs_root(fs_info, root);
if (ret) {
btrfs_put_root(root);
- if (ret == -EEXIST) {
- btrfs_put_root(root);
+ if (ret == -EEXIST)
goto again;
- }
goto fail;
}
return root;
@@ -3896,11 +3885,13 @@ int write_all_supers(struct btrfs_fs_info *fs_info, int max_mirrors)
void btrfs_drop_and_free_fs_root(struct btrfs_fs_info *fs_info,
struct btrfs_root *root)
{
+ bool drop_ref = false;
+
spin_lock(&fs_info->fs_roots_radix_lock);
radix_tree_delete(&fs_info->fs_roots_radix,
(unsigned long)root->root_key.objectid);
if (test_and_clear_bit(BTRFS_ROOT_IN_RADIX, &root->state))
- btrfs_put_root(root);
+ drop_ref = true;
spin_unlock(&fs_info->fs_roots_radix_lock);
if (btrfs_root_refs(&root->root_item) == 0)
@@ -3922,7 +3913,8 @@ void btrfs_drop_and_free_fs_root(struct btrfs_fs_info *fs_info,
iput(root->ino_cache_inode);
root->ino_cache_inode = NULL;
}
- btrfs_put_root(root);
+ if (drop_ref)
+ btrfs_put_root(root);
}
int btrfs_cleanup_fs_roots(struct btrfs_fs_info *fs_info)
diff --git a/fs/btrfs/tests/qgroup-tests.c b/fs/btrfs/tests/qgroup-tests.c
index ac035a6fa003..ce1ca8e73c2d 100644
--- a/fs/btrfs/tests/qgroup-tests.c
+++ b/fs/btrfs/tests/qgroup-tests.c
@@ -507,6 +507,7 @@ int btrfs_test_qgroups(u32 sectorsize, u32 nodesize)
test_err("couldn't insert fs root %d", ret);
goto out;
}
+ btrfs_put_root(tmp_root);
tmp_root = btrfs_alloc_dummy_root(fs_info);
if (IS_ERR(tmp_root)) {
@@ -521,6 +522,7 @@ int btrfs_test_qgroups(u32 sectorsize, u32 nodesize)
test_err("couldn't insert fs root %d", ret);
goto out;
}
+ btrfs_put_root(tmp_root);
test_msg("running qgroup tests");
ret = test_no_shared_qgroup(root, sectorsize, nodesize);
--
2.24.1
^ permalink raw reply related
* [PATCH 5/8] btrfs: hold a ref on the root on the dead roots list
From: Josef Bacik @ 2020-02-14 21:11 UTC (permalink / raw)
To: linux-btrfs, kernel-team
In-Reply-To: <20200214211147.24610-1-josef@toxicpanda.com>
At the point we add a root to the dead roots list we have no open inodes
for that root, so we need to hold a ref on that root to keep it from
disappearing.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
fs/btrfs/disk-io.c | 3 +--
fs/btrfs/transaction.c | 6 ++++--
2 files changed, 5 insertions(+), 4 deletions(-)
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 26eb1564028d..ffbc4c6c17fe 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2079,8 +2079,7 @@ void btrfs_free_fs_roots(struct btrfs_fs_info *fs_info)
if (test_bit(BTRFS_ROOT_IN_RADIX, &gang[0]->state))
btrfs_drop_and_free_fs_root(fs_info, gang[0]);
- else
- btrfs_put_root(gang[0]);
+ btrfs_put_root(gang[0]);
}
while (1) {
diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index 3ffeb9a2c6e6..53af0f55f5f9 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -1260,8 +1260,10 @@ void btrfs_add_dead_root(struct btrfs_root *root)
struct btrfs_fs_info *fs_info = root->fs_info;
spin_lock(&fs_info->trans_lock);
- if (list_empty(&root->root_list))
+ if (list_empty(&root->root_list)) {
+ btrfs_grab_root(root);
list_add_tail(&root->root_list, &fs_info->dead_roots);
+ }
spin_unlock(&fs_info->trans_lock);
}
@@ -2443,7 +2445,7 @@ int btrfs_clean_one_deleted_snapshot(struct btrfs_root *root)
ret = btrfs_drop_snapshot(root, NULL, 0, 0);
else
ret = btrfs_drop_snapshot(root, NULL, 1, 0);
-
+ btrfs_put_root(root);
return (ret < 0) ? 0 : 1;
}
--
2.24.1
^ permalink raw reply related
* [PATCH 4/8] btrfs: make inodes hold a ref on their roots
From: Josef Bacik @ 2020-02-14 21:11 UTC (permalink / raw)
To: linux-btrfs, kernel-team
In-Reply-To: <20200214211147.24610-1-josef@toxicpanda.com>
If we make sure all the inodes have refs on their root we don't have to
worry about the root disappearing while we have open inodes.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
fs/btrfs/disk-io.c | 2 +-
fs/btrfs/inode.c | 8 +++++---
2 files changed, 6 insertions(+), 4 deletions(-)
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 953190b03043..26eb1564028d 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2142,7 +2142,7 @@ static void btrfs_init_btree_inode(struct btrfs_fs_info *fs_info)
BTRFS_I(inode)->io_tree.ops = &btree_extent_io_ops;
- BTRFS_I(inode)->root = fs_info->tree_root;
+ BTRFS_I(inode)->root = btrfs_grab_root(fs_info->tree_root);
memset(&BTRFS_I(inode)->location, 0, sizeof(struct btrfs_key));
set_bit(BTRFS_INODE_DUMMY, &BTRFS_I(inode)->runtime_flags);
btrfs_insert_inode_hash(inode);
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 20a6ee83f709..338283bb66f1 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -5234,7 +5234,8 @@ static int btrfs_init_locked_inode(struct inode *inode, void *p)
inode->i_ino = args->location->objectid;
memcpy(&BTRFS_I(inode)->location, args->location,
sizeof(*args->location));
- BTRFS_I(inode)->root = args->root;
+ BTRFS_I(inode)->root = btrfs_grab_root(args->root);
+ BUG_ON(args->root && !BTRFS_I(inode)->root);
return 0;
}
@@ -5315,7 +5316,7 @@ static struct inode *new_simple_dir(struct super_block *s,
if (!inode)
return ERR_PTR(-ENOMEM);
- BTRFS_I(inode)->root = root;
+ BTRFS_I(inode)->root = btrfs_grab_root(root);
memcpy(&BTRFS_I(inode)->location, key, sizeof(*key));
set_bit(BTRFS_INODE_DUMMY, &BTRFS_I(inode)->runtime_flags);
@@ -5883,7 +5884,7 @@ static struct inode *btrfs_new_inode(struct btrfs_trans_handle *trans,
*/
BTRFS_I(inode)->index_cnt = 2;
BTRFS_I(inode)->dir_index = *index;
- BTRFS_I(inode)->root = root;
+ BTRFS_I(inode)->root = btrfs_grab_root(root);
BTRFS_I(inode)->generation = trans->transid;
inode->i_generation = BTRFS_I(inode)->generation;
@@ -8898,6 +8899,7 @@ void btrfs_destroy_inode(struct inode *inode)
inode_tree_del(inode);
btrfs_drop_extent_cache(BTRFS_I(inode), 0, (u64)-1, 0);
btrfs_inode_clear_file_extent_range(BTRFS_I(inode), 0, (u64)-1);
+ btrfs_put_root(BTRFS_I(inode)->root);
}
int btrfs_drop_inode(struct inode *inode)
--
2.24.1
^ permalink raw reply related
* [PATCH 3/8] btrfs: move the root freeing stuff into btrfs_put_root
From: Josef Bacik @ 2020-02-14 21:11 UTC (permalink / raw)
To: linux-btrfs, kernel-team
In-Reply-To: <20200214211147.24610-1-josef@toxicpanda.com>
There are a few different ways to free roots, either you allocated them
yourself and you just do
free_extent_buffer(root->node);
free_extent_buffer(root->commit_node);
btrfs_put_root(root);
Which is the pattern for log roots. Or for snapshots/subvolumes that
are being dropped you simply call btrfs_free_fs_root() which does all
the cleanup for you.
Unify this all into btrfs_put_root(), so that we don't free up things
associated with the root until the last reference is dropped. This
makes the root freeing code much more significant.
The only caveat is at close_ctree() time we have to free the extent
buffers for all of our main roots (extent_root, chunk_root, etc) because
we have to drop the btree_inode and we'll run into issues if we hold
onto those nodes until ->kill_sb() time. This will be addressed in the
future when we kill the btree_inode.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
fs/btrfs/disk-io.c | 64 ++++++++++++++++++------------------
fs/btrfs/disk-io.h | 16 +--------
fs/btrfs/extent-tree.c | 7 ++--
fs/btrfs/extent_io.c | 16 +++++++--
fs/btrfs/free-space-tree.c | 2 --
fs/btrfs/qgroup.c | 7 +---
fs/btrfs/relocation.c | 4 ---
fs/btrfs/tests/btrfs-tests.c | 5 +--
fs/btrfs/tree-log.c | 6 ----
9 files changed, 50 insertions(+), 77 deletions(-)
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index b7d6b4436f7d..953190b03043 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -1302,11 +1302,8 @@ struct btrfs_root *btrfs_create_tree(struct btrfs_trans_handle *trans,
return root;
fail:
- if (leaf) {
+ if (leaf)
btrfs_tree_unlock(leaf);
- free_extent_buffer(root->commit_root);
- free_extent_buffer(leaf);
- }
btrfs_put_root(root);
return ERR_PTR(ret);
@@ -1429,10 +1426,10 @@ struct btrfs_root *btrfs_read_tree_root(struct btrfs_root *tree_root,
generation, level, NULL);
if (IS_ERR(root->node)) {
ret = PTR_ERR(root->node);
+ root->node = NULL;
goto find_fail;
} else if (!btrfs_buffer_uptodate(root->node, generation, 0)) {
ret = -EIO;
- free_extent_buffer(root->node);
goto find_fail;
}
root->commit_root = btrfs_root_node(root);
@@ -1661,14 +1658,14 @@ struct btrfs_root *btrfs_get_fs_root(struct btrfs_fs_info *fs_info,
if (ret) {
btrfs_put_root(root);
if (ret == -EEXIST) {
- btrfs_free_fs_root(root);
+ btrfs_put_root(root);
goto again;
}
goto fail;
}
return root;
fail:
- btrfs_free_fs_root(root);
+ btrfs_put_root(root);
return ERR_PTR(ret);
}
@@ -2040,11 +2037,35 @@ static void free_root_pointers(struct btrfs_fs_info *info, bool free_chunk_root)
free_root_extent_buffers(info->csum_root);
free_root_extent_buffers(info->quota_root);
free_root_extent_buffers(info->uuid_root);
+ free_root_extent_buffers(info->fs_root);
if (free_chunk_root)
free_root_extent_buffers(info->chunk_root);
free_root_extent_buffers(info->free_space_root);
}
+void btrfs_put_root(struct btrfs_root *root)
+{
+ if (!root)
+ return;
+ if (refcount_dec_and_test(&root->refs)) {
+ WARN_ON(!RB_EMPTY_ROOT(&root->inode_tree));
+ if (root->anon_dev)
+ free_anon_bdev(root->anon_dev);
+ if (root->subv_writers)
+ btrfs_free_subvolume_writers(root->subv_writers);
+ free_extent_buffer(root->node);
+ free_extent_buffer(root->commit_root);
+ kfree(root->free_ino_ctl);
+ kfree(root->free_ino_pinned);
+#ifdef CONFIG_BTRFS_DEBUG
+ spin_lock(&root->fs_info->fs_roots_radix_lock);
+ list_del_init(&root->leak_list);
+ spin_unlock(&root->fs_info->fs_roots_radix_lock);
+#endif
+ kfree(root);
+ }
+}
+
void btrfs_free_fs_roots(struct btrfs_fs_info *fs_info)
{
int ret;
@@ -2056,13 +2077,10 @@ void btrfs_free_fs_roots(struct btrfs_fs_info *fs_info)
struct btrfs_root, root_list);
list_del(&gang[0]->root_list);
- if (test_bit(BTRFS_ROOT_IN_RADIX, &gang[0]->state)) {
+ if (test_bit(BTRFS_ROOT_IN_RADIX, &gang[0]->state))
btrfs_drop_and_free_fs_root(fs_info, gang[0]);
- } else {
- free_extent_buffer(gang[0]->node);
- free_extent_buffer(gang[0]->commit_root);
+ else
btrfs_put_root(gang[0]);
- }
}
while (1) {
@@ -2269,11 +2287,11 @@ static int btrfs_replay_log(struct btrfs_fs_info *fs_info,
if (IS_ERR(log_tree_root->node)) {
btrfs_warn(fs_info, "failed to read log tree");
ret = PTR_ERR(log_tree_root->node);
+ log_tree_root->node = NULL;
btrfs_put_root(log_tree_root);
return ret;
} else if (!extent_buffer_uptodate(log_tree_root->node)) {
btrfs_err(fs_info, "failed to read log tree");
- free_extent_buffer(log_tree_root->node);
btrfs_put_root(log_tree_root);
return -EIO;
}
@@ -2282,7 +2300,6 @@ static int btrfs_replay_log(struct btrfs_fs_info *fs_info,
if (ret) {
btrfs_handle_fs_error(fs_info, ret,
"Failed to recover log tree");
- free_extent_buffer(log_tree_root->node);
btrfs_put_root(log_tree_root);
return ret;
}
@@ -3893,8 +3910,6 @@ void btrfs_drop_and_free_fs_root(struct btrfs_fs_info *fs_info,
if (test_bit(BTRFS_FS_STATE_ERROR, &fs_info->fs_state)) {
btrfs_free_log(NULL, root);
if (root->reloc_root) {
- free_extent_buffer(root->reloc_root->node);
- free_extent_buffer(root->reloc_root->commit_root);
btrfs_put_root(root->reloc_root);
root->reloc_root = NULL;
}
@@ -3908,20 +3923,6 @@ void btrfs_drop_and_free_fs_root(struct btrfs_fs_info *fs_info,
iput(root->ino_cache_inode);
root->ino_cache_inode = NULL;
}
- btrfs_free_fs_root(root);
-}
-
-void btrfs_free_fs_root(struct btrfs_root *root)
-{
- WARN_ON(!RB_EMPTY_ROOT(&root->inode_tree));
- if (root->anon_dev)
- free_anon_bdev(root->anon_dev);
- if (root->subv_writers)
- btrfs_free_subvolume_writers(root->subv_writers);
- free_extent_buffer(root->node);
- free_extent_buffer(root->commit_root);
- kfree(root->free_ino_ctl);
- kfree(root->free_ino_pinned);
btrfs_put_root(root);
}
@@ -4073,8 +4074,6 @@ void __cold close_ctree(struct btrfs_fs_info *fs_info)
btrfs_sysfs_remove_mounted(fs_info);
btrfs_sysfs_remove_fsid(fs_info->fs_devices);
- btrfs_free_fs_roots(fs_info);
-
btrfs_put_block_group_cache(fs_info);
/*
@@ -4086,6 +4085,7 @@ void __cold close_ctree(struct btrfs_fs_info *fs_info)
clear_bit(BTRFS_FS_OPEN, &fs_info->flags);
free_root_pointers(fs_info, true);
+ btrfs_free_fs_roots(fs_info);
/*
* We must free the block groups after dropping the fs_roots as we could
diff --git a/fs/btrfs/disk-io.h b/fs/btrfs/disk-io.h
index db21ab614357..860a9a3bd8d1 100644
--- a/fs/btrfs/disk-io.h
+++ b/fs/btrfs/disk-io.h
@@ -76,7 +76,6 @@ void btrfs_btree_balance_dirty(struct btrfs_fs_info *fs_info);
void btrfs_btree_balance_dirty_nodelay(struct btrfs_fs_info *fs_info);
void btrfs_drop_and_free_fs_root(struct btrfs_fs_info *fs_info,
struct btrfs_root *root);
-void btrfs_free_fs_root(struct btrfs_root *root);
#ifdef CONFIG_BTRFS_FS_RUN_SANITY_TESTS
struct btrfs_root *btrfs_alloc_dummy_root(struct btrfs_fs_info *fs_info);
@@ -98,20 +97,7 @@ static inline struct btrfs_root *btrfs_grab_root(struct btrfs_root *root)
return NULL;
}
-static inline void btrfs_put_root(struct btrfs_root *root)
-{
- if (!root)
- return;
- if (refcount_dec_and_test(&root->refs)) {
-#ifdef CONFIG_BTRFS_DEBUG
- spin_lock(&root->fs_info->fs_roots_radix_lock);
- list_del_init(&root->leak_list);
- spin_unlock(&root->fs_info->fs_roots_radix_lock);
-#endif
- kfree(root);
- }
-}
-
+void btrfs_put_root(struct btrfs_root *root);
void btrfs_mark_buffer_dirty(struct extent_buffer *buf);
int btrfs_buffer_uptodate(struct extent_buffer *buf, u64 parent_transid,
int atomic);
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 08481a64f5d4..bb9a180e5b96 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -5411,13 +5411,10 @@ int btrfs_drop_snapshot(struct btrfs_root *root,
}
}
- if (test_bit(BTRFS_ROOT_IN_RADIX, &root->state)) {
+ if (test_bit(BTRFS_ROOT_IN_RADIX, &root->state))
btrfs_add_dropped_root(trans, root);
- } else {
- free_extent_buffer(root->node);
- free_extent_buffer(root->commit_root);
+ else
btrfs_put_root(root);
- }
root_dropped = true;
out_end_trans:
btrfs_end_transaction_throttle(trans);
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index e82e3817e811..deab1430d9c5 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -64,12 +64,20 @@ void btrfs_extent_buffer_leak_debug_check(struct btrfs_fs_info *fs_info)
struct extent_buffer *eb;
unsigned long flags;
+ /*
+ * If we didn't get into open_ctree our alloced_ebs will not be init'ed,
+ * so just skip this.
+ */
+ if (fs_info->alloced_ebs.next == NULL)
+ return;
+
spin_lock_irqsave(&fs_info->eb_leak_lock, flags);
while (!list_empty(&fs_info->alloced_ebs)) {
eb = list_first_entry(&fs_info->alloced_ebs,
struct extent_buffer, leak_list);
- pr_err("BTRFS: buffer leak start %llu len %lu refs %d bflags %lu\n",
- eb->start, eb->len, atomic_read(&eb->refs), eb->bflags);
+ pr_err("BTRFS: buffer leak start %llu len %lu refs %d bflags %lu owner %llu\n",
+ eb->start, eb->len, atomic_read(&eb->refs), eb->bflags,
+ btrfs_header_owner(eb));
list_del(&eb->leak_list);
kmem_cache_free(extent_buffer_cache, eb);
}
@@ -4795,7 +4803,6 @@ int extent_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
static void __free_extent_buffer(struct extent_buffer *eb)
{
- btrfs_leak_debug_del(&eb->fs_info->eb_leak_lock, &eb->leak_list);
kmem_cache_free(extent_buffer_cache, eb);
}
@@ -4861,6 +4868,7 @@ static void btrfs_release_extent_buffer_pages(struct extent_buffer *eb)
static inline void btrfs_release_extent_buffer(struct extent_buffer *eb)
{
btrfs_release_extent_buffer_pages(eb);
+ btrfs_leak_debug_del(&eb->fs_info->eb_leak_lock, &eb->leak_list);
__free_extent_buffer(eb);
}
@@ -5248,6 +5256,8 @@ static int release_extent_buffer(struct extent_buffer *eb)
spin_unlock(&eb->refs_lock);
}
+ btrfs_leak_debug_del(&eb->fs_info->eb_leak_lock,
+ &eb->leak_list);
/* Should be safe to release our pages at this point */
btrfs_release_extent_buffer_pages(eb);
#ifdef CONFIG_BTRFS_FS_RUN_SANITY_TESTS
diff --git a/fs/btrfs/free-space-tree.c b/fs/btrfs/free-space-tree.c
index bc43950eb32f..8b1f5c8897b7 100644
--- a/fs/btrfs/free-space-tree.c
+++ b/fs/btrfs/free-space-tree.c
@@ -1251,8 +1251,6 @@ int btrfs_clear_free_space_tree(struct btrfs_fs_info *fs_info)
btrfs_free_tree_block(trans, free_space_root, free_space_root->node,
0, 1);
- free_extent_buffer(free_space_root->node);
- free_extent_buffer(free_space_root->commit_root);
btrfs_put_root(free_space_root);
return btrfs_commit_transaction(trans);
diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index 6ae868eb9a17..ac192a7696f3 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -1037,11 +1037,8 @@ int btrfs_quota_enable(struct btrfs_fs_info *fs_info)
out_free_path:
btrfs_free_path(path);
out_free_root:
- if (ret) {
- free_extent_buffer(quota_root->node);
- free_extent_buffer(quota_root->commit_root);
+ if (ret)
btrfs_put_root(quota_root);
- }
out:
if (ret) {
ulist_free(fs_info->qgroup_ulist);
@@ -1104,8 +1101,6 @@ int btrfs_quota_disable(struct btrfs_fs_info *fs_info)
btrfs_tree_unlock(quota_root->node);
btrfs_free_tree_block(trans, quota_root, quota_root->node, 0, 1);
- free_extent_buffer(quota_root->node);
- free_extent_buffer(quota_root->commit_root);
btrfs_put_root(quota_root);
end_trans:
diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index 034f5f151a74..4fb7e3cc2aca 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -2549,10 +2549,6 @@ void free_reloc_roots(struct list_head *list)
reloc_root = list_entry(list->next, struct btrfs_root,
root_list);
__del_reloc_root(reloc_root);
- free_extent_buffer(reloc_root->node);
- free_extent_buffer(reloc_root->commit_root);
- reloc_root->node = NULL;
- reloc_root->commit_root = NULL;
}
}
diff --git a/fs/btrfs/tests/btrfs-tests.c b/fs/btrfs/tests/btrfs-tests.c
index 69c9afef06e3..42e62fd2809c 100644
--- a/fs/btrfs/tests/btrfs-tests.c
+++ b/fs/btrfs/tests/btrfs-tests.c
@@ -194,6 +194,7 @@ void btrfs_free_dummy_fs_info(struct btrfs_fs_info *fs_info)
cleanup_srcu_struct(&fs_info->subvol_srcu);
kfree(fs_info->super_copy);
btrfs_check_leaked_roots(fs_info);
+ btrfs_extent_buffer_leak_debug_check(fs_info);
kfree(fs_info->fs_devices);
kfree(fs_info);
}
@@ -205,10 +206,6 @@ void btrfs_free_dummy_root(struct btrfs_root *root)
/* Will be freed by btrfs_free_fs_roots */
if (WARN_ON(test_bit(BTRFS_ROOT_IN_RADIX, &root->state)))
return;
- if (root->node) {
- /* One for allocate_extent_buffer */
- free_extent_buffer(root->node);
- }
btrfs_put_root(root);
}
diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index 4ea47bac5fcf..d27aa289dc50 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -3288,7 +3288,6 @@ static void free_log_tree(struct btrfs_trans_handle *trans,
clear_extent_bits(&log->dirty_log_pages, 0, (u64)-1,
EXTENT_DIRTY | EXTENT_NEW | EXTENT_NEED_WAIT);
- free_extent_buffer(log->node);
btrfs_put_root(log);
}
@@ -6181,8 +6180,6 @@ int btrfs_recover_log_trees(struct btrfs_root *log_root_tree)
ret = btrfs_pin_extent_for_log_replay(fs_info,
log->node->start,
log->node->len);
- free_extent_buffer(log->node);
- free_extent_buffer(log->commit_root);
btrfs_put_root(log);
if (!ret)
@@ -6220,8 +6217,6 @@ int btrfs_recover_log_trees(struct btrfs_root *log_root_tree)
wc.replay_dest->log_root = NULL;
btrfs_put_root(wc.replay_dest);
- free_extent_buffer(log->node);
- free_extent_buffer(log->commit_root);
btrfs_put_root(log);
if (ret)
@@ -6253,7 +6248,6 @@ int btrfs_recover_log_trees(struct btrfs_root *log_root_tree)
if (ret)
return ret;
- free_extent_buffer(log_root_tree->node);
log_root_tree->log_root = NULL;
clear_bit(BTRFS_FS_LOG_RECOVERING, &fs_info->flags);
btrfs_put_root(log_root_tree);
--
2.24.1
^ permalink raw reply related
* [PATCH 2/8] btrfs: move ino_cache_inode dropping
From: Josef Bacik @ 2020-02-14 21:11 UTC (permalink / raw)
To: linux-btrfs, kernel-team
In-Reply-To: <20200214211147.24610-1-josef@toxicpanda.com>
We are going to make root life be controlled soley by refcounting, and
inodes will be one of the things that hold a ref on the root. This
means we need to handle dropping the ino_cache_inode outside of the root
freeing logic, so move it into btrfs_drop_and_free_fs_root() so it is
cleaned up properly on unmount.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
fs/btrfs/disk-io.c | 5 ++++-
fs/btrfs/transaction.c | 4 ++++
2 files changed, 8 insertions(+), 1 deletion(-)
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 2a237aecc6d7..b7d6b4436f7d 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -3904,12 +3904,15 @@ void btrfs_drop_and_free_fs_root(struct btrfs_fs_info *fs_info,
__btrfs_remove_free_space_cache(root->free_ino_pinned);
if (root->free_ino_ctl)
__btrfs_remove_free_space_cache(root->free_ino_ctl);
+ if (root->ino_cache_inode) {
+ iput(root->ino_cache_inode);
+ root->ino_cache_inode = NULL;
+ }
btrfs_free_fs_root(root);
}
void btrfs_free_fs_root(struct btrfs_root *root)
{
- iput(root->ino_cache_inode);
WARN_ON(!RB_EMPTY_ROOT(&root->inode_tree));
if (root->anon_dev)
free_anon_bdev(root->anon_dev);
diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index c2fff16842ac..3ffeb9a2c6e6 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -2433,6 +2433,10 @@ int btrfs_clean_one_deleted_snapshot(struct btrfs_root *root)
btrfs_debug(fs_info, "cleaner removing %llu", root->root_key.objectid);
btrfs_kill_all_delayed_nodes(root);
+ if (root->ino_cache_inode) {
+ iput(root->ino_cache_inode);
+ root->ino_cache_inode = NULL;
+ }
if (btrfs_header_backref_rev(root->node) <
BTRFS_MIXED_BACKREF_REV)
--
2.24.1
^ permalink raw reply related
* [PATCH 1/8] btrfs: make the extent buffer leak check per fs info
From: Josef Bacik @ 2020-02-14 21:11 UTC (permalink / raw)
To: linux-btrfs, kernel-team
In-Reply-To: <20200214211147.24610-1-josef@toxicpanda.com>
I'm going to make the entire destruction of btrfs_root's controlled by
their refcount, so it will be helpful to notice if we're leaking their
eb's on umount.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
fs/btrfs/ctree.h | 3 +++
fs/btrfs/disk-io.c | 3 +++
fs/btrfs/extent_io.c | 45 ++++++++++++++++++++++----------------------
fs/btrfs/extent_io.h | 7 +++++++
4 files changed, 36 insertions(+), 22 deletions(-)
diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index ffd99c3f64db..000ad54629e3 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -948,6 +948,9 @@ struct btrfs_fs_info {
struct kobject *debug_kobj;
struct kobject *discard_debug_kobj;
struct list_head allocated_roots;
+
+ spinlock_t eb_leak_lock;
+ struct list_head alloced_ebs;
#endif
};
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 018681ec159b..2a237aecc6d7 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -1574,6 +1574,7 @@ void btrfs_free_fs_info(struct btrfs_fs_info *fs_info)
btrfs_put_root(fs_info->free_space_root);
btrfs_put_root(fs_info->fs_root);
btrfs_check_leaked_roots(fs_info);
+ btrfs_extent_buffer_leak_debug_check(fs_info);
kfree(fs_info->super_copy);
kfree(fs_info->super_for_commit);
kvfree(fs_info);
@@ -2702,6 +2703,8 @@ void btrfs_init_fs_info(struct btrfs_fs_info *fs_info)
INIT_LIST_HEAD(&fs_info->unused_bgs);
#ifdef CONFIG_BTRFS_DEBUG
INIT_LIST_HEAD(&fs_info->allocated_roots);
+ INIT_LIST_HEAD(&fs_info->alloced_ebs);
+ spin_lock_init(&fs_info->eb_leak_lock);
#endif
extent_map_tree_init(&fs_info->mapping_tree);
btrfs_init_block_rsv(&fs_info->global_block_rsv,
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 6f9638350470..e82e3817e811 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -35,42 +35,45 @@ static inline bool extent_state_in_tree(const struct extent_state *state)
}
#ifdef CONFIG_BTRFS_DEBUG
-static LIST_HEAD(buffers);
static LIST_HEAD(states);
-
static DEFINE_SPINLOCK(leak_lock);
-static inline
-void btrfs_leak_debug_add(struct list_head *new, struct list_head *head)
+static inline void btrfs_leak_debug_add(spinlock_t *lock,
+ struct list_head *new,
+ struct list_head *head)
{
unsigned long flags;
- spin_lock_irqsave(&leak_lock, flags);
+ spin_lock_irqsave(lock, flags);
list_add(new, head);
- spin_unlock_irqrestore(&leak_lock, flags);
+ spin_unlock_irqrestore(lock, flags);
}
-static inline
-void btrfs_leak_debug_del(struct list_head *entry)
+static inline void btrfs_leak_debug_del(spinlock_t *lock,
+ struct list_head *entry)
{
unsigned long flags;
- spin_lock_irqsave(&leak_lock, flags);
+ spin_lock_irqsave(lock, flags);
list_del(entry);
- spin_unlock_irqrestore(&leak_lock, flags);
+ spin_unlock_irqrestore(lock, flags);
}
-static inline void btrfs_extent_buffer_leak_debug_check(void)
+void btrfs_extent_buffer_leak_debug_check(struct btrfs_fs_info *fs_info)
{
struct extent_buffer *eb;
+ unsigned long flags;
- while (!list_empty(&buffers)) {
- eb = list_entry(buffers.next, struct extent_buffer, leak_list);
+ spin_lock_irqsave(&fs_info->eb_leak_lock, flags);
+ while (!list_empty(&fs_info->alloced_ebs)) {
+ eb = list_first_entry(&fs_info->alloced_ebs,
+ struct extent_buffer, leak_list);
pr_err("BTRFS: buffer leak start %llu len %lu refs %d bflags %lu\n",
eb->start, eb->len, atomic_read(&eb->refs), eb->bflags);
list_del(&eb->leak_list);
kmem_cache_free(extent_buffer_cache, eb);
}
+ spin_unlock_irqrestore(&fs_info->eb_leak_lock, flags);
}
static inline void btrfs_extent_state_leak_debug_check(void)
@@ -107,9 +110,8 @@ static inline void __btrfs_debug_check_extent_io_range(const char *caller,
}
}
#else
-#define btrfs_leak_debug_add(new, head) do {} while (0)
-#define btrfs_leak_debug_del(entry) do {} while (0)
-#define btrfs_extent_buffer_leak_debug_check() do {} while (0)
+#define btrfs_leak_debug_add(lock, new, head) do {} while (0)
+#define btrfs_leak_debug_del(lock, entry) do {} while (0)
#define btrfs_extent_state_leak_debug_check() do {} while (0)
#define btrfs_debug_check_extent_io_range(c, s, e) do {} while (0)
#endif
@@ -245,8 +247,6 @@ void __cold extent_state_cache_exit(void)
void __cold extent_io_exit(void)
{
- btrfs_extent_buffer_leak_debug_check();
-
/*
* Make sure all delayed rcu free are flushed before we
* destroy caches.
@@ -324,7 +324,7 @@ static struct extent_state *alloc_extent_state(gfp_t mask)
state->state = 0;
state->failrec = NULL;
RB_CLEAR_NODE(&state->rb_node);
- btrfs_leak_debug_add(&state->leak_list, &states);
+ btrfs_leak_debug_add(&leak_lock, &state->leak_list, &states);
refcount_set(&state->refs, 1);
init_waitqueue_head(&state->wq);
trace_alloc_extent_state(state, mask, _RET_IP_);
@@ -337,7 +337,7 @@ void free_extent_state(struct extent_state *state)
return;
if (refcount_dec_and_test(&state->refs)) {
WARN_ON(extent_state_in_tree(state));
- btrfs_leak_debug_del(&state->leak_list);
+ btrfs_leak_debug_del(&leak_lock, &state->leak_list);
trace_free_extent_state(state, _RET_IP_);
kmem_cache_free(extent_state_cache, state);
}
@@ -4795,7 +4795,7 @@ int extent_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
static void __free_extent_buffer(struct extent_buffer *eb)
{
- btrfs_leak_debug_del(&eb->leak_list);
+ btrfs_leak_debug_del(&eb->fs_info->eb_leak_lock, &eb->leak_list);
kmem_cache_free(extent_buffer_cache, eb);
}
@@ -4882,7 +4882,8 @@ __alloc_extent_buffer(struct btrfs_fs_info *fs_info, u64 start,
init_waitqueue_head(&eb->write_lock_wq);
init_waitqueue_head(&eb->read_lock_wq);
- btrfs_leak_debug_add(&eb->leak_list, &buffers);
+ btrfs_leak_debug_add(&fs_info->eb_leak_lock, &eb->leak_list,
+ &fs_info->alloced_ebs);
spin_lock_init(&eb->refs_lock);
atomic_set(&eb->refs, 1);
diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h
index 234622101230..2ed65bd0760e 100644
--- a/fs/btrfs/extent_io.h
+++ b/fs/btrfs/extent_io.h
@@ -325,4 +325,11 @@ bool find_lock_delalloc_range(struct inode *inode,
#endif
struct extent_buffer *alloc_test_extent_buffer(struct btrfs_fs_info *fs_info,
u64 start);
+
+#ifdef CONFIG_BTRFS_DEBUG
+void btrfs_extent_buffer_leak_debug_check(struct btrfs_fs_info *fs_info);
+#else
+#define btrfs_extent_buffer_leak_debug_check(fs_info) do {} while (0)
+#endif
+
#endif
--
2.24.1
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.