Re: [LTP] [PATCH 1/1] nfsstat01: Update client RPC calls for kernel 6.9

public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed

* Re: [LTP] [PATCH 1/1] nfsstat01: Update client RPC calls for kernel 6.9
       [not found] <d4b235df-4ee5-4824-9d48-e3b3c1f1f4d1@oracle.com>
@ 2024-07-02 22:55 ` Calum Mackay
  2024-07-05 14:19   ` Chuck Lever III
  0 siblings, 1 reply; 21+ messages in thread
From: Calum Mackay @ 2024-07-02 22:55 UTC (permalink / raw)
  To: Petr Vorel
  Cc: Calum Mackay, Trond Myklebust, anna, Linux NFS Mailing List,
	kernel-team, ltp, Avinesh Kumar, NeilBrown, Sherry Yang,
	Josef Bacik, linux-stable, Jeff Layton

To clarify…

On 02/07/2024 5:54 pm, Calum Mackay wrote:
> hi Petr,
> 
> I noticed your LTP patch [1][2] which adjusts the nfsstat01 test on v6.9 
> kernels, to account for Josef's changes [3], which restrict the NFS/RPC 
> stats per-namespace.
> 
> I see that Josef's changes were backported, as far back as longterm 
> v5.4,

Sorry, that's not quite accurate.

Josef's NFS client changes were all backported from v6.9, as far as 
longterm v5.4.y:

2057a48d0dd0 sunrpc: add a struct rpc_stats arg to rpc_create_args
d47151b79e32 nfs: expose /proc/net/sunrpc/nfs in net namespaces
1548036ef120 nfs: make the rpc_stat per net namespace


Of Josef's NFS server changes, four were backported from v6.9 to v6.8:

418b9687dece sunrpc: use the struct net as the svc proc private
d98416cc2154 nfsd: rename NFSD_NET_* to NFSD_STATS_*
93483ac5fec6 nfsd: expose /proc/net/sunrpc/nfsd in net namespaces
4b14885411f7 nfsd: make all of the nfsd stats per-network namespace

and the others remained only in v6.9:

ab42f4d9a26f sunrpc: don't change ->sv_stats if it doesn't exist
a2214ed588fb nfsd: stop setting ->pg_stats for unused stats
f09432386766 sunrpc: pass in the sv_stats struct through svc_create_pooled
3f6ef182f144 sunrpc: remove ->pg_stats from svc_program
e41ee44cc6a4 nfsd: remove nfsd_stats, make th_cnt a global counter
16fb9808ab2c nfsd: make svc_stat per-network namespace instead of global



I'm wondering if this difference between NFS client, and NFS server, 
stat behaviour, across kernel versions, may perhaps cause some user 
confusion?


cheers,
calum.




> so your check for kernel version "6.9" in the test may need to be 
> adjusted, if LTP is intended to be run on stable kernels?
> 
> best wishes,
> calum.
> 
> 
> [1] https://lore.kernel.org/ltp/20240620111129.594449-1-pvorel@suse.cz/
> [2] https://patchwork.ozlabs.org/project/ltp/ 
> patch/20240620111129.594449-1-pvorel@suse.cz/
> [3] https://lore.kernel.org/linux-nfs/ 
> cover.1708026931.git.josef@toxicpanda.com/



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [LTP] [PATCH 1/1] nfsstat01: Update client RPC calls for kernel 6.9
  2024-07-02 22:55 ` [LTP] [PATCH 1/1] nfsstat01: Update client RPC calls for kernel 6.9 Calum Mackay
@ 2024-07-05 14:19   ` Chuck Lever III
  2024-07-06  7:11     ` Greg KH
  2024-07-08  4:02     ` Petr Vorel
  0 siblings, 2 replies; 21+ messages in thread
From: Chuck Lever III @ 2024-07-05 14:19 UTC (permalink / raw)
  To: Calum Mackay, linux-stable
  Cc: Petr Vorel, Trond Myklebust, Anna Schumaker,
	Linux NFS Mailing List, kernel-team@fb.com, ltp@lists.linux.it,
	Avinesh Kumar, Neil Brown, Sherry Yang, Josef Bacik, Jeff Layton



> On Jul 2, 2024, at 6:55 PM, Calum Mackay <calum.mackay@oracle.com> wrote:
> 
> To clarify…
> 
> On 02/07/2024 5:54 pm, Calum Mackay wrote:
>> hi Petr,
>> I noticed your LTP patch [1][2] which adjusts the nfsstat01 test on v6.9 kernels, to account for Josef's changes [3], which restrict the NFS/RPC stats per-namespace.
>> I see that Josef's changes were backported, as far back as longterm v5.4,
> 
> Sorry, that's not quite accurate.
> 
> Josef's NFS client changes were all backported from v6.9, as far as longterm v5.4.y:
> 
> 2057a48d0dd0 sunrpc: add a struct rpc_stats arg to rpc_create_args
> d47151b79e32 nfs: expose /proc/net/sunrpc/nfs in net namespaces
> 1548036ef120 nfs: make the rpc_stat per net namespace
> 
> 
> Of Josef's NFS server changes, four were backported from v6.9 to v6.8:
> 
> 418b9687dece sunrpc: use the struct net as the svc proc private
> d98416cc2154 nfsd: rename NFSD_NET_* to NFSD_STATS_*
> 93483ac5fec6 nfsd: expose /proc/net/sunrpc/nfsd in net namespaces
> 4b14885411f7 nfsd: make all of the nfsd stats per-network namespace
> 
> and the others remained only in v6.9:
> 
> ab42f4d9a26f sunrpc: don't change ->sv_stats if it doesn't exist
> a2214ed588fb nfsd: stop setting ->pg_stats for unused stats
> f09432386766 sunrpc: pass in the sv_stats struct through svc_create_pooled
> 3f6ef182f144 sunrpc: remove ->pg_stats from svc_program
> e41ee44cc6a4 nfsd: remove nfsd_stats, make th_cnt a global counter
> 16fb9808ab2c nfsd: make svc_stat per-network namespace instead of global
> 
> 
> 
> I'm wondering if this difference between NFS client, and NFS server, stat behaviour, across kernel versions, may perhaps cause some user confusion?

As a refresher for the stable folken, Josef's changes make
nfsstats silo'd, so they no longer show counts from the whole
system, but only for NFS operations relating to the local net
namespace. That is a surprising change for some users, tools,
and testing.

I'm not clear on whether there are any rules/guidelines around
LTS backports causing behavior changes that user tools, like
nfsstat, might be impacted by.

The client-side nfsstat changes are fully backported to all
TS kernels. But should they have been?

The server-side nfsstat changes appear in only v6.9. Should
they be backported to the other LTS kernels, or not?


>> so your check for kernel version "6.9" in the test may need to be adjusted, if LTP is intended to be run on stable kernels?
>> best wishes,
>> calum.
>> [1] https://lore.kernel.org/ltp/20240620111129.594449-1-pvorel@suse.cz/
>> [2] https://patchwork.ozlabs.org/project/ltp/ patch/20240620111129.594449-1-pvorel@suse.cz/
>> [3] https://lore.kernel.org/linux-nfs/ cover.1708026931.git.josef@toxicpanda.com/

--
Chuck Lever



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [LTP] [PATCH 1/1] nfsstat01: Update client RPC calls for kernel 6.9
  2024-07-05 14:19   ` Chuck Lever III
@ 2024-07-06  7:11     ` Greg KH
  2024-07-06  7:46       ` Sherry Yang
  2024-07-08  4:02     ` Petr Vorel
  1 sibling, 1 reply; 21+ messages in thread
From: Greg KH @ 2024-07-06  7:11 UTC (permalink / raw)
  To: Chuck Lever III
  Cc: Calum Mackay, linux-stable, Petr Vorel, Trond Myklebust,
	Anna Schumaker, Linux NFS Mailing List, kernel-team@fb.com,
	ltp@lists.linux.it, Avinesh Kumar, Neil Brown, Sherry Yang,
	Josef Bacik, Jeff Layton

On Fri, Jul 05, 2024 at 02:19:18PM +0000, Chuck Lever III wrote:
> 
> 
> > On Jul 2, 2024, at 6:55 PM, Calum Mackay <calum.mackay@oracle.com> wrote:
> > 
> > To clarify…
> > 
> > On 02/07/2024 5:54 pm, Calum Mackay wrote:
> >> hi Petr,
> >> I noticed your LTP patch [1][2] which adjusts the nfsstat01 test on v6.9 kernels, to account for Josef's changes [3], which restrict the NFS/RPC stats per-namespace.
> >> I see that Josef's changes were backported, as far back as longterm v5.4,
> > 
> > Sorry, that's not quite accurate.
> > 
> > Josef's NFS client changes were all backported from v6.9, as far as longterm v5.4.y:
> > 
> > 2057a48d0dd0 sunrpc: add a struct rpc_stats arg to rpc_create_args
> > d47151b79e32 nfs: expose /proc/net/sunrpc/nfs in net namespaces
> > 1548036ef120 nfs: make the rpc_stat per net namespace
> > 
> > 
> > Of Josef's NFS server changes, four were backported from v6.9 to v6.8:
> > 
> > 418b9687dece sunrpc: use the struct net as the svc proc private
> > d98416cc2154 nfsd: rename NFSD_NET_* to NFSD_STATS_*
> > 93483ac5fec6 nfsd: expose /proc/net/sunrpc/nfsd in net namespaces
> > 4b14885411f7 nfsd: make all of the nfsd stats per-network namespace
> > 
> > and the others remained only in v6.9:
> > 
> > ab42f4d9a26f sunrpc: don't change ->sv_stats if it doesn't exist
> > a2214ed588fb nfsd: stop setting ->pg_stats for unused stats
> > f09432386766 sunrpc: pass in the sv_stats struct through svc_create_pooled
> > 3f6ef182f144 sunrpc: remove ->pg_stats from svc_program
> > e41ee44cc6a4 nfsd: remove nfsd_stats, make th_cnt a global counter
> > 16fb9808ab2c nfsd: make svc_stat per-network namespace instead of global
> > 
> > 
> > 
> > I'm wondering if this difference between NFS client, and NFS server, stat behaviour, across kernel versions, may perhaps cause some user confusion?
> 
> As a refresher for the stable folken, Josef's changes make
> nfsstats silo'd, so they no longer show counts from the whole
> system, but only for NFS operations relating to the local net
> namespace. That is a surprising change for some users, tools,
> and testing.
> 
> I'm not clear on whether there are any rules/guidelines around
> LTS backports causing behavior changes that user tools, like
> nfsstat, might be impacted by.

The same rules that apply for Linus's tree (i.e. no userspace
regressions.)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [LTP] [PATCH 1/1] nfsstat01: Update client RPC calls for kernel 6.9
  2024-07-06  7:11     ` Greg KH
@ 2024-07-06  7:46       ` Sherry Yang
  2024-07-08 10:36         ` Greg KH
  0 siblings, 1 reply; 21+ messages in thread
From: Sherry Yang @ 2024-07-06  7:46 UTC (permalink / raw)
  To: Greg KH, Chuck Lever III
  Cc: Calum Mackay, linux-stable, Petr Vorel, Trond Myklebust,
	Anna Schumaker, Linux NFS Mailing List, kernel-team@fb.com,
	ltp@lists.linux.it, Avinesh Kumar, Neil Brown, Josef Bacik,
	Jeff Layton



> On Jul 6, 2024, at 12:11 AM, Greg KH <greg@kroah.com> wrote:
> 
> On Fri, Jul 05, 2024 at 02:19:18PM +0000, Chuck Lever III wrote:
>> 
>> 
>>> On Jul 2, 2024, at 6:55 PM, Calum Mackay <calum.mackay@oracle.com> wrote:
>>> 
>>> To clarify…
>>> 
>>> On 02/07/2024 5:54 pm, Calum Mackay wrote:
>>>> hi Petr,
>>>> I noticed your LTP patch [1][2] which adjusts the nfsstat01 test on v6.9 kernels, to account for Josef's changes [3], which restrict the NFS/RPC stats per-namespace.
>>>> I see that Josef's changes were backported, as far back as longterm v5.4,
>>> 
>>> Sorry, that's not quite accurate.
>>> 
>>> Josef's NFS client changes were all backported from v6.9, as far as longterm v5.4.y:
>>> 
>>> 2057a48d0dd0 sunrpc: add a struct rpc_stats arg to rpc_create_args
>>> d47151b79e32 nfs: expose /proc/net/sunrpc/nfs in net namespaces
>>> 1548036ef120 nfs: make the rpc_stat per net namespace
>>> 
>>> 
>>> Of Josef's NFS server changes, four were backported from v6.9 to v6.8:
>>> 
>>> 418b9687dece sunrpc: use the struct net as the svc proc private
>>> d98416cc2154 nfsd: rename NFSD_NET_* to NFSD_STATS_*
>>> 93483ac5fec6 nfsd: expose /proc/net/sunrpc/nfsd in net namespaces
>>> 4b14885411f7 nfsd: make all of the nfsd stats per-network namespace
>>> 
>>> and the others remained only in v6.9:
>>> 
>>> ab42f4d9a26f sunrpc: don't change ->sv_stats if it doesn't exist
>>> a2214ed588fb nfsd: stop setting ->pg_stats for unused stats
>>> f09432386766 sunrpc: pass in the sv_stats struct through svc_create_pooled
>>> 3f6ef182f144 sunrpc: remove ->pg_stats from svc_program
>>> e41ee44cc6a4 nfsd: remove nfsd_stats, make th_cnt a global counter
>>> 16fb9808ab2c nfsd: make svc_stat per-network namespace instead of global
>>> 
>>> 
>>> 
>>> I'm wondering if this difference between NFS client, and NFS server, stat behaviour, across kernel versions, may perhaps cause some user confusion?
>> 
>> As a refresher for the stable folken, Josef's changes make
>> nfsstats silo'd, so they no longer show counts from the whole
>> system, but only for NFS operations relating to the local net
>> namespace. That is a surprising change for some users, tools,
>> and testing.
>> 
>> I'm not clear on whether there are any rules/guidelines around
>> LTS backports causing behavior changes that user tools, like
>> nfsstat, might be impacted by.
> 
> The same rules that apply for Linus's tree (i.e. no userspace
> regressions.)

Given the current data we have, LTP nfsstat01[1] failed on LTS v5.4.278 because of kernel commit 1548036ef1204 ("nfs:
make the rpc_stat per net namespace") [2]. Other LTS which backported the same commit are very likely troubled with the same LTP test failure.

The following are the LTP nfsstat01 failure output

========
network 1 TINFO: initialize 'lhost' 'ltp_ns_veth2' interface
network 1 TINFO: add local addr 10.0.0.2/24
network 1 TINFO: add local addr fd00:1:1:1::2/64
network 1 TINFO: initialize 'rhost' 'ltp_ns_veth1' interface
network 1 TINFO: add remote addr 10.0.0.1/24
network 1 TINFO: add remote addr fd00:1:1:1::1/64
network 1 TINFO: Network config (local -- remote):
network 1 TINFO: ltp_ns_veth2 -- ltp_ns_veth1
network 1 TINFO: 10.0.0.2/24 -- 10.0.0.1/24
network 1 TINFO: fd00:1:1:1::2/64 -- fd00:1:1:1::1/64
<<<test_start>>>
tag=veth|nfsstat3_01 stime=1719943586
cmdline="nfsstat01"
contacts=""
analysis=exit
<<<test_output>>>
incrementing stop
nfsstat01 1 TINFO: timeout per run is 0h 20m 0s
nfsstat01 1 TINFO: setup NFSv3, socket type udp
nfsstat01 1 TINFO: Mounting NFS: mount -t nfs -o proto=udp,vers=3 10.0.0.2:/tmp/netpan-4577/LTP_nfsstat01.lz6zhgQHoV/3/udp /tmp/netpan-4577/LTP_nfsstat01.lz6zhgQHoV/3/0
nfsstat01 1 TINFO: checking RPC calls for server/client
nfsstat01 1 TINFO: calls 98/0
nfsstat01 1 TINFO: Checking for tracking of RPC calls for server/client
nfsstat01 1 TINFO: new calls 102/0
nfsstat01 1 TPASS: server RPC calls increased
nfsstat01 1 TFAIL: client RPC calls not increased
nfsstat01 1 TINFO: checking NFS calls for server/client
nfsstat01 1 TINFO: calls 2/2
nfsstat01 1 TINFO: Checking for tracking of NFS calls for server/client
nfsstat01 1 TINFO: new calls 3/3
nfsstat01 1 TPASS: server NFS calls increased
nfsstat01 1 TPASS: client NFS calls increased
nfsstat01 2 TINFO: Cleaning up testcase
nfsstat01 2 TINFO: SELinux enabled in enforcing mode, this may affect test results
nfsstat01 2 TINFO: it can be disabled with TST_DISABLE_SELINUX=1 (requires super/root)
nfsstat01 2 TINFO: install seinfo to find used SELinux profiles
nfsstat01 2 TINFO: loaded SELinux profiles: none

Summary:
passed 3
failed 1
skipped 0
warnings 0
<<<execution_status>>>
initiation_status="ok"
duration=1 termination_type=exited termination_id=1 corefile=no
cutime=11 cstime=16
<<<test_end>>>
ltp-pan reported FAIL
========

We can observe the number of RPC client calls is 0, which is wired. And this happens from the kernel commit 57d1ce96d7655 ("nfs: make the rpc_stat per net namespace”). So now we’re not sure the kernel backport of nfs client changes is proper, or the LTP tests / userspace need to be modified.

If no userspace regression, should we revert the Josef’s NFS client-side changes on LTS?

Thanks,
Sherry

Reference:
[1] https://github.com/linux-test-project/ltp/blob/master/testcases/network/nfs/nfsstat01/nfsstat01.sh
[2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1548036ef1204df65ca5a16e8b199c858cb80075
> 
> thanks,
> 
> greg k-h



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [LTP] [PATCH 1/1] nfsstat01: Update client RPC calls for kernel 6.9
  2024-07-05 14:19   ` Chuck Lever III
  2024-07-06  7:11     ` Greg KH
@ 2024-07-08  4:02     ` Petr Vorel
  1 sibling, 0 replies; 21+ messages in thread
From: Petr Vorel @ 2024-07-08  4:02 UTC (permalink / raw)
  To: Chuck Lever III
  Cc: Calum Mackay, linux-stable, Trond Myklebust, Anna Schumaker,
	Linux NFS Mailing List, kernel-team@fb.com, ltp@lists.linux.it,
	Avinesh Kumar, Neil Brown, Sherry Yang, Josef Bacik, Jeff Layton

Hi all,


> > On Jul 2, 2024, at 6:55 PM, Calum Mackay <calum.mackay@oracle.com> wrote:

> > To clarify…

> > On 02/07/2024 5:54 pm, Calum Mackay wrote:
> >> hi Petr,
> >> I noticed your LTP patch [1][2] which adjusts the nfsstat01 test on v6.9 kernels, to account for Josef's changes [3], which restrict the NFS/RPC stats per-namespace.
> >> I see that Josef's changes were backported, as far back as longterm v5.4,

> > Sorry, that's not quite accurate.

> > Josef's NFS client changes were all backported from v6.9, as far as longterm v5.4.y:

> > 2057a48d0dd0 sunrpc: add a struct rpc_stats arg to rpc_create_args
> > d47151b79e32 nfs: expose /proc/net/sunrpc/nfs in net namespaces
> > 1548036ef120 nfs: make the rpc_stat per net namespace


> > Of Josef's NFS server changes, four were backported from v6.9 to v6.8:

> > 418b9687dece sunrpc: use the struct net as the svc proc private
> > d98416cc2154 nfsd: rename NFSD_NET_* to NFSD_STATS_*
> > 93483ac5fec6 nfsd: expose /proc/net/sunrpc/nfsd in net namespaces
> > 4b14885411f7 nfsd: make all of the nfsd stats per-network namespace

> > and the others remained only in v6.9:

> > ab42f4d9a26f sunrpc: don't change ->sv_stats if it doesn't exist
> > a2214ed588fb nfsd: stop setting ->pg_stats for unused stats
> > f09432386766 sunrpc: pass in the sv_stats struct through svc_create_pooled
> > 3f6ef182f144 sunrpc: remove ->pg_stats from svc_program
> > e41ee44cc6a4 nfsd: remove nfsd_stats, make th_cnt a global counter
> > 16fb9808ab2c nfsd: make svc_stat per-network namespace instead of global



> > I'm wondering if this difference between NFS client, and NFS server, stat behaviour, across kernel versions, may perhaps cause some user confusion?

> As a refresher for the stable folken, Josef's changes make
> nfsstats silo'd, so they no longer show counts from the whole
> system, but only for NFS operations relating to the local net
> namespace. That is a surprising change for some users, tools,
> and testing.

> I'm not clear on whether there are any rules/guidelines around
> LTS backports causing behavior changes that user tools, like
> nfsstat, might be impacted by.

> The client-side nfsstat changes are fully backported to all
> TS kernels. But should they have been?

> The server-side nfsstat changes appear in only v6.9. Should
> they be backported to the other LTS kernels, or not?

First, thanks a lot for having a look into the issue.

It looks to me as a functional change, thus I would not backport
changes unless changes they are needed to be backported (part some larger fix).
Thus maybe revert?

And if backported, I would expect changes on both sides (client and server)
would be backported (not just server side).

Kind regards,
Petr

> >> so your check for kernel version "6.9" in the test may need to be adjusted, if LTP is intended to be run on stable kernels?
> >> best wishes,
> >> calum.
> >> [1] https://lore.kernel.org/ltp/20240620111129.594449-1-pvorel@suse.cz/
> >> [2] https://patchwork.ozlabs.org/project/ltp/ patch/20240620111129.594449-1-pvorel@suse.cz/
> >> [3] https://lore.kernel.org/linux-nfs/ cover.1708026931.git.josef@toxicpanda.com/

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [LTP] [PATCH 1/1] nfsstat01: Update client RPC calls for kernel 6.9
  2024-07-06  7:46       ` Sherry Yang
@ 2024-07-08 10:36         ` Greg KH
  2024-07-08 17:49           ` Chuck Lever III
  0 siblings, 1 reply; 21+ messages in thread
From: Greg KH @ 2024-07-08 10:36 UTC (permalink / raw)
  To: Sherry Yang
  Cc: Chuck Lever III, Calum Mackay, linux-stable, Petr Vorel,
	Trond Myklebust, Anna Schumaker, Linux NFS Mailing List,
	kernel-team@fb.com, ltp@lists.linux.it, Avinesh Kumar, Neil Brown,
	Josef Bacik, Jeff Layton

On Sat, Jul 06, 2024 at 07:46:19AM +0000, Sherry Yang wrote:
> 
> 
> > On Jul 6, 2024, at 12:11 AM, Greg KH <greg@kroah.com> wrote:
> > 
> > On Fri, Jul 05, 2024 at 02:19:18PM +0000, Chuck Lever III wrote:
> >> 
> >> 
> >>> On Jul 2, 2024, at 6:55 PM, Calum Mackay <calum.mackay@oracle.com> wrote:
> >>> 
> >>> To clarify…
> >>> 
> >>> On 02/07/2024 5:54 pm, Calum Mackay wrote:
> >>>> hi Petr,
> >>>> I noticed your LTP patch [1][2] which adjusts the nfsstat01 test on v6.9 kernels, to account for Josef's changes [3], which restrict the NFS/RPC stats per-namespace.
> >>>> I see that Josef's changes were backported, as far back as longterm v5.4,
> >>> 
> >>> Sorry, that's not quite accurate.
> >>> 
> >>> Josef's NFS client changes were all backported from v6.9, as far as longterm v5.4.y:
> >>> 
> >>> 2057a48d0dd0 sunrpc: add a struct rpc_stats arg to rpc_create_args
> >>> d47151b79e32 nfs: expose /proc/net/sunrpc/nfs in net namespaces
> >>> 1548036ef120 nfs: make the rpc_stat per net namespace
> >>> 
> >>> 
> >>> Of Josef's NFS server changes, four were backported from v6.9 to v6.8:
> >>> 
> >>> 418b9687dece sunrpc: use the struct net as the svc proc private
> >>> d98416cc2154 nfsd: rename NFSD_NET_* to NFSD_STATS_*
> >>> 93483ac5fec6 nfsd: expose /proc/net/sunrpc/nfsd in net namespaces
> >>> 4b14885411f7 nfsd: make all of the nfsd stats per-network namespace
> >>> 
> >>> and the others remained only in v6.9:
> >>> 
> >>> ab42f4d9a26f sunrpc: don't change ->sv_stats if it doesn't exist
> >>> a2214ed588fb nfsd: stop setting ->pg_stats for unused stats
> >>> f09432386766 sunrpc: pass in the sv_stats struct through svc_create_pooled
> >>> 3f6ef182f144 sunrpc: remove ->pg_stats from svc_program
> >>> e41ee44cc6a4 nfsd: remove nfsd_stats, make th_cnt a global counter
> >>> 16fb9808ab2c nfsd: make svc_stat per-network namespace instead of global
> >>> 
> >>> 
> >>> 
> >>> I'm wondering if this difference between NFS client, and NFS server, stat behaviour, across kernel versions, may perhaps cause some user confusion?
> >> 
> >> As a refresher for the stable folken, Josef's changes make
> >> nfsstats silo'd, so they no longer show counts from the whole
> >> system, but only for NFS operations relating to the local net
> >> namespace. That is a surprising change for some users, tools,
> >> and testing.
> >> 
> >> I'm not clear on whether there are any rules/guidelines around
> >> LTS backports causing behavior changes that user tools, like
> >> nfsstat, might be impacted by.
> > 
> > The same rules that apply for Linus's tree (i.e. no userspace
> > regressions.)
> 
> Given the current data we have, LTP nfsstat01[1] failed on LTS v5.4.278 because of kernel commit 1548036ef1204 ("nfs:
> make the rpc_stat per net namespace") [2]. Other LTS which backported the same commit are very likely troubled with the same LTP test failure.
> 
> The following are the LTP nfsstat01 failure output
> 
> ========
> network 1 TINFO: initialize 'lhost' 'ltp_ns_veth2' interface
> network 1 TINFO: add local addr 10.0.0.2/24
> network 1 TINFO: add local addr fd00:1:1:1::2/64
> network 1 TINFO: initialize 'rhost' 'ltp_ns_veth1' interface
> network 1 TINFO: add remote addr 10.0.0.1/24
> network 1 TINFO: add remote addr fd00:1:1:1::1/64
> network 1 TINFO: Network config (local -- remote):
> network 1 TINFO: ltp_ns_veth2 -- ltp_ns_veth1
> network 1 TINFO: 10.0.0.2/24 -- 10.0.0.1/24
> network 1 TINFO: fd00:1:1:1::2/64 -- fd00:1:1:1::1/64
> <<<test_start>>>
> tag=veth|nfsstat3_01 stime=1719943586
> cmdline="nfsstat01"
> contacts=""
> analysis=exit
> <<<test_output>>>
> incrementing stop
> nfsstat01 1 TINFO: timeout per run is 0h 20m 0s
> nfsstat01 1 TINFO: setup NFSv3, socket type udp
> nfsstat01 1 TINFO: Mounting NFS: mount -t nfs -o proto=udp,vers=3 10.0.0.2:/tmp/netpan-4577/LTP_nfsstat01.lz6zhgQHoV/3/udp /tmp/netpan-4577/LTP_nfsstat01.lz6zhgQHoV/3/0
> nfsstat01 1 TINFO: checking RPC calls for server/client
> nfsstat01 1 TINFO: calls 98/0
> nfsstat01 1 TINFO: Checking for tracking of RPC calls for server/client
> nfsstat01 1 TINFO: new calls 102/0
> nfsstat01 1 TPASS: server RPC calls increased
> nfsstat01 1 TFAIL: client RPC calls not increased
> nfsstat01 1 TINFO: checking NFS calls for server/client
> nfsstat01 1 TINFO: calls 2/2
> nfsstat01 1 TINFO: Checking for tracking of NFS calls for server/client
> nfsstat01 1 TINFO: new calls 3/3
> nfsstat01 1 TPASS: server NFS calls increased
> nfsstat01 1 TPASS: client NFS calls increased
> nfsstat01 2 TINFO: Cleaning up testcase
> nfsstat01 2 TINFO: SELinux enabled in enforcing mode, this may affect test results
> nfsstat01 2 TINFO: it can be disabled with TST_DISABLE_SELINUX=1 (requires super/root)
> nfsstat01 2 TINFO: install seinfo to find used SELinux profiles
> nfsstat01 2 TINFO: loaded SELinux profiles: none
> 
> Summary:
> passed 3
> failed 1
> skipped 0
> warnings 0
> <<<execution_status>>>
> initiation_status="ok"
> duration=1 termination_type=exited termination_id=1 corefile=no
> cutime=11 cstime=16
> <<<test_end>>>
> ltp-pan reported FAIL
> ========
> 
> We can observe the number of RPC client calls is 0, which is wired. And this happens from the kernel commit 57d1ce96d7655 ("nfs: make the rpc_stat per net namespace”). So now we’re not sure the kernel backport of nfs client changes is proper, or the LTP tests / userspace need to be modified.
> 
> If no userspace regression, should we revert the Josef’s NFS client-side changes on LTS?

This sounds like a regression in Linus's tree too, so why isn't it
reverted there first?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [LTP] [PATCH 1/1] nfsstat01: Update client RPC calls for kernel 6.9
  2024-07-08 10:36         ` Greg KH
@ 2024-07-08 17:49           ` Chuck Lever III
  2024-07-09  6:48             ` Cyril Hrubis
                               ` (2 more replies)
  0 siblings, 3 replies; 21+ messages in thread
From: Chuck Lever III @ 2024-07-08 17:49 UTC (permalink / raw)
  To: Greg KH
  Cc: Sherry Yang, Calum Mackay, linux-stable, Petr Vorel,
	Trond Myklebust, Anna Schumaker, Linux NFS Mailing List,
	kernel-team@fb.com, ltp@lists.linux.it, Avinesh Kumar, Neil Brown,
	Josef Bacik, Jeff Layton



> On Jul 8, 2024, at 6:36 AM, Greg KH <greg@kroah.com> wrote:
> 
> On Sat, Jul 06, 2024 at 07:46:19AM +0000, Sherry Yang wrote:
>> 
>> 
>>> On Jul 6, 2024, at 12:11 AM, Greg KH <greg@kroah.com> wrote:
>>> 
>>> On Fri, Jul 05, 2024 at 02:19:18PM +0000, Chuck Lever III wrote:
>>>> 
>>>> 
>>>>> On Jul 2, 2024, at 6:55 PM, Calum Mackay <calum.mackay@oracle.com> wrote:
>>>>> 
>>>>> To clarify…
>>>>> 
>>>>> On 02/07/2024 5:54 pm, Calum Mackay wrote:
>>>>>> hi Petr,
>>>>>> I noticed your LTP patch [1][2] which adjusts the nfsstat01 test on v6.9 kernels, to account for Josef's changes [3], which restrict the NFS/RPC stats per-namespace.
>>>>>> I see that Josef's changes were backported, as far back as longterm v5.4,
>>>>> 
>>>>> Sorry, that's not quite accurate.
>>>>> 
>>>>> Josef's NFS client changes were all backported from v6.9, as far as longterm v5.4.y:
>>>>> 
>>>>> 2057a48d0dd0 sunrpc: add a struct rpc_stats arg to rpc_create_args
>>>>> d47151b79e32 nfs: expose /proc/net/sunrpc/nfs in net namespaces
>>>>> 1548036ef120 nfs: make the rpc_stat per net namespace
>>>>> 
>>>>> 
>>>>> Of Josef's NFS server changes, four were backported from v6.9 to v6.8:
>>>>> 
>>>>> 418b9687dece sunrpc: use the struct net as the svc proc private
>>>>> d98416cc2154 nfsd: rename NFSD_NET_* to NFSD_STATS_*
>>>>> 93483ac5fec6 nfsd: expose /proc/net/sunrpc/nfsd in net namespaces
>>>>> 4b14885411f7 nfsd: make all of the nfsd stats per-network namespace
>>>>> 
>>>>> and the others remained only in v6.9:
>>>>> 
>>>>> ab42f4d9a26f sunrpc: don't change ->sv_stats if it doesn't exist
>>>>> a2214ed588fb nfsd: stop setting ->pg_stats for unused stats
>>>>> f09432386766 sunrpc: pass in the sv_stats struct through svc_create_pooled
>>>>> 3f6ef182f144 sunrpc: remove ->pg_stats from svc_program
>>>>> e41ee44cc6a4 nfsd: remove nfsd_stats, make th_cnt a global counter
>>>>> 16fb9808ab2c nfsd: make svc_stat per-network namespace instead of global
>>>>> 
>>>>> 
>>>>> 
>>>>> I'm wondering if this difference between NFS client, and NFS server, stat behaviour, across kernel versions, may perhaps cause some user confusion?
>>>> 
>>>> As a refresher for the stable folken, Josef's changes make
>>>> nfsstats silo'd, so they no longer show counts from the whole
>>>> system, but only for NFS operations relating to the local net
>>>> namespace. That is a surprising change for some users, tools,
>>>> and testing.
>>>> 
>>>> I'm not clear on whether there are any rules/guidelines around
>>>> LTS backports causing behavior changes that user tools, like
>>>> nfsstat, might be impacted by.
>>> 
>>> The same rules that apply for Linus's tree (i.e. no userspace
>>> regressions.)
>> 
>> Given the current data we have, LTP nfsstat01[1] failed on LTS v5.4.278 because of kernel commit 1548036ef1204 ("nfs:
>> make the rpc_stat per net namespace") [2]. Other LTS which backported the same commit are very likely troubled with the same LTP test failure.
>> 
>> The following are the LTP nfsstat01 failure output
>> 
>> ========
>> network 1 TINFO: initialize 'lhost' 'ltp_ns_veth2' interface
>> network 1 TINFO: add local addr 10.0.0.2/24
>> network 1 TINFO: add local addr fd00:1:1:1::2/64
>> network 1 TINFO: initialize 'rhost' 'ltp_ns_veth1' interface
>> network 1 TINFO: add remote addr 10.0.0.1/24
>> network 1 TINFO: add remote addr fd00:1:1:1::1/64
>> network 1 TINFO: Network config (local -- remote):
>> network 1 TINFO: ltp_ns_veth2 -- ltp_ns_veth1
>> network 1 TINFO: 10.0.0.2/24 -- 10.0.0.1/24
>> network 1 TINFO: fd00:1:1:1::2/64 -- fd00:1:1:1::1/64
>> <<<test_start>>>
>> tag=veth|nfsstat3_01 stime=1719943586
>> cmdline="nfsstat01"
>> contacts=""
>> analysis=exit
>> <<<test_output>>>
>> incrementing stop
>> nfsstat01 1 TINFO: timeout per run is 0h 20m 0s
>> nfsstat01 1 TINFO: setup NFSv3, socket type udp
>> nfsstat01 1 TINFO: Mounting NFS: mount -t nfs -o proto=udp,vers=3 10.0.0.2:/tmp/netpan-4577/LTP_nfsstat01.lz6zhgQHoV/3/udp /tmp/netpan-4577/LTP_nfsstat01.lz6zhgQHoV/3/0
>> nfsstat01 1 TINFO: checking RPC calls for server/client
>> nfsstat01 1 TINFO: calls 98/0
>> nfsstat01 1 TINFO: Checking for tracking of RPC calls for server/client
>> nfsstat01 1 TINFO: new calls 102/0
>> nfsstat01 1 TPASS: server RPC calls increased
>> nfsstat01 1 TFAIL: client RPC calls not increased
>> nfsstat01 1 TINFO: checking NFS calls for server/client
>> nfsstat01 1 TINFO: calls 2/2
>> nfsstat01 1 TINFO: Checking for tracking of NFS calls for server/client
>> nfsstat01 1 TINFO: new calls 3/3
>> nfsstat01 1 TPASS: server NFS calls increased
>> nfsstat01 1 TPASS: client NFS calls increased
>> nfsstat01 2 TINFO: Cleaning up testcase
>> nfsstat01 2 TINFO: SELinux enabled in enforcing mode, this may affect test results
>> nfsstat01 2 TINFO: it can be disabled with TST_DISABLE_SELINUX=1 (requires super/root)
>> nfsstat01 2 TINFO: install seinfo to find used SELinux profiles
>> nfsstat01 2 TINFO: loaded SELinux profiles: none
>> 
>> Summary:
>> passed 3
>> failed 1
>> skipped 0
>> warnings 0
>> <<<execution_status>>>
>> initiation_status="ok"
>> duration=1 termination_type=exited termination_id=1 corefile=no
>> cutime=11 cstime=16
>> <<<test_end>>>
>> ltp-pan reported FAIL
>> ========
>> 
>> We can observe the number of RPC client calls is 0, which is wired. And this happens from the kernel commit 57d1ce96d7655 ("nfs: make the rpc_stat per net namespace”). So now we’re not sure the kernel backport of nfs client changes is proper, or the LTP tests / userspace need to be modified.
>> 
>> If no userspace regression, should we revert the Josef’s NFS client-side changes on LTS?
> 
> This sounds like a regression in Linus's tree too, so why isn't it
> reverted there first?

There is a change in behavior in the upstream code, but Josef's
patches fix an information leak and make the statistics more
sensible in container environments. I'm not certain that
should be considered a regression, but confess I don't know
the regression rules to this fine a degree of detail.

If it is indeed a regression, how can we go about retaining
both behaviors (selectable by Kconfig or perhaps administrative
UI)?


--
Chuck Lever



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [LTP] [PATCH 1/1] nfsstat01: Update client RPC calls for kernel 6.9
  2024-07-08 17:49           ` Chuck Lever III
@ 2024-07-09  6:48             ` Cyril Hrubis
  2024-07-11 21:18             ` Jeff Layton
  2024-07-12 13:45             ` Thorsten Leemhuis
  2 siblings, 0 replies; 21+ messages in thread
From: Cyril Hrubis @ 2024-07-09  6:48 UTC (permalink / raw)
  To: Chuck Lever III
  Cc: Greg KH, Linux NFS Mailing List, Neil Brown, Jeff Layton,
	Sherry Yang, linux-stable, Josef Bacik, Anna Schumaker,
	Trond Myklebust, Calum Mackay, kernel-team@fb.com,
	ltp@lists.linux.it

Hi!
> There is a change in behavior in the upstream code, but Josef's
> patches fix an information leak and make the statistics more
> sensible in container environments. I'm not certain that
> should be considered a regression, but confess I don't know
> the regression rules to this fine a degree of detail.
> 
> If it is indeed a regression, how can we go about retaining
> both behaviors (selectable by Kconfig or perhaps administrative
> UI)?

That is IMHO the worst solution, every userspace tool would have to be
able to work with both formats for an undefinite amount of time and the
only added value of this approach would be a Kconfig option to enable
information leak...

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [LTP] [PATCH 1/1] nfsstat01: Update client RPC calls for kernel 6.9
  2024-07-08 17:49           ` Chuck Lever III
  2024-07-09  6:48             ` Cyril Hrubis
@ 2024-07-11 21:18             ` Jeff Layton
  2024-07-11 22:58               ` NeilBrown
  2024-07-12 13:45             ` Thorsten Leemhuis
  2 siblings, 1 reply; 21+ messages in thread
From: Jeff Layton @ 2024-07-11 21:18 UTC (permalink / raw)
  To: Chuck Lever III, Greg KH
  Cc: Sherry Yang, Calum Mackay, linux-stable, Petr Vorel,
	Trond Myklebust, Anna Schumaker, Linux NFS Mailing List,
	kernel-team@fb.com, ltp@lists.linux.it, Avinesh Kumar, Neil Brown,
	Josef Bacik

On Mon, 2024-07-08 at 17:49 +0000, Chuck Lever III wrote:
> 
> > On Jul 8, 2024, at 6:36 AM, Greg KH <greg@kroah.com> wrote:
> > 
> > On Sat, Jul 06, 2024 at 07:46:19AM +0000, Sherry Yang wrote:
> > > 
> > > 
> > > > On Jul 6, 2024, at 12:11 AM, Greg KH <greg@kroah.com> wrote:
> > > > 
> > > > On Fri, Jul 05, 2024 at 02:19:18PM +0000, Chuck Lever III wrote:
> > > > > 
> > > > > 
> > > > > > On Jul 2, 2024, at 6:55 PM, Calum Mackay <calum.mackay@oracle.com> wrote:
> > > > > > 
> > > > > > To clarify…
> > > > > > 
> > > > > > On 02/07/2024 5:54 pm, Calum Mackay wrote:
> > > > > > > hi Petr,
> > > > > > > I noticed your LTP patch [1][2] which adjusts the nfsstat01 test on v6.9 kernels, to account for Josef's changes [3], which restrict the NFS/RPC stats per-namespace.
> > > > > > > I see that Josef's changes were backported, as far back as longterm v5.4,
> > > > > > 
> > > > > > Sorry, that's not quite accurate.
> > > > > > 
> > > > > > Josef's NFS client changes were all backported from v6.9, as far as longterm v5.4.y:
> > > > > > 
> > > > > > 2057a48d0dd0 sunrpc: add a struct rpc_stats arg to rpc_create_args
> > > > > > d47151b79e32 nfs: expose /proc/net/sunrpc/nfs in net namespaces
> > > > > > 1548036ef120 nfs: make the rpc_stat per net namespace
> > > > > > 
> > > > > > 
> > > > > > Of Josef's NFS server changes, four were backported from v6.9 to v6.8:
> > > > > > 
> > > > > > 418b9687dece sunrpc: use the struct net as the svc proc private
> > > > > > d98416cc2154 nfsd: rename NFSD_NET_* to NFSD_STATS_*
> > > > > > 93483ac5fec6 nfsd: expose /proc/net/sunrpc/nfsd in net namespaces
> > > > > > 4b14885411f7 nfsd: make all of the nfsd stats per-network namespace
> > > > > > 
> > > > > > and the others remained only in v6.9:
> > > > > > 
> > > > > > ab42f4d9a26f sunrpc: don't change ->sv_stats if it doesn't exist
> > > > > > a2214ed588fb nfsd: stop setting ->pg_stats for unused stats
> > > > > > f09432386766 sunrpc: pass in the sv_stats struct through svc_create_pooled
> > > > > > 3f6ef182f144 sunrpc: remove ->pg_stats from svc_program
> > > > > > e41ee44cc6a4 nfsd: remove nfsd_stats, make th_cnt a global counter
> > > > > > 16fb9808ab2c nfsd: make svc_stat per-network namespace instead of global
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > I'm wondering if this difference between NFS client, and NFS server, stat behaviour, across kernel versions, may perhaps cause some user confusion?
> > > > > 
> > > > > As a refresher for the stable folken, Josef's changes make
> > > > > nfsstats silo'd, so they no longer show counts from the whole
> > > > > system, but only for NFS operations relating to the local net
> > > > > namespace. That is a surprising change for some users, tools,
> > > > > and testing.
> > > > > 
> > > > > I'm not clear on whether there are any rules/guidelines around
> > > > > LTS backports causing behavior changes that user tools, like
> > > > > nfsstat, might be impacted by.
> > > > 
> > > > The same rules that apply for Linus's tree (i.e. no userspace
> > > > regressions.)
> > > 
> > > Given the current data we have, LTP nfsstat01[1] failed on LTS v5.4.278 because of kernel commit 1548036ef1204 ("nfs:
> > > make the rpc_stat per net namespace") [2]. Other LTS which backported the same commit are very likely troubled with the same LTP test failure.
> > > 
> > > The following are the LTP nfsstat01 failure output
> > > 
> > > ========
> > > network 1 TINFO: initialize 'lhost' 'ltp_ns_veth2' interface
> > > network 1 TINFO: add local addr 10.0.0.2/24
> > > network 1 TINFO: add local addr fd00:1:1:1::2/64
> > > network 1 TINFO: initialize 'rhost' 'ltp_ns_veth1' interface
> > > network 1 TINFO: add remote addr 10.0.0.1/24
> > > network 1 TINFO: add remote addr fd00:1:1:1::1/64
> > > network 1 TINFO: Network config (local -- remote):
> > > network 1 TINFO: ltp_ns_veth2 -- ltp_ns_veth1
> > > network 1 TINFO: 10.0.0.2/24 -- 10.0.0.1/24
> > > network 1 TINFO: fd00:1:1:1::2/64 -- fd00:1:1:1::1/64
> > > <<<test_start>>>
> > > tag=veth|nfsstat3_01 stime=1719943586
> > > cmdline="nfsstat01"
> > > contacts=""
> > > analysis=exit
> > > <<<test_output>>>
> > > incrementing stop
> > > nfsstat01 1 TINFO: timeout per run is 0h 20m 0s
> > > nfsstat01 1 TINFO: setup NFSv3, socket type udp
> > > nfsstat01 1 TINFO: Mounting NFS: mount -t nfs -o proto=udp,vers=3 10.0.0.2:/tmp/netpan-4577/LTP_nfsstat01.lz6zhgQHoV/3/udp /tmp/netpan-4577/LTP_nfsstat01.lz6zhgQHoV/3/0
> > > nfsstat01 1 TINFO: checking RPC calls for server/client
> > > nfsstat01 1 TINFO: calls 98/0
> > > nfsstat01 1 TINFO: Checking for tracking of RPC calls for server/client
> > > nfsstat01 1 TINFO: new calls 102/0
> > > nfsstat01 1 TPASS: server RPC calls increased
> > > nfsstat01 1 TFAIL: client RPC calls not increased
> > > nfsstat01 1 TINFO: checking NFS calls for server/client
> > > nfsstat01 1 TINFO: calls 2/2
> > > nfsstat01 1 TINFO: Checking for tracking of NFS calls for server/client
> > > nfsstat01 1 TINFO: new calls 3/3
> > > nfsstat01 1 TPASS: server NFS calls increased
> > > nfsstat01 1 TPASS: client NFS calls increased
> > > nfsstat01 2 TINFO: Cleaning up testcase
> > > nfsstat01 2 TINFO: SELinux enabled in enforcing mode, this may affect test results
> > > nfsstat01 2 TINFO: it can be disabled with TST_DISABLE_SELINUX=1 (requires super/root)
> > > nfsstat01 2 TINFO: install seinfo to find used SELinux profiles
> > > nfsstat01 2 TINFO: loaded SELinux profiles: none
> > > 
> > > Summary:
> > > passed 3
> > > failed 1
> > > skipped 0
> > > warnings 0
> > > <<<execution_status>>>
> > > initiation_status="ok"
> > > duration=1 termination_type=exited termination_id=1 corefile=no
> > > cutime=11 cstime=16
> > > <<<test_end>>>
> > > ltp-pan reported FAIL
> > > ========
> > > 
> > > We can observe the number of RPC client calls is 0, which is wired. And this happens from the kernel commit 57d1ce96d7655 ("nfs: make the rpc_stat per net namespace”). So now we’re not sure the kernel backport of nfs client changes is proper, or the LTP tests / userspace need to be modified.
> > > 
> > > If no userspace regression, should we revert the Josef’s NFS client-side changes on LTS?
> > 
> > This sounds like a regression in Linus's tree too, so why isn't it
> > reverted there first?
> 
> There is a change in behavior in the upstream code, but Josef's
> patches fix an information leak and make the statistics more
> sensible in container environments. I'm not certain that
> should be considered a regression, but confess I don't know
> the regression rules to this fine a degree of detail.
> 
> If it is indeed a regression, how can we go about retaining
> both behaviors (selectable by Kconfig or perhaps administrative
> UI)?
> 

I'd argue that the old behavior was a bug, and that Josef fixed
it. These stats should probably have been made per-net when all of the
original nfsd namespace work was done, but no one noticed until
recently. Whoops. 

A couple of hacky ideas for how we might deal with this:

1/ add a new line to the output of /proc/net/rpc/nfsd. It could just
say "per-net\n" or "per-net <netns_id_number>\n" or something. nfsstat
should ignore it, but LTP test could look for it and handle it
appropriately. That could even be useful later for nfsstat too I guess.

2/ move the file to a new name and make the old filename be a symlink
to the new one. nfsstat would still work, but LTP would be able to see
whether it was a symlink to detect the difference...or could just make
a new symlink that points to the file and LTP could look for its
presence.

-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [LTP] [PATCH 1/1] nfsstat01: Update client RPC calls for kernel 6.9
  2024-07-11 21:18             ` Jeff Layton
@ 2024-07-11 22:58               ` NeilBrown
  2024-07-12  0:40                 ` Jeff Layton
  0 siblings, 1 reply; 21+ messages in thread
From: NeilBrown @ 2024-07-11 22:58 UTC (permalink / raw)
  To: Jeff Layton
  Cc: Chuck Lever III, Greg KH, Sherry Yang, Calum Mackay, linux-stable,
	Petr Vorel, Trond Myklebust, Anna Schumaker,
	Linux NFS Mailing List, kernel-team@fb.com, ltp@lists.linux.it,
	Avinesh Kumar, Josef Bacik

On Fri, 12 Jul 2024, Jeff Layton wrote:
> On Mon, 2024-07-08 at 17:49 +0000, Chuck Lever III wrote:
> > 
> > > On Jul 8, 2024, at 6:36 AM, Greg KH <greg@kroah.com> wrote:
> > > 
> > > On Sat, Jul 06, 2024 at 07:46:19AM +0000, Sherry Yang wrote:
> > > > 
> > > > 
> > > > > On Jul 6, 2024, at 12:11 AM, Greg KH <greg@kroah.com> wrote:
> > > > > 
> > > > > On Fri, Jul 05, 2024 at 02:19:18PM +0000, Chuck Lever III wrote:
> > > > > > 
> > > > > > 
> > > > > > > On Jul 2, 2024, at 6:55 PM, Calum Mackay <calum.mackay@oracle.com> wrote:
> > > > > > > 
> > > > > > > To clarify…
> > > > > > > 
> > > > > > > On 02/07/2024 5:54 pm, Calum Mackay wrote:
> > > > > > > > hi Petr,
> > > > > > > > I noticed your LTP patch [1][2] which adjusts the nfsstat01 test on v6.9 kernels, to account for Josef's changes [3], which restrict the NFS/RPC stats per-namespace.
> > > > > > > > I see that Josef's changes were backported, as far back as longterm v5.4,
> > > > > > > 
> > > > > > > Sorry, that's not quite accurate.
> > > > > > > 
> > > > > > > Josef's NFS client changes were all backported from v6.9, as far as longterm v5.4.y:
> > > > > > > 
> > > > > > > 2057a48d0dd0 sunrpc: add a struct rpc_stats arg to rpc_create_args
> > > > > > > d47151b79e32 nfs: expose /proc/net/sunrpc/nfs in net namespaces
> > > > > > > 1548036ef120 nfs: make the rpc_stat per net namespace
> > > > > > > 
> > > > > > > 
> > > > > > > Of Josef's NFS server changes, four were backported from v6.9 to v6.8:
> > > > > > > 
> > > > > > > 418b9687dece sunrpc: use the struct net as the svc proc private
> > > > > > > d98416cc2154 nfsd: rename NFSD_NET_* to NFSD_STATS_*
> > > > > > > 93483ac5fec6 nfsd: expose /proc/net/sunrpc/nfsd in net namespaces
> > > > > > > 4b14885411f7 nfsd: make all of the nfsd stats per-network namespace
> > > > > > > 
> > > > > > > and the others remained only in v6.9:
> > > > > > > 
> > > > > > > ab42f4d9a26f sunrpc: don't change ->sv_stats if it doesn't exist
> > > > > > > a2214ed588fb nfsd: stop setting ->pg_stats for unused stats
> > > > > > > f09432386766 sunrpc: pass in the sv_stats struct through svc_create_pooled
> > > > > > > 3f6ef182f144 sunrpc: remove ->pg_stats from svc_program
> > > > > > > e41ee44cc6a4 nfsd: remove nfsd_stats, make th_cnt a global counter
> > > > > > > 16fb9808ab2c nfsd: make svc_stat per-network namespace instead of global
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > I'm wondering if this difference between NFS client, and NFS server, stat behaviour, across kernel versions, may perhaps cause some user confusion?
> > > > > > 
> > > > > > As a refresher for the stable folken, Josef's changes make
> > > > > > nfsstats silo'd, so they no longer show counts from the whole
> > > > > > system, but only for NFS operations relating to the local net
> > > > > > namespace. That is a surprising change for some users, tools,
> > > > > > and testing.
> > > > > > 
> > > > > > I'm not clear on whether there are any rules/guidelines around
> > > > > > LTS backports causing behavior changes that user tools, like
> > > > > > nfsstat, might be impacted by.
> > > > > 
> > > > > The same rules that apply for Linus's tree (i.e. no userspace
> > > > > regressions.)
> > > > 
> > > > Given the current data we have, LTP nfsstat01[1] failed on LTS v5.4.278 because of kernel commit 1548036ef1204 ("nfs:
> > > > make the rpc_stat per net namespace") [2]. Other LTS which backported the same commit are very likely troubled with the same LTP test failure.
> > > > 
> > > > The following are the LTP nfsstat01 failure output
> > > > 
> > > > ========
> > > > network 1 TINFO: initialize 'lhost' 'ltp_ns_veth2' interface
> > > > network 1 TINFO: add local addr 10.0.0.2/24
> > > > network 1 TINFO: add local addr fd00:1:1:1::2/64
> > > > network 1 TINFO: initialize 'rhost' 'ltp_ns_veth1' interface
> > > > network 1 TINFO: add remote addr 10.0.0.1/24
> > > > network 1 TINFO: add remote addr fd00:1:1:1::1/64
> > > > network 1 TINFO: Network config (local -- remote):
> > > > network 1 TINFO: ltp_ns_veth2 -- ltp_ns_veth1
> > > > network 1 TINFO: 10.0.0.2/24 -- 10.0.0.1/24
> > > > network 1 TINFO: fd00:1:1:1::2/64 -- fd00:1:1:1::1/64
> > > > <<<test_start>>>
> > > > tag=veth|nfsstat3_01 stime=1719943586
> > > > cmdline="nfsstat01"
> > > > contacts=""
> > > > analysis=exit
> > > > <<<test_output>>>
> > > > incrementing stop
> > > > nfsstat01 1 TINFO: timeout per run is 0h 20m 0s
> > > > nfsstat01 1 TINFO: setup NFSv3, socket type udp
> > > > nfsstat01 1 TINFO: Mounting NFS: mount -t nfs -o proto=udp,vers=3 10.0.0.2:/tmp/netpan-4577/LTP_nfsstat01.lz6zhgQHoV/3/udp /tmp/netpan-4577/LTP_nfsstat01.lz6zhgQHoV/3/0
> > > > nfsstat01 1 TINFO: checking RPC calls for server/client
> > > > nfsstat01 1 TINFO: calls 98/0
> > > > nfsstat01 1 TINFO: Checking for tracking of RPC calls for server/client
> > > > nfsstat01 1 TINFO: new calls 102/0
> > > > nfsstat01 1 TPASS: server RPC calls increased
> > > > nfsstat01 1 TFAIL: client RPC calls not increased
> > > > nfsstat01 1 TINFO: checking NFS calls for server/client
> > > > nfsstat01 1 TINFO: calls 2/2
> > > > nfsstat01 1 TINFO: Checking for tracking of NFS calls for server/client
> > > > nfsstat01 1 TINFO: new calls 3/3
> > > > nfsstat01 1 TPASS: server NFS calls increased
> > > > nfsstat01 1 TPASS: client NFS calls increased
> > > > nfsstat01 2 TINFO: Cleaning up testcase
> > > > nfsstat01 2 TINFO: SELinux enabled in enforcing mode, this may affect test results
> > > > nfsstat01 2 TINFO: it can be disabled with TST_DISABLE_SELINUX=1 (requires super/root)
> > > > nfsstat01 2 TINFO: install seinfo to find used SELinux profiles
> > > > nfsstat01 2 TINFO: loaded SELinux profiles: none
> > > > 
> > > > Summary:
> > > > passed 3
> > > > failed 1
> > > > skipped 0
> > > > warnings 0
> > > > <<<execution_status>>>
> > > > initiation_status="ok"
> > > > duration=1 termination_type=exited termination_id=1 corefile=no
> > > > cutime=11 cstime=16
> > > > <<<test_end>>>
> > > > ltp-pan reported FAIL
> > > > ========
> > > > 
> > > > We can observe the number of RPC client calls is 0, which is wired. And this happens from the kernel commit 57d1ce96d7655 ("nfs: make the rpc_stat per net namespace”). So now we’re not sure the kernel backport of nfs client changes is proper, or the LTP tests / userspace need to be modified.
> > > > 
> > > > If no userspace regression, should we revert the Josef’s NFS client-side changes on LTS?
> > > 
> > > This sounds like a regression in Linus's tree too, so why isn't it
> > > reverted there first?
> > 
> > There is a change in behavior in the upstream code, but Josef's
> > patches fix an information leak and make the statistics more
> > sensible in container environments. I'm not certain that
> > should be considered a regression, but confess I don't know
> > the regression rules to this fine a degree of detail.
> > 
> > If it is indeed a regression, how can we go about retaining
> > both behaviors (selectable by Kconfig or perhaps administrative
> > UI)?
> > 
> 
> I'd argue that the old behavior was a bug, and that Josef fixed
> it. These stats should probably have been made per-net when all of the
> original nfsd namespace work was done, but no one noticed until
> recently. Whoops. 
> 
> A couple of hacky ideas for how we might deal with this:
> 
> 1/ add a new line to the output of /proc/net/rpc/nfsd. It could just
> say "per-net\n" or "per-net <netns_id_number>\n" or something. nfsstat
> should ignore it, but LTP test could look for it and handle it
> appropriately. That could even be useful later for nfsstat too I guess.
> 
> 2/ move the file to a new name and make the old filename be a symlink
> to the new one. nfsstat would still work, but LTP would be able to see
> whether it was a symlink to detect the difference...or could just make
> a new symlink that points to the file and LTP could look for its
> presence.

I don't think it makes sense to present a solution which requires
LTP to be modified.  If we are willing to modify LTP, then we should
modify it to work with the per-net stats.

I think we need to create a new interface for the per-net stats, then
deprecate the old interface and remove it in (say) 2 years.  That given
LTP time to update, and means that an old LTP won't give incorrect
numbers, it will simply fail.

All we need to do is bikeshed the new interface.
  netlink ?
  /proc/net/rpc-pernet/nfsd ?

This means that we still need to keep the combined stats, or to combine
all the per-net stats on each access.

NeilBrown

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [LTP] [PATCH 1/1] nfsstat01: Update client RPC calls for kernel 6.9
  2024-07-11 22:58               ` NeilBrown
@ 2024-07-12  0:40                 ` Jeff Layton
  2024-07-12  6:12                   ` NeilBrown
  0 siblings, 1 reply; 21+ messages in thread
From: Jeff Layton @ 2024-07-12  0:40 UTC (permalink / raw)
  To: NeilBrown
  Cc: Chuck Lever III, Greg KH, Sherry Yang, Calum Mackay, linux-stable,
	Petr Vorel, Trond Myklebust, Anna Schumaker,
	Linux NFS Mailing List, kernel-team@fb.com, ltp@lists.linux.it,
	Avinesh Kumar, Josef Bacik

On Fri, 2024-07-12 at 08:58 +1000, NeilBrown wrote:
> On Fri, 12 Jul 2024, Jeff Layton wrote:
> > On Mon, 2024-07-08 at 17:49 +0000, Chuck Lever III wrote:
> > > 
> > > > On Jul 8, 2024, at 6:36 AM, Greg KH <greg@kroah.com> wrote:
> > > > 
> > > > On Sat, Jul 06, 2024 at 07:46:19AM +0000, Sherry Yang wrote:
> > > > > 
> > > > > 
> > > > > > On Jul 6, 2024, at 12:11 AM, Greg KH <greg@kroah.com> wrote:
> > > > > > 
> > > > > > On Fri, Jul 05, 2024 at 02:19:18PM +0000, Chuck Lever III wrote:
> > > > > > > 
> > > > > > > 
> > > > > > > > On Jul 2, 2024, at 6:55 PM, Calum Mackay <calum.mackay@oracle.com> wrote:
> > > > > > > > 
> > > > > > > > To clarify…
> > > > > > > > 
> > > > > > > > On 02/07/2024 5:54 pm, Calum Mackay wrote:
> > > > > > > > > hi Petr,
> > > > > > > > > I noticed your LTP patch [1][2] which adjusts the nfsstat01 test on v6.9 kernels, to account for Josef's changes [3], which restrict the NFS/RPC stats per-namespace.
> > > > > > > > > I see that Josef's changes were backported, as far back as longterm v5.4,
> > > > > > > > 
> > > > > > > > Sorry, that's not quite accurate.
> > > > > > > > 
> > > > > > > > Josef's NFS client changes were all backported from v6.9, as far as longterm v5.4.y:
> > > > > > > > 
> > > > > > > > 2057a48d0dd0 sunrpc: add a struct rpc_stats arg to rpc_create_args
> > > > > > > > d47151b79e32 nfs: expose /proc/net/sunrpc/nfs in net namespaces
> > > > > > > > 1548036ef120 nfs: make the rpc_stat per net namespace
> > > > > > > > 
> > > > > > > > 
> > > > > > > > Of Josef's NFS server changes, four were backported from v6.9 to v6.8:
> > > > > > > > 
> > > > > > > > 418b9687dece sunrpc: use the struct net as the svc proc private
> > > > > > > > d98416cc2154 nfsd: rename NFSD_NET_* to NFSD_STATS_*
> > > > > > > > 93483ac5fec6 nfsd: expose /proc/net/sunrpc/nfsd in net namespaces
> > > > > > > > 4b14885411f7 nfsd: make all of the nfsd stats per-network namespace
> > > > > > > > 
> > > > > > > > and the others remained only in v6.9:
> > > > > > > > 
> > > > > > > > ab42f4d9a26f sunrpc: don't change ->sv_stats if it doesn't exist
> > > > > > > > a2214ed588fb nfsd: stop setting ->pg_stats for unused stats
> > > > > > > > f09432386766 sunrpc: pass in the sv_stats struct through svc_create_pooled
> > > > > > > > 3f6ef182f144 sunrpc: remove ->pg_stats from svc_program
> > > > > > > > e41ee44cc6a4 nfsd: remove nfsd_stats, make th_cnt a global counter
> > > > > > > > 16fb9808ab2c nfsd: make svc_stat per-network namespace instead of global
> > > > > > > > 
> > > > > > > > 
> > > > > > > > 
> > > > > > > > I'm wondering if this difference between NFS client, and NFS server, stat behaviour, across kernel versions, may perhaps cause some user confusion?
> > > > > > > 
> > > > > > > As a refresher for the stable folken, Josef's changes make
> > > > > > > nfsstats silo'd, so they no longer show counts from the whole
> > > > > > > system, but only for NFS operations relating to the local net
> > > > > > > namespace. That is a surprising change for some users, tools,
> > > > > > > and testing.
> > > > > > > 
> > > > > > > I'm not clear on whether there are any rules/guidelines around
> > > > > > > LTS backports causing behavior changes that user tools, like
> > > > > > > nfsstat, might be impacted by.
> > > > > > 
> > > > > > The same rules that apply for Linus's tree (i.e. no userspace
> > > > > > regressions.)
> > > > > 
> > > > > Given the current data we have, LTP nfsstat01[1] failed on LTS v5.4.278 because of kernel commit 1548036ef1204 ("nfs:
> > > > > make the rpc_stat per net namespace") [2]. Other LTS which backported the same commit are very likely troubled with the same LTP test failure.
> > > > > 
> > > > > The following are the LTP nfsstat01 failure output
> > > > > 
> > > > > ========
> > > > > network 1 TINFO: initialize 'lhost' 'ltp_ns_veth2' interface
> > > > > network 1 TINFO: add local addr 10.0.0.2/24
> > > > > network 1 TINFO: add local addr fd00:1:1:1::2/64
> > > > > network 1 TINFO: initialize 'rhost' 'ltp_ns_veth1' interface
> > > > > network 1 TINFO: add remote addr 10.0.0.1/24
> > > > > network 1 TINFO: add remote addr fd00:1:1:1::1/64
> > > > > network 1 TINFO: Network config (local -- remote):
> > > > > network 1 TINFO: ltp_ns_veth2 -- ltp_ns_veth1
> > > > > network 1 TINFO: 10.0.0.2/24 -- 10.0.0.1/24
> > > > > network 1 TINFO: fd00:1:1:1::2/64 -- fd00:1:1:1::1/64
> > > > > <<<test_start>>>
> > > > > tag=veth|nfsstat3_01 stime=1719943586
> > > > > cmdline="nfsstat01"
> > > > > contacts=""
> > > > > analysis=exit
> > > > > <<<test_output>>>
> > > > > incrementing stop
> > > > > nfsstat01 1 TINFO: timeout per run is 0h 20m 0s
> > > > > nfsstat01 1 TINFO: setup NFSv3, socket type udp
> > > > > nfsstat01 1 TINFO: Mounting NFS: mount -t nfs -o proto=udp,vers=3 10.0.0.2:/tmp/netpan-4577/LTP_nfsstat01.lz6zhgQHoV/3/udp /tmp/netpan-4577/LTP_nfsstat01.lz6zhgQHoV/3/0
> > > > > nfsstat01 1 TINFO: checking RPC calls for server/client
> > > > > nfsstat01 1 TINFO: calls 98/0
> > > > > nfsstat01 1 TINFO: Checking for tracking of RPC calls for server/client
> > > > > nfsstat01 1 TINFO: new calls 102/0
> > > > > nfsstat01 1 TPASS: server RPC calls increased
> > > > > nfsstat01 1 TFAIL: client RPC calls not increased
> > > > > nfsstat01 1 TINFO: checking NFS calls for server/client
> > > > > nfsstat01 1 TINFO: calls 2/2
> > > > > nfsstat01 1 TINFO: Checking for tracking of NFS calls for server/client
> > > > > nfsstat01 1 TINFO: new calls 3/3
> > > > > nfsstat01 1 TPASS: server NFS calls increased
> > > > > nfsstat01 1 TPASS: client NFS calls increased
> > > > > nfsstat01 2 TINFO: Cleaning up testcase
> > > > > nfsstat01 2 TINFO: SELinux enabled in enforcing mode, this may affect test results
> > > > > nfsstat01 2 TINFO: it can be disabled with TST_DISABLE_SELINUX=1 (requires super/root)
> > > > > nfsstat01 2 TINFO: install seinfo to find used SELinux profiles
> > > > > nfsstat01 2 TINFO: loaded SELinux profiles: none
> > > > > 
> > > > > Summary:
> > > > > passed 3
> > > > > failed 1
> > > > > skipped 0
> > > > > warnings 0
> > > > > <<<execution_status>>>
> > > > > initiation_status="ok"
> > > > > duration=1 termination_type=exited termination_id=1 corefile=no
> > > > > cutime=11 cstime=16
> > > > > <<<test_end>>>
> > > > > ltp-pan reported FAIL
> > > > > ========
> > > > > 
> > > > > We can observe the number of RPC client calls is 0, which is wired. And this happens from the kernel commit 57d1ce96d7655 ("nfs: make the rpc_stat per net namespace”). So now we’re not sure the kernel backport of nfs client changes is proper, or the LTP tests / userspace need to be modified.
> > > > > 
> > > > > If no userspace regression, should we revert the Josef’s NFS client-side changes on LTS?
> > > > 
> > > > This sounds like a regression in Linus's tree too, so why isn't it
> > > > reverted there first?
> > > 
> > > There is a change in behavior in the upstream code, but Josef's
> > > patches fix an information leak and make the statistics more
> > > sensible in container environments. I'm not certain that
> > > should be considered a regression, but confess I don't know
> > > the regression rules to this fine a degree of detail.
> > > 
> > > If it is indeed a regression, how can we go about retaining
> > > both behaviors (selectable by Kconfig or perhaps administrative
> > > UI)?
> > > 
> > 
> > I'd argue that the old behavior was a bug, and that Josef fixed
> > it. These stats should probably have been made per-net when all of the
> > original nfsd namespace work was done, but no one noticed until
> > recently. Whoops. 
> > 
> > A couple of hacky ideas for how we might deal with this:
> > 
> > 1/ add a new line to the output of /proc/net/rpc/nfsd. It could just
> > say "per-net\n" or "per-net <netns_id_number>\n" or something. nfsstat
> > should ignore it, but LTP test could look for it and handle it
> > appropriately. That could even be useful later for nfsstat too I guess.
> > 
> > 2/ move the file to a new name and make the old filename be a symlink
> > to the new one. nfsstat would still work, but LTP would be able to see
> > whether it was a symlink to detect the difference...or could just make
> > a new symlink that points to the file and LTP could look for its
> > presence.
> 
> I don't think it makes sense to present a solution which requires
> LTP to be modified.  If we are willing to modify LTP, then we should
> modify it to work with the per-net stats.
> 
> I think we need to create a new interface for the per-net stats, then
> deprecate the old interface and remove it in (say) 2 years.  That given
> LTP time to update, and means that an old LTP won't give incorrect
> numbers, it will simply fail.
> 
> All we need to do is bikeshed the new interface.
>   netlink ?
>   /proc/net/rpc-pernet/nfsd ?
> 
> This means that we still need to keep the combined stats, or to combine
> all the per-net stats on each access.
> 

How much of this functionality would we need to restore?

Prior to Josef's patches, you would get info about global stats from
relevant stats procfiles in a container. That seems like an information
leak to me, but fixing that is probably going to break _somebody_.
Where do we draw the line and why?

LTP is just a testsuite. Asking them to alter tests in order to cope
with a bugfix seems entirely reasonable to me. If someone can make a
case for real-world applications that rely on the old semantics, then
I'd be more open to changing this, but I just don't see the upside of
restoring legacy behavior here.
-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [LTP] [PATCH 1/1] nfsstat01: Update client RPC calls for kernel 6.9
  2024-07-12  0:40                 ` Jeff Layton
@ 2024-07-12  6:12                   ` NeilBrown
  2024-07-12 10:16                     ` Jeff Layton
  0 siblings, 1 reply; 21+ messages in thread
From: NeilBrown @ 2024-07-12  6:12 UTC (permalink / raw)
  To: Jeff Layton
  Cc: Chuck Lever III, Greg KH, Sherry Yang, Calum Mackay, linux-stable,
	Petr Vorel, Trond Myklebust, Anna Schumaker,
	Linux NFS Mailing List, kernel-team@fb.com, ltp@lists.linux.it,
	Avinesh Kumar, Josef Bacik

On Fri, 12 Jul 2024, Jeff Layton wrote:
> On Fri, 2024-07-12 at 08:58 +1000, NeilBrown wrote:
> > On Fri, 12 Jul 2024, Jeff Layton wrote:
> > > On Mon, 2024-07-08 at 17:49 +0000, Chuck Lever III wrote:
> > > > 
> > > > > On Jul 8, 2024, at 6:36 AM, Greg KH <greg@kroah.com> wrote:
> > > > > 
> > > > > On Sat, Jul 06, 2024 at 07:46:19AM +0000, Sherry Yang wrote:
> > > > > > 
> > > > > > 
> > > > > > > On Jul 6, 2024, at 12:11 AM, Greg KH <greg@kroah.com> wrote:
> > > > > > > 
> > > > > > > On Fri, Jul 05, 2024 at 02:19:18PM +0000, Chuck Lever III wrote:
> > > > > > > > 
> > > > > > > > 
> > > > > > > > > On Jul 2, 2024, at 6:55 PM, Calum Mackay <calum.mackay@oracle.com> wrote:
> > > > > > > > > 
> > > > > > > > > To clarify…
> > > > > > > > > 
> > > > > > > > > On 02/07/2024 5:54 pm, Calum Mackay wrote:
> > > > > > > > > > hi Petr,
> > > > > > > > > > I noticed your LTP patch [1][2] which adjusts the nfsstat01 test on v6.9 kernels, to account for Josef's changes [3], which restrict the NFS/RPC stats per-namespace.
> > > > > > > > > > I see that Josef's changes were backported, as far back as longterm v5.4,
> > > > > > > > > 
> > > > > > > > > Sorry, that's not quite accurate.
> > > > > > > > > 
> > > > > > > > > Josef's NFS client changes were all backported from v6.9, as far as longterm v5.4.y:
> > > > > > > > > 
> > > > > > > > > 2057a48d0dd0 sunrpc: add a struct rpc_stats arg to rpc_create_args
> > > > > > > > > d47151b79e32 nfs: expose /proc/net/sunrpc/nfs in net namespaces
> > > > > > > > > 1548036ef120 nfs: make the rpc_stat per net namespace
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > Of Josef's NFS server changes, four were backported from v6.9 to v6.8:
> > > > > > > > > 
> > > > > > > > > 418b9687dece sunrpc: use the struct net as the svc proc private
> > > > > > > > > d98416cc2154 nfsd: rename NFSD_NET_* to NFSD_STATS_*
> > > > > > > > > 93483ac5fec6 nfsd: expose /proc/net/sunrpc/nfsd in net namespaces
> > > > > > > > > 4b14885411f7 nfsd: make all of the nfsd stats per-network namespace
> > > > > > > > > 
> > > > > > > > > and the others remained only in v6.9:
> > > > > > > > > 
> > > > > > > > > ab42f4d9a26f sunrpc: don't change ->sv_stats if it doesn't exist
> > > > > > > > > a2214ed588fb nfsd: stop setting ->pg_stats for unused stats
> > > > > > > > > f09432386766 sunrpc: pass in the sv_stats struct through svc_create_pooled
> > > > > > > > > 3f6ef182f144 sunrpc: remove ->pg_stats from svc_program
> > > > > > > > > e41ee44cc6a4 nfsd: remove nfsd_stats, make th_cnt a global counter
> > > > > > > > > 16fb9808ab2c nfsd: make svc_stat per-network namespace instead of global
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > I'm wondering if this difference between NFS client, and NFS server, stat behaviour, across kernel versions, may perhaps cause some user confusion?
> > > > > > > > 
> > > > > > > > As a refresher for the stable folken, Josef's changes make
> > > > > > > > nfsstats silo'd, so they no longer show counts from the whole
> > > > > > > > system, but only for NFS operations relating to the local net
> > > > > > > > namespace. That is a surprising change for some users, tools,
> > > > > > > > and testing.
> > > > > > > > 
> > > > > > > > I'm not clear on whether there are any rules/guidelines around
> > > > > > > > LTS backports causing behavior changes that user tools, like
> > > > > > > > nfsstat, might be impacted by.
> > > > > > > 
> > > > > > > The same rules that apply for Linus's tree (i.e. no userspace
> > > > > > > regressions.)
> > > > > > 
> > > > > > Given the current data we have, LTP nfsstat01[1] failed on LTS v5.4.278 because of kernel commit 1548036ef1204 ("nfs:
> > > > > > make the rpc_stat per net namespace") [2]. Other LTS which backported the same commit are very likely troubled with the same LTP test failure.
> > > > > > 
> > > > > > The following are the LTP nfsstat01 failure output
> > > > > > 
> > > > > > ========
> > > > > > network 1 TINFO: initialize 'lhost' 'ltp_ns_veth2' interface
> > > > > > network 1 TINFO: add local addr 10.0.0.2/24
> > > > > > network 1 TINFO: add local addr fd00:1:1:1::2/64
> > > > > > network 1 TINFO: initialize 'rhost' 'ltp_ns_veth1' interface
> > > > > > network 1 TINFO: add remote addr 10.0.0.1/24
> > > > > > network 1 TINFO: add remote addr fd00:1:1:1::1/64
> > > > > > network 1 TINFO: Network config (local -- remote):
> > > > > > network 1 TINFO: ltp_ns_veth2 -- ltp_ns_veth1
> > > > > > network 1 TINFO: 10.0.0.2/24 -- 10.0.0.1/24
> > > > > > network 1 TINFO: fd00:1:1:1::2/64 -- fd00:1:1:1::1/64
> > > > > > <<<test_start>>>
> > > > > > tag=veth|nfsstat3_01 stime=1719943586
> > > > > > cmdline="nfsstat01"
> > > > > > contacts=""
> > > > > > analysis=exit
> > > > > > <<<test_output>>>
> > > > > > incrementing stop
> > > > > > nfsstat01 1 TINFO: timeout per run is 0h 20m 0s
> > > > > > nfsstat01 1 TINFO: setup NFSv3, socket type udp
> > > > > > nfsstat01 1 TINFO: Mounting NFS: mount -t nfs -o proto=udp,vers=3 10.0.0.2:/tmp/netpan-4577/LTP_nfsstat01.lz6zhgQHoV/3/udp /tmp/netpan-4577/LTP_nfsstat01.lz6zhgQHoV/3/0
> > > > > > nfsstat01 1 TINFO: checking RPC calls for server/client
> > > > > > nfsstat01 1 TINFO: calls 98/0
> > > > > > nfsstat01 1 TINFO: Checking for tracking of RPC calls for server/client
> > > > > > nfsstat01 1 TINFO: new calls 102/0
> > > > > > nfsstat01 1 TPASS: server RPC calls increased
> > > > > > nfsstat01 1 TFAIL: client RPC calls not increased
> > > > > > nfsstat01 1 TINFO: checking NFS calls for server/client
> > > > > > nfsstat01 1 TINFO: calls 2/2
> > > > > > nfsstat01 1 TINFO: Checking for tracking of NFS calls for server/client
> > > > > > nfsstat01 1 TINFO: new calls 3/3
> > > > > > nfsstat01 1 TPASS: server NFS calls increased
> > > > > > nfsstat01 1 TPASS: client NFS calls increased
> > > > > > nfsstat01 2 TINFO: Cleaning up testcase
> > > > > > nfsstat01 2 TINFO: SELinux enabled in enforcing mode, this may affect test results
> > > > > > nfsstat01 2 TINFO: it can be disabled with TST_DISABLE_SELINUX=1 (requires super/root)
> > > > > > nfsstat01 2 TINFO: install seinfo to find used SELinux profiles
> > > > > > nfsstat01 2 TINFO: loaded SELinux profiles: none
> > > > > > 
> > > > > > Summary:
> > > > > > passed 3
> > > > > > failed 1
> > > > > > skipped 0
> > > > > > warnings 0
> > > > > > <<<execution_status>>>
> > > > > > initiation_status="ok"
> > > > > > duration=1 termination_type=exited termination_id=1 corefile=no
> > > > > > cutime=11 cstime=16
> > > > > > <<<test_end>>>
> > > > > > ltp-pan reported FAIL
> > > > > > ========
> > > > > > 
> > > > > > We can observe the number of RPC client calls is 0, which is wired. And this happens from the kernel commit 57d1ce96d7655 ("nfs: make the rpc_stat per net namespace”). So now we’re not sure the kernel backport of nfs client changes is proper, or the LTP tests / userspace need to be modified.
> > > > > > 
> > > > > > If no userspace regression, should we revert the Josef’s NFS client-side changes on LTS?
> > > > > 
> > > > > This sounds like a regression in Linus's tree too, so why isn't it
> > > > > reverted there first?
> > > > 
> > > > There is a change in behavior in the upstream code, but Josef's
> > > > patches fix an information leak and make the statistics more
> > > > sensible in container environments. I'm not certain that
> > > > should be considered a regression, but confess I don't know
> > > > the regression rules to this fine a degree of detail.
> > > > 
> > > > If it is indeed a regression, how can we go about retaining
> > > > both behaviors (selectable by Kconfig or perhaps administrative
> > > > UI)?
> > > > 
> > > 
> > > I'd argue that the old behavior was a bug, and that Josef fixed
> > > it. These stats should probably have been made per-net when all of the
> > > original nfsd namespace work was done, but no one noticed until
> > > recently. Whoops. 
> > > 
> > > A couple of hacky ideas for how we might deal with this:
> > > 
> > > 1/ add a new line to the output of /proc/net/rpc/nfsd. It could just
> > > say "per-net\n" or "per-net <netns_id_number>\n" or something. nfsstat
> > > should ignore it, but LTP test could look for it and handle it
> > > appropriately. That could even be useful later for nfsstat too I guess.
> > > 
> > > 2/ move the file to a new name and make the old filename be a symlink
> > > to the new one. nfsstat would still work, but LTP would be able to see
> > > whether it was a symlink to detect the difference...or could just make
> > > a new symlink that points to the file and LTP could look for its
> > > presence.
> > 
> > I don't think it makes sense to present a solution which requires
> > LTP to be modified.  If we are willing to modify LTP, then we should
> > modify it to work with the per-net stats.
> > 
> > I think we need to create a new interface for the per-net stats, then
> > deprecate the old interface and remove it in (say) 2 years.  That given
> > LTP time to update, and means that an old LTP won't give incorrect
> > numbers, it will simply fail.
> > 
> > All we need to do is bikeshed the new interface.
> >   netlink ?
> >   /proc/net/rpc-pernet/nfsd ?
> > 
> > This means that we still need to keep the combined stats, or to combine
> > all the per-net stats on each access.
> > 
> 
> How much of this functionality would we need to restore?
> 
> Prior to Josef's patches, you would get info about global stats from
> relevant stats procfiles in a container. That seems like an information
> leak to me, but fixing that is probably going to break _somebody_.
> Where do we draw the line and why?
> 
> LTP is just a testsuite. Asking them to alter tests in order to cope
> with a bugfix seems entirely reasonable to me. If someone can make a
> case for real-world applications that rely on the old semantics, then
> I'd be more open to changing this, but I just don't see the upside of
> restoring legacy behavior here.

If it is OK to ask them to alter the tests, ask them to alter the tests
to work with today's kernel and don't make any change to the kernel.
Maybe the tests will have to be fixed to "PASS" both the old and the new
results, but that probably isn't rocket science.

My point is that if we are going to change the kernel to accommodate LTP
at all, we should accommodate LTP as it is today.  If we are going to
change LTP to accommodate the kernel, then it should accommodate the
kernel as it is today.

NeilBrown

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [LTP] [PATCH 1/1] nfsstat01: Update client RPC calls for kernel 6.9
  2024-07-12  6:12                   ` NeilBrown
@ 2024-07-12 10:16                     ` Jeff Layton
  2024-07-12 11:07                       ` Petr Vorel
  2024-07-12 11:13                       ` NeilBrown
  0 siblings, 2 replies; 21+ messages in thread
From: Jeff Layton @ 2024-07-12 10:16 UTC (permalink / raw)
  To: NeilBrown
  Cc: Chuck Lever III, Greg KH, Sherry Yang, Calum Mackay, linux-stable,
	Petr Vorel, Trond Myklebust, Anna Schumaker,
	Linux NFS Mailing List, kernel-team@fb.com, ltp@lists.linux.it,
	Avinesh Kumar, Josef Bacik

On Fri, 2024-07-12 at 16:12 +1000, NeilBrown wrote:
> On Fri, 12 Jul 2024, Jeff Layton wrote:
> > On Fri, 2024-07-12 at 08:58 +1000, NeilBrown wrote:
> > > On Fri, 12 Jul 2024, Jeff Layton wrote:
> > > > On Mon, 2024-07-08 at 17:49 +0000, Chuck Lever III wrote:
> > > > > 
> > > > > > On Jul 8, 2024, at 6:36 AM, Greg KH <greg@kroah.com> wrote:
> > > > > > 
> > > > > > On Sat, Jul 06, 2024 at 07:46:19AM +0000, Sherry Yang wrote:
> > > > > > > 
> > > > > > > 
> > > > > > > > On Jul 6, 2024, at 12:11 AM, Greg KH <greg@kroah.com> wrote:
> > > > > > > > 
> > > > > > > > On Fri, Jul 05, 2024 at 02:19:18PM +0000, Chuck Lever III wrote:
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > > On Jul 2, 2024, at 6:55 PM, Calum Mackay <calum.mackay@oracle.com> wrote:
> > > > > > > > > > 
> > > > > > > > > > To clarify…
> > > > > > > > > > 
> > > > > > > > > > On 02/07/2024 5:54 pm, Calum Mackay wrote:
> > > > > > > > > > > hi Petr,
> > > > > > > > > > > I noticed your LTP patch [1][2] which adjusts the nfsstat01 test on v6.9 kernels, to account for Josef's changes [3], which restrict the NFS/RPC stats per-namespace.
> > > > > > > > > > > I see that Josef's changes were backported, as far back as longterm v5.4,
> > > > > > > > > > 
> > > > > > > > > > Sorry, that's not quite accurate.
> > > > > > > > > > 
> > > > > > > > > > Josef's NFS client changes were all backported from v6.9, as far as longterm v5.4.y:
> > > > > > > > > > 
> > > > > > > > > > 2057a48d0dd0 sunrpc: add a struct rpc_stats arg to rpc_create_args
> > > > > > > > > > d47151b79e32 nfs: expose /proc/net/sunrpc/nfs in net namespaces
> > > > > > > > > > 1548036ef120 nfs: make the rpc_stat per net namespace
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > Of Josef's NFS server changes, four were backported from v6.9 to v6.8:
> > > > > > > > > > 
> > > > > > > > > > 418b9687dece sunrpc: use the struct net as the svc proc private
> > > > > > > > > > d98416cc2154 nfsd: rename NFSD_NET_* to NFSD_STATS_*
> > > > > > > > > > 93483ac5fec6 nfsd: expose /proc/net/sunrpc/nfsd in net namespaces
> > > > > > > > > > 4b14885411f7 nfsd: make all of the nfsd stats per-network namespace
> > > > > > > > > > 
> > > > > > > > > > and the others remained only in v6.9:
> > > > > > > > > > 
> > > > > > > > > > ab42f4d9a26f sunrpc: don't change ->sv_stats if it doesn't exist
> > > > > > > > > > a2214ed588fb nfsd: stop setting ->pg_stats for unused stats
> > > > > > > > > > f09432386766 sunrpc: pass in the sv_stats struct through svc_create_pooled
> > > > > > > > > > 3f6ef182f144 sunrpc: remove ->pg_stats from svc_program
> > > > > > > > > > e41ee44cc6a4 nfsd: remove nfsd_stats, make th_cnt a global counter
> > > > > > > > > > 16fb9808ab2c nfsd: make svc_stat per-network namespace instead of global
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > I'm wondering if this difference between NFS client, and NFS server, stat behaviour, across kernel versions, may perhaps cause some user confusion?
> > > > > > > > > 
> > > > > > > > > As a refresher for the stable folken, Josef's changes make
> > > > > > > > > nfsstats silo'd, so they no longer show counts from the whole
> > > > > > > > > system, but only for NFS operations relating to the local net
> > > > > > > > > namespace. That is a surprising change for some users, tools,
> > > > > > > > > and testing.
> > > > > > > > > 
> > > > > > > > > I'm not clear on whether there are any rules/guidelines around
> > > > > > > > > LTS backports causing behavior changes that user tools, like
> > > > > > > > > nfsstat, might be impacted by.
> > > > > > > > 
> > > > > > > > The same rules that apply for Linus's tree (i.e. no userspace
> > > > > > > > regressions.)
> > > > > > > 
> > > > > > > Given the current data we have, LTP nfsstat01[1] failed on LTS v5.4.278 because of kernel commit 1548036ef1204 ("nfs:
> > > > > > > make the rpc_stat per net namespace") [2]. Other LTS which backported the same commit are very likely troubled with the same LTP test failure.
> > > > > > > 
> > > > > > > The following are the LTP nfsstat01 failure output
> > > > > > > 
> > > > > > > ========
> > > > > > > network 1 TINFO: initialize 'lhost' 'ltp_ns_veth2' interface
> > > > > > > network 1 TINFO: add local addr 10.0.0.2/24
> > > > > > > network 1 TINFO: add local addr fd00:1:1:1::2/64
> > > > > > > network 1 TINFO: initialize 'rhost' 'ltp_ns_veth1' interface
> > > > > > > network 1 TINFO: add remote addr 10.0.0.1/24
> > > > > > > network 1 TINFO: add remote addr fd00:1:1:1::1/64
> > > > > > > network 1 TINFO: Network config (local -- remote):
> > > > > > > network 1 TINFO: ltp_ns_veth2 -- ltp_ns_veth1
> > > > > > > network 1 TINFO: 10.0.0.2/24 -- 10.0.0.1/24
> > > > > > > network 1 TINFO: fd00:1:1:1::2/64 -- fd00:1:1:1::1/64
> > > > > > > <<<test_start>>>
> > > > > > > tag=veth|nfsstat3_01 stime=1719943586
> > > > > > > cmdline="nfsstat01"
> > > > > > > contacts=""
> > > > > > > analysis=exit
> > > > > > > <<<test_output>>>
> > > > > > > incrementing stop
> > > > > > > nfsstat01 1 TINFO: timeout per run is 0h 20m 0s
> > > > > > > nfsstat01 1 TINFO: setup NFSv3, socket type udp
> > > > > > > nfsstat01 1 TINFO: Mounting NFS: mount -t nfs -o proto=udp,vers=3 10.0.0.2:/tmp/netpan-4577/LTP_nfsstat01.lz6zhgQHoV/3/udp /tmp/netpan-4577/LTP_nfsstat01.lz6zhgQHoV/3/0
> > > > > > > nfsstat01 1 TINFO: checking RPC calls for server/client
> > > > > > > nfsstat01 1 TINFO: calls 98/0
> > > > > > > nfsstat01 1 TINFO: Checking for tracking of RPC calls for server/client
> > > > > > > nfsstat01 1 TINFO: new calls 102/0
> > > > > > > nfsstat01 1 TPASS: server RPC calls increased
> > > > > > > nfsstat01 1 TFAIL: client RPC calls not increased
> > > > > > > nfsstat01 1 TINFO: checking NFS calls for server/client
> > > > > > > nfsstat01 1 TINFO: calls 2/2
> > > > > > > nfsstat01 1 TINFO: Checking for tracking of NFS calls for server/client
> > > > > > > nfsstat01 1 TINFO: new calls 3/3
> > > > > > > nfsstat01 1 TPASS: server NFS calls increased
> > > > > > > nfsstat01 1 TPASS: client NFS calls increased
> > > > > > > nfsstat01 2 TINFO: Cleaning up testcase
> > > > > > > nfsstat01 2 TINFO: SELinux enabled in enforcing mode, this may affect test results
> > > > > > > nfsstat01 2 TINFO: it can be disabled with TST_DISABLE_SELINUX=1 (requires super/root)
> > > > > > > nfsstat01 2 TINFO: install seinfo to find used SELinux profiles
> > > > > > > nfsstat01 2 TINFO: loaded SELinux profiles: none
> > > > > > > 
> > > > > > > Summary:
> > > > > > > passed 3
> > > > > > > failed 1
> > > > > > > skipped 0
> > > > > > > warnings 0
> > > > > > > <<<execution_status>>>
> > > > > > > initiation_status="ok"
> > > > > > > duration=1 termination_type=exited termination_id=1 corefile=no
> > > > > > > cutime=11 cstime=16
> > > > > > > <<<test_end>>>
> > > > > > > ltp-pan reported FAIL
> > > > > > > ========
> > > > > > > 
> > > > > > > We can observe the number of RPC client calls is 0, which is wired. And this happens from the kernel commit 57d1ce96d7655 ("nfs: make the rpc_stat per net namespace”). So now we’re not sure the kernel backport of nfs client changes is proper, or the LTP tests / userspace need to be modified.
> > > > > > > 
> > > > > > > If no userspace regression, should we revert the Josef’s NFS client-side changes on LTS?
> > > > > > 
> > > > > > This sounds like a regression in Linus's tree too, so why isn't it
> > > > > > reverted there first?
> > > > > 
> > > > > There is a change in behavior in the upstream code, but Josef's
> > > > > patches fix an information leak and make the statistics more
> > > > > sensible in container environments. I'm not certain that
> > > > > should be considered a regression, but confess I don't know
> > > > > the regression rules to this fine a degree of detail.
> > > > > 
> > > > > If it is indeed a regression, how can we go about retaining
> > > > > both behaviors (selectable by Kconfig or perhaps administrative
> > > > > UI)?
> > > > > 
> > > > 
> > > > I'd argue that the old behavior was a bug, and that Josef fixed
> > > > it. These stats should probably have been made per-net when all of the
> > > > original nfsd namespace work was done, but no one noticed until
> > > > recently. Whoops. 
> > > > 
> > > > A couple of hacky ideas for how we might deal with this:
> > > > 
> > > > 1/ add a new line to the output of /proc/net/rpc/nfsd. It could just
> > > > say "per-net\n" or "per-net <netns_id_number>\n" or something. nfsstat
> > > > should ignore it, but LTP test could look for it and handle it
> > > > appropriately. That could even be useful later for nfsstat too I guess.
> > > > 
> > > > 2/ move the file to a new name and make the old filename be a symlink
> > > > to the new one. nfsstat would still work, but LTP would be able to see
> > > > whether it was a symlink to detect the difference...or could just make
> > > > a new symlink that points to the file and LTP could look for its
> > > > presence.
> > > 
> > > I don't think it makes sense to present a solution which requires
> > > LTP to be modified.  If we are willing to modify LTP, then we should
> > > modify it to work with the per-net stats.
> > > 
> > > I think we need to create a new interface for the per-net stats, then
> > > deprecate the old interface and remove it in (say) 2 years.  That given
> > > LTP time to update, and means that an old LTP won't give incorrect
> > > numbers, it will simply fail.
> > > 
> > > All we need to do is bikeshed the new interface.
> > >   netlink ?
> > >   /proc/net/rpc-pernet/nfsd ?
> > > 
> > > This means that we still need to keep the combined stats, or to combine
> > > all the per-net stats on each access.
> > > 
> > 
> > How much of this functionality would we need to restore?
> > 
> > Prior to Josef's patches, you would get info about global stats from
> > relevant stats procfiles in a container. That seems like an information
> > leak to me, but fixing that is probably going to break _somebody_.
> > Where do we draw the line and why?
> > 
> > LTP is just a testsuite. Asking them to alter tests in order to cope
> > with a bugfix seems entirely reasonable to me. If someone can make a
> > case for real-world applications that rely on the old semantics, then
> > I'd be more open to changing this, but I just don't see the upside of
> > restoring legacy behavior here.
> 
> If it is OK to ask them to alter the tests, ask them to alter the tests
> to work with today's kernel and don't make any change to the kernel.
> Maybe the tests will have to be fixed to "PASS" both the old and the new
> results, but that probably isn't rocket science.
> 
> My point is that if we are going to change the kernel to accommodate LTP
> at all, we should accommodate LTP as it is today.  If we are going to
> change LTP to accommodate the kernel, then it should accommodate the
> kernel as it is today.
> 

The problem is that there is no way for userland tell the difference
between the older and newer behavior. That was what I was suggesting we
add.

To be clear, I hold this opinion loosely. If the consensus is that we
need to revert things then so be it. I just don't see the value of
doing that in this particular situation.
-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [LTP] [PATCH 1/1] nfsstat01: Update client RPC calls for kernel 6.9
  2024-07-12 10:16                     ` Jeff Layton
@ 2024-07-12 11:07                       ` Petr Vorel
  2024-07-12 14:03                         ` Chuck Lever III
  2024-07-12 11:13                       ` NeilBrown
  1 sibling, 1 reply; 21+ messages in thread
From: Petr Vorel @ 2024-07-12 11:07 UTC (permalink / raw)
  To: Jeff Layton
  Cc: NeilBrown, Chuck Lever III, Greg KH, Sherry Yang, Calum Mackay,
	linux-stable, Trond Myklebust, Anna Schumaker,
	Linux NFS Mailing List, kernel-team@fb.com, ltp@lists.linux.it,
	Avinesh Kumar, Josef Bacik

Hi all,

> On Fri, 2024-07-12 at 16:12 +1000, NeilBrown wrote:
> > On Fri, 12 Jul 2024, Jeff Layton wrote:
> > > On Fri, 2024-07-12 at 08:58 +1000, NeilBrown wrote:
> > > > On Fri, 12 Jul 2024, Jeff Layton wrote:
> > > > > On Mon, 2024-07-08 at 17:49 +0000, Chuck Lever III wrote:

> > > > > > > On Jul 8, 2024, at 6:36 AM, Greg KH <greg@kroah.com> wrote:

> > > > > > > On Sat, Jul 06, 2024 at 07:46:19AM +0000, Sherry Yang wrote:


> > > > > > > > > On Jul 6, 2024, at 12:11 AM, Greg KH <greg@kroah.com> wrote:

> > > > > > > > > On Fri, Jul 05, 2024 at 02:19:18PM +0000, Chuck Lever III wrote:


> > > > > > > > > > > On Jul 2, 2024, at 6:55 PM, Calum Mackay <calum.mackay@oracle.com> wrote:

> > > > > > > > > > > To clarify…

> > > > > > > > > > > On 02/07/2024 5:54 pm, Calum Mackay wrote:
> > > > > > > > > > > > hi Petr,
> > > > > > > > > > > > I noticed your LTP patch [1][2] which adjusts the nfsstat01 test on v6.9 kernels, to account for Josef's changes [3], which restrict the NFS/RPC stats per-namespace.
> > > > > > > > > > > > I see that Josef's changes were backported, as far back as longterm v5.4,

> > > > > > > > > > > Sorry, that's not quite accurate.

> > > > > > > > > > > Josef's NFS client changes were all backported from v6.9, as far as longterm v5.4.y:

> > > > > > > > > > > 2057a48d0dd0 sunrpc: add a struct rpc_stats arg to rpc_create_args
> > > > > > > > > > > d47151b79e32 nfs: expose /proc/net/sunrpc/nfs in net namespaces
> > > > > > > > > > > 1548036ef120 nfs: make the rpc_stat per net namespace


> > > > > > > > > > > Of Josef's NFS server changes, four were backported from v6.9 to v6.8:

> > > > > > > > > > > 418b9687dece sunrpc: use the struct net as the svc proc private
> > > > > > > > > > > d98416cc2154 nfsd: rename NFSD_NET_* to NFSD_STATS_*
> > > > > > > > > > > 93483ac5fec6 nfsd: expose /proc/net/sunrpc/nfsd in net namespaces
> > > > > > > > > > > 4b14885411f7 nfsd: make all of the nfsd stats per-network namespace

> > > > > > > > > > > and the others remained only in v6.9:

> > > > > > > > > > > ab42f4d9a26f sunrpc: don't change ->sv_stats if it doesn't exist
> > > > > > > > > > > a2214ed588fb nfsd: stop setting ->pg_stats for unused stats
> > > > > > > > > > > f09432386766 sunrpc: pass in the sv_stats struct through svc_create_pooled
> > > > > > > > > > > 3f6ef182f144 sunrpc: remove ->pg_stats from svc_program
> > > > > > > > > > > e41ee44cc6a4 nfsd: remove nfsd_stats, make th_cnt a global counter
> > > > > > > > > > > 16fb9808ab2c nfsd: make svc_stat per-network namespace instead of global



> > > > > > > > > > > I'm wondering if this difference between NFS client, and NFS server, stat behaviour, across kernel versions, may perhaps cause some user confusion?

> > > > > > > > > > As a refresher for the stable folken, Josef's changes make
> > > > > > > > > > nfsstats silo'd, so they no longer show counts from the whole
> > > > > > > > > > system, but only for NFS operations relating to the local net
> > > > > > > > > > namespace. That is a surprising change for some users, tools,
> > > > > > > > > > and testing.

> > > > > > > > > > I'm not clear on whether there are any rules/guidelines around
> > > > > > > > > > LTS backports causing behavior changes that user tools, like
> > > > > > > > > > nfsstat, might be impacted by.

> > > > > > > > > The same rules that apply for Linus's tree (i.e. no userspace
> > > > > > > > > regressions.)

> > > > > > > > Given the current data we have, LTP nfsstat01[1] failed on LTS v5.4.278 because of kernel commit 1548036ef1204 ("nfs:
> > > > > > > > make the rpc_stat per net namespace") [2]. Other LTS which backported the same commit are very likely troubled with the same LTP test failure.

> > > > > > > > The following are the LTP nfsstat01 failure output

> > > > > > > > ========
> > > > > > > > network 1 TINFO: initialize 'lhost' 'ltp_ns_veth2' interface
> > > > > > > > network 1 TINFO: add local addr 10.0.0.2/24
> > > > > > > > network 1 TINFO: add local addr fd00:1:1:1::2/64
> > > > > > > > network 1 TINFO: initialize 'rhost' 'ltp_ns_veth1' interface
> > > > > > > > network 1 TINFO: add remote addr 10.0.0.1/24
> > > > > > > > network 1 TINFO: add remote addr fd00:1:1:1::1/64
> > > > > > > > network 1 TINFO: Network config (local -- remote):
> > > > > > > > network 1 TINFO: ltp_ns_veth2 -- ltp_ns_veth1
> > > > > > > > network 1 TINFO: 10.0.0.2/24 -- 10.0.0.1/24
> > > > > > > > network 1 TINFO: fd00:1:1:1::2/64 -- fd00:1:1:1::1/64
> > > > > > > > <<<test_start>>>
> > > > > > > > tag=veth|nfsstat3_01 stime=1719943586
> > > > > > > > cmdline="nfsstat01"
> > > > > > > > contacts=""
> > > > > > > > analysis=exit
> > > > > > > > <<<test_output>>>
> > > > > > > > incrementing stop
> > > > > > > > nfsstat01 1 TINFO: timeout per run is 0h 20m 0s
> > > > > > > > nfsstat01 1 TINFO: setup NFSv3, socket type udp
> > > > > > > > nfsstat01 1 TINFO: Mounting NFS: mount -t nfs -o proto=udp,vers=3 10.0.0.2:/tmp/netpan-4577/LTP_nfsstat01.lz6zhgQHoV/3/udp /tmp/netpan-4577/LTP_nfsstat01.lz6zhgQHoV/3/0
> > > > > > > > nfsstat01 1 TINFO: checking RPC calls for server/client
> > > > > > > > nfsstat01 1 TINFO: calls 98/0
> > > > > > > > nfsstat01 1 TINFO: Checking for tracking of RPC calls for server/client
> > > > > > > > nfsstat01 1 TINFO: new calls 102/0
> > > > > > > > nfsstat01 1 TPASS: server RPC calls increased
> > > > > > > > nfsstat01 1 TFAIL: client RPC calls not increased
> > > > > > > > nfsstat01 1 TINFO: checking NFS calls for server/client
> > > > > > > > nfsstat01 1 TINFO: calls 2/2
> > > > > > > > nfsstat01 1 TINFO: Checking for tracking of NFS calls for server/client
> > > > > > > > nfsstat01 1 TINFO: new calls 3/3
> > > > > > > > nfsstat01 1 TPASS: server NFS calls increased
> > > > > > > > nfsstat01 1 TPASS: client NFS calls increased
> > > > > > > > nfsstat01 2 TINFO: Cleaning up testcase
> > > > > > > > nfsstat01 2 TINFO: SELinux enabled in enforcing mode, this may affect test results
> > > > > > > > nfsstat01 2 TINFO: it can be disabled with TST_DISABLE_SELINUX=1 (requires super/root)
> > > > > > > > nfsstat01 2 TINFO: install seinfo to find used SELinux profiles
> > > > > > > > nfsstat01 2 TINFO: loaded SELinux profiles: none

> > > > > > > > Summary:
> > > > > > > > passed 3
> > > > > > > > failed 1
> > > > > > > > skipped 0
> > > > > > > > warnings 0
> > > > > > > > <<<execution_status>>>
> > > > > > > > initiation_status="ok"
> > > > > > > > duration=1 termination_type=exited termination_id=1 corefile=no
> > > > > > > > cutime=11 cstime=16
> > > > > > > > <<<test_end>>>
> > > > > > > > ltp-pan reported FAIL
> > > > > > > > ========

> > > > > > > > We can observe the number of RPC client calls is 0, which is wired. And this happens from the kernel commit 57d1ce96d7655 ("nfs: make the rpc_stat per net namespace”). So now we’re not sure the kernel backport of nfs client changes is proper, or the LTP tests / userspace need to be modified.

> > > > > > > > If no userspace regression, should we revert the Josef’s NFS client-side changes on LTS?

> > > > > > > This sounds like a regression in Linus's tree too, so why isn't it
> > > > > > > reverted there first?

> > > > > > There is a change in behavior in the upstream code, but Josef's
> > > > > > patches fix an information leak and make the statistics more
> > > > > > sensible in container environments. I'm not certain that
> > > > > > should be considered a regression, but confess I don't know
> > > > > > the regression rules to this fine a degree of detail.

> > > > > > If it is indeed a regression, how can we go about retaining
> > > > > > both behaviors (selectable by Kconfig or perhaps administrative
> > > > > > UI)?


> > > > > I'd argue that the old behavior was a bug, and that Josef fixed
> > > > > it. These stats should probably have been made per-net when all of the
> > > > > original nfsd namespace work was done, but no one noticed until
> > > > > recently. Whoops. 

> > > > > A couple of hacky ideas for how we might deal with this:

> > > > > 1/ add a new line to the output of /proc/net/rpc/nfsd. It could just
> > > > > say "per-net\n" or "per-net <netns_id_number>\n" or something. nfsstat
> > > > > should ignore it, but LTP test could look for it and handle it
> > > > > appropriately. That could even be useful later for nfsstat too I guess.

> > > > > 2/ move the file to a new name and make the old filename be a symlink
> > > > > to the new one. nfsstat would still work, but LTP would be able to see
> > > > > whether it was a symlink to detect the difference...or could just make
> > > > > a new symlink that points to the file and LTP could look for its
> > > > > presence.

> > > > I don't think it makes sense to present a solution which requires
> > > > LTP to be modified.  If we are willing to modify LTP, then we should
> > > > modify it to work with the per-net stats.

> > > > I think we need to create a new interface for the per-net stats, then
> > > > deprecate the old interface and remove it in (say) 2 years.  That given
> > > > LTP time to update, and means that an old LTP won't give incorrect
> > > > numbers, it will simply fail.

> > > > All we need to do is bikeshed the new interface.
> > > >   netlink ?
> > > >   /proc/net/rpc-pernet/nfsd ?

> > > > This means that we still need to keep the combined stats, or to combine
> > > > all the per-net stats on each access.


> > > How much of this functionality would we need to restore?

> > > Prior to Josef's patches, you would get info about global stats from
> > > relevant stats procfiles in a container. That seems like an information
> > > leak to me, but fixing that is probably going to break _somebody_.
> > > Where do we draw the line and why?

> > > LTP is just a testsuite. Asking them to alter tests in order to cope
> > > with a bugfix seems entirely reasonable to me. If someone can make a
> > > case for real-world applications that rely on the old semantics, then
> > > I'd be more open to changing this, but I just don't see the upside of
> > > restoring legacy behavior here.

+1. Also people who test with LTP are advised to use at least the latest release
(we release every 3 months) or the current master branch (linux-next and kernel
rc testers should probably use master branch).

> > If it is OK to ask them to alter the tests, ask them to alter the tests
> > to work with today's kernel and don't make any change to the kernel.
> > Maybe the tests will have to be fixed to "PASS" both the old and the new
> > results, but that probably isn't rocket science.

> > My point is that if we are going to change the kernel to accommodate LTP
> > at all, we should accommodate LTP as it is today.  If we are going to
> > change LTP to accommodate the kernel, then it should accommodate the
> > kernel as it is today.


> The problem is that there is no way for userland tell the difference
> between the older and newer behavior. That was what I was suggesting we
> add.

> To be clear, I hold this opinion loosely. If the consensus is that we
> need to revert things then so be it. I just don't see the value of
> doing that in this particular situation.

I also think that from container POV it fixed an information leak.

Kind regards,
Petr

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [LTP] [PATCH 1/1] nfsstat01: Update client RPC calls for kernel 6.9
  2024-07-12 10:16                     ` Jeff Layton
  2024-07-12 11:07                       ` Petr Vorel
@ 2024-07-12 11:13                       ` NeilBrown
  2024-08-14 20:55                         ` Petr Vorel
  1 sibling, 1 reply; 21+ messages in thread
From: NeilBrown @ 2024-07-12 11:13 UTC (permalink / raw)
  To: Jeff Layton
  Cc: Chuck Lever III, Greg KH, Sherry Yang, Calum Mackay, linux-stable,
	Petr Vorel, Trond Myklebust, Anna Schumaker,
	Linux NFS Mailing List, kernel-team@fb.com, ltp@lists.linux.it,
	Avinesh Kumar, Josef Bacik

On Fri, 12 Jul 2024, Jeff Layton wrote:
> On Fri, 2024-07-12 at 16:12 +1000, NeilBrown wrote:
> > 
> > My point is that if we are going to change the kernel to accommodate LTP
> > at all, we should accommodate LTP as it is today.  If we are going to
> > change LTP to accommodate the kernel, then it should accommodate the
> > kernel as it is today.
> > 
> 
> The problem is that there is no way for userland tell the difference
> between the older and newer behavior. That was what I was suggesting we
> add.

To make sure I wasn't talking through my hat, I had a look at the ltp
code.

The test in question simply tests that the count of RPC calls increases.

It can get the count of RPC calls in one of 2 ways :
 1/ "lhost" - look directly in /proc/net/rpc/{nfs,nfsd}
 2/ "rhost" - ssh to the server and look in that file.

The current test to "fix" this for kernels -ge "6.9" is to force the use
of "rhost".

I'm guessing that always using "rhost" for the nfsd stats would always
work.
But if not, the code could get both the local and remote nfsd stats, and
check that at least one of them increases (and neither decrease).

So ltp doesn't need to know which kernel is being used - it can be
written to work safely on either.

NeilBrown

> 
> To be clear, I hold this opinion loosely. If the consensus is that we
> need to revert things then so be it. I just don't see the value of
> doing that in this particular situation.
> -- 
> Jeff Layton <jlayton@kernel.org>
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [LTP] [PATCH 1/1] nfsstat01: Update client RPC calls for kernel 6.9
  2024-07-08 17:49           ` Chuck Lever III
  2024-07-09  6:48             ` Cyril Hrubis
  2024-07-11 21:18             ` Jeff Layton
@ 2024-07-12 13:45             ` Thorsten Leemhuis
  2024-07-12 14:07               ` Jeff Layton
  2 siblings, 1 reply; 21+ messages in thread
From: Thorsten Leemhuis @ 2024-07-12 13:45 UTC (permalink / raw)
  To: Chuck Lever III, Greg KH
  Cc: Sherry Yang, Calum Mackay, linux-stable, Petr Vorel,
	Trond Myklebust, Anna Schumaker, Linux NFS Mailing List,
	kernel-team@fb.com, ltp@lists.linux.it, Avinesh Kumar, Neil Brown,
	Josef Bacik, Jeff Layton, Linux kernel regressions list

On 08.07.24 19:49, Chuck Lever III wrote:
>> On Jul 8, 2024, at 6:36 AM, Greg KH <greg@kroah.com> wrote:
>> On Sat, Jul 06, 2024 at 07:46:19AM +0000, Sherry Yang wrote:
>>>> On Jul 6, 2024, at 12:11 AM, Greg KH <greg@kroah.com> wrote:
>>>> On Fri, Jul 05, 2024 at 02:19:18PM +0000, Chuck Lever III wrote:
>>>>>> On Jul 2, 2024, at 6:55 PM, Calum Mackay <calum.mackay@oracle.com> wrote:
>>>>>> On 02/07/2024 5:54 pm, Calum Mackay wrote:
>>>>>>> I noticed your LTP patch [1][2] which adjusts the nfsstat01 test on v6.9 kernels, to account for Josef's changes [3], which restrict the NFS/RPC stats per-namespace.
>>>>>>> I see that Josef's changes were backported, as far back as longterm v5.4,
> [...]
>>>>>> I'm wondering if this difference between NFS client, and NFS server, stat behaviour, across kernel versions, may perhaps cause some user confusion?
>>>>>
>>>>> As a refresher for the stable folken, Josef's changes make
>>>>> nfsstats silo'd, so they no longer show counts from the whole
>>>>> system, but only for NFS operations relating to the local net
>>>>> namespace. That is a surprising change for some users, tools,
>>>>> and testing.
>>>>>
>>>>> I'm not clear on whether there are any rules/guidelines around
>>>>> LTS backports causing behavior changes that user tools, like
>>>>> nfsstat, might be impacted by.
>>>>
>>>> The same rules that apply for Linus's tree (i.e. no userspace
>>>> regressions.)
>>> [...]
>>> If no userspace regression, should we revert the Josef’s NFS client-side changes on LTS?
>>
>> This sounds like a regression in Linus's tree too, so why isn't it
>> reverted there first?
> 
> There is a change in behavior in the upstream code, but Josef's
> patches fix an information leak and make the statistics more
> sensible in container environments. I'm not certain that
> should be considered a regression, but confess I don't know
> the regression rules to this fine a degree of detail.

Chuck pointed me to this thread (I had an eye on it already anyway) and
asked for advice. Take everything I write here with a grain of salt, as
this is somewhat tricky situation which makes it hard to predict how
Linus would actually want to see this handled. Maybe I should have CCed
him, but I doubt he cares right now; but we maybe should bring him in,
if an actual user complains.

With that out of the way, let me write a few thoughts:

* That some test breaks is not a regression, as regressions are about
"practical issues", not some ABI/API changes that only some tests care
about. So if it's just a test that broke update it.

* If a user would reported something like "this change broke my app" it
obviously would be something totally different. But that did not happen
yet afaics -- or did it? But from the discussion it sounds like that is
something that will likely happen down the road. If that's the case I'd
say it's best to prevent that from happening.

* Not sure how Linus would react if a user would complain that some
workflow broke because rpc_stat are now per net namespace and shows
different numbers (e.g. using a format that does not break any apps). It
would likely depend on the actual case and how bad he would consider the
information leak.

> If it is indeed a regression, how can we go about retaining
> both behaviors (selectable by Kconfig or perhaps administrative
> UI)?

That likely might be the best idea if user report an actual regression
due to this. But switching the format of any existing file creates quite
some trouble, as others already mentioned in this thread. So maybe
providing the newer format in a different file and allowing to disable
the older one though a Kconfig setting might be the best way forward.
Sure, it would take years until people would have switched over, but
that's how it is with our "no regressions" rule.

Does that help?

Ciao, Thorsten

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [LTP] [PATCH 1/1] nfsstat01: Update client RPC calls for kernel 6.9
  2024-07-12 11:07                       ` Petr Vorel
@ 2024-07-12 14:03                         ` Chuck Lever III
  0 siblings, 0 replies; 21+ messages in thread
From: Chuck Lever III @ 2024-07-12 14:03 UTC (permalink / raw)
  To: Petr Vorel
  Cc: Jeff Layton, Neil Brown, Greg KH, Sherry Yang, Calum Mackay,
	linux-stable, Trond Myklebust, Anna Schumaker,
	Linux NFS Mailing List, kernel-team@fb.com, ltp@lists.linux.it,
	Avinesh Kumar, Josef Bacik,
	Linux regression tracking (Thorsten Leemhuis)



> On Jul 12, 2024, at 7:07 AM, Petr Vorel <pvorel@suse.cz> wrote:
> 
> Hi all,
> 
>> On Fri, 2024-07-12 at 16:12 +1000, NeilBrown wrote:
>>> On Fri, 12 Jul 2024, Jeff Layton wrote:
>>>> On Fri, 2024-07-12 at 08:58 +1000, NeilBrown wrote:
>>>>> On Fri, 12 Jul 2024, Jeff Layton wrote:
>>>>>> On Mon, 2024-07-08 at 17:49 +0000, Chuck Lever III wrote:
> 
>>>>>>>> On Jul 8, 2024, at 6:36 AM, Greg KH <greg@kroah.com> wrote:
> 
>>>>>>>> On Sat, Jul 06, 2024 at 07:46:19AM +0000, Sherry Yang wrote:
> 
> 
>>>>>>>>>> On Jul 6, 2024, at 12:11 AM, Greg KH <greg@kroah.com> wrote:
> 
>>>>>>>>>> On Fri, Jul 05, 2024 at 02:19:18PM +0000, Chuck Lever III wrote:
> 
> 
>>>>>>>>>>>> On Jul 2, 2024, at 6:55 PM, Calum Mackay <calum.mackay@oracle.com> wrote:
> 
>>>>>>>>>>>> To clarify…
> 
>>>>>>>>>>>> On 02/07/2024 5:54 pm, Calum Mackay wrote:
>>>>>>>>>>>>> hi Petr,
>>>>>>>>>>>>> I noticed your LTP patch [1][2] which adjusts the nfsstat01 test on v6.9 kernels, to account for Josef's changes [3], which restrict the NFS/RPC stats per-namespace.
>>>>>>>>>>>>> I see that Josef's changes were backported, as far back as longterm v5.4,
> 
>>>>>>>>>>>> Sorry, that's not quite accurate.
> 
>>>>>>>>>>>> Josef's NFS client changes were all backported from v6.9, as far as longterm v5.4.y:
> 
>>>>>>>>>>>> 2057a48d0dd0 sunrpc: add a struct rpc_stats arg to rpc_create_args
>>>>>>>>>>>> d47151b79e32 nfs: expose /proc/net/sunrpc/nfs in net namespaces
>>>>>>>>>>>> 1548036ef120 nfs: make the rpc_stat per net namespace
> 
> 
>>>>>>>>>>>> Of Josef's NFS server changes, four were backported from v6.9 to v6.8:
> 
>>>>>>>>>>>> 418b9687dece sunrpc: use the struct net as the svc proc private
>>>>>>>>>>>> d98416cc2154 nfsd: rename NFSD_NET_* to NFSD_STATS_*
>>>>>>>>>>>> 93483ac5fec6 nfsd: expose /proc/net/sunrpc/nfsd in net namespaces
>>>>>>>>>>>> 4b14885411f7 nfsd: make all of the nfsd stats per-network namespace
> 
>>>>>>>>>>>> and the others remained only in v6.9:
> 
>>>>>>>>>>>> ab42f4d9a26f sunrpc: don't change ->sv_stats if it doesn't exist
>>>>>>>>>>>> a2214ed588fb nfsd: stop setting ->pg_stats for unused stats
>>>>>>>>>>>> f09432386766 sunrpc: pass in the sv_stats struct through svc_create_pooled
>>>>>>>>>>>> 3f6ef182f144 sunrpc: remove ->pg_stats from svc_program
>>>>>>>>>>>> e41ee44cc6a4 nfsd: remove nfsd_stats, make th_cnt a global counter
>>>>>>>>>>>> 16fb9808ab2c nfsd: make svc_stat per-network namespace instead of global
> 
> 
> 
>>>>>>>>>>>> I'm wondering if this difference between NFS client, and NFS server, stat behaviour, across kernel versions, may perhaps cause some user confusion?
> 
>>>>>>>>>>> As a refresher for the stable folken, Josef's changes make
>>>>>>>>>>> nfsstats silo'd, so they no longer show counts from the whole
>>>>>>>>>>> system, but only for NFS operations relating to the local net
>>>>>>>>>>> namespace. That is a surprising change for some users, tools,
>>>>>>>>>>> and testing.
> 
>>>>>>>>>>> I'm not clear on whether there are any rules/guidelines around
>>>>>>>>>>> LTS backports causing behavior changes that user tools, like
>>>>>>>>>>> nfsstat, might be impacted by.
> 
>>>>>>>>>> The same rules that apply for Linus's tree (i.e. no userspace
>>>>>>>>>> regressions.)
> 
>>>>>>>>> Given the current data we have, LTP nfsstat01[1] failed on LTS v5.4.278 because of kernel commit 1548036ef1204 ("nfs:
>>>>>>>>> make the rpc_stat per net namespace") [2]. Other LTS which backported the same commit are very likely troubled with the same LTP test failure.
> 
>>>>>>>>> The following are the LTP nfsstat01 failure output
> 
>>>>>>>>> ========
>>>>>>>>> network 1 TINFO: initialize 'lhost' 'ltp_ns_veth2' interface
>>>>>>>>> network 1 TINFO: add local addr 10.0.0.2/24
>>>>>>>>> network 1 TINFO: add local addr fd00:1:1:1::2/64
>>>>>>>>> network 1 TINFO: initialize 'rhost' 'ltp_ns_veth1' interface
>>>>>>>>> network 1 TINFO: add remote addr 10.0.0.1/24
>>>>>>>>> network 1 TINFO: add remote addr fd00:1:1:1::1/64
>>>>>>>>> network 1 TINFO: Network config (local -- remote):
>>>>>>>>> network 1 TINFO: ltp_ns_veth2 -- ltp_ns_veth1
>>>>>>>>> network 1 TINFO: 10.0.0.2/24 -- 10.0.0.1/24
>>>>>>>>> network 1 TINFO: fd00:1:1:1::2/64 -- fd00:1:1:1::1/64
>>>>>>>>> <<<test_start>>>
>>>>>>>>> tag=veth|nfsstat3_01 stime=1719943586
>>>>>>>>> cmdline="nfsstat01"
>>>>>>>>> contacts=""
>>>>>>>>> analysis=exit
>>>>>>>>> <<<test_output>>>
>>>>>>>>> incrementing stop
>>>>>>>>> nfsstat01 1 TINFO: timeout per run is 0h 20m 0s
>>>>>>>>> nfsstat01 1 TINFO: setup NFSv3, socket type udp
>>>>>>>>> nfsstat01 1 TINFO: Mounting NFS: mount -t nfs -o proto=udp,vers=3 10.0.0.2:/tmp/netpan-4577/LTP_nfsstat01.lz6zhgQHoV/3/udp /tmp/netpan-4577/LTP_nfsstat01.lz6zhgQHoV/3/0
>>>>>>>>> nfsstat01 1 TINFO: checking RPC calls for server/client
>>>>>>>>> nfsstat01 1 TINFO: calls 98/0
>>>>>>>>> nfsstat01 1 TINFO: Checking for tracking of RPC calls for server/client
>>>>>>>>> nfsstat01 1 TINFO: new calls 102/0
>>>>>>>>> nfsstat01 1 TPASS: server RPC calls increased
>>>>>>>>> nfsstat01 1 TFAIL: client RPC calls not increased
>>>>>>>>> nfsstat01 1 TINFO: checking NFS calls for server/client
>>>>>>>>> nfsstat01 1 TINFO: calls 2/2
>>>>>>>>> nfsstat01 1 TINFO: Checking for tracking of NFS calls for server/client
>>>>>>>>> nfsstat01 1 TINFO: new calls 3/3
>>>>>>>>> nfsstat01 1 TPASS: server NFS calls increased
>>>>>>>>> nfsstat01 1 TPASS: client NFS calls increased
>>>>>>>>> nfsstat01 2 TINFO: Cleaning up testcase
>>>>>>>>> nfsstat01 2 TINFO: SELinux enabled in enforcing mode, this may affect test results
>>>>>>>>> nfsstat01 2 TINFO: it can be disabled with TST_DISABLE_SELINUX=1 (requires super/root)
>>>>>>>>> nfsstat01 2 TINFO: install seinfo to find used SELinux profiles
>>>>>>>>> nfsstat01 2 TINFO: loaded SELinux profiles: none
> 
>>>>>>>>> Summary:
>>>>>>>>> passed 3
>>>>>>>>> failed 1
>>>>>>>>> skipped 0
>>>>>>>>> warnings 0
>>>>>>>>> <<<execution_status>>>
>>>>>>>>> initiation_status="ok"
>>>>>>>>> duration=1 termination_type=exited termination_id=1 corefile=no
>>>>>>>>> cutime=11 cstime=16
>>>>>>>>> <<<test_end>>>
>>>>>>>>> ltp-pan reported FAIL
>>>>>>>>> ========
> 
>>>>>>>>> We can observe the number of RPC client calls is 0, which is wired. And this happens from the kernel commit 57d1ce96d7655 ("nfs: make the rpc_stat per net namespace”). So now we’re not sure the kernel backport of nfs client changes is proper, or the LTP tests / userspace need to be modified.
> 
>>>>>>>>> If no userspace regression, should we revert the Josef’s NFS client-side changes on LTS?
> 
>>>>>>>> This sounds like a regression in Linus's tree too, so why isn't it
>>>>>>>> reverted there first?
> 
>>>>>>> There is a change in behavior in the upstream code, but Josef's
>>>>>>> patches fix an information leak and make the statistics more
>>>>>>> sensible in container environments. I'm not certain that
>>>>>>> should be considered a regression, but confess I don't know
>>>>>>> the regression rules to this fine a degree of detail.
> 
>>>>>>> If it is indeed a regression, how can we go about retaining
>>>>>>> both behaviors (selectable by Kconfig or perhaps administrative
>>>>>>> UI)?
> 
> 
>>>>>> I'd argue that the old behavior was a bug, and that Josef fixed
>>>>>> it. These stats should probably have been made per-net when all of the
>>>>>> original nfsd namespace work was done, but no one noticed until
>>>>>> recently. Whoops. 
> 
>>>>>> A couple of hacky ideas for how we might deal with this:
> 
>>>>>> 1/ add a new line to the output of /proc/net/rpc/nfsd. It could just
>>>>>> say "per-net\n" or "per-net <netns_id_number>\n" or something. nfsstat
>>>>>> should ignore it, but LTP test could look for it and handle it
>>>>>> appropriately. That could even be useful later for nfsstat too I guess.
> 
>>>>>> 2/ move the file to a new name and make the old filename be a symlink
>>>>>> to the new one. nfsstat would still work, but LTP would be able to see
>>>>>> whether it was a symlink to detect the difference...or could just make
>>>>>> a new symlink that points to the file and LTP could look for its
>>>>>> presence.
> 
>>>>> I don't think it makes sense to present a solution which requires
>>>>> LTP to be modified.  If we are willing to modify LTP, then we should
>>>>> modify it to work with the per-net stats.
> 
>>>>> I think we need to create a new interface for the per-net stats, then
>>>>> deprecate the old interface and remove it in (say) 2 years.  That given
>>>>> LTP time to update, and means that an old LTP won't give incorrect
>>>>> numbers, it will simply fail.
> 
>>>>> All we need to do is bikeshed the new interface.
>>>>>  netlink ?
>>>>>  /proc/net/rpc-pernet/nfsd ?
> 
>>>>> This means that we still need to keep the combined stats, or to combine
>>>>> all the per-net stats on each access.
> 
> 
>>>> How much of this functionality would we need to restore?
> 
>>>> Prior to Josef's patches, you would get info about global stats from
>>>> relevant stats procfiles in a container. That seems like an information
>>>> leak to me, but fixing that is probably going to break _somebody_.
>>>> Where do we draw the line and why?
> 
>>>> LTP is just a testsuite. Asking them to alter tests in order to cope
>>>> with a bugfix seems entirely reasonable to me. If someone can make a
>>>> case for real-world applications that rely on the old semantics, then
>>>> I'd be more open to changing this, but I just don't see the upside of
>>>> restoring legacy behavior here.
> 
> +1. Also people who test with LTP are advised to use at least the latest release
> (we release every 3 months) or the current master branch (linux-next and kernel
> rc testers should probably use master branch).
> 
>>> If it is OK to ask them to alter the tests, ask them to alter the tests
>>> to work with today's kernel and don't make any change to the kernel.
>>> Maybe the tests will have to be fixed to "PASS" both the old and the new
>>> results, but that probably isn't rocket science.
> 
>>> My point is that if we are going to change the kernel to accommodate LTP
>>> at all, we should accommodate LTP as it is today.  If we are going to
>>> change LTP to accommodate the kernel, then it should accommodate the
>>> kernel as it is today.
> 
> 
>> The problem is that there is no way for userland tell the difference
>> between the older and newer behavior. That was what I was suggesting we
>> add.
> 
>> To be clear, I hold this opinion loosely. If the consensus is that we
>> need to revert things then so be it. I just don't see the value of
>> doing that in this particular situation.
> 
> I also think that from container POV it fixed an information leak.

I would vastly prefer that the information leak is fixed
not only in upstream kernels, but also in LTS kernels.

Neil's suggestion of introducing a new kernel/userspace
API is the usual approach for API changes, but does not
address the information leak at all until the old API is
removed in 2+ years.

What I would like to see happen is:

 -- leave Josef's patches in the upstream kernel

 -- backport the missing parts of that series to LTS
    kernels (and I volunteer to handle that)

 -- Take Neil's advice of fixing ltp in a way that
    does not rely on kernel release information

I can wait for Peter to acknowledge that Neil's advice
will work for ltp (and of course, any other thoughts
from the To: or Cc: list).


Thorsten says:
> So maybe
> providing the newer format in a different file and allowing to disable
> the older one though a Kconfig setting might be the best way forward.


That is the "ideal" solution, but the downside is how
long it would take to reach all kernels and distros.

I'm happy to reconsider this approach if we receive
more than a report of test case breakage... that was
criteria I did not have earlier.


--
Chuck Lever



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [LTP] [PATCH 1/1] nfsstat01: Update client RPC calls for kernel 6.9
  2024-07-12 13:45             ` Thorsten Leemhuis
@ 2024-07-12 14:07               ` Jeff Layton
  0 siblings, 0 replies; 21+ messages in thread
From: Jeff Layton @ 2024-07-12 14:07 UTC (permalink / raw)
  To: Thorsten Leemhuis, Chuck Lever III, Greg KH
  Cc: Sherry Yang, Calum Mackay, linux-stable, Petr Vorel,
	Trond Myklebust, Anna Schumaker, Linux NFS Mailing List,
	kernel-team@fb.com, ltp@lists.linux.it, Avinesh Kumar, Neil Brown,
	Josef Bacik, Linux kernel regressions list

On Fri, 2024-07-12 at 15:45 +0200, Thorsten Leemhuis wrote:
> On 08.07.24 19:49, Chuck Lever III wrote:
> > > On Jul 8, 2024, at 6:36 AM, Greg KH <greg@kroah.com> wrote:
> > > On Sat, Jul 06, 2024 at 07:46:19AM +0000, Sherry Yang wrote:
> > > > > On Jul 6, 2024, at 12:11 AM, Greg KH <greg@kroah.com> wrote:
> > > > > On Fri, Jul 05, 2024 at 02:19:18PM +0000, Chuck Lever III
> > > > > wrote:
> > > > > > > On Jul 2, 2024, at 6:55 PM, Calum Mackay
> > > > > > > <calum.mackay@oracle.com> wrote:
> > > > > > > On 02/07/2024 5:54 pm, Calum Mackay wrote:
> > > > > > > > I noticed your LTP patch [1][2] which adjusts the
> > > > > > > > nfsstat01 test on v6.9 kernels, to account for Josef's
> > > > > > > > changes [3], which restrict the NFS/RPC stats per-
> > > > > > > > namespace.
> > > > > > > > I see that Josef's changes were backported, as far back
> > > > > > > > as longterm v5.4,
> > [...]
> > > > > > > I'm wondering if this difference between NFS client, and
> > > > > > > NFS server, stat behaviour, across kernel versions, may
> > > > > > > perhaps cause some user confusion?
> > > > > > 
> > > > > > As a refresher for the stable folken, Josef's changes make
> > > > > > nfsstats silo'd, so they no longer show counts from the
> > > > > > whole
> > > > > > system, but only for NFS operations relating to the local
> > > > > > net
> > > > > > namespace. That is a surprising change for some users,
> > > > > > tools,
> > > > > > and testing.
> > > > > > 
> > > > > > I'm not clear on whether there are any rules/guidelines
> > > > > > around
> > > > > > LTS backports causing behavior changes that user tools,
> > > > > > like
> > > > > > nfsstat, might be impacted by.
> > > > > 
> > > > > The same rules that apply for Linus's tree (i.e. no userspace
> > > > > regressions.)
> > > > [...]
> > > > If no userspace regression, should we revert the Josef’s NFS
> > > > client-side changes on LTS?
> > > 
> > > This sounds like a regression in Linus's tree too, so why isn't
> > > it
> > > reverted there first?
> > 
> > There is a change in behavior in the upstream code, but Josef's
> > patches fix an information leak and make the statistics more
> > sensible in container environments. I'm not certain that
> > should be considered a regression, but confess I don't know
> > the regression rules to this fine a degree of detail.
> 
> Chuck pointed me to this thread (I had an eye on it already anyway)
> and
> asked for advice. Take everything I write here with a grain of salt,
> as
> this is somewhat tricky situation which makes it hard to predict how
> Linus would actually want to see this handled. Maybe I should have
> CCed
> him, but I doubt he cares right now; but we maybe should bring him
> in,
> if an actual user complains.
> 
> With that out of the way, let me write a few thoughts:
> 
> * That some test breaks is not a regression, as regressions are about
> "practical issues", not some ABI/API changes that only some tests
> care
> about. So if it's just a test that broke update it.
> 
> * If a user would reported something like "this change broke my app"
> it
> obviously would be something totally different. But that did not
> happen
> yet afaics -- or did it? But from the discussion it sounds like that
> is
> something that will likely happen down the road. If that's the case
> I'd
> say it's best to prevent that from happening.
> 

I doubt anyone outside of automated testcases will ever notice this,
and if they do, then they probably want the new behavior. This was an
oversight when the nfs client and server were first containerized.
These stats should have been made per-net then, but never were.

Basically, the old "/proc/net/rpc/nfs{d}" files presented aggregate
stats for all of the nfsd's on the machine _and_ they presented the
same information no matter the net namespace you're in when reading
them.

Josef's patches changed it so that we collect this information on a
per-net namespace basis, and we only present the totals of the net
namespace reading the procfile.

There are 2 possibilities of breakage:

1) someone in a container expects to see stats for the entire host in
/proc/net/rpc/nfsd. This is a bug -- users in the container should
never have seen this in the first place. In practice, it's probably
pretty benign, but fixing it is the right thing to do.

2) someone in the init_net_ns reads the procfile and expects to see
global totals. Ok, this is a change, but I argue that it's a good one,
since it gives more immediate info about the server running in the
init_net_ns.

In practice most usage of these procfiles is pretty informal (mostly
acting as the source for nfsstat). Segregating this info by container
is the right outcome. It should have been done that way from the get-
go.

> * Not sure how Linus would react if a user would complain that some
> workflow broke because rpc_stat are now per net namespace and shows
> different numbers (e.g. using a format that does not break any apps).
> It
> would likely depend on the actual case and how bad he would consider
> the
> information leak.
> 
> > If it is indeed a regression, how can we go about retaining
> > both behaviors (selectable by Kconfig or perhaps administrative
> > UI)?
> 
> That likely might be the best idea if user report an actual
> regression
> due to this. But switching the format of any existing file creates
> quite
> some trouble, as others already mentioned in this thread. So maybe
> providing the newer format in a different file and allowing to
> disable
> the older one though a Kconfig setting might be the best way forward.
> Sure, it would take years until people would have switched over, but
> that's how it is with our "no regressions" rule.
> 
> Does that help?
> 
> Ciao, Thorsten

-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [LTP] [PATCH 1/1] nfsstat01: Update client RPC calls for kernel 6.9
  2024-07-12 11:13                       ` NeilBrown
@ 2024-08-14 20:55                         ` Petr Vorel
  2024-08-14 22:17                           ` NeilBrown
  0 siblings, 1 reply; 21+ messages in thread
From: Petr Vorel @ 2024-08-14 20:55 UTC (permalink / raw)
  To: NeilBrown
  Cc: Jeff Layton, Chuck Lever III, Greg KH, Sherry Yang, Calum Mackay,
	linux-stable, Trond Myklebust, Anna Schumaker,
	Linux NFS Mailing List, kernel-team@fb.com, ltp@lists.linux.it,
	Avinesh Kumar, Josef Bacik

> On Fri, 12 Jul 2024, Jeff Layton wrote:
> > On Fri, 2024-07-12 at 16:12 +1000, NeilBrown wrote:

> > > My point is that if we are going to change the kernel to accommodate LTP
> > > at all, we should accommodate LTP as it is today.  If we are going to
> > > change LTP to accommodate the kernel, then it should accommodate the
> > > kernel as it is today.

> > The problem is that there is no way for userland tell the difference
> > between the older and newer behavior. That was what I was suggesting we
> > add.

> To make sure I wasn't talking through my hat, I had a look at the ltp
> code.

> The test in question simply tests that the count of RPC calls increases.

> It can get the count of RPC calls in one of 2 ways :
>  1/ "lhost" - look directly in /proc/net/rpc/{nfs,nfsd}
>  2/ "rhost" - ssh to the server and look in that file.

FYI "rhost" in LTP can be either using namespaces (Single Host Configuration [1]),
which is run by default, or SSH based (Two Host Configuration [2]). IMHO most of
the testers (including myself run tests simply via network namespaces).

NOTE: I suppose CONFIG_NAMESPACES=y is a must for 'ip netns' to be working, thus
tests would hopefully failed early on kernel having that disabled.

> The current test to "fix" this for kernels -ge "6.9" is to force the use
> of "rhost".

> I'm guessing that always using "rhost" for the nfsd stats would always
> work.

FYI this old commit [3] allowed these tests to be working in network namespaces.
It reads for network namespaces both /proc/net/rpc/{nfs,nfsd} from non-namespace
("lhost").  This is the subject of the change in 6.9, which now fails.
And for SSH based we obviously look on "rhost" already.

> But if not, the code could get both the local and remote nfsd stats, and
> check that at least one of them increases (and neither decrease).

This sounds reasonable, thanks for a hint. I'll just look for client RPC calls
(/proc/net/rpc/nfs) in both non-namespace and namespace. The only think is that
we effectively give up checking where it should be (if it for whatever reason in
the future changes again, we miss that). I'm not sure if this would be treated
the same as the current situation (Josef Bacik had obvious reasons for this to
be working).

@Josef @NFS maintainers: WDYT?

Kind regards,
Petr

> So ltp doesn't need to know which kernel is being used - it can be
> written to work safely on either.

> NeilBrown

[1] https://github.com/linux-test-project/ltp/tree/master/testcases/network#single-host-configuration
[2] https://github.com/linux-test-project/ltp/tree/master/testcases/network#two-host-configuration
[3] https://github.com/linux-test-project/ltp/commit/40958772f11d90e4b5052e7e772a3837d285cf89

> > To be clear, I hold this opinion loosely. If the consensus is that we
> > need to revert things then so be it. I just don't see the value of
> > doing that in this particular situation.
> > -- 
> > Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [LTP] [PATCH 1/1] nfsstat01: Update client RPC calls for kernel 6.9
  2024-08-14 20:55                         ` Petr Vorel
@ 2024-08-14 22:17                           ` NeilBrown
  2024-08-15  6:53                             ` Petr Vorel
  0 siblings, 1 reply; 21+ messages in thread
From: NeilBrown @ 2024-08-14 22:17 UTC (permalink / raw)
  To: Petr Vorel
  Cc: Jeff Layton, Chuck Lever III, Greg KH, Sherry Yang, Calum Mackay,
	linux-stable, Trond Myklebust, Anna Schumaker,
	Linux NFS Mailing List, kernel-team@fb.com, ltp@lists.linux.it,
	Avinesh Kumar, Josef Bacik

On Thu, 15 Aug 2024, Petr Vorel wrote:
> > On Fri, 12 Jul 2024, Jeff Layton wrote:
> > > On Fri, 2024-07-12 at 16:12 +1000, NeilBrown wrote:
> 
> > > > My point is that if we are going to change the kernel to accommodate LTP
> > > > at all, we should accommodate LTP as it is today.  If we are going to
> > > > change LTP to accommodate the kernel, then it should accommodate the
> > > > kernel as it is today.
> 
> 
> > > The problem is that there is no way for userland tell the difference
> > > between the older and newer behavior. That was what I was suggesting we
> > > add.
> 
> > To make sure I wasn't talking through my hat, I had a look at the ltp
> > code.
> 
> > The test in question simply tests that the count of RPC calls increases.
> 
> > It can get the count of RPC calls in one of 2 ways :
> >  1/ "lhost" - look directly in /proc/net/rpc/{nfs,nfsd}
> >  2/ "rhost" - ssh to the server and look in that file.
> 
> FYI "rhost" in LTP can be either using namespaces (Single Host Configuration [1]),
> which is run by default, or SSH based (Two Host Configuration [2]). IMHO most of
> the testers (including myself run tests simply via network namespaces).
> 
> NOTE: I suppose CONFIG_NAMESPACES=y is a must for 'ip netns' to be working, thus
> tests would hopefully failed early on kernel having that disabled.
> 
> > The current test to "fix" this for kernels -ge "6.9" is to force the use
> > of "rhost".
> 
> > I'm guessing that always using "rhost" for the nfsd stats would always
> > work.
> 
> FYI this old commit [3] allowed these tests to be working in network namespaces.
> It reads for network namespaces both /proc/net/rpc/{nfs,nfsd} from non-namespace
> ("lhost").  This is the subject of the change in 6.9, which now fails.
> And for SSH based we obviously look on "rhost" already.

That patch looks like a mistake.  The author noticed that the rpc stats
were not "namespacified" and instead of reporting the bug (and surely
the whole point of a test suite is to report bugs), they made a change
that seems completely unnecessary which had the effect of entrenching
the bug.  Unfortunately the commit message only says why it is same to
make the change, not why it us useful.

> 
> > But if not, the code could get both the local and remote nfsd stats, and
> > check that at least one of them increases (and neither decrease).
> 
> This sounds reasonable, thanks for a hint. I'll just look for client RPC calls
> (/proc/net/rpc/nfs) in both non-namespace and namespace. The only think is that
> we effectively give up checking where it should be (if it for whatever reason in
> the future changes again, we miss that). I'm not sure if this would be treated
> the same as the current situation (Josef Bacik had obvious reasons for this to
> be working).

Stats should always be visible in the relevant namespace.  server stats
should be visible in the name space where the server runs.  client stats
should be visible in the namespace where the filesystem is mounted.
This has always been true (as long as we have had stats) and if it ever
stops being true, that is a bug.
I think the test suite should test for precisely this case.  Testing if
the stats are visible from a different namespace is not likely to be an
interesting test - unless you want to guard against the possibility that
we will one day accidentally de-namespaceify the stats (stranger things
have happened).

Thanks,
NeilBrown

> 
> @Josef @NFS maintainers: WDYT?
> 
> Kind regards,
> Petr
> 
> > So ltp doesn't need to know which kernel is being used - it can be
> > written to work safely on either.
> 
> > NeilBrown
> 
> [1] https://github.com/linux-test-project/ltp/tree/master/testcases/network#single-host-configuration
> [2] https://github.com/linux-test-project/ltp/tree/master/testcases/network#two-host-configuration
> [3] https://github.com/linux-test-project/ltp/commit/40958772f11d90e4b5052e7e772a3837d285cf89
> 
> > > To be clear, I hold this opinion loosely. If the consensus is that we
> > > need to revert things then so be it. I just don't see the value of
> > > doing that in this particular situation.
> > > -- 
> > > Jeff Layton <jlayton@kernel.org>
> 
> 
> 


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [LTP] [PATCH 1/1] nfsstat01: Update client RPC calls for kernel 6.9
  2024-08-14 22:17                           ` NeilBrown
@ 2024-08-15  6:53                             ` Petr Vorel
  0 siblings, 0 replies; 21+ messages in thread
From: Petr Vorel @ 2024-08-15  6:53 UTC (permalink / raw)
  To: NeilBrown
  Cc: Jeff Layton, Chuck Lever III, Greg KH, Sherry Yang, Calum Mackay,
	linux-stable, Trond Myklebust, Anna Schumaker,
	Linux NFS Mailing List, kernel-team@fb.com, ltp@lists.linux.it,
	Avinesh Kumar, Josef Bacik

> On Thu, 15 Aug 2024, Petr Vorel wrote:
> > > On Fri, 12 Jul 2024, Jeff Layton wrote:
> > > > On Fri, 2024-07-12 at 16:12 +1000, NeilBrown wrote:

> > > > > My point is that if we are going to change the kernel to accommodate LTP
> > > > > at all, we should accommodate LTP as it is today.  If we are going to
> > > > > change LTP to accommodate the kernel, then it should accommodate the
> > > > > kernel as it is today.


> > > > The problem is that there is no way for userland tell the difference
> > > > between the older and newer behavior. That was what I was suggesting we
> > > > add.

> > > To make sure I wasn't talking through my hat, I had a look at the ltp
> > > code.

> > > The test in question simply tests that the count of RPC calls increases.

> > > It can get the count of RPC calls in one of 2 ways :
> > >  1/ "lhost" - look directly in /proc/net/rpc/{nfs,nfsd}
> > >  2/ "rhost" - ssh to the server and look in that file.

> > FYI "rhost" in LTP can be either using namespaces (Single Host Configuration [1]),
> > which is run by default, or SSH based (Two Host Configuration [2]). IMHO most of
> > the testers (including myself run tests simply via network namespaces).

> > NOTE: I suppose CONFIG_NAMESPACES=y is a must for 'ip netns' to be working, thus
> > tests would hopefully failed early on kernel having that disabled.

> > > The current test to "fix" this for kernels -ge "6.9" is to force the use
> > > of "rhost".

> > > I'm guessing that always using "rhost" for the nfsd stats would always
> > > work.

> > FYI this old commit [3] allowed these tests to be working in network namespaces.
> > It reads for network namespaces both /proc/net/rpc/{nfs,nfsd} from non-namespace
> > ("lhost").  This is the subject of the change in 6.9, which now fails.
> > And for SSH based we obviously look on "rhost" already.

> That patch looks like a mistake.  The author noticed that the rpc stats
> were not "namespacified" and instead of reporting the bug (and surely
> the whole point of a test suite is to report bugs), they made a change
> that seems completely unnecessary which had the effect of entrenching
> the bug.  Unfortunately the commit message only says why it is same to
> make the change, not why it us useful.

Fully agree. With nowadays experiences I would have asked him to discuss that on
this ML. Ironically, Alexey back then (as part of Oracle) did had report and
even fix few network protocol related bugs, did some testing for NFS related
fixes.

> > > But if not, the code could get both the local and remote nfsd stats, and
> > > check that at least one of them increases (and neither decrease).

> > This sounds reasonable, thanks for a hint. I'll just look for client RPC calls
> > (/proc/net/rpc/nfs) in both non-namespace and namespace. The only think is that
> > we effectively give up checking where it should be (if it for whatever reason in
> > the future changes again, we miss that). I'm not sure if this would be treated
> > the same as the current situation (Josef Bacik had obvious reasons for this to
> > be working).

> Stats should always be visible in the relevant namespace.  server stats
> should be visible in the name space where the server runs.  client stats
> should be visible in the namespace where the filesystem is mounted.
> This has always been true (as long as we have had stats) and if it ever
> stops being true, that is a bug.

+1

> I think the test suite should test for precisely this case.  Testing if
> the stats are visible from a different namespace is not likely to be an
> interesting test - unless you want to guard against the possibility that
> we will one day accidentally de-namespaceify the stats (stranger things
> have happened).

There could be additional check for namespaces only (skip in SSH) that there is
no information leak.

Kind regards,
Petr

> Thanks,
> NeilBrown


> > @Josef @NFS maintainers: WDYT?

> > Kind regards,
> > Petr

> > > So ltp doesn't need to know which kernel is being used - it can be
> > > written to work safely on either.

> > > NeilBrown

> > [1] https://github.com/linux-test-project/ltp/tree/master/testcases/network#single-host-configuration
> > [2] https://github.com/linux-test-project/ltp/tree/master/testcases/network#two-host-configuration
> > [3] https://github.com/linux-test-project/ltp/commit/40958772f11d90e4b5052e7e772a3837d285cf89

> > > > To be clear, I hold this opinion loosely. If the consensus is that we
> > > > need to revert things then so be it. I just don't see the value of
> > > > doing that in this particular situation.
> > > > -- 
> > > > Jeff Layton <jlayton@kernel.org>





^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2024-08-15  6:53 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <d4b235df-4ee5-4824-9d48-e3b3c1f1f4d1@oracle.com>
2024-07-02 22:55 ` [LTP] [PATCH 1/1] nfsstat01: Update client RPC calls for kernel 6.9 Calum Mackay
2024-07-05 14:19   ` Chuck Lever III
2024-07-06  7:11     ` Greg KH
2024-07-06  7:46       ` Sherry Yang
2024-07-08 10:36         ` Greg KH
2024-07-08 17:49           ` Chuck Lever III
2024-07-09  6:48             ` Cyril Hrubis
2024-07-11 21:18             ` Jeff Layton
2024-07-11 22:58               ` NeilBrown
2024-07-12  0:40                 ` Jeff Layton
2024-07-12  6:12                   ` NeilBrown
2024-07-12 10:16                     ` Jeff Layton
2024-07-12 11:07                       ` Petr Vorel
2024-07-12 14:03                         ` Chuck Lever III
2024-07-12 11:13                       ` NeilBrown
2024-08-14 20:55                         ` Petr Vorel
2024-08-14 22:17                           ` NeilBrown
2024-08-15  6:53                             ` Petr Vorel
2024-07-12 13:45             ` Thorsten Leemhuis
2024-07-12 14:07               ` Jeff Layton
2024-07-08  4:02     ` Petr Vorel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox