From: Firo Yang <firo.yang@suse.com>
To: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Cc: vyasevich@gmail.com, nhorman@tuxdriver.com, mkubecek@suse.com,
davem@davemloft.net, edumazet@google.com, kuba@kernel.org,
pabeni@redhat.com, linux-sctp@vger.kernel.org,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
firogm@gmail.com
Subject: Re: [PATCH 1/1] sctp: sysctl: referring the correct net namespace
Date: Thu, 24 Nov 2022 14:29:38 +0800 [thread overview]
Message-ID: <Y38PUmjeFWApHnrh@suse.com> (raw)
In-Reply-To: <Y34ZVEeSryB0UTFD@t14s.localdomain>
The 11/23/2022 10:00, Marcelo Ricardo Leitner wrote:
> On Wed, Nov 23, 2022 at 05:44:06PM +0800, Firo Yang wrote:
> > Recently, a customer reported that from their container whose
> > net namespace is different to the host's init_net, they can't set
> > the container's net.sctp.rto_max to any value smaller than
> > init_net.sctp.rto_min.
> >
> > For instance,
> > Host:
> > sudo sysctl net.sctp.rto_min
> > net.sctp.rto_min = 1000
> >
> > Container:
> > echo 100 > /mnt/proc-net/sctp/rto_min
> > echo 400 > /mnt/proc-net/sctp/rto_max
> > echo: write error: Invalid argument
> >
> > This is caused by the check made from this'commit 4f3fdf3bc59c
> > ("sctp: add check rto_min and rto_max in sysctl")'
> > When validating the input value, it's always referring the boundary
> > value set for the init_net namespace.
> >
> > Having container's rto_max smaller than host's init_net.sctp.rto_min
> > does make sense. Considering that the rto between two containers on the
> > same host is very likely smaller than it for two hosts.
>
> Makes sense. And also, here, it is not using the init_net as
> boundaries for the values themselves. I mean, rto_min in init_net
> won't be the minimum allowed for rto_min in other netns. Ditto for
> rto_max.
>
> More below.
>
> >
> > So to fix this problem, just referring the boundary value from the net
> > namespace where the new input value came from shold be enough.
> >
> > Signed-off-by: Firo Yang <firo.yang@suse.com>
> > ---
> > net/sctp/sysctl.c | 6 ++++++
> > 1 file changed, 6 insertions(+)
> >
> > diff --git a/net/sctp/sysctl.c b/net/sctp/sysctl.c
> > index b46a416787ec..e167df4dc60b 100644
> > --- a/net/sctp/sysctl.c
> > +++ b/net/sctp/sysctl.c
> > @@ -429,6 +429,9 @@ static int proc_sctp_do_rto_min(struct ctl_table *ctl, int write,
> > else
> > tbl.data = &net->sctp.rto_min;
> >
> > + if (net != &init_net)
> > + max = net->sctp.rto_max;
>
> This also affects other sysctls:
>
> $ grep -e procname -e extra sysctl.c | grep -B1 extra.*init_net
> .extra1 = SYSCTL_ONE,
> .extra2 = &init_net.sctp.rto_max
> .procname = "rto_max",
> .extra1 = &init_net.sctp.rto_min,
> --
> .extra1 = SYSCTL_ZERO,
> .extra2 = &init_net.sctp.ps_retrans,
> .procname = "ps_retrans",
> .extra1 = &init_net.sctp.pf_retrans,
>
> And apparently, SCTP is the only one doing such dynamic limits. At
> least in networking.
>
> While the issue you reported is fixable this way, for ps/pf_retrans,
> it is not, as it is using proc_dointvec_minmax() and it will simply
> consume those values (with no netns translation).
>
> So what about patching sctp_sysctl_net_register() instead, to update
> these pointers during netns creation? Right after where it update the
> 'data' one in there:
>
> for (i = 0; table[i].data; i++)
> table[i].data += (char *)(&net->sctp) - (char *)&init_net.sctp;
Thanks Marcelo. It's better. So you mean something like the following?
--- a/net/sctp/sysctl.c
+++ b/net/sctp/sysctl.c
@@ -586,6 +586,11 @@ int sctp_sysctl_net_register(struct net *net)
for (i = 0; table[i].data; i++)
table[i].data += (char *)(&net->sctp) - (char *)&init_net.sctp;
+#define SCTP_RTO_MIN_IDX 1
+#define SCTP_RTO_MAX_IDX 2
+ table[SCTP_RTO_MIN_IDX].extra2 = &net->sctp.rto_max;
+ table[SCTP_RTO_MAX_IDX].extra1 = &net->sctp.rto_min;
+
net->sctp.sysctl_header = register_net_sysctl(net, "net/sctp", table);
if (net->sctp.sysctl_header == NULL) {
kfree(table);
>
> Thanks,
> Marcelo
>
> > +
> > ret = proc_dointvec(&tbl, write, buffer, lenp, ppos);
> > if (write && ret == 0) {
> > if (new_value > max || new_value < min)
> > @@ -457,6 +460,9 @@ static int proc_sctp_do_rto_max(struct ctl_table *ctl, int write,
> > else
> > tbl.data = &net->sctp.rto_max;
> >
> > + if (net != &init_net)
> > + min = net->sctp.rto_min;
> > +
> > ret = proc_dointvec(&tbl, write, buffer, lenp, ppos);
> > if (write && ret == 0) {
> > if (new_value > max || new_value < min)
> > --
> > 2.26.2
> >
next prev parent reply other threads:[~2022-11-24 6:30 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-23 9:44 [PATCH 1/1] sctp: sysctl: referring the correct net namespace Firo Yang
2022-11-23 13:00 ` Marcelo Ricardo Leitner
2022-11-24 6:29 ` Firo Yang [this message]
2022-11-24 17:57 ` Marcelo Ricardo Leitner
2022-11-25 5:53 ` Firo Yang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y38PUmjeFWApHnrh@suse.com \
--to=firo.yang@suse.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=firogm@gmail.com \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-sctp@vger.kernel.org \
--cc=marcelo.leitner@gmail.com \
--cc=mkubecek@suse.com \
--cc=netdev@vger.kernel.org \
--cc=nhorman@tuxdriver.com \
--cc=pabeni@redhat.com \
--cc=vyasevich@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.