Re: [PATCH net-next v3 0/4] net: move .getsockopt away from __user buffers

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Breno Leitao <leitao@debian.org>
To: David Laight <david.laight.linux@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>,
	 Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>,
	 Paolo Abeni <pabeni@redhat.com>, Simon Horman <horms@kernel.org>,
	 Kuniyuki Iwashima <kuniyu@google.com>,
	Willem de Bruijn <willemb@google.com>,
	metze@samba.org,  axboe@kernel.dk,
	Stanislav Fomichev <sdf@fomichev.me>,
	io-uring@vger.kernel.org,  bpf@vger.kernel.org,
	netdev@vger.kernel.org,
	 Linus Torvalds <torvalds@linux-foundation.org>,
	linux-kernel@vger.kernel.org, kernel-team@meta.com
Subject: Re: [PATCH net-next v3 0/4] net: move .getsockopt away from __user buffers
Date: Fri, 10 Apr 2026 05:29:37 -0700	[thread overview]
Message-ID: <adjkn7p4U13WBs2o@gmail.com> (raw)
In-Reply-To: <20260408195640.324ee932@pumpkin>

On Wed, Apr 08, 2026 at 07:56:40PM +0100, David Laight wrote:
> On Wed, 8 Apr 2026 06:52:54 -0700
> Breno Leitao <leitao@debian.org> wrote:
>
> > Hello David,
> >
> > On Wed, Apr 08, 2026 at 12:26:53PM +0100, David Laight wrote:
> > > On Wed, 08 Apr 2026 03:30:28 -0700
> > > Breno Leitao <leitao@debian.org> wrote:
> > >
> > > > Currently, the .getsockopt callback requires __user pointers:
> > > >
> > > >   int (*getsockopt)(struct socket *sock, int level,
> > > >                     int optname, char __user *optval, int __user *optlen);
> > > >
> > > > This prevents kernel callers (io_uring, BPF) from using getsockopt on
> > > > levels other than SOL_SOCKET, since they pass kernel pointers.
> > > >
> > > > Following Linus' suggestion [0], this series introduces sockopt_t, a
> > > > type-safe wrapper around iov_iter, and a getsockopt_iter callback that
> > > > works with both user and kernel buffers. AF_PACKET and CAN raw are
> > > > converted as initial users, with selftests covering the trickiest
> > > > conversion patterns.
> > >
> > > What are you doing about the cases where 'optlen' is a complete lie?
> >
> > Is this incorrect optlen originating from userspace, and getting into
> > the .getsockopt callbacks?
>
> Look at tcp_ao_copy_mptks_to_user() in net/ipv4/tcp_ao.c
> This isn't 'old code' it was added in 2023.
>
> Basically what is being transferred is an array and 'optlen' is the
> size of one element.
> The number of elements is in the first one.

Thank you for pointing this out. I now understand the issue you're raising.

The problem is that optlen doesn't represent the full buffer length. Instead:
	optlen = per-element struct size (stride)
	actual buffer length = optlen * nkeys (where nkeys comes from optval[0])

For handling these cases, I can think of several approaches:

1) Dynamically resize the iter once the actual buffer size is discovered:

  int sockopt_set_buflen(sockopt_t *opt, size_t new_len)
  {
      if (!opt->legacy)   /* Following Stefan's suggestion */
	return -EINVAL;

      /* Re-initialize iter with the actual buffer size.
       * For ubuf: same base pointer, updated count.
       * For kvec: same iov_base, updated iov_len + re-init.
       */
  }

This allows legacy protocols to adjust the iov_iter later, mirroring
current behavior. Note that this doesn't worsen the existing
situation—currently, the current code is like having iov_iter length is
set to INT_MAX, given the callback can read/write to any location based
on that __user pointer.

2) Use a special legacy callback path, as proposed by Stefan.

3) Store base pointers in sockopt_t and defer iter initialization for
   legacy callbacks:

  static int tcp_ao_copy_mkts_to_user(const struct sock *sk,
                                    struct tcp_ao_info *ao_info,
                                    sockopt_t *opt)
  {
        struct tcp_ao_getsockopt opt_in;
        int user_len = opt->optlen;
        struct kvec kvec;

        /* First, initialize a small iter to read the first element */
        sockopt_init_iter(opt, user_len, &kvec);

        if (copy_from_iter(&opt_in, user_len, &opt->iter_in) != user_len)
                return -EFAULT;

        /* Now we know the actual buffer size */
        sockopt_init_iter(opt, user_len * opt_in.nkeys, &kvec);

        /* ... write the full array via copy_to_iter() ... */
  }

4) Maintain two separate callbacks: ->getsockopt_iter (to be renamed to ->getsockopt
   after the transition) and ->getsockopt_unsafe for legacy cases.


Regardless of which approach we take for these legacy implementations, I don't
believe any of them invalidate the current patchset from a design standpoint.

Since these legacy protocols represent less than 1% of the cases, I'd prefer to
optimize for the common path and handle the exceptional cases as exceptions.

next prev parent reply	other threads:[~2026-04-10 12:29 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-08 10:30 [PATCH net-next v3 0/4] net: move .getsockopt away from __user buffers Breno Leitao
2026-04-08 10:30 ` [PATCH net-next v3 1/4] net: add getsockopt_iter callback to proto_ops Breno Leitao
2026-04-08 10:30 ` [PATCH net-next v3 2/4] net: call getsockopt_iter if available Breno Leitao
2026-04-08 10:30 ` [PATCH net-next v3 3/4] af_packet: convert to getsockopt_iter Breno Leitao
2026-04-08 10:30 ` [PATCH net-next v3 4/4] can: raw: " Breno Leitao
2026-04-08 11:26 ` [PATCH net-next v3 0/4] net: move .getsockopt away from __user buffers David Laight
2026-04-08 13:52   ` Breno Leitao
2026-04-08 18:56     ` David Laight
2026-04-10 12:29       ` Breno Leitao [this message]
2026-04-10 14:15         ` David Laight
2026-04-08 13:56   ` Stefan Metzmacher
2026-04-09  8:39     ` Stefan Metzmacher
2026-04-08 17:02 ` Stanislav Fomichev
2026-04-10 12:52   ` Breno Leitao
2026-04-10 15:11     ` Stanislav Fomichev
2026-04-13 22:30 ` patchwork-bot+netdevbpf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=adjkn7p4U13WBs2o@gmail.com \
    --to=leitao@debian.org \
    --cc=axboe@kernel.dk \
    --cc=bpf@vger.kernel.org \
    --cc=davem@davemloft.net \
    --cc=david.laight.linux@gmail.com \
    --cc=edumazet@google.com \
    --cc=horms@kernel.org \
    --cc=io-uring@vger.kernel.org \
    --cc=kernel-team@meta.com \
    --cc=kuba@kernel.org \
    --cc=kuniyu@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=metze@samba.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=sdf@fomichev.me \
    --cc=torvalds@linux-foundation.org \
    --cc=willemb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.