public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: Breno Leitao <leitao@debian.org>
To: "David S. Miller" <davem@davemloft.net>,
	 Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>,
	 Paolo Abeni <pabeni@redhat.com>, Simon Horman <horms@kernel.org>,
	 Kuniyuki Iwashima <kuniyu@google.com>,
	 Willem de Bruijn <willemb@google.com>,
	metze@samba.org, axboe@kernel.dk,
	 Stanislav Fomichev <sdf@fomichev.me>
Cc: io-uring@vger.kernel.org, bpf@vger.kernel.org,
	netdev@vger.kernel.org,
	 Linus Torvalds <torvalds@linux-foundation.org>,
	 linux-kernel@vger.kernel.org, kernel-team@meta.com,
	 Breno Leitao <leitao@debian.org>
Subject: [PATCH net-next v2 0/4] net: move .getsockopt away from __user buffers
Date: Wed, 01 Apr 2026 08:44:25 -0700	[thread overview]
Message-ID: <20260401-getsockopt-v2-0-611df6771aff@debian.org> (raw)

Currently, the .getsockopt callback requires __user pointers:

  int (*getsockopt)(struct socket *sock, int level,
                    int optname, char __user *optval, int __user *optlen);

This prevents kernel callers (io_uring, BPF) from using getsockopt on
levels other than SOL_SOCKET, since they pass kernel pointers.

Following Linus' suggestion [0], this series introduces sockopt_t, a
type-safe wrapper around iov_iter, and a getsockopt_iter callback that
works with both user and kernel buffers. AF_PACKET and CAN raw are
converted as initial users, with selftests covering the trickiest
conversion patterns.

[0] https://lore.kernel.org/all/CAHk-=whmzrO-BMU=uSVXbuoLi-3tJsO=0kHj1BCPBE3F2kVhTA@mail.gmail.com/

Below are some questions raised during the RFC discussion:

1) Should optlen be an iov_iter as well?
  No. optlen can remain a plain kernel int since do_sock_getsockopt_iter() syncs
  it back to userspace on both success and failure. The existing callback
  patterns all work with this approach:

  a) Most callbacks (roughly 2/3) always write back optlen.

  b) Some callbacks read optlen but never update it. The original
     value is written back unchanged.

  c) CAN raw updates optlen even on error (-ERANGE) to report the
     required buffer size:

          err = -ERANGE;
          if (put_user(fsize, optlen))
                  err = -EFAULT;

     No regression, since opt.optlen is always written back to
     userspace by the wrapper.

  d) Bluetooth uses put_user() with mixed sizes (u32, u16, u8) but
     never updates optlen. Same as case (b).

2) Can callbacks change iov_iter direction mid-flight?

  Yes. Some protocols read from and then write back to optval in the same
  getsockopt call. For example, PACKET_HDRLEN reads a tpacket version from optval
  and writes back the corresponding header size.

  The converted callback handles this by temporarily flipping the iter direction,
  reverting the position, and writing back:

          case PACKET_HDRLEN:
                  // opt->iter.data_source is ITER_SOURCE;
                  if (copy_from_iter(&val, len, &opt->iter) != len)
                          return -EFAULT;
		  // unroll the bytes
                  iov_iter_revert(&opt->iter, len);
                  opt->iter.data_source = ITER_DEST;
                  // ... update val ...
                  if (copy_to_iter(&val, len, &opt->iter) != len)
                          return -EFAULT;

  The callback needs to handle two things after reading from the iter:
  reset the position with iov_iter_revert(), and flip data_source back
  to ITER_DEST before writing.

  - ITER_DEST — the iter is a destination (kernel writes to it).
		copy_to_iter() works, copy_from_iter() refuses.
  - ITER_SOURCE — the iter is a source (kernel reads from it).
		copy_from_iter() works, copy_to_iter() refuses.

3) In which case iov_iter_revert() needs to be called?

  When a callback needs to read from and then write back to the same
  buffer in a single getsockopt call. The iter advances its position on
  copy_from_iter(), so you need iov_iter_revert() to reset the position
  back to the start before you can copy_to_iter() into the same location.

  Without the revert, copy_to_iter() would write past the end of the
  buffer since the iter already advanced during the read.

4) Do we have any selftest for this change?

  Yes, I've created a commit that I am using to test it, but, I am not
  sure how useful it is rigth now, so, not appending it here.

  You can find it at
  https://github.com/leitao/linux/commit/2d9311947061f1baa43858f597dd6c54d7ccc5d2

Note: The dance regarding changes to iov_iter_revert() (2) and
opt->iter.data_source (3) is a bit fragile. It will not be a bad idea to
creaet a helper (e.g., sockopt_read_val()) would be safer to prevent
others from getting it wrong.

I am not adding it now, so, it is easier to read the bare bones of the
change and helpers can come later.

Link: https://lore.kernel.org/all/CAHk-=whmzrO-BMU=uSVXbuoLi-3tJsO=0kHj1BCPBE3F2kVhTA@mail.gmail.com/ [0]
---
Changes in v2:
- Restore optlen even on error path (getsockopt_iter fails)
- Move af_packet.c and can instead of netlink (given these are the most
  complicate ones).
- Link to v1: https://patch.msgid.link/20260130-getsockopt-v1-0-9154fcff6f95@debian.org

---
Breno Leitao (4):
      net: add getsockopt_iter callback to proto_ops
      net: call getsockopt_iter if available
      af_packet: convert to getsockopt_iter
      can: raw: convert to getsockopt_iter

 include/linux/net.h    | 19 +++++++++++++++++++
 net/can/raw.c          | 28 +++++++++++++---------------
 net/packet/af_packet.c | 18 ++++++++++--------
 net/socket.c           | 48 +++++++++++++++++++++++++++++++++++++++++++++---
 4 files changed, 87 insertions(+), 26 deletions(-)
---
base-commit: 2d9311947061f1baa43858f597dd6c54d7ccc5d2
change-id: 20260130-getsockopt-9f36625eedcb

Best regards,
--  
Breno Leitao <leitao@debian.org>


             reply	other threads:[~2026-04-01 15:44 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-01 15:44 Breno Leitao [this message]
2026-04-01 15:44 ` [PATCH net-next v2 1/4] net: add getsockopt_iter callback to proto_ops Breno Leitao
2026-04-01 15:44 ` [PATCH net-next v2 2/4] net: call getsockopt_iter if available Breno Leitao
2026-04-01 16:34   ` Stanislav Fomichev
2026-04-01 17:43     ` Breno Leitao
2026-04-01 18:10       ` Stanislav Fomichev
2026-04-02 15:39         ` Breno Leitao
2026-04-02 23:00           ` Stanislav Fomichev
2026-04-01 15:44 ` [PATCH net-next v2 3/4] af_packet: convert to getsockopt_iter Breno Leitao
2026-04-01 15:44 ` [PATCH net-next v2 4/4] can: raw: " Breno Leitao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260401-getsockopt-v2-0-611df6771aff@debian.org \
    --to=leitao@debian.org \
    --cc=axboe@kernel.dk \
    --cc=bpf@vger.kernel.org \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=horms@kernel.org \
    --cc=io-uring@vger.kernel.org \
    --cc=kernel-team@meta.com \
    --cc=kuba@kernel.org \
    --cc=kuniyu@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=metze@samba.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=sdf@fomichev.me \
    --cc=torvalds@linux-foundation.org \
    --cc=willemb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox