From: Kuniyuki Iwashima <kuniyu@amazon.com>
To: "David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
Willem de Bruijn <willemb@google.com>
Cc: Simon Horman <horms@kernel.org>,
Kuniyuki Iwashima <kuniyu@amazon.com>,
Kuniyuki Iwashima <kuni1840@gmail.com>,
Chuck Lever <chuck.lever@oracle.com>,
Jeff Layton <jlayton@kernel.org>,
Matthieu Baerts <matttbe@kernel.org>,
"Keith Busch" <kbusch@kernel.org>, Jens Axboe <axboe@kernel.dk>,
Christoph Hellwig <hch@lst.de>,
Wenjia Zhang <wenjia@linux.ibm.com>,
Jan Karcher <jaka@linux.ibm.com>,
Steve French <sfrench@samba.org>, <netdev@vger.kernel.org>,
<mptcp@lists.linux.dev>, <linux-nfs@vger.kernel.org>,
<linux-rdma@vger.kernel.org>, <linux-nvme@lists.infradead.org>
Subject: [PATCH v2 net-next 3/7] socket: Restore sock_create_kern().
Date: Fri, 23 May 2025 11:21:09 -0700 [thread overview]
Message-ID: <20250523182128.59346-4-kuniyu@amazon.com> (raw)
In-Reply-To: <20250523182128.59346-1-kuniyu@amazon.com>
Let's restore sock_create_kern() that holds a netns reference.
Now, it's the same as the version before commit 26abe14379f8 ("net:
Modify sk_alloc to not reference count the netns of kernel sockets.").
Back then, after creating a socket in init_net, we used sk_change_net()
to drop the netns ref and switch to another netns, but now we can
simply use __sock_create_kern() instead.
$ git blame -L:sk_change_net include/net/sock.h 26abe14379f8~
DEBUG_NET_WARN_ON_ONCE() is to catch a path calling sock_create_kern()
from __net_init functions, since doing so would leak the netns as
__net_exit functions cannot run until the socket is removed.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
v2: s/ret/err/ in sock_create_kern() for clarity
---
include/linux/net.h | 2 ++
net/socket.c | 42 ++++++++++++++++++++++++++++++++++++++++++
2 files changed, 44 insertions(+)
diff --git a/include/linux/net.h b/include/linux/net.h
index 12180e00f882..b60e3afab344 100644
--- a/include/linux/net.h
+++ b/include/linux/net.h
@@ -254,6 +254,8 @@ bool sock_is_registered(int family);
int sock_create(int family, int type, int proto, struct socket **res);
int __sock_create_kern(struct net *net, int family, int type, int proto,
struct socket **res);
+int sock_create_kern(struct net *net, int family, int type, int proto,
+ struct socket **res);
int sock_create_lite(int family, int type, int proto, struct socket **res);
struct socket *sock_alloc(void);
void sock_release(struct socket *sock);
diff --git a/net/socket.c b/net/socket.c
index 7c4474c966c0..9ad352183fae 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -1632,6 +1632,48 @@ int __sock_create_kern(struct net *net, int family, int type, int protocol, stru
}
EXPORT_SYMBOL(__sock_create_kern);
+/**
+ * sock_create_kern - creates a socket for kernel space
+ *
+ * @net: net namespace
+ * @family: protocol family (AF_INET, ...)
+ * @type: communication type (SOCK_STREAM, ...)
+ * @protocol: protocol (0, ...)
+ * @res: new socket
+ *
+ * Creates a new socket and assigns it to @res.
+ *
+ * The socket is for kernel space and should not be exposed to
+ * userspace via a file descriptor nor BPF hooks except for LSM
+ * (see inet_create(), inet_release(), etc).
+ *
+ * The socket bypasses some LSMs that take care of @kern in
+ * security_socket_create() and security_socket_post_create().
+ *
+ * The socket holds a reference count of @net so that the caller
+ * does not need to care about @net's lifetime.
+ *
+ * This MUST NOT be called from the __net_init path and @net MUST
+ * be alive as of calling sock_create_kern().
+ *
+ * Context: Process context. This function internally uses GFP_KERNEL.
+ * Return: 0 or an error.
+ */
+int sock_create_kern(struct net *net, int family, int type, int protocol,
+ struct socket **res)
+{
+ int err;
+
+ DEBUG_NET_WARN_ON_ONCE(!net_initialized(net));
+
+ err = __sock_create(net, family, type, protocol, res, 1);
+ if (!err)
+ sk_net_refcnt_upgrade((*res)->sk);
+
+ return err;
+}
+EXPORT_SYMBOL(sock_create_kern);
+
static struct socket *__sys_socket_create(int family, int type, int protocol)
{
struct socket *sock;
--
2.49.0
next prev parent reply other threads:[~2025-05-23 18:23 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-23 18:21 [PATCH v2 net-next 0/7] socket: Make sock_create_kern() robust against misuse Kuniyuki Iwashima
2025-05-23 18:21 ` [PATCH v2 net-next 1/7] socket: Un-export __sock_create() Kuniyuki Iwashima
2025-05-26 5:29 ` Christoph Hellwig
2025-05-26 10:06 ` David Laight
2025-05-30 2:42 ` Kuniyuki Iwashima
2025-05-23 18:21 ` [PATCH v2 net-next 2/7] socket: Rename sock_create_kern() to __sock_create_kern() Kuniyuki Iwashima
2025-05-26 5:30 ` Christoph Hellwig
2025-05-29 21:29 ` David Laight
2025-05-30 3:05 ` Kuniyuki Iwashima
2025-05-30 6:48 ` David Laight
2025-05-30 2:45 ` Kuniyuki Iwashima
2025-05-23 18:21 ` Kuniyuki Iwashima [this message]
2025-05-26 5:32 ` [PATCH v2 net-next 3/7] socket: Restore sock_create_kern() Christoph Hellwig
2025-05-30 2:53 ` Kuniyuki Iwashima
2025-06-02 5:08 ` Christoph Hellwig
2025-06-03 21:30 ` David Laight
2025-06-04 18:36 ` Kuniyuki Iwashima
2025-05-23 18:21 ` [PATCH v2 net-next 4/7] smb: client: Add missing net_passive_dec() Kuniyuki Iwashima
2025-05-23 18:21 ` [PATCH v2 net-next 5/7] socket: Remove kernel socket conversion except for net/rds/ Kuniyuki Iwashima
2025-05-26 5:33 ` Christoph Hellwig
2025-05-30 2:59 ` Kuniyuki Iwashima
2025-06-02 5:08 ` Christoph Hellwig
2025-05-23 18:21 ` [PATCH v2 net-next 6/7] socket: Replace most sock_create() calls with sock_create_kern() Kuniyuki Iwashima
2025-05-26 5:33 ` Christoph Hellwig
2025-05-26 5:35 ` Christoph Hellwig
2025-05-30 3:03 ` Kuniyuki Iwashima
2025-06-02 5:09 ` Christoph Hellwig
2025-06-02 21:52 ` Kuniyuki Iwashima
2025-06-03 4:50 ` Christoph Hellwig
2025-06-04 18:20 ` Kuniyuki Iwashima
2025-06-05 4:28 ` Christoph Hellwig
2025-05-23 18:21 ` [PATCH v2 net-next 7/7] socket: Clean up kdoc for sock_create() and sock_create_lite() Kuniyuki Iwashima
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250523182128.59346-4-kuniyu@amazon.com \
--to=kuniyu@amazon.com \
--cc=axboe@kernel.dk \
--cc=chuck.lever@oracle.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=hch@lst.de \
--cc=horms@kernel.org \
--cc=jaka@linux.ibm.com \
--cc=jlayton@kernel.org \
--cc=kbusch@kernel.org \
--cc=kuba@kernel.org \
--cc=kuni1840@gmail.com \
--cc=linux-nfs@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=linux-rdma@vger.kernel.org \
--cc=matttbe@kernel.org \
--cc=mptcp@lists.linux.dev \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=sfrench@samba.org \
--cc=wenjia@linux.ibm.com \
--cc=willemb@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox