* Is xt_owner's owner_mt() racy with sock_orphan()? [worse with new TYPESAFE_BY_RCU file lifetime?]
@ 2023-12-05 16:40 Jann Horn
2023-12-05 17:08 ` Jann Horn
0 siblings, 1 reply; 13+ messages in thread
From: Jann Horn @ 2023-12-05 16:40 UTC (permalink / raw)
To: Pablo Neira Ayuso, Jozsef Kadlecsik, Florian Westphal,
netfilter-devel, coreteam
Cc: Christian Brauner, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Network Development, kernel list
Hi!
I think this code is racy, but testing that seems like a pain...
owner_mt() in xt_owner runs in context of a NF_INET_LOCAL_OUT or
NF_INET_POST_ROUTING hook. It first checks that sk->sk_socket is
non-NULL, then checks that sk->sk_socket->file is non-NULL, then
accesses the ->f_cred of that file.
I don't see anything that protects this against a concurrent
sock_orphan(), which NULLs out the sk->sk_socket pointer, if we're in
the context of a TCP retransmit or something like that. So I think we
can theoretically have a NULL deref, though the compiler will probably
optimize it away by merging the two loads of sk->sk_socket:
static bool
owner_mt(const struct sk_buff *skb, struct xt_action_param *par)
{
[...]
const struct file *filp;
struct sock *sk = skb_to_full_sk(skb);
[...]
if (!sk || !sk->sk_socket || !net_eq(net, sock_net(sk)))
return (info->match ^ info->invert) == 0;
else if (info->match & info->invert & XT_OWNER_SOCKET)
[...]
// concurrent sock_orphan() while we're here
// null deref on second access to sk->sk_socket
filp = sk->sk_socket->file;
if (filp == NULL)
[...]
[...]
}
(Sidenote: I guess this also means xt_owner will treat a sock as
having no owner as soon as it is orphaned? Which probably means that
when a TCP socket enters linger state because it is close()d with data
remaining in the output buffer, the remaining packets will be treated
differently by netfilter?)
I also think we have no guarantee here that the socket's ->file won't
go away due to a concurrent __sock_release(), which could cause us to
continue reading file credentials out of a file whose refcount has
already dropped to zero?
That was probably working sorta okay-ish in practice until now because
files had RCU lifetime, "struct cred" also normally has RCU lifetime,
and nf_hook() runs in an RCU read-side critical section; but now that
"struct file" uses SLAB_TYPESAFE_BY_RCU, I think we can theoretically
race such that the "struct file" could have been freed and reallocated
in the meantime, causing us to see an entirely unrelated "struct file"
and look at creds that are unrelated to the context from which the
packet was sent.
But again, I haven't actually tested this yet, I might be getting it wrong.
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: Is xt_owner's owner_mt() racy with sock_orphan()? [worse with new TYPESAFE_BY_RCU file lifetime?] 2023-12-05 16:40 Is xt_owner's owner_mt() racy with sock_orphan()? [worse with new TYPESAFE_BY_RCU file lifetime?] Jann Horn @ 2023-12-05 17:08 ` Jann Horn 2023-12-05 21:40 ` Phil Sutter 2023-12-06 13:58 ` Christian Brauner 0 siblings, 2 replies; 13+ messages in thread From: Jann Horn @ 2023-12-05 17:08 UTC (permalink / raw) To: Pablo Neira Ayuso, Jozsef Kadlecsik, Florian Westphal, netfilter-devel, coreteam Cc: Christian Brauner, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, Network Development, kernel list On Tue, Dec 5, 2023 at 5:40 PM Jann Horn <jannh@google.com> wrote: > > Hi! > > I think this code is racy, but testing that seems like a pain... > > owner_mt() in xt_owner runs in context of a NF_INET_LOCAL_OUT or > NF_INET_POST_ROUTING hook. It first checks that sk->sk_socket is > non-NULL, then checks that sk->sk_socket->file is non-NULL, then > accesses the ->f_cred of that file. > > I don't see anything that protects this against a concurrent > sock_orphan(), which NULLs out the sk->sk_socket pointer, if we're in Ah, and all the other users of ->sk_socket in net/netfilter/ do it under the sk_callback_lock... so I guess the fix would be to add the same in owner_mt? ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Is xt_owner's owner_mt() racy with sock_orphan()? [worse with new TYPESAFE_BY_RCU file lifetime?] 2023-12-05 17:08 ` Jann Horn @ 2023-12-05 21:40 ` Phil Sutter 2023-12-06 16:28 ` Jann Horn 2023-12-06 16:42 ` Pablo Neira Ayuso 2023-12-06 13:58 ` Christian Brauner 1 sibling, 2 replies; 13+ messages in thread From: Phil Sutter @ 2023-12-05 21:40 UTC (permalink / raw) To: Jann Horn Cc: Pablo Neira Ayuso, Jozsef Kadlecsik, Florian Westphal, netfilter-devel, coreteam, Christian Brauner, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, Network Development, kernel list [-- Attachment #1: Type: text/plain, Size: 1054 bytes --] Hi, On Tue, Dec 05, 2023 at 06:08:29PM +0100, Jann Horn wrote: > On Tue, Dec 5, 2023 at 5:40 PM Jann Horn <jannh@google.com> wrote: > > > > Hi! > > > > I think this code is racy, but testing that seems like a pain... > > > > owner_mt() in xt_owner runs in context of a NF_INET_LOCAL_OUT or > > NF_INET_POST_ROUTING hook. It first checks that sk->sk_socket is > > non-NULL, then checks that sk->sk_socket->file is non-NULL, then > > accesses the ->f_cred of that file. > > > > I don't see anything that protects this against a concurrent > > sock_orphan(), which NULLs out the sk->sk_socket pointer, if we're in > > Ah, and all the other users of ->sk_socket in net/netfilter/ do it > under the sk_callback_lock... so I guess the fix would be to add the > same in owner_mt? Sounds reasonable, although I wonder how likely a socket is to orphan while netfilter is processing a packet it just sent. How about the attached patch? Not sure what hash to put into a Fixes: tag given this is a day 1 bug and ipt_owner/ip6t_owner predate git. Thanks, Phil [-- Attachment #2: 0001-netfilter-xt_owner-Fix-for-unsafe-access-of-sk-sk_so.patch --] [-- Type: text/x-diff, Size: 1977 bytes --] From 3e28490e43b04d49e6e7f145a70fff7dd42c8cc5 Mon Sep 17 00:00:00 2001 From: Phil Sutter <phil@nwl.cc> Date: Tue, 5 Dec 2023 21:58:12 +0100 Subject: [nf PATCH] netfilter: xt_owner: Fix for unsafe access of sk->sk_socket A concurrently running sock_orphan() may NULL the sk_socket pointer in between check and deref. Follow other users (like nft_meta.c for instance) and acquire sk_callback_lock before dereferencing sk_socket. Reported-by: Jann Horn <jannh@google.com> Signed-off-by: Phil Sutter <phil@nwl.cc> --- net/netfilter/xt_owner.c | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/net/netfilter/xt_owner.c b/net/netfilter/xt_owner.c index e85ce69924ae..50332888c8d2 100644 --- a/net/netfilter/xt_owner.c +++ b/net/netfilter/xt_owner.c @@ -76,18 +76,23 @@ owner_mt(const struct sk_buff *skb, struct xt_action_param *par) */ return false; - filp = sk->sk_socket->file; - if (filp == NULL) + read_lock_bh(&sk->sk_callback_lock); + filp = sk->sk_socket ? sk->sk_socket->file : NULL; + if (filp == NULL) { + read_unlock_bh(&sk->sk_callback_lock); return ((info->match ^ info->invert) & (XT_OWNER_UID | XT_OWNER_GID)) == 0; + } if (info->match & XT_OWNER_UID) { kuid_t uid_min = make_kuid(net->user_ns, info->uid_min); kuid_t uid_max = make_kuid(net->user_ns, info->uid_max); if ((uid_gte(filp->f_cred->fsuid, uid_min) && uid_lte(filp->f_cred->fsuid, uid_max)) ^ - !(info->invert & XT_OWNER_UID)) + !(info->invert & XT_OWNER_UID)) { + read_unlock_bh(&sk->sk_callback_lock); return false; + } } if (info->match & XT_OWNER_GID) { @@ -112,10 +117,13 @@ owner_mt(const struct sk_buff *skb, struct xt_action_param *par) } } - if (match ^ !(info->invert & XT_OWNER_GID)) + if (match ^ !(info->invert & XT_OWNER_GID)) { + read_unlock_bh(&sk->sk_callback_lock); return false; + } } + read_unlock_bh(&sk->sk_callback_lock); return true; } -- 2.41.0 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: Is xt_owner's owner_mt() racy with sock_orphan()? [worse with new TYPESAFE_BY_RCU file lifetime?] 2023-12-05 21:40 ` Phil Sutter @ 2023-12-06 16:28 ` Jann Horn 2023-12-06 16:48 ` Pablo Neira Ayuso ` (2 more replies) 2023-12-06 16:42 ` Pablo Neira Ayuso 1 sibling, 3 replies; 13+ messages in thread From: Jann Horn @ 2023-12-06 16:28 UTC (permalink / raw) To: Phil Sutter, Jann Horn, Pablo Neira Ayuso, Jozsef Kadlecsik, Florian Westphal, netfilter-devel, coreteam, Christian Brauner, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, Network Development, kernel list On Tue, Dec 5, 2023 at 10:40 PM Phil Sutter <phil@nwl.cc> wrote: > On Tue, Dec 05, 2023 at 06:08:29PM +0100, Jann Horn wrote: > > On Tue, Dec 5, 2023 at 5:40 PM Jann Horn <jannh@google.com> wrote: > > > > > > Hi! > > > > > > I think this code is racy, but testing that seems like a pain... > > > > > > owner_mt() in xt_owner runs in context of a NF_INET_LOCAL_OUT or > > > NF_INET_POST_ROUTING hook. It first checks that sk->sk_socket is > > > non-NULL, then checks that sk->sk_socket->file is non-NULL, then > > > accesses the ->f_cred of that file. > > > > > > I don't see anything that protects this against a concurrent > > > sock_orphan(), which NULLs out the sk->sk_socket pointer, if we're in > > > > Ah, and all the other users of ->sk_socket in net/netfilter/ do it > > under the sk_callback_lock... so I guess the fix would be to add the > > same in owner_mt? > > Sounds reasonable, although I wonder how likely a socket is to > orphan while netfilter is processing a packet it just sent. > > How about the attached patch? Not sure what hash to put into a Fixes: > tag given this is a day 1 bug and ipt_owner/ip6t_owner predate git. Looks mostly reasonable to me; though I guess it's a bit weird to have two separate bailout paths for checking whether sk->sk_socket is NULL, where the first check can race, and the second check uses different logic for determining the return value; I don't know whether that actually matters semantically. But I'm not sure how to make it look nicer either. I guess you could add a READ_ONCE() around the first read to signal that that's a potentially racy read, but I don't feel strongly about that. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Is xt_owner's owner_mt() racy with sock_orphan()? [worse with new TYPESAFE_BY_RCU file lifetime?] 2023-12-06 16:28 ` Jann Horn @ 2023-12-06 16:48 ` Pablo Neira Ayuso 2023-12-06 16:49 ` Christian Brauner 2023-12-06 20:42 ` Phil Sutter 2 siblings, 0 replies; 13+ messages in thread From: Pablo Neira Ayuso @ 2023-12-06 16:48 UTC (permalink / raw) To: Jann Horn Cc: Phil Sutter, Jozsef Kadlecsik, Florian Westphal, netfilter-devel, coreteam, Christian Brauner, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, Network Development, kernel list On Wed, Dec 06, 2023 at 05:28:44PM +0100, Jann Horn wrote: > On Tue, Dec 5, 2023 at 10:40 PM Phil Sutter <phil@nwl.cc> wrote: > > On Tue, Dec 05, 2023 at 06:08:29PM +0100, Jann Horn wrote: > > > On Tue, Dec 5, 2023 at 5:40 PM Jann Horn <jannh@google.com> wrote: > > > > > > > > Hi! > > > > > > > > I think this code is racy, but testing that seems like a pain... > > > > > > > > owner_mt() in xt_owner runs in context of a NF_INET_LOCAL_OUT or > > > > NF_INET_POST_ROUTING hook. It first checks that sk->sk_socket is > > > > non-NULL, then checks that sk->sk_socket->file is non-NULL, then > > > > accesses the ->f_cred of that file. > > > > > > > > I don't see anything that protects this against a concurrent > > > > sock_orphan(), which NULLs out the sk->sk_socket pointer, if we're in > > > > > > Ah, and all the other users of ->sk_socket in net/netfilter/ do it > > > under the sk_callback_lock... so I guess the fix would be to add the > > > same in owner_mt? > > > > Sounds reasonable, although I wonder how likely a socket is to > > orphan while netfilter is processing a packet it just sent. > > > > How about the attached patch? Not sure what hash to put into a Fixes: > > tag given this is a day 1 bug and ipt_owner/ip6t_owner predate git. > > Looks mostly reasonable to me; though I guess it's a bit weird to have > two separate bailout paths for checking whether sk->sk_socket is NULL, > where the first check can race, and the second check uses different > logic for determining the return value; I don't know whether that > actually matters semantically. But I'm not sure how to make it look > nicer either. > I guess you could add a READ_ONCE() around the first read to signal > that that's a potentially racy read, but I don't feel strongly about > that. The lack of READ_ONCE() on sk->sk_socket also got me thinking here, given this sk_set_socket(sk, NULL) in sock_orphan is done under the lock. I am taking Phil's patch, I think it is leaving things in better shape. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Is xt_owner's owner_mt() racy with sock_orphan()? [worse with new TYPESAFE_BY_RCU file lifetime?] 2023-12-06 16:28 ` Jann Horn 2023-12-06 16:48 ` Pablo Neira Ayuso @ 2023-12-06 16:49 ` Christian Brauner 2023-12-06 20:42 ` Phil Sutter 2 siblings, 0 replies; 13+ messages in thread From: Christian Brauner @ 2023-12-06 16:49 UTC (permalink / raw) To: Jann Horn Cc: Phil Sutter, Pablo Neira Ayuso, Jozsef Kadlecsik, Florian Westphal, netfilter-devel, coreteam, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, Network Development, kernel list On Wed, Dec 06, 2023 at 05:28:44PM +0100, Jann Horn wrote: > On Tue, Dec 5, 2023 at 10:40 PM Phil Sutter <phil@nwl.cc> wrote: > > On Tue, Dec 05, 2023 at 06:08:29PM +0100, Jann Horn wrote: > > > On Tue, Dec 5, 2023 at 5:40 PM Jann Horn <jannh@google.com> wrote: > > > > > > > > Hi! > > > > > > > > I think this code is racy, but testing that seems like a pain... > > > > > > > > owner_mt() in xt_owner runs in context of a NF_INET_LOCAL_OUT or > > > > NF_INET_POST_ROUTING hook. It first checks that sk->sk_socket is > > > > non-NULL, then checks that sk->sk_socket->file is non-NULL, then > > > > accesses the ->f_cred of that file. > > > > > > > > I don't see anything that protects this against a concurrent > > > > sock_orphan(), which NULLs out the sk->sk_socket pointer, if we're in > > > > > > Ah, and all the other users of ->sk_socket in net/netfilter/ do it > > > under the sk_callback_lock... so I guess the fix would be to add the > > > same in owner_mt? > > > > Sounds reasonable, although I wonder how likely a socket is to > > orphan while netfilter is processing a packet it just sent. > > > > How about the attached patch? Not sure what hash to put into a Fixes: > > tag given this is a day 1 bug and ipt_owner/ip6t_owner predate git. > > Looks mostly reasonable to me; though I guess it's a bit weird to have > two separate bailout paths for checking whether sk->sk_socket is NULL, > where the first check can race, and the second check uses different > logic for determining the return value; I don't know whether that > actually matters semantically. But I'm not sure how to make it look > nicer either. > I guess you could add a READ_ONCE() around the first read to signal > that that's a potentially racy read, but I don't feel strongly about > that. It should be possible to split it into two static inlin helpers: owner_mt_fast() owner_mt_slow() And then abstract the lockless and locked fetches into the two helpers. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Is xt_owner's owner_mt() racy with sock_orphan()? [worse with new TYPESAFE_BY_RCU file lifetime?] 2023-12-06 16:28 ` Jann Horn 2023-12-06 16:48 ` Pablo Neira Ayuso 2023-12-06 16:49 ` Christian Brauner @ 2023-12-06 20:42 ` Phil Sutter 2023-12-06 21:02 ` Jann Horn 2 siblings, 1 reply; 13+ messages in thread From: Phil Sutter @ 2023-12-06 20:42 UTC (permalink / raw) To: Jann Horn Cc: Pablo Neira Ayuso, Jozsef Kadlecsik, Florian Westphal, netfilter-devel, coreteam, Christian Brauner, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, Network Development, kernel list On Wed, Dec 06, 2023 at 05:28:44PM +0100, Jann Horn wrote: > On Tue, Dec 5, 2023 at 10:40 PM Phil Sutter <phil@nwl.cc> wrote: > > On Tue, Dec 05, 2023 at 06:08:29PM +0100, Jann Horn wrote: > > > On Tue, Dec 5, 2023 at 5:40 PM Jann Horn <jannh@google.com> wrote: > > > > > > > > Hi! > > > > > > > > I think this code is racy, but testing that seems like a pain... > > > > > > > > owner_mt() in xt_owner runs in context of a NF_INET_LOCAL_OUT or > > > > NF_INET_POST_ROUTING hook. It first checks that sk->sk_socket is > > > > non-NULL, then checks that sk->sk_socket->file is non-NULL, then > > > > accesses the ->f_cred of that file. > > > > > > > > I don't see anything that protects this against a concurrent > > > > sock_orphan(), which NULLs out the sk->sk_socket pointer, if we're in > > > > > > Ah, and all the other users of ->sk_socket in net/netfilter/ do it > > > under the sk_callback_lock... so I guess the fix would be to add the > > > same in owner_mt? > > > > Sounds reasonable, although I wonder how likely a socket is to > > orphan while netfilter is processing a packet it just sent. > > > > How about the attached patch? Not sure what hash to put into a Fixes: > > tag given this is a day 1 bug and ipt_owner/ip6t_owner predate git. > > Looks mostly reasonable to me; though I guess it's a bit weird to have > two separate bailout paths for checking whether sk->sk_socket is NULL, > where the first check can race, and the second check uses different > logic for determining the return value; I don't know whether that > actually matters semantically. But I'm not sure how to make it look > nicer either. I find the code pretty confusing since it combines three matches (socket UID, socket GID and socket existence) via binary ops. The second bail disregards socket existence bits, I assumed it was deliberate and thus decided to leave the first part as-is. > I guess you could add a READ_ONCE() around the first read to signal > that that's a potentially racy read, but I don't feel strongly about > that. Is this just annotation or do you see a practical effect of using READ_ONCE() there? Either way, thanks for the review! Phil ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Is xt_owner's owner_mt() racy with sock_orphan()? [worse with new TYPESAFE_BY_RCU file lifetime?] 2023-12-06 20:42 ` Phil Sutter @ 2023-12-06 21:02 ` Jann Horn 2023-12-07 18:09 ` Phil Sutter 0 siblings, 1 reply; 13+ messages in thread From: Jann Horn @ 2023-12-06 21:02 UTC (permalink / raw) To: Phil Sutter, Jann Horn, Pablo Neira Ayuso, Jozsef Kadlecsik, Florian Westphal, netfilter-devel, coreteam, Christian Brauner, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, Network Development, kernel list On Wed, Dec 6, 2023 at 9:42 PM Phil Sutter <phil@nwl.cc> wrote: > > On Wed, Dec 06, 2023 at 05:28:44PM +0100, Jann Horn wrote: > > On Tue, Dec 5, 2023 at 10:40 PM Phil Sutter <phil@nwl.cc> wrote: > > > On Tue, Dec 05, 2023 at 06:08:29PM +0100, Jann Horn wrote: > > > > On Tue, Dec 5, 2023 at 5:40 PM Jann Horn <jannh@google.com> wrote: > > > > > > > > > > Hi! > > > > > > > > > > I think this code is racy, but testing that seems like a pain... > > > > > > > > > > owner_mt() in xt_owner runs in context of a NF_INET_LOCAL_OUT or > > > > > NF_INET_POST_ROUTING hook. It first checks that sk->sk_socket is > > > > > non-NULL, then checks that sk->sk_socket->file is non-NULL, then > > > > > accesses the ->f_cred of that file. > > > > > > > > > > I don't see anything that protects this against a concurrent > > > > > sock_orphan(), which NULLs out the sk->sk_socket pointer, if we're in > > > > > > > > Ah, and all the other users of ->sk_socket in net/netfilter/ do it > > > > under the sk_callback_lock... so I guess the fix would be to add the > > > > same in owner_mt? > > > > > > Sounds reasonable, although I wonder how likely a socket is to > > > orphan while netfilter is processing a packet it just sent. > > > > > > How about the attached patch? Not sure what hash to put into a Fixes: > > > tag given this is a day 1 bug and ipt_owner/ip6t_owner predate git. > > > > Looks mostly reasonable to me; though I guess it's a bit weird to have > > two separate bailout paths for checking whether sk->sk_socket is NULL, > > where the first check can race, and the second check uses different > > logic for determining the return value; I don't know whether that > > actually matters semantically. But I'm not sure how to make it look > > nicer either. > > I find the code pretty confusing since it combines three matches (socket > UID, socket GID and socket existence) via binary ops. The second bail > disregards socket existence bits, I assumed it was deliberate and thus > decided to leave the first part as-is. > > > I guess you could add a READ_ONCE() around the first read to signal > > that that's a potentially racy read, but I don't feel strongly about > > that. > > Is this just annotation or do you see a practical effect of using > READ_ONCE() there? I mostly just meant that as an annotation. My understanding is that in theory, racy reads can cause the compiler to do some terrible things to your code (https://lore.kernel.org/all/CAG48ez2nFks+yN1Kp4TZisso+rjvv_4UW0FTo8iFUd4Qyq1qDw@mail.gmail.com/), but that's almost certainly not going to happen here. (Well, I guess doing a READ_ONCE() at one side without doing WRITE_ONCE() on the other side is also unclean...) ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Is xt_owner's owner_mt() racy with sock_orphan()? [worse with new TYPESAFE_BY_RCU file lifetime?] 2023-12-06 21:02 ` Jann Horn @ 2023-12-07 18:09 ` Phil Sutter 0 siblings, 0 replies; 13+ messages in thread From: Phil Sutter @ 2023-12-07 18:09 UTC (permalink / raw) To: Jann Horn Cc: Pablo Neira Ayuso, Jozsef Kadlecsik, Florian Westphal, netfilter-devel, coreteam, Christian Brauner, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, Network Development, kernel list On Wed, Dec 06, 2023 at 10:02:04PM +0100, Jann Horn wrote: > On Wed, Dec 6, 2023 at 9:42 PM Phil Sutter <phil@nwl.cc> wrote: > > > > On Wed, Dec 06, 2023 at 05:28:44PM +0100, Jann Horn wrote: > > > On Tue, Dec 5, 2023 at 10:40 PM Phil Sutter <phil@nwl.cc> wrote: > > > > On Tue, Dec 05, 2023 at 06:08:29PM +0100, Jann Horn wrote: > > > > > On Tue, Dec 5, 2023 at 5:40 PM Jann Horn <jannh@google.com> wrote: > > > > > > > > > > > > Hi! > > > > > > > > > > > > I think this code is racy, but testing that seems like a pain... > > > > > > > > > > > > owner_mt() in xt_owner runs in context of a NF_INET_LOCAL_OUT or > > > > > > NF_INET_POST_ROUTING hook. It first checks that sk->sk_socket is > > > > > > non-NULL, then checks that sk->sk_socket->file is non-NULL, then > > > > > > accesses the ->f_cred of that file. > > > > > > > > > > > > I don't see anything that protects this against a concurrent > > > > > > sock_orphan(), which NULLs out the sk->sk_socket pointer, if we're in > > > > > > > > > > Ah, and all the other users of ->sk_socket in net/netfilter/ do it > > > > > under the sk_callback_lock... so I guess the fix would be to add the > > > > > same in owner_mt? > > > > > > > > Sounds reasonable, although I wonder how likely a socket is to > > > > orphan while netfilter is processing a packet it just sent. > > > > > > > > How about the attached patch? Not sure what hash to put into a Fixes: > > > > tag given this is a day 1 bug and ipt_owner/ip6t_owner predate git. > > > > > > Looks mostly reasonable to me; though I guess it's a bit weird to have > > > two separate bailout paths for checking whether sk->sk_socket is NULL, > > > where the first check can race, and the second check uses different > > > logic for determining the return value; I don't know whether that > > > actually matters semantically. But I'm not sure how to make it look > > > nicer either. > > > > I find the code pretty confusing since it combines three matches (socket > > UID, socket GID and socket existence) via binary ops. The second bail > > disregards socket existence bits, I assumed it was deliberate and thus > > decided to leave the first part as-is. > > > > > I guess you could add a READ_ONCE() around the first read to signal > > > that that's a potentially racy read, but I don't feel strongly about > > > that. > > > > Is this just annotation or do you see a practical effect of using > > READ_ONCE() there? > > I mostly just meant that as an annotation. My understanding is that in > theory, racy reads can cause the compiler to do some terrible things > to your code (https://lore.kernel.org/all/CAG48ez2nFks+yN1Kp4TZisso+rjvv_4UW0FTo8iFUd4Qyq1qDw@mail.gmail.com/), Thanks for the pointer, this was an educational read! > but that's almost certainly not going to happen here. At least it's not a switch on a value in user-controlled memory. ;) > (Well, I guess doing a READ_ONCE() at one side without doing > WRITE_ONCE() on the other side is also unclean...) For the annotation aspect it won't matter. Though since it will merely improve reliability of that check in the given corner-case which is an unreliable situation in the first place, I'd just leave it alone and hope for the code to be replaced by the one in nft_meta.c eventually. Thanks, Phil ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Is xt_owner's owner_mt() racy with sock_orphan()? [worse with new TYPESAFE_BY_RCU file lifetime?] 2023-12-05 21:40 ` Phil Sutter 2023-12-06 16:28 ` Jann Horn @ 2023-12-06 16:42 ` Pablo Neira Ayuso 1 sibling, 0 replies; 13+ messages in thread From: Pablo Neira Ayuso @ 2023-12-06 16:42 UTC (permalink / raw) To: Phil Sutter, Jann Horn, Jozsef Kadlecsik, Florian Westphal, netfilter-devel, coreteam, Christian Brauner, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, Network Development, kernel list On Tue, Dec 05, 2023 at 10:40:15PM +0100, Phil Sutter wrote: > Hi, > > On Tue, Dec 05, 2023 at 06:08:29PM +0100, Jann Horn wrote: > > On Tue, Dec 5, 2023 at 5:40 PM Jann Horn <jannh@google.com> wrote: > > > > > > Hi! > > > > > > I think this code is racy, but testing that seems like a pain... > > > > > > owner_mt() in xt_owner runs in context of a NF_INET_LOCAL_OUT or > > > NF_INET_POST_ROUTING hook. It first checks that sk->sk_socket is > > > non-NULL, then checks that sk->sk_socket->file is non-NULL, then > > > accesses the ->f_cred of that file. > > > > > > I don't see anything that protects this against a concurrent > > > sock_orphan(), which NULLs out the sk->sk_socket pointer, if we're in > > > > Ah, and all the other users of ->sk_socket in net/netfilter/ do it > > under the sk_callback_lock... so I guess the fix would be to add the > > same in owner_mt? > > Sounds reasonable, although I wonder how likely a socket is to > orphan while netfilter is processing a packet it just sent. > > How about the attached patch? Not sure what hash to put into a Fixes: > tag given this is a day 1 bug and ipt_owner/ip6t_owner predate git. > > Thanks, Phil > From 3e28490e43b04d49e6e7f145a70fff7dd42c8cc5 Mon Sep 17 00:00:00 2001 > From: Phil Sutter <phil@nwl.cc> > Date: Tue, 5 Dec 2023 21:58:12 +0100 > Subject: [nf PATCH] netfilter: xt_owner: Fix for unsafe access of sk->sk_socket > > A concurrently running sock_orphan() may NULL the sk_socket pointer in > between check and deref. Follow other users (like nft_meta.c for > instance) and acquire sk_callback_lock before dereferencing sk_socket. For the record, I have placed this patch in nf.git Thanks. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Is xt_owner's owner_mt() racy with sock_orphan()? [worse with new TYPESAFE_BY_RCU file lifetime?] 2023-12-05 17:08 ` Jann Horn 2023-12-05 21:40 ` Phil Sutter @ 2023-12-06 13:58 ` Christian Brauner 2023-12-06 14:38 ` Jann Horn 1 sibling, 1 reply; 13+ messages in thread From: Christian Brauner @ 2023-12-06 13:58 UTC (permalink / raw) To: Jann Horn Cc: Pablo Neira Ayuso, Jozsef Kadlecsik, Florian Westphal, netfilter-devel, coreteam, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, Network Development, kernel list On Tue, Dec 05, 2023 at 06:08:29PM +0100, Jann Horn wrote: > On Tue, Dec 5, 2023 at 5:40 PM Jann Horn <jannh@google.com> wrote: > > > > Hi! > > > > I think this code is racy, but testing that seems like a pain... > > > > owner_mt() in xt_owner runs in context of a NF_INET_LOCAL_OUT or > > NF_INET_POST_ROUTING hook. It first checks that sk->sk_socket is > > non-NULL, then checks that sk->sk_socket->file is non-NULL, then > > accesses the ->f_cred of that file. > > > > I don't see anything that protects this against a concurrent > > sock_orphan(), which NULLs out the sk->sk_socket pointer, if we're in > > Ah, and all the other users of ->sk_socket in net/netfilter/ do it > under the sk_callback_lock... so I guess the fix would be to add the > same in owner_mt? In your other mail you wrote: > I also think we have no guarantee here that the socket's ->file won't > go away due to a concurrent __sock_release(), which could cause us to > continue reading file credentials out of a file whose refcount has > already dropped to zero? Is this an independent worry or can the concurrent __sock_release() issue only happen due to a sock_orphan() having happened first? I think that it requires a sock_orphan() having happend, presumably because the socket gets marked SOCK_DEAD and can thus be released via __sock_release() asynchronously? If so then taking sk_callback_lock() in owner_mt() should fix this. (Otherwise we might need an additional get_active_file() on sk->sk_socker->file in owner_mt() in addition to the other fix.) ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Is xt_owner's owner_mt() racy with sock_orphan()? [worse with new TYPESAFE_BY_RCU file lifetime?] 2023-12-06 13:58 ` Christian Brauner @ 2023-12-06 14:38 ` Jann Horn 2023-12-06 16:50 ` Christian Brauner 0 siblings, 1 reply; 13+ messages in thread From: Jann Horn @ 2023-12-06 14:38 UTC (permalink / raw) To: Christian Brauner Cc: Pablo Neira Ayuso, Jozsef Kadlecsik, Florian Westphal, netfilter-devel, coreteam, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, Network Development, kernel list On Wed, Dec 6, 2023 at 2:58 PM Christian Brauner <brauner@kernel.org> wrote: > > On Tue, Dec 05, 2023 at 06:08:29PM +0100, Jann Horn wrote: > > On Tue, Dec 5, 2023 at 5:40 PM Jann Horn <jannh@google.com> wrote: > > > > > > Hi! > > > > > > I think this code is racy, but testing that seems like a pain... > > > > > > owner_mt() in xt_owner runs in context of a NF_INET_LOCAL_OUT or > > > NF_INET_POST_ROUTING hook. It first checks that sk->sk_socket is > > > non-NULL, then checks that sk->sk_socket->file is non-NULL, then > > > accesses the ->f_cred of that file. > > > > > > I don't see anything that protects this against a concurrent > > > sock_orphan(), which NULLs out the sk->sk_socket pointer, if we're in > > > > Ah, and all the other users of ->sk_socket in net/netfilter/ do it > > under the sk_callback_lock... so I guess the fix would be to add the > > same in owner_mt? > > In your other mail you wrote: > > > I also think we have no guarantee here that the socket's ->file won't > > go away due to a concurrent __sock_release(), which could cause us to > > continue reading file credentials out of a file whose refcount has > > already dropped to zero? > > Is this an independent worry or can the concurrent __sock_release() > issue only happen due to a sock_orphan() having happened first? I think > that it requires a sock_orphan() having happend, presumably because the > socket gets marked SOCK_DEAD and can thus be released via > __sock_release() asynchronously? > > If so then taking sk_callback_lock() in owner_mt() should fix this. > (Otherwise we might need an additional get_active_file() on > sk->sk_socker->file in owner_mt() in addition to the other fix.) My understanding is that it could only happen due to a sock_orphan() having happened first, and so just sk_callback_lock() should probably be a sufficient fix. (I'm not an expert on net subsystem locking rules though.) ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Is xt_owner's owner_mt() racy with sock_orphan()? [worse with new TYPESAFE_BY_RCU file lifetime?] 2023-12-06 14:38 ` Jann Horn @ 2023-12-06 16:50 ` Christian Brauner 0 siblings, 0 replies; 13+ messages in thread From: Christian Brauner @ 2023-12-06 16:50 UTC (permalink / raw) To: Jann Horn Cc: Pablo Neira Ayuso, Jozsef Kadlecsik, Florian Westphal, netfilter-devel, coreteam, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, Network Development, kernel list On Wed, Dec 06, 2023 at 03:38:50PM +0100, Jann Horn wrote: > On Wed, Dec 6, 2023 at 2:58 PM Christian Brauner <brauner@kernel.org> wrote: > > > > On Tue, Dec 05, 2023 at 06:08:29PM +0100, Jann Horn wrote: > > > On Tue, Dec 5, 2023 at 5:40 PM Jann Horn <jannh@google.com> wrote: > > > > > > > > Hi! > > > > > > > > I think this code is racy, but testing that seems like a pain... > > > > > > > > owner_mt() in xt_owner runs in context of a NF_INET_LOCAL_OUT or > > > > NF_INET_POST_ROUTING hook. It first checks that sk->sk_socket is > > > > non-NULL, then checks that sk->sk_socket->file is non-NULL, then > > > > accesses the ->f_cred of that file. > > > > > > > > I don't see anything that protects this against a concurrent > > > > sock_orphan(), which NULLs out the sk->sk_socket pointer, if we're in > > > > > > Ah, and all the other users of ->sk_socket in net/netfilter/ do it > > > under the sk_callback_lock... so I guess the fix would be to add the > > > same in owner_mt? > > > > In your other mail you wrote: > > > > > I also think we have no guarantee here that the socket's ->file won't > > > go away due to a concurrent __sock_release(), which could cause us to > > > continue reading file credentials out of a file whose refcount has > > > already dropped to zero? > > > > Is this an independent worry or can the concurrent __sock_release() > > issue only happen due to a sock_orphan() having happened first? I think > > that it requires a sock_orphan() having happend, presumably because the > > socket gets marked SOCK_DEAD and can thus be released via > > __sock_release() asynchronously? > > > > If so then taking sk_callback_lock() in owner_mt() should fix this. > > (Otherwise we might need an additional get_active_file() on > > sk->sk_socker->file in owner_mt() in addition to the other fix.) > > My understanding is that it could only happen due to a sock_orphan() > having happened first, and so just sk_callback_lock() should probably > be a sufficient fix. (I'm not an expert on net subsystem locking rules > though.) Ok, so as I suspected. That's good. ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2023-12-07 18:09 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-12-05 16:40 Is xt_owner's owner_mt() racy with sock_orphan()? [worse with new TYPESAFE_BY_RCU file lifetime?] Jann Horn 2023-12-05 17:08 ` Jann Horn 2023-12-05 21:40 ` Phil Sutter 2023-12-06 16:28 ` Jann Horn 2023-12-06 16:48 ` Pablo Neira Ayuso 2023-12-06 16:49 ` Christian Brauner 2023-12-06 20:42 ` Phil Sutter 2023-12-06 21:02 ` Jann Horn 2023-12-07 18:09 ` Phil Sutter 2023-12-06 16:42 ` Pablo Neira Ayuso 2023-12-06 13:58 ` Christian Brauner 2023-12-06 14:38 ` Jann Horn 2023-12-06 16:50 ` Christian Brauner
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).