netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Missing a write memory barrier in tls_init()
@ 2023-11-02  7:11 Dae R. Jeong
  2023-11-06 22:36 ` Jakub Kicinski
  0 siblings, 1 reply; 8+ messages in thread
From: Dae R. Jeong @ 2023-11-02  7:11 UTC (permalink / raw)
  To: borisp, john.fastabend, kuba, davem, edumazet, pabeni, netdev,
	linux-kernel
  Cc: ywchoi

Hello,

It seems a write memory barrier is missing in tls_init() (or
tls_ctx_create()). In the following execution, NULL dereference may
happen in {tls_setsockopt, tls_getsockopt}.

CPU0                                         CPU1
-----                                        -----
// In tls_init()
// In tls_ctx_create()
ctx = kzalloc()
ctx->sk_proto = READ_ONCE(sk->sk_prot) - (1)

// In update_sk_prot()
WRITE_ONCE(sk->sk_prot, tls_prots)     - (2)
                                              // In sock_common_setsockopt()
                                              READ_ONCE(sk->sk_prot)->setsockopt()

                                              // In tls_{setsockopt,getsockopt}()
                                              ctx->sk_proto->setsockopt()    - (3)


In the above concurrent execution, nothing prevents store-store
reordering in CPU0, so it is possible that CPU0 completes (2) before
(1). If it happens, CPU1 may crash at (3).

To prevent such out-of-order execution, I think we need something like
this (although I don't like smp_wmb(). smp_store_release() should be
better):

diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
index 1c2c6800949d..5dccde91f9b1 100644
--- a/net/tls/tls_main.c
+++ b/net/tls/tls_main.c
@@ -819,6 +819,7 @@ struct tls_context *tls_ctx_create(struct sock *sk)
        rcu_assign_pointer(icsk->icsk_ulp_data, ctx);
        ctx->sk_proto = READ_ONCE(sk->sk_prot);
        ctx->sk = sk;
+       smp_wmb();
        return ctx;
 }

In addition, I believe the {tls_setsockopt, tls_getsockopt}
implementation is fine because of the address dependency. I think
load-load reordering is prohibited in this case so we don't need a
read barrier.

Could you check this?


Best regards,
Dae R. Jeong

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: Missing a write memory barrier in tls_init()
  2023-11-02  7:11 Missing a write memory barrier in tls_init() Dae R. Jeong
@ 2023-11-06 22:36 ` Jakub Kicinski
  2023-11-07  8:07   ` Dae R. Jeong
  2023-11-07 22:45   ` Sabrina Dubroca
  0 siblings, 2 replies; 8+ messages in thread
From: Jakub Kicinski @ 2023-11-06 22:36 UTC (permalink / raw)
  To: Dae R. Jeong
  Cc: borisp, john.fastabend, davem, edumazet, pabeni, netdev,
	linux-kernel, ywchoi

On Thu, 2 Nov 2023 16:11:29 +0900 Dae R. Jeong wrote:
> In addition, I believe the {tls_setsockopt, tls_getsockopt}
> implementation is fine because of the address dependency. I think
> load-load reordering is prohibited in this case so we don't need a
> read barrier.

Sounds plausible, could you send a patch?

The smb_wmb() would be better placed in tls_init(), IMHO.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Missing a write memory barrier in tls_init()
  2023-11-06 22:36 ` Jakub Kicinski
@ 2023-11-07  8:07   ` Dae R. Jeong
  2023-11-07 22:45   ` Sabrina Dubroca
  1 sibling, 0 replies; 8+ messages in thread
From: Dae R. Jeong @ 2023-11-07  8:07 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: borisp, john.fastabend, davem, edumazet, pabeni, netdev,
	linux-kernel, ywchoi

Hi, Jakub,

Thank you for your reply.

On Mon, Nov 06, 2023 at 02:36:59PM -0800, Jakub Kicinski wrote:
> On Thu, 2 Nov 2023 16:11:29 +0900 Dae R. Jeong wrote:
> > In addition, I believe the {tls_setsockopt, tls_getsockopt}
> > implementation is fine because of the address dependency. I think
> > load-load reordering is prohibited in this case so we don't need a
> > read barrier.
> 
> Sounds plausible, could you send a patch?

Sure. I am doing something else today, so I will send a patch tomorrow
or the day after tomorrow.


> The smb_wmb() would be better placed in tls_init(), IMHO.

It sounds better. I will write a patch in that way.


Best regards,
Dae R. Jeong

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Missing a write memory barrier in tls_init()
  2023-11-06 22:36 ` Jakub Kicinski
  2023-11-07  8:07   ` Dae R. Jeong
@ 2023-11-07 22:45   ` Sabrina Dubroca
  2023-11-08  2:53     ` Jakub Kicinski
  1 sibling, 1 reply; 8+ messages in thread
From: Sabrina Dubroca @ 2023-11-07 22:45 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Dae R. Jeong, borisp, john.fastabend, davem, edumazet, pabeni,
	netdev, linux-kernel, ywchoi

2023-11-06, 14:36:59 -0800, Jakub Kicinski wrote:
> On Thu, 2 Nov 2023 16:11:29 +0900 Dae R. Jeong wrote:
> > In addition, I believe the {tls_setsockopt, tls_getsockopt}
> > implementation is fine because of the address dependency. I think
> > load-load reordering is prohibited in this case so we don't need a
> > read barrier.
> 
> Sounds plausible, could you send a patch?
>
> The smb_wmb() would be better placed in tls_init(), IMHO.

Wouldn't it be enough to just move the rcu_assign_pointer after ctx is
fully initialized, ie just before update_sk_prot? also clearer wrt
RCU.
(and maybe get rid of tls_ctx_create and move all that into tls_init,
it's not much and we don't even set ctx->{tx,rx}_conf in there)

-- 
Sabrina


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Missing a write memory barrier in tls_init()
  2023-11-07 22:45   ` Sabrina Dubroca
@ 2023-11-08  2:53     ` Jakub Kicinski
  2023-11-08  9:07       ` Sabrina Dubroca
  0 siblings, 1 reply; 8+ messages in thread
From: Jakub Kicinski @ 2023-11-08  2:53 UTC (permalink / raw)
  To: Sabrina Dubroca
  Cc: Dae R. Jeong, borisp, john.fastabend, davem, edumazet, pabeni,
	netdev, linux-kernel, ywchoi

On Tue, 7 Nov 2023 23:45:46 +0100 Sabrina Dubroca wrote:
> Wouldn't it be enough to just move the rcu_assign_pointer after ctx is
> fully initialized, ie just before update_sk_prot? also clearer wrt
> RCU.

I'm not sure, IIUC rcu_assign_pointer() is equivalent to
WRITE_ONCE() on any sane architecture, it depends on address
dependencies to provide ordering. Since here we care about
ctx->sk_prot being updated, when changes to sk->sk_prot
are visible there is no super-obvious address dependency.

There may be one. But to me at least it isn't an obvious
"RCU used right will handle this" case.

> (and maybe get rid of tls_ctx_create and move all that into tls_init,
> it's not much and we don't even set ctx->{tx,rx}_conf in there)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Missing a write memory barrier in tls_init()
  2023-11-08  2:53     ` Jakub Kicinski
@ 2023-11-08  9:07       ` Sabrina Dubroca
  2023-11-10 10:22         ` Dae R. Jeong
  0 siblings, 1 reply; 8+ messages in thread
From: Sabrina Dubroca @ 2023-11-08  9:07 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Dae R. Jeong, borisp, john.fastabend, davem, edumazet, pabeni,
	netdev, linux-kernel, ywchoi

2023-11-07, 18:53:24 -0800, Jakub Kicinski wrote:
> On Tue, 7 Nov 2023 23:45:46 +0100 Sabrina Dubroca wrote:
> > Wouldn't it be enough to just move the rcu_assign_pointer after ctx is
> > fully initialized, ie just before update_sk_prot? also clearer wrt
> > RCU.
> 
> I'm not sure, IIUC rcu_assign_pointer() is equivalent to
> WRITE_ONCE() on any sane architecture, it depends on address
> dependencies to provide ordering.

Not what the doc says:

    /**
     * rcu_assign_pointer() - assign to RCU-protected pointer
     [...]
     * Inserts memory barriers on architectures that require them
     * (which is most of them), and also prevents the compiler from
     * reordering the code that initializes the structure after the pointer
     * assignment.
     [...]
     */

And it uses smp_store_release (unless writing NULL).


rcu_dereference is the one that usually doesn't contain a barrier:

    /**
     * rcu_dereference_check() - rcu_dereference with debug checking
     [...]
     * Inserts memory barriers on architectures that require them
     * (currently only the Alpha), prevents the compiler from refetching
     * (and from merging fetches), and, more importantly, documents exactly
     * which pointers are protected by RCU and checks that the pointer is
     * annotated as __rcu.
     */


> Since here we care about
> ctx->sk_prot being updated, when changes to sk->sk_prot
> are visible there is no super-obvious address dependency.
> 
> There may be one. But to me at least it isn't an obvious
> "RCU used right will handle this" case.

Ok, I think you're right. Looking at smp_store_release used by rcu_assign_pointer:

    #define __smp_store_release(p, v)					\
    do {									\
    	compiletime_assert_atomic_type(*p);				\
    	barrier();							\
    	WRITE_ONCE(*p, v);						\
    } while (0)

it's only going to make sure ctx->sk_proto is set when ctx is visible,
and not guarantee that ctx is visible whenever sk->sk_prot has been
switched over.

-- 
Sabrina


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Missing a write memory barrier in tls_init()
  2023-11-08  9:07       ` Sabrina Dubroca
@ 2023-11-10 10:22         ` Dae R. Jeong
  2023-11-10 11:04           ` Dae R. Jeong
  0 siblings, 1 reply; 8+ messages in thread
From: Dae R. Jeong @ 2023-11-10 10:22 UTC (permalink / raw)
  To: Sabrina Dubroca
  Cc: Jakub Kicinski, borisp, john.fastabend, davem, edumazet, pabeni,
	netdev, linux-kernel, ywchoi

On Wed, Nov 08, 2023 at 10:07:58AM +0100, Sabrina Dubroca wrote:
> 2023-11-07, 18:53:24 -0800, Jakub Kicinski wrote:
> > On Tue, 7 Nov 2023 23:45:46 +0100 Sabrina Dubroca wrote:
> > > Wouldn't it be enough to just move the rcu_assign_pointer after ctx is
> > > fully initialized, ie just before update_sk_prot? also clearer wrt
> > > RCU.
> > 
> > I'm not sure, IIUC rcu_assign_pointer() is equivalent to
> > WRITE_ONCE() on any sane architecture, it depends on address
> > dependencies to provide ordering.
> 
> Not what the doc says:
> 
>     /**
>      * rcu_assign_pointer() - assign to RCU-protected pointer
>      [...]
>      * Inserts memory barriers on architectures that require them
>      * (which is most of them), and also prevents the compiler from
>      * reordering the code that initializes the structure after the pointer
>      * assignment.
>      [...]
>      */
> 
> And it uses smp_store_release (unless writing NULL).
> 

I think Sabrina is right. We can rely on the release semantic implied
in rcu_assign_pointer(). Simply moving rcu_assign_pointer() to the end
of tls_ctx_create() should prevent a scenario what I thought (ie.,
store-store reordering between ctx->sk_proto and sk->sk_prot).

diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
index 1c2c6800949d..d20b823c68d4 100644
--- a/net/tls/tls_main.c
+++ b/net/tls/tls_main.c
@@ -816,9 +816,9 @@ struct tls_context *tls_ctx_create(struct sock *sk)
                return NULL;
 
        mutex_init(&ctx->tx_lock);
-       rcu_assign_pointer(icsk->icsk_ulp_data, ctx);
        ctx->sk_proto = READ_ONCE(sk->sk_prot);
        ctx->sk = sk;
+       rcu_assign_pointer(icsk->icsk_ulp_data, ctx);
        return ctx;
 }

But what I also wonder is that, do we need to ensure that
ctx->{tx,rx}_conf is visible before updating sk->sk_prot? If so, as
Sabrina suggested, we may want to move rcu_assign_pointer() right
before update_sk_prot().


Best regards,
Dae R. Jeong

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: Missing a write memory barrier in tls_init()
  2023-11-10 10:22         ` Dae R. Jeong
@ 2023-11-10 11:04           ` Dae R. Jeong
  0 siblings, 0 replies; 8+ messages in thread
From: Dae R. Jeong @ 2023-11-10 11:04 UTC (permalink / raw)
  To: Sabrina Dubroca
  Cc: Jakub Kicinski, borisp, john.fastabend, davem, edumazet, pabeni,
	netdev, linux-kernel, ywchoi

On Fri, Nov 10, 2023 at 07:22:48PM +0900, Dae R. Jeong wrote:
> On Wed, Nov 08, 2023 at 10:07:58AM +0100, Sabrina Dubroca wrote:
> > 2023-11-07, 18:53:24 -0800, Jakub Kicinski wrote:
> > > On Tue, 7 Nov 2023 23:45:46 +0100 Sabrina Dubroca wrote:
> > > > Wouldn't it be enough to just move the rcu_assign_pointer after ctx is
> > > > fully initialized, ie just before update_sk_prot? also clearer wrt
> > > > RCU.
> > > 
> > > I'm not sure, IIUC rcu_assign_pointer() is equivalent to
> > > WRITE_ONCE() on any sane architecture, it depends on address
> > > dependencies to provide ordering.
> > 
> > Not what the doc says:
> > 
> >     /**
> >      * rcu_assign_pointer() - assign to RCU-protected pointer
> >      [...]
> >      * Inserts memory barriers on architectures that require them
> >      * (which is most of them), and also prevents the compiler from
> >      * reordering the code that initializes the structure after the pointer
> >      * assignment.
> >      [...]
> >      */
> > 
> > And it uses smp_store_release (unless writing NULL).
> > 
> 
> I think Sabrina is right. We can rely on the release semantic implied
> in rcu_assign_pointer(). Simply moving rcu_assign_pointer() to the end
> of tls_ctx_create() should prevent a scenario what I thought (ie.,
> store-store reordering between ctx->sk_proto and sk->sk_prot).
> 
> diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
> index 1c2c6800949d..d20b823c68d4 100644
> --- a/net/tls/tls_main.c
> +++ b/net/tls/tls_main.c
> @@ -816,9 +816,9 @@ struct tls_context *tls_ctx_create(struct sock *sk)
>                 return NULL;
>  
>         mutex_init(&ctx->tx_lock);
> -       rcu_assign_pointer(icsk->icsk_ulp_data, ctx);
>         ctx->sk_proto = READ_ONCE(sk->sk_prot);
>         ctx->sk = sk;
> +       rcu_assign_pointer(icsk->icsk_ulp_data, ctx);
>         return ctx;
>  }
> 
> But what I also wonder is that, do we need to ensure that
> ctx->{tx,rx}_conf is visible before updating sk->sk_prot? If so, as
> Sabrina suggested, we may want to move rcu_assign_pointer() right
> before update_sk_prot().
> 
> 
> Best regards,
> Dae R. Jeong

I sent a patch by taking suggestions from Sabrina. The patches 1)
moves rcu_assign_pointer() after fully initializing ctx, and 2) gets
rid of tls_ctx_create().

I'm not sure whether removing tls_ctx_create() is a good idea or not,
but it still did not fully initialize ctx (i.e., ctx->{tx,rx}_conf).

Let me know if there is any issue, then I will rewrite a patch.


Best regards,
Dae R. Jeong

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-11-10 11:05 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-11-02  7:11 Missing a write memory barrier in tls_init() Dae R. Jeong
2023-11-06 22:36 ` Jakub Kicinski
2023-11-07  8:07   ` Dae R. Jeong
2023-11-07 22:45   ` Sabrina Dubroca
2023-11-08  2:53     ` Jakub Kicinski
2023-11-08  9:07       ` Sabrina Dubroca
2023-11-10 10:22         ` Dae R. Jeong
2023-11-10 11:04           ` Dae R. Jeong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).