public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Kuniyuki Iwashima <kuniyu@amazon.com>
To: <rostedt@goodmis.org>
Cc: <davem@davemloft.net>, <edumazet@google.com>,
	<keescook@chromium.org>, <kuba@kernel.org>, <kuni1840@gmail.com>,
	<kuniyu@amazon.com>, <linux-kernel@vger.kernel.org>,
	<mcgrof@kernel.org>, <netdev@vger.kernel.org>,
	<pabeni@redhat.com>, <satoru.moriya@hds.com>,
	<yzaikin@google.com>
Subject: Re: [PATCH v1 net 11/16] net: Fix a data-race around sysctl_mem.
Date: Wed, 6 Jul 2022 09:27:53 -0700	[thread overview]
Message-ID: <20220706162753.47894-1-kuniyu@amazon.com> (raw)
In-Reply-To: <20220706092711.28ce57e6@gandalf.local.home>

From:   Steven Rostedt <rostedt@goodmis.org>
Date:   Wed, 6 Jul 2022 09:27:11 -0400
> On Wed, 6 Jul 2022 09:17:07 -0400
> Steven Rostedt <rostedt@goodmis.org> wrote:
> 
> > On Tue, 5 Jul 2022 22:21:25 -0700
> > Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
> > 
> > > --- a/include/trace/events/sock.h
> > > +++ b/include/trace/events/sock.h
> > > @@ -122,9 +122,9 @@ TRACE_EVENT(sock_exceed_buf_limit,
> > >  
> > >  	TP_printk("proto:%s sysctl_mem=%ld,%ld,%ld allocated=%ld sysctl_rmem=%d rmem_alloc=%d sysctl_wmem=%d wmem_alloc=%d wmem_queued=%d kind=%s",
> > >  		__entry->name,
> > > -		__entry->sysctl_mem[0],
> > > -		__entry->sysctl_mem[1],
> > > -		__entry->sysctl_mem[2],
> > > +		READ_ONCE(__entry->sysctl_mem[0]),
> > > +		READ_ONCE(__entry->sysctl_mem[1]),
> > > +		READ_ONCE(__entry->sysctl_mem[2]),  
> > 
> > This is not reading anything to do with sysctl. It's reading the content of
> > what was recorded in the ring buffer.
> > 
> > That is, the READ_ONCE() here is not necessary, and if anything will break
> > user space parsing, as this is exported to user space to tell it how to
> > read the binary format in the ring buffer.
> 
> I take that back. Looking at the actual trace event, it is pointing to
> sysctl memory, which is a major bug.
> 
> TRACE_EVENT(sock_exceed_buf_limit,
> 
>         TP_PROTO(struct sock *sk, struct proto *prot, long allocated, int kind),
> 
>         TP_ARGS(sk, prot, allocated, kind),
> 
>         TP_STRUCT__entry(
>                 __array(char, name, 32)
>                 __field(long *, sysctl_mem)
> 
> sysctl_mem is a pointer.
> 
>                 __field(long, allocated)
>                 __field(int, sysctl_rmem)
>                 __field(int, rmem_alloc)
>                 __field(int, sysctl_wmem)
>                 __field(int, wmem_alloc)
>                 __field(int, wmem_queued)
>                 __field(int, kind)
>         ),
> 
>         TP_fast_assign(
>                 strncpy(__entry->name, prot->name, 32);
> 
>                 __entry->sysctl_mem = prot->sysctl_mem;
> 
> 
> They save the pointer **IN THE RING BUFFER**!!!
> 
>                 __entry->allocated = allocated;
>                 __entry->sysctl_rmem = sk_get_rmem0(sk, prot);
>                 __entry->rmem_alloc = atomic_read(&sk->sk_rmem_alloc);
>                 __entry->sysctl_wmem = sk_get_wmem0(sk, prot);
>                 __entry->wmem_alloc = refcount_read(&sk->sk_wmem_alloc);
>                 __entry->wmem_queued = READ_ONCE(sk->sk_wmem_queued);
>                 __entry->kind = kind;
>         ),
> 
>         TP_printk("proto:%s sysctl_mem=%ld,%ld,%ld allocated=%ld sysctl_rmem=%d rmem_alloc=%d sysctl_wmem=%d wmem_alloc=%d wmem_queued=%d kind=%s",
>                 __entry->name,
>                 __entry->sysctl_mem[0],
>                 __entry->sysctl_mem[1],
>                 __entry->sysctl_mem[2],
> 
> They are now reading a stale pointer, which can be read at any time. That
> is, you get the information of what is in sysctl_mem at the time the ring
> buffer is read (which is useless from user space), and not at the time of
> the event.
> 
> Thanks for pointing this out. This needs to be fixed.

For the record, Steve fixed this properly here, so I'll drop the tracing
part in v2.
https://lore.kernel.org/netdev/20220706105040.54fc03b0@gandalf.local.home/

  reply	other threads:[~2022-07-06 16:28 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-06  5:21 [PATCH v1 net 00/16] sysctl: Fix data-races around ipv4_table Kuniyuki Iwashima
2022-07-06  5:21 ` [PATCH v1 net 01/16] sysctl: Clean up proc_handler definitions Kuniyuki Iwashima
2022-07-06  5:21 ` [PATCH v1 net 02/16] sysctl: Add proc_dobool_lockless() Kuniyuki Iwashima
2022-07-06  5:21 ` [PATCH v1 net 03/16] sysctl: Add proc_dointvec_lockless() Kuniyuki Iwashima
2022-07-06  7:00   ` Eric Dumazet
2022-07-06 16:15     ` Kuniyuki Iwashima
2022-07-06  5:21 ` [PATCH v1 net 04/16] sysctl: Add proc_douintvec_lockless() Kuniyuki Iwashima
2022-07-06  5:21 ` [PATCH v1 net 05/16] sysctl: Add proc_dointvec_minmax_lockless() Kuniyuki Iwashima
2022-07-06  5:21 ` [PATCH v1 net 06/16] sysctl: Add proc_douintvec_minmax_lockless() Kuniyuki Iwashima
2022-07-06  5:21 ` [PATCH v1 net 07/16] sysctl: Add proc_doulongvec_minmax_lockless() Kuniyuki Iwashima
2022-07-06  5:21 ` [PATCH v1 net 08/16] sysctl: Add proc_dointvec_jiffies_lockless() Kuniyuki Iwashima
2022-07-06  5:21 ` [PATCH v1 net 09/16] tcp: Fix a data-race around sysctl_tcp_max_orphans Kuniyuki Iwashima
2022-07-06  5:21 ` [PATCH v1 net 10/16] inetpeer: Fix data-races around sysctl Kuniyuki Iwashima
2022-07-06  5:21 ` [PATCH v1 net 11/16] net: Fix a data-race around sysctl_mem Kuniyuki Iwashima
2022-07-06 13:17   ` Steven Rostedt
2022-07-06 13:27     ` Steven Rostedt
2022-07-06 16:27       ` Kuniyuki Iwashima [this message]
2022-07-06  5:21 ` [PATCH v1 net 12/16] tcp: Mark sysctl_tcp_low_latency obsolete Kuniyuki Iwashima
2022-07-06  5:21 ` [PATCH v1 net 13/16] cipso: Fix a data-race around cipso_v4_cache_bucketsize Kuniyuki Iwashima
2022-07-06  5:21 ` [PATCH v1 net 14/16] cipso: Fix data-races around boolean sysctl Kuniyuki Iwashima
2022-07-06  5:21 ` [PATCH v1 net 15/16] icmp: Fix data-races around sysctl Kuniyuki Iwashima
2022-07-06  5:21 ` [PATCH v1 net 16/16] ipv4: Fix a data-race around sysctl_fib_sync_mem Kuniyuki Iwashima

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220706162753.47894-1-kuniyu@amazon.com \
    --to=kuniyu@amazon.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=keescook@chromium.org \
    --cc=kuba@kernel.org \
    --cc=kuni1840@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mcgrof@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=rostedt@goodmis.org \
    --cc=satoru.moriya@hds.com \
    --cc=yzaikin@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox