From: Kuniyuki Iwashima <kuniyu@amazon.com>
To: <rostedt@goodmis.org>
Cc: <davem@davemloft.net>, <edumazet@google.com>,
<keescook@chromium.org>, <kuba@kernel.org>, <kuni1840@gmail.com>,
<kuniyu@amazon.com>, <linux-kernel@vger.kernel.org>,
<mcgrof@kernel.org>, <netdev@vger.kernel.org>,
<pabeni@redhat.com>, <satoru.moriya@hds.com>,
<yzaikin@google.com>
Subject: Re: [PATCH v1 net 11/16] net: Fix a data-race around sysctl_mem.
Date: Wed, 6 Jul 2022 09:27:53 -0700 [thread overview]
Message-ID: <20220706162753.47894-1-kuniyu@amazon.com> (raw)
In-Reply-To: <20220706092711.28ce57e6@gandalf.local.home>
From: Steven Rostedt <rostedt@goodmis.org>
Date: Wed, 6 Jul 2022 09:27:11 -0400
> On Wed, 6 Jul 2022 09:17:07 -0400
> Steven Rostedt <rostedt@goodmis.org> wrote:
>
> > On Tue, 5 Jul 2022 22:21:25 -0700
> > Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
> >
> > > --- a/include/trace/events/sock.h
> > > +++ b/include/trace/events/sock.h
> > > @@ -122,9 +122,9 @@ TRACE_EVENT(sock_exceed_buf_limit,
> > >
> > > TP_printk("proto:%s sysctl_mem=%ld,%ld,%ld allocated=%ld sysctl_rmem=%d rmem_alloc=%d sysctl_wmem=%d wmem_alloc=%d wmem_queued=%d kind=%s",
> > > __entry->name,
> > > - __entry->sysctl_mem[0],
> > > - __entry->sysctl_mem[1],
> > > - __entry->sysctl_mem[2],
> > > + READ_ONCE(__entry->sysctl_mem[0]),
> > > + READ_ONCE(__entry->sysctl_mem[1]),
> > > + READ_ONCE(__entry->sysctl_mem[2]),
> >
> > This is not reading anything to do with sysctl. It's reading the content of
> > what was recorded in the ring buffer.
> >
> > That is, the READ_ONCE() here is not necessary, and if anything will break
> > user space parsing, as this is exported to user space to tell it how to
> > read the binary format in the ring buffer.
>
> I take that back. Looking at the actual trace event, it is pointing to
> sysctl memory, which is a major bug.
>
> TRACE_EVENT(sock_exceed_buf_limit,
>
> TP_PROTO(struct sock *sk, struct proto *prot, long allocated, int kind),
>
> TP_ARGS(sk, prot, allocated, kind),
>
> TP_STRUCT__entry(
> __array(char, name, 32)
> __field(long *, sysctl_mem)
>
> sysctl_mem is a pointer.
>
> __field(long, allocated)
> __field(int, sysctl_rmem)
> __field(int, rmem_alloc)
> __field(int, sysctl_wmem)
> __field(int, wmem_alloc)
> __field(int, wmem_queued)
> __field(int, kind)
> ),
>
> TP_fast_assign(
> strncpy(__entry->name, prot->name, 32);
>
> __entry->sysctl_mem = prot->sysctl_mem;
>
>
> They save the pointer **IN THE RING BUFFER**!!!
>
> __entry->allocated = allocated;
> __entry->sysctl_rmem = sk_get_rmem0(sk, prot);
> __entry->rmem_alloc = atomic_read(&sk->sk_rmem_alloc);
> __entry->sysctl_wmem = sk_get_wmem0(sk, prot);
> __entry->wmem_alloc = refcount_read(&sk->sk_wmem_alloc);
> __entry->wmem_queued = READ_ONCE(sk->sk_wmem_queued);
> __entry->kind = kind;
> ),
>
> TP_printk("proto:%s sysctl_mem=%ld,%ld,%ld allocated=%ld sysctl_rmem=%d rmem_alloc=%d sysctl_wmem=%d wmem_alloc=%d wmem_queued=%d kind=%s",
> __entry->name,
> __entry->sysctl_mem[0],
> __entry->sysctl_mem[1],
> __entry->sysctl_mem[2],
>
> They are now reading a stale pointer, which can be read at any time. That
> is, you get the information of what is in sysctl_mem at the time the ring
> buffer is read (which is useless from user space), and not at the time of
> the event.
>
> Thanks for pointing this out. This needs to be fixed.
For the record, Steve fixed this properly here, so I'll drop the tracing
part in v2.
https://lore.kernel.org/netdev/20220706105040.54fc03b0@gandalf.local.home/
next prev parent reply other threads:[~2022-07-06 16:28 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-07-06 5:21 [PATCH v1 net 00/16] sysctl: Fix data-races around ipv4_table Kuniyuki Iwashima
2022-07-06 5:21 ` [PATCH v1 net 01/16] sysctl: Clean up proc_handler definitions Kuniyuki Iwashima
2022-07-06 5:21 ` [PATCH v1 net 02/16] sysctl: Add proc_dobool_lockless() Kuniyuki Iwashima
2022-07-06 5:21 ` [PATCH v1 net 03/16] sysctl: Add proc_dointvec_lockless() Kuniyuki Iwashima
2022-07-06 7:00 ` Eric Dumazet
2022-07-06 16:15 ` Kuniyuki Iwashima
2022-07-06 5:21 ` [PATCH v1 net 04/16] sysctl: Add proc_douintvec_lockless() Kuniyuki Iwashima
2022-07-06 5:21 ` [PATCH v1 net 05/16] sysctl: Add proc_dointvec_minmax_lockless() Kuniyuki Iwashima
2022-07-06 5:21 ` [PATCH v1 net 06/16] sysctl: Add proc_douintvec_minmax_lockless() Kuniyuki Iwashima
2022-07-06 5:21 ` [PATCH v1 net 07/16] sysctl: Add proc_doulongvec_minmax_lockless() Kuniyuki Iwashima
2022-07-06 5:21 ` [PATCH v1 net 08/16] sysctl: Add proc_dointvec_jiffies_lockless() Kuniyuki Iwashima
2022-07-06 5:21 ` [PATCH v1 net 09/16] tcp: Fix a data-race around sysctl_tcp_max_orphans Kuniyuki Iwashima
2022-07-06 5:21 ` [PATCH v1 net 10/16] inetpeer: Fix data-races around sysctl Kuniyuki Iwashima
2022-07-06 5:21 ` [PATCH v1 net 11/16] net: Fix a data-race around sysctl_mem Kuniyuki Iwashima
2022-07-06 13:17 ` Steven Rostedt
2022-07-06 13:27 ` Steven Rostedt
2022-07-06 16:27 ` Kuniyuki Iwashima [this message]
2022-07-06 5:21 ` [PATCH v1 net 12/16] tcp: Mark sysctl_tcp_low_latency obsolete Kuniyuki Iwashima
2022-07-06 5:21 ` [PATCH v1 net 13/16] cipso: Fix a data-race around cipso_v4_cache_bucketsize Kuniyuki Iwashima
2022-07-06 5:21 ` [PATCH v1 net 14/16] cipso: Fix data-races around boolean sysctl Kuniyuki Iwashima
2022-07-06 5:21 ` [PATCH v1 net 15/16] icmp: Fix data-races around sysctl Kuniyuki Iwashima
2022-07-06 5:21 ` [PATCH v1 net 16/16] ipv4: Fix a data-race around sysctl_fib_sync_mem Kuniyuki Iwashima
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220706162753.47894-1-kuniyu@amazon.com \
--to=kuniyu@amazon.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=keescook@chromium.org \
--cc=kuba@kernel.org \
--cc=kuni1840@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mcgrof@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=rostedt@goodmis.org \
--cc=satoru.moriya@hds.com \
--cc=yzaikin@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox