linux-trace-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Zheng Yejian <zhengyejian1@huawei.com>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: <mhiramat@kernel.org>, <laijs@cn.fujitsu.com>,
	<linux-kernel@vger.kernel.org>,
	<linux-trace-kernel@vger.kernel.org>
Subject: Re: [PATCH] tracing: Fix race when concurrently splice_read trace_pipe
Date: Sat, 12 Aug 2023 15:38:12 +0800	[thread overview]
Message-ID: <b5dbdbeb-be3a-3434-0909-0697d8cb15bf@huawei.com> (raw)
In-Reply-To: <20230811152525.2511f8f0@gandalf.local.home>

On 2023/8/12 03:25, Steven Rostedt wrote:
> On Thu, 10 Aug 2023 20:39:05 +0800
> Zheng Yejian <zhengyejian1@huawei.com> wrote:
> 
>> When concurrently splice_read file trace_pipe and per_cpu/cpu*/trace_pipe,
>> there are more data being read out than expected.
> 
> Honestly the real fix is to prevent that use case. We should probably have
> access to trace_pipe lock all the per_cpu trace_pipes too.
> 
> -- Steve
> 

Hi~

Reproduction testcase is show as below, it can always reproduce the
issue in v5.10, and after this patch, the testcase passed.

In v5.10, when run `cat trace_pipe > /tmp/myfile &`, it call
sendfile() to transmit data from trace_pipe into /tmp/myfile. And in
kernel, .splice_read() of trace_pipe is called then the issue is
reproduced.

However in the newest v6.5, this reproduction case didn't run into the
.splice_read() of trace_pipe, because after commit 97ef77c52b78 ("fs:
check FMODE_LSEEK to control internal pipe splicing"), non-seekable
trace_pipe cannot be sendfile-ed.

``` repro.sh
#!/bin/bash


do_test()
{
         local trace_dir=/sys/kernel/tracing
         local trace=${trace_dir}/trace
         local old_trace_lines
         local new_trace_lines
         local tempfiles
         local testlog="trace pipe concurrency issue"
         local pipe_pids
         local i
         local write_cnt=1000
         local read_cnt=0
         local nr_cpu=`nproc`

         # 1. At first, clear all ring buffer
         echo > ${trace}

         # 2. Count how many lines in trace file now
         old_trace_lines=`cat ${trace} | wc -l`

         # 3. Close water mark so that reader can read as event comes
         echo 0 > ${trace_dir}/buffer_percent

         # 4. Read percpu trace_pipes into local file on background.
         #    Splice read must be used under command 'cat' so that the racy
         #    issue can be reproduced !!!
         i=0
         while [ ${i} -lt ${nr_cpu} ]; do
                 tempfiles[${i}]=/tmp/percpu_trace_pipe_${i}
                 cat ${trace_dir}/per_cpu/cpu${i}/trace_pipe > 
${tempfiles[${i}]} &
                 pipe_pids[${i}]=$!
                 let i=i+1
         done

         # 5. Read main trace_pipe into local file on background.
         #    The same, splice read must be used to reproduce the issue !!!
         tempfiles[${i}]=/tmp/main_trace_pipe
         cat ${trace_dir}/trace_pipe > ${tempfiles[${i}]} &
         pipe_pids[${i}]=$!

         echo "Take a break, let readers run."
         sleep 3

         # 6. Write events into ring buffer through trace_marker, so that
         #    hungry readers start racing these events.
         i=0
         while [ ${i} -lt ${write_cnt} ]; do
                 echo "${testlog} <${i}>" > ${trace_dir}/trace_marker
                 let i=i+1
         done

         # 7. Wait until all events being consumed
         new_trace_lines=`cat ${trace} | wc -l`
         while [ "${new_trace_lines}" != "${old_trace_lines}" ]; do
                 new_trace_lines=`cat ${trace} | wc -l`
                 sleep 1
         done
         echo "All written events have been consumed."

         # 8. Kill all readers and count the events readed
         i=0
         while [ ${i} -lt ${#pipe_pids[*]} ]; do
                 local num

                 kill -9 ${pipe_pids[${i}]}
                 wait ${pipe_pids[${i}]}
                 num=`cat ${tempfiles[${i}]} | grep "${testlog}" | wc -l`
                 let read_cnt=read_cnt+num
                 let i=i+1
         done

         # 9. Expect to read events as much as write
         if [ "${read_cnt}" != "${write_cnt}" ]; then
                 echo "Test fail: write ${write_cnt} but read 
${read_cnt} !!!"
                 return 1
         fi

         # 10. Clean temp files if test success
         i=0
         while [ ${i} -lt ${#tempfiles[*]} ]; do
                 rm ${tempfiles[${i}]}
                 let i=i+1
         done
         return 0
}

do_test
```

-- Zheng Yejian

  parent reply	other threads:[~2023-08-12  7:38 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-10 12:39 [PATCH] tracing: Fix race when concurrently splice_read trace_pipe Zheng Yejian
2023-08-11 11:42 ` Masami Hiramatsu
2023-08-11 12:37   ` Zheng Yejian
2023-08-11 19:24     ` Steven Rostedt
2023-08-12  2:22       ` Zheng Yejian
2023-08-12 21:08         ` Steven Rostedt
2023-08-16 19:23         ` Steven Rostedt
2023-08-17 11:50           ` [RFC PATCH] tracing: Introduce pipe_cpumask to avoid race on trace_pipes Zheng Yejian
2023-08-17 14:13             ` Steven Rostedt
2023-08-18  1:38               ` Zheng Yejian
2023-08-18  1:43                 ` Steven Rostedt
2023-08-18  2:26             ` [PATCH v2] " Zheng Yejian
2023-08-18  5:03               ` Masami Hiramatsu
2023-08-18 13:41                 ` Steven Rostedt
2023-08-18 14:23                   ` Masami Hiramatsu
2023-08-18 15:53                     ` Steven Rostedt
2023-08-19  1:42                       ` Masami Hiramatsu
2023-08-20 13:18                         ` Masami Hiramatsu
2023-08-21  2:19                         ` Masami Hiramatsu
2023-08-21  2:33                           ` Steven Rostedt
2023-08-21  9:21                             ` Masami Hiramatsu
2023-08-21 15:17                         ` Steven Rostedt
2023-08-21 22:32                           ` Masami Hiramatsu
2023-08-11 19:25 ` [PATCH] tracing: Fix race when concurrently splice_read trace_pipe Steven Rostedt
2023-08-12  1:45   ` Zheng Yejian
2023-08-12 20:47     ` Masami Hiramatsu
2023-08-12  7:38   ` Zheng Yejian [this message]
2023-08-13  1:13     ` Steven Rostedt
2023-08-13 16:41       ` Linus Torvalds
2023-08-14 20:16         ` Steven Rostedt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b5dbdbeb-be3a-3434-0909-0697d8cb15bf@huawei.com \
    --to=zhengyejian1@huawei.com \
    --cc=laijs@cn.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=mhiramat@kernel.org \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).