All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Bayduraev, Alexey V" <alexey.v.bayduraev@linux.intel.com>
To: Ian Rogers <irogers@google.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>,
	Jiri Olsa <jolsa@redhat.com>, Namhyung Kim <namhyung@kernel.org>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Andi Kleen <ak@linux.intel.com>,
	Adrian Hunter <adrian.hunter@intel.com>,
	Alexander Antonov <alexander.antonov@linux.intel.com>,
	Alexei Budankov <abudankov@huawei.com>,
	Riccardo Mancini <rickyman7@gmail.com>
Subject: Re: [PATCH v13 01/16] perf record: Introduce thread affinity and mmap masks
Date: Tue, 5 Apr 2022 19:21:04 +0300	[thread overview]
Message-ID: <b1b875cb-cae5-e707-7897-047f991c3db0@linux.intel.com> (raw)
In-Reply-To: <CAP-5=fXK6BfeUdA=h=Rx3K96Hn5nHHPrz7hQDwCMRpSbzcBXvw@mail.gmail.com>

On 05.04.2022 1:25, Ian Rogers wrote:
> On Mon, Jan 17, 2022 at 10:38 AM Alexey Bayduraev
> <alexey.v.bayduraev@linux.intel.com> wrote:
>>
>> Introduce affinity and mmap thread masks. Thread affinity mask

<SNIP>

> 
> In per-thread mode it is possible that cpus is the dummy CPU map here.
> This means that the cpu below has the value -1 and setting bit -1
> actually has the effect of setting bit 63. Here is a reproduction
> based on the acme/perf/core branch:
> 
> ```
> $ make STATIC=1 DEBUG=1 EXTRA_CFLAGS='-fno-omit-frame-pointer
> -fsanitize=undefined -fno-sanitize-recover'
> $ perf record -o /tmp/perf.data  --per-thread true
> tools/include/asm-generic/bitops/atomic.h:10:36: runtime error: shift
> exponent -1 is negative
> $ UBSAN_OPTIONS=abort_on_error=1 gdb --args perf record -o
> /tmp/perf.data --per-thread true
> (gdb) r
> tools/include/asm-generic/bitops/atomic.h:10:36: runtime error: shift
> exponent -1 is negative
> (gdb) bt
> #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:49
> #1  0x00007ffff71d2546 in __GI_abort () at abort.c:79
> #2  0x00007ffff640db9f in __sanitizer::Abort () at
> ../../../../src/libsanitizer/sanitizer_common/sanitizer_posix_libcdep.cpp:151
> #3  0x00007ffff6418efc in __sanitizer::Die () at
> ../../../../src/libsanitizer/sanitizer_common/sanitizer_termination.cpp:58
> #4  0x00007ffff63fd99e in
> __ubsan::__ubsan_handle_shift_out_of_bounds_abort (Data=<optimized
> out>, LHS=<optimized out>,
>     RHS=<optimized out>) at
> ../../../../src/libsanitizer/ubsan/ubsan_handlers.cpp:378
> #5  0x0000555555c54405 in set_bit (nr=-1, addr=0x555556ecd0a0)
>     at tools/include/asm-generic/bitops/atomic.h:10
> #6  0x0000555555c6ddaf in record__mmap_cpu_mask_init
> (mask=0x555556ecd070, cpus=0x555556ecd050) at builtin-record.c:3333
> #7  0x0000555555c7044c in record__init_thread_default_masks
> (rec=0x55555681b100 <record>, cpus=0x555556ecd050) at
> builtin-record.c:3668
> #8  0x0000555555c705b3 in record__init_thread_masks
> (rec=0x55555681b100 <record>) at builtin-record.c:3681
> #9  0x0000555555c7297a in cmd_record (argc=1, argv=0x7fffffffdcc0) at
> builtin-record.c:3976
> #10 0x0000555555e06d41 in run_builtin (p=0x555556827538
> <commands+216>, argc=5, argv=0x7fffffffdcc0) at perf.c:313
> #11 0x0000555555e07253 in handle_internal_command (argc=5,
> argv=0x7fffffffdcc0) at perf.c:365
> #12 0x0000555555e07508 in run_argv (argcp=0x7fffffffdb0c,
> argv=0x7fffffffdb00) at perf.c:409
> #13 0x0000555555e07b32 in main (argc=5, argv=0x7fffffffdcc0) at perf.c:539
> ```
> 
> Not setting the mask->bits if the cpu map is dummy causes no data to
> be written. Setting mask->bits 0 causes a segv. Setting bit 63 works
> but feels like there are more invariants broken in the code.
> 
> Here is a not good workaround patch:
> 
> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> index ba74fab02e62..62727b676f98 100644
> --- a/tools/perf/builtin-record.c
> +++ b/tools/perf/builtin-record.c
> @@ -3329,6 +3329,11 @@ static void record__mmap_cpu_mask_init(struct
> mmap_cpu_mask *mask, struct perf_c
>  {
>         int c;
> 
> +       if (cpu_map__is_dummy(cpus)) {
> +               set_bit(63, mask->bits);
> +               return;
> +       }
> +
>         for (c = 0; c < cpus->nr; c++)
>                 set_bit(cpus->map[c].cpu, mask->bits);
>  }
> 
> Alexey, what should the expected behavior be with per-thread mmaps?
> 
> Thanks,
> Ian

Thanks a lot,

In case of per-thread mmaps we should initialize thread_data[0]->maps[i] by
evlist->mmap[i]. Looks like this was missed by this patchset.

Your patch works, because it triggers test_bit() in record__thread_data_init_maps()
and thread_data maps get correctly initialized.

However, it's better to ignore thread_data->masks in record__mmap_cpu_mask_init()
and setup thread_data maps explicitly for per-thread case.

Also, to prevent more runtime crashes, --per-thread and --threads 
options should be mutually exclusive.

I will prepare a fix for this issue soon.

Regards,
Alexey

> 
>> +static void record__free_thread_masks(struct record *rec, int nr_threads)
>> +{
>> +       int t;
>> +
>> +       if (rec->thread_masks)
>> +               for (t = 0; t < nr_threads; t++)
>> +                       record__thread_mask_free(&rec->thread_masks[t]);
>> +
>> +       zfree(&rec->thread_masks);
>> +}
>> +
>> +static int record__alloc_thread_masks(struct record *rec, int nr_threads, int nr_bits)
>> +{
>> +       int t, ret;
>> +
>> +       rec->thread_masks = zalloc(nr_threads * sizeof(*(rec->thread_masks)));
>> +       if (!rec->thread_masks) {
>> +               pr_err("Failed to allocate thread masks\n");
>> +               return -ENOMEM;
>> +       }
>> +
>> +       for (t = 0; t < nr_threads; t++) {
>> +               ret = record__thread_mask_alloc(&rec->thread_masks[t], nr_bits);
>> +               if (ret) {
>> +                       pr_err("Failed to allocate thread masks[%d]\n", t);
>> +                       goto out_free;
>> +               }
>> +       }
>> +
>> +       return 0;
>> +
>> +out_free:
>> +       record__free_thread_masks(rec, nr_threads);
>> +
>> +       return ret;
>> +}
>> +
>> +static int record__init_thread_default_masks(struct record *rec, struct perf_cpu_map *cpus)
>> +{
>> +       int ret;
>> +
>> +       ret = record__alloc_thread_masks(rec, 1, cpu__max_cpu().cpu);
>> +       if (ret)
>> +               return ret;
>> +
>> +       record__mmap_cpu_mask_init(&rec->thread_masks->maps, cpus);
>> +
>> +       rec->nr_threads = 1;
>> +
>> +       return 0;
>> +}
>> +
>> +static int record__init_thread_masks(struct record *rec)
>> +{
>> +       struct perf_cpu_map *cpus = rec->evlist->core.cpus;
>> +
>> +       return record__init_thread_default_masks(rec, cpus);
>> +}
>> +
>>  int cmd_record(int argc, const char **argv)
>>  {
>>         int err;
>> @@ -2948,6 +3063,12 @@ int cmd_record(int argc, const char **argv)
>>                 goto out;
>>         }
>>
>> +       err = record__init_thread_masks(rec);
>> +       if (err) {
>> +               pr_err("Failed to initialize parallel data streaming masks\n");
>> +               goto out;
>> +       }
>> +
>>         if (rec->opts.nr_cblocks > nr_cblocks_max)
>>                 rec->opts.nr_cblocks = nr_cblocks_max;
>>         pr_debug("nr_cblocks: %d\n", rec->opts.nr_cblocks);
>> @@ -2966,6 +3087,8 @@ int cmd_record(int argc, const char **argv)
>>         symbol__exit();
>>         auxtrace_record__free(rec->itr);
>>  out_opts:
>> +       record__free_thread_masks(rec, rec->nr_threads);
>> +       rec->nr_threads = 0;
>>         evlist__close_control(rec->opts.ctl_fd, rec->opts.ctl_fd_ack, &rec->opts.ctl_fd_close);
>>         return err;
>>  }
>> --
>> 2.19.0
>>

  reply	other threads:[~2022-04-06  2:08 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-17 18:34 [PATCH v13 00/16] Introduce threaded trace streaming for basic perf record operation Alexey Bayduraev
2022-01-17 18:34 ` [PATCH v13 01/16] perf record: Introduce thread affinity and mmap masks Alexey Bayduraev
2022-01-31 21:00   ` Arnaldo Carvalho de Melo
2022-01-31 21:16     ` Arnaldo Carvalho de Melo
2022-02-01 11:46       ` Bayduraev, Alexey V
2022-01-31 22:03     ` Arnaldo Carvalho de Melo
2022-01-31 22:04       ` Arnaldo Carvalho de Melo
2022-04-04 22:25   ` Ian Rogers
2022-04-05 16:21     ` Bayduraev, Alexey V [this message]
2022-04-06 16:46       ` Ian Rogers
2022-01-17 18:34 ` [PATCH v13 02/16] tools lib: Introduce fdarray duplicate function Alexey Bayduraev
2022-01-17 18:34 ` [PATCH v13 03/16] perf record: Introduce thread specific data array Alexey Bayduraev
2022-01-31 21:39   ` Arnaldo Carvalho de Melo
2022-01-31 22:21     ` Arnaldo Carvalho de Melo
2022-02-11 16:51       ` Arnaldo Carvalho de Melo
2022-02-11 16:52         ` Arnaldo Carvalho de Melo
2022-02-11 19:34         ` Alexei Budankov
2022-01-17 18:34 ` [PATCH v13 04/16] perf record: Introduce function to propagate control commands Alexey Bayduraev
2022-01-17 18:34 ` [PATCH v13 05/16] perf record: Introduce thread local variable Alexey Bayduraev
2022-01-31 21:42   ` Arnaldo Carvalho de Melo
2022-01-31 21:45   ` Arnaldo Carvalho de Melo
2022-02-01  7:35     ` Bayduraev, Alexey V
2022-01-17 18:34 ` [PATCH v13 06/16] perf record: Stop threads in the end of trace streaming Alexey Bayduraev
2022-01-17 18:34 ` [PATCH v13 07/16] perf record: Start threads in the beginning " Alexey Bayduraev
2022-01-17 18:34 ` [PATCH v13 08/16] perf record: Introduce data file at mmap buffer object Alexey Bayduraev
2022-01-17 18:34 ` [PATCH v13 09/16] perf record: Introduce bytes written stats Alexey Bayduraev
2022-01-17 18:34 ` [PATCH v13 10/16] perf record: Introduce compressor at mmap buffer object Alexey Bayduraev
2022-01-31 21:56   ` Arnaldo Carvalho de Melo
2022-02-01  8:08     ` Bayduraev, Alexey V
2022-01-17 18:34 ` [PATCH v13 11/16] perf record: Introduce data transferred and compressed stats Alexey Bayduraev
2022-01-24 15:28   ` Arnaldo Carvalho de Melo
2022-01-24 16:39     ` Arnaldo Carvalho de Melo
2022-01-17 18:34 ` [PATCH v13 12/16] perf record: Introduce --threads command line option Alexey Bayduraev
2022-01-17 18:34 ` [PATCH v13 13/16] perf record: Extend " Alexey Bayduraev
2022-01-17 18:34 ` [PATCH v13 14/16] perf record: Implement compatibility checks Alexey Bayduraev
2022-01-17 18:34 ` [PATCH v13 15/16] perf session: Load data directory files for analysis Alexey Bayduraev
2022-01-17 18:34 ` [PATCH v13 16/16] perf report: Output data file name in raw trace dump Alexey Bayduraev
2022-01-24 15:45 ` [PATCH v13 00/16] Introduce threaded trace streaming for basic perf record operation Jiri Olsa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b1b875cb-cae5-e707-7897-047f991c3db0@linux.intel.com \
    --to=alexey.v.bayduraev@linux.intel.com \
    --cc=abudankov@huawei.com \
    --cc=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=ak@linux.intel.com \
    --cc=alexander.antonov@linux.intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=irogers@google.com \
    --cc=jolsa@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rickyman7@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.