public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Jiri Olsa <jolsa@redhat.com>
To: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>,
	Namhyung Kim <namhyung@kernel.org>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>, Andi Kleen <ak@linux.intel.com>,
	linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v5 0/3] perf record: adapt NUMA awareness to machines with #CPUs > 1K
Date: Tue, 3 Dec 2019 13:17:45 +0100	[thread overview]
Message-ID: <20191203121745.GA14125@krava> (raw)
In-Reply-To: <d1aead99-474a-46d3-36be-36dbb8e5581b@linux.intel.com>

On Tue, Dec 03, 2019 at 02:41:29PM +0300, Alexey Budankov wrote:
> 
> Current implementation of cpu_set_t type by glibc has internal cpu
> mask size limitation of no more than 1024 CPUs. This limitation confines
> NUMA awareness of Perf tool in record mode, thru --affinity option,
> to the first 1024 CPUs on machines with larger amount of CPUs.
> 
> This patch set enables Perf tool to overcome 1024 CPUs limitation by
> using a dedicated struct mmap_cpu_mask type and applying tool's bitmap
> API operations to manipulate affinity masks of the tool's thread and
> the mmaped data buffers.
> 
> tools bitmap API has been extended with bitmap_free() function and
> bitmap_equal() operation whose implementation is derived from the
> kernel one.
> 
> ---
> Changes in v5:
> - avoided allocation of mmap affinity masks in case of 
>   rec->opts.affinity == PERF_AFFINITY_SYS

Acked-by: Jiri Olsa <jolsa@redhat.com>

thanks,
jirka

> Changes in v4:
> - renamed perf_mmap__print_cpu_mask() to mmap_cpu_mask__scnprintf()
> - avoided checking mask bits for NULL prior calling bitmask_free()
> - avoided thread affinity mask allocation for case of 
>   rec->opts.affinity == PERF_AFFINITY_SYS
> Changes in v3:
> - implemented perf_mmap__print_cpu_mask() function
> - use perf_mmap__print_cpu_mask() to log thread and mmap cpus masks
>   when verbose level is equal to 2
> Changes in v2:
> - implemented bitmap_free() for symmetry with bitmap_alloc()
> - capitalized MMAP_CPU_MASK_BYTES() macro
> - returned -1 from perf_mmap__setup_affinity_mask()
> - implemented releasing of masks using bitmap_free()
> - moved debug printing under -vv option
> 
> ---
> Alexey Budankov (3):
>   tools bitmap: implement bitmap_equal() operation at bitmap API
>   perf mmap: declare type for cpu mask of arbitrary length
>   perf record: adapt affinity to machines with #CPUs > 1K
> 
>  tools/include/linux/bitmap.h | 30 +++++++++++++++++++++++++++
>  tools/lib/bitmap.c           | 15 ++++++++++++++
>  tools/perf/builtin-record.c  | 28 +++++++++++++++++++------
>  tools/perf/util/mmap.c       | 40 ++++++++++++++++++++++++++++++------
>  tools/perf/util/mmap.h       | 13 +++++++++++-
>  5 files changed, 113 insertions(+), 13 deletions(-)
> 
> ---
> Validation:
> 
> # tools/perf/perf record -vv -- ls
> Using CPUID GenuineIntel-6-5E-3
> intel_pt default config: tsc,mtc,mtc_period=3,psb_period=3,pt,branch
> nr_cblocks: 0
> affinity: SYS
> mmap flush: 1
> comp level: 0
> ------------------------------------------------------------
> perf_event_attr:
>   size                             120
>   { sample_period, sample_freq }   4000
>   sample_type                      IP|TID|TIME|PERIOD
>   read_format                      ID
>   disabled                         1
>   inherit                          1
>   mmap                             1
>   comm                             1
>   freq                             1
>   enable_on_exec                   1
>   task                             1
>   precise_ip                       3
>   sample_id_all                    1
>   exclude_guest                    1
>   mmap2                            1
>   comm_exec                        1
>   ksymbol                          1
>   bpf_event                        1
> ------------------------------------------------------------
> sys_perf_event_open: pid 23718  cpu 0  group_fd -1  flags 0x8 = 4
> sys_perf_event_open: pid 23718  cpu 1  group_fd -1  flags 0x8 = 5
> sys_perf_event_open: pid 23718  cpu 2  group_fd -1  flags 0x8 = 6
> sys_perf_event_open: pid 23718  cpu 3  group_fd -1  flags 0x8 = 9
> sys_perf_event_open: pid 23718  cpu 4  group_fd -1  flags 0x8 = 10
> sys_perf_event_open: pid 23718  cpu 5  group_fd -1  flags 0x8 = 11
> sys_perf_event_open: pid 23718  cpu 6  group_fd -1  flags 0x8 = 12
> sys_perf_event_open: pid 23718  cpu 7  group_fd -1  flags 0x8 = 13
> mmap size 528384B
> 0x7f3e06e060b8: mmap mask[8]: 
> 0x7f3e06e16180: mmap mask[8]: 
> 0x7f3e06e26248: mmap mask[8]: 
> 0x7f3e06e36310: mmap mask[8]: 
> 0x7f3e06e463d8: mmap mask[8]: 
> 0x7f3e06e564a0: mmap mask[8]: 
> 0x7f3e06e66568: mmap mask[8]: 
> 0x7f3e06e76630: mmap mask[8]: 
> ------------------------------------------------------------
> perf_event_attr:
>   type                             1
>   size                             120
>   config                           0x9
>   watermark                        1
>   sample_id_all                    1
>   bpf_event                        1
>   { wakeup_events, wakeup_watermark } 1
> ------------------------------------------------------------
> sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8 = 14
> sys_perf_event_open: pid -1  cpu 1  group_fd -1  flags 0x8 = 15
> sys_perf_event_open: pid -1  cpu 2  group_fd -1  flags 0x8 = 16
> sys_perf_event_open: pid -1  cpu 3  group_fd -1  flags 0x8 = 17
> sys_perf_event_open: pid -1  cpu 4  group_fd -1  flags 0x8 = 18
> sys_perf_event_open: pid -1  cpu 5  group_fd -1  flags 0x8 = 19
> sys_perf_event_open: pid -1  cpu 6  group_fd -1  flags 0x8 = 20
> sys_perf_event_open: pid -1  cpu 7  group_fd -1  flags 0x8 = 21
> mmap size 528384B
> 0x7f3e0697d0b8: mmap mask[8]: 
> 0x7f3e0698d180: mmap mask[8]: 
> 0x7f3e0699d248: mmap mask[8]: 
> 0x7f3e069ad310: mmap mask[8]: 
> 0x7f3e069bd3d8: mmap mask[8]: 
> 0x7f3e069cd4a0: mmap mask[8]: 
> 0x7f3e069dd568: mmap mask[8]: 
> 0x7f3e069ed630: mmap mask[8]: 
> Synthesizing TSC conversion information
> arch			      copy     Documentation  init     kernel	 MAINTAINERS	  modules.builtin.modinfo  perf.data	  scripts   System.map	vmlinux
> block			      COPYING  drivers	      ipc      lbuild	 Makefile	  modules.order		   perf.data.old  security  tools	vmlinux.o
> certs			      CREDITS  fs	      Kbuild   lib	 mm		  Module.symvers	   README	  sound     usr
> config-5.2.7-100.fc29.x86_64  crypto   include	      Kconfig  LICENSES  modules.builtin  net			   samples	  stdio     virt
> [ perf record: Woken up 1 times to write data ]
> Looking at the vmlinux_path (8 entries long)
> Using vmlinux for symbols
> [ perf record: Captured and wrote 0.013 MB perf.data (8 samples) ]
> 
> tools/perf/perf record -vv --affinity=cpu -- ls
> thread mask[8]: empty
> Using CPUID GenuineIntel-6-5E-3
> intel_pt default config: tsc,mtc,mtc_period=3,psb_period=3,pt,branch
> nr_cblocks: 0
> affinity: CPU
> mmap flush: 1
> comp level: 0
> ------------------------------------------------------------
> perf_event_attr:
>   size                             120
>   { sample_period, sample_freq }   4000
>   sample_type                      IP|TID|TIME|PERIOD
>   read_format                      ID
>   disabled                         1
>   inherit                          1
>   mmap                             1
>   comm                             1
>   freq                             1
>   enable_on_exec                   1
>   task                             1
>   precise_ip                       3
>   sample_id_all                    1
>   exclude_guest                    1
>   mmap2                            1
>   comm_exec                        1
>   ksymbol                          1
>   bpf_event                        1
> ------------------------------------------------------------
> sys_perf_event_open: pid 23713  cpu 0  group_fd -1  flags 0x8 = 4
> sys_perf_event_open: pid 23713  cpu 1  group_fd -1  flags 0x8 = 5
> sys_perf_event_open: pid 23713  cpu 2  group_fd -1  flags 0x8 = 6
> sys_perf_event_open: pid 23713  cpu 3  group_fd -1  flags 0x8 = 9
> sys_perf_event_open: pid 23713  cpu 4  group_fd -1  flags 0x8 = 10
> sys_perf_event_open: pid 23713  cpu 5  group_fd -1  flags 0x8 = 11
> sys_perf_event_open: pid 23713  cpu 6  group_fd -1  flags 0x8 = 12
> sys_perf_event_open: pid 23713  cpu 7  group_fd -1  flags 0x8 = 13
> mmap size 528384B
> 0x7f3e005bc0b8: mmap mask[8]: 0
> 0x7f3e005cc180: mmap mask[8]: 1
> 0x7f3e005dc248: mmap mask[8]: 2
> 0x7f3e005ec310: mmap mask[8]: 3
> 0x7f3e005fc3d8: mmap mask[8]: 4
> 0x7f3e0060c4a0: mmap mask[8]: 5
> 0x7f3e0061c568: mmap mask[8]: 6
> 0x7f3e0062c630: mmap mask[8]: 7
> ------------------------------------------------------------
> perf_event_attr:
>   type                             1
>   size                             120
>   config                           0x9
>   watermark                        1
>   sample_id_all                    1
>   bpf_event                        1
>   { wakeup_events, wakeup_watermark } 1
> ------------------------------------------------------------
> sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8 = 14
> sys_perf_event_open: pid -1  cpu 1  group_fd -1  flags 0x8 = 15
> sys_perf_event_open: pid -1  cpu 2  group_fd -1  flags 0x8 = 16
> sys_perf_event_open: pid -1  cpu 3  group_fd -1  flags 0x8 = 17
> sys_perf_event_open: pid -1  cpu 4  group_fd -1  flags 0x8 = 18
> sys_perf_event_open: pid -1  cpu 5  group_fd -1  flags 0x8 = 19
> sys_perf_event_open: pid -1  cpu 6  group_fd -1  flags 0x8 = 20
> sys_perf_event_open: pid -1  cpu 7  group_fd -1  flags 0x8 = 21
> mmap size 528384B
> 0x7f3e001330b8: mmap mask[8]: 
> 0x7f3e00143180: mmap mask[8]: 
> 0x7f3e00153248: mmap mask[8]: 
> 0x7f3e00163310: mmap mask[8]: 
> 0x7f3e001733d8: mmap mask[8]: 
> 0x7f3e001834a0: mmap mask[8]: 
> 0x7f3e00193568: mmap mask[8]: 
> 0x7f3e001a3630: mmap mask[8]: 
> Synthesizing TSC conversion information
> 0x9c9ff0: thread mask[8]: 0
> 0x9c9ff0: thread mask[8]: 1
> 0x9c9ff0: thread mask[8]: 2
> 0x9c9ff0: thread mask[8]: 3
> 0x9c9ff0: thread mask[8]: 4
> arch			      copy     Documentation  init     kernel	 MAINTAINERS	  modules.builtin.modinfo  perf.data	  scripts   System.map	vmlinux
> block			      COPYING  drivers	      ipc      lbuild	 Makefile	  modules.order		   perf.data.old  security  tools	vmlinux.o
> certs			      CREDITS  fs	      Kbuild   lib	 mm		  Module.symvers	   README	  sound     usr
> config-5.2.7-100.fc29.x86_64  crypto   include	      Kconfig  LICENSES  modules.builtin  net			   samples	  stdio     virt
> 0x9c9ff0: thread mask[8]: 5
> 0x9c9ff0: thread mask[8]: 6
> 0x9c9ff0: thread mask[8]: 7
> 0x9c9ff0: thread mask[8]: 0
> 0x9c9ff0: thread mask[8]: 1
> 0x9c9ff0: thread mask[8]: 2
> 0x9c9ff0: thread mask[8]: 3
> 0x9c9ff0: thread mask[8]: 4
> 0x9c9ff0: thread mask[8]: 5
> 0x9c9ff0: thread mask[8]: 6
> 0x9c9ff0: thread mask[8]: 7
> [ perf record: Woken up 0 times to write data ]
> 0x9c9ff0: thread mask[8]: 0
> 0x9c9ff0: thread mask[8]: 1
> 0x9c9ff0: thread mask[8]: 2
> 0x9c9ff0: thread mask[8]: 3
> 0x9c9ff0: thread mask[8]: 4
> 0x9c9ff0: thread mask[8]: 5
> 0x9c9ff0: thread mask[8]: 6
> 0x9c9ff0: thread mask[8]: 7
> Looking at the vmlinux_path (8 entries long)
> Using vmlinux for symbols
> ...
> [ perf record: Captured and wrote 0.013 MB perf.data (10 samples) ]
> 


  parent reply	other threads:[~2019-12-03 12:17 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-03 11:41 [PATCH v5 0/3] perf record: adapt NUMA awareness to machines with #CPUs > 1K Alexey Budankov
2019-12-03 11:43 ` [PATCH v5 1/3] tools bitmap: implement bitmap_equal() operation at bitmap API Alexey Budankov
2020-01-10 17:53   ` [tip: perf/core] tools bitmap: Implement " tip-bot2 for Alexey Budankov
2019-12-03 11:44 ` [PATCH v5 2/3] perf mmap: declare type for cpu mask of arbitrary length Alexey Budankov
2019-12-04 13:49   ` Arnaldo Carvalho de Melo
2019-12-05  7:31     ` Alexey Budankov
2020-01-10 17:53   ` [tip: perf/core] perf mmap: Declare " tip-bot2 for Alexey Budankov
2019-12-03 11:45 ` [PATCH v5 3/3] perf record: adapt affinity to machines with #CPUs > 1K Alexey Budankov
2019-12-04 13:48   ` Arnaldo Carvalho de Melo
2019-12-05  7:30     ` Alexey Budankov
2020-01-10 17:53   ` [tip: perf/core] perf record: Adapt " tip-bot2 for Alexey Budankov
2019-12-03 12:17 ` Jiri Olsa [this message]
2019-12-03 18:36   ` [PATCH v5 0/3] perf record: adapt NUMA awareness " Arnaldo Carvalho de Melo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191203121745.GA14125@krava \
    --to=jolsa@redhat.com \
    --cc=acme@kernel.org \
    --cc=ak@linux.intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=alexey.budankov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox