linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5] perf sched: Minor optimizations for resource initialization
@ 2024-02-05 10:46 Yang Jihong
  2024-02-05 10:46 ` [PATCH 1/5] perf sched: Move start_work_mutex and work_done_wait_mutex initialization to perf_sched__replay() Yang Jihong
                   ` (4 more replies)
  0 siblings, 5 replies; 12+ messages in thread
From: Yang Jihong @ 2024-02-05 10:46 UTC (permalink / raw)
  To: peterz, mingo, acme, namhyung, mark.rutland, alexander.shishkin,
	jolsa, irogers, adrian.hunter, linux-perf-users, linux-kernel
  Cc: yangjihong1

start_work_mutex, work_done_wait_mutex, curr_thread, curr_pid, and
cpu_last_switched are initialized together in cmd_sched(),
but for different perf sched subcommands, some actions are unnecessary,
especially perf sched record. 
This series of patches initialize only required resources for different
subcommands.

Simple functional testing:

  # perf sched record perf bench sched messaging
  # Running 'sched/messaging' benchmark:
  # 20 sender and receiver processes per group
  # 10 groups == 400 processes run
  
       Total time: 0.204 [sec]
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 14.868 MB perf.data (133947 samples) ]
  
  # perf sched latency
  
   -------------------------------------------------------------------------------------------------------------------------------------------
    Task                  |   Runtime ms  | Switches | Avg delay ms    | Max delay ms    | Max delay start           | Max delay end          |
   -------------------------------------------------------------------------------------------------------------------------------------------
    qemu-system-x86:(3)   |      2.753 ms |       10 | avg:   0.963 ms | max:   8.526 ms | max start: 457541.695704 s | max end: 457541.704230 s
    sched-messaging:(401) |   2995.481 ms |    36312 | avg:   0.944 ms | max: 111.551 ms | max start: 457541.645349 s | max end: 457541.756900 s
    rcu_sched:14          |      0.402 ms |       32 | avg:   0.266 ms | max:   7.996 ms | max start: 457541.581363 s | max end: 457541.589359 s
    sshd:48125            |      0.461 ms |        2 | avg:   0.033 ms | max:   0.036 ms | max start: 457541.584374 s | max end: 457541.584410 s
    perf:112408           |      2.321 ms |        2 | avg:   0.031 ms | max:   0.032 ms | max start: 457541.846874 s | max end: 457541.846906 s
  <SNIP>
    ksoftirqd/1:22        |      0.704 ms |        3 | avg:   0.005 ms | max:   0.005 ms | max start: 457541.845388 s | max end: 457541.845393 s
    kworker/15:0-mm:61886 |      0.227 ms |       21 | avg:   0.005 ms | max:   0.006 ms | max start: 457541.739187 s | max end: 457541.739193 s
    kworker/13:1-ev:92643 |      0.249 ms |       22 | avg:   0.005 ms | max:   0.006 ms | max start: 457541.768695 s | max end: 457541.768701 s
    kworker/12:1-ev:61202 |      0.418 ms |       40 | avg:   0.005 ms | max:   0.007 ms | max start: 457541.739679 s | max end: 457541.739687 s
   -----------------------------------------------------------------------------------------------------------------
    TOTAL:                |   3007.424 ms |    36776 |
   ---------------------------------------------------
  
  # echo $?
  0
  
  # perf sched map
    *A0                                                               457541.580856 secs A0 => migration/0:15
    *.                                                                457541.580886 secs .  => swapper:0
     .  *B0                                                           457541.581018 secs B0 => migration/1:21
     .  *.                                                            457541.581050 secs
     .   .  *C0                                                       457541.581147 secs C0 => migration/2:27
     .   .  *.                                                        457541.581174 secs
     .   .   .  *D0                                                   457541.581259 secs D0 => migration/3:33
  <SNIP>
     E7  E7  E7  E7  E7  E7  E7  E7  E7  E7  E7 *E7  .   .   .   .    457541.847783 secs
     E7  E7  E7  E7  E7  E7  E7  E7  E7  E7  E7  E7 *E7  .   .   .    457541.847873 secs
     E7  E7  E7  E7  E7  E7  E7  E7  E7  E7  E7  E7  E7 *E7  .   .    457541.847954 secs
     E7  E7  E7  E7  E7  E7  E7  E7  E7  E7  E7  E7  E7  E7 *E7  .    457541.848034 secs
     E7  E7  E7  E7  E7  E7  E7  E7  E7  E7  E7  E7  E7  E7  E7 *E7   457541.848108 secs
  # echo $?
  0
  
  # perf sched replay
  run measurement overhead: 108 nsecs
  sleep measurement overhead: 65244 nsecs
  the run test took 1000002 nsecs
  the sleep test took 1079677 nsecs
  nr_run_events:        40700
  nr_sleep_events:      41415
  nr_wakeup_events:     31601
  target-less wakeups:  15
  multi-target wakeups: 805
  task      0 (             swapper:         0), nr_events: 7307
  task      1 (             swapper:         1), nr_events: 1
  task      2 (             swapper:         2), nr_events: 1
  <SNIP>
  task    713 (     sched-messaging:    112809), nr_events: 987
  task    714 (     sched-messaging:    112810), nr_events: 2706
  ------------------------------------------------------------
  #1  : 1298.443, ravg: 1298.44, cpu: 8281.74 / 8281.74
  #2  : 1316.426, ravg: 1300.24, cpu: 7577.61 / 8211.32
  #3  : 1323.864, ravg: 1302.60, cpu: 7932.48 / 8183.44
  #4  : 1329.423, ravg: 1305.29, cpu: 7646.17 / 8129.71
  #5  : 1321.419, ravg: 1306.90, cpu: 7669.90 / 8083.73
  #6  : 1322.082, ravg: 1308.42, cpu: 7755.66 / 8050.92
  #7  : 1324.330, ravg: 1310.01, cpu: 7361.51 / 7981.98
  #8  : 1312.489, ravg: 1310.26, cpu: 7170.11 / 7900.80
  #9  : 1312.002, ravg: 1310.43, cpu: 7176.40 / 7828.36
  #10 : 1311.737, ravg: 1310.56, cpu: 7121.14 / 7757.63
  # echo $?
  0
  
  # perf sched script
              perf  112408 [000] 457541.580826: sched:sched_stat_runtime: comm=perf pid=112408 runtime=53050 [ns] vruntime=621244222333 [ns]
              perf  112408 [000] 457541.580831:       sched:sched_waking: comm=migration/0 pid=15 prio=0 target_cpu=000
              perf  112408 [000] 457541.580853: sched:sched_stat_runtime: comm=perf pid=112408 runtime=24379 [ns] vruntime=621244246712 [ns]
              perf  112408 [000] 457541.580856:       sched:sched_switch: prev_comm=perf prev_pid=112408 prev_prio=120 prev_state=R+ ==> next_comm=migration/0 next_pid=15 next_prio=0
  <SNIP>
           swapper       0 [012] 457541.847873:       sched:sched_switch: prev_comm=swapper/12 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=perf next_pid=112408 next_prio=120
           swapper       0 [013] 457541.847954:       sched:sched_switch: prev_comm=swapper/13 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=perf next_pid=112408 next_prio=120
           swapper       0 [014] 457541.848034:       sched:sched_switch: prev_comm=swapper/14 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=perf next_pid=112408 next_prio=120
           swapper       0 [015] 457541.848108:       sched:sched_switch: prev_comm=swapper/15 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=perf next_pid=112408 next_prio=120
  # echo $?
  0
  
  # perf sched timehist
  Samples do not have callchains.
             time    cpu  task name                       wait time  sch delay   run time
                          [tid/pid]                          (msec)     (msec)     (msec)
  --------------- ------  ------------------------------  ---------  ---------  ---------
    457541.580856 [0000]  perf[112408]                        0.000      0.000      0.000
    457541.580886 [0000]  migration/0[15]                     0.000      0.024      0.029
    457541.581018 [0001]  perf[112408]                        0.000      0.000      0.000
    457541.581050 [0001]  migration/1[21]                     0.000      0.006      0.032
    457541.581147 [0002]  perf[112408]                        0.000      0.000      0.000
    457541.581174 [0002]  migration/2[27]                     0.000      0.005      0.026
  <SNIP>
    457541.847623 [0009]  <idle>                              0.010      0.000      1.489
    457541.847704 [0010]  <idle>                              0.012      0.000      1.777
    457541.847783 [0011]  <idle>                              0.233      0.000      1.213
    457541.847873 [0012]  <idle>                              0.008      0.000     24.188
    457541.847954 [0013]  <idle>                              0.009      0.000      1.734
    457541.848034 [0014]  <idle>                              0.009      0.000      2.969
    457541.848108 [0015]  <idle>                              0.220      0.000      1.205
  # echo $?
  0

Yang Jihong (5):
  perf sched: Move start_work_mutex and work_done_wait_mutex
    initialization to perf_sched__replay()
  perf sched: Fix memory leak in perf_sched__map()
  perf sched: Move curr_thread initialization to perf_sched__map()
  perf sched: Move curr_pid and cpu_last_switched initialization to
    perf_sched__{lat|map|replay}()
  perf thread_map: Free strlist on normal path in
    thread_map__new_by_tid_str()

 tools/perf/builtin-sched.c   | 163 ++++++++++++++++++++++-------------
 tools/perf/util/thread_map.c |   2 +-
 2 files changed, 105 insertions(+), 60 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/5] perf sched: Move start_work_mutex and work_done_wait_mutex initialization to perf_sched__replay()
  2024-02-05 10:46 [PATCH 0/5] perf sched: Minor optimizations for resource initialization Yang Jihong
@ 2024-02-05 10:46 ` Yang Jihong
  2024-02-05 10:46 ` [PATCH 2/5] perf sched: Fix memory leak in perf_sched__map() Yang Jihong
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 12+ messages in thread
From: Yang Jihong @ 2024-02-05 10:46 UTC (permalink / raw)
  To: peterz, mingo, acme, namhyung, mark.rutland, alexander.shishkin,
	jolsa, irogers, adrian.hunter, linux-perf-users, linux-kernel
  Cc: yangjihong1

The start_work_mutex and work_done_wait_mutex are used only for the
'perf sched replay'. Put their initialization in perf_sched__replay () to
reduce unnecessary actions in other commands.

Simple functional testing:

  # perf sched record perf bench sched messaging
  # Running 'sched/messaging' benchmark:
  # 20 sender and receiver processes per group
  # 10 groups == 400 processes run

       Total time: 0.197 [sec]
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 14.952 MB perf.data (134165 samples) ]

  # perf sched replay
  run measurement overhead: 108 nsecs
  sleep measurement overhead: 65658 nsecs
  the run test took 999991 nsecs
  the sleep test took 1079324 nsecs
  nr_run_events:        42378
  nr_sleep_events:      43102
  nr_wakeup_events:     31852
  target-less wakeups:  17
  multi-target wakeups: 712
  task      0 (             swapper:         0), nr_events: 10451
  task      1 (             swapper:         1), nr_events: 3
  task      2 (             swapper:         2), nr_events: 1
  <SNIP>
  task    717 (     sched-messaging:     74483), nr_events: 152
  task    718 (     sched-messaging:     74484), nr_events: 1944
  task    719 (     sched-messaging:     74485), nr_events: 73
  task    720 (     sched-messaging:     74486), nr_events: 163
  task    721 (     sched-messaging:     74487), nr_events: 942
  task    722 (     sched-messaging:     74488), nr_events: 78
  task    723 (     sched-messaging:     74489), nr_events: 1090
  ------------------------------------------------------------
  #1  : 1366.507, ravg: 1366.51, cpu: 7682.70 / 7682.70
  #2  : 1410.072, ravg: 1370.86, cpu: 7723.88 / 7686.82
  #3  : 1396.296, ravg: 1373.41, cpu: 7568.20 / 7674.96
  #4  : 1381.019, ravg: 1374.17, cpu: 7531.81 / 7660.64
  #5  : 1393.826, ravg: 1376.13, cpu: 7725.25 / 7667.11
  #6  : 1401.581, ravg: 1378.68, cpu: 7594.82 / 7659.88
  #7  : 1381.337, ravg: 1378.94, cpu: 7371.22 / 7631.01
  #8  : 1373.842, ravg: 1378.43, cpu: 7894.92 / 7657.40
  #9  : 1364.697, ravg: 1377.06, cpu: 7324.91 / 7624.15
  #10 : 1363.613, ravg: 1375.72, cpu: 7209.55 / 7582.69
  # echo $?
  0

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
---
 tools/perf/builtin-sched.c | 19 ++++++++++++-------
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 42d5fc5d6b7b..08dec557e6be 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -3285,15 +3285,20 @@ static int perf_sched__map(struct perf_sched *sched)
 
 static int perf_sched__replay(struct perf_sched *sched)
 {
+	int ret;
 	unsigned long i;
 
+	mutex_init(&sched->start_work_mutex);
+	mutex_init(&sched->work_done_wait_mutex);
+
 	calibrate_run_measurement_overhead(sched);
 	calibrate_sleep_measurement_overhead(sched);
 
 	test_calibrations(sched);
 
-	if (perf_sched__read_events(sched))
-		return -1;
+	ret = perf_sched__read_events(sched);
+	if (ret)
+		goto out_mutex_destroy;
 
 	printf("nr_run_events:        %ld\n", sched->nr_run_events);
 	printf("nr_sleep_events:      %ld\n", sched->nr_sleep_events);
@@ -3318,7 +3323,11 @@ static int perf_sched__replay(struct perf_sched *sched)
 
 	sched->thread_funcs_exit = true;
 	destroy_tasks(sched);
-	return 0;
+
+out_mutex_destroy:
+	mutex_destroy(&sched->start_work_mutex);
+	mutex_destroy(&sched->work_done_wait_mutex);
+	return ret;
 }
 
 static void setup_sorting(struct perf_sched *sched, const struct option *options,
@@ -3556,8 +3565,6 @@ int cmd_sched(int argc, const char **argv)
 	unsigned int i;
 	int ret = 0;
 
-	mutex_init(&sched.start_work_mutex);
-	mutex_init(&sched.work_done_wait_mutex);
 	sched.curr_thread = calloc(MAX_CPUS, sizeof(*sched.curr_thread));
 	if (!sched.curr_thread) {
 		ret = -ENOMEM;
@@ -3645,8 +3652,6 @@ int cmd_sched(int argc, const char **argv)
 	free(sched.curr_pid);
 	free(sched.cpu_last_switched);
 	free(sched.curr_thread);
-	mutex_destroy(&sched.start_work_mutex);
-	mutex_destroy(&sched.work_done_wait_mutex);
 
 	return ret;
 }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/5] perf sched: Fix memory leak in perf_sched__map()
  2024-02-05 10:46 [PATCH 0/5] perf sched: Minor optimizations for resource initialization Yang Jihong
  2024-02-05 10:46 ` [PATCH 1/5] perf sched: Move start_work_mutex and work_done_wait_mutex initialization to perf_sched__replay() Yang Jihong
@ 2024-02-05 10:46 ` Yang Jihong
  2024-02-05 18:58   ` Arnaldo Carvalho de Melo
  2024-02-05 10:46 ` [PATCH 3/5] perf sched: Move curr_thread initialization to perf_sched__map() Yang Jihong
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 12+ messages in thread
From: Yang Jihong @ 2024-02-05 10:46 UTC (permalink / raw)
  To: peterz, mingo, acme, namhyung, mark.rutland, alexander.shishkin,
	jolsa, irogers, adrian.hunter, linux-perf-users, linux-kernel
  Cc: yangjihong1

perf_sched__map() needs to free memory of map_cpus, color_pids and
color_cpus in normal path and rollback allocated memory in error path.

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
---
 tools/perf/builtin-sched.c | 41 ++++++++++++++++++++++++--------------
 1 file changed, 26 insertions(+), 15 deletions(-)

diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 08dec557e6be..26dbfa4aab61 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -3208,8 +3208,6 @@ static int perf_sched__lat(struct perf_sched *sched)
 
 static int setup_map_cpus(struct perf_sched *sched)
 {
-	struct perf_cpu_map *map;
-
 	sched->max_cpu.cpu  = sysconf(_SC_NPROCESSORS_CONF);
 
 	if (sched->map.comp) {
@@ -3218,16 +3216,15 @@ static int setup_map_cpus(struct perf_sched *sched)
 			return -1;
 	}
 
-	if (!sched->map.cpus_str)
-		return 0;
-
-	map = perf_cpu_map__new(sched->map.cpus_str);
-	if (!map) {
-		pr_err("failed to get cpus map from %s\n", sched->map.cpus_str);
-		return -1;
+	if (sched->map.cpus_str) {
+		sched->map.cpus = perf_cpu_map__new(sched->map.cpus_str);
+		if (!sched->map.cpus) {
+			pr_err("failed to get cpus map from %s\n", sched->map.cpus_str);
+			free(sched->map.comp_cpus);
+			return -1;
+		}
 	}
 
-	sched->map.cpus = map;
 	return 0;
 }
 
@@ -3267,20 +3264,34 @@ static int setup_color_cpus(struct perf_sched *sched)
 
 static int perf_sched__map(struct perf_sched *sched)
 {
+	int rc = -1;
+
 	if (setup_map_cpus(sched))
-		return -1;
+		return rc;
 
 	if (setup_color_pids(sched))
-		return -1;
+		goto out_free_map_cpus;
 
 	if (setup_color_cpus(sched))
-		return -1;
+		goto out_free_color_pids;
 
 	setup_pager();
 	if (perf_sched__read_events(sched))
-		return -1;
+		goto out_free_color_cpus;
+
+	rc = 0;
 	print_bad_events(sched);
-	return 0;
+
+out_free_color_cpus:
+	perf_cpu_map__put(sched->map.color_cpus);
+
+out_free_color_pids:
+	perf_thread_map__put(sched->map.color_pids);
+
+out_free_map_cpus:
+	free(sched->map.comp_cpus);
+	perf_cpu_map__put(sched->map.cpus);
+	return rc;
 }
 
 static int perf_sched__replay(struct perf_sched *sched)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/5] perf sched: Move curr_thread initialization to perf_sched__map()
  2024-02-05 10:46 [PATCH 0/5] perf sched: Minor optimizations for resource initialization Yang Jihong
  2024-02-05 10:46 ` [PATCH 1/5] perf sched: Move start_work_mutex and work_done_wait_mutex initialization to perf_sched__replay() Yang Jihong
  2024-02-05 10:46 ` [PATCH 2/5] perf sched: Fix memory leak in perf_sched__map() Yang Jihong
@ 2024-02-05 10:46 ` Yang Jihong
  2024-02-05 18:59   ` Arnaldo Carvalho de Melo
  2024-02-05 10:46 ` [PATCH 4/5] perf sched: Move curr_pid and cpu_last_switched initialization to perf_sched__{lat|map|replay}() Yang Jihong
  2024-02-05 10:46 ` [PATCH 5/5] perf thread_map: Free strlist on normal path in thread_map__new_by_tid_str() Yang Jihong
  4 siblings, 1 reply; 12+ messages in thread
From: Yang Jihong @ 2024-02-05 10:46 UTC (permalink / raw)
  To: peterz, mingo, acme, namhyung, mark.rutland, alexander.shishkin,
	jolsa, irogers, adrian.hunter, linux-perf-users, linux-kernel
  Cc: yangjihong1

The curr_thread is used only for the 'perf sched map'. Put initialization
in perf_sched__map() to reduce unnecessary actions in other commands.

Simple functional testing:

  # perf sched record perf bench sched messaging
  # Running 'sched/messaging' benchmark:
  # 20 sender and receiver processes per group
  # 10 groups == 400 processes run

       Total time: 0.197 [sec]
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 15.526 MB perf.data (140095 samples) ]

  # perf sched map
    *A0                                                               451264.532445 secs A0 => migration/0:15
    *.                                                                451264.532468 secs .  => swapper:0
     .  *B0                                                           451264.532537 secs B0 => migration/1:21
     .  *.                                                            451264.532560 secs
     .   .  *C0                                                       451264.532644 secs C0 => migration/2:27
     .   .  *.                                                        451264.532668 secs
     .   .   .  *D0                                                   451264.532753 secs D0 => migration/3:33
     .   .   .  *.                                                    451264.532778 secs
     .   .   .   .  *E0                                               451264.532861 secs E0 => migration/4:39
     .   .   .   .  *.                                                451264.532886 secs
     .   .   .   .   .  *F0                                           451264.532973 secs F0 => migration/5:45
  <SNIP>
     A7  A7  A7  A7  A7 *A7  .   .   .   .   .   .   .   .   .   .    451264.790785 secs
     A7  A7  A7  A7  A7  A7 *A7  .   .   .   .   .   .   .   .   .    451264.790858 secs
     A7  A7  A7  A7  A7  A7  A7 *A7  .   .   .   .   .   .   .   .    451264.790934 secs
     A7  A7  A7  A7  A7  A7  A7  A7 *A7  .   .   .   .   .   .   .    451264.791004 secs
     A7  A7  A7  A7  A7  A7  A7  A7  A7 *A7  .   .   .   .   .   .    451264.791075 secs
     A7  A7  A7  A7  A7  A7  A7  A7  A7  A7 *A7  .   .   .   .   .    451264.791143 secs
     A7  A7  A7  A7  A7  A7  A7  A7  A7  A7  A7 *A7  .   .   .   .    451264.791232 secs
     A7  A7  A7  A7  A7  A7  A7  A7  A7  A7  A7  A7 *A7  .   .   .    451264.791336 secs
     A7  A7  A7  A7  A7  A7  A7  A7  A7  A7  A7  A7  A7 *A7  .   .    451264.791407 secs
     A7  A7  A7  A7  A7  A7  A7  A7  A7  A7  A7  A7  A7  A7 *A7  .    451264.791484 secs
     A7  A7  A7  A7  A7  A7  A7  A7  A7  A7  A7  A7  A7  A7  A7 *A7   451264.791553 secs
  # echo $?
  0

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
---
 tools/perf/builtin-sched.c | 15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 26dbfa4aab61..54d79e560617 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -3266,9 +3266,13 @@ static int perf_sched__map(struct perf_sched *sched)
 {
 	int rc = -1;
 
-	if (setup_map_cpus(sched))
+	sched->curr_thread = calloc(MAX_CPUS, sizeof(*(sched->curr_thread)));
+	if (!sched->curr_thread)
 		return rc;
 
+	if (setup_map_cpus(sched))
+		goto out_free_curr_thread;
+
 	if (setup_color_pids(sched))
 		goto out_free_map_cpus;
 
@@ -3291,6 +3295,9 @@ static int perf_sched__map(struct perf_sched *sched)
 out_free_map_cpus:
 	free(sched->map.comp_cpus);
 	perf_cpu_map__put(sched->map.cpus);
+
+out_free_curr_thread:
+	free(sched->curr_thread);
 	return rc;
 }
 
@@ -3576,11 +3583,6 @@ int cmd_sched(int argc, const char **argv)
 	unsigned int i;
 	int ret = 0;
 
-	sched.curr_thread = calloc(MAX_CPUS, sizeof(*sched.curr_thread));
-	if (!sched.curr_thread) {
-		ret = -ENOMEM;
-		goto out;
-	}
 	sched.cpu_last_switched = calloc(MAX_CPUS, sizeof(*sched.cpu_last_switched));
 	if (!sched.cpu_last_switched) {
 		ret = -ENOMEM;
@@ -3662,7 +3664,6 @@ int cmd_sched(int argc, const char **argv)
 out:
 	free(sched.curr_pid);
 	free(sched.cpu_last_switched);
-	free(sched.curr_thread);
 
 	return ret;
 }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 4/5] perf sched: Move curr_pid and cpu_last_switched initialization to perf_sched__{lat|map|replay}()
  2024-02-05 10:46 [PATCH 0/5] perf sched: Minor optimizations for resource initialization Yang Jihong
                   ` (2 preceding siblings ...)
  2024-02-05 10:46 ` [PATCH 3/5] perf sched: Move curr_thread initialization to perf_sched__map() Yang Jihong
@ 2024-02-05 10:46 ` Yang Jihong
  2024-02-05 19:01   ` Arnaldo Carvalho de Melo
  2024-02-05 10:46 ` [PATCH 5/5] perf thread_map: Free strlist on normal path in thread_map__new_by_tid_str() Yang Jihong
  4 siblings, 1 reply; 12+ messages in thread
From: Yang Jihong @ 2024-02-05 10:46 UTC (permalink / raw)
  To: peterz, mingo, acme, namhyung, mark.rutland, alexander.shishkin,
	jolsa, irogers, adrian.hunter, linux-perf-users, linux-kernel
  Cc: yangjihong1

The curr_pid and cpu_last_switched are used only for the
'perf sched replay/latency/map'. Put their initialization in
perf_sched__{lat|map|replay () to reduce unnecessary actions in other
commands.

Simple functional testing:

  # perf sched record perf bench sched messaging
  # Running 'sched/messaging' benchmark:
  # 20 sender and receiver processes per group
  # 10 groups == 400 processes run

       Total time: 0.209 [sec]
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 16.456 MB perf.data (147907 samples) ]

  # perf sched lat

   -------------------------------------------------------------------------------------------------------------------------------------------
    Task                  |   Runtime ms  | Switches | Avg delay ms    | Max delay ms    | Max delay start           | Max delay end          |
   -------------------------------------------------------------------------------------------------------------------------------------------
    sched-messaging:(401) |   2990.699 ms |    38705 | avg:   0.661 ms | max:  67.046 ms | max start: 456532.624830 s | max end: 456532.691876 s
    qemu-system-x86:(7)   |    179.764 ms |     2191 | avg:   0.152 ms | max:  21.857 ms | max start: 456532.576434 s | max end: 456532.598291 s
    sshd:48125            |      0.522 ms |        2 | avg:   0.037 ms | max:   0.046 ms | max start: 456532.514610 s | max end: 456532.514656 s
  <SNIP>
    ksoftirqd/11:82       |      0.063 ms |        1 | avg:   0.005 ms | max:   0.005 ms | max start: 456532.769366 s | max end: 456532.769371 s
    kworker/9:0-mm_:34624 |      0.233 ms |       20 | avg:   0.004 ms | max:   0.007 ms | max start: 456532.690804 s | max end: 456532.690812 s
    migration/13:93       |      0.000 ms |        1 | avg:   0.004 ms | max:   0.004 ms | max start: 456532.512669 s | max end: 456532.512674 s
   -----------------------------------------------------------------------------------------------------------------
    TOTAL:                |   3180.750 ms |    41368 |
   ---------------------------------------------------

  # echo $?
  0

  # perf sched map
    *A0                                                               456532.510141 secs A0 => migration/0:15
    *.                                                                456532.510171 secs .  => swapper:0
     .  *B0                                                           456532.510261 secs B0 => migration/1:21
     .  *.                                                            456532.510279 secs
  <SNIP>
     L7  L7  L7  L7  L7  L7  L7  L7  L7  L7  L7 *L7  .   .   .   .    456532.785979 secs
     L7  L7  L7  L7  L7  L7  L7  L7  L7  L7  L7  L7 *L7  .   .   .    456532.786054 secs
     L7  L7  L7  L7  L7  L7  L7  L7  L7  L7  L7  L7  L7 *L7  .   .    456532.786127 secs
     L7  L7  L7  L7  L7  L7  L7  L7  L7  L7  L7  L7  L7  L7 *L7  .    456532.786197 secs
     L7  L7  L7  L7  L7  L7  L7  L7  L7  L7  L7  L7  L7  L7  L7 *L7   456532.786270 secs
  # echo $?
  0

  # perf sched replay
  run measurement overhead: 108 nsecs
  sleep measurement overhead: 66473 nsecs
  the run test took 1000002 nsecs
  the sleep test took 1082686 nsecs
  nr_run_events:        49334
  nr_sleep_events:      50054
  nr_wakeup_events:     34701
  target-less wakeups:  165
  multi-target wakeups: 766
  task      0 (             swapper:         0), nr_events: 15419
  task      1 (             swapper:         1), nr_events: 1
  task      2 (             swapper:         2), nr_events: 1
  <SNIP>
  task    715 (     sched-messaging:    110248), nr_events: 1438
  task    716 (     sched-messaging:    110249), nr_events: 512
  task    717 (     sched-messaging:    110250), nr_events: 500
  task    718 (     sched-messaging:    110251), nr_events: 537
  task    719 (     sched-messaging:    110252), nr_events: 823
  ------------------------------------------------------------
  #1  : 1325.288, ravg: 1325.29, cpu: 7823.35 / 7823.35
  #2  : 1363.606, ravg: 1329.12, cpu: 7655.53 / 7806.56
  #3  : 1349.494, ravg: 1331.16, cpu: 7544.80 / 7780.39
  #4  : 1311.488, ravg: 1329.19, cpu: 7495.13 / 7751.86
  #5  : 1309.902, ravg: 1327.26, cpu: 7266.65 / 7703.34
  #6  : 1309.535, ravg: 1325.49, cpu: 7843.86 / 7717.39
  #7  : 1316.482, ravg: 1324.59, cpu: 7854.41 / 7731.09
  #8  : 1366.604, ravg: 1328.79, cpu: 7955.81 / 7753.57
  #9  : 1326.286, ravg: 1328.54, cpu: 7466.86 / 7724.90
  #10 : 1356.653, ravg: 1331.35, cpu: 7566.60 / 7709.07
  # echo $?
  0

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
---
 tools/perf/builtin-sched.c | 94 +++++++++++++++++++++++++-------------
 1 file changed, 61 insertions(+), 33 deletions(-)

diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 54d79e560617..af76366ec798 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -3167,14 +3167,44 @@ static void perf_sched__merge_lat(struct perf_sched *sched)
 	}
 }
 
+static int setup_cpus_switch_event(struct perf_sched *sched)
+{
+	unsigned int i;
+
+	sched->cpu_last_switched = calloc(MAX_CPUS, sizeof(*(sched->cpu_last_switched)));
+	if (!sched->cpu_last_switched)
+		return -1;
+
+	sched->curr_pid = malloc(MAX_CPUS * sizeof(*(sched->curr_pid)));
+	if (!sched->curr_pid) {
+		free(sched->cpu_last_switched);
+		return -1;
+	}
+
+	for (i = 0; i < MAX_CPUS; i++)
+		sched->curr_pid[i] = -1;
+
+	return 0;
+}
+
+static void free_cpus_switch_event(struct perf_sched *sched)
+{
+	free(sched->curr_pid);
+	free(sched->cpu_last_switched);
+}
+
 static int perf_sched__lat(struct perf_sched *sched)
 {
+	int rc = -1;
 	struct rb_node *next;
 
 	setup_pager();
 
+	if (setup_cpus_switch_event(sched))
+		return rc;
+
 	if (perf_sched__read_events(sched))
-		return -1;
+		goto out_free_cpus_switch_event;
 
 	perf_sched__merge_lat(sched);
 	perf_sched__sort_lat(sched);
@@ -3203,7 +3233,11 @@ static int perf_sched__lat(struct perf_sched *sched)
 	print_bad_events(sched);
 	printf("\n");
 
-	return 0;
+	rc = 0;
+
+out_free_cpus_switch_event:
+	free_cpus_switch_event(sched);
+	return rc;
 }
 
 static int setup_map_cpus(struct perf_sched *sched)
@@ -3270,9 +3304,12 @@ static int perf_sched__map(struct perf_sched *sched)
 	if (!sched->curr_thread)
 		return rc;
 
-	if (setup_map_cpus(sched))
+	if (setup_cpus_switch_event(sched))
 		goto out_free_curr_thread;
 
+	if (setup_map_cpus(sched))
+		goto out_free_cpus_switch_event;
+
 	if (setup_color_pids(sched))
 		goto out_free_map_cpus;
 
@@ -3296,6 +3333,9 @@ static int perf_sched__map(struct perf_sched *sched)
 	free(sched->map.comp_cpus);
 	perf_cpu_map__put(sched->map.cpus);
 
+out_free_cpus_switch_event:
+	free_cpus_switch_event(sched);
+
 out_free_curr_thread:
 	free(sched->curr_thread);
 	return rc;
@@ -3309,6 +3349,10 @@ static int perf_sched__replay(struct perf_sched *sched)
 	mutex_init(&sched->start_work_mutex);
 	mutex_init(&sched->work_done_wait_mutex);
 
+	ret = setup_cpus_switch_event(sched);
+	if (ret)
+		goto out_mutex_destroy;
+
 	calibrate_run_measurement_overhead(sched);
 	calibrate_sleep_measurement_overhead(sched);
 
@@ -3316,7 +3360,7 @@ static int perf_sched__replay(struct perf_sched *sched)
 
 	ret = perf_sched__read_events(sched);
 	if (ret)
-		goto out_mutex_destroy;
+		goto out_free_cpus_switch_event;
 
 	printf("nr_run_events:        %ld\n", sched->nr_run_events);
 	printf("nr_sleep_events:      %ld\n", sched->nr_sleep_events);
@@ -3342,6 +3386,9 @@ static int perf_sched__replay(struct perf_sched *sched)
 	sched->thread_funcs_exit = true;
 	destroy_tasks(sched);
 
+out_free_cpus_switch_event:
+	free_cpus_switch_event(sched);
+
 out_mutex_destroy:
 	mutex_destroy(&sched->start_work_mutex);
 	mutex_destroy(&sched->work_done_wait_mutex);
@@ -3580,21 +3627,7 @@ int cmd_sched(int argc, const char **argv)
 		.switch_event	    = replay_switch_event,
 		.fork_event	    = replay_fork_event,
 	};
-	unsigned int i;
-	int ret = 0;
-
-	sched.cpu_last_switched = calloc(MAX_CPUS, sizeof(*sched.cpu_last_switched));
-	if (!sched.cpu_last_switched) {
-		ret = -ENOMEM;
-		goto out;
-	}
-	sched.curr_pid = malloc(MAX_CPUS * sizeof(*sched.curr_pid));
-	if (!sched.curr_pid) {
-		ret = -ENOMEM;
-		goto out;
-	}
-	for (i = 0; i < MAX_CPUS; i++)
-		sched.curr_pid[i] = -1;
+	int ret;
 
 	argc = parse_options_subcommand(argc, argv, sched_options, sched_subcommands,
 					sched_usage, PARSE_OPT_STOP_AT_NON_OPTION);
@@ -3605,9 +3638,9 @@ int cmd_sched(int argc, const char **argv)
 	 * Aliased to 'perf script' for now:
 	 */
 	if (!strcmp(argv[0], "script")) {
-		ret = cmd_script(argc, argv);
+		return cmd_script(argc, argv);
 	} else if (strlen(argv[0]) > 2 && strstarts("record", argv[0])) {
-		ret = __cmd_record(argc, argv);
+		return __cmd_record(argc, argv);
 	} else if (strlen(argv[0]) > 2 && strstarts("latency", argv[0])) {
 		sched.tp_handler = &lat_ops;
 		if (argc > 1) {
@@ -3616,7 +3649,7 @@ int cmd_sched(int argc, const char **argv)
 				usage_with_options(latency_usage, latency_options);
 		}
 		setup_sorting(&sched, latency_options, latency_usage);
-		ret = perf_sched__lat(&sched);
+		return perf_sched__lat(&sched);
 	} else if (!strcmp(argv[0], "map")) {
 		if (argc) {
 			argc = parse_options(argc, argv, map_options, map_usage, 0);
@@ -3625,7 +3658,7 @@ int cmd_sched(int argc, const char **argv)
 		}
 		sched.tp_handler = &map_ops;
 		setup_sorting(&sched, latency_options, latency_usage);
-		ret = perf_sched__map(&sched);
+		return perf_sched__map(&sched);
 	} else if (strlen(argv[0]) > 2 && strstarts("replay", argv[0])) {
 		sched.tp_handler = &replay_ops;
 		if (argc) {
@@ -3633,7 +3666,7 @@ int cmd_sched(int argc, const char **argv)
 			if (argc)
 				usage_with_options(replay_usage, replay_options);
 		}
-		ret = perf_sched__replay(&sched);
+		return perf_sched__replay(&sched);
 	} else if (!strcmp(argv[0], "timehist")) {
 		if (argc) {
 			argc = parse_options(argc, argv, timehist_options,
@@ -3649,21 +3682,16 @@ int cmd_sched(int argc, const char **argv)
 				parse_options_usage(NULL, timehist_options, "w", true);
 			if (sched.show_next)
 				parse_options_usage(NULL, timehist_options, "n", true);
-			ret = -EINVAL;
-			goto out;
+			return -EINVAL;
 		}
 		ret = symbol__validate_sym_arguments();
 		if (ret)
-			goto out;
+			return ret;
 
-		ret = perf_sched__timehist(&sched);
+		return perf_sched__timehist(&sched);
 	} else {
 		usage_with_options(sched_usage, sched_options);
 	}
 
-out:
-	free(sched.curr_pid);
-	free(sched.cpu_last_switched);
-
-	return ret;
+	return 0;
 }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 5/5] perf thread_map: Free strlist on normal path in thread_map__new_by_tid_str()
  2024-02-05 10:46 [PATCH 0/5] perf sched: Minor optimizations for resource initialization Yang Jihong
                   ` (3 preceding siblings ...)
  2024-02-05 10:46 ` [PATCH 4/5] perf sched: Move curr_pid and cpu_last_switched initialization to perf_sched__{lat|map|replay}() Yang Jihong
@ 2024-02-05 10:46 ` Yang Jihong
  2024-02-05 19:09   ` Arnaldo Carvalho de Melo
  4 siblings, 1 reply; 12+ messages in thread
From: Yang Jihong @ 2024-02-05 10:46 UTC (permalink / raw)
  To: peterz, mingo, acme, namhyung, mark.rutland, alexander.shishkin,
	jolsa, irogers, adrian.hunter, linux-perf-users, linux-kernel
  Cc: yangjihong1

slist needs to be freed in both error path and normal path in
thread_map__new_by_tid_str().

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
---
 tools/perf/util/thread_map.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/thread_map.c b/tools/perf/util/thread_map.c
index e848579e61a8..ea3b431b9783 100644
--- a/tools/perf/util/thread_map.c
+++ b/tools/perf/util/thread_map.c
@@ -280,13 +280,13 @@ struct perf_thread_map *thread_map__new_by_tid_str(const char *tid_str)
 		threads->nr = ntasks;
 	}
 out:
+	strlist__delete(slist);
 	if (threads)
 		refcount_set(&threads->refcnt, 1);
 	return threads;
 
 out_free_threads:
 	zfree(&threads);
-	strlist__delete(slist);
 	goto out;
 }
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/5] perf sched: Fix memory leak in perf_sched__map()
  2024-02-05 10:46 ` [PATCH 2/5] perf sched: Fix memory leak in perf_sched__map() Yang Jihong
@ 2024-02-05 18:58   ` Arnaldo Carvalho de Melo
  2024-02-06  7:08     ` Yang Jihong
  0 siblings, 1 reply; 12+ messages in thread
From: Arnaldo Carvalho de Melo @ 2024-02-05 18:58 UTC (permalink / raw)
  To: Yang Jihong
  Cc: peterz, mingo, namhyung, mark.rutland, alexander.shishkin, jolsa,
	irogers, adrian.hunter, linux-perf-users, linux-kernel

On Mon, Feb 05, 2024 at 10:46:13AM +0000, Yang Jihong wrote:
> +++ b/tools/perf/builtin-sched.c
> @@ -3267,20 +3264,34 @@ static int setup_color_cpus(struct perf_sched *sched)
>  
>  static int perf_sched__map(struct perf_sched *sched)
>  {
> +	int rc = -1;
> +
>  	if (setup_map_cpus(sched))
> -		return -1;
> +		return rc;
>  
>  	if (setup_color_pids(sched))
> -		return -1;
> +		goto out_free_map_cpus;

I think renaming the goto labels to what they will do, dropping a
refcount, is more clear, i.e.:

		goto out_put_map_cpus;

>  
>  	if (setup_color_cpus(sched))
> -		return -1;
> +		goto out_free_color_pids;
>  
>  	setup_pager();
>  	if (perf_sched__read_events(sched))
> -		return -1;
> +		goto out_free_color_cpus;
> +
> +	rc = 0;
>  	print_bad_events(sched);
> -	return 0;
> +
> +out_free_color_cpus:
> +	perf_cpu_map__put(sched->map.color_cpus);
> +
> +out_free_color_pids:
> +	perf_thread_map__put(sched->map.color_pids);
> +
> +out_free_map_cpus:
> +	free(sched->map.comp_cpus);

Please use:

	zfree(&sched->map.comp_cpus);

> +	perf_cpu_map__put(sched->map.cpus);
> +	return rc;
>  }

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/5] perf sched: Move curr_thread initialization to perf_sched__map()
  2024-02-05 10:46 ` [PATCH 3/5] perf sched: Move curr_thread initialization to perf_sched__map() Yang Jihong
@ 2024-02-05 18:59   ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 12+ messages in thread
From: Arnaldo Carvalho de Melo @ 2024-02-05 18:59 UTC (permalink / raw)
  To: Yang Jihong
  Cc: peterz, mingo, namhyung, mark.rutland, alexander.shishkin, jolsa,
	irogers, adrian.hunter, linux-perf-users, linux-kernel

On Mon, Feb 05, 2024 at 10:46:14AM +0000, Yang Jihong wrote:
> The curr_thread is used only for the 'perf sched map'. Put initialization
> in perf_sched__map() to reduce unnecessary actions in other commands.
> 
> Simple functional testing:
> 
>   # perf sched record perf bench sched messaging
>   # Running 'sched/messaging' benchmark:
>   # 20 sender and receiver processes per group
>   # 10 groups == 400 processes run
> 
>        Total time: 0.197 [sec]
>   [ perf record: Woken up 1 times to write data ]
>   [ perf record: Captured and wrote 15.526 MB perf.data (140095 samples) ]
> 
>   # perf sched map
>     *A0                                                               451264.532445 secs A0 => migration/0:15
>     *.                                                                451264.532468 secs .  => swapper:0
>      .  *B0                                                           451264.532537 secs B0 => migration/1:21
>      .  *.                                                            451264.532560 secs
>      .   .  *C0                                                       451264.532644 secs C0 => migration/2:27
>      .   .  *.                                                        451264.532668 secs
>      .   .   .  *D0                                                   451264.532753 secs D0 => migration/3:33
>      .   .   .  *.                                                    451264.532778 secs
>      .   .   .   .  *E0                                               451264.532861 secs E0 => migration/4:39
>      .   .   .   .  *.                                                451264.532886 secs
>      .   .   .   .   .  *F0                                           451264.532973 secs F0 => migration/5:45
>   <SNIP>
>      A7  A7  A7  A7  A7 *A7  .   .   .   .   .   .   .   .   .   .    451264.790785 secs
>      A7  A7  A7  A7  A7  A7 *A7  .   .   .   .   .   .   .   .   .    451264.790858 secs
>      A7  A7  A7  A7  A7  A7  A7 *A7  .   .   .   .   .   .   .   .    451264.790934 secs
>      A7  A7  A7  A7  A7  A7  A7  A7 *A7  .   .   .   .   .   .   .    451264.791004 secs
>      A7  A7  A7  A7  A7  A7  A7  A7  A7 *A7  .   .   .   .   .   .    451264.791075 secs
>      A7  A7  A7  A7  A7  A7  A7  A7  A7  A7 *A7  .   .   .   .   .    451264.791143 secs
>      A7  A7  A7  A7  A7  A7  A7  A7  A7  A7  A7 *A7  .   .   .   .    451264.791232 secs
>      A7  A7  A7  A7  A7  A7  A7  A7  A7  A7  A7  A7 *A7  .   .   .    451264.791336 secs
>      A7  A7  A7  A7  A7  A7  A7  A7  A7  A7  A7  A7  A7 *A7  .   .    451264.791407 secs
>      A7  A7  A7  A7  A7  A7  A7  A7  A7  A7  A7  A7  A7  A7 *A7  .    451264.791484 secs
>      A7  A7  A7  A7  A7  A7  A7  A7  A7  A7  A7  A7  A7  A7  A7 *A7   451264.791553 secs
>   # echo $?
>   0
> 
> Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
> ---
>  tools/perf/builtin-sched.c | 15 ++++++++-------
>  1 file changed, 8 insertions(+), 7 deletions(-)
> 
> diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
> index 26dbfa4aab61..54d79e560617 100644
> --- a/tools/perf/builtin-sched.c
> +++ b/tools/perf/builtin-sched.c
> @@ -3266,9 +3266,13 @@ static int perf_sched__map(struct perf_sched *sched)
>  {
>  	int rc = -1;
>  
> -	if (setup_map_cpus(sched))
> +	sched->curr_thread = calloc(MAX_CPUS, sizeof(*(sched->curr_thread)));
> +	if (!sched->curr_thread)
>  		return rc;
>  
> +	if (setup_map_cpus(sched))
> +		goto out_free_curr_thread;
> +
>  	if (setup_color_pids(sched))
>  		goto out_free_map_cpus;
>  
> @@ -3291,6 +3295,9 @@ static int perf_sched__map(struct perf_sched *sched)
>  out_free_map_cpus:
>  	free(sched->map.comp_cpus);
>  	perf_cpu_map__put(sched->map.cpus);
> +
> +out_free_curr_thread:
> +	free(sched->curr_thread);

	zfree(&sched->curr_thread);

>  	return rc;
>  }
>  
> @@ -3576,11 +3583,6 @@ int cmd_sched(int argc, const char **argv)
>  	unsigned int i;
>  	int ret = 0;
>  
> -	sched.curr_thread = calloc(MAX_CPUS, sizeof(*sched.curr_thread));
> -	if (!sched.curr_thread) {
> -		ret = -ENOMEM;
> -		goto out;
> -	}
>  	sched.cpu_last_switched = calloc(MAX_CPUS, sizeof(*sched.cpu_last_switched));
>  	if (!sched.cpu_last_switched) {
>  		ret = -ENOMEM;
> @@ -3662,7 +3664,6 @@ int cmd_sched(int argc, const char **argv)
>  out:
>  	free(sched.curr_pid);
>  	free(sched.cpu_last_switched);
> -	free(sched.curr_thread);
>  
>  	return ret;
>  }
> -- 
> 2.34.1

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 4/5] perf sched: Move curr_pid and cpu_last_switched initialization to perf_sched__{lat|map|replay}()
  2024-02-05 10:46 ` [PATCH 4/5] perf sched: Move curr_pid and cpu_last_switched initialization to perf_sched__{lat|map|replay}() Yang Jihong
@ 2024-02-05 19:01   ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 12+ messages in thread
From: Arnaldo Carvalho de Melo @ 2024-02-05 19:01 UTC (permalink / raw)
  To: Yang Jihong
  Cc: peterz, mingo, namhyung, mark.rutland, alexander.shishkin, jolsa,
	irogers, adrian.hunter, linux-perf-users, linux-kernel

On Mon, Feb 05, 2024 at 10:46:15AM +0000, Yang Jihong wrote:
> +static int setup_cpus_switch_event(struct perf_sched *sched)
> +{
> +	unsigned int i;
> +
> +	sched->cpu_last_switched = calloc(MAX_CPUS, sizeof(*(sched->cpu_last_switched)));
> +	if (!sched->cpu_last_switched)
> +		return -1;
> +
> +	sched->curr_pid = malloc(MAX_CPUS * sizeof(*(sched->curr_pid)));
> +	if (!sched->curr_pid) {
> +		free(sched->cpu_last_switched);

		zfree(&sched->cpu_last_switched);

> +		return -1;
> +	}
> +
> +	for (i = 0; i < MAX_CPUS; i++)
> +		sched->curr_pid[i] = -1;
> +
> +	return 0;
> +}
> +
> +static void free_cpus_switch_event(struct perf_sched *sched)
> +{
> +	free(sched->curr_pid);
> +	free(sched->cpu_last_switched);

	zfree(&sched->curr_pid);
	zfree(&sched->cpu_last_switched);

> +}
> +
>  static int perf_sched__lat(struct perf_sched *sched)
>  {
> +	int rc = -1;
>  	struct rb_node *next;
>  
>  	setup_pager();
>  
> +	if (setup_cpus_switch_event(sched))
> +		return rc;
> +
>  	if (perf_sched__read_events(sched))
> -		return -1;
> +		goto out_free_cpus_switch_event;
>  
>  	perf_sched__merge_lat(sched);
>  	perf_sched__sort_lat(sched);
> @@ -3203,7 +3233,11 @@ static int perf_sched__lat(struct perf_sched *sched)
>  	print_bad_events(sched);
>  	printf("\n");
>  
> -	return 0;
> +	rc = 0;
> +
> +out_free_cpus_switch_event:
> +	free_cpus_switch_event(sched);
> +	return rc;
>  }
>  
>  static int setup_map_cpus(struct perf_sched *sched)
> @@ -3270,9 +3304,12 @@ static int perf_sched__map(struct perf_sched *sched)
>  	if (!sched->curr_thread)
>  		return rc;
>  
> -	if (setup_map_cpus(sched))
> +	if (setup_cpus_switch_event(sched))
>  		goto out_free_curr_thread;
>  
> +	if (setup_map_cpus(sched))
> +		goto out_free_cpus_switch_event;
> +
>  	if (setup_color_pids(sched))
>  		goto out_free_map_cpus;
>  
> @@ -3296,6 +3333,9 @@ static int perf_sched__map(struct perf_sched *sched)
>  	free(sched->map.comp_cpus);
>  	perf_cpu_map__put(sched->map.cpus);
>  
> +out_free_cpus_switch_event:
> +	free_cpus_switch_event(sched);
> +
>  out_free_curr_thread:
>  	free(sched->curr_thread);
>  	return rc;
> @@ -3309,6 +3349,10 @@ static int perf_sched__replay(struct perf_sched *sched)
>  	mutex_init(&sched->start_work_mutex);
>  	mutex_init(&sched->work_done_wait_mutex);
>  
> +	ret = setup_cpus_switch_event(sched);
> +	if (ret)
> +		goto out_mutex_destroy;
> +
>  	calibrate_run_measurement_overhead(sched);
>  	calibrate_sleep_measurement_overhead(sched);
>  
> @@ -3316,7 +3360,7 @@ static int perf_sched__replay(struct perf_sched *sched)
>  
>  	ret = perf_sched__read_events(sched);
>  	if (ret)
> -		goto out_mutex_destroy;
> +		goto out_free_cpus_switch_event;
>  
>  	printf("nr_run_events:        %ld\n", sched->nr_run_events);
>  	printf("nr_sleep_events:      %ld\n", sched->nr_sleep_events);
> @@ -3342,6 +3386,9 @@ static int perf_sched__replay(struct perf_sched *sched)
>  	sched->thread_funcs_exit = true;
>  	destroy_tasks(sched);
>  
> +out_free_cpus_switch_event:
> +	free_cpus_switch_event(sched);
> +
>  out_mutex_destroy:
>  	mutex_destroy(&sched->start_work_mutex);
>  	mutex_destroy(&sched->work_done_wait_mutex);
> @@ -3580,21 +3627,7 @@ int cmd_sched(int argc, const char **argv)
>  		.switch_event	    = replay_switch_event,
>  		.fork_event	    = replay_fork_event,
>  	};
> -	unsigned int i;
> -	int ret = 0;
> -
> -	sched.cpu_last_switched = calloc(MAX_CPUS, sizeof(*sched.cpu_last_switched));
> -	if (!sched.cpu_last_switched) {
> -		ret = -ENOMEM;
> -		goto out;
> -	}
> -	sched.curr_pid = malloc(MAX_CPUS * sizeof(*sched.curr_pid));
> -	if (!sched.curr_pid) {
> -		ret = -ENOMEM;
> -		goto out;
> -	}
> -	for (i = 0; i < MAX_CPUS; i++)
> -		sched.curr_pid[i] = -1;
> +	int ret;
>  
>  	argc = parse_options_subcommand(argc, argv, sched_options, sched_subcommands,
>  					sched_usage, PARSE_OPT_STOP_AT_NON_OPTION);
> @@ -3605,9 +3638,9 @@ int cmd_sched(int argc, const char **argv)
>  	 * Aliased to 'perf script' for now:
>  	 */
>  	if (!strcmp(argv[0], "script")) {
> -		ret = cmd_script(argc, argv);
> +		return cmd_script(argc, argv);
>  	} else if (strlen(argv[0]) > 2 && strstarts("record", argv[0])) {
> -		ret = __cmd_record(argc, argv);
> +		return __cmd_record(argc, argv);
>  	} else if (strlen(argv[0]) > 2 && strstarts("latency", argv[0])) {
>  		sched.tp_handler = &lat_ops;
>  		if (argc > 1) {
> @@ -3616,7 +3649,7 @@ int cmd_sched(int argc, const char **argv)
>  				usage_with_options(latency_usage, latency_options);
>  		}
>  		setup_sorting(&sched, latency_options, latency_usage);
> -		ret = perf_sched__lat(&sched);
> +		return perf_sched__lat(&sched);
>  	} else if (!strcmp(argv[0], "map")) {
>  		if (argc) {
>  			argc = parse_options(argc, argv, map_options, map_usage, 0);
> @@ -3625,7 +3658,7 @@ int cmd_sched(int argc, const char **argv)
>  		}
>  		sched.tp_handler = &map_ops;
>  		setup_sorting(&sched, latency_options, latency_usage);
> -		ret = perf_sched__map(&sched);
> +		return perf_sched__map(&sched);
>  	} else if (strlen(argv[0]) > 2 && strstarts("replay", argv[0])) {
>  		sched.tp_handler = &replay_ops;
>  		if (argc) {
> @@ -3633,7 +3666,7 @@ int cmd_sched(int argc, const char **argv)
>  			if (argc)
>  				usage_with_options(replay_usage, replay_options);
>  		}
> -		ret = perf_sched__replay(&sched);
> +		return perf_sched__replay(&sched);
>  	} else if (!strcmp(argv[0], "timehist")) {
>  		if (argc) {
>  			argc = parse_options(argc, argv, timehist_options,
> @@ -3649,21 +3682,16 @@ int cmd_sched(int argc, const char **argv)
>  				parse_options_usage(NULL, timehist_options, "w", true);
>  			if (sched.show_next)
>  				parse_options_usage(NULL, timehist_options, "n", true);
> -			ret = -EINVAL;
> -			goto out;
> +			return -EINVAL;
>  		}
>  		ret = symbol__validate_sym_arguments();
>  		if (ret)
> -			goto out;
> +			return ret;
>  
> -		ret = perf_sched__timehist(&sched);
> +		return perf_sched__timehist(&sched);
>  	} else {
>  		usage_with_options(sched_usage, sched_options);
>  	}
>  
> -out:
> -	free(sched.curr_pid);
> -	free(sched.cpu_last_switched);
> -
> -	return ret;
> +	return 0;
>  }
> -- 
> 2.34.1

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 5/5] perf thread_map: Free strlist on normal path in thread_map__new_by_tid_str()
  2024-02-05 10:46 ` [PATCH 5/5] perf thread_map: Free strlist on normal path in thread_map__new_by_tid_str() Yang Jihong
@ 2024-02-05 19:09   ` Arnaldo Carvalho de Melo
  2024-02-06  7:10     ` Yang Jihong
  0 siblings, 1 reply; 12+ messages in thread
From: Arnaldo Carvalho de Melo @ 2024-02-05 19:09 UTC (permalink / raw)
  To: Yang Jihong
  Cc: peterz, mingo, namhyung, mark.rutland, alexander.shishkin, jolsa,
	irogers, adrian.hunter, linux-perf-users, linux-kernel

On Mon, Feb 05, 2024 at 10:46:16AM +0000, Yang Jihong wrote:
> slist needs to be freed in both error path and normal path in
> thread_map__new_by_tid_str().

Please add:

Fixes: b52956c961be3a04 ("perf tools: Allow multiple threads or processes in record, stat, top")

Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>

- Arnaldo
 
> Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
> ---
>  tools/perf/util/thread_map.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/tools/perf/util/thread_map.c b/tools/perf/util/thread_map.c
> index e848579e61a8..ea3b431b9783 100644
> --- a/tools/perf/util/thread_map.c
> +++ b/tools/perf/util/thread_map.c
> @@ -280,13 +280,13 @@ struct perf_thread_map *thread_map__new_by_tid_str(const char *tid_str)
>  		threads->nr = ntasks;
>  	}
>  out:
> +	strlist__delete(slist);
>  	if (threads)
>  		refcount_set(&threads->refcnt, 1);
>  	return threads;
>  
>  out_free_threads:
>  	zfree(&threads);
> -	strlist__delete(slist);
>  	goto out;
>  }
>  
> -- 
> 2.34.1

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/5] perf sched: Fix memory leak in perf_sched__map()
  2024-02-05 18:58   ` Arnaldo Carvalho de Melo
@ 2024-02-06  7:08     ` Yang Jihong
  0 siblings, 0 replies; 12+ messages in thread
From: Yang Jihong @ 2024-02-06  7:08 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: peterz, mingo, namhyung, mark.rutland, alexander.shishkin, jolsa,
	irogers, adrian.hunter, linux-perf-users, linux-kernel

Hello,

On 2024/2/6 2:58, Arnaldo Carvalho de Melo wrote:
> On Mon, Feb 05, 2024 at 10:46:13AM +0000, Yang Jihong wrote:
>> +++ b/tools/perf/builtin-sched.c
>> @@ -3267,20 +3264,34 @@ static int setup_color_cpus(struct perf_sched *sched)
>>   
>>   static int perf_sched__map(struct perf_sched *sched)
>>   {
>> +	int rc = -1;
>> +
>>   	if (setup_map_cpus(sched))
>> -		return -1;
>> +		return rc;
>>   
>>   	if (setup_color_pids(sched))
>> -		return -1;
>> +		goto out_free_map_cpus;
> 
> I think renaming the goto labels to what they will do, dropping a
> refcount, is more clear, i.e.:
> 
> 		goto out_put_map_cpus;
OK, will modify in v2.
> 
>>   
>>   	if (setup_color_cpus(sched))
>> -		return -1;
>> +		goto out_free_color_pids;
>>   
>>   	setup_pager();
>>   	if (perf_sched__read_events(sched))
>> -		return -1;
>> +		goto out_free_color_cpus;
>> +
>> +	rc = 0;
>>   	print_bad_events(sched);
>> -	return 0;
>> +
>> +out_free_color_cpus:
>> +	perf_cpu_map__put(sched->map.color_cpus);
>> +
>> +out_free_color_pids:
>> +	perf_thread_map__put(sched->map.color_pids);
>> +
>> +out_free_map_cpus:
>> +	free(sched->map.comp_cpus);
> 
> Please use:
> 
> 	zfree(&sched->map.comp_cpus);
> 
OK, will change to use zfree in the next version.
Other patches where uses free will also be changed to zfree.

Thanks,
Yang

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 5/5] perf thread_map: Free strlist on normal path in thread_map__new_by_tid_str()
  2024-02-05 19:09   ` Arnaldo Carvalho de Melo
@ 2024-02-06  7:10     ` Yang Jihong
  0 siblings, 0 replies; 12+ messages in thread
From: Yang Jihong @ 2024-02-06  7:10 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: peterz, mingo, namhyung, mark.rutland, alexander.shishkin, jolsa,
	irogers, adrian.hunter, linux-perf-users, linux-kernel

Hello,

On 2024/2/6 3:09, Arnaldo Carvalho de Melo wrote:
> On Mon, Feb 05, 2024 at 10:46:16AM +0000, Yang Jihong wrote:
>> slist needs to be freed in both error path and normal path in
>> thread_map__new_by_tid_str().
> 
> Please add:
> 
> Fixes: b52956c961be3a04 ("perf tools: Allow multiple threads or processes in record, stat, top")
> 
OK, will add fixes tag in v2.

> Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> 
Thanks for the reviewed-by tag

> - Arnaldo

Thanks,
Yang

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2024-02-06  7:10 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-02-05 10:46 [PATCH 0/5] perf sched: Minor optimizations for resource initialization Yang Jihong
2024-02-05 10:46 ` [PATCH 1/5] perf sched: Move start_work_mutex and work_done_wait_mutex initialization to perf_sched__replay() Yang Jihong
2024-02-05 10:46 ` [PATCH 2/5] perf sched: Fix memory leak in perf_sched__map() Yang Jihong
2024-02-05 18:58   ` Arnaldo Carvalho de Melo
2024-02-06  7:08     ` Yang Jihong
2024-02-05 10:46 ` [PATCH 3/5] perf sched: Move curr_thread initialization to perf_sched__map() Yang Jihong
2024-02-05 18:59   ` Arnaldo Carvalho de Melo
2024-02-05 10:46 ` [PATCH 4/5] perf sched: Move curr_pid and cpu_last_switched initialization to perf_sched__{lat|map|replay}() Yang Jihong
2024-02-05 19:01   ` Arnaldo Carvalho de Melo
2024-02-05 10:46 ` [PATCH 5/5] perf thread_map: Free strlist on normal path in thread_map__new_by_tid_str() Yang Jihong
2024-02-05 19:09   ` Arnaldo Carvalho de Melo
2024-02-06  7:10     ` Yang Jihong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).