All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Mi, Dapeng" <dapeng1.mi@linux.intel.com>
To: kernel test robot <oliver.sang@intel.com>
Cc: oe-lkp@lists.linux.dev, lkp@intel.com,
	Kan Liang <kan.liang@linux.intel.com>,
	linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Namhyung Kim <namhyung@kernel.org>,
	Ian Rogers <irogers@google.com>,
	Adrian Hunter <adrian.hunter@intel.com>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Andi Kleen <ak@linux.intel.com>,
	Eranian Stephane <eranian@google.com>,
	Dapeng Mi <dapeng1.mi@intel.com>
Subject: Re: [Patch v7 02/12] perf/x86/intel: Fix NULL event access and potential PEBS record loss
Date: Sun, 28 Sep 2025 14:00:28 +0800	[thread overview]
Message-ID: <bd48852d-e5d3-4d58-9d71-891a4e31dd5b@linux.intel.com> (raw)
In-Reply-To: <e92a703d-6a92-474c-acba-b15176b97548@linux.intel.com>

[-- Attachment #1: Type: text/plain, Size: 9245 bytes --]

Hi Oliver,

Could you please help to validate the attached patch? The patch should fix
this warning. (Please apply this patch on top of the whole patch series).
Thanks.

diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index 65908880f424..ef32714cb182 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -2821,8 +2821,11 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs
*iregs, struct perf_sample_d
                 * If collision happened, the record will be dropped.
                 */
                if (pebs_status != (1ULL << bit)) {
-                       for_each_set_bit(i, (unsigned long *)&pebs_status,
size)
+                       for_each_set_bit(i, (unsigned long *)&pebs_status,
size) {
                                error[i]++;
+                               if (error[i] && !events[i])
+                                       events[i] = cpuc->events[i];
+                       }
                        continue;
                }


On 9/8/2025 5:05 PM, Mi, Dapeng wrote:
> On 9/8/2025 4:43 PM, kernel test robot wrote:
>> Hello,
>>
>> kernel test robot noticed "WARNING:at_arch/x86/events/intel/ds.c:#intel_pmu_drain_pebs_nhm" on:
>>
>> commit: a7138973beb1d124386472663cf50a571a2059ce ("[Patch v7 02/12] perf/x86/intel: Fix NULL event access and potential PEBS record loss")
>> url: https://github.com/intel-lab-lkp/linux/commits/Dapeng-Mi/perf-x86-Remove-redundant-is_x86_event-prototype/20250828-094117
>> patch link: https://lore.kernel.org/all/20250828013435.1528459-3-dapeng1.mi@linux.intel.com/
>> patch subject: [Patch v7 02/12] perf/x86/intel: Fix NULL event access and potential PEBS record loss
>>
>> in testcase: phoronix-test-suite
>> version: 
>> with following parameters:
>>
>> 	test: stress-ng-1.11.0
>> 	option_a: Socket Activity
>> 	cpufreq_governor: performance
>>
>>
>>
>> config: x86_64-rhel-9.4
>> compiler: gcc-12
>> test machine: 96 threads 2 sockets Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz (Cascade Lake) with 512G memory
>>
>> (please refer to attached dmesg/kmsg for entire log/backtrace)
>>
>>
>>
>> If you fix the issue in a separate patch/commit (i.e. not just a new version of
>> the same patch/commit), kindly add following tags
>> | Reported-by: kernel test robot <oliver.sang@intel.com>
>> | Closes: https://lore.kernel.org/oe-lkp/202509081646.d101cfb7-lkp@intel.com
>>
>>
>>
>> The kernel config and materials to reproduce are available at:
>> https://download.01.org/0day-ci/archive/20250908/202509081646.d101cfb7-lkp@intel.com
>>
>>
>> the dmesg in above link is not very clear, so we also attached one dmesg FYI,
>> from which:
>>
>> [   41.225784][   C82] ------------[ cut here ]------------
>> [   41.225786][   C82] WARNING: CPU: 82 PID: 3704 at arch/x86/events/intel/ds.c:2592 intel_pmu_drain_pebs_nhm+0x56b/0x630
>> [   41.225791][   C82] Modules linked in: xt_conntrack nft_chain_nat xt_MASQUERADE nf_nat nf_conntrack_netlink nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4
>> xt_addrtype nft_compat nf_tables nfnetlink br_netfilter bridge stp llc dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio qrtr sg binfmt_misc loop fus
>> e dm_mod overlay btrfs blake2b_generic xor raid6_pq intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common skx_edac skx_eda
>> c_common nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irdma sd_mod ast irqbypass ice ipmi_ssif drm_client_lib snd_pcm ghash
>> _clmulni_intel drm_shmem_helper snd_timer gnss rapl drm_kms_helper intel_cstate snd ahci ib_uverbs libahci mei_me soundcore acpi_power_meter i2c_i801 ioat
>> dma drm ib_core pcspkr intel_uncore ipmi_si acpi_ipmi libata mei joydev i2c_smbus intel_pch_thermal lpc_ich dca wmi ipmi_devintf ipmi_msghandler acpi_pad
>> [   41.225831][   C82] CPU: 82 UID: 0 PID: 3704 Comm: sleep Tainted: G S                  6.17.0-rc1-00052-ga7138973beb1 #1 VOLUNTARY
>> [   41.225834][   C82] Tainted: [S]=CPU_OUT_OF_SPEC
>> [   41.225835][   C82] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0008.031920191559 03/19/2019
>> [   41.225836][   C82] RIP: 0010:intel_pmu_drain_pebs_nhm+0x56b/0x630
>> [   41.225839][   C82] Code: 48 e8 b9 cd fe ff 85 c0 0f 84 a9 00 00 00 41 f6 84 24 a4 01 00 00 80 0f 84 9a 00 00 00 4c 89 ef e8 1a 2a 34 00 e9 c7 fc ff ff
>>  <0f> 0b e9 c0 fc ff ff 0f 0b e9 b9 fc ff ff 48 8b 04 cb 48 89 84 cc
>> [   41.225841][   C82] RSP: 0018:fffffe00012f38c0 EFLAGS: 00010046
>> [   41.225843][   C82] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
>> [   41.225844][   C82] RDX: 0000000000000001 RSI: 0000000000000004 RDI: fffffe00012f3900
>> [   41.225845][   C82] RBP: fffffe00013120c8 R08: 0000000000000000 R09: 0000000000000000
>> [   20.931889][ T1340] Error: Driver 'pcspkr' is already registered, aborting...
>> [   41.225846][   C82] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
>> [   41.225847][   C82] R13: 0000000000000000 R14: fffffe00012f3c80 R15: 0000000000000000
>> [   41.225848][   C82] FS:  0000000000000000(0000) GS:ffff88f027c62000(0000) knlGS:0000000000000000
>> [   21.006859][ T1512] sd 6:0:0:0: Attached scsi generic sg0 type 0
>> [   21.013583][ T1512] sd 7:0:0:0: Attached scsi generic sg1 type 0
>> [   41.225849][   C82] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [   41.225851][   C82] CR2: 00007ffe5571fe7c CR3: 00000040c5ae1003 CR4: 00000000007726f0
>> [   41.225852][   C82] PKRU: 55555554
>> [   41.225853][   C82] Call Trace:
>> [   41.225855][   C82]  <NMI>
>> [   41.225861][   C82]  handle_pmi_common+0x29b/0x430
>> [   41.225865][   C82]  intel_pmu_handle_irq+0x109/0x2b0
>> [   41.225867][   C82]  perf_event_nmi_handler+0x2a/0x70
>> [   41.225870][   C82]  nmi_handle+0x53/0x130
>> [   41.225873][   C82]  default_do_nmi+0x11d/0x170
>> [   41.225876][   C82]  exc_nmi+0x106/0x1b0
>> [   41.225878][   C82]  end_repeat_nmi+0xf/0x53
>> [   41.225880][   C82] RIP: 0010:find_next_fd+0x2a/0xb0
>> [   41.225883][   C82] Code: 0f 1f 44 00 00 41 54 89 f2 48 c7 c0 ff ff ff ff 49 89 fc 55 c1 ea 06 53 89 f3 48 8b 77 18 89 d9 48 d3 e0 48 f7 d0 48 0b 04 d6
>>  <48> 83 f8 ff 74 0d 48 f7 d0 f3 48 0f bc c0 83 f8 3f 76 3a 41 8b 2c
>> [   41.225885][   C82] RSP: 0018:ffffc90025283b90 EFLAGS: 00000206
>> [   41.225886][   C82] RAX: 0000000000000017 RBX: 0000000000000003 RCX: 0000000000000003
>> [   41.225887][   C82] RDX: 0000000000000000 RSI: ffff88f06d277150 RDI: ffff88f06d2770e8
>> [   41.225888][   C82] RBP: 0000000000000400 R08: 8080808080808080 R09: 979c8d9e9a8cdfff
>> [   41.225889][   C82] R10: fefefefefefefeff R11: 0000000000000000 R12: ffff88f06d2770e8
>> [   41.225890][   C82] R13: 0000000000088000 R14: ffff88f06d2770c0 R15: ffff88f06d2770e8
>> [   41.225893][   C82]  ? find_next_fd+0x2a/0xb0
>> [   41.225896][   C82]  ? find_next_fd+0x2a/0xb0
>> [   41.225899][   C82]  </NMI>
>> [   41.225899][   C82]  <TASK>
>> [   41.225900][   C82]  alloc_fd+0x55/0x130
>> [   41.225902][   C82]  do_sys_openat2+0x5a/0xf0
>> [   41.225905][   C82]  __x64_sys_openat+0x6d/0xb0
>> [   41.225907][   C82]  do_syscall_64+0x7f/0x2b0
>> [   41.225909][   C82]  ? vfs_statx+0x68/0x170
>> [   41.225911][   C82]  ? strncpy_from_user+0x26/0xf0
>> [   41.225914][   C82]  ? vfs_fstatat+0x75/0xb0
>> [   41.225917][   C82]  ? __do_sys_newfstatat+0x25/0x70
>> [   41.225919][   C82]  ? path_openat+0xb6/0x2b0
>> [   41.225921][   C82]  ? do_syscall_64+0x7f/0x2b0
>> [   41.225922][   C82]  ? do_filp_open+0xc3/0x170
>> [   41.225924][   C82]  ? do_syscall_64+0x7f/0x2b0
>> [   41.225925][   C82]  ? __cond_resched+0x1e/0x70
>> [   41.225928][   C82]  ? check_heap_object+0x34/0x1b0
>> [   41.225931][   C82]  ? __check_object_size+0x5c/0x130
>> [   41.225933][   C82]  ? do_sys_openat2+0x8a/0xf0
>> [   41.225936][   C82]  ? __x64_sys_openat+0x6d/0xb0
>> [   41.225938][   C82]  ? clear_bhb_loop+0x30/0x80
>> [   41.225940][   C82]  ? clear_bhb_loop+0x30/0x80
>> [   41.225942][   C82]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
>> [   41.225944][   C82] RIP: 0033:0x7eff04bb9a2d
>> [   41.225946][   C82] Code: 48 89 54 24 e0 41 83 e2 40 75 32 89 f0 25 00 00 41 00 3d 00 00 41 00 74 24 89 f2 b8 01 01 00 00 48 89 fe bf 9c ff ff ff 0f 05
>>  <48> 3d 00 f0 ff ff 77 33 c3 66 2e 0f 1f 84 00 00 00 00 00 48 8d 44
>> [   41.225947][   C82] RSP: 002b:00007ffe5571f7e8 EFLAGS: 00000287 ORIG_RAX: 0000000000000101
>> [   41.225949][   C82] RAX: ffffffffffffffda RBX: 0000558b3236dbe6 RCX: 00007eff04bb9a2d
>> [   41.225950][   C82] RDX: 0000000000080000 RSI: 00007eff04bc20b1 RDI: 00000000ffffff9c
>> [   41.225951][   C82] RBP: 00007eff04bcd1f8 R08: 0000000000000000 R09: 0000558b3236dbe6
>> [   41.225952][   C82] R10: 0000000000000000 R11: 0000000000000287 R12: ffffffffffffffff
>> [   41.225953][   C82] R13: 0000000000000001 R14: 00007eff04bcc020 R15: 00007eff04bcd6b8
>> [   41.225954][   C82]  </TASK>
>> [   41.225955][   C82] ---[ end trace 0000000000000000 ]---
>>
>>
> It looks the warning is triggered in the  "error[i] != 0" case and lead to
> the local events[] array is not initialized. Would fix it in next version.
>
>
>
>

[-- Attachment #2: 0001-perf-x86-intel-Fix-NULL-event-access-waring-from-tes.patch --]
[-- Type: text/plain, Size: 1110 bytes --]

From 0afc076a7836433b455c8aff15d4a3ad9631ebc2 Mon Sep 17 00:00:00 2001
From: Dapeng Mi <dapeng1.mi@linux.intel.com>
Date: Sun, 28 Sep 2025 13:50:47 +0800
Subject: [PATCH] perf/x86/intel: Fix NULL event access waring from test robot

This patch fixes the warning about
https://lore.kernel.org/all/202509081646.d101cfb7-lkp@intel.com/.

Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
---
 arch/x86/events/intel/ds.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index 65908880f424..ef32714cb182 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -2821,8 +2821,11 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs, struct perf_sample_d
 		 * If collision happened, the record will be dropped.
 		 */
 		if (pebs_status != (1ULL << bit)) {
-			for_each_set_bit(i, (unsigned long *)&pebs_status, size)
+			for_each_set_bit(i, (unsigned long *)&pebs_status, size) {
 				error[i]++;
+				if (error[i] && !events[i])
+					events[i] = cpuc->events[i];
+			}
 			continue;
 		}
 
-- 
2.34.1


  reply	other threads:[~2025-09-28  6:00 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-28  1:34 [Patch v7 00/12] arch-PEBS enabling for Intel platforms Dapeng Mi
2025-08-28  1:34 ` [Patch v7 01/12] perf/x86: Remove redundant is_x86_event() prototype Dapeng Mi
2025-08-28  1:34 ` [Patch v7 02/12] perf/x86/intel: Fix NULL event access and potential PEBS record loss Dapeng Mi
2025-09-08  8:43   ` kernel test robot
2025-09-08  9:05     ` Mi, Dapeng
2025-09-28  6:00       ` Mi, Dapeng [this message]
2025-09-30  5:19         ` Oliver Sang
2025-09-30  6:19           ` Mi, Dapeng
2025-10-02  7:09             ` Oliver Sang
2025-10-06  8:03               ` Mi, Dapeng
2025-08-28  1:34 ` [Patch v7 03/12] perf/x86/intel: Replace x86_pmu.drain_pebs calling with static call Dapeng Mi
2025-08-28  1:34 ` [Patch v7 04/12] perf/x86/intel: Correct large PEBS flag check Dapeng Mi
2025-08-28  1:34 ` [Patch v7 05/12] perf/x86/intel: Initialize architectural PEBS Dapeng Mi
2025-08-28  1:34 ` [Patch v7 06/12] perf/x86/intel/ds: Factor out PEBS record processing code to functions Dapeng Mi
2025-08-28  1:34 ` [Patch v7 07/12] perf/x86/intel/ds: Factor out PEBS group " Dapeng Mi
2025-08-28  1:34 ` [Patch v7 08/12] perf/x86/intel: Process arch-PEBS records or record fragments Dapeng Mi
2025-08-28  1:34 ` [Patch v7 09/12] perf/x86/intel: Allocate arch-PEBS buffer and initialize PEBS_BASE MSR Dapeng Mi
2025-08-28  1:34 ` [Patch v7 10/12] perf/x86/intel: Update dyn_constranit base on PEBS event precise level Dapeng Mi
2025-08-28  1:34 ` [Patch v7 11/12] perf/x86/intel: Setup PEBS data configuration and enable legacy groups Dapeng Mi
2025-08-28  1:34 ` [Patch v7 12/12] perf/x86/intel: Add counter group support for arch-PEBS Dapeng Mi
2025-09-19  5:55 ` [Patch v7 00/12] arch-PEBS enabling for Intel platforms Mi, Dapeng
2025-10-09  8:35 ` Mi, Dapeng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bd48852d-e5d3-4d58-9d71-891a4e31dd5b@linux.intel.com \
    --to=dapeng1.mi@linux.intel.com \
    --cc=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=ak@linux.intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=dapeng1.mi@intel.com \
    --cc=eranian@google.com \
    --cc=irogers@google.com \
    --cc=kan.liang@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=lkp@intel.com \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=oe-lkp@lists.linux.dev \
    --cc=oliver.sang@intel.com \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.