From: "Liang, Kan" <kan.liang@linux.intel.com>
To: Alexander Antonov <alexander.antonov@linux.intel.com>,
peterz@infradead.org, linux-kernel@vger.kernel.org
Cc: kyle.meyer@hpe.com, alexey.v.bayduraev@linux.intel.com
Subject: Re: [PATCH] perf/x86/intel/uncore: Fix NULL pointer dereference issue in upi_fill_topology()
Date: Mon, 20 Nov 2023 16:21:59 -0500 [thread overview]
Message-ID: <7caf86b8-f050-4d0f-8aba-e2d725a0ab64@linux.intel.com> (raw)
In-Reply-To: <50ce6fce-c2fc-4392-b405-5c9a7a93f061@linux.intel.com>
On 2023-11-20 2:49 p.m., Alexander Antonov wrote:
>
> On 11/15/2023 8:00 PM, Liang, Kan wrote:
>>
>> On 2023-11-15 10:13 a.m., alexander.antonov@linux.intel.com wrote:
>>> From: Alexander Antonov <alexander.antonov@linux.intel.com>
>>>
>>> The NULL dereference happens inside upi_fill_topology() procedure in
>>> case of disabling one of the sockets on the system.
>>>
>>> For example, if you disable the 2nd socket on a 4-socket system then
>>> uncore_max_dies() returns 3 and inside pmu_alloc_topology() memory will
>>> be allocated only for 3 sockets and stored in type->topology.
>>> In discover_upi_topology() memory is accessed by socket id from
>>> CPUNODEID
>>> registers which contain physical ids (from 0 to 3) and on the line:
>>>
>>> upi = &type->topology[nid][idx];
>>>
>>> out-of-bound access will happen and the 'upi' pointer will be passed to
>>> upi_fill_topology() where it will be dereferenced.
>>>
>>> To avoid this issue update the code to convert physical socket id to
>>> logical socket id in discover_upi_topology() before accessing memory.
>>>
>>> Fixes: f680b6e6062e ("perf/x86/intel/uncore: Enable UPI topology
>>> discovery for Icelake Server")
>>> Reported-by: Kyle Meyer <kyle.meyer@hpe.com>
>>> Tested-by: Kyle Meyer <kyle.meyer@hpe.com>
>>> Signed-off-by: Alexander Antonov <alexander.antonov@linux.intel.com>
>>> ---
>>> arch/x86/events/intel/uncore_snbep.c | 10 ++++++++--
>>> 1 file changed, 8 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/arch/x86/events/intel/uncore_snbep.c
>>> b/arch/x86/events/intel/uncore_snbep.c
>>> index 8250f0f59c2b..49bc27ab26ad 100644
>>> --- a/arch/x86/events/intel/uncore_snbep.c
>>> +++ b/arch/x86/events/intel/uncore_snbep.c
>>> @@ -5596,7 +5596,7 @@ static int discover_upi_topology(struct
>>> intel_uncore_type *type, int ubox_did, i
>>> struct pci_dev *ubox = NULL;
>>> struct pci_dev *dev = NULL;
>>> u32 nid, gid;
>>> - int i, idx, ret = -EPERM;
>>> + int i, idx, lgc_pkg, ret = -EPERM;
>>> struct intel_uncore_topology *upi;
>>> unsigned int devfn;
>>> @@ -5614,8 +5614,13 @@ static int discover_upi_topology(struct
>>> intel_uncore_type *type, int ubox_did, i
>>> for (i = 0; i < 8; i++) {
>>> if (nid != GIDNIDMAP(gid, i))
>>> continue;
>>> + lgc_pkg = topology_phys_to_logical_pkg(i);
>>> + if (lgc_pkg < 0) {
>>> + ret = -EPERM;
>>> + goto err;
>>> + }
>> In the snbep_pci2phy_map_init(), there are similar codes to find the
>> logical die id. Can we factor a common function for both of them?
>>
>> Thanks,
>> Kan
> Hi Kan,
>
> Thank you for your comment.
> Yes, I think we can factor out the common loop where GIDNIDMAP is being
> checked.
> But inside snbep_pci2phy_map_init() we have a bit different procedure which
> also does the following:
>
> if (topology_max_die_per_package() > 1)
> die_id = i;
>
> I think that having this code, at least, in our case could bring us to the
> same issue which we are trying to fix. But of course we could
> parametrize this checking.
The topology_max_die_per_package() > 1 means there are more that 1 die
in a socket. AFAIK, it only happens on the Cascade Lake AP.
Did you observe it in the ICX?
Thanks,
Kan
>
> What do you think?
>
> Thanks,
> Alexander
>>
>>> for (idx = 0; idx < type->num_boxes; idx++) {
>>> - upi = &type->topology[nid][idx];
>>> + upi = &type->topology[lgc_pkg][idx];
>>> devfn = PCI_DEVFN(dev_link0 + idx,
>>> ICX_UPI_REGS_ADDR_FUNCTION);
>>> dev =
>>> pci_get_domain_bus_and_slot(pci_domain_nr(ubox->bus),
>>> ubox->bus->number,
>>> @@ -5626,6 +5631,7 @@ static int discover_upi_topology(struct
>>> intel_uncore_type *type, int ubox_did, i
>>> goto err;
>>> }
>>> }
>>> + break;
>>> }
>>> }
>>> err:
>>>
>>> base-commit: 9bacdd8996c77c42ca004440be610692275ff9d0
next prev parent reply other threads:[~2023-11-20 21:22 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-15 15:13 [PATCH] perf/x86/intel/uncore: Fix NULL pointer dereference issue in upi_fill_topology() alexander.antonov
2023-11-15 19:00 ` Liang, Kan
2023-11-20 19:49 ` Alexander Antonov
2023-11-20 21:21 ` Liang, Kan [this message]
2023-11-21 15:11 ` Alexander Antonov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7caf86b8-f050-4d0f-8aba-e2d725a0ab64@linux.intel.com \
--to=kan.liang@linux.intel.com \
--cc=alexander.antonov@linux.intel.com \
--cc=alexey.v.bayduraev@linux.intel.com \
--cc=kyle.meyer@hpe.com \
--cc=linux-kernel@vger.kernel.org \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox