All of lore.kernel.org
 help / color / mirror / Atom feed
From: Borislav Petkov <bp@suse.de>
To: Andre Przywara <andre.przywara@arm.com>,
	Sudeep Holla <sudeep.holla@arm.com>, Tejun Heo <tj@kernel.org>
Cc: linux-kernel@vger.kernel.org,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
	x86@kernel.org
Subject: Re: [PATCH RFT v2] x86: move cacheinfo sysfs to generic cacheinfo infrastructure
Date: Tue, 3 Mar 2015 19:45:27 +0100	[thread overview]
Message-ID: <20150303184527.GE3648@pd.tnic> (raw)
In-Reply-To: <1425249564-28347-1-git-send-email-andre.przywara@arm.com>

@Tejun: scroll to the end :)

On Sun, Mar 01, 2015 at 10:39:24PM +0000, Andre Przywara wrote:
> So I gave that a spin on my old PhenomII X6 box, and indeed there is a
> NULL pointer dereference. I tracked it down to
> __cache_amd_cpumap_setup(), where the sibling map for the L3 is
> populated. When the function is called for the first CPU, subsequent
> CPU's cacheinfo structures obviously haven't been initialized yet, so

Right, let me try to understand the splat:

[    3.030598] RIP: 0010:[<ffffffff810141c0>]  [<ffffffff810141c0>] _populate_cache_leaves+0x440/0x460

This is:

     9d6:       49 0f a3 06             bt     %rax,(%r14)
     9da:       19 c9                   sbb    %ecx,%ecx
     9dc:       85 c9                   test   %ecx,%ecx
     9de:       74 d0                   je     9b0 <_populate_cache_leaves+0x410>
     9e0:       f0 49 0f ab 84 24 f8    lock bts %rax,0xf8(%r12)			<----
     9e7:       00 00 00 
     9ea:       eb c4                   jmp    9b0 <_populate_cache_leaves+0x410>
     9ec:       4c 8b 65 98             mov    -0x68(%rbp),%r12
     9f0:       e9 eb fd ff ff          jmpq   7e0 <_populate_cache_leaves+0x240>
     9f5:       66 66 2e 0f 1f 84 00    data16 nopw %cs:0x0(%rax,%rax,1)

%r12 is 0 and that is this hunk in __cache_amd_cpumap_setup():

        } else if (index == 3) {
                for_each_cpu(i, cpu_llc_shared_mask(cpu)) {
                        this_cpu_ci = get_cpu_cacheinfo(i);
                        this_leaf = this_cpu_ci->info_list + index;
                        for_each_cpu(sibling, cpu_llc_shared_mask(cpu)) {
                                if (!cpu_online(sibling))
                                        continue;
                                cpumask_set_cpu(sibling,			<--- set_bit() does LOCK
                                                &this_leaf->shared_cpu_map);
                        }
                }

so this_leaf is NULL.

We get it by doing

	this_leaf = this_cpu_ci->info_list + index;

and this_cpu_ci is a per-CPU variable.

Now, previously the code did

-			if (!per_cpu(ici_cpuid4_info, i))
-				continue;


and __cache_cpumap_setup() already does:

                        if (i == cpu || !sib_cpu_ci->info_list)
                                continue;/* skip if itself or no cacheinfo */

so maybe we should do that too in __cache_amd_cpumap_setup():

                        if (!this_cpu_ci->info_list)
                                continue;

for the index == 3 case?

It boots fine here with that change and it is consistent with the
previous code.

And yes, the x86 cacheinfo code could use a serious rubbing and cleanup.

Btw, this patch introduces a bunch of new sysfs nodes in the caches
hierarchy:

--- caches-guest-before.txt     2015-03-03 15:11:09.168276423 +0100
+++ caches-guest-after.txt      2015-03-03 18:19:04.084426130 +0100
@@ -1,6 +1,22 @@
+/sys/devices/system/cpu/cpu0/cache/power/control:1:auto
+/sys/devices/system/cpu/cpu0/cache/power/async:1:disabled
+/sys/devices/system/cpu/cpu0/cache/power/runtime_enabled:1:disabled
+/sys/devices/system/cpu/cpu0/cache/power/runtime_active_kids:1:0
+/sys/devices/system/cpu/cpu0/cache/power/runtime_active_time:1:0
+/sys/devices/system/cpu/cpu0/cache/power/runtime_status:1:unsupported
+/sys/devices/system/cpu/cpu0/cache/power/runtime_usage:1:0
+/sys/devices/system/cpu/cpu0/cache/power/runtime_suspended_time:1:0
 /sys/devices/system/cpu/cpu0/cache/index0/size:1:64K
 /sys/devices/system/cpu/cpu0/cache/index0/type:1:Data
 /sys/devices/system/cpu/cpu0/cache/index0/level:1:1
+/sys/devices/system/cpu/cpu0/cache/index0/power/control:1:auto
+/sys/devices/system/cpu/cpu0/cache/index0/power/async:1:disabled
+/sys/devices/system/cpu/cpu0/cache/index0/power/runtime_enabled:1:disabled
+/sys/devices/system/cpu/cpu0/cache/index0/power/runtime_active_kids:1:0
+/sys/devices/system/cpu/cpu0/cache/index0/power/runtime_active_time:1:0
+/sys/devices/system/cpu/cpu0/cache/index0/power/runtime_status:1:unsupported
+/sys/devices/system/cpu/cpu0/cache/index0/power/runtime_usage:1:0
+/sys/devices/system/cpu/cpu0/cache/index0/power/runtime_suspended_time:1:0
 /sys/devices/system/cpu/cpu0/cache/index0/number_of_sets:1:512
 /sys/devices/system/cpu/cpu0/cache/index0/shared_cpu_map:1:1
 /sys/devices/system/cpu/cpu0/cache/index0/shared_cpu_list:1:0
 ...

What do those things mean? runtime_active_kids ?? Kids are active during
runtime?! Well, that's a given, no need for a sysfs node for that :-)

> > Indeed I see a regression between 3.19.0 and a clean 4.0.0-rc1 in this
> > respect, the diff is like:
> > -cpu5/cache/index3/shared_cpu_map:00000000,0000003f
> > +cpu5/cache/index3/shared_cpu_map:3f
> > (for each cpu and index)
> >
> > Can someone confirm (and investigate) this?
>
> I saw this even on ARM64 platforms with 4.0-rc1 and did narrow down to
> series of formatting functions changes introduced by Tejun Heo, mainly
> commit 4a0792b0e7a6("bitmap: use %*pb[l] to print bitmaps including
> cpumasks and nodemasks")

Strictly speaking, this is a regression if userspace is parsing those
masks. I hardly doubt anything is doing that but if we had to be
completely strict we should add back the zeroes.

Tejun, we have this change in user-visible masks formatting in sysfs
after your bitmaps printing changes:

-cpu5/cache/index3/shared_cpu_map:00000000,0000003f
+cpu5/cache/index3/shared_cpu_map:3f

What's the rationale on userspace parsing bitmaps in sysfs, we don't
care?

Thanks.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

  parent reply	other threads:[~2015-03-03 18:46 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-01-08  7:50 [PATCH RFT] x86: move cacheinfo sysfs to generic cacheinfo infrastructure Sudeep Holla
2015-01-23 13:55 ` Thomas Gleixner
2015-01-23 18:15   ` Borislav Petkov
2015-02-23 16:36     ` Sudeep Holla
2015-02-23 18:14 ` [PATCH RFT v2] " Sudeep Holla
2015-02-24  7:58   ` Ingo Molnar
2015-02-24  9:30     ` Sudeep Holla
2015-02-24 17:57   ` Borislav Petkov
2015-02-24 18:09     ` Sudeep Holla
2015-02-24 18:58       ` Borislav Petkov
2015-03-01 22:39     ` Andre Przywara
2015-03-02 10:17       ` Sudeep Holla
2015-03-03 18:45       ` Borislav Petkov [this message]
2015-03-03 18:53         ` Tejun Heo
2015-03-03 18:57           ` Borislav Petkov
2015-03-04 11:15             ` Sudeep Holla
2015-03-04 11:27               ` Borislav Petkov
2015-03-04 11:35                 ` Sudeep Holla
2015-03-04  9:52         ` Sudeep Holla
2015-03-04 12:00   ` [PATCH v3] " Sudeep Holla
2015-03-04 12:27     ` Borislav Petkov
2015-03-05  8:16       ` Borislav Petkov
2015-03-05  9:28         ` Sudeep Holla
2015-03-10 11:37           ` Borislav Petkov
2015-03-10 11:53             ` Sudeep Holla
2015-03-10 14:22               ` Sudeep Holla
2015-03-10 14:26                 ` Borislav Petkov
2015-03-10 14:35                   ` Sudeep Holla
2015-03-11 13:36                     ` Borislav Petkov
2015-03-11 15:44                       ` Sudeep Holla
2015-03-11 20:23                         ` Borislav Petkov
2015-03-09 14:46     ` [tip:x86/cpu] x86/cacheinfo: Move cacheinfo sysfs code to generic infrastructure tip-bot for Sudeep Holla

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150303184527.GE3648@pd.tnic \
    --to=bp@suse.de \
    --cc=andre.przywara@arm.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=sudeep.holla@arm.com \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.