From: "Luck, Tony" <tony.luck@intel.com>
To: Borislav Petkov <bp@alien8.de>
Cc: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>,
Greg KH <gregkh@linuxfoundation.org>,
Justin Ernst <justin.ernst@hpe.com>,
russ.anderson@hpe.com, Mauro Carvalho Chehab <mchehab@kernel.org>,
linux-edac@vger.kernel.org, linux-kernel@vger.kernel.org,
Aristeu Rozanski Filho <arozansk@redhat.com>
Subject: Re: [PATCH] Raise maximum number of memory controllers
Date: Wed, 26 Sep 2018 11:10:35 -0700 [thread overview]
Message-ID: <20180926181035.GA1132@agluck-desk> (raw)
In-Reply-To: <20180926161749.GI5584@zn.tnic>
On Wed, Sep 26, 2018 at 06:17:49PM +0200, Borislav Petkov wrote:
> On Wed, Sep 26, 2018 at 01:03:40PM -0300, Mauro Carvalho Chehab wrote:
> > I guess this is/was needed to create things like this:
> >
> > lrwxrwxrwx 1 root root 0 set 26 05:24 /sys/bus/edac/devices/mc -> ../../../devices/system/edac/mc
>
> They're still there:
>
> $ ls -l /sys/bus/edac/devices/
> total 0
> lrwxrwxrwx 1 root root 0 Sep 26 18:15 csrow0 -> ../../../devices/system/edac/mc/mc0/csrow0
> lrwxrwxrwx 1 root root 0 Sep 26 18:15 dimm0 -> ../../../devices/system/edac/mc/mc0/dimm0
> lrwxrwxrwx 1 root root 0 Sep 26 18:15 dimm3 -> ../../../devices/system/edac/mc/mc0/dimm3
> lrwxrwxrwx 1 root root 0 Sep 26 18:15 dimm6 -> ../../../devices/system/edac/mc/mc0/dimm6
> lrwxrwxrwx 1 root root 0 Sep 26 18:15 dimm9 -> ../../../devices/system/edac/mc/mc0/dimm9
> lrwxrwxrwx 1 root root 0 Sep 26 18:15 mc -> ../../../devices/system/edac/mc
> lrwxrwxrwx 1 root root 0 Sep 26 18:15 mc0 -> ../../../devices/system/edac/mc/mc0
I ran into trouble on my 4 socket broadwell server (so 8 memory controllers,
a whole pile of DIMMs, running from sb_edac.c)
Things start going wrong with:
[ 45.216657] sysfs: cannot create duplicate filename '/bus/edac/devices/dimm0'
[ 45.216663] CPU: 37 PID: 2034 Comm: systemd-udevd Not tainted 4.19.0-rc5 #1
[ 45.216665] Hardware name: Intel Corporation BRICKLAND/BRICKLAND, BIOS BRBDXSD1.86B.0338.V01.1603162127 03/16/2016
[ 45.216667] Call Trace:
[ 45.216688] dump_stack+0x5c/0x7b
[ 45.216697] sysfs_warn_dup+0x56/0x70
[ 45.216702] sysfs_do_create_link_sd.isra.2+0x98/0xb0
[ 45.216714] bus_add_device+0x77/0x160
[ 45.216720] device_add+0x424/0x660
[ 45.216731] edac_create_sysfs_mci_device+0xb9/0x2f0
[ 45.216738] edac_mc_add_mc_with_groups+0x111/0x2b0
[ 45.216747] sbridge_init+0x13c9/0x2000 [sb_edac]
[ 45.216757] ? _raw_spin_lock+0x1d/0x20
[ 45.216765] ? free_pcppages_bulk+0x2ca/0x630
[ 45.216769] ? 0xffffffffc050f000
[ 45.216779] do_one_initcall+0x46/0x1c8
[ 45.216784] ? free_unref_page_commit+0x95/0x120
[ 45.216791] ? _cond_resched+0x15/0x40
[ 45.216798] ? kmem_cache_alloc_trace+0x153/0x1c0
[ 45.216805] do_init_module+0x5b/0x208
[ 45.216826] load_module+0x1a2d/0x1fb0
[ 45.216835] ? __do_sys_finit_module+0xe9/0x110
[ 45.216840] __do_sys_finit_module+0xe9/0x110
[ 45.216847] do_syscall_64+0x5b/0x180
[ 45.216852] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 45.216856] RIP: 0033:0x7fcdec618bd9
and fell off a cliff after that.
Going back to the old code I have a "dimm0" on each of the eight controllers:
# find /sys -name dimm0
/sys/devices/system/edac/mc/mc6/dimm0
/sys/devices/system/edac/mc/mc4/dimm0
/sys/devices/system/edac/mc/mc2/dimm0
/sys/devices/system/edac/mc/mc0/dimm0
/sys/devices/system/edac/mc/mc7/dimm0
/sys/devices/system/edac/mc/mc5/dimm0
/sys/devices/system/edac/mc/mc3/dimm0
/sys/devices/system/edac/mc/mc1/dimm0
/sys/bus/mc6/devices/dimm0
/sys/bus/mc4/devices/dimm0
/sys/bus/mc2/devices/dimm0
/sys/bus/mc0/devices/dimm0
/sys/bus/mc7/devices/dimm0
/sys/bus/mc5/devices/dimm0
/sys/bus/mc3/devices/dimm0
/sys/bus/mc1/devices/dimm0
# ls -l /sys/bus/mc0/devices
total 0
lrwxrwxrwx. 1 root root 0 Sep 26 11:08 csrow0 -> ../../../devices/system/edac/mc/mc0/csrow0
lrwxrwxrwx. 1 root root 0 Sep 26 11:08 dimm0 -> ../../../devices/system/edac/mc/mc0/dimm0
lrwxrwxrwx. 1 root root 0 Sep 26 11:08 dimm3 -> ../../../devices/system/edac/mc/mc0/dimm3
lrwxrwxrwx. 1 root root 0 Sep 26 11:08 dimm6 -> ../../../devices/system/edac/mc/mc0/dimm6
lrwxrwxrwx. 1 root root 0 Sep 26 11:08 dimm9 -> ../../../devices/system/edac/mc/mc0/dimm9
lrwxrwxrwx. 1 root root 0 Sep 26 11:08 mc0 -> ../../../devices/system/edac/mc/mc0
It looks like the new code isn't trying to place the dimm symlinks
in the proper subdirectories.
-Tony
next prev parent reply other threads:[~2018-09-26 18:10 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-09-25 14:34 [PATCH] Raise maximum number of memory controllers Justin Ernst
2018-09-25 15:26 ` Borislav Petkov
2018-09-25 17:50 ` Luck, Tony
2018-09-25 18:07 ` Borislav Petkov
2018-09-26 9:35 ` Borislav Petkov
2018-09-26 15:27 ` Borislav Petkov
2018-09-26 16:03 ` Mauro Carvalho Chehab
2018-09-26 16:17 ` Borislav Petkov
2018-09-26 17:39 ` Mauro Carvalho Chehab
2018-09-26 18:10 ` Luck, Tony [this message]
2018-09-26 18:23 ` Russ Anderson
2018-09-26 23:02 ` Luck, Tony
2018-09-27 4:52 ` Borislav Petkov
2018-09-27 21:44 ` Luck, Tony
2018-09-27 22:03 ` Borislav Petkov
2018-09-28 1:10 ` Mauro Carvalho Chehab
2018-10-01 12:47 ` Borislav Petkov
2018-10-01 22:43 ` [PATCH] EDAC: Don't add devices under /sys/bus/edac Luck, Tony
2018-10-02 1:22 ` Mauro Carvalho Chehab
2018-10-02 15:51 ` Ernst, Justin
2018-10-02 16:26 ` Borislav Petkov
2018-11-06 14:45 ` Borislav Petkov
2018-11-13 19:09 ` Ernst, Justin
2018-11-13 19:15 ` Borislav Petkov
2018-09-26 7:55 ` [PATCH] Raise maximum number of memory controllers Zhuo, Qiuxu
2018-09-26 13:53 ` Russ Anderson
2018-09-26 16:13 ` Aristeu Rozanski
2018-09-27 5:56 ` Borislav Petkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180926181035.GA1132@agluck-desk \
--to=tony.luck@intel.com \
--cc=arozansk@redhat.com \
--cc=bp@alien8.de \
--cc=gregkh@linuxfoundation.org \
--cc=justin.ernst@hpe.com \
--cc=linux-edac@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mchehab+samsung@kernel.org \
--cc=mchehab@kernel.org \
--cc=russ.anderson@hpe.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox