From: Johannes Hirte <johannes.hirte@datenkhaos.de>
To: "Ghannam, Yazen" <Yazen.Ghannam@amd.com>
Cc: "linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"bp@suse.de" <bp@suse.de>,
"tony.luck@intel.com" <tony.luck@intel.com>,
"x86@kernel.org" <x86@kernel.org>
Subject: Re: [PATCH 3/3] x86/MCE/AMD: Get address from already initialized block
Date: Tue, 15 May 2018 11:39:54 +0200 [thread overview]
Message-ID: <20180515093953.GA1746@probook> (raw)
In-Reply-To: <CY4PR12MB1557FB8EB5C17E55A9B405D8F8B70@CY4PR12MB1557.namprd12.prod.outlook.com>
On 2018 Apr 17, Ghannam, Yazen wrote:
> > -----Original Message-----
> > From: linux-edac-owner@vger.kernel.org <linux-edac-
> > owner@vger.kernel.org> On Behalf Of Johannes Hirte
> > Sent: Monday, April 16, 2018 7:56 AM
> > To: Ghannam, Yazen <Yazen.Ghannam@amd.com>
> > Cc: linux-edac@vger.kernel.org; linux-kernel@vger.kernel.org; bp@suse.de;
> > tony.luck@intel.com; x86@kernel.org
> > Subject: Re: [PATCH 3/3] x86/MCE/AMD: Get address from already initialized
> > block
> >
> > On 2018 Apr 14, Johannes Hirte wrote:
> > > On 2018 Feb 01, Yazen Ghannam wrote:
> > > > From: Yazen Ghannam <yazen.ghannam@amd.com>
> > > >
> > > > The block address is saved after the block is initialized when
> > > > threshold_init_device() is called.
> > > >
> > > > Use the saved block address, if available, rather than trying to
> > > > rediscover it.
> > > >
> > > > We can avoid some *on_cpu() calls in the init path that will cause a
> > > > call trace when resuming from suspend.
> > > >
> > > > Cc: <stable@vger.kernel.org> # 4.14.x
> > > > Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
> > > > ---
> > > > arch/x86/kernel/cpu/mcheck/mce_amd.c | 15 +++++++++++++++
> > > > 1 file changed, 15 insertions(+)
> > > >
> > > > diff --git a/arch/x86/kernel/cpu/mcheck/mce_amd.c
> > b/arch/x86/kernel/cpu/mcheck/mce_amd.c
> > > > index bf53b4549a17..8c4f8f30c779 100644
> > > > --- a/arch/x86/kernel/cpu/mcheck/mce_amd.c
> > > > +++ b/arch/x86/kernel/cpu/mcheck/mce_amd.c
> > > > @@ -436,6 +436,21 @@ static u32 get_block_address(unsigned int cpu,
> > u32 current_addr, u32 low, u32 hi
> > > > {
> > > > u32 addr = 0, offset = 0;
> > > >
> > > > + if ((bank >= mca_cfg.banks) || (block >= NR_BLOCKS))
> > > > + return addr;
> > > > +
> > > > + /* Get address from already initialized block. */
> > > > + if (per_cpu(threshold_banks, cpu)) {
> > > > + struct threshold_bank *bankp = per_cpu(threshold_banks,
> > cpu)[bank];
> > > > +
> > > > + if (bankp && bankp->blocks) {
> > > > + struct threshold_block *blockp = &bankp-
> > >blocks[block];
> > > > +
> > > > + if (blockp)
> > > > + return blockp->address;
> > > > + }
> > > > + }
> > > > +
> > > > if (mce_flags.smca) {
> > > > if (smca_get_bank_type(bank) == SMCA_RESERVED)
> > > > return addr;
> > > > --
> > > > 2.14.1
> > >
> > > I have a KASAN: slab-out-of-bounds, and git bisect points me to this
> > > change:
> > >
> > > Apr 13 00:40:32 probook kernel:
> > ================================================================
> > ==
> > > Apr 13 00:40:32 probook kernel: BUG: KASAN: slab-out-of-bounds in
> > get_block_address.isra.3+0x1e9/0x520
> > > Apr 13 00:40:32 probook kernel: Read of size 4 at addr ffff8803f165ddf4 by
> > task swapper/0/1
> > > Apr 13 00:40:32 probook kernel:
> > > Apr 13 00:40:32 probook kernel: CPU: 1 PID: 1 Comm: swapper/0 Not
> > tainted 4.16.0-10757-g4ca8ba4ccff9 #532
> > > Apr 13 00:40:32 probook kernel: Hardware name: HP HP ProBook 645
> > G2/80FE, BIOS N77 Ver. 01.12 12/19/2017
> > > Apr 13 00:40:32 probook kernel: Call Trace:
> > > Apr 13 00:40:32 probook kernel: dump_stack+0x5b/0x8b
> > > Apr 13 00:40:32 probook kernel: ? get_block_address.isra.3+0x1e9/0x520
> > > Apr 13 00:40:32 probook kernel: print_address_description+0x65/0x270
> > > Apr 13 00:40:32 probook kernel: ? get_block_address.isra.3+0x1e9/0x520
> > > Apr 13 00:40:32 probook kernel: kasan_report+0x232/0x350
> > > Apr 13 00:40:32 probook kernel: get_block_address.isra.3+0x1e9/0x520
> > > Apr 13 00:40:32 probook kernel: ? kobject_init_and_add+0xde/0x130
> > > Apr 13 00:40:32 probook kernel: ? get_name+0x390/0x390
> > > Apr 13 00:40:32 probook kernel: ? kasan_unpoison_shadow+0x30/0x40
> > > Apr 13 00:40:32 probook kernel: ? kasan_kmalloc+0xa0/0xd0
> > > Apr 13 00:40:32 probook kernel: allocate_threshold_blocks+0x12c/0xc60
> > > Apr 13 00:40:32 probook kernel: ? kobject_add_internal+0x800/0x800
> > > Apr 13 00:40:32 probook kernel: ? get_block_address.isra.3+0x520/0x520
> > > Apr 13 00:40:32 probook kernel: ? kasan_kmalloc+0xa0/0xd0
> > > Apr 13 00:40:32 probook kernel:
> > mce_threshold_create_device+0x35b/0x990
> > > Apr 13 00:40:32 probook kernel: ? init_special_inode+0x1d0/0x230
> > > Apr 13 00:40:32 probook kernel: threshold_init_device+0x98/0xa7
> > > Apr 13 00:40:32 probook kernel: ?
> > mcheck_vendor_init_severity+0x43/0x43
> > > Apr 13 00:40:32 probook kernel: do_one_initcall+0x76/0x30c
> > > Apr 13 00:40:32 probook kernel: ?
> > trace_event_raw_event_initcall_finish+0x190/0x190
> > > Apr 13 00:40:32 probook kernel: ? kasan_unpoison_shadow+0xb/0x40
> > > Apr 13 00:40:32 probook kernel: ? kasan_unpoison_shadow+0x30/0x40
> > > Apr 13 00:40:32 probook kernel: kernel_init_freeable+0x3d6/0x471
> > > Apr 13 00:40:32 probook kernel: ? rest_init+0xf0/0xf0
> > > Apr 13 00:40:32 probook kernel: kernel_init+0xa/0x120
> > > Apr 13 00:40:32 probook kernel: ? rest_init+0xf0/0xf0
> > > Apr 13 00:40:32 probook kernel: ret_from_fork+0x22/0x40
> > > Apr 13 00:40:32 probook kernel:
> > > Apr 13 00:40:32 probook kernel: Allocated by task 1:
> > > Apr 13 00:40:32 probook kernel: kasan_kmalloc+0xa0/0xd0
> > > Apr 13 00:40:32 probook kernel: kmem_cache_alloc_trace+0xf3/0x1f0
> > > Apr 13 00:40:32 probook kernel: allocate_threshold_blocks+0x1bc/0xc60
> > > Apr 13 00:40:32 probook kernel:
> > mce_threshold_create_device+0x35b/0x990
> > > Apr 13 00:40:32 probook kernel: threshold_init_device+0x98/0xa7
> > > Apr 13 00:40:32 probook kernel: do_one_initcall+0x76/0x30c
> > > Apr 13 00:40:32 probook kernel: kernel_init_freeable+0x3d6/0x471
> > > Apr 13 00:40:32 probook kernel: kernel_init+0xa/0x120
> > > Apr 13 00:40:32 probook kernel: ret_from_fork+0x22/0x40
> > > Apr 13 00:40:32 probook kernel:
> > > Apr 13 00:40:32 probook kernel: Freed by task 0:
> > > Apr 13 00:40:32 probook kernel: (stack is not available)
> > > Apr 13 00:40:32 probook kernel:
> > > Apr 13 00:40:32 probook kernel: The buggy address belongs to the object at
> > ffff8803f165dd80
> > > which belongs to the cache kmalloc-128 of size 128
> > > Apr 13 00:40:32 probook kernel: The buggy address is located 116 bytes
> > inside of
> > > 128-byte region [ffff8803f165dd80, ffff8803f165de00)
> > > Apr 13 00:40:32 probook kernel: The buggy address belongs to the page:
> > > Apr 13 00:40:32 probook kernel: page:ffffea000fc59740 count:1
> > mapcount:0 mapping:0000000000000000 index:0x0
> > > Apr 13 00:40:32 probook kernel: flags: 0x2000000000000100(slab)
> > > Apr 13 00:40:32 probook kernel: raw: 2000000000000100
> > 0000000000000000 0000000000000000 0000000180150015
> > > Apr 13 00:40:32 probook kernel: raw: dead000000000100
> > dead000000000200 ffff8803f3403340 0000000000000000
> > > Apr 13 00:40:32 probook kernel: page dumped because: kasan: bad access
> > detected
> > > Apr 13 00:40:32 probook kernel:
> > > Apr 13 00:40:32 probook kernel: Memory state around the buggy address:
> > > Apr 13 00:40:32 probook kernel: ffff8803f165dc80: fc fc fc fc fc fc fc fc 00 00
> > 00 00 00 00 00 00
> > > Apr 13 00:40:32 probook kernel: ffff8803f165dd00: 00 00 00 00 00 00 00 fc
> > fc fc fc fc fc fc fc fc
> > > Apr 13 00:40:32 probook kernel: >ffff8803f165dd80: 00 00 00 00 00 00 00
> > 00 00 00 00 00 00 fc fc fc
> > > Apr 13 00:40:32 probook kernel: ^
> > > Apr 13 00:40:32 probook kernel: ffff8803f165de00: fc fc fc fc fc fc fc fc fc fc
> > fc fc fc fc fc fc
> > > Apr 13 00:40:32 probook kernel: ffff8803f165de80: fc fc fc fc fc fc fc fc fc fc
> > fc fc fc fc fc fc
> > > Apr 13 00:40:32 probook kernel:
> > ================================================================
> > ==
> > >
> >
> > Putting the whole chaching part under the
> >
> > if (mce_flags.smca) {
> >
> > solved the issue on my Carrizo.
> >
>
> Thanks for reporting this. I'm able to reproduce this on my Fam17h system. The
> caching should still be the same on non-SMCA systems. Putting it all under the
> SMCA flags effectively removes it on Carrizo.
>
> Here are when get_block_address() is called:
> 1) Boot time MCE init. Called on each CPU. No caching.
> 2) Init of the MCE device. Called on a single CPU. Values are cached here.
> 3) CPU on/offling which calls MCE init. Should use the cached values.
>
> It seems to me that the KASAN bug is detected during #2 though it's not yet clear
> to me what the issue is. I need to read up on KASAN and keep debugging.
The out-of-bound access happens in get_block_address:
if (bankp && bankp->blocks) {
struct threshold_block *blockp blockp = &bankp->blocks[block];
with block=1. This doesn't exists. I don't even find any array here.
There is a linked list, created in allocate_threshold_blocks. On my
system I get 17 lists with one element each.
--
Regards,
Johannes
next prev parent reply other threads:[~2018-05-15 9:40 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-02-01 18:48 [PATCH 1/3] x86/MCE/AMD: Redo function to get SMCA bank type Yazen Ghannam
2018-02-01 18:48 ` [PATCH 2/3] x86/MCE/AMD, EDAC/mce_amd: Enumerate Reserved " Yazen Ghannam
2018-02-08 15:15 ` Borislav Petkov
2018-02-14 16:28 ` Ghannam, Yazen
2018-02-01 18:48 ` [PATCH 3/3] x86/MCE/AMD: Get address from already initialized block Yazen Ghannam
2018-02-08 15:26 ` Borislav Petkov
2018-04-14 0:42 ` Johannes Hirte
2018-04-16 11:56 ` Johannes Hirte
2018-04-17 13:31 ` Ghannam, Yazen
2018-05-15 9:39 ` Johannes Hirte [this message]
2018-05-16 22:46 ` Borislav Petkov
2018-05-17 6:49 ` Johannes Hirte
2018-05-17 10:41 ` Borislav Petkov
2018-05-17 13:04 ` Ghannam, Yazen
2018-05-17 13:44 ` Borislav Petkov
2018-05-17 14:05 ` Ghannam, Yazen
2018-05-17 18:30 ` [PATCH 1/2] x86/MCE/AMD: Cache SMCA MISC block addresses Borislav Petkov
2018-05-17 18:31 ` [PATCH 2/2] x86/MCE/AMD: Read MCx_MISC block addresses on any CPU Borislav Petkov
2018-05-17 19:29 ` [PATCH 3/3] x86/MCE/AMD: Get address from already initialized block Johannes Hirte
2018-05-17 19:33 ` Borislav Petkov
2018-05-19 13:21 ` [tip:ras/urgent] x86/MCE/AMD: Cache SMCA MISC block addresses tip-bot for Borislav Petkov
2018-02-08 15:04 ` [PATCH 1/3] x86/MCE/AMD: Redo function to get SMCA bank type Borislav Petkov
2018-02-14 16:38 ` Ghannam, Yazen
2018-02-14 19:35 ` Borislav Petkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180515093953.GA1746@probook \
--to=johannes.hirte@datenkhaos.de \
--cc=Yazen.Ghannam@amd.com \
--cc=bp@suse.de \
--cc=linux-edac@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=tony.luck@intel.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).