From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751268AbdJ1J5e (ORCPT ); Sat, 28 Oct 2017 05:57:34 -0400 Received: from szxga05-in.huawei.com ([45.249.212.191]:9535 "EHLO szxga05-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750882AbdJ1J5d (ORCPT ); Sat, 28 Oct 2017 05:57:33 -0400 Message-ID: <59F45332.7050806@huawei.com> Date: Sat, 28 Oct 2017 17:51:46 +0800 From: zhouchengming User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:12.0) Gecko/20120428 Thunderbird/12.0.1 MIME-Version: 1.0 To: Masami Hiramatsu CC: Borislav Petkov , , , , , , , , , , , Subject: Re: [PATCH] kprobes, x86/alternatives: use text_mutex to protect smp_alt_modules References: <1509096884-22993-1-git-send-email-zhouchengming1@huawei.com> <20171027111527.GD1305@nazgul.tnic> <59F31BB5.90905@huawei.com> <20171027123348.GE1305@nazgul.tnic> <59F334F0.2070900@huawei.com> <20171028174310.384d62976cc5ba4859325d3a@kernel.org> In-Reply-To: <20171028174310.384d62976cc5ba4859325d3a@kernel.org> Content-Type: text/plain; charset="ISO-8859-1"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [10.177.236.183] X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A090203.59F4539B.0048,ss=1,re=0.000,recu=0.000,reip=0.000,cl=1,cld=1,fgs=0, ip=0.0.0.0, so=2014-11-16 11:51:01, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: 513f511b97ff327a9e3858ac7e2aafb9 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2017/10/28 16:43, Masami Hiramatsu wrote: > On Fri, 27 Oct 2017 21:30:24 +0800 > zhouchengming wrote: > >> On 2017/10/27 20:33, Borislav Petkov wrote: >>> On Fri, Oct 27, 2017 at 07:42:45PM +0800, zhouchengming wrote: >>>> This is a real bug happened on one of our machines, below is the calltrace. >>>> We can see the trigger is at alternatives_text_reserved+0x20/0x80, and >>>> encounter a deleted (poisoned) list_head. >>> Looks like some out-of-tree, old kernel thing. We don't have >>> mlx4_stats_sysfs_create() upstream and looking at the boot timestamps, >>> it could be that register_jprobe() is not ready yet. >> Yes, it's an out-of-tree module, loaded when boot kernel. register_kprobe() >> maybe not ready yet, but the bug is not caused by it obviously. >> >>> Looking at the Code, though: >>> >>> 20: 74 59 je 0x7b >>> 22: 66 0f 1f 84 00 00 00 nopw 0x0(%rax,%rax,1) >>> 29: 00 00 >>> 2b:* 48 3b 71 20 cmp 0x20(%rcx),%rsi<-- trapping instruction >>> 2f: 72 3a jb 0x6b >>> 31: 48 3b 79 28 cmp 0x28(%rcx),%rdi >>> 35: 77 34 ja 0x6b >>> >>> %rcx is 0xdead0000000000d0 and that is POISON_POINTER_DELTA + 0xd0 so >>> that looks more like smp_alt_modules is not initialized yet but I could >>> could very well be wrong because this is an old kernel. So trigger that >>> with the upstream kernel without out of tree modules. >> The smp_alt_modules is defined by LIST_HEAD, so it's initialized at start. >> >> A deleted list_head->next = LIST_POISON1 = 0xdead000000000000 + 0x100, then >> container_of() to get the struct smp_alt_module: -0x30 = 0xdead0000000000d0 >> >> Obviously, it's a deleted list_head, and I have explained clearly how it happen in >> the patch comment. > Ah, I see. It looks alternatives_text_reserved() bug at a glance. > But simply adding smp_alt mutex to alternatives_text_reserved() causes > ABBA deadlock in the kprobe's path. > So your solution is to replace the smp_alt with text_mutex, since > alternatives_text_reserved is x86 specific function. > > Hmm, let me see... I agree that will be a simple way to solve, but > it also means we have 2 resources protected by text_mutex. Yes, the smp_alt mutex must be held outside the text_mutex, this is a simpler way to solve, because we will need another x86 specific interface if we want to hold the smp_alt mutex. But like you said, it's not good to use one text_mutex to protect 2 resources... I hope there is any better way. Thanks. > Thank you, > >