From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751284AbdEPBmF (ORCPT ); Mon, 15 May 2017 21:42:05 -0400 Received: from szxga02-in.huawei.com ([45.249.212.188]:5882 "EHLO szxga02-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750991AbdEPBmE (ORCPT ); Mon, 15 May 2017 21:42:04 -0400 Subject: Re: [PATCH] perf/x86/intel/cqm: Make sure the head event of cache_groups always has valid RMID To: Thomas Gleixner References: <590A928F.8020201@huawei.com> CC: Peter Zijlstra , "H. Peter Anvin" , , , LKML From: Zefan Li Message-ID: <591A5889.2080303@huawei.com> Date: Tue, 16 May 2017 09:40:25 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.5.1 MIME-Version: 1.0 In-Reply-To: <590A928F.8020201@huawei.com> Content-Type: text/plain; charset="gbk" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.177.19.236] X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A020204.591A588E.011E,ss=1,re=0.000,recu=0.000,reip=0.000,cl=1,cld=1,fgs=0, ip=0.0.0.0, so=2014-11-16 11:51:01, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: 17b71e9950e5c2df537fe0f531fab489 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org any comments? On 2017/5/4 10:31, Zefan Li wrote: > It is assumed that the head of cache_groups always has valid RMID, > which isn't true. > > When we deallocate RMID from conflicting events currently we don't > move them to the tail, and one of those events can happen to be in > the head. Another case is we allocate RMIDs for all the events except > the head event in intel_cqm_sched_in_event(). > > Besides there's another bug that we retry rotating without resetting > nr_needed and start in __intel_cqm_rmid_rotate(). > > Those bugs combined together led to the following oops. > > WARNING: at arch/x86/kernel/cpu/perf_event_intel_cqm.c:186 __put_rmid+0x28/0x80() > ... > [] __put_rmid+0x28/0x80 > [] intel_cqm_rmid_rotate+0xba/0x440 > [] process_one_work+0x17b/0x470 > [] worker_thread+0x11b/0x400 > ... > BUG: unable to handle kernel NULL pointer dereference at (null) > ... > [] intel_cqm_rmid_rotate+0xba/0x440 > [] process_one_work+0x17b/0x470 > [] worker_thread+0x11b/0x400 > > Cc: stable@vger.kernel.org > Signed-off-by: Zefan Li > --- > arch/x86/events/intel/cqm.c | 19 ++++++++++++++++--- > 1 file changed, 16 insertions(+), 3 deletions(-) > > diff --git a/arch/x86/events/intel/cqm.c b/arch/x86/events/intel/cqm.c > index 8c00dc0..c06a5ba 100644 > --- a/arch/x86/events/intel/cqm.c > +++ b/arch/x86/events/intel/cqm.c > @@ -553,8 +553,13 @@ static bool intel_cqm_sched_in_event(u32 rmid) > > leader = list_first_entry(&cache_groups, struct perf_event, > hw.cqm_groups_entry); > - event = leader; > > + if (!list_empty(&cache_groups) && !__rmid_valid(leader->hw.cqm_rmid)) { > + intel_cqm_xchg_rmid(leader, rmid); > + return true; > + } > + > + event = leader; > list_for_each_entry_continue(event, &cache_groups, > hw.cqm_groups_entry) { > if (__rmid_valid(event->hw.cqm_rmid)) > @@ -721,6 +726,7 @@ static void intel_cqm_sched_out_conflicting_events(struct perf_event *event) > { > struct perf_event *group, *g; > u32 rmid; > + LIST_HEAD(conflicting_groups); > > lockdep_assert_held(&cache_mutex); > > @@ -744,7 +750,11 @@ static void intel_cqm_sched_out_conflicting_events(struct perf_event *event) > > intel_cqm_xchg_rmid(group, INVALID_RMID); > __put_rmid(rmid); > + list_move_tail(&group->hw.cqm_groups_entry, > + &conflicting_groups); > } > + > + list_splice_tail(&conflicting_groups, &cache_groups); > } > > /* > @@ -773,9 +783,9 @@ static void intel_cqm_sched_out_conflicting_events(struct perf_event *event) > */ > static bool __intel_cqm_rmid_rotate(void) > { > - struct perf_event *group, *start = NULL; > + struct perf_event *group, *start; > unsigned int threshold_limit; > - unsigned int nr_needed = 0; > + unsigned int nr_needed; > unsigned int nr_available; > bool rotated = false; > > @@ -789,6 +799,9 @@ static bool __intel_cqm_rmid_rotate(void) > if (list_empty(&cache_groups) && list_empty(&cqm_rmid_limbo_lru)) > goto out; > > + nr_needed = 0; > + start = NULL; > + > list_for_each_entry(group, &cache_groups, hw.cqm_groups_entry) { > if (!__rmid_valid(group->hw.cqm_rmid)) { > if (!start) >