From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755022AbcEXXaD (ORCPT ); Tue, 24 May 2016 19:30:03 -0400 Received: from mx1.redhat.com ([209.132.183.28]:58037 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752005AbcEXXaB (ORCPT ); Tue, 24 May 2016 19:30:01 -0400 Date: Wed, 25 May 2016 01:29:58 +0200 From: Oleg Nesterov To: Andy Lutomirski Cc: Andrei Vagin , LKML , X86 ML , Andy Lutomirski , Cyrill Gorcunov Subject: Re: x86: A process doesn't stop on hw breakpoints sometimes Message-ID: <20160524232958.GA14477@redhat.com> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Tue, 24 May 2016 23:30:00 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 05/23, Andy Lutomirski wrote: > > I'm guessing you're either hitting a subtle bug in the mess that is > breakpoint handling or you're hitting a bug in perf's context switch > code. yes, same feeling... > Given that the breakpoint gets missed many times in a row, yes, the child specially tries to hit the same bp again and again, > this is > presumably either a bug in breakpoint programming (i.e. the thing > isn't actually set in dr0/dr7) or a bug in the bp state tracking. or some buf in perf_sched_in(). In fact this is what I think now, but I can be wrong. > If > it were a bug in RF flag handling, I'd expect it to skip once and trip > the second time through. Exactly. It would be nice to ensure that this problem has actually gone, and how. So, Andrei, if you have any motivation, we can continue. The next step needs a simple kernel patch or kernel module which allows to read dr0/dr7 and print these registers in the "fail" loop. Oleg.