From mboxrd@z Thu Jan 1 00:00:00 1970 From: tixy@linaro.org (Jon Medhurst (Tixy)) Date: Thu, 04 Sep 2014 11:40:35 +0100 Subject: [PATCH v5 3/3] kprobes: arm: enable OPTPROBES for ARM 32 In-Reply-To: <20140903103044.GC32378@arm.com> References: <1409144552-12751-1-git-send-email-wangnan0@huawei.com> <1409144552-12751-4-git-send-email-wangnan0@huawei.com> <1409665784.2873.49.camel@linaro1.home> <5406EADC.8080009@hitachi.com> <20140903103044.GC32378@arm.com> Message-ID: <1409827235.3008.46.camel@linaro1.home> To: linux-arm-kernel@lists.infradead.org List-Id: linux-arm-kernel.lists.infradead.org On Wed, 2014-09-03 at 11:30 +0100, Will Deacon wrote: > On Wed, Sep 03, 2014 at 11:18:04AM +0100, Masami Hiramatsu wrote: > > (2014/09/02 22:49), Jon Medhurst (Tixy) wrote: > > > 1. On SMP systems it's very slow because of kprobe's use of stop_machine > > > for applying and removing probes, this forces the system to idle and > > > wait for the next scheduler tick for each probe change. > > > > Hmm, agreed. It seems that arm32 limitation of self-modifying code on SMP. > > I'm not sure how we can handle it, but I guess; > > - for some processors which have better coherent cache for SMP, we can > > atomically replace the breakpoint code with original code. > > Except that it's not an architected breakpoint instruction, as I mentioned > before. It's also not really a property of the cache. > > > - Even if we get an "undefined instruction" exception, its handler can > > ask kprobes if the address is under modifying or not. And if it is, > > we can just return from the exception to retry the execution. > > It's not as simple as that -- you could potentially see an interleaving of > the two instructions. The architecture is even broader than that: > > Concurrent modification and execution of instructions can lead to the > resulting instruction performing any behavior that can be achieved by > executing any sequence of instructions that can be executed from the > same Exception level, > > There are additional guarantees for some instructions (like the architected > BKPT instruction). I should point out that the current implementation of kprobes doesn't use stop_machine because it's trying to meet the above architecture restrictions, and that arming kprobes (changing probed instruction to an undefined instruction) isn't usually done under stop_machine, so other CPUs could be executing the original instruction as it's being modified. So, should we be making patch_text unconditionally use stop machine and remove all direct use of __patch_text? (E.g. by jump labels.) -- Tixy