From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A94F3C433E2 for ; Fri, 4 Sep 2020 10:14:36 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 6177F206D4 for ; Fri, 4 Sep 2020 10:14:36 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="So/Wiwuw"; dkim=fail reason="signature verification failed" (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="dZBSlId5"; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="eoru9GuS" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6177F206D4 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linutronix.de Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:MIME-Version:Message-ID:Date:References:In-Reply-To: Subject:To:From:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=doa6zVbHQdafnzmVDWbrql65CxY6z6/j5D1CSfbf3gI=; b=So/Wiwuw2LKsb1jd6P0Jtmxyl tn86857RiPio5Ee+TWG+AW8Owu+2qrMizuYwNYbha0PFrU4hd9aTUD7NZ824x6SrhpixKRPgY9+yf t3cMJvSuR3DJ5U9eyoSioa0BfLcJf8UznA+IbtOuoKGra2ViLKLcq0jkKBvpfEhx0iGVoROxEfCbH QC98tOal5jSNu5qlfuYGVM4uhXozImENo2u0F435P3AM+TSrAt/o4+Z9dMvBAcCC6AAt8S2DfKQk4 lec05CxKgCRfDOPUNz753rbQorOsjvUBy9FDpSg+vQ5yxV46MxFKFl1ISnFT7VYn1LhPLf2k8IIzU A2P1P+Trg==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1kE8ig-0007bU-DR; Fri, 04 Sep 2020 10:13:22 +0000 Received: from galois.linutronix.de ([193.142.43.55]) by merlin.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1kE8ic-0007aO-PF for linux-arm-kernel@lists.infradead.org; Fri, 04 Sep 2020 10:13:20 +0000 From: Thomas Gleixner DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1599214396; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Z8OvW6DJXZpgU3aKJ+ArJf6u2HaUfprFbOOK8smF874=; b=dZBSlId5W09VtfHcoWhseB5XiTBbGXWSjSWG5XFU58ZWYuqDdtM4Ipu9mNGZY+ddAn37dS AxDs/0r1rhHicd824BDPwytOgDf8f17n5gdpXS06ccu1J8J02/S3Px3uFgDRLoGRl+PjKM bU83O/2nJ/tSYGIE4CElngX6XiPbSOHvrFIlF/46BHUlEwSBUrrrMnzuy1Ks/8KOesKn82 yzz5TXpXJFv9IUUrWA8fr+U2kWyVy8VmvS31HDRYjYZ0xy05l0NK41J5gNkBa6AjMjEEvi SqxPITHKMuMWLYE38SvauwgernJ2tQIyJdf2qZxYvlL917z72wWol9JRX9XuYQ== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1599214396; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Z8OvW6DJXZpgU3aKJ+ArJf6u2HaUfprFbOOK8smF874=; b=eoru9GuSlO3Zvk6GwC/O3LrVM2/p6u5L16LR2oCIwbP9/NjUC2FQtUYVwbzdk2rHLSP1Ta igirNT0+2/4DKxDw== To: Andy Lutomirski Subject: Re: ptrace_syscall_32 is failing In-Reply-To: References: <87k0xdjbtt.fsf@nanos.tec.linutronix.de> <87blioinub.fsf@nanos.tec.linutronix.de> Date: Fri, 04 Sep 2020 12:13:15 +0200 Message-ID: <87mu254zpg.fsf@nanos.tec.linutronix.de> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20200904_061319_034411_7A51D123 X-CRM114-Status: GOOD ( 27.34 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-s390 , linuxppc-dev , Benjamin Herrenschmidt , Vasily Gorbik , Brian Gerst , Heiko Carstens , X86 ML , LKML , Christian Borntraeger , Paul Mackerras , Catalin Marinas , Andy Lutomirski , Michael Ellerman , Will Deacon , linux-arm-kernel Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Andy, On Wed, Sep 02 2020 at 09:49, Andy Lutomirski wrote: > On Wed, Sep 2, 2020 at 1:29 AM Thomas Gleixner wrote: >> >> But you might tell me where exactly you want to inject the SIGTRAP in >> the syscall exit code flow. > > It would be a bit complicated. Definitely after any signals from the > syscall are delivered. Right now, I think that we don't deliver a > SIGTRAP on the instruction boundary after SYSCALL while > single-stepping. (I think we used to, but only sometimes, and now we > are at least consistent.) This is because IRET will not trap if it > starts with TF clear and ends up setting it. (I asked Intel to > document this, and I think they finally did, although I haven't gotten > around to reading the new docs. Certainly the old docs as of a year > or two ago had no description whatsoever of how TF changes worked.) > > Deciding exactly *when* a trap should occur would be nontrivial -- we > can't trap on sigreturn() from a SIGTRAP, for example. > > So this isn't fully worked out. Oh well. >> >> I don't think we want that in general. The current variant is perfectly >> >> fine for everything except the 32bit fast syscall nonsense. Also >> >> irqentry_entry/exit is not equivalent to the syscall_enter/exit >> >> counterparts. >> > >> > If there are any architectures in which actual work is needed to >> > figure out whether something is a syscall in the first place, they'll >> > want to do the usual kernel entry work before the syscall entry work. >> >> That's low level entry code which does not require RCU, lockdep, tracing >> or whatever muck we setup before actual work can be done. >> >> arch_asm_entry() >> ... >> arch_c_entry(cause) { >> switch(cause) { >> case EXCEPTION: arch_c_exception(...); >> case SYSCALL: arch_c_syscall(...); >> ... >> } > > You're assuming that figuring out the cause doesn't need the kernel > entry code to run first. In the case of the 32-bit vDSO fast > syscalls, we arguably don't know whether an entry is a syscall until > we have done a user memory access. Logically, we're doing: > > if (get_user() < 0) { > /* Not a syscall. This is actually a silly operation that sets AX = > -EFAULT and returns. Do not audit or invoke ptrace. */ > } else { > /* This actually is a syscall. */ > } Yes, that's what I've addressed with providing split interfaces. >> You really want to differentiate between exception and syscall >> entry/exit. >> > > Why do we want to distinguish between exception and syscall > entry/exit? For the enter part, AFAICS the exception case boils down > to enter_from_user_mode() and the syscall case is: > > enter_from_user_mode(regs); > instrumentation_begin(); > > local_irq_enable(); > ti_work = READ_ONCE(current_thread_info()->flags); > if (ti_work & SYSCALL_ENTER_WORK) > syscall = syscall_trace_enter(regs, syscall, ti_work); > instrumentation_end(); > > Which would decompose quite nicely as a regular (non-syscall) entry > plus the syscall part later. There is a difference between syscall entry and exception entry at least in my view: syscall: enter_from_user_mode(regs); local_irq_enable(); exception: enter_from_user_mode(regs); >> we'd have: >> >> arch_c_entry() >> irqentry_enter(); >> local_irq_enble(); >> nr = syscall_enter_from_user_mode_work(); >> ... >> >> which enforces two calls for sane entries and more code in arch/.... > > This is why I still like my: > > arch_c_entry() > irqentry_enter_from_user_mode(); > generic_syscall(); > exit... So what we have now (with my patch applied) is either: 1) arch_c_entry() nr = syscall_enter_from_user_mode(); arch_handle_syscall(nr); syscall_exit_to_user_mode(); or for that extra 32bit fast syscall thing: 2) arch_c_entry() syscall_enter_from_user_mode_prepare(); arch_do_stuff(); nr = syscall_enter_from_user_mode_work(); arch_handle_syscall(nr); syscall_exit_to_user_mode(); So for sane cases you just use #1. Ideally we'd not need arch_handle_syscall(nr) at all, but that does not work with multiple ABIs supported, i.e. the compat muck. The only way we could make that work is to have: syscall_enter_exit(regs, mode) nr = syscall_enter_from_user_mode(); arch_handle_syscall(mode, nr); syscall_exit_to_user_mode(); and then arch_c_entry() becomes: syscall_enter_exit(regs, mode); which means that arch_handle_syscall() would have to evaluate the mode and chose the appropriate syscall table. Not sure whether that's a win. Thanks, tglx _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel