From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 28D3EC433ED for ; Mon, 17 May 2021 17:27:12 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0D0F661263 for ; Mon, 17 May 2021 17:27:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240932AbhEQR21 (ORCPT ); Mon, 17 May 2021 13:28:27 -0400 Received: from out02.mta.xmission.com ([166.70.13.232]:44932 "EHLO out02.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230408AbhEQR2Z (ORCPT ); Mon, 17 May 2021 13:28:25 -0400 Received: from in01.mta.xmission.com ([166.70.13.51]) by out02.mta.xmission.com with esmtps (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.93) (envelope-from ) id 1lih1I-008g3Z-Gy; Mon, 17 May 2021 11:27:08 -0600 Received: from ip68-227-160-95.om.om.cox.net ([68.227.160.95] helo=fess.xmission.com) by in01.mta.xmission.com with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.87) (envelope-from ) id 1ligwn-0007iG-0m; Mon, 17 May 2021 11:22:29 -0600 From: ebiederm@xmission.com (Eric W. Biederman) To: Pavel Begunkov Cc: linux-kernel@vger.kernel.org, Ingo Molnar , Jens Axboe References: Date: Mon, 17 May 2021 12:22:13 -0500 In-Reply-To: (Pavel Begunkov's message of "Mon, 17 May 2021 11:18:07 +0100") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-XM-SPF: eid=1ligwn-0007iG-0m;;;mid=;;;hst=in01.mta.xmission.com;;;ip=68.227.160.95;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX1/uHyMU4FoXCJfugdXBJnxPvncafg9cjoo= X-SA-Exim-Connect-IP: 68.227.160.95 X-SA-Exim-Mail-From: ebiederm@xmission.com Subject: Re: [PATCH] signal: optimise signal_pending() X-SA-Exim-Version: 4.2.1 (built Thu, 05 May 2016 13:38:54 -0600) X-SA-Exim-Scanned: Yes (on in01.mta.xmission.com) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Pavel Begunkov writes: > Optimise signal_pending() by checking both TIF_SIGPENDING and > TIF_NOTIFY_SIGNAL at once. Saves quite a bit of generated instructions, > e.g. sheds 240B from io_uring alone, some including ones in hot paths. > > text data bss dec hex filename > 84087 12414 8 96509 178fd ./fs/io_uring.o > 83847 12414 8 96269 1780d ./fs/io_uring.o I believe the atomic test_bit is pretty fundamental, especially with it's implied barriers. I believe you are optimizing out the code that will makes signal_pending work in a loop. I have tried looking and I really don't understand why TIF_NOTIFY_SIGNAL was added. Perhaps instead of trying to optimize the test, you should optimize by combining TIF_NOTIFY_SIGNAL with TIF_SIGPENDING. Perhaps set_notify_signal could be optimized to set both. I think I only see 4 calls in the tree. > Signed-off-by: Pavel Begunkov > --- > > Suggestions on how to make it less disruptive to abstractions are most > welcome, as even the one below fails to generated anything sane because > of test_bit() > > return unlikely(test_ti_thread_flag(ti, TIF_SIGPENDING) | > test_ti_thread_flag(ti, TIF_SIGPENDING)); > > include/linux/sched/signal.h | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/include/linux/sched/signal.h b/include/linux/sched/signal.h > index 3f6a0fcaa10c..97e1963a13fc 100644 > --- a/include/linux/sched/signal.h > +++ b/include/linux/sched/signal.h > @@ -361,14 +361,14 @@ static inline int task_sigpending(struct task_struct *p) > > static inline int signal_pending(struct task_struct *p) > { > + struct thread_info *ti = task_thread_info(p); > + > /* > * TIF_NOTIFY_SIGNAL isn't really a signal, but it requires the same > * behavior in terms of ensuring that we break out of wait loops > * so that notify signal callbacks can be processed. > */ > - if (unlikely(test_tsk_thread_flag(p, TIF_NOTIFY_SIGNAL))) > - return 1; > - return task_sigpending(p); > + return unlikely(ti->flags & (_TIF_SIGPENDING | _TIF_NOTIFY_SIGNAL)); > } > > static inline int __fatal_signal_pending(struct task_struct *p)