From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9FF68C433FE for ; Wed, 1 Jun 2022 06:33:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1AD256B0071; Wed, 1 Jun 2022 02:33:11 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 15EBE6B0073; Wed, 1 Jun 2022 02:33:11 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 04B956B0074; Wed, 1 Jun 2022 02:33:10 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id EBF0B6B0071 for ; Wed, 1 Jun 2022 02:33:10 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id B46A634B61 for ; Wed, 1 Jun 2022 06:33:10 +0000 (UTC) X-FDA: 79528699740.17.A980909 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by imf08.hostedemail.com (Postfix) with ESMTP id A8535160056 for ; Wed, 1 Jun 2022 06:32:43 +0000 (UTC) Received: from canpemm500002.china.huawei.com (unknown [172.30.72.53]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4LCfQF2WqczRhTc; Wed, 1 Jun 2022 14:29:57 +0800 (CST) Received: from [10.174.177.76] (10.174.177.76) by canpemm500002.china.huawei.com (7.192.104.244) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Wed, 1 Jun 2022 14:33:02 +0800 Subject: Re: [PATCH v4 1/4] mm: reduce the rcu lock duration To: "Eric W. Biederman" CC: Ying Huang , , , , , , , , , , , , References: <20220530113016.16663-1-linmiaohe@huawei.com> <20220530113016.16663-2-linmiaohe@huawei.com> <87bkvdfzvm.fsf@email.froward.int.ebiederm.org> From: Miaohe Lin Message-ID: Date: Wed, 1 Jun 2022 14:33:02 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.6.0 MIME-Version: 1.0 In-Reply-To: <87bkvdfzvm.fsf@email.froward.int.ebiederm.org> Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.174.177.76] X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To canpemm500002.china.huawei.com (7.192.104.244) X-CFilter-Loop: Reflected X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: A8535160056 X-Stat-Signature: s4gf9u4jr4a4ryb3zppd8ip795s1nwyb X-Rspam-User: Authentication-Results: imf08.hostedemail.com; dkim=none; spf=pass (imf08.hostedemail.com: domain of linmiaohe@huawei.com designates 45.249.212.188 as permitted sender) smtp.mailfrom=linmiaohe@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com X-HE-Tag: 1654065163-984949 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2022/6/1 0:09, Eric W. Biederman wrote: > Miaohe Lin writes: snip >> >> " >> Commit 3268c63eded4 ("mm: fix move/migrate_pages() race on task struct") >> extends the period of the rcu_read_lock until after the permissions checks >> are done because it suspects the permissions checks are not safe unless >> performed under both rcu_read_lock and task_lock to ensure the task<->mm >> association does not change on us while we are working [1]. But extended >> rcu read lock does not add much value. Because after permission checking >> the permission may still be changed. There's no much difference. So it's >> unnecessary to extend the period of the rcu_read_lock. Release the rcu >> lock after task refcount is successfully grabbed to reduce the rcu holding >> time. >> >> [1] https://lore.kernel.org/lkml/87sjhzun47.fsf@xmission.com/ >> " > > It doesn't make sense to me. > > I don't see any sleeping functions called from find_mm_struct or > kernel_migrate_pages in the area kernel_migrate_pages in the area of the > code protected by get_task_struct. So at a very basic level I see a > justification for dirtying a cache line twice with get_task_struct and > put_task_struct to reduce rcu_read_lock hold times. > > I would contend that a reasonable cleanup based up on the current state > of the code would be to extend the rcu_read_lock over get_task_mm so If so, security_task_movememory will be called inside rcu lock. It might call sleeping functions, e.g. smack_log(). I think it's not a good idea. > that a reference to task_struct does not need to be taken. That has > the potential to reduce contention and reduce lock hold times. > > > The code is missing a big fat comment with the assertion that it is ok > if the permission checks are racy because the race is small, and the > worst case thing that happens is the page is migrated to another > numa node. > > > Given that the get_mm_task takes task_lock the cost of dirtying the > cache line is already being paid. Perhaps not extending task_lock hold > times a little bit is justified, but I haven't seen that case made. > > This seems like code that is called little enough it would be better for > it to be correct, and not need big fat comments explaining why it > doesn't matter that they code is deliberately buggy. > Agree. A big fat comments will make code hard to follow. > > In short it does not make sense to me to justify a patch for performance > reasons when it appears that extending the rcu_read_lock hold time and > not touch the task reference count would stop dirtying a cache line and > likely have more impact. IMHO, incremented task refcount should make code works correctly. And extending the rcu_read_lock over get_task_mm will break the things because sleeping functions might be called while holding rcu lock. Does the patch itself makes sense for you? Should I rephase the commit log further? I'm afraid I didn't get your point correctly. > > Eric Thanks! > . >