From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752559AbdLLBM0 (ORCPT ); Mon, 11 Dec 2017 20:12:26 -0500 Received: from mga06.intel.com ([134.134.136.31]:27354 "EHLO mga06.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751234AbdLLBMY (ORCPT ); Mon, 11 Dec 2017 20:12:24 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.45,393,1508828400"; d="scan'208";a="17448086" From: "Huang\, Ying" To: "Paul E. McKenney" Cc: Andrew Morton , Minchan Kim , , , Hugh Dickins , Johannes Weiner , "Tim Chen" , Shaohua Li , Mel Gorman , =?utf-8?B?Su+/vXLvv71tZQ==?= Glisse , Michal Hocko , Andrea Arcangeli , David Rientjes , Rik van Riel , Jan Kara , Dave Jiang , Aaron Lu Subject: Re: [PATCH -mm] mm, swap: Fix race between swapoff and some swap operations References: <20171207011426.1633-1-ying.huang@intel.com> <20171207162937.6a179063a7c92ecac77e44af@linux-foundation.org> <20171208014346.GA8915@bbox> <87po7pg4jt.fsf@yhuang-dev.intel.com> <20171208082644.GA14361@bbox> <87k1xxbohp.fsf@yhuang-dev.intel.com> <20171208140909.4e31ba4f1235b638ae68fd5c@linux-foundation.org> <87609dvnl0.fsf@yhuang-dev.intel.com> <20171211170449.GS7829@linux.vnet.ibm.com> Date: Tue, 12 Dec 2017 09:12:20 +0800 In-Reply-To: <20171211170449.GS7829@linux.vnet.ibm.com> (Paul E. McKenney's message of "Mon, 11 Dec 2017 09:04:49 -0800") Message-ID: <87374grbpn.fsf@yhuang-dev.intel.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=ascii Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, Pual, "Paul E. McKenney" writes: > On Mon, Dec 11, 2017 at 01:30:03PM +0800, Huang, Ying wrote: >> Andrew Morton writes: >> >> > On Fri, 08 Dec 2017 16:41:38 +0800 "Huang\, Ying" wrote: >> > >> >> > Why do we need srcu here? Is it enough with rcu like below? >> >> > >> >> > It might have a bug/room to be optimized about performance/naming. >> >> > I just wanted to show my intention. >> >> >> >> Yes. rcu should work too. But if we use rcu, it may need to be called >> >> several times to make sure the swap device under us doesn't go away, for >> >> example, when checking si->max in __swp_swapcount() and >> >> add_swap_count_continuation(). And I found we need rcu to protect swap >> >> cache radix tree array too. So I think it may be better to use one >> >> calling to srcu_read_lock/unlock() instead of multiple callings to >> >> rcu_read_lock/unlock(). >> > >> > Or use stop_machine() ;) It's very crude but it sure is simple. Does >> > anyone have a swapoff-intensive workload? >> >> Sorry, I don't know how to solve the problem with stop_machine(). >> >> The problem we try to resolved is that, we have a swap entry, but that >> swap entry can become invalid because of swappoff between we check it >> and we use it. So we need to prevent swapoff to be run between checking >> and using. >> >> I don't know how to use stop_machine() in swapoff to wait for all users >> of swap entry to finish. Anyone can help me on this? > > You can think of stop_machine() as being sort of like a reader-writer > lock. The readers can be any section of code with preemption disabled, > and the writer is the function passed to stop_machine(). > > Users running real-time applications on Linux don't tend to like > stop_machine() much, but perhaps it is nevertheless the right tool > for this particular job. Thanks a lot for explanation! Now I understand this. Another question, for this specific problem, I think both stop_machine() based solution and rcu_read_lock/unlock() + synchronize_rcu() based solution work. If so, what is the difference between them? I guess rcu based solution will be a little better for real-time applications? So what is the advantage of stop_machine() based solution? Best Regards, Huang, Ying