From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0146613D24D for ; Thu, 9 Jan 2025 04:07:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736395680; cv=none; b=XMSpuWeBw4ZOegIMX73XWkok5tyD/LBufgWhzFv+LJdYmjEAnUoWUOJeYS863+Aojz/cpNE4mdmlgaNHI8y/bkP0vmrSmtJZfNCPBlfBXF1s7Fbs4lNz6WID878ww4gEZ+UQWKnHMnsWB6kO4WQ7SS7WmTU8yt7VwnRzSY7KmdQ= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736395680; c=relaxed/simple; bh=NLlHWq8OQVZg8aQwJt+6x5RlrUVb/0AHkkuOfKZFDb0=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=KvTFzI7h2ENu3o3BxvDox2usgC1BJiDkJF4MFP1+XnTtzFCA/0J5AVzk+sgGPBpTGJ6DuCipVaESp8mk7DCGxtIsTk43fhfZteHcyhgGnugxkvv/j9RiRjgGup2po5tuOo3B43xTvqWbzuma838Lj53adBuXamy06PhBkTG20ho= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=jB6//Xp2; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="jB6//Xp2" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1736395677; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=mAE8vSdiZmItWMYElUXD2PNWSksXHi0/cUj+T0Joe/w=; b=jB6//Xp2t/qCqPq9RRzqOe7Id+mAIHq/oxWn+J7kkxxiR8Pz++IXvZB/mZHnW/cO6fptoG 1xi8pc+TDsP4ZKvEORS1IC0uRLT3o3FK6vEmGrP4XFmyPLG7f0GA2ScLC9i366Gx+uraUN /jDRLFbs0Uk5t/oD96o48cAPrgFqK4Q= Received: from mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-441-ZTNJP0N5OeeL0jkhMfbzEA-1; Wed, 08 Jan 2025 23:07:51 -0500 X-MC-Unique: ZTNJP0N5OeeL0jkhMfbzEA-1 X-Mimecast-MFC-AGG-ID: ZTNJP0N5OeeL0jkhMfbzEA Received: from mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.12]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id E3BB619153C4; Thu, 9 Jan 2025 04:07:49 +0000 (UTC) Received: from localhost (unknown [10.72.112.99]) by mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 4262319560AB; Thu, 9 Jan 2025 04:07:47 +0000 (UTC) Date: Thu, 9 Jan 2025 12:07:43 +0800 From: Baoquan He To: Kairui Song Cc: linux-mm@kvack.org, Andrew Morton , Chris Li , Barry Song , Ryan Roberts , Hugh Dickins , Yosry Ahmed , "Huang, Ying" , Nhat Pham , Johannes Weiner , Kalesh Singh , linux-kernel@vger.kernel.org Subject: Re: [PATCH v3 04/13] mm, swap: use cluster lock for HDD Message-ID: References: <20241230174621.61185-1-ryncsn@gmail.com> <20241230174621.61185-5-ryncsn@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20241230174621.61185-5-ryncsn@gmail.com> X-Scanned-By: MIMEDefang 3.0 on 10.30.177.12 On 12/31/24 at 01:46am, Kairui Song wrote: > From: Kairui Song > > Cluster lock (ci->lock) was introduce to reduce contention for certain ~~~~~~~~~ typo, introduced. > operations. Using cluster lock for HDD is not helpful as HDD have a poor > performance, so locking isn't the bottleneck. But having different set > of locks for HDD / non-HDD prevents further rework of device lock > (si->lock). > > This commit just changed all lock_cluster_or_swap_info to lock_cluster, > which is a safe and straight conversion since cluster info is always > allocated now, also removed all cluster_info related checks. > > Suggested-by: Chris Li > Signed-off-by: Kairui Song > --- > mm/swapfile.c | 107 ++++++++++++++++---------------------------------- > 1 file changed, 34 insertions(+), 73 deletions(-) Other than the nit in patch log, LGTM, Reviewed-by: Baoquan He > > diff --git a/mm/swapfile.c b/mm/swapfile.c > index fca58d43b836..d0e5b9fa0c48 100644 > --- a/mm/swapfile.c > +++ b/mm/swapfile.c > @@ -58,10 +58,9 @@ static void swap_entry_range_free(struct swap_info_struct *si, swp_entry_t entry > static void swap_range_alloc(struct swap_info_struct *si, unsigned long offset, > unsigned int nr_entries); > static bool folio_swapcache_freeable(struct folio *folio); > -static struct swap_cluster_info *lock_cluster_or_swap_info( > - struct swap_info_struct *si, unsigned long offset); > -static void unlock_cluster_or_swap_info(struct swap_info_struct *si, > - struct swap_cluster_info *ci); > +static struct swap_cluster_info *lock_cluster(struct swap_info_struct *si, > + unsigned long offset); > +static void unlock_cluster(struct swap_cluster_info *ci); > > static DEFINE_SPINLOCK(swap_lock); > static unsigned int nr_swapfiles; > @@ -222,9 +221,9 @@ static int __try_to_reclaim_swap(struct swap_info_struct *si, > * swap_map is HAS_CACHE only, which means the slots have no page table > * reference or pending writeback, and can't be allocated to others. > */ > - ci = lock_cluster_or_swap_info(si, offset); > + ci = lock_cluster(si, offset); > need_reclaim = swap_is_has_cache(si, offset, nr_pages); > - unlock_cluster_or_swap_info(si, ci); > + unlock_cluster(ci); > if (!need_reclaim) > goto out_unlock; > > @@ -404,45 +403,15 @@ static inline struct swap_cluster_info *lock_cluster(struct swap_info_struct *si > { > struct swap_cluster_info *ci; > > - ci = si->cluster_info; > - if (ci) { > - ci += offset / SWAPFILE_CLUSTER; > - spin_lock(&ci->lock); > - } > - return ci; > -} > - > -static inline void unlock_cluster(struct swap_cluster_info *ci) > -{ > - if (ci) > - spin_unlock(&ci->lock); > -} > - > -/* > - * Determine the locking method in use for this device. Return > - * swap_cluster_info if SSD-style cluster-based locking is in place. > - */ > -static inline struct swap_cluster_info *lock_cluster_or_swap_info( > - struct swap_info_struct *si, unsigned long offset) > -{ > - struct swap_cluster_info *ci; > - > - /* Try to use fine-grained SSD-style locking if available: */ > - ci = lock_cluster(si, offset); > - /* Otherwise, fall back to traditional, coarse locking: */ > - if (!ci) > - spin_lock(&si->lock); > + ci = &si->cluster_info[offset / SWAPFILE_CLUSTER]; > + spin_lock(&ci->lock); > > return ci; > } > > -static inline void unlock_cluster_or_swap_info(struct swap_info_struct *si, > - struct swap_cluster_info *ci) > +static inline void unlock_cluster(struct swap_cluster_info *ci) > { > - if (ci) > - unlock_cluster(ci); > - else > - spin_unlock(&si->lock); > + spin_unlock(&ci->lock); > } > > /* Add a cluster to discard list and schedule it to do discard */ > @@ -558,9 +527,6 @@ static void inc_cluster_info_page(struct swap_info_struct *si, > unsigned long idx = page_nr / SWAPFILE_CLUSTER; > struct swap_cluster_info *ci; > > - if (!cluster_info) > - return; > - > ci = cluster_info + idx; > ci->count++; > > @@ -576,9 +542,6 @@ static void inc_cluster_info_page(struct swap_info_struct *si, > static void dec_cluster_info_page(struct swap_info_struct *si, > struct swap_cluster_info *ci, int nr_pages) > { > - if (!si->cluster_info) > - return; > - > VM_BUG_ON(ci->count < nr_pages); > VM_BUG_ON(cluster_is_free(ci)); > lockdep_assert_held(&si->lock); > @@ -1007,8 +970,6 @@ static int cluster_alloc_swap(struct swap_info_struct *si, > { > int n_ret = 0; > > - VM_BUG_ON(!si->cluster_info); > - > si->flags += SWP_SCANNING; > > while (n_ret < nr) { > @@ -1052,10 +1013,10 @@ static int scan_swap_map_slots(struct swap_info_struct *si, > } > > /* > - * Swapfile is not block device or not using clusters so unable > + * Swapfile is not block device so unable > * to allocate large entries. > */ > - if (!(si->flags & SWP_BLKDEV) || !si->cluster_info) > + if (!(si->flags & SWP_BLKDEV)) > return 0; > } > > @@ -1295,9 +1256,9 @@ static unsigned char __swap_entry_free(struct swap_info_struct *si, > unsigned long offset = swp_offset(entry); > unsigned char usage; > > - ci = lock_cluster_or_swap_info(si, offset); > + ci = lock_cluster(si, offset); > usage = __swap_entry_free_locked(si, offset, 1); > - unlock_cluster_or_swap_info(si, ci); > + unlock_cluster(ci); > if (!usage) > free_swap_slot(entry); > > @@ -1320,14 +1281,14 @@ static bool __swap_entries_free(struct swap_info_struct *si, > if (nr > SWAPFILE_CLUSTER - offset % SWAPFILE_CLUSTER) > goto fallback; > > - ci = lock_cluster_or_swap_info(si, offset); > + ci = lock_cluster(si, offset); > if (!swap_is_last_map(si, offset, nr, &has_cache)) { > - unlock_cluster_or_swap_info(si, ci); > + unlock_cluster(ci); > goto fallback; > } > for (i = 0; i < nr; i++) > WRITE_ONCE(si->swap_map[offset + i], SWAP_HAS_CACHE); > - unlock_cluster_or_swap_info(si, ci); > + unlock_cluster(ci); > > if (!has_cache) { > for (i = 0; i < nr; i++) > @@ -1383,7 +1344,7 @@ static void cluster_swap_free_nr(struct swap_info_struct *si, > DECLARE_BITMAP(to_free, BITS_PER_LONG) = { 0 }; > int i, nr; > > - ci = lock_cluster_or_swap_info(si, offset); > + ci = lock_cluster(si, offset); > while (nr_pages) { > nr = min(BITS_PER_LONG, nr_pages); > for (i = 0; i < nr; i++) { > @@ -1391,18 +1352,18 @@ static void cluster_swap_free_nr(struct swap_info_struct *si, > bitmap_set(to_free, i, 1); > } > if (!bitmap_empty(to_free, BITS_PER_LONG)) { > - unlock_cluster_or_swap_info(si, ci); > + unlock_cluster(ci); > for_each_set_bit(i, to_free, BITS_PER_LONG) > free_swap_slot(swp_entry(si->type, offset + i)); > if (nr == nr_pages) > return; > bitmap_clear(to_free, 0, BITS_PER_LONG); > - ci = lock_cluster_or_swap_info(si, offset); > + ci = lock_cluster(si, offset); > } > offset += nr; > nr_pages -= nr; > } > - unlock_cluster_or_swap_info(si, ci); > + unlock_cluster(ci); > } > > /* > @@ -1441,9 +1402,9 @@ void put_swap_folio(struct folio *folio, swp_entry_t entry) > if (!si) > return; > > - ci = lock_cluster_or_swap_info(si, offset); > + ci = lock_cluster(si, offset); > if (size > 1 && swap_is_has_cache(si, offset, size)) { > - unlock_cluster_or_swap_info(si, ci); > + unlock_cluster(ci); > spin_lock(&si->lock); > swap_entry_range_free(si, entry, size); > spin_unlock(&si->lock); > @@ -1451,14 +1412,14 @@ void put_swap_folio(struct folio *folio, swp_entry_t entry) > } > for (int i = 0; i < size; i++, entry.val++) { > if (!__swap_entry_free_locked(si, offset + i, SWAP_HAS_CACHE)) { > - unlock_cluster_or_swap_info(si, ci); > + unlock_cluster(ci); > free_swap_slot(entry); > if (i == size - 1) > return; > - lock_cluster_or_swap_info(si, offset); > + lock_cluster(si, offset); > } > } > - unlock_cluster_or_swap_info(si, ci); > + unlock_cluster(ci); > } > > static int swp_entry_cmp(const void *ent1, const void *ent2) > @@ -1522,9 +1483,9 @@ int swap_swapcount(struct swap_info_struct *si, swp_entry_t entry) > struct swap_cluster_info *ci; > int count; > > - ci = lock_cluster_or_swap_info(si, offset); > + ci = lock_cluster(si, offset); > count = swap_count(si->swap_map[offset]); > - unlock_cluster_or_swap_info(si, ci); > + unlock_cluster(ci); > return count; > } > > @@ -1547,7 +1508,7 @@ int swp_swapcount(swp_entry_t entry) > > offset = swp_offset(entry); > > - ci = lock_cluster_or_swap_info(si, offset); > + ci = lock_cluster(si, offset); > > count = swap_count(si->swap_map[offset]); > if (!(count & COUNT_CONTINUED)) > @@ -1570,7 +1531,7 @@ int swp_swapcount(swp_entry_t entry) > n *= (SWAP_CONT_MAX + 1); > } while (tmp_count & COUNT_CONTINUED); > out: > - unlock_cluster_or_swap_info(si, ci); > + unlock_cluster(ci); > return count; > } > > @@ -1585,8 +1546,8 @@ static bool swap_page_trans_huge_swapped(struct swap_info_struct *si, > int i; > bool ret = false; > > - ci = lock_cluster_or_swap_info(si, offset); > - if (!ci || nr_pages == 1) { > + ci = lock_cluster(si, offset); > + if (nr_pages == 1) { > if (swap_count(map[roffset])) > ret = true; > goto unlock_out; > @@ -1598,7 +1559,7 @@ static bool swap_page_trans_huge_swapped(struct swap_info_struct *si, > } > } > unlock_out: > - unlock_cluster_or_swap_info(si, ci); > + unlock_cluster(ci); > return ret; > } > > @@ -3428,7 +3389,7 @@ static int __swap_duplicate(swp_entry_t entry, unsigned char usage, int nr) > offset = swp_offset(entry); > VM_WARN_ON(nr > SWAPFILE_CLUSTER - offset % SWAPFILE_CLUSTER); > VM_WARN_ON(usage == 1 && nr > 1); > - ci = lock_cluster_or_swap_info(si, offset); > + ci = lock_cluster(si, offset); > > err = 0; > for (i = 0; i < nr; i++) { > @@ -3483,7 +3444,7 @@ static int __swap_duplicate(swp_entry_t entry, unsigned char usage, int nr) > } > > unlock_out: > - unlock_cluster_or_swap_info(si, ci); > + unlock_cluster(ci); > return err; > } > > -- > 2.47.1 > >