From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.3 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4870BC4361B for ; Mon, 7 Dec 2020 08:54:07 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id C51322151B for ; Mon, 7 Dec 2020 08:54:06 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C51322151B Authentication-Results: mail.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=suse.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-nvme-bounces+linux-nvme=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Type:Cc: List-Subscribe:List-Help:List-Post:List-Archive:List-Unsubscribe:List-Id: In-Reply-To:MIME-Version:Date:Message-ID:From:References:To:Subject:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=SAIFDUEhe492RvIKIvkqfh2WCrLApdGi07N2oXMjzKI=; b=N2CrMgmnECe/1EPnEwiaIPYv5 OiLmqBzAQQkO8/8zQd2pptPqh++UfSK+Q44xIpdfoxF0JhDhgZ1P9cgJDxsNEB4egarKyHnQQSgmH YUsZe4D2yb07N4dr2JNPQ4gl2ZlLftNRMhMGAhyrsJV3A62KwrJMy4QRI/0tLXWoiHq/gMszYV9cH 1l35rdVzn5bhJmPaoheawaqKQ+io5dCLACB8yuKhM3P2enlCJG7qJ33a+6cvx0H8FgBtj+sXvVxNM KiVpnJAwUHvX3yE3UkTFz/un3yD1Gttv4oMCjnMPjIROpT8jhy1ZOLEDQBGRdXtfuaTcjbu/ariv1 lF+Kc04KQ==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1kmCHH-0005B5-Ct; Mon, 07 Dec 2020 08:53:51 +0000 Received: from mx2.suse.de ([195.135.220.15]) by merlin.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1kmCHE-00059Z-6G for linux-nvme@lists.infradead.org; Mon, 07 Dec 2020 08:53:50 +0000 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1607331225; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=EJqEJybrWz1AyEh3+Wm2RnrytB8sa4RlWEisi1lHL6o=; b=jRzj71LQjX6YdwEu9llN7PoxVvOd17dRgD+MfCTAY2XFYR2CQBwW7cgqkklTi/q6kUH/X6 wxWTQCgs4LNLM+4Q1tE+BF54g8EgOBPaX41m6NQd/D1/EPwSwxc+grQiX6YbRtsiRypox5 aY/7UsVvwSb7cmaTO7FLLTmoPXKTv9w= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 320DAADCA; Mon, 7 Dec 2020 08:53:45 +0000 (UTC) Subject: Re: GPF on 0xdead000000000100 in nvme_map_data - Linux 5.9.9 To: Jason Andryuk , =?UTF-8?Q?Roger_Pau_Monn=c3=a9?= References: <20201129035639.GW2532@mail-itl> <20201130164010.GA23494@redsun51.ssa.fujisawa.hgst.com> <20201202000642.GJ201140@mail-itl> <20201204110847.GU201140@mail-itl> <20201204120803.GA20727@lst.de> <20201204122054.GV201140@mail-itl> <20201205082839.ts3ju6yta46cgwjn@Air-de-Roger> From: =?UTF-8?B?SsO8cmdlbiBHcm/Dnw==?= Message-ID: <7de66323-8a1f-fe95-c9d2-d2a5b1318d2f@suse.com> Date: Mon, 7 Dec 2020 09:53:43 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.4.0 MIME-Version: 1.0 In-Reply-To: X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20201207_035348_571422_A12D091A X-CRM114-Status: GOOD ( 34.69 ) X-BeenThere: linux-nvme@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Sagi Grimberg , =?UTF-8?Q?Marek_Marczykowski-G=c3=b3recki?= , linux-nvme@lists.infradead.org, Jens Axboe , Keith Busch , xen-devel , Christoph Hellwig Content-Type: multipart/mixed; boundary="===============6406864312272058914==" Sender: "Linux-nvme" Errors-To: linux-nvme-bounces+linux-nvme=archiver.kernel.org@lists.infradead.org This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --===============6406864312272058914== Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="DjInZuhMnyFoVAwIkWsZZLcVQw8Nn1gxE" This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --DjInZuhMnyFoVAwIkWsZZLcVQw8Nn1gxE Content-Type: multipart/mixed; boundary="dj6QMlgeBtfrRkv9HmCRuQeYiLeVMFEpv"; protected-headers="v1" From: =?UTF-8?B?SsO8cmdlbiBHcm/Dnw==?= To: Jason Andryuk , =?UTF-8?Q?Roger_Pau_Monn=c3=a9?= Cc: =?UTF-8?Q?Marek_Marczykowski-G=c3=b3recki?= , Christoph Hellwig , xen-devel , Keith Busch , Jens Axboe , Sagi Grimberg , linux-nvme@lists.infradead.org Message-ID: <7de66323-8a1f-fe95-c9d2-d2a5b1318d2f@suse.com> Subject: Re: GPF on 0xdead000000000100 in nvme_map_data - Linux 5.9.9 References: <20201129035639.GW2532@mail-itl> <20201130164010.GA23494@redsun51.ssa.fujisawa.hgst.com> <20201202000642.GJ201140@mail-itl> <20201204110847.GU201140@mail-itl> <20201204120803.GA20727@lst.de> <20201204122054.GV201140@mail-itl> <20201205082839.ts3ju6yta46cgwjn@Air-de-Roger> In-Reply-To: --dj6QMlgeBtfrRkv9HmCRuQeYiLeVMFEpv Content-Type: multipart/mixed; boundary="------------8A9C09CAAF7010E62E1E799D" Content-Language: en-US This is a multi-part message in MIME format. --------------8A9C09CAAF7010E62E1E799D Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: quoted-printable Marek, On 06.12.20 17:47, Jason Andryuk wrote: > On Sat, Dec 5, 2020 at 3:29 AM Roger Pau Monn=C3=A9 wrote: >> >> On Fri, Dec 04, 2020 at 01:20:54PM +0100, Marek Marczykowski-G=C3=B3re= cki wrote: >>> On Fri, Dec 04, 2020 at 01:08:03PM +0100, Christoph Hellwig wrote: >>>> On Fri, Dec 04, 2020 at 12:08:47PM +0100, Marek Marczykowski-G=C3=B3= recki wrote: >>>>> culprit: >>>>> >>>>> commit 9e2369c06c8a181478039258a4598c1ddd2cadfa >>>>> Author: Roger Pau Monne >>>>> Date: Tue Sep 1 10:33:26 2020 +0200 >>>>> >>>>> xen: add helpers to allocate unpopulated memory >>>>> >>>>> I'm adding relevant people and xen-devel to the thread. >>>>> For completeness, here is the original crash message: >>>> >>>> That commit definitively adds a new ZONE_DEVICE user, so it does loo= k >>>> related. But you are not running on Xen, are you? >>> >>> I am. It is Xen dom0. >> >> I'm afraid I'm on leave and won't be able to look into this until the >> beginning of January. I would guess it's some kind of bad >> interaction between blkback and NVMe drivers both using ZONE_DEVICE? >> >> Maybe the best is to revert this change and I will look into it when >> I get back, unless someone is willing to debug this further. >=20 > Looking at commit 9e2369c06c8a and xen-blkback put_free_pages() , they > both use page->lru which is part of the anonymous union shared with > *pgmap. That matches Marek's suspicion that the ZONE_DEVICE memory is > being used as ZONE_NORMAL. >=20 > memmap_init_zone_device() says: > * ZONE_DEVICE pages union ->lru with a ->pgmap back pointer > * and zone_device_data. It is a bug if a ZONE_DEVICE page is > * ever freed or placed on a driver-private list. Could you test whether the two attached patches are helping? Only compile tested. Juergen --------------8A9C09CAAF7010E62E1E799D Content-Type: text/x-patch; charset=UTF-8; name="0001-xen-add-helpers-for-caching-grant-mapping-pages.patch" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename*0="0001-xen-add-helpers-for-caching-grant-mapping-pages.patch" =46rom 4f6ad98ce5fd457fd12e6617b0bc2a8f82fbce4d Mon Sep 17 00:00:00 2001 From: Juergen Gross Date: Mon, 7 Dec 2020 08:31:22 +0100 Subject: [PATCH 1/2] xen: add helpers for caching grant mapping pages Instead of having similar helpers in multiple backend drivers use common helpers for caching pages allocated via gnttab_alloc_pages(). Make use of those helpers in blkback and scsiback. Signed-off-by: Juergen Gross --- drivers/block/xen-blkback/blkback.c | 89 ++++++----------------------- drivers/block/xen-blkback/common.h | 4 +- drivers/block/xen-blkback/xenbus.c | 6 +- drivers/xen/grant-table.c | 72 +++++++++++++++++++++++ drivers/xen/xen-scsiback.c | 60 ++++--------------- include/xen/grant_table.h | 13 +++++ 6 files changed, 116 insertions(+), 128 deletions(-) diff --git a/drivers/block/xen-blkback/blkback.c b/drivers/block/xen-blkb= ack/blkback.c index 501e9dacfff9..9ebf53903d7b 100644 --- a/drivers/block/xen-blkback/blkback.c +++ b/drivers/block/xen-blkback/blkback.c @@ -132,73 +132,12 @@ module_param(log_stats, int, 0644); =20 #define BLKBACK_INVALID_HANDLE (~0) =20 -/* Number of free pages to remove on each call to gnttab_free_pages */ -#define NUM_BATCH_FREE_PAGES 10 - static inline bool persistent_gnt_timeout(struct persistent_gnt *persist= ent_gnt) { return pgrant_timeout && (jiffies - persistent_gnt->last_used >=3D HZ * pgrant_timeout); } =20 -static inline int get_free_page(struct xen_blkif_ring *ring, struct page= **page) -{ - unsigned long flags; - - spin_lock_irqsave(&ring->free_pages_lock, flags); - if (list_empty(&ring->free_pages)) { - BUG_ON(ring->free_pages_num !=3D 0); - spin_unlock_irqrestore(&ring->free_pages_lock, flags); - return gnttab_alloc_pages(1, page); - } - BUG_ON(ring->free_pages_num =3D=3D 0); - page[0] =3D list_first_entry(&ring->free_pages, struct page, lru); - list_del(&page[0]->lru); - ring->free_pages_num--; - spin_unlock_irqrestore(&ring->free_pages_lock, flags); - - return 0; -} - -static inline void put_free_pages(struct xen_blkif_ring *ring, struct pa= ge **page, - int num) -{ - unsigned long flags; - int i; - - spin_lock_irqsave(&ring->free_pages_lock, flags); - for (i =3D 0; i < num; i++) - list_add(&page[i]->lru, &ring->free_pages); - ring->free_pages_num +=3D num; - spin_unlock_irqrestore(&ring->free_pages_lock, flags); -} - -static inline void shrink_free_pagepool(struct xen_blkif_ring *ring, int= num) -{ - /* Remove requested pages in batches of NUM_BATCH_FREE_PAGES */ - struct page *page[NUM_BATCH_FREE_PAGES]; - unsigned int num_pages =3D 0; - unsigned long flags; - - spin_lock_irqsave(&ring->free_pages_lock, flags); - while (ring->free_pages_num > num) { - BUG_ON(list_empty(&ring->free_pages)); - page[num_pages] =3D list_first_entry(&ring->free_pages, - struct page, lru); - list_del(&page[num_pages]->lru); - ring->free_pages_num--; - if (++num_pages =3D=3D NUM_BATCH_FREE_PAGES) { - spin_unlock_irqrestore(&ring->free_pages_lock, flags); - gnttab_free_pages(num_pages, page); - spin_lock_irqsave(&ring->free_pages_lock, flags); - num_pages =3D 0; - } - } - spin_unlock_irqrestore(&ring->free_pages_lock, flags); - if (num_pages !=3D 0) - gnttab_free_pages(num_pages, page); -} - #define vaddr(page) ((unsigned long)pfn_to_kaddr(page_to_pfn(page))) =20 static int do_block_io_op(struct xen_blkif_ring *ring, unsigned int *eoi= _flags); @@ -331,7 +270,8 @@ static void free_persistent_gnts(struct xen_blkif_rin= g *ring, struct rb_root *ro unmap_data.count =3D segs_to_unmap; BUG_ON(gnttab_unmap_refs_sync(&unmap_data)); =20 - put_free_pages(ring, pages, segs_to_unmap); + gnttab_page_cache_put(&ring->free_pages, pages, + segs_to_unmap); segs_to_unmap =3D 0; } =20 @@ -371,7 +311,8 @@ void xen_blkbk_unmap_purged_grants(struct work_struct= *work) if (++segs_to_unmap =3D=3D BLKIF_MAX_SEGMENTS_PER_REQUEST) { unmap_data.count =3D segs_to_unmap; BUG_ON(gnttab_unmap_refs_sync(&unmap_data)); - put_free_pages(ring, pages, segs_to_unmap); + gnttab_page_cache_put(&ring->free_pages, pages, + segs_to_unmap); segs_to_unmap =3D 0; } kfree(persistent_gnt); @@ -379,7 +320,7 @@ void xen_blkbk_unmap_purged_grants(struct work_struct= *work) if (segs_to_unmap > 0) { unmap_data.count =3D segs_to_unmap; BUG_ON(gnttab_unmap_refs_sync(&unmap_data)); - put_free_pages(ring, pages, segs_to_unmap); + gnttab_page_cache_put(&ring->free_pages, pages, segs_to_unmap); } } =20 @@ -664,9 +605,10 @@ int xen_blkif_schedule(void *arg) =20 /* Shrink the free pages pool if it is too large. */ if (time_before(jiffies, blkif->buffer_squeeze_end)) - shrink_free_pagepool(ring, 0); + gnttab_page_cache_shrink(&ring->free_pages, 0); else - shrink_free_pagepool(ring, max_buffer_pages); + gnttab_page_cache_shrink(&ring->free_pages, + max_buffer_pages); =20 if (log_stats && time_after(jiffies, ring->st_print)) print_stats(ring); @@ -697,7 +639,7 @@ void xen_blkbk_free_caches(struct xen_blkif_ring *rin= g) ring->persistent_gnt_c =3D 0; =20 /* Since we are shutting down remove all pages from the buffer */ - shrink_free_pagepool(ring, 0 /* All */); + gnttab_page_cache_shrink(&ring->free_pages, 0 /* All */); } =20 static unsigned int xen_blkbk_unmap_prepare( @@ -736,7 +678,7 @@ static void xen_blkbk_unmap_and_respond_callback(int = result, struct gntab_unmap_ but is this the best way to deal with this? */ BUG_ON(result); =20 - put_free_pages(ring, data->pages, data->count); + gnttab_page_cache_put(&ring->free_pages, data->pages, data->count); make_response(ring, pending_req->id, pending_req->operation, pending_req->status); free_req(ring, pending_req); @@ -803,7 +745,8 @@ static void xen_blkbk_unmap(struct xen_blkif_ring *ri= ng, if (invcount) { ret =3D gnttab_unmap_refs(unmap, NULL, unmap_pages, invcount); BUG_ON(ret); - put_free_pages(ring, unmap_pages, invcount); + gnttab_page_cache_put(&ring->free_pages, unmap_pages, + invcount); } pages +=3D batch; num -=3D batch; @@ -850,7 +793,8 @@ static int xen_blkbk_map(struct xen_blkif_ring *ring,= pages[i]->page =3D persistent_gnt->page; pages[i]->persistent_gnt =3D persistent_gnt; } else { - if (get_free_page(ring, &pages[i]->page)) + if (gnttab_page_cache_get(&ring->free_pages, + &pages[i]->page)) goto out_of_memory; addr =3D vaddr(pages[i]->page); pages_to_gnt[segs_to_map] =3D pages[i]->page; @@ -883,7 +827,8 @@ static int xen_blkbk_map(struct xen_blkif_ring *ring,= BUG_ON(new_map_idx >=3D segs_to_map); if (unlikely(map[new_map_idx].status !=3D 0)) { pr_debug("invalid buffer -- could not remap it\n"); - put_free_pages(ring, &pages[seg_idx]->page, 1); + gnttab_page_cache_put(&ring->free_pages, + &pages[seg_idx]->page, 1); pages[seg_idx]->handle =3D BLKBACK_INVALID_HANDLE; ret |=3D 1; goto next; @@ -944,7 +889,7 @@ static int xen_blkbk_map(struct xen_blkif_ring *ring,= =20 out_of_memory: pr_alert("%s: out of memory\n", __func__); - put_free_pages(ring, pages_to_gnt, segs_to_map); + gnttab_page_cache_put(&ring->free_pages, pages_to_gnt, segs_to_map); for (i =3D last_map; i < num; i++) pages[i]->handle =3D BLKBACK_INVALID_HANDLE; return -ENOMEM; diff --git a/drivers/block/xen-blkback/common.h b/drivers/block/xen-blkba= ck/common.h index c6ea5d38c509..a1b9df2c4ef1 100644 --- a/drivers/block/xen-blkback/common.h +++ b/drivers/block/xen-blkback/common.h @@ -288,9 +288,7 @@ struct xen_blkif_ring { struct work_struct persistent_purge_work; =20 /* Buffer of free pages to map grant refs. */ - spinlock_t free_pages_lock; - int free_pages_num; - struct list_head free_pages; + struct gnttab_page_cache free_pages; =20 struct work_struct free_work; /* Thread shutdown wait queue. */ diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkba= ck/xenbus.c index f5705569e2a7..76912c584a76 100644 --- a/drivers/block/xen-blkback/xenbus.c +++ b/drivers/block/xen-blkback/xenbus.c @@ -144,8 +144,7 @@ static int xen_blkif_alloc_rings(struct xen_blkif *bl= kif) INIT_LIST_HEAD(&ring->pending_free); INIT_LIST_HEAD(&ring->persistent_purge_list); INIT_WORK(&ring->persistent_purge_work, xen_blkbk_unmap_purged_grants)= ; - spin_lock_init(&ring->free_pages_lock); - INIT_LIST_HEAD(&ring->free_pages); + gnttab_page_cache_init(&ring->free_pages); =20 spin_lock_init(&ring->pending_free_lock); init_waitqueue_head(&ring->pending_free_wq); @@ -317,8 +316,7 @@ static int xen_blkif_disconnect(struct xen_blkif *blk= if) BUG_ON(atomic_read(&ring->persistent_gnt_in_use) !=3D 0); BUG_ON(!list_empty(&ring->persistent_purge_list)); BUG_ON(!RB_EMPTY_ROOT(&ring->persistent_gnts)); - BUG_ON(!list_empty(&ring->free_pages)); - BUG_ON(ring->free_pages_num !=3D 0); + BUG_ON(ring->free_pages.num_pages !=3D 0); BUG_ON(ring->persistent_gnt_c !=3D 0); WARN_ON(i !=3D (XEN_BLKIF_REQS_PER_PAGE * blkif->nr_ring_pages)); ring->active =3D false; diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c index 523dcdf39cc9..e2e42912f241 100644 --- a/drivers/xen/grant-table.c +++ b/drivers/xen/grant-table.c @@ -813,6 +813,78 @@ int gnttab_alloc_pages(int nr_pages, struct page **p= ages) } EXPORT_SYMBOL_GPL(gnttab_alloc_pages); =20 +void gnttab_page_cache_init(struct gnttab_page_cache *cache) +{ + spin_lock_init(&cache->lock); + INIT_LIST_HEAD(&cache->pages); + cache->num_pages =3D 0; +} +EXPORT_SYMBOL_GPL(gnttab_page_cache_init); + +int gnttab_page_cache_get(struct gnttab_page_cache *cache, struct page *= *page) +{ + unsigned long flags; + + spin_lock_irqsave(&cache->lock, flags); + + if (list_empty(&cache->pages)) { + spin_unlock_irqrestore(&cache->lock, flags); + return gnttab_alloc_pages(1, page); + } + + page[0] =3D list_first_entry(&cache->pages, struct page, lru); + list_del(&page[0]->lru); + cache->num_pages--; + + spin_unlock_irqrestore(&cache->lock, flags); + + return 0; +} +EXPORT_SYMBOL_GPL(gnttab_page_cache_get); + +void gnttab_page_cache_put(struct gnttab_page_cache *cache, struct page = **page, + unsigned int num) +{ + unsigned long flags; + unsigned int i; + + spin_lock_irqsave(&cache->lock, flags); + + for (i =3D 0; i < num; i++) + list_add(&page[i]->lru, &cache->pages); + cache->num_pages +=3D num; + + spin_unlock_irqrestore(&cache->lock, flags); +} +EXPORT_SYMBOL_GPL(gnttab_page_cache_put); + +void gnttab_page_cache_shrink(struct gnttab_page_cache *cache, unsigned = int num) +{ + struct page *page[10]; + unsigned int i =3D 0; + unsigned long flags; + + spin_lock_irqsave(&cache->lock, flags); + + while (cache->num_pages > num) { + page[i] =3D list_first_entry(&cache->pages, struct page, lru); + list_del(&page[i]->lru); + cache->num_pages--; + if (++i =3D=3D ARRAY_SIZE(page)) { + spin_unlock_irqrestore(&cache->lock, flags); + gnttab_free_pages(i, page); + i =3D 0; + spin_lock_irqsave(&cache->lock, flags); + } + } + + spin_unlock_irqrestore(&cache->lock, flags); + + if (i !=3D 0) + gnttab_free_pages(i, page); +} +EXPORT_SYMBOL_GPL(gnttab_page_cache_shrink); + void gnttab_pages_clear_private(int nr_pages, struct page **pages) { int i; diff --git a/drivers/xen/xen-scsiback.c b/drivers/xen/xen-scsiback.c index 4acc4e899600..862162dca33c 100644 --- a/drivers/xen/xen-scsiback.c +++ b/drivers/xen/xen-scsiback.c @@ -99,6 +99,8 @@ struct vscsibk_info { struct list_head v2p_entry_lists; =20 wait_queue_head_t waiting_to_free; + + struct gnttab_page_cache free_pages; }; =20 /* theoretical maximum of grants for one request */ @@ -188,10 +190,6 @@ module_param_named(max_buffer_pages, scsiback_max_bu= ffer_pages, int, 0644); MODULE_PARM_DESC(max_buffer_pages, "Maximum number of free pages to keep in backend buffer"); =20 -static DEFINE_SPINLOCK(free_pages_lock); -static int free_pages_num; -static LIST_HEAD(scsiback_free_pages); - /* Global spinlock to protect scsiback TPG list */ static DEFINE_MUTEX(scsiback_mutex); static LIST_HEAD(scsiback_list); @@ -207,41 +205,6 @@ static void scsiback_put(struct vscsibk_info *info) wake_up(&info->waiting_to_free); } =20 -static void put_free_pages(struct page **page, int num) -{ - unsigned long flags; - int i =3D free_pages_num + num, n =3D num; - - if (num =3D=3D 0) - return; - if (i > scsiback_max_buffer_pages) { - n =3D min(num, i - scsiback_max_buffer_pages); - gnttab_free_pages(n, page + num - n); - n =3D num - n; - } - spin_lock_irqsave(&free_pages_lock, flags); - for (i =3D 0; i < n; i++) - list_add(&page[i]->lru, &scsiback_free_pages); - free_pages_num +=3D n; - spin_unlock_irqrestore(&free_pages_lock, flags); -} - -static int get_free_page(struct page **page) -{ - unsigned long flags; - - spin_lock_irqsave(&free_pages_lock, flags); - if (list_empty(&scsiback_free_pages)) { - spin_unlock_irqrestore(&free_pages_lock, flags); - return gnttab_alloc_pages(1, page); - } - page[0] =3D list_first_entry(&scsiback_free_pages, struct page, lru); - list_del(&page[0]->lru); - free_pages_num--; - spin_unlock_irqrestore(&free_pages_lock, flags); - return 0; -} - static unsigned long vaddr_page(struct page *page) { unsigned long pfn =3D page_to_pfn(page); @@ -302,7 +265,8 @@ static void scsiback_fast_flush_area(struct vscsibk_p= end *req) BUG_ON(err); } =20 - put_free_pages(req->pages, req->n_grants); + gnttab_page_cache_put(&req->info->free_pages, req->pages, + req->n_grants); req->n_grants =3D 0; } =20 @@ -445,8 +409,8 @@ static int scsiback_gnttab_data_map_list(struct vscsi= bk_pend *pending_req, struct vscsibk_info *info =3D pending_req->info; =20 for (i =3D 0; i < cnt; i++) { - if (get_free_page(pg + mapcount)) { - put_free_pages(pg, mapcount); + if (gnttab_page_cache_get(&info->free_pages, pg + mapcount)) { + gnttab_page_cache_put(&info->free_pages, pg, mapcount); pr_err("no grant page\n"); return -ENOMEM; } @@ -796,6 +760,8 @@ static int scsiback_do_cmd_fn(struct vscsibk_info *in= fo, cond_resched(); } =20 + gnttab_page_cache_shrink(&info->free_pages, scsiback_max_buffer_pages);= + RING_FINAL_CHECK_FOR_REQUESTS(&info->ring, more_to_do); return more_to_do; } @@ -1233,6 +1199,8 @@ static int scsiback_remove(struct xenbus_device *de= v) =20 scsiback_release_translation_entry(info); =20 + gnttab_page_cache_shrink(&info->free_pages, 0); + dev_set_drvdata(&dev->dev, NULL); =20 return 0; @@ -1263,6 +1231,7 @@ static int scsiback_probe(struct xenbus_device *dev= , info->irq =3D 0; INIT_LIST_HEAD(&info->v2p_entry_lists); spin_lock_init(&info->v2p_lock); + gnttab_page_cache_init(&info->free_pages); =20 err =3D xenbus_printf(XBT_NIL, dev->nodename, "feature-sg-grant", "%u",= SG_ALL); @@ -1879,13 +1848,6 @@ static int __init scsiback_init(void) =20 static void __exit scsiback_exit(void) { - struct page *page; - - while (free_pages_num) { - if (get_free_page(&page)) - BUG(); - gnttab_free_pages(1, &page); - } target_unregister_template(&scsiback_ops); xenbus_unregister_driver(&scsiback_driver); } diff --git a/include/xen/grant_table.h b/include/xen/grant_table.h index 9bc5bc07d4d3..c6ef8ffc1a09 100644 --- a/include/xen/grant_table.h +++ b/include/xen/grant_table.h @@ -198,6 +198,19 @@ void gnttab_free_auto_xlat_frames(void); int gnttab_alloc_pages(int nr_pages, struct page **pages); void gnttab_free_pages(int nr_pages, struct page **pages); =20 +struct gnttab_page_cache { + spinlock_t lock; + struct list_head pages; + unsigned int num_pages; +}; + +void gnttab_page_cache_init(struct gnttab_page_cache *cache); +int gnttab_page_cache_get(struct gnttab_page_cache *cache, struct page *= *page); +void gnttab_page_cache_put(struct gnttab_page_cache *cache, struct page = **page, + unsigned int num); +void gnttab_page_cache_shrink(struct gnttab_page_cache *cache, + unsigned int num); + #ifdef CONFIG_XEN_GRANT_DMA_ALLOC struct gnttab_dma_alloc_args { /* Device for which DMA memory will be/was allocated. */ --=20 2.26.2 --------------8A9C09CAAF7010E62E1E799D Content-Type: text/x-patch; charset=UTF-8; name="0002-xen-don-t-use-page-lru-for-ZONE_DEVICE-memory.patch" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="0002-xen-don-t-use-page-lru-for-ZONE_DEVICE-memory.patch" =46rom 061fee2e0b4cb7dc7deb07980fca8afa6349358b Mon Sep 17 00:00:00 2001 From: Juergen Gross Date: Mon, 7 Dec 2020 09:36:14 +0100 Subject: [PATCH 2/2] xen: don't use page->lru for ZONE_DEVICE memory Commit 9e2369c06c8a18 ("xen: add helpers to allocate unpopulated memory") introduced usage of ZONE_DEVICE memory for foreign memory mappings. Unfortunately this collides with using page->lru for Xen backend private page caches. Fix that by using page->zone_device_data instead. Fixes: 9e2369c06c8a18 ("xen: add helpers to allocate unpopulated memory")= Signed-off-by: Juergen Gross --- drivers/xen/grant-table.c | 65 ++++++++++++++++++++++++++++++++++----- include/xen/grant_table.h | 4 +++ 2 files changed, 62 insertions(+), 7 deletions(-) diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c index e2e42912f241..ddb38a3d7680 100644 --- a/drivers/xen/grant-table.c +++ b/drivers/xen/grant-table.c @@ -813,10 +813,63 @@ int gnttab_alloc_pages(int nr_pages, struct page **= pages) } EXPORT_SYMBOL_GPL(gnttab_alloc_pages); =20 +#ifdef CONFIG_XEN_UNPOPULATED_ALLOC +static inline void cache_init(struct gnttab_page_cache *cache) +{ + cache->pages =3D NULL; +} + +static inline bool cache_empty(struct gnttab_page_cache *cache) +{ + return cache->pages; +} + +static inline struct page *cache_deq(struct gnttab_page_cache *cache) +{ + struct page *page; + + page =3D cache->pages; + cache->pages =3D page->zone_device_data; + + return page; +} + +static inline void cache_enq(struct gnttab_page_cache *cache, struct pag= e *page) +{ + page->zone_device_data =3D cache->pages; + cache->pages =3D page; +} +#else +static inline void cache_init(struct gnttab_page_cache *cache) +{ + INIT_LIST_HEAD(&cache->pages); +} + +static inline bool cache_empty(struct gnttab_page_cache *cache) +{ + return list_empty(&cache->pages); +} + +static inline struct page *cache_deq(struct gnttab_page_cache *cache) +{ + struct page *page; + + page =3D list_first_entry(&cache->pages, struct page, lru); + list_del(&page[0]->lru); + + return page; +} + +static inline void cache_enq(struct gnttab_page_cache *cache, struct pag= e *page) +{ + list_add(&page->lru, &cache->pages); +} +#endif + void gnttab_page_cache_init(struct gnttab_page_cache *cache) { spin_lock_init(&cache->lock); - INIT_LIST_HEAD(&cache->pages); + cache_init(cache); cache->num_pages =3D 0; } EXPORT_SYMBOL_GPL(gnttab_page_cache_init); @@ -827,13 +880,12 @@ int gnttab_page_cache_get(struct gnttab_page_cache = *cache, struct page **page) =20 spin_lock_irqsave(&cache->lock, flags); =20 - if (list_empty(&cache->pages)) { + if (cache_empty(cache)) { spin_unlock_irqrestore(&cache->lock, flags); return gnttab_alloc_pages(1, page); } =20 - page[0] =3D list_first_entry(&cache->pages, struct page, lru); - list_del(&page[0]->lru); + page[0] =3D cache_deq(cache); cache->num_pages--; =20 spin_unlock_irqrestore(&cache->lock, flags); @@ -851,7 +903,7 @@ void gnttab_page_cache_put(struct gnttab_page_cache *= cache, struct page **page, spin_lock_irqsave(&cache->lock, flags); =20 for (i =3D 0; i < num; i++) - list_add(&page[i]->lru, &cache->pages); + cache_enq(cache, page[i]); cache->num_pages +=3D num; =20 spin_unlock_irqrestore(&cache->lock, flags); @@ -867,8 +919,7 @@ void gnttab_page_cache_shrink(struct gnttab_page_cach= e *cache, unsigned int num) spin_lock_irqsave(&cache->lock, flags); =20 while (cache->num_pages > num) { - page[i] =3D list_first_entry(&cache->pages, struct page, lru); - list_del(&page[i]->lru); + page[i] =3D cache_deq(cache); cache->num_pages--; if (++i =3D=3D ARRAY_SIZE(page)) { spin_unlock_irqrestore(&cache->lock, flags); diff --git a/include/xen/grant_table.h b/include/xen/grant_table.h index c6ef8ffc1a09..b9c937b3a149 100644 --- a/include/xen/grant_table.h +++ b/include/xen/grant_table.h @@ -200,7 +200,11 @@ void gnttab_free_pages(int nr_pages, struct page **p= ages); =20 struct gnttab_page_cache { spinlock_t lock; +#ifdef CONFIG_XEN_UNPOPULATED_ALLOC + struct page *pages; +#else struct list_head pages; +#endif unsigned int num_pages; }; =20 --=20 2.26.2 --------------8A9C09CAAF7010E62E1E799D Content-Type: application/pgp-keys; name="OpenPGP_0xB0DE9DD628BF132F.asc" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="OpenPGP_0xB0DE9DD628BF132F.asc" -----BEGIN PGP PUBLIC KEY BLOCK----- xsBNBFOMcBYBCACgGjqjoGvbEouQZw/ToiBg9W98AlM2QHV+iNHsEs7kxWhKMjrioyspZKOBy= cWx w3ie3j9uvg9EOB3aN4xiTv4qbnGiTr3oJhkB1gsb6ToJQZ8uxGq2kaV2KL9650I1SJvedYm8O= f8Z d621lSmoKOwlNClALZNew72NjJLEzTalU1OdT7/i1TXkH09XSSI8mEQ/ouNcMvIJNwQpd369y= 9bf IhWUiVXEK7MlRgUG6MvIj6Y3Am/BBLUVbDa4+gmzDC9ezlZkTZG2t14zWPvxXP3FAp2pkW0xq= G7/ 377qptDmrk42GlSKN4z76ELnLxussxc7I2hx18NUcbP8+uty4bMxABEBAAHNHEp1ZXJnZW4gR= 3Jv c3MgPGpnQHBmdXBmLm5ldD7CwHkEEwECACMFAlOMcBYCGwMHCwkIBwMCAQYVCAIJCgsEFgIDA= QIe AQIXgAAKCRCw3p3WKL8TL0KdB/93FcIZ3GCNwFU0u3EjNbNjmXBKDY4FUGNQH2lvWAUy+dnyT= hpw dtF/jQ6j9RwE8VP0+NXcYpGJDWlNb9/JmYqLiX2Q3TyevpB0CA3dbBQp0OW0fgCetToGIQrg0= MbD 1C/sEOv8Mr4NAfbauXjZlvTj30H2jO0u+6WGM6nHwbh2l5O8ZiHkH32iaSTfN7Eu5RnNVUJbv= oPH Z8SlM4KWm8rG+lIkGurqqu5gu8q8ZMKdsdGC4bBxdQKDKHEFExLJK/nRPFmAuGlId1E3fe10v= 5QL +qHI3EIPtyfE7i9Hz6rVwi7lWKgh7pe0ZvatAudZ+JNIlBKptb64FaiIOAWDCx1SzR9KdWVyZ= 2Vu IEdyb3NzIDxqZ3Jvc3NAc3VzZS5jb20+wsB5BBMBAgAjBQJTjHCvAhsDBwsJCAcDAgEGFQgCC= QoL BBYCAwECHgECF4AACgkQsN6d1ii/Ey/HmQf/RtI7kv5A2PS4RF7HoZhPVPogNVbC4YA6lW7Dr= Wf0 teC0RR3MzXfy6pJ+7KLgkqMlrAbN/8Dvjoz78X+5vhH/rDLa9BuZQlhFmvcGtCF8eR0T1v0nC= /nu AFVGy+67q2DH8As3KPu0344TBDpAvr2uYM4tSqxK4DURx5INz4ZZ0WNFHcqsfvlGJALDeE0Lh= ITT d9jLzdDad1pQSToCnLl6SBJZjDOX9QQcyUigZFtCXFst4dlsvddrxyqT1f17+2cFSdu7+ynLm= XBK 7abQ3rwJY8SbRO2iRulogc5vr/RLMMlscDAiDkaFQWLoqHHOdfO9rURssHNN8WkMnQfvUewRz= 80h SnVlcmdlbiBHcm9zcyA8amdyb3NzQG5vdmVsbC5jb20+wsB5BBMBAgAjBQJTjHDXAhsDBwsJC= AcD AgEGFQgCCQoLBBYCAwECHgECF4AACgkQsN6d1ii/Ey8PUQf/ehmgCI9jB9hlgexLvgOtf7PJn= FOX gMLdBQgBlVPO3/D9R8LtF9DBAFPNhlrsfIG/SqICoRCqUcJ96Pn3P7UUinFG/I0ECGF4EvTE1= jnD kfJZr6jrbjgyoZHiw/4BNwSTL9rWASyLgqlA8u1mf+c2yUwcGhgkRAd1gOwungxcwzwqgljf0= N51 N5JfVRHRtyfwq/ge+YEkDGcTU6Y0sPOuj4Dyfm8fJzdfHNQsWq3PnczLVELStJNdapwPOoE+l= otu fe3AM2vAEYJ9rTz3Cki4JFUsgLkHFqGZarrPGi1eyQcXeluldO3m91NK/1xMI3/+8jbO0tsn1= tqS EUGIJi7ox80eSnVlcmdlbiBHcm9zcyA8amdyb3NzQHN1c2UuZGU+wsB5BBMBAgAjBQJTjHDrA= hsD BwsJCAcDAgEGFQgCCQoLBBYCAwECHgECF4AACgkQsN6d1ii/Ey+LhQf9GL45eU5vOowA2u5N3= g3O ZUEBmDHVVbqMtzwlmNC4k9Kx39r5s2vcFl4tXqW7g9/ViXYuiDXb0RfUpZiIUW89siKrkzmQ5= dM7 wRqzgJpJwK8Bn2MIxAKArekWpiCKvBOB/Cc+3EXE78XdlxLyOi/NrmSGRIov0karw2RzMNOu5= D+j LRZQd1Sv27AR+IP3I8U4aqnhLpwhK7MEy9oCILlgZ1QZe49kpcumcZKORmzBTNh30FVKK1Evm= V2x AKDoaEOgQB4iFQLhJCdP1I5aSgM5IVFdn7v5YgEYuJYx37IoN1EblHI//x/e2AaIHpzK5h88N= Eaw QsaNRpNSrcfbFmAg987ATQRTjHAWAQgAyzH6AOODMBjgfWE9VeCgsrwH3exNAU32gLq2xvjpW= nHI s98ndPUDpnoxWQugJ6MpMncr0xSwFmHEgnSEjK/PAjppgmyc57BwKII3sV4on+gDVFJR6Y8ZR= wgn BC5mVM6JjQ5xDk8WRXljExRfUX9pNhdE5eBOZJrDRoLUmmjDtKzWaDhIg/+1Hzz93X4fCQkNV= bVF LELU9bMaLPBG/x5q4iYZ2k2ex6d47YE1ZFdMm6YBYMOljGkZKwYde5ldM9mo45mmwe0icXKLk= pEd IXKTZeKDO+Hdv1aqFuAcccTg9RXDQjmwhC3yEmrmcfl0+rPghO0Iv3OOImwTEe4co3c1mwARA= QAB wsBfBBgBAgAJBQJTjHAWAhsMAAoJELDendYovxMvQ/gH/1ha96vm4P/L+bQpJwrZ/dneZcmEw= Tbe 8YFsw2V/Buv6Z4Mysln3nQK5ZadD534CF7TDVft7fC4tU4PONxF5D+/tvgkPfDAfF77zy2AH1= vJz Q1fOU8lYFpZXTXIHb+559UqvIB8AdgR3SAJGHHt4RKA0F7f5ipYBBrC6cyXJyyoprT10EMvU8= VGi wXvTyJz3fjoYsdFzpWPlJEBRMedCot60g5dmbdrZ5DWClAr0yau47zpWj3enf1tLWaqcsuylW= svi uGjKGw7KHQd3bxALOknAp4dN3QwBYCKuZ7AddY9yjynVaD5X7nF9nO5BjR/i1DG86lem3iBDX= zXs ZDn8R38=3D =3D2wuH -----END PGP PUBLIC KEY BLOCK----- --------------8A9C09CAAF7010E62E1E799D-- --dj6QMlgeBtfrRkv9HmCRuQeYiLeVMFEpv-- --DjInZuhMnyFoVAwIkWsZZLcVQw8Nn1gxE Content-Type: application/pgp-signature; name="OpenPGP_signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="OpenPGP_signature" -----BEGIN PGP SIGNATURE----- wsB5BAABCAAjFiEEhRJncuj2BJSl0Jf3sN6d1ii/Ey8FAl/N7ZcFAwAAAAAACgkQsN6d1ii/Ey+7 pgf/TTJ0ULCjkRR/cCuSNRg9xycepvgqT7Rven1u6ewgKK2WDrcJlXtEIWRr43R8DAgQNyB67Ztb azE/dcMa5KfmxQdKtwIDV3m7t60YLe4lVCRLlMB5UjNknbscAxa3/wtcqfwk7qErfvp9OAWXi41B CRWacZ7/KsO5eCHvqiSTVrNgDjHZF5BRb2dSwWtivDhkHuegN6aaqwB/FgE1ds/HfT1XJe6y2ft4 yweThV1fzdg87bu8IM+HovxEAYcQT2949DoEfmzoDtHxGSgtl4WrzoO61FTOmQnVKSn9t2Ux12cc gLJXwC8lc9bOa6Bkymlt9YSjnTrRJjAbZw30bWsjXA== =JpuS -----END PGP SIGNATURE----- --DjInZuhMnyFoVAwIkWsZZLcVQw8Nn1gxE-- --===============6406864312272058914== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme --===============6406864312272058914==--