From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from szxga08-in.huawei.com (szxga08-in.huawei.com [45.249.212.255]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5CF753A1B5; Fri, 19 Apr 2024 12:37:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.255 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1713530280; cv=none; b=sJCZ++SWAsLJBUO9IT/ILL3GYkIxx6n+XYCxs3rwFjyUGc10OY8yMM3Tsa+nVMvSFV7V11+PbyLNFhvURJr4HFpQhwJZDSmb2DRHewP2p1iL0pPFFsfJP6W8wZI3aDilXT91rHSDG9cBglaLNJqZTgR35HEfVBSfPf3HcxtfVQg= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1713530280; c=relaxed/simple; bh=CtvHFDck+w2iZroa+6jPORcjDAiRAeL3wYfx97pu67U=; h=Subject:To:CC:References:From:Message-ID:Date:MIME-Version: In-Reply-To:Content-Type; b=kWxtiT1GK0opQZ5ImohT0s2QP7Fwr+pQiV01cIa82qwoJn0Hw6vqj17XwrD0T2Y5DD6XjZ5Fh9nslJnX/+rsWmnTIGWP+P8vNXu3s7uQJ6NCrZxyGa2gkauzaMrbf8YJ1Frd5rYthnWGDenxrNloI/99KUYT8g4fn4RN8wLLjKE= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=huawei.com; arc=none smtp.client-ip=45.249.212.255 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huawei.com Received: from mail.maildlp.com (unknown [172.19.162.254]) by szxga08-in.huawei.com (SkyGuard) with ESMTP id 4VLYxv0rYXz1R8P8; Fri, 19 Apr 2024 20:34:59 +0800 (CST) Received: from dggpemm500005.china.huawei.com (unknown [7.185.36.74]) by mail.maildlp.com (Postfix) with ESMTPS id 354B718007D; Fri, 19 Apr 2024 20:37:54 +0800 (CST) Received: from [10.69.30.204] (10.69.30.204) by dggpemm500005.china.huawei.com (7.185.36.74) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.35; Fri, 19 Apr 2024 20:37:53 +0800 Subject: Re: [PATCH net-next v2 13/15] net: replace page_frag with page_frag_cache To: Mat Martineau CC: Paolo Abeni , , , , , Ayush Sawal , Eric Dumazet , Willem de Bruijn , Jason Wang , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Daniel Bristot de Oliveira , Valentin Schneider , John Fastabend , Jakub Sitnicki , David Ahern , Matthieu Baerts , Geliang Tang , Boris Pismenny , , References: <20240415131941.51153-1-linyunsheng@huawei.com> <20240415131941.51153-14-linyunsheng@huawei.com> <83991c67-8e4a-c287-b4a5-5dbba8835947@kernel.org> From: Yunsheng Lin Message-ID: Date: Fri, 19 Apr 2024 20:37:53 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.2.0 Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 In-Reply-To: <83991c67-8e4a-c287-b4a5-5dbba8835947@kernel.org> Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit X-ClientProxiedBy: dggems701-chm.china.huawei.com (10.3.19.178) To dggpemm500005.china.huawei.com (7.185.36.74) On 2024/4/17 5:40, Mat Martineau wrote: ... > I wasn't concerned as much about the direct cost of the inlined page_frag_alloc_commit() helper, it was that we could make fewer prepare calls if the commit was deferred as long as possible. As we discussed above, I see now that the prepare is not expensive when there is more space available in the current frag. > >> Maybe what we could do is to do the prepare in the inline >> helper instead of a function when cache is enough, so that >> we can avoid a function call as the old code does, as an >> inlined function requires less overhead and is generally >> faster than a function call. >> >> But that requires more refactoring, as this patchset is bigger >> enough now, I guess we try it later if it is possible. > > A more generic (possible) optimization would be to inline some of page_frag_cache_refill(), but I'm not sure the code size tradeoff is worth it - would have to collect some data to find out for sure! In my arm64 system, It seems inlining some of page_frag_cache_refill() results in smaller kernel code size as below, has not done the performance test yet, but the smaller code size seems odd here. Without this patchset: linyunsheng@ubuntu:~/ksize$ ./ksize.sh Linux Kernel total | text data bss -------------------------------------------------------------------------------- vmlinux 51974159 | 32238573 19087890 647696 With this patchset: linyunsheng@ubuntu:~/ksize$ ./ksize.sh Linux Kernel total | text data bss -------------------------------------------------------------------------------- vmlinux 51970078 | 32234468 19087914 647696 With this patchset and below patch inlining some of page_frag_cache_refill(): linyunsheng@ubuntu:~/ksize$ ./ksize.sh Linux Kernel total | text data bss -------------------------------------------------------------------------------- vmlinux 51966078 | 32230468 19087914 647696 diff --git a/include/linux/page_frag_cache.h b/include/linux/page_frag_cache.h index 529e7c040dad..3e1de20c8146 100644 --- a/include/linux/page_frag_cache.h +++ b/include/linux/page_frag_cache.h @@ -60,8 +60,33 @@ static inline bool page_frag_cache_is_pfmemalloc(struct page_frag_cache *nc) void page_frag_cache_drain(struct page_frag_cache *nc); void __page_frag_cache_drain(struct page *page, unsigned int count); -void *page_frag_cache_refill(struct page_frag_cache *nc, unsigned int fragsz, - gfp_t gfp_mask); +void *__page_frag_cache_refill(struct page_frag_cache *nc, gfp_t gfp_mask); +void *page_frag_cache_flush(struct page_frag_cache *nc, gfp_t gfp_mask); + +static inline void *page_frag_cache_refill(struct page_frag_cache *nc, + unsigned int fragsz, gfp_t gfp_mask) +{ + unsigned long size_mask; + void *va; + + if (unlikely(!nc->va)) { + va = __page_frag_cache_refill(nc, gfp_mask); + if (likely(va)) + return va; + } + +#if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE) + /* if size can vary use size else just use PAGE_SIZE */ + size_mask = nc->size_mask; +#else + size_mask = PAGE_SIZE - 1; +#endif + + if (unlikely(nc->offset + fragsz > (size_mask + 1))) + return page_frag_cache_flush(nc, gfp_mask); + + return (void *)((unsigned long)nc->va & ~size_mask); +} /** * page_frag_alloc_va() - Alloc a page fragment. diff --git a/mm/page_frag_cache.c b/mm/page_frag_cache.c index 8b1d35aafcc1..74f643d472fb 100644 --- a/mm/page_frag_cache.c +++ b/mm/page_frag_cache.c @@ -18,8 +18,7 @@ #include #include "internal.h" -static bool __page_frag_cache_refill(struct page_frag_cache *nc, - gfp_t gfp_mask) +void *__page_frag_cache_refill(struct page_frag_cache *nc, gfp_t gfp_mask) { struct page *page = NULL; gfp_t gfp = gfp_mask; @@ -40,7 +39,7 @@ static bool __page_frag_cache_refill(struct page_frag_cache *nc, if (unlikely(!page)) { nc->va = NULL; - return false; + return NULL; } nc->va = page_address(page); @@ -57,8 +56,9 @@ static bool __page_frag_cache_refill(struct page_frag_cache *nc, nc->pfmemalloc = page_is_pfmemalloc(page); nc->offset = 0; - return true; + return nc->va; } +EXPORT_SYMBOL(__page_frag_cache_refill); /** * page_frag_cache_drain - Drain the current page from page_frag cache. @@ -83,20 +83,12 @@ void __page_frag_cache_drain(struct page *page, unsigned int count) } EXPORT_SYMBOL(__page_frag_cache_drain); -void *page_frag_cache_refill(struct page_frag_cache *nc, unsigned int fragsz, - gfp_t gfp_mask) +void *page_frag_cache_flush(struct page_frag_cache *nc, gfp_t gfp_mask) { unsigned long size_mask; - unsigned int offset; struct page *page; void *va; - if (unlikely(!nc->va)) { -refill: - if (!__page_frag_cache_refill(nc, gfp_mask)) - return NULL; - } - #if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE) /* if size can vary use size else just use PAGE_SIZE */ size_mask = nc->size_mask; @@ -105,41 +97,23 @@ void *page_frag_cache_refill(struct page_frag_cache *nc, unsigned int fragsz, #endif va = (void *)((unsigned long)nc->va & ~size_mask); - offset = nc->offset; - - if (unlikely(offset + fragsz > (size_mask + 1))) { - page = virt_to_page(va); - - if (!page_ref_sub_and_test(page, nc->pagecnt_bias & size_mask)) - goto refill; - - if (unlikely(nc->pfmemalloc)) { - free_unref_page(page, compound_order(page)); - goto refill; - } - - /* OK, page count is 0, we can safely set it */ - set_page_count(page, size_mask); - nc->pagecnt_bias |= size_mask; - - nc->offset = 0; - if (unlikely(fragsz > (size_mask + 1))) { - /* - * The caller is trying to allocate a fragment - * with fragsz > PAGE_SIZE but the cache isn't big - * enough to satisfy the request, this may - * happen in low memory conditions. - * We don't release the cache page because - * it could make memory pressure worse - * so we simply return NULL here. - */ - return NULL; - } + page = virt_to_page(va); + if (!page_ref_sub_and_test(page, nc->pagecnt_bias & size_mask)) + return __page_frag_cache_refill(nc, gfp_mask); + + if (unlikely(nc->pfmemalloc)) { + free_unref_page(page, compound_order(page)); + return __page_frag_cache_refill(nc, gfp_mask); } + /* OK, page count is 0, we can safely set it */ + set_page_count(page, size_mask); + nc->pagecnt_bias |= size_mask; + nc->offset = 0; + return va; } -EXPORT_SYMBOL(page_frag_cache_refill); +EXPORT_SYMBOL(page_frag_cache_flush); /* * Frees a page fragment allocated out of either a compound or order 0 page. > > Thanks, > > Mat > > . >