From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A1388C433F5 for ; Tue, 22 Mar 2022 12:40:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 156076B0072; Tue, 22 Mar 2022 08:40:43 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 106336B0073; Tue, 22 Mar 2022 08:40:43 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F0F086B0074; Tue, 22 Mar 2022 08:40:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0092.hostedemail.com [216.40.44.92]) by kanga.kvack.org (Postfix) with ESMTP id E137A6B0072 for ; Tue, 22 Mar 2022 08:40:42 -0400 (EDT) Received: from smtpin30.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id A11051828A442 for ; Tue, 22 Mar 2022 12:40:42 +0000 (UTC) X-FDA: 79271981124.30.2E220A8 Received: from mail-ed1-f44.google.com (mail-ed1-f44.google.com [209.85.208.44]) by imf30.hostedemail.com (Postfix) with ESMTP id 2F10380020 for ; Tue, 22 Mar 2022 12:40:41 +0000 (UTC) Received: by mail-ed1-f44.google.com with SMTP id b15so21503778edn.4 for ; Tue, 22 Mar 2022 05:40:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:date:mime-version:user-agent:subject:content-language:to :cc:references:from:in-reply-to:content-transfer-encoding; bh=t1hZFpsz98krD/PH6PxD5vpNeuRF6mWyFaQ+Q3Qv/og=; b=UXludevU4N+p982mYx7Qkiuj40zxoJtkHMgMP3V2iou/diyuRnv6wwjOEGByrz6Xu/ YKW7Sorst5cJyKWl1uiX3zE+sPz6pagVQdkFgxFwcpsF8KOnWLF8ciOIoL2wJvXJuLtO 2Ssu49xjbyhwGM3bOLVcrNyQj2Xy/Kr8Sx77sZaUJC9bMAokk3WwRMG8S/X6vkEefPQV 3GbBsBlJTncumbeQzpJGc4xKyvNEGtIM704ubCBcCf+YUbxg4zdf7GDWbEg/kuZF++cG HdckEf+Tod+5YXb7eWO938sEbjEpOB76PFLlSGFiQiyd/G6JeZ6+CD+NBdlQrT5Q2J4m SO1Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent:subject :content-language:to:cc:references:from:in-reply-to :content-transfer-encoding; bh=t1hZFpsz98krD/PH6PxD5vpNeuRF6mWyFaQ+Q3Qv/og=; b=MSUnwC1G3s4W+IfWmAGvCZXSEioAr/9tniDhqRT6TQwSCod/7YKDi66CflHaHdMWEA /8huIkvgGkYLVy4GFk1X5GgvmM0qCkZPozIiqQ+rh1hTS8DKo+H8B9UU4Johg0rqCE2R 9idhT0wLh3WjhNIgAabONvch7QxlXbPel6KJ/lCsGF5mr+kMf/1yHL5QEyIUpNK2/xVX zzahZJwrsJj7KW4NKj1ZF8g1K85MofqWYx19xDFVFbONC+CLzIUmbS2xjVTfKFlihV98 7Dk3jZCg8xCJbu8vkxhdDmQGVuOw+yZpH7UwLyCcXwD6IIR9j0bKNMQ6A0w2YB4mTIDJ KV0A== X-Gm-Message-State: AOAM531rETgH3mcE9Xpgg4dmNbtSNsvJaFPAP2GPlSSLWnByhxCStPY3 gvDvq2Qx0sQeZZVC6XHUyv4Bv/UaCog= X-Google-Smtp-Source: ABdhPJwFuW1BX+jn6RtSuudaiv5oLls6Mdga2VU/4A9C/Gywcs+m8OUtiWD63/AQuVfEeNZW+rqMcA== X-Received: by 2002:aa7:d74d:0:b0:419:1d7:adcc with SMTP id a13-20020aa7d74d000000b0041901d7adccmr24916128eds.407.1647952840775; Tue, 22 Mar 2022 05:40:40 -0700 (PDT) Received: from [192.168.178.40] (ipbcc1cfad.dynamic.kabel-deutschland.de. [188.193.207.173]) by smtp.gmail.com with ESMTPSA id u5-20020aa7d985000000b004024027e7dasm9328415eds.28.2022.03.22.05.40.39 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 22 Mar 2022 05:40:40 -0700 (PDT) Message-ID: Date: Tue, 22 Mar 2022 13:40:39 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.5.0 Subject: Re: [RFC 0/3] Add zero copy feature for tcmu Content-Language: en-US To: Xiaoguang Wang , linux-mm@kvack.org, target-devel@vger.kernel.org, linux-scsi@vger.kernel.org Cc: linux-block@vger.kernel.org, xuyu@linux.alibaba.com References: <20220318095531.15479-1-xiaoguang.wang@linux.alibaba.com> From: Bodo Stroesser In-Reply-To: <20220318095531.15479-1-xiaoguang.wang@linux.alibaba.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=UXludevU; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf30.hostedemail.com: domain of bostroesser@gmail.com designates 209.85.208.44 as permitted sender) smtp.mailfrom=bostroesser@gmail.com X-Rspam-User: X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 2F10380020 X-Stat-Signature: 64mkmd7uobxwm5jfc9g6w5qbuomm1tr8 X-HE-Tag: 1647952841-952565 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 18.03.22 10:55, Xiaoguang Wang wrote: > The core idea to implement tcmu zero copy feature is really straight, > which just maps block device io request's sgl pages to tcmu user space > backstore, then we can avoid extra copy overhead between sgl pages and > tcmu internal data area(which really impacts io throughput), please see > https://www.spinics.net/lists/target-devel/msg21121.html for detailed > info. > Can you please tell us, how big the performance improvement is and which configuration you are using for measurenments? > Initially I use remap_pfn_range or vm_insert_pages to map sgl pages to > user space, but both of them have limits: > 1) Use vm_insert_pages > which is like tcp getsockopt(TCP_ZEROCOPY_RECEIVE), but there're two > restrictions: > 1. anonymous pages can not be mmaped to user spacea. > ==> vm_insert_pages > ====> insert_pages > ======> insert_page_in_batch_locked > ========> validate_page_before_insert > In validate_page_before_insert(), it shows that anonymous page can not > be mapped to use space, we know that if issuing direct io to block > device, io request's sgl pages mostly comes from anonymous page. > if (PageAnon(page) || PageSlab(page) || page_has_type(page)) > return -EINVAL; > I'm not sure why there is such restriction? for safety reasons ? > > 2. warn_on triggered in __folio_mark_dirty > When calling zap_page_range in tcmu user space backstore when io > completes, there is a warn_on triggered in __folio_mark_dirty: > if (folio->mapping) { /* Race with truncate? */ > WARN_ON_ONCE(warn && !folio_test_uptodate(folio)); > > I'm not familiar with folio yet, but I think the reason is that when > issuing a buffered read to tcmu block device, it's page cache mapped > to user space, backstore write this page and pte will be dirtied. but > initially it's newly allocated, hence page_update flag not set. > In zap_pte_range(), there is such codes: > if (!PageAnon(page)) { > if (pte_dirty(ptent)) { > force_flush = 1; > set_page_dirty(page); > } > So this warn_on is reasonable. > Indeed what I want is just to map io request sgl pages to tcmu user > space backstore, then backstore can read or write data to mapped area, > I don't want to care about page or its mapping status, so I choose to > use remap_pfn_range. > > 2) Use remap_pfn_range() > remap_pfn_range works well, but it has somewhat obvious overhead. For a > 512kb io request, it has 128 pages, and usually this 128 page's pfn are > not consecutive, so in worst cases, for a 512kb io request, I'd need to > issue 128 calls to remap_pfn_range, it's horrible. And in remap_pfn_range, > if x86 page attribute table feature is enabled, lookup_memtype called by > track_pfn_remap() also introduces obvious overhead. > > Finally in order to solve these problems, Xu Yu helps to implment a new > helper, which accepts an array of pages as parameter, anonymous pages can > be mapped to user space, pages would be treated as special pte(pte_special > returns true), so vm_normal_page returns NULL, above folio warn_on won't > trigger. > > Thanks. > > Xiaoguang Wang (2): > mm: export zap_page_range() > scsi: target: tcmu: Support zero copy > > Xu Yu (1): > mm/memory.c: introduce vm_insert_page(s)_mkspecial > > drivers/target/target_core_user.c | 257 +++++++++++++++++++++++++++++++++----- > include/linux/mm.h | 2 + > mm/memory.c | 183 +++++++++++++++++++++++++++ > 3 files changed, 414 insertions(+), 28 deletions(-) >