From: "Fabio M. De Francesco" <fmdefrancesco@gmail.com>
To: Chaitanya Kulkarni <chaitanyak@nvidia.com>
Cc: "linux-nvme@lists.infradead.org" <linux-nvme@lists.infradead.org>,
Christoph Hellwig <hch@lst.de>,
Chaitanya Kulkarni <chaitanyak@nvidia.com>,
Sagi Grimberg <sagi@grimberg.me>, Ira Weiny <ira.weiny@intel.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [RFC PATCH] nvmet-tcp: Don't kmap() pages which can't come from HIGHMEM
Date: Tue, 16 Aug 2022 20:16:54 +0200 [thread overview]
Message-ID: <8992226.CDJkKcVGEf@opensuse> (raw)
In-Reply-To: <3e8a0bb1-4c69-93d2-71f9-81bb8466cb14@nvidia.com>
On martedì 16 agosto 2022 15:12:08 CEST Chaitanya Kulkarni wrote:
> Fabio,
>
> On 8/16/22 02:18, Fabio M. De Francesco wrote:
>
> > kmap() is being deprecated in favor of kmap_local_page().
> >
> > There are two main problems with kmap(): (1) It comes with an overhead as
> > mapping space is restricted and protected by a global lock for
> > synchronization and (2) it also requires global TLB invalidation when the
> > kmap’s pool wraps and it might block when the mapping space is fully
> > utilized until a slot becomes available.
> >
>
> so I believe this should give us better performance under heavy
> workload ?
>
Yes, correct. Can you please take a look at the highmem official documentation
(highmem.rst)? I reworked and extended it with two series of patches.
Everything about the deprecation of kmap() is explained there and in a patch
from Ira: "checkpatch: Add kmap and kmap_atomic to the deprecated list" which
you reviewed at https://lore.kernel.org/all/91f708ed-f456-dc83-281e-fc18a0b4b981@nvidia.com/
> > With kmap_local_page() the mappings are per thread, CPU local, can take
> > page faults, and can be called from any context (including interrupts).
> > It is faster than kmap() in kernels with HIGHMEM enabled. Furthermore,
> > the tasks can be preempted and, when they are scheduled to run again, the
> > kernel virtual addresses are restored and are still valid.
> >
> > However, there is a huge constraint which might block some conversions
> > to kmap_local_page(): the kernel virtual address cannot be handed across
> > different threads. Ira made me notice that the kmap() and kunmap() in this
> > driver happen in two different workqueues. Therefore, kunmap_local() will
> > try to unmap an invalid address.
> >
> > Let me explain why I'm sending an RFC. When I hit the above mentioned
> > issues I tried to refactor the code in ways where mapping and unmapping
> > happen in a single thread (to not break the rules of threads locality).
> >
> > However, while reading this code again I think I noticed an important
> > prerequisite which may lead to a simpler solution... If I'm not wrong, it
> > looks like the pages are allocated in nvmet_tcp_map_data(), using the
> > GFP_KERNEL flag.
> >
> > This would assure that those pages _cannot_ come from HIGHMEM. If I'm not
> > missing something (again!), a plain page_address() could replace the
kmap()
> > of sg_page(sg); furthermore, we shouldn't need the un-mappings any longer.
> >
> > Unfortunately, I don't know this protocol and I'm not so experienced with
> > kernel development to be able to understand this code properly.
> >
> > Therefore, I have two questions: am I right about thinking that the pages
> > mapped in nvmet_tcp_map_pdu_iovec() are allocated with GFP_KERNEL? If so,
> > can anyone with more knowledge than mine please say if my changes make any
> > sense?
> >
> > Suggested-by: Ira Weiny <ira.weiny@intel.com>
> > Signed-off-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>
>
>
> Thanks a lot for detailed explanation.
You are welcome!
> Quick question what kind of performance benefits you have seen with
> this change ? we need to document the performance numbers since commit
> log mentions here that kmap_local_page() is faster than kmap().
OK, but kmap_local_page() was discarded because not applicable here without
heavy refactoring.
> In case you are not aware please have a look at the blktests to create
> a simple loopback setup with nvme-loop transport.
I have nothing against learning how blktests works and running this tool.
I'll do as you requested.
However, please read the implementation of kmap():
#ifdef CONFIG_HIGHMEM
static inline void *kmap(struct page *page)
{
void *addr;
might_sleep();
if (!PageHighMem(page))
addr = page_address(page);
else
addr = kmap_high(page);
kmap_flush_tlb((unsigned long)addr);
return addr;
}
If page is not from HIGHMEM it is a simple page_address(), like it is in my
RFC patch.
#else /* CONFIG_HIGHMEM */
static inline void *kmap(struct page *page)
{
might_sleep();
return page_address(page);
}
Again, a plain page_address().
Furthermore, with a simple page_address() we avoid the calls to kunmap().
I think it implicitly say all we need to know about why we should prefer
page_address() whenever we are _sure_ that pages cannot come from HIGHMEM.
Thanks for your comments and questions,
Fabio
> -ck
>
next prev parent reply other threads:[~2022-08-16 18:17 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-16 9:18 [RFC PATCH] nvmet-tcp: Don't kmap() pages which can't come from HIGHMEM Fabio M. De Francesco
2022-08-16 13:12 ` Chaitanya Kulkarni
2022-08-16 18:16 ` Fabio M. De Francesco [this message]
2022-08-16 18:59 ` Keith Busch
2022-08-17 9:44 ` Sagi Grimberg
2022-08-17 12:02 ` Fabio M. De Francesco
2022-08-17 23:42 ` Chaitanya Kulkarni
2022-08-17 14:18 ` Keith Busch
2022-08-17 14:25 ` Sagi Grimberg
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8992226.CDJkKcVGEf@opensuse \
--to=fmdefrancesco@gmail.com \
--cc=chaitanyak@nvidia.com \
--cc=hch@lst.de \
--cc=ira.weiny@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=sagi@grimberg.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.