linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Fabio M. De Francesco" <fmdefrancesco@gmail.com>
To: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>,
	linux-fsdevel@vger.kernel.org, linux-aio@kvack.org,
	linux-kernel@vger.kernel.org, "Venkataramanan,
	Anirudh" <anirudh.venkataramanan@intel.com>,
	Ira Weiny <ira.weiny@intel.com>, Jeff Moyer <jmoyer@redhat.com>,
	Kent Overstreet <kent.overstreet@linux.dev>
Subject: Re: [PATCH v3] fs/aio: Replace kmap{,_atomic}() with kmap_local_page()
Date: Fri, 03 Mar 2023 06:23:24 +0100	[thread overview]
Message-ID: <2172918.1BCLMh4Saa@suse> (raw)
In-Reply-To: <20230119162055.20944-1-fmdefrancesco@gmail.com>

On giovedì 19 gennaio 2023 17:20:55 CET Fabio M. De Francesco wrote:
> The use of kmap() and kmap_atomic() are being deprecated in favor of
> kmap_local_page().
> 
> There are two main problems with kmap(): (1) It comes with an overhead as
> the mapping space is restricted and protected by a global lock for
> synchronization and (2) it also requires global TLB invalidation when the
> kmap’s pool wraps and it might block when the mapping space is fully
> utilized until a slot becomes available.
> 
> With kmap_local_page() the mappings are per thread, CPU local, can take
> page faults, and can be called from any context (including interrupts).
> It is faster than kmap() in kernels with HIGHMEM enabled. Furthermore,
> the tasks can be preempted and, when they are scheduled to run again, the
> kernel virtual addresses are restored and still valid.
> 
> The use of kmap_local_page() in fs/aio.c is "safe" in the sense that the
> code don't hands the returned kernel virtual addresses to other threads
> and there are no nestings which should be handled with the stack based
> (LIFO) mappings/un-mappings order. Furthermore, the code between the old
> kmap_atomic()/kunmap_atomic() did not depend on disabling page-faults
> and/or preemption, so that there is no need to call pagefault_disable()
> and/or preempt_disable() before the mappings.
> 
> Therefore, replace kmap() and kmap_atomic() with kmap_local_page() in
> fs/aio.c.
> 
> Tested with xfstests on a QEMU/KVM x86_32 VM, 6GB RAM, booting a kernel
> with HIGHMEM64GB enabled.
> 
> Cc: "Venkataramanan, Anirudh" <anirudh.venkataramanan@intel.com>
> Suggested-by: Ira Weiny <ira.weiny@intel.com>
> Reviewed-by: Ira Weiny <ira.weiny@intel.com>
> Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
> Reviewed-by: Kent Overstreet <kent.overstreet@linux.dev>
> Signed-off-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>
> ---
> 
> I've tested with "./check -g aio". The tests in this group fail 3/26
> times, with and without my patch. Therefore, these changes don't introduce
> further errors. I'm not aware of any other tests which I may run, so that
> any suggestions would be precious and much appreciated :-)
> 
> I'm resending this patch because some recipients were missing in the
> previous submissions. In the meantime I'm also adding some more information
> in the commit message. There are no changes in the code.
> 
> Changes from v1:
>         Add further information in the commit message, and the
>         "Reviewed-by" tags from Ira and Jeff (thanks!).
> 
> Changes from v2:
> 	Rewrite a block of code between mapping/un-mapping to improve
> 	readability in aio_setup_ring() and add a missing call to
> 	flush_dcache_page() in ioctx_add_table() (thanks to Al Viro);
> 	Add a "Reviewed-by" tag from Kent Overstreet (thanks).
> 
>  fs/aio.c | 46 +++++++++++++++++++++-------------------------
>  1 file changed, 21 insertions(+), 25 deletions(-)
> 
> diff --git a/fs/aio.c b/fs/aio.c
> index 562916d85cba..9b39063dc7ac 100644
> --- a/fs/aio.c
> +++ b/fs/aio.c
> @@ -486,7 +486,6 @@ static const struct address_space_operations 
aio_ctx_aops
> = {
> 
>  static int aio_setup_ring(struct kioctx *ctx, unsigned int nr_events)
>  {
> -	struct aio_ring *ring;
>  	struct mm_struct *mm = current->mm;
>  	unsigned long size, unused;
>  	int nr_pages;
> @@ -567,16 +566,12 @@ static int aio_setup_ring(struct kioctx *ctx, unsigned
> int nr_events) ctx->user_id = ctx->mmap_base;
>  	ctx->nr_events = nr_events; /* trusted copy */
> 
> -	ring = kmap_atomic(ctx->ring_pages[0]);
> -	ring->nr = nr_events;	/* user copy */
> -	ring->id = ~0U;
> -	ring->head = ring->tail = 0;
> -	ring->magic = AIO_RING_MAGIC;
> -	ring->compat_features = AIO_RING_COMPAT_FEATURES;
> -	ring->incompat_features = AIO_RING_INCOMPAT_FEATURES;
> -	ring->header_length = sizeof(struct aio_ring);
> -	kunmap_atomic(ring);
> -	flush_dcache_page(ctx->ring_pages[0]);
> +	memcpy_to_page(ctx->ring_pages[0], 0, (const char *)&(struct 
aio_ring) {
> +		       .nr = nr_events, .id = ~0U, .magic = 
AIO_RING_MAGIC,
> +		       .compat_features = AIO_RING_COMPAT_FEATURES,
> +		       .incompat_features = AIO_RING_INCOMPAT_FEATURES,
> +		       .header_length = sizeof(struct aio_ring) },
> +		       sizeof(struct aio_ring));
> 
>  	return 0;
>  }
> @@ -678,9 +673,10 @@ static int ioctx_add_table(struct kioctx *ctx, struct
> mm_struct *mm) * we are protected from page migration
>  					 * changes ring_pages by -
>ring_lock.
>  					 */
> -					ring = kmap_atomic(ctx-
>ring_pages[0]);
> +					ring = kmap_local_page(ctx-
>ring_pages[0]);
>  					ring->id = ctx->id;
> -					kunmap_atomic(ring);
> +					kunmap_local(ring);
> +					flush_dcache_page(ctx-
>ring_pages[0]);
>  					return 0;
>  				}
> 
> @@ -1021,9 +1017,9 @@ static void user_refill_reqs_available(struct kioctx
> *ctx) * against ctx->completed_events below will make sure we do the
>  		 * safe/right thing.
>  		 */
> -		ring = kmap_atomic(ctx->ring_pages[0]);
> +		ring = kmap_local_page(ctx->ring_pages[0]);
>  		head = ring->head;
> -		kunmap_atomic(ring);
> +		kunmap_local(ring);
> 
>  		refill_reqs_available(ctx, head, ctx->tail);
>  	}
> @@ -1129,12 +1125,12 @@ static void aio_complete(struct aio_kiocb *iocb)
>  	if (++tail >= ctx->nr_events)
>  		tail = 0;
> 
> -	ev_page = kmap_atomic(ctx->ring_pages[pos / AIO_EVENTS_PER_PAGE]);
> +	ev_page = kmap_local_page(ctx->ring_pages[pos / 
AIO_EVENTS_PER_PAGE]);
>  	event = ev_page + pos % AIO_EVENTS_PER_PAGE;
> 
>  	*event = iocb->ki_res;
> 
> -	kunmap_atomic(ev_page);
> +	kunmap_local(ev_page);
>  	flush_dcache_page(ctx->ring_pages[pos / AIO_EVENTS_PER_PAGE]);
> 
>  	pr_debug("%p[%u]: %p: %p %Lx %Lx %Lx\n", ctx, tail, iocb,
> @@ -1148,10 +1144,10 @@ static void aio_complete(struct aio_kiocb *iocb)
> 
>  	ctx->tail = tail;
> 
> -	ring = kmap_atomic(ctx->ring_pages[0]);
> +	ring = kmap_local_page(ctx->ring_pages[0]);
>  	head = ring->head;
>  	ring->tail = tail;
> -	kunmap_atomic(ring);
> +	kunmap_local(ring);
>  	flush_dcache_page(ctx->ring_pages[0]);
> 
>  	ctx->completed_events++;
> @@ -1211,10 +1207,10 @@ static long aio_read_events_ring(struct kioctx *ctx,
>  	mutex_lock(&ctx->ring_lock);
> 
>  	/* Access to ->ring_pages here is protected by ctx->ring_lock. */
> -	ring = kmap_atomic(ctx->ring_pages[0]);
> +	ring = kmap_local_page(ctx->ring_pages[0]);
>  	head = ring->head;
>  	tail = ring->tail;
> -	kunmap_atomic(ring);
> +	kunmap_local(ring);
> 
>  	/*
>  	 * Ensure that once we've read the current tail pointer, that
> @@ -1246,10 +1242,10 @@ static long aio_read_events_ring(struct kioctx *ctx,
>  		avail = min(avail, nr - ret);
>  		avail = min_t(long, avail, AIO_EVENTS_PER_PAGE - pos);
> 
> -		ev = kmap(page);
> +		ev = kmap_local_page(page);
>  		copy_ret = copy_to_user(event + ret, ev + pos,
>  					sizeof(*ev) * avail);
> -		kunmap(page);
> +		kunmap_local(ev);
> 
>  		if (unlikely(copy_ret)) {
>  			ret = -EFAULT;
> @@ -1261,9 +1257,9 @@ static long aio_read_events_ring(struct kioctx *ctx,
>  		head %= ctx->nr_events;
>  	}
> 
> -	ring = kmap_atomic(ctx->ring_pages[0]);
> +	ring = kmap_local_page(ctx->ring_pages[0]);
>  	ring->head = head;
> -	kunmap_atomic(ring);
> +	kunmap_local(ring);
>  	flush_dcache_page(ctx->ring_pages[0]);
> 
>  	pr_debug("%li  h%u t%u\n", ret, head, tail);
> --
> 2.39.0

Hi Al,

I see that this patch is here since Jan 19, 2023.
Is there anything that prevents its merging? Am I expected to do further 
changes? Please notice that it already had three "Reviewed-by:" tags (again 
thanks to Ira, Jeff and Kent). 

Can you please take it in your three?

Thanks,

Fabio




  reply	other threads:[~2023-03-03  5:23 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-19 16:20 [PATCH v3] fs/aio: Replace kmap{,_atomic}() with kmap_local_page() Fabio M. De Francesco
2023-03-03  5:23 ` Fabio M. De Francesco [this message]
2023-03-27 10:08 ` Fabio M. De Francesco
2023-03-27 13:22   ` Matthew Wilcox
2023-03-27 18:37     ` Kent Overstreet
2023-06-07 14:59     ` Fabio M. De Francesco
2023-06-09 15:04 ` Fabio M. De Francesco

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2172918.1BCLMh4Saa@suse \
    --to=fmdefrancesco@gmail.com \
    --cc=anirudh.venkataramanan@intel.com \
    --cc=bcrl@kvack.org \
    --cc=ira.weiny@intel.com \
    --cc=jmoyer@redhat.com \
    --cc=kent.overstreet@linux.dev \
    --cc=linux-aio@kvack.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).