From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pb0-f48.google.com (mail-pb0-f48.google.com [209.85.160.48]) by kanga.kvack.org (Postfix) with ESMTP id 6FAB76B006E for ; Thu, 17 Oct 2013 09:50:04 -0400 (EDT) Received: by mail-pb0-f48.google.com with SMTP id ma3so2303603pbc.7 for ; Thu, 17 Oct 2013 06:50:04 -0700 (PDT) Received: from /spool/local by e23smtp05.au.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 17 Oct 2013 23:49:59 +1000 Received: from d23relay03.au.ibm.com (d23relay03.au.ibm.com [9.190.235.21]) by d23dlp03.au.ibm.com (Postfix) with ESMTP id 0A8363578040 for ; Fri, 18 Oct 2013 00:49:57 +1100 (EST) Received: from d23av04.au.ibm.com (d23av04.au.ibm.com [9.190.235.139]) by d23relay03.au.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id r9HDnjZT5308766 for ; Fri, 18 Oct 2013 00:49:45 +1100 Received: from d23av04.au.ibm.com (localhost [127.0.0.1]) by d23av04.au.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id r9HDmRag006212 for ; Fri, 18 Oct 2013 00:48:27 +1100 Date: Thu, 17 Oct 2013 08:48:27 -0500 From: Robert Jennings Subject: Re: [PATCH 1/2] vmsplice: unmap gifted pages for recipient Message-ID: <20131017134827.GB19741@linux.vnet.ibm.com> References: <1381177293-27125-1-git-send-email-rcj@linux.vnet.ibm.com> <1381177293-27125-2-git-send-email-rcj@linux.vnet.ibm.com> <525FB9EE.3070609@suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <525FB9EE.3070609@suse.cz> Sender: owner-linux-mm@kvack.org List-ID: To: Vlastimil Babka Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, Alexander Viro , Rik van Riel , Andrea Arcangeli , Dave Hansen , Matt Helsley , Anthony Liguori , Michael Roth , Lei Li , Leonardo Garcia * Vlastimil Babka (vbabka@suse.cz) wrote: > On 10/07/2013 10:21 PM, Robert C Jennings wrote: > > Introduce use of the unused SPLICE_F_MOVE flag for vmsplice to zap > > pages. > > > > When vmsplice is called with flags (SPLICE_F_GIFT | SPLICE_F_MOVE) the > > writer's gift'ed pages would be zapped. This patch supports further work > > to move vmsplice'd pages rather than copying them. That patch has the > > restriction that the page must not be mapped by the source for the move, > > otherwise it will fall back to copying the page. > > > > Signed-off-by: Matt Helsley > > Signed-off-by: Robert C Jennings > > --- > > Since the RFC went out I have coalesced the zap_page_range() call to > > operate on VMAs rather than calling this for each page. For a 256MB > > vmsplice this reduced the write side 50% from the RFC. > > --- > > fs/splice.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++++- > > include/linux/splice.h | 1 + > > 2 files changed, 51 insertions(+), 1 deletion(-) > > > > diff --git a/fs/splice.c b/fs/splice.c > > index 3b7ee65..a62d61e 100644 > > --- a/fs/splice.c > > +++ b/fs/splice.c > > @@ -188,12 +188,17 @@ ssize_t splice_to_pipe(struct pipe_inode_info *pipe, > > { > > unsigned int spd_pages = spd->nr_pages; > > int ret, do_wakeup, page_nr; > > + struct vm_area_struct *vma; > > + unsigned long user_start, user_end; > > > > ret = 0; > > do_wakeup = 0; > > page_nr = 0; > > + vma = NULL; > > + user_start = user_end = 0; > > > > pipe_lock(pipe); > > + down_read(¤t->mm->mmap_sem); > > Seems like you could take the mmap_sem only when GIFT and MOVE is set. > Maybe it won't help that much for performance but at least serve as > documenting the reason it's needed? > > Vlastimil > I had been doing that previously but moving this outside the loop and acquiring it once did improve performance. I'll add a comment on down_read() as to the reason for taking this though. -Rob > > for (;;) { > > if (!pipe->readers) { > > @@ -212,8 +217,44 @@ ssize_t splice_to_pipe(struct pipe_inode_info *pipe, > > buf->len = spd->partial[page_nr].len; > > buf->private = spd->partial[page_nr].private; > > buf->ops = spd->ops; > > - if (spd->flags & SPLICE_F_GIFT) > > + if (spd->flags & SPLICE_F_GIFT) { > > + unsigned long useraddr = > > + spd->partial[page_nr].useraddr; > > + > > + if ((spd->flags & SPLICE_F_MOVE) && > > + !buf->offset && > > + (buf->len == PAGE_SIZE)) { > > + /* Can move page aligned buf, gather > > + * requests to make a single > > + * zap_page_range() call per VMA > > + */ > > + if (vma && (useraddr == user_end) && > > + ((useraddr + PAGE_SIZE) <= > > + vma->vm_end)) { > > + /* same vma, no holes */ > > + user_end += PAGE_SIZE; > > + } else { > > + if (vma) > > + zap_page_range(vma, > > + user_start, > > + (user_end - > > + user_start), > > + NULL); > > + vma = find_vma_intersection( > > + current->mm, > > + useraddr, > > + (useraddr + > > + PAGE_SIZE)); > > + if (!IS_ERR_OR_NULL(vma)) { > > + user_start = useraddr; > > + user_end = (useraddr + > > + PAGE_SIZE); > > + } else > > + vma = NULL; > > + } > > + } > > buf->flags |= PIPE_BUF_FLAG_GIFT; > > + } > > > > pipe->nrbufs++; > > page_nr++; > > @@ -255,6 +296,10 @@ ssize_t splice_to_pipe(struct pipe_inode_info *pipe, > > pipe->waiting_writers--; > > } > > > > + if (vma) > > + zap_page_range(vma, user_start, (user_end - user_start), NULL); > > + > > + up_read(¤t->mm->mmap_sem); > > pipe_unlock(pipe); > > > > if (do_wakeup) > > @@ -485,6 +530,7 @@ fill_it: > > > > spd.partial[page_nr].offset = loff; > > spd.partial[page_nr].len = this_len; > > + spd.partial[page_nr].useraddr = index << PAGE_CACHE_SHIFT; > > len -= this_len; > > loff = 0; > > spd.nr_pages++; > > @@ -656,6 +702,7 @@ ssize_t default_file_splice_read(struct file *in, loff_t *ppos, > > this_len = min_t(size_t, vec[i].iov_len, res); > > spd.partial[i].offset = 0; > > spd.partial[i].len = this_len; > > + spd.partial[i].useraddr = (unsigned long)vec[i].iov_base; > > if (!this_len) { > > __free_page(spd.pages[i]); > > spd.pages[i] = NULL; > > @@ -1475,6 +1522,8 @@ static int get_iovec_page_array(const struct iovec __user *iov, > > > > partial[buffers].offset = off; > > partial[buffers].len = plen; > > + partial[buffers].useraddr = (unsigned long)base; > > + base = (void*)((unsigned long)base + PAGE_SIZE); > > > > off = 0; > > len -= plen; > > diff --git a/include/linux/splice.h b/include/linux/splice.h > > index 74575cb..56661e3 100644 > > --- a/include/linux/splice.h > > +++ b/include/linux/splice.h > > @@ -44,6 +44,7 @@ struct partial_page { > > unsigned int offset; > > unsigned int len; > > unsigned long private; > > + unsigned long useraddr; > > }; > > > > /* > > > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org