All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jens Axboe <jens.axboe@oracle.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>, linux-kernel@vger.kernel.org
Subject: Re: splice: move balance_dirty_pages_ratelimited() outside of     splice actor
Date: Tue, 12 Jun 2007 20:15:41 +0200	[thread overview]
Message-ID: <20070612181540.GP18832@kernel.dk> (raw)
In-Reply-To: <20070612173202.GO18832@kernel.dk>

On Tue, Jun 12 2007, Jens Axboe wrote:
> On Tue, Jun 12 2007, Andrew Morton wrote:
> > On Tue, 12 Jun 2007 14:44:50 +0200 Jens Axboe <jens.axboe@oracle.com> wrote:
> > 
> > > splice
> > 
> > btw, I'm staring in profound mystification at this:
> > 
> > int generic_pipe_buf_steal(struct pipe_inode_info *pipe,
> > 			   struct pipe_buffer *buf)
> > {
> > 	struct page *page = buf->page;
> > 
> > 	if (page_count(page) == 1) {
> > 		lock_page(page);
> > 		return 0;
> > 	}
> > 
> > 	return 1;
> > }
> > 
> > 
> > afacit that `if page_count(page)' test could be replaced by
> > `if today_is_tuesday()'.  But then I don't have the foggiest idea
> > what it's trying to do.
> > 
> > It would be nice to get some comments in and around here.
> > 
> > Also, I was trying to work out the role and responsibility of the ->pin
> > callback, and gave up.
> > 
> > There isn't a lot of point in explaining this over email - one should be
> > able to gain an understanding of these things by reading the code.  I think
> > the best way of tackling this would be to comprehensively document
> > pipe_buf_operations and pipe_inode_info, please...
> 
> OK so I wont explain it in detail here, I'll write up a good set of
> comments tonight.

I'll merge this into the #splice branch.

diff --git a/fs/pipe.c b/fs/pipe.c
index 3694af1..d007830 100644
--- a/fs/pipe.c
+++ b/fs/pipe.c
@@ -164,6 +164,20 @@ static void anon_pipe_buf_release(struct pipe_inode_info *pipe,
 		page_cache_release(page);
 }
 
+/**
+ * generic_pipe_buf_map - virtually map a pipe buffer
+ * @pipe:	the pipe that the buffer belongs to
+ * @buf:	the buffer that should be mapped
+ * @atomic:	whether to use an atomic map
+ *
+ * Description:
+ *	This function returns a kernel virtual address mapping for the
+ *	passed in @pipe_buffer. If @atomic is set, an atomic map is provided
+ *	and the caller has to be careful not to fault before calling
+ *	the unmap function.
+ *
+ *	Note that this function occupies KM_USER0 if @atomic != 0.
+ */
 void *generic_pipe_buf_map(struct pipe_inode_info *pipe,
 			   struct pipe_buffer *buf, int atomic)
 {
@@ -175,6 +189,15 @@ void *generic_pipe_buf_map(struct pipe_inode_info *pipe,
 	return kmap(buf->page);
 }
 
+/**
+ * generic_pipe_buf_unmap - unmap a previously mapped pipe buffer
+ * @pipe:	the pipe that the buffer belongs to
+ * @buf:	the buffer that should be unmapped
+ * @map_data:	the data that the mapping function returned
+ *
+ * Description:
+ *	This function undoes the mapping that ->map() provided.
+ */
 void generic_pipe_buf_unmap(struct pipe_inode_info *pipe,
 			    struct pipe_buffer *buf, void *map_data)
 {
@@ -185,11 +208,28 @@ void generic_pipe_buf_unmap(struct pipe_inode_info *pipe,
 		kunmap(buf->page);
 }
 
+/**
+ * generic_pipe_buf_steal - attempt to take ownership of a @pipe_buffer
+ * @pipe:	the pipe that the buffer belongs to
+ * @buf:	the buffer to attempt to steal
+ *
+ * Description:
+ *	This function attempts to steal the @struct page attached to
+ *	@buf. If successful, this function returns 0 and returns with
+ *	the page locked. The caller may then reuse the page for whatever
+ *	he wishes, the typical use is insertion into a different file
+ *	page cache.
+ */
 int generic_pipe_buf_steal(struct pipe_inode_info *pipe,
 			   struct pipe_buffer *buf)
 {
 	struct page *page = buf->page;
 
+	/*
+	 * A reference of one is golden, that means that the owner of this
+	 * page is the only one holding a reference to it. lock the page
+	 * and return OK.
+	 */
 	if (page_count(page) == 1) {
 		lock_page(page);
 		return 0;
@@ -198,11 +238,30 @@ int generic_pipe_buf_steal(struct pipe_inode_info *pipe,
 	return 1;
 }
 
-void generic_pipe_buf_get(struct pipe_inode_info *info, struct pipe_buffer *buf)
+/**
+ * generic_pipe_buf_get - get a reference to a @struct pipe_buffer
+ * @pipe:	the pipe that the buffer belongs to
+ * @buf:	the buffer to get a reference to
+ *
+ * Description:
+ *	This function grabs an extra reference to @buf. It's used in
+ *	in the tee() system call, when we duplicate the buffers in one
+ *	pipe into another.
+ */
+void generic_pipe_buf_get(struct pipe_inode_info *pipe, struct pipe_buffer *buf)
 {
 	page_cache_get(buf->page);
 }
 
+/**
+ * generic_pipe_buf_confirm - verify contents of the pipe buffer
+ * @pipe:	the pipe that the buffer belongs to
+ * @buf:	the buffer to confirm
+ *
+ * Description:
+ *	This function does nothing, because the generic pipe code uses
+ *	pages that are always good when inserted into the pipe.
+ */
 int generic_pipe_buf_confirm(struct pipe_inode_info *info,
 			     struct pipe_buffer *buf)
 {
diff --git a/fs/splice.c b/fs/splice.c
index b125a61..dd3be2c 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -85,6 +85,10 @@ static void page_cache_pipe_buf_release(struct pipe_inode_info *pipe,
 	buf->flags &= ~PIPE_BUF_FLAG_LRU;
 }
 
+/*
+ * Check whether the contents of buf is OK to access. Since the content
+ * is a page cache page, IO may be in flight.
+ */
 static int page_cache_pipe_buf_confirm(struct pipe_inode_info *pipe,
 				       struct pipe_buffer *buf)
 {
diff --git a/include/linux/pipe_fs_i.h b/include/linux/pipe_fs_i.h
index cc09fe8..333f389 100644
--- a/include/linux/pipe_fs_i.h
+++ b/include/linux/pipe_fs_i.h
@@ -17,6 +17,22 @@ struct pipe_buffer {
 	unsigned long private;
 };
 
+/**
+ *	struct pipe_inode_info - a linux kernel pipe
+ *	@wait: reader/writer wait point in case of empty/full pipe
+ *	@nrbufs: the number of non-empty pipe buffers in this pipe
+ *	@curbuf: the current pipe buffer entry
+ *	@tmp_page: cached released page
+ *	@readers: number of current readers of this pipe
+ *	@writers: number of current writers of this pipe
+ *	@waiting_writers: number of writers blocked waiting for room
+ *	@r_counter: reader counter
+ *	@w_counter: writer counter
+ *	@fasync_readers: reader side fasync
+ *	@fasync_writers: writer side fasync
+ *	@inode: inode this pipe is attached to
+ *	@bufs: the circular array of pipe buffers
+ **/
 struct pipe_inode_info {
 	wait_queue_head_t wait;
 	unsigned int nrbufs, curbuf;
@@ -43,15 +59,65 @@ struct pipe_inode_info {
  *	->unmap()
  *
  * That is, ->map() must be called on a confirmed buffer,
- * same goes for ->steal().
+ * same goes for ->steal(). See below for the meaning of each
+ * operation. Also see kerneldoc in fs/pipe.c for the pipe
+ * and generic variants of these hooks.
  */
 struct pipe_buf_operations {
+	/*
+	 * This is set to 1, if the generic pipe read/write may coalesce
+	 * data into an existing buffer. If this is set to 0, a new pipe
+	 * page segment is always used for new data.
+	 */
 	int can_merge;
+
+	/*
+	 * ->map() returns a virtual address mapping of the pipe buffer.
+	 * The last integer flag reflects whether this should be an atomic
+	 * mapping or not. The atomic map is faster, however you can't take
+	 * page faults before calling ->unmap() again. So if you need to eg
+	 * access user data through copy_to/from_user(), then you must get
+	 * a non-atomic map. ->map() uses the KM_USER0 atomic slot for
+	 * atomic maps, so you can't map more than one pipe_buffer at once
+	 * and you have to be careful if mapping another page as source
+	 * or destination for a copy (IOW, it has to use something else
+	 * than KM_USER0).
+	 */
 	void * (*map)(struct pipe_inode_info *, struct pipe_buffer *, int);
+
+	/*
+	 * Undoes ->map(), finishes the virtual mapping of the pipe buffer.
+	 */
 	void (*unmap)(struct pipe_inode_info *, struct pipe_buffer *, void *);
+
+	/*
+	 * ->confirm() verifies that the data in the pipe buffer is there
+	 * and that the contents are good. If the pages in the pipe belong
+	 * to a file system, we may need to wait for IO completion in this
+	 * hook. Returns 0 for good, or a negative error value in case of
+	 * error.
+	 */
 	int (*confirm)(struct pipe_inode_info *, struct pipe_buffer *);
+
+	/*
+	 * When the contents of this pipe buffer has been completely
+	 * consumed by a reader, ->release() is called.
+	 */
 	void (*release)(struct pipe_inode_info *, struct pipe_buffer *);
+
+	/*
+	 * Attempt to take ownership of the pipe buffer and its contents.
+	 * ->steal() returns 0 for success, in which case the contents
+	 * of the pipe (the buf->page) is locked and now completely owned
+	 * by the caller. The page may then be transferred to a different
+	 * mapping, the most often used case is insertion into different
+	 * file address space cache.
+	 */
 	int (*steal)(struct pipe_inode_info *, struct pipe_buffer *);
+
+	/*
+	 * Get a reference to the pipe buffer.
+	 */
 	void (*get)(struct pipe_inode_info *, struct pipe_buffer *);
 };
 

-- 
Jens Axboe


  reply	other threads:[~2007-06-12 18:18 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <200706112159.l5BLxF5x004043@hera.kernel.org>
2007-06-11 23:34 ` splice: move balance_dirty_pages_ratelimited() outside of splice actor Andrew Morton
2007-06-12  6:35   ` Jens Axboe
2007-06-12 11:20     ` Jens Axboe
2007-06-12 11:31       ` Jens Axboe
2007-06-12 12:06         ` Peter Zijlstra
2007-06-12 12:10           ` Jens Axboe
2007-06-12 12:16             ` Peter Zijlstra
2007-06-12 12:44               ` Jens Axboe
2007-06-12 16:02                 ` Andrew Morton
2007-06-12 16:46                 ` Andrew Morton
2007-06-12 17:32                   ` Jens Axboe
2007-06-12 18:15                     ` Jens Axboe [this message]
2007-06-12 18:43                       ` Andrew Morton
2007-06-12 18:48                         ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070612181540.GP18832@kernel.dk \
    --to=jens.axboe@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.