Re: splice(SPLICE_F_MOVE) problems

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Jens Axboe <axboe@suse.de>
To: Oleg Nesterov <oleg@tv-sign.ru>
Cc: linux-kernel@vger.kernel.org, Linus Torvalds <torvalds@osdl.org>,
	Ingo Molnar <mingo@elte.hu>
Subject: Re: splice(SPLICE_F_MOVE) problems
Date: Wed, 3 May 2006 08:56:45 +0200	[thread overview]
Message-ID: <20060503065644.GJ9712@suse.de> (raw)
In-Reply-To: <20060503041455.GA158@oleg>

On Wed, May 03 2006, Oleg Nesterov wrote:
> I am looking at current fs/splice.c from
> 	http://kernel.org/git/?p=linux/kernel/git/torvalds/....
> 
> 
> 	pipe_to_file:
> 
> 		if ((sd->flags & SPLICE_F_MOVE) && this_len == PAGE_CACHE_SIZE) {
> 			/*
> 			 * If steal succeeds, buf->page is now pruned from the vm
> 			 * side (page cache) and we can reuse it. The page will also
> 			 * be locked on successful return.
> 			 */
> 			if (buf->ops->steal(info, buf))
> 				goto find_page;
> 
> 			page = buf->page;
> 			page_cache_get(page);
> 
> Isn't it better to do this after successful successful add_to_page_cache() ?
> This way you don't need to do page_cache_release() if add_to_page_cache fails.

It is, I dunno why I moved the ordering around when fixing that up. I'll
move it down a block and skip the early page_cache_get.

> 			/*
> 			 * page must be on the LRU for adding to the pagecache.
> 
> Is it true? (looking at add_to_page_cache_lru).

My wording is a little confusing probably, I didn't mean to imply any
ordering contraints there. Just simply state that a page cannot be in
the pagecache and not on the LRU.

> 			if (!PageLRU(page)) {
> 				struct zone *zone = page_zone(page);
> 
> 				spin_lock_irq(&zone->lru_lock);
> 				if (!PageLRU(page)) {
> 					SetPageLRU(page);
> 					add_page_to_inactive_list(zone, page);
> 				}
> 				spin_unlock_irq(&zone->lru_lock);
> 
> Why not lru_cache_add() ?

Because for calling lru_cache_add, I have to be absolutely certain that
the page isn't on the LRU already. So I'd have to grab the zone lru_lock
first anyways, and I cannot call lru_cache_add() with that held since
__pagevec_lru_add() also grabs it. Hence the "open coding".

> 			if (add_to_page_cache(page, mapping, index, gfp_mask)) {
> 				page_cache_release(page);
> 				unlock_page(page);
> 				goto find_page;
> 
> This leaves unmapped page on ->inactive_list. page_count(page) == 1.
> What if shrink_inactive_list() finds this page, increments
> page->count, and calls shrink_page_list()->remove_mapping() again?
> Hmm... So page_cache_pipe_buf_steal() has a reason to return a locked
> page (I am not an expert, please correct me).

With the block moved above the LRU business, this disappears as well.

> However, user_page_pipe_buf_steal() returns unlocked page in
> PIPE_BUF_FLAG_GIFT case. So, if add_to_page_cache() fails,
> unlock_page() will trigger BUG().

It does, it calls generic_pipe_buf_steal() which locks it.

> 		ret = mapping->a_ops->prepare_write(file, page, offset, offset+this_len);
> 		if (ret == AOP_TRUNCATED_PAGE) {
> 			page_cache_release(page);
> 			goto find_page;
> 
> We also need to unlock(page) if it was stealed.

Are you sure that's the right test? Don't you mean if ret !=
AOP_TRUNCATED_PAGE && ret?

How about the attached?

diff --git a/fs/splice.c b/fs/splice.c
index 7fb0497..9b10ec4 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -574,12 +574,17 @@ static int pipe_to_file(struct pipe_inod
 			goto find_page;
 
 		page = buf->page;
+		if (add_to_page_cache(page, mapping, index, gfp_mask)) {
+			unlock_page(page);
+			goto find_page;
+		}
+
 		page_cache_get(page);
 
 		/*
-		 * page must be on the LRU for adding to the pagecache.
-		 * Check this without grabbing the zone lock, if it isn't
-		 * the do grab the zone lock, recheck, and add if necessary.
+		 * The page cannot reside in the pagecache and not be on
+		 * the LRU list. It should be safe to check PageLRU()
+		 * outside the zone lru_lock, as long as we recheck it inside.
 		 */
 		if (!PageLRU(page)) {
 			struct zone *zone = page_zone(page);
@@ -591,12 +596,6 @@ static int pipe_to_file(struct pipe_inod
 			}
 			spin_unlock_irq(&zone->lru_lock);
 		}
-
-		if (add_to_page_cache(page, mapping, index, gfp_mask)) {
-			page_cache_release(page);
-			unlock_page(page);
-			goto find_page;
-		}
 	} else {
 find_page:
 		page = find_lock_page(mapping, index);
@@ -647,11 +646,15 @@ find_page:
 	}
 
 	ret = mapping->a_ops->prepare_write(file, page, offset, offset+this_len);
-	if (ret == AOP_TRUNCATED_PAGE) {
+	if (unlikely(ret)) {
+		if (ret != AOP_TRUNCATED_PAGE)
+			unlock_page(page);
 		page_cache_release(page);
-		goto find_page;
-	} else if (ret)
+		if (ret == AOP_TRUNCATED_PAGE)
+			goto find_page;
+
 		goto out;
+	}
 
 	if (buf->page != page) {
 		/*

-- 
Jens Axboe

next prev parent reply	other threads:[~2006-05-03  6:56 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-05-01  6:59 splice(SPLICE_F_MOVE) problems Oleg Nesterov
2006-05-01  6:54 ` Jens Axboe
2006-05-01 19:06   ` Oleg Nesterov
2006-05-01 17:41     ` Jens Axboe
2006-05-02  0:11       ` Oleg Nesterov
2006-05-02  5:28         ` Jens Axboe
2006-05-03  4:14           ` Oleg Nesterov
2006-05-03  6:56             ` Jens Axboe [this message]
2006-05-03 14:35               ` Oleg Nesterov
2006-05-03 10:48                 ` Jens Axboe

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:7fb0497 dfblob:9b10ec4 )
 OR (
bs:"Re: splice(SPLICE_F_MOVE) problems" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20060503065644.GJ9712@suse.de \
    --to=axboe@suse.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=oleg@tv-sign.ru \
    --cc=torvalds@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.