linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: NeilBrown <neilb@suse.com>
To: "Wang\,
	Alan 1. \(NSB - CN\/Hangzhou\)" <alan.1.wang@nokia-sbell.com>,
	"linux-nfs\@vger.kernel.org" <linux-nfs@vger.kernel.org>
To: "J. Bruce Fields" <bfields@fieldses.org>
Subject: Re: A special case in write_flush which cause the umount busy
Date: Wed, 14 Feb 2018 06:57:06 +1100	[thread overview]
Message-ID: <87inb0r7cd.fsf@notabene.neil.brown.name> (raw)
In-Reply-To: <VI1PR07MB342127E2A55088F691693A3BA2F00@VI1PR07MB3421.eurprd07.prod.outlook.com>

[-- Attachment #1: Type: text/plain, Size: 4649 bytes --]

On Sun, Feb 11 2018, Wang, Alan 1. (NSB - CN/Hangzhou) wrote:

> Hi,
>
> I have a test case on mount/umount on a partition from nfs server side. And encounter a problem of umount busy in a low probability.
>
> The Linux version is 3.10.64 with the patch "sunrpc/cache: make cache flushing more reliable".
> https://patchwork.kernel.org/patch/7410021/
>
> After some analysis and test in many times, I find that when it failed to mount, the time "then" and "now" are different, which caused the last_refresh is far beyond the flush_time. So this cache is not expired and won't be clean at once. 
> Then the ref in cache_head won't be released, and mntput_no_expire didn't be called to decrease the count. That caused the umount busy.
>
> Below are logs in my test.
>
> kernel: [  292.767801] write_flush 1480 then = 249, now = 250
> kernel: [  292.767817] cache_clean 451, cd name nfsd.fh expiry_time 790485253, cd flush_time 249, last_refresh 369, seconds_since_boot 250
> kernel: [  292.767913] write_flush 1480 then = 249, now = 250
> kernel: [  292.767928] cache_clean 451, cd name nfsd.export expiry_time 2049, cd flush_time 249, last_refresh 369, seconds_since_boot 250
> kernel: [  292.773229] do_refcount_check 283 mycount 4
> kernel: [  292.773245] do_umount 1344 retval -16
>
> I think this happens in such case that the exportfs writes the flush with current time, the time of "then". But when seconds_since_boot being called in function write_flush, the time is on the next second, so the "now" is one second after "then".
> Because "then" is less than "now", the flush_time is set directly to original "then", rather than "cd->flush_time + 1".
>
>
> And I want to change the condition as below. I'm not sure it's OK and has no effects to other part.

I think this change is safe.  It is a bit of a hack, but the "flush"
interface was actually very poorly designed and I don't think it *can*
be implemented cleanly.
Maybe we should just ignore the number written in and *always* flush
the cache completely.  That is was it always wanted.

See below.

Bruce: do you have any thoughts on this?

Thanks,
NeilBrown

diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
index aa36dad32db1..43f117d1547e 100644
--- a/net/sunrpc/cache.c
+++ b/net/sunrpc/cache.c
@@ -1450,7 +1450,7 @@ static ssize_t write_flush(struct file *file, const char __user *buf,
 			   struct cache_detail *cd)
 {
 	char tbuf[20];
-	char *bp, *ep;
+	char *ep;
 	time_t then, now;
 
 	if (*ppos || count > sizeof(tbuf)-1)
@@ -1461,24 +1461,24 @@ static ssize_t write_flush(struct file *file, const char __user *buf,
 	simple_strtoul(tbuf, &ep, 0);
 	if (*ep && *ep != '\n')
 		return -EINVAL;
+	/* Note that while we check that 'buf' holds a valid number,
+	 * we always ignore the value and just flush everything.
+	 * Making use of the number leads to races.
+	 */
 
-	bp = tbuf;
-	then = get_expiry(&bp);
 	now = seconds_since_boot();
-	cd->nextcheck = now;
-	/* Can only set flush_time to 1 second beyond "now", or
+	/* Always flush everything, so behave like cache_purge()
+	 * Can only set flush_time to 1 second beyond "now", or
 	 * possibly 1 second beyond flushtime.  This is because
 	 * flush_time never goes backwards so it mustn't get too far
 	 * ahead of time.
 	 */
-	if (then >= now) {
-		/* Want to flush everything, so behave like cache_purge() */
-		if (cd->flush_time >= now)
-			now = cd->flush_time + 1;
-		then = now;
-	}
 
-	cd->flush_time = then;
+	if (cd->flush_time >= now)
+		now = cd->flush_time + 1;
+
+	cd->flush_time = now;
+	cd->nextcheck = now;
 	cache_flush();
 
 	*ppos += count;


>
> --------------------------------------------------------------------------
>        then = get_expiry(&bp);
>        now = seconds_since_boot();
>        cd->nextcheck = now;
>        /* Can only set flush_time to 1 second beyond "now", or
>        * possibly 1 second beyond flushtime.  This is because
>        * flush_time never goes backwards so it mustn't get too far
>        * ahead of time.
>        */
>        if (then != now)
>               printk("%s %d then = %d, now = %d\n", __func__, __LINE__, then, now);
> -      if (then >= now) {
> +     if (then >= now - 1) {
>               /* Want to flush everything, so behave like cache_purge() */
>               if (cd->flush_time >= now)
>                      now = cd->flush_time + 1;
>               then = now;
>        }
>
>        cd->flush_time = then;
> --------------------------------------------------------------------------
>
> Best Regards,
> Alan Wang

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

  reply	other threads:[~2018-02-13 19:57 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-02-11  6:55 A special case in write_flush which cause the umount busy Wang, Alan 1. (NSB - CN/Hangzhou)
2018-02-13 19:57 ` NeilBrown [this message]
2018-02-13 21:03   ` J. Bruce Fields
2018-02-14  1:15     ` [PATCH] SUNRPC: cache: ignore timestamp written to 'flush' file NeilBrown
2018-02-22  0:46       ` J. Bruce Fields

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87inb0r7cd.fsf@notabene.neil.brown.name \
    --to=neilb@suse.com \
    --cc=alan.1.wang@nokia-sbell.com \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).