All of lore.kernel.org
 help / color / mirror / Atom feed
From: NeilBrown <neilb@suse.com>
To: "Wang\,
	Alan 1. \(NSB - CN\/Hangzhou\)" <alan.1.wang@nokia-sbell.com>,
	"linux-nfs\@vger.kernel.org" <linux-nfs@vger.kernel.org>
To: "J. Bruce Fields" <bfields@fieldses.org>
Subject: Re: A special case in write_flush which cause the umount busy
Date: Wed, 14 Feb 2018 06:57:06 +1100	[thread overview]
Message-ID: <87inb0r7cd.fsf@notabene.neil.brown.name> (raw)
In-Reply-To: <VI1PR07MB342127E2A55088F691693A3BA2F00@VI1PR07MB3421.eurprd07.prod.outlook.com>

[-- Attachment #1: Type: text/plain, Size: 4649 bytes --]

On Sun, Feb 11 2018, Wang, Alan 1. (NSB - CN/Hangzhou) wrote:

> Hi,
>
> I have a test case on mount/umount on a partition from nfs server side. And encounter a problem of umount busy in a low probability.
>
> The Linux version is 3.10.64 with the patch "sunrpc/cache: make cache flushing more reliable".
> https://patchwork.kernel.org/patch/7410021/
>
> After some analysis and test in many times, I find that when it failed to mount, the time "then" and "now" are different, which caused the last_refresh is far beyond the flush_time. So this cache is not expired and won't be clean at once. 
> Then the ref in cache_head won't be released, and mntput_no_expire didn't be called to decrease the count. That caused the umount busy.
>
> Below are logs in my test.
>
> kernel: [  292.767801] write_flush 1480 then = 249, now = 250
> kernel: [  292.767817] cache_clean 451, cd name nfsd.fh expiry_time 790485253, cd flush_time 249, last_refresh 369, seconds_since_boot 250
> kernel: [  292.767913] write_flush 1480 then = 249, now = 250
> kernel: [  292.767928] cache_clean 451, cd name nfsd.export expiry_time 2049, cd flush_time 249, last_refresh 369, seconds_since_boot 250
> kernel: [  292.773229] do_refcount_check 283 mycount 4
> kernel: [  292.773245] do_umount 1344 retval -16
>
> I think this happens in such case that the exportfs writes the flush with current time, the time of "then". But when seconds_since_boot being called in function write_flush, the time is on the next second, so the "now" is one second after "then".
> Because "then" is less than "now", the flush_time is set directly to original "then", rather than "cd->flush_time + 1".
>
>
> And I want to change the condition as below. I'm not sure it's OK and has no effects to other part.

I think this change is safe.  It is a bit of a hack, but the "flush"
interface was actually very poorly designed and I don't think it *can*
be implemented cleanly.
Maybe we should just ignore the number written in and *always* flush
the cache completely.  That is was it always wanted.

See below.

Bruce: do you have any thoughts on this?

Thanks,
NeilBrown

diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
index aa36dad32db1..43f117d1547e 100644
--- a/net/sunrpc/cache.c
+++ b/net/sunrpc/cache.c
@@ -1450,7 +1450,7 @@ static ssize_t write_flush(struct file *file, const char __user *buf,
 			   struct cache_detail *cd)
 {
 	char tbuf[20];
-	char *bp, *ep;
+	char *ep;
 	time_t then, now;
 
 	if (*ppos || count > sizeof(tbuf)-1)
@@ -1461,24 +1461,24 @@ static ssize_t write_flush(struct file *file, const char __user *buf,
 	simple_strtoul(tbuf, &ep, 0);
 	if (*ep && *ep != '\n')
 		return -EINVAL;
+	/* Note that while we check that 'buf' holds a valid number,
+	 * we always ignore the value and just flush everything.
+	 * Making use of the number leads to races.
+	 */
 
-	bp = tbuf;
-	then = get_expiry(&bp);
 	now = seconds_since_boot();
-	cd->nextcheck = now;
-	/* Can only set flush_time to 1 second beyond "now", or
+	/* Always flush everything, so behave like cache_purge()
+	 * Can only set flush_time to 1 second beyond "now", or
 	 * possibly 1 second beyond flushtime.  This is because
 	 * flush_time never goes backwards so it mustn't get too far
 	 * ahead of time.
 	 */
-	if (then >= now) {
-		/* Want to flush everything, so behave like cache_purge() */
-		if (cd->flush_time >= now)
-			now = cd->flush_time + 1;
-		then = now;
-	}
 
-	cd->flush_time = then;
+	if (cd->flush_time >= now)
+		now = cd->flush_time + 1;
+
+	cd->flush_time = now;
+	cd->nextcheck = now;
 	cache_flush();
 
 	*ppos += count;


>
> --------------------------------------------------------------------------
>        then = get_expiry(&bp);
>        now = seconds_since_boot();
>        cd->nextcheck = now;
>        /* Can only set flush_time to 1 second beyond "now", or
>        * possibly 1 second beyond flushtime.  This is because
>        * flush_time never goes backwards so it mustn't get too far
>        * ahead of time.
>        */
>        if (then != now)
>               printk("%s %d then = %d, now = %d\n", __func__, __LINE__, then, now);
> -      if (then >= now) {
> +     if (then >= now - 1) {
>               /* Want to flush everything, so behave like cache_purge() */
>               if (cd->flush_time >= now)
>                      now = cd->flush_time + 1;
>               then = now;
>        }
>
>        cd->flush_time = then;
> --------------------------------------------------------------------------
>
> Best Regards,
> Alan Wang

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

  reply	other threads:[~2018-02-13 19:57 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-02-11  6:55 A special case in write_flush which cause the umount busy Wang, Alan 1. (NSB - CN/Hangzhou)
2018-02-13 19:57 ` NeilBrown [this message]
2018-02-13 21:03   ` J. Bruce Fields
2018-02-14  1:15     ` [PATCH] SUNRPC: cache: ignore timestamp written to 'flush' file NeilBrown
2018-02-22  0:46       ` J. Bruce Fields

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87inb0r7cd.fsf@notabene.neil.brown.name \
    --to=neilb@suse.com \
    --cc=alan.1.wang@nokia-sbell.com \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.