public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Kirill Korotaev <dev@sw.ru>
To: balbir@in.ibm.com
Cc: Jan Blunck <jblunck@suse.de>,
	viro@zeniv.linux.org.uk, linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@osdl.org>,
	olh@suse.de
Subject: Re: [PATCH] shrink_dcache_parent() races against shrink_dcache_memory()
Date: Tue, 24 Jan 2006 12:48:13 +0300	[thread overview]
Message-ID: <43D5F7DD.7010507@sw.ru> (raw)
In-Reply-To: <20060124055425.GA30609@in.ibm.com>

I like your idea, but some comments below... I doubt it works.
I will think it over a bit later...

Kirill
P.S. it's not easily reproducable. Before my fix it took us 3-6 hours on 
automated stress testing to hit this bug. Right now I can't setup it for 
testing, maybe in a week or so.

>>On Mon, Jan 23, Kirill Korotaev wrote:
>>
>>[snip] 
>>
>>Hmm, will think about that one again. shrink_dcache_parent() and
>>shrink_dcache_memory()/dput() are not racing against each other now since the
>>reference counting is done before giving up dcache_lock and the select_parent
>>could start.
>>
>>Regards,
>>	Jan
>>
> 
> 
> I have been playing around with a possible solution to the problem.
> I have not been able to reproduce this issue, hence I am unable to verify
> if the patch below fixes the problem. I have run the system with this
> patch and verified that no obvious badness is observed.
> 
> Kirill, Jan if you can easily reproduce the problem, could you
> try this patch and review it as well for correctness of the solution?
> 
> All callers that try to free memory set the PF_MEMALLOC flag, we check
> if the super block is going away due to an unmount, if so we ask the
> allocator to return.
> 
> The patch adds additional cost of holding the sb_lock for each dentry
> being pruned. It holds sb_lock under dentry->d_lock and dcache_lock,
> I am not sure about the locking order of these locks.
> 
> Signed-off-by: Balbir Singh <balbir@in.ibm.com>
> ---
> 
>  fs/dcache.c |   23 +++++++++++++++++++++++
>  1 files changed, 23 insertions(+)
> 
> diff -puN fs/dcache.c~dcache_race_fix2 fs/dcache.c
> --- linux-2.6/fs/dcache.c~dcache_race_fix2	2006-01-24 11:05:46.000000000 +0530
> +++ linux-2.6-balbir/fs/dcache.c	2006-01-24 11:05:46.000000000 +0530
> @@ -425,6 +425,29 @@ static void prune_dcache(int count)
>   			spin_unlock(&dentry->d_lock);
>  			continue;
>  		}
> +
> +		/*
> +		 * Note to reviewers: our current lock order is dcache_lock,
> +		 * dentry->d_lock & sb_lock. Could this create a deadlock?
> +		 */
> +		spin_lock(&sb_lock);
<<<< 1. sb_lock doesn't protect atomic_read() anyhow...
<<<<    I mean, sb_lock is not required to read its value...
> +		if (!atomic_read(&dentry->d_sb->s_active)) {
> +			/*
> +			 * Race condition, umount and other pruning is happening
> +			 * in parallel.
> +			 */
> +			if (current->flags & PF_MEMALLOC) {
> +				/*
> +				 * let the allocator leave this dentry alone
> +				 */
> +				spin_unlock(&sb_lock);
> +				spin_unlock(&dentry->d_lock);
> +				spin_unlock(&dcache_lock);
> +				return;
<<<< you should not return, but rather 'continue'. otherwise you skip 
_all_ dentries, even from active super blocks.
> +			}
> +		}
> +		spin_unlock(&sb_lock);
> +
<<<< and here, when you drop sb_lock, and dentry->d_lock/dcache_lock in 
prune_dentry() it looks to me that we have exactly the same situation as 
it was without your patch:
<<<< another CPU can start umount in parallel.
<<<< maybe sb_lock barrier helps this somehow, but I can't see how yet...

<<<< another idea: down_read(&sb->s_umount) probably could help...
<<<< because it will block the whole umount operation...
<<<< but we can't take it under dcache_lock...
>  		prune_one_dentry(dentry);
>  	}
>  	spin_unlock(&dcache_lock);
> 
> Thanks,
> Balbir
> _
> 



  reply	other threads:[~2006-01-24  9:47 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-01-20 20:36 [PATCH] shrink_dcache_parent() races against shrink_dcache_memory() Jan Blunck
2006-01-23  5:22 ` Andrew Morton
2006-01-23  8:12   ` Kirill Korotaev
2006-01-23 15:13   ` Jan Blunck
2006-01-23  8:07 ` Kirill Korotaev
2006-01-23 15:57   ` Jan Blunck
2006-01-24  5:54     ` Balbir Singh
2006-01-24  9:48       ` Kirill Korotaev [this message]
2006-01-24 11:10         ` Balbir Singh
2006-01-24 17:18           ` Kirill Korotaev
2006-01-25  7:03             ` Balbir Singh
2006-01-30 12:03   ` Jan Blunck
2006-01-30 14:38     ` Balbir Singh
2006-01-30 14:54       ` Jan Blunck
2006-01-30 15:02         ` Kirill Korotaev
2006-01-30 15:25           ` Jan Blunck
2006-01-30 15:31             ` Kirill Korotaev
2006-01-30 14:42     ` Kirill Korotaev
2006-01-30 14:58       ` Jan Blunck
2006-01-30 15:59         ` Kirill Korotaev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=43D5F7DD.7010507@sw.ru \
    --to=dev@sw.ru \
    --cc=akpm@osdl.org \
    --cc=balbir@in.ibm.com \
    --cc=jblunck@suse.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=olh@suse.de \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox