All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Jan Kara <jack@suse.cz>,
	linux-nvdimm@lists.01.org, Dave Chinner <david@fromorbit.com>,
	linux-kernel@vger.kernel.org, stable@vger.kernel.org,
	Christoph Hellwig <hch@lst.de>,
	linux-mm@kvack.org, Dave Hansen <dave.hansen@intel.com>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH] dax: fix deadlock with DAX 4k holes
Date: Wed, 4 Jan 2017 08:18:04 +0100	[thread overview]
Message-ID: <20170104071804.GF3780@quack2.suse.cz> (raw)
In-Reply-To: <1483479365-13607-1-git-send-email-ross.zwisler@linux.intel.com>

On Tue 03-01-17 14:36:05, Ross Zwisler wrote:
> Currently in DAX if we have three read faults on the same hole address we
> can end up with the following:
> 
> Thread 0		Thread 1		Thread 2
> --------		--------		--------
> dax_iomap_fault
>  grab_mapping_entry
>   lock_slot
>    <locks empty DAX entry>
> 
>   			dax_iomap_fault
> 			 grab_mapping_entry
> 			  get_unlocked_mapping_entry
> 			   <sleeps on empty DAX entry>
> 
> 						dax_iomap_fault
> 						 grab_mapping_entry
> 						  get_unlocked_mapping_entry
> 						   <sleeps on empty DAX entry>
>   dax_load_hole
>    find_or_create_page
>    ...
>     page_cache_tree_insert
>      dax_wake_mapping_entry_waiter
>       <wakes one sleeper>
>      __radix_tree_replace
>       <swaps empty DAX entry with 4k zero page>
> 
> 			<wakes>
> 			get_page
> 			lock_page
> 			...
> 			put_locked_mapping_entry
> 			unlock_page
> 			put_page
> 
> 						<sleeps forever on the DAX
> 						 wait queue>
> 
> The crux of the problem is that once we insert a 4k zero page, all locking
> from then on is done in terms of that 4k zero page and any additional
> threads sleeping on the empty DAX entry will never be woken.  Fix this by
> waking all sleepers when we replace the DAX radix tree entry with a 4k zero
> page.  This will allow all sleeping threads to successfully transition from
> locking based on the DAX empty entry to locking on the 4k zero page.
> 
> With the test case reported by Xiong this happens very regularly in my test
> setup, with some runs resulting in 9+ threads in this deadlocked state.
> With this fix I've been able to run that same test dozens of times in a
> loop without issue.
> 
> Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
> Reported-by: Xiong Zhou <xzhou@redhat.com>
> Fixes: commit ac401cc78242 ("dax: New fault locking")
> Cc: Jan Kara <jack@suse.cz>
> Cc: stable@vger.kernel.org # 4.7+

Ah, very good catch. You can add:

Reviewed-by: Jan Kara <jack@suse.cz>

I wonder why I was not able to reproduce this... Probably the timing didn't
work out right on my test machine.

								Honza

> ---
> 
> This issue exists as far back as v4.7, and I was easly able to reproduce it
> with v4.7 using the same test.
> 
> Unfortunately this patch won't apply cleanly to the stable trees, but the
> change is very simple and should be easy to replicate by hand.  Please ping
> me if you'd like patches that apply cleanly to the v4.9 and v4.8.15 trees.
> 
> ---
>  mm/filemap.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/filemap.c b/mm/filemap.c
> index d0e4d10..b772a33 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -138,7 +138,7 @@ static int page_cache_tree_insert(struct address_space *mapping,
>  				dax_radix_locked_entry(0, RADIX_DAX_EMPTY));
>  			/* Wakeup waiters for exceptional entry lock */
>  			dax_wake_mapping_entry_waiter(mapping, page->index, p,
> -						      false);
> +						      true);
>  		}
>  	}
>  	__radix_tree_replace(&mapping->page_tree, node, slot, page,
> -- 
> 2.7.4
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

WARNING: multiple messages have this Message-ID (diff)
From: Jan Kara <jack@suse.cz>
To: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Xiong Zhou <xzhou@redhat.com>,
	stable@vger.kernel.org, linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Christoph Hellwig <hch@lst.de>,
	Dan Williams <dan.j.williams@intel.com>,
	Dave Chinner <david@fromorbit.com>,
	Dave Hansen <dave.hansen@intel.com>, Jan Kara <jack@suse.cz>,
	linux-mm@kvack.org, linux-nvdimm@lists.01.org
Subject: Re: [PATCH] dax: fix deadlock with DAX 4k holes
Date: Wed, 4 Jan 2017 08:18:04 +0100	[thread overview]
Message-ID: <20170104071804.GF3780@quack2.suse.cz> (raw)
In-Reply-To: <1483479365-13607-1-git-send-email-ross.zwisler@linux.intel.com>

On Tue 03-01-17 14:36:05, Ross Zwisler wrote:
> Currently in DAX if we have three read faults on the same hole address we
> can end up with the following:
> 
> Thread 0		Thread 1		Thread 2
> --------		--------		--------
> dax_iomap_fault
>  grab_mapping_entry
>   lock_slot
>    <locks empty DAX entry>
> 
>   			dax_iomap_fault
> 			 grab_mapping_entry
> 			  get_unlocked_mapping_entry
> 			   <sleeps on empty DAX entry>
> 
> 						dax_iomap_fault
> 						 grab_mapping_entry
> 						  get_unlocked_mapping_entry
> 						   <sleeps on empty DAX entry>
>   dax_load_hole
>    find_or_create_page
>    ...
>     page_cache_tree_insert
>      dax_wake_mapping_entry_waiter
>       <wakes one sleeper>
>      __radix_tree_replace
>       <swaps empty DAX entry with 4k zero page>
> 
> 			<wakes>
> 			get_page
> 			lock_page
> 			...
> 			put_locked_mapping_entry
> 			unlock_page
> 			put_page
> 
> 						<sleeps forever on the DAX
> 						 wait queue>
> 
> The crux of the problem is that once we insert a 4k zero page, all locking
> from then on is done in terms of that 4k zero page and any additional
> threads sleeping on the empty DAX entry will never be woken.  Fix this by
> waking all sleepers when we replace the DAX radix tree entry with a 4k zero
> page.  This will allow all sleeping threads to successfully transition from
> locking based on the DAX empty entry to locking on the 4k zero page.
> 
> With the test case reported by Xiong this happens very regularly in my test
> setup, with some runs resulting in 9+ threads in this deadlocked state.
> With this fix I've been able to run that same test dozens of times in a
> loop without issue.
> 
> Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
> Reported-by: Xiong Zhou <xzhou@redhat.com>
> Fixes: commit ac401cc78242 ("dax: New fault locking")
> Cc: Jan Kara <jack@suse.cz>
> Cc: stable@vger.kernel.org # 4.7+

Ah, very good catch. You can add:

Reviewed-by: Jan Kara <jack@suse.cz>

I wonder why I was not able to reproduce this... Probably the timing didn't
work out right on my test machine.

								Honza

> ---
> 
> This issue exists as far back as v4.7, and I was easly able to reproduce it
> with v4.7 using the same test.
> 
> Unfortunately this patch won't apply cleanly to the stable trees, but the
> change is very simple and should be easy to replicate by hand.  Please ping
> me if you'd like patches that apply cleanly to the v4.9 and v4.8.15 trees.
> 
> ---
>  mm/filemap.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/filemap.c b/mm/filemap.c
> index d0e4d10..b772a33 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -138,7 +138,7 @@ static int page_cache_tree_insert(struct address_space *mapping,
>  				dax_radix_locked_entry(0, RADIX_DAX_EMPTY));
>  			/* Wakeup waiters for exceptional entry lock */
>  			dax_wake_mapping_entry_waiter(mapping, page->index, p,
> -						      false);
> +						      true);
>  		}
>  	}
>  	__radix_tree_replace(&mapping->page_tree, node, slot, page,
> -- 
> 2.7.4
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Jan Kara <jack@suse.cz>
To: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Xiong Zhou <xzhou@redhat.com>,
	stable@vger.kernel.org, linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Christoph Hellwig <hch@lst.de>,
	Dan Williams <dan.j.williams@intel.com>,
	Dave Chinner <david@fromorbit.com>,
	Dave Hansen <dave.hansen@intel.com>, Jan Kara <jack@suse.cz>,
	linux-mm@kvack.org, linux-nvdimm@ml01.01.org
Subject: Re: [PATCH] dax: fix deadlock with DAX 4k holes
Date: Wed, 4 Jan 2017 08:18:04 +0100	[thread overview]
Message-ID: <20170104071804.GF3780@quack2.suse.cz> (raw)
In-Reply-To: <1483479365-13607-1-git-send-email-ross.zwisler@linux.intel.com>

On Tue 03-01-17 14:36:05, Ross Zwisler wrote:
> Currently in DAX if we have three read faults on the same hole address we
> can end up with the following:
> 
> Thread 0		Thread 1		Thread 2
> --------		--------		--------
> dax_iomap_fault
>  grab_mapping_entry
>   lock_slot
>    <locks empty DAX entry>
> 
>   			dax_iomap_fault
> 			 grab_mapping_entry
> 			  get_unlocked_mapping_entry
> 			   <sleeps on empty DAX entry>
> 
> 						dax_iomap_fault
> 						 grab_mapping_entry
> 						  get_unlocked_mapping_entry
> 						   <sleeps on empty DAX entry>
>   dax_load_hole
>    find_or_create_page
>    ...
>     page_cache_tree_insert
>      dax_wake_mapping_entry_waiter
>       <wakes one sleeper>
>      __radix_tree_replace
>       <swaps empty DAX entry with 4k zero page>
> 
> 			<wakes>
> 			get_page
> 			lock_page
> 			...
> 			put_locked_mapping_entry
> 			unlock_page
> 			put_page
> 
> 						<sleeps forever on the DAX
> 						 wait queue>
> 
> The crux of the problem is that once we insert a 4k zero page, all locking
> from then on is done in terms of that 4k zero page and any additional
> threads sleeping on the empty DAX entry will never be woken.  Fix this by
> waking all sleepers when we replace the DAX radix tree entry with a 4k zero
> page.  This will allow all sleeping threads to successfully transition from
> locking based on the DAX empty entry to locking on the 4k zero page.
> 
> With the test case reported by Xiong this happens very regularly in my test
> setup, with some runs resulting in 9+ threads in this deadlocked state.
> With this fix I've been able to run that same test dozens of times in a
> loop without issue.
> 
> Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
> Reported-by: Xiong Zhou <xzhou@redhat.com>
> Fixes: commit ac401cc78242 ("dax: New fault locking")
> Cc: Jan Kara <jack@suse.cz>
> Cc: stable@vger.kernel.org # 4.7+

Ah, very good catch. You can add:

Reviewed-by: Jan Kara <jack@suse.cz>

I wonder why I was not able to reproduce this... Probably the timing didn't
work out right on my test machine.

								Honza

> ---
> 
> This issue exists as far back as v4.7, and I was easly able to reproduce it
> with v4.7 using the same test.
> 
> Unfortunately this patch won't apply cleanly to the stable trees, but the
> change is very simple and should be easy to replicate by hand.  Please ping
> me if you'd like patches that apply cleanly to the v4.9 and v4.8.15 trees.
> 
> ---
>  mm/filemap.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/filemap.c b/mm/filemap.c
> index d0e4d10..b772a33 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -138,7 +138,7 @@ static int page_cache_tree_insert(struct address_space *mapping,
>  				dax_radix_locked_entry(0, RADIX_DAX_EMPTY));
>  			/* Wakeup waiters for exceptional entry lock */
>  			dax_wake_mapping_entry_waiter(mapping, page->index, p,
> -						      false);
> +						      true);
>  		}
>  	}
>  	__radix_tree_replace(&mapping->page_tree, node, slot, page,
> -- 
> 2.7.4
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

  reply	other threads:[~2017-01-04  7:18 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-27 11:22 multi-threads libvmmalloc fork test hang Xiong Zhou
2016-10-27 11:22 ` Xiong Zhou
     [not found] ` <20161027112230.wsumgs62fqdxt3sc-obHvtwIU90hQcClZ3XN9yxcY2uh10dtjAL8bYrjMMd8@public.gmane.org>
2016-10-27 13:37   ` Jan Kara
2016-10-27 13:37     ` Jan Kara
     [not found]     ` <20161027133720.GF19743-4I4JzKEfoa/jFM9bn6wA6Q@public.gmane.org>
2017-01-03 16:58       ` Ross Zwisler
2017-01-03 16:58         ` Ross Zwisler
2017-01-03 21:36 ` [PATCH] dax: fix deadlock with DAX 4k holes Ross Zwisler
2017-01-03 21:36   ` Ross Zwisler
2017-01-03 21:36   ` Ross Zwisler
2017-01-04  7:18   ` Jan Kara [this message]
2017-01-04  7:18     ` Jan Kara
2017-01-04  7:18     ` Jan Kara
2017-01-04 14:26   ` Xiong Zhou
2017-01-04 14:26     ` Xiong Zhou

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170104071804.GF3780@quack2.suse.cz \
    --to=jack@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=dave.hansen@intel.com \
    --cc=david@fromorbit.com \
    --cc=hch@lst.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=ross.zwisler@linux.intel.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.