All of lore.kernel.org
 help / color / mirror / Atom feed
* [Ocfs2-devel] [PATCH] ocfs2: Avoid livelock in ocfs2_readpage()
@ 2011-06-23 20:51 Jan Kara
  2011-06-24 20:55 ` Sunil Mushran
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Jan Kara @ 2011-06-23 20:51 UTC (permalink / raw)
  To: ocfs2-devel

When someone writes to an inode, readers accessing the same inode via
ocfs2_readpage() just busyloop trying to get ip_alloc_sem because
do_generic_file_read() looks up the page again and retries ->readpage()
when previous attempt failed with AOP_TRUNCATED_PAGE. When there are enough
readers, they can occupy all CPUs and in non-preempt kernel the system is
deadlocked because writer holding ip_alloc_sem is never run to release the
semaphore. Fix the problem by making reader block on ip_alloc_sem to break
the busy loop.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/ocfs2/aops.c |    8 ++++++++
 1 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c
index ac97bca..0919e8f 100644
--- a/fs/ocfs2/aops.c
+++ b/fs/ocfs2/aops.c
@@ -290,7 +290,15 @@ static int ocfs2_readpage(struct file *file, struct page *page)
 	}
 
 	if (down_read_trylock(&oi->ip_alloc_sem) == 0) {
+		/*
+		 * Unlock the page and cycle ip_alloc_sem so that we don't
+		 * busyloop waiting for ip_alloc_sem to unlock
+		 */
 		ret = AOP_TRUNCATED_PAGE;
+		unlock_page(page);
+		unlock = 0;
+		down_read(&oi->ip_alloc_sem);
+		up_read(&oi->ip_alloc_sem);
 		goto out_inode_unlock;
 	}
 
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [Ocfs2-devel] [PATCH] ocfs2: Avoid livelock in ocfs2_readpage()
  2011-06-23 20:51 [Ocfs2-devel] [PATCH] ocfs2: Avoid livelock in ocfs2_readpage() Jan Kara
@ 2011-06-24 20:55 ` Sunil Mushran
  2011-06-24 21:44   ` Jan Kara
  2011-06-26  7:26 ` Joel Becker
  2011-07-28  9:08 ` Joel Becker
  2 siblings, 1 reply; 6+ messages in thread
From: Sunil Mushran @ 2011-06-24 20:55 UTC (permalink / raw)
  To: ocfs2-devel

Looks reasonable. Was this something someone ran into or was
this reproduced only via a test workload?

On 06/23/2011 01:51 PM, Jan Kara wrote:
> When someone writes to an inode, readers accessing the same inode via
> ocfs2_readpage() just busyloop trying to get ip_alloc_sem because
> do_generic_file_read() looks up the page again and retries ->readpage()
> when previous attempt failed with AOP_TRUNCATED_PAGE. When there are enough
> readers, they can occupy all CPUs and in non-preempt kernel the system is
> deadlocked because writer holding ip_alloc_sem is never run to release the
> semaphore. Fix the problem by making reader block on ip_alloc_sem to break
> the busy loop.
>
> Signed-off-by: Jan Kara<jack@suse.cz>
> ---
>   fs/ocfs2/aops.c |    8 ++++++++
>   1 files changed, 8 insertions(+), 0 deletions(-)
>
> diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c
> index ac97bca..0919e8f 100644
> --- a/fs/ocfs2/aops.c
> +++ b/fs/ocfs2/aops.c
> @@ -290,7 +290,15 @@ static int ocfs2_readpage(struct file *file, struct page *page)
>   	}
>
>   	if (down_read_trylock(&oi->ip_alloc_sem) == 0) {
> +		/*
> +		 * Unlock the page and cycle ip_alloc_sem so that we don't
> +		 * busyloop waiting for ip_alloc_sem to unlock
> +		 */
>   		ret = AOP_TRUNCATED_PAGE;
> +		unlock_page(page);
> +		unlock = 0;
> +		down_read(&oi->ip_alloc_sem);
> +		up_read(&oi->ip_alloc_sem);
>   		goto out_inode_unlock;
>   	}
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Ocfs2-devel] [PATCH] ocfs2: Avoid livelock in ocfs2_readpage()
  2011-06-24 20:55 ` Sunil Mushran
@ 2011-06-24 21:44   ` Jan Kara
  0 siblings, 0 replies; 6+ messages in thread
From: Jan Kara @ 2011-06-24 21:44 UTC (permalink / raw)
  To: ocfs2-devel

On Fri 24-06-11 13:55:28, Sunil Mushran wrote:
> Looks reasonable. Was this something someone ran into or was
> this reproduced only via a test workload?
  One of our customers hit this when running installation in KVM whose
filesystem image was stored on OCFS2... Generally loading KVM guest with
IO seemed to trigger this fairly regulary.

								Honza
> On 06/23/2011 01:51 PM, Jan Kara wrote:
> >When someone writes to an inode, readers accessing the same inode via
> >ocfs2_readpage() just busyloop trying to get ip_alloc_sem because
> >do_generic_file_read() looks up the page again and retries ->readpage()
> >when previous attempt failed with AOP_TRUNCATED_PAGE. When there are enough
> >readers, they can occupy all CPUs and in non-preempt kernel the system is
> >deadlocked because writer holding ip_alloc_sem is never run to release the
> >semaphore. Fix the problem by making reader block on ip_alloc_sem to break
> >the busy loop.
> >
> >Signed-off-by: Jan Kara<jack@suse.cz>
> >---
> >  fs/ocfs2/aops.c |    8 ++++++++
> >  1 files changed, 8 insertions(+), 0 deletions(-)
> >
> >diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c
> >index ac97bca..0919e8f 100644
> >--- a/fs/ocfs2/aops.c
> >+++ b/fs/ocfs2/aops.c
> >@@ -290,7 +290,15 @@ static int ocfs2_readpage(struct file *file, struct page *page)
> >  	}
> >
> >  	if (down_read_trylock(&oi->ip_alloc_sem) == 0) {
> >+		/*
> >+		 * Unlock the page and cycle ip_alloc_sem so that we don't
> >+		 * busyloop waiting for ip_alloc_sem to unlock
> >+		 */
> >  		ret = AOP_TRUNCATED_PAGE;
> >+		unlock_page(page);
> >+		unlock = 0;
> >+		down_read(&oi->ip_alloc_sem);
> >+		up_read(&oi->ip_alloc_sem);
> >  		goto out_inode_unlock;
> >  	}
> >
> 
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Ocfs2-devel] [PATCH] ocfs2: Avoid livelock in ocfs2_readpage()
  2011-06-23 20:51 [Ocfs2-devel] [PATCH] ocfs2: Avoid livelock in ocfs2_readpage() Jan Kara
  2011-06-24 20:55 ` Sunil Mushran
@ 2011-06-26  7:26 ` Joel Becker
  2011-06-27 10:47   ` Jan Kara
  2011-07-28  9:08 ` Joel Becker
  2 siblings, 1 reply; 6+ messages in thread
From: Joel Becker @ 2011-06-26  7:26 UTC (permalink / raw)
  To: ocfs2-devel

On Thu, Jun 23, 2011 at 10:51:47PM +0200, Jan Kara wrote:
> When someone writes to an inode, readers accessing the same inode via
> ocfs2_readpage() just busyloop trying to get ip_alloc_sem because
> do_generic_file_read() looks up the page again and retries ->readpage()
> when previous attempt failed with AOP_TRUNCATED_PAGE. When there are enough
> readers, they can occupy all CPUs and in non-preempt kernel the system is
> deadlocked because writer holding ip_alloc_sem is never run to release the
> semaphore. Fix the problem by making reader block on ip_alloc_sem to break
> the busy loop.
> 
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
>  fs/ocfs2/aops.c |    8 ++++++++
>  1 files changed, 8 insertions(+), 0 deletions(-)
> 
> diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c
> index ac97bca..0919e8f 100644
> --- a/fs/ocfs2/aops.c
> +++ b/fs/ocfs2/aops.c
> @@ -290,7 +290,15 @@ static int ocfs2_readpage(struct file *file, struct page *page)
>  	}
>  
>  	if (down_read_trylock(&oi->ip_alloc_sem) == 0) {
> +		/*
> +		 * Unlock the page and cycle ip_alloc_sem so that we don't
> +		 * busyloop waiting for ip_alloc_sem to unlock
> +		 */
>  		ret = AOP_TRUNCATED_PAGE;
> +		unlock_page(page);
> +		unlock = 0;
> +		down_read(&oi->ip_alloc_sem);
> +		up_read(&oi->ip_alloc_sem);
>  		goto out_inode_unlock;
>  	}

	Question: First, is it safe to drop the page lock here?
Can all callers of readpage (not just g_f_a_r()) handle that?>  

-- 

"But all my words come back to me
 In shades of mediocrity.
 Like emptiness in harmony
 I need someone to comfort me."

			http://www.jlbec.org/
			jlbec at evilplan.org

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Ocfs2-devel] [PATCH] ocfs2: Avoid livelock in ocfs2_readpage()
  2011-06-26  7:26 ` Joel Becker
@ 2011-06-27 10:47   ` Jan Kara
  0 siblings, 0 replies; 6+ messages in thread
From: Jan Kara @ 2011-06-27 10:47 UTC (permalink / raw)
  To: ocfs2-devel

On Sun 26-06-11 00:26:44, Joel Becker wrote:
> On Thu, Jun 23, 2011 at 10:51:47PM +0200, Jan Kara wrote:
> > When someone writes to an inode, readers accessing the same inode via
> > ocfs2_readpage() just busyloop trying to get ip_alloc_sem because
> > do_generic_file_read() looks up the page again and retries ->readpage()
> > when previous attempt failed with AOP_TRUNCATED_PAGE. When there are enough
> > readers, they can occupy all CPUs and in non-preempt kernel the system is
> > deadlocked because writer holding ip_alloc_sem is never run to release the
> > semaphore. Fix the problem by making reader block on ip_alloc_sem to break
> > the busy loop.
> > 
> > Signed-off-by: Jan Kara <jack@suse.cz>
> > ---
> >  fs/ocfs2/aops.c |    8 ++++++++
> >  1 files changed, 8 insertions(+), 0 deletions(-)
> > 
> > diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c
> > index ac97bca..0919e8f 100644
> > --- a/fs/ocfs2/aops.c
> > +++ b/fs/ocfs2/aops.c
> > @@ -290,7 +290,15 @@ static int ocfs2_readpage(struct file *file, struct page *page)
> >  	}
> >  
> >  	if (down_read_trylock(&oi->ip_alloc_sem) == 0) {
> > +		/*
> > +		 * Unlock the page and cycle ip_alloc_sem so that we don't
> > +		 * busyloop waiting for ip_alloc_sem to unlock
> > +		 */
> >  		ret = AOP_TRUNCATED_PAGE;
> > +		unlock_page(page);
> > +		unlock = 0;
> > +		down_read(&oi->ip_alloc_sem);
> > +		up_read(&oi->ip_alloc_sem);
> >  		goto out_inode_unlock;
> >  	}
> 
> 	Question: First, is it safe to drop the page lock here?
> Can all callers of readpage (not just g_f_a_r()) handle that?
  We do exactly the same thing in ocfs2_inode_lock_with_page() so we
definitely don't introduce a new problem. And every caller of readpage() is
*supposed* to handle AOP_TRUNCATED_PAGE return value (because page reading
can race with truncate) in which case page must be returned unlocked. So it
should be safe. Admittedly I did not bother to check the call sites.

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Ocfs2-devel] [PATCH] ocfs2: Avoid livelock in ocfs2_readpage()
  2011-06-23 20:51 [Ocfs2-devel] [PATCH] ocfs2: Avoid livelock in ocfs2_readpage() Jan Kara
  2011-06-24 20:55 ` Sunil Mushran
  2011-06-26  7:26 ` Joel Becker
@ 2011-07-28  9:08 ` Joel Becker
  2 siblings, 0 replies; 6+ messages in thread
From: Joel Becker @ 2011-07-28  9:08 UTC (permalink / raw)
  To: ocfs2-devel

On Thu, Jun 23, 2011 at 10:51:47PM +0200, Jan Kara wrote:
> When someone writes to an inode, readers accessing the same inode via
> ocfs2_readpage() just busyloop trying to get ip_alloc_sem because
> do_generic_file_read() looks up the page again and retries ->readpage()
> when previous attempt failed with AOP_TRUNCATED_PAGE. When there are enough
> readers, they can occupy all CPUs and in non-preempt kernel the system is
> deadlocked because writer holding ip_alloc_sem is never run to release the
> semaphore. Fix the problem by making reader block on ip_alloc_sem to break
> the busy loop.
> 
> Signed-off-by: Jan Kara <jack@suse.cz>

This patch is now in the fixes branch of ocfs2.git.

Joel

-- 

Life's Little Instruction Book #30

	"Never buy a house without a fireplace."

			http://www.jlbec.org/
			jlbec at evilplan.org

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2011-07-28  9:08 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-06-23 20:51 [Ocfs2-devel] [PATCH] ocfs2: Avoid livelock in ocfs2_readpage() Jan Kara
2011-06-24 20:55 ` Sunil Mushran
2011-06-24 21:44   ` Jan Kara
2011-06-26  7:26 ` Joel Becker
2011-06-27 10:47   ` Jan Kara
2011-07-28  9:08 ` Joel Becker

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.