public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] xfs: fix memory leaks on shutdown w/ delayed logging.
@ 2011-01-14 13:07 Dave Chinner
  2011-01-14 13:07 ` [PATCH 1/2] xfs: fix log ticket leak on forced shutdown Dave Chinner
  2011-01-14 13:07 ` [PATCH 2/2] xfs: fix efi item " Dave Chinner
  0 siblings, 2 replies; 7+ messages in thread
From: Dave Chinner @ 2011-01-14 13:07 UTC (permalink / raw)
  To: xfs

These two patches fix memory leaks that can occur when a log write
aborts during a CIL flush. xfstest 139 triggers these leaks reliably.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/2] xfs: fix log ticket leak on forced shutdown.
  2011-01-14 13:07 [PATCH 0/2] xfs: fix memory leaks on shutdown w/ delayed logging Dave Chinner
@ 2011-01-14 13:07 ` Dave Chinner
  2011-01-18 12:35   ` Christoph Hellwig
  2011-01-14 13:07 ` [PATCH 2/2] xfs: fix efi item " Dave Chinner
  1 sibling, 1 reply; 7+ messages in thread
From: Dave Chinner @ 2011-01-14 13:07 UTC (permalink / raw)
  To: xfs

From: Dave Chinner <dchinner@redhat.com>

The kmemleak detector shows this after test 139:

unreferenced object 0xffff880079b88bb0 (size 264):
  comm "xfs_io", pid 4904, jiffies 4294909382 (age 276.824s)
  hex dump (first 32 bytes):
    00 00 00 00 ad 4e ad de ff ff ff ff 00 00 00 00  .....N..........
    ff ff ff ff ff ff ff ff 48 7b c9 82 ff ff ff ff  ........H{......
  backtrace:
    [<ffffffff81afb04d>] kmemleak_alloc+0x2d/0x60
    [<ffffffff8115c6cf>] kmem_cache_alloc+0x13f/0x2b0
    [<ffffffff814aaa97>] kmem_zone_alloc+0x77/0xf0
    [<ffffffff814aab2e>] kmem_zone_zalloc+0x1e/0x50
    [<ffffffff8148f394>] xlog_ticket_alloc+0x34/0x170
    [<ffffffff81494444>] xlog_cil_push+0xa4/0x3f0
    [<ffffffff81494eca>] xlog_cil_force_lsn+0x15a/0x160
    [<ffffffff814933a5>] _xfs_log_force_lsn+0x75/0x2d0
    [<ffffffff814a264d>] _xfs_trans_commit+0x2bd/0x2f0
    [<ffffffff8148bfdd>] xfs_iomap_write_allocate+0x1ad/0x350
    [<ffffffff814ac17f>] xfs_map_blocks+0x21f/0x370
    [<ffffffff814ad1b7>] xfs_vm_writepage+0x1c7/0x550
    [<ffffffff8112200a>] __writepage+0x1a/0x50
    [<ffffffff81122df2>] write_cache_pages+0x1c2/0x4c0
    [<ffffffff81123117>] generic_writepages+0x27/0x30
    [<ffffffff814aba5d>] xfs_vm_writepages+0x5d/0x80

By inspection, the leak occurs when xlog_write() returns and error
and we jump to the abort path without dropping the reference on the
active ticket.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/xfs_log_cil.c |    7 +++++--
 1 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/fs/xfs/xfs_log_cil.c b/fs/xfs/xfs_log_cil.c
index 9dc8125..c7eac5a 100644
--- a/fs/xfs/xfs_log_cil.c
+++ b/fs/xfs/xfs_log_cil.c
@@ -543,7 +543,7 @@ xlog_cil_push(
 
 	error = xlog_write(log, &lvhdr, tic, &ctx->start_lsn, NULL, 0);
 	if (error)
-		goto out_abort;
+		goto out_abort_free_ticket;
 
 	/*
 	 * now that we've written the checkpoint into the log, strictly
@@ -569,8 +569,9 @@ restart:
 	}
 	spin_unlock(&cil->xc_cil_lock);
 
+	/* xfs_log_done always frees the ticket on error. */
 	commit_lsn = xfs_log_done(log->l_mp, tic, &commit_iclog, 0);
-	if (error || commit_lsn == -1)
+	if (commit_lsn == -1)
 		goto out_abort;
 
 	/* attach all the transactions w/ busy extents to iclog */
@@ -600,6 +601,8 @@ out_free_ticket:
 	kmem_free(new_ctx);
 	return 0;
 
+out_abort_free_ticket:
+	xfs_log_ticket_put(tic);
 out_abort:
 	xlog_cil_committed(ctx, XFS_LI_ABORTED);
 	return XFS_ERROR(EIO);
-- 
1.7.2.3

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/2] xfs: fix efi item leak on forced shutdown
  2011-01-14 13:07 [PATCH 0/2] xfs: fix memory leaks on shutdown w/ delayed logging Dave Chinner
  2011-01-14 13:07 ` [PATCH 1/2] xfs: fix log ticket leak on forced shutdown Dave Chinner
@ 2011-01-14 13:07 ` Dave Chinner
  2011-01-18 12:46   ` Christoph Hellwig
  1 sibling, 1 reply; 7+ messages in thread
From: Dave Chinner @ 2011-01-14 13:07 UTC (permalink / raw)
  To: xfs

From: Dave Chinner <dchinner@redhat.com>

After test 139, kmemleak shows:

unreferenced object 0xffff880078b405d8 (size 400):
  comm "xfs_io", pid 4904, jiffies 4294909383 (age 1186.728s)
  hex dump (first 32 bytes):
    60 c1 17 79 00 88 ff ff 60 c1 17 79 00 88 ff ff  `..y....`..y....
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff81afb04d>] kmemleak_alloc+0x2d/0x60
    [<ffffffff8115c6cf>] kmem_cache_alloc+0x13f/0x2b0
    [<ffffffff814aaa97>] kmem_zone_alloc+0x77/0xf0
    [<ffffffff814aab2e>] kmem_zone_zalloc+0x1e/0x50
    [<ffffffff8147cd6b>] xfs_efi_init+0x4b/0xb0
    [<ffffffff814a4ee8>] xfs_trans_get_efi+0x58/0x90
    [<ffffffff81455fab>] xfs_bmap_finish+0x8b/0x1d0
    [<ffffffff814851b4>] xfs_itruncate_finish+0x2c4/0x5d0
    [<ffffffff814a970f>] xfs_setattr+0x8df/0xa70
    [<ffffffff814b5c7b>] xfs_vn_setattr+0x1b/0x20
    [<ffffffff8117dc00>] notify_change+0x170/0x2e0
    [<ffffffff81163bf6>] do_truncate+0x66/0xa0
    [<ffffffff81163d0b>] sys_ftruncate+0xdb/0xe0
    [<ffffffff8103a002>] system_call_fastpath+0x16/0x1b
    [<ffffffffffffffff>] 0xffffffffffffffff

The cause of the leak is that the "remove" parameter of IOP_UNPIN()
is never set when a CIL push is aborted. This means that the EFI
item is never freed if it was in the push being cancelled. The
problem is specific to delayed logging.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/xfs_trans.c |   10 ++++++++++
 1 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c
index f80a067..e66ce5e 100644
--- a/fs/xfs/xfs_trans.c
+++ b/fs/xfs/xfs_trans.c
@@ -1472,6 +1472,16 @@ xfs_trans_committed_bulk(
 		if (XFS_LSN_CMP(item_lsn, (xfs_lsn_t)-1) == 0)
 			continue;
 
+		/*
+		 * if we are aborting the operation, no point in inserting the
+		 * object into the AIL as we areee in a shutdown situation.
+		 */
+		if (aborted) {
+			ASSERT(XFS_FORCED_SHUTDOWN(ailp->xa_mount));
+			IOP_UNPIN(lip, aborted);
+			continue;
+		}
+
 		if (item_lsn != commit_lsn) {
 
 			/*
-- 
1.7.2.3

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] xfs: fix log ticket leak on forced shutdown.
  2011-01-14 13:07 ` [PATCH 1/2] xfs: fix log ticket leak on forced shutdown Dave Chinner
@ 2011-01-18 12:35   ` Christoph Hellwig
  0 siblings, 0 replies; 7+ messages in thread
From: Christoph Hellwig @ 2011-01-18 12:35 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

> By inspection, the leak occurs when xlog_write() returns and error
> and we jump to the abort path without dropping the reference on the
> active ticket.

Indeed,

Reviewed-by: Christoph Hellwig <hch@lst.de>

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] xfs: fix efi item leak on forced shutdown
  2011-01-14 13:07 ` [PATCH 2/2] xfs: fix efi item " Dave Chinner
@ 2011-01-18 12:46   ` Christoph Hellwig
  2011-01-18 23:33     ` Dave Chinner
  0 siblings, 1 reply; 7+ messages in thread
From: Christoph Hellwig @ 2011-01-18 12:46 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

On Sat, Jan 15, 2011 at 12:07:10AM +1100, Dave Chinner wrote:
> The cause of the leak is that the "remove" parameter of IOP_UNPIN()
> is never set when a CIL push is aborted. This means that the EFI
> item is never freed if it was in the push being cancelled. The
> problem is specific to delayed logging.
> 
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> ---
>  fs/xfs/xfs_trans.c |   10 ++++++++++
>  1 files changed, 10 insertions(+), 0 deletions(-)
> 
> diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c
> index f80a067..e66ce5e 100644
> --- a/fs/xfs/xfs_trans.c
> +++ b/fs/xfs/xfs_trans.c
> @@ -1472,6 +1472,16 @@ xfs_trans_committed_bulk(
>  		if (XFS_LSN_CMP(item_lsn, (xfs_lsn_t)-1) == 0)
>  			continue;
>  
> +		/*
> +		 * if we are aborting the operation, no point in inserting the
> +		 * object into the AIL as we areee in a shutdown situation.

that's a few 'e' too much.

> +		 */
> +		if (aborted) {
> +			ASSERT(XFS_FORCED_SHUTDOWN(ailp->xa_mount));
> +			IOP_UNPIN(lip, aborted);
> +			continue;
> +		}

Hmm, this is not symmetric with the non-delaylog path.
xfs_trans_item_committed never sets the remove flag to IOP_UNPIN,
even if the transaction commit was aborted.

It seems like the CIL code is missing an equivalent to
xfs_trans_uncommit for the case that xfs_log_write or xfs_log_done
fail.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] xfs: fix efi item leak on forced shutdown
  2011-01-18 12:46   ` Christoph Hellwig
@ 2011-01-18 23:33     ` Dave Chinner
  2011-01-20 13:27       ` Christoph Hellwig
  0 siblings, 1 reply; 7+ messages in thread
From: Dave Chinner @ 2011-01-18 23:33 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs

On Tue, Jan 18, 2011 at 07:46:25AM -0500, Christoph Hellwig wrote:
> On Sat, Jan 15, 2011 at 12:07:10AM +1100, Dave Chinner wrote:
> > The cause of the leak is that the "remove" parameter of IOP_UNPIN()
> > is never set when a CIL push is aborted. This means that the EFI
> > item is never freed if it was in the push being cancelled. The
> > problem is specific to delayed logging.
> > 
> > Signed-off-by: Dave Chinner <dchinner@redhat.com>
> > ---
> >  fs/xfs/xfs_trans.c |   10 ++++++++++
> >  1 files changed, 10 insertions(+), 0 deletions(-)
> > 
> > diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c
> > index f80a067..e66ce5e 100644
> > --- a/fs/xfs/xfs_trans.c
> > +++ b/fs/xfs/xfs_trans.c
> > @@ -1472,6 +1472,16 @@ xfs_trans_committed_bulk(
> >  		if (XFS_LSN_CMP(item_lsn, (xfs_lsn_t)-1) == 0)
> >  			continue;
> >  
> > +		/*
> > +		 * if we are aborting the operation, no point in inserting the
> > +		 * object into the AIL as we areee in a shutdown situation.
> 
> that's a few 'e' too much.
> 
> > +		 */
> > +		if (aborted) {
> > +			ASSERT(XFS_FORCED_SHUTDOWN(ailp->xa_mount));
> > +			IOP_UNPIN(lip, aborted);
> > +			continue;
> > +		}
> 
> Hmm, this is not symmetric with the non-delaylog path.
> xfs_trans_item_committed never sets the remove flag to IOP_UNPIN,
> even if the transaction commit was aborted.

Right, because the delaylog and non-delaylog paths are not symmetric
w.r.t. log write failures.

> It seems like the CIL code is missing an equivalent to
> xfs_trans_uncommit for the case that xfs_log_write or xfs_log_done
> fail.

There isn't an equivalent. In the delaylog case, we don't have a
transaction to "uncommit" when a log write failure occurs - we are
aborting the checkpoint of the CIL, not a transaction. As the items
have already gone through IOP_COMMITTING and IOP_UNLOCK, we have to
treat the failures like they came from the log IO completion
handler.

In the case of non-delaylog, neither IOP_COMMITTING or IOP_UNLOCK
has been called on the items when the xfs_log_write() fails. They
are still linked into the xfs_trans structure, so they can be
handled by xfs_trans_uncommit() which simply needs to walk the items
in the transaction and  IOP_UNPIN(lip, abort), IOP_UNLOCK and free
the items.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] xfs: fix efi item leak on forced shutdown
  2011-01-18 23:33     ` Dave Chinner
@ 2011-01-20 13:27       ` Christoph Hellwig
  0 siblings, 0 replies; 7+ messages in thread
From: Christoph Hellwig @ 2011-01-20 13:27 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

On Wed, Jan 19, 2011 at 10:33:46AM +1100, Dave Chinner wrote:
> > Hmm, this is not symmetric with the non-delaylog path.
> > xfs_trans_item_committed never sets the remove flag to IOP_UNPIN,
> > even if the transaction commit was aborted.
> 
> Right, because the delaylog and non-delaylog paths are not symmetric
> w.r.t. log write failures.
> 
> > It seems like the CIL code is missing an equivalent to
> > xfs_trans_uncommit for the case that xfs_log_write or xfs_log_done
> > fail.
> 
> There isn't an equivalent. In the delaylog case, we don't have a
> transaction to "uncommit" when a log write failure occurs - we are
> aborting the checkpoint of the CIL, not a transaction. As the items
> have already gone through IOP_COMMITTING and IOP_UNLOCK, we have to
> treat the failures like they came from the log IO completion
> handler.

True.

> In the case of non-delaylog, neither IOP_COMMITTING or IOP_UNLOCK
> has been called on the items when the xfs_log_write() fails. They
> are still linked into the xfs_trans structure, so they can be
> handled by xfs_trans_uncommit() which simply needs to walk the items
> in the transaction and  IOP_UNPIN(lip, abort), IOP_UNLOCK and free
> the items.

You're right.  Care to extend the comment to explain this a bit better
in the code?

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2011-01-20 13:25 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-01-14 13:07 [PATCH 0/2] xfs: fix memory leaks on shutdown w/ delayed logging Dave Chinner
2011-01-14 13:07 ` [PATCH 1/2] xfs: fix log ticket leak on forced shutdown Dave Chinner
2011-01-18 12:35   ` Christoph Hellwig
2011-01-14 13:07 ` [PATCH 2/2] xfs: fix efi item " Dave Chinner
2011-01-18 12:46   ` Christoph Hellwig
2011-01-18 23:33     ` Dave Chinner
2011-01-20 13:27       ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox