public inbox for linux-ext4@vger.kernel.org
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Namjae Jeon <namjae.jeon@samsung.com>
Cc: 'Jan Kara' <jack@suse.cz>, 'Theodore Ts'o' <tytso@mit.edu>,
	linux-ext4@vger.kernel.org
Subject: Re: memory leak: data=journal and {collapse,insert,zero}_range
Date: Mon, 23 Nov 2015 14:53:42 +0100	[thread overview]
Message-ID: <20151123135342.GJ23418@quack.suse.cz> (raw)
In-Reply-To: <000001d1234c$c37079b0$4a516d10$@samsung.com>

[-- Attachment #1: Type: text/plain, Size: 1599 bytes --]

On Fri 20-11-15 13:34:36, Namjae Jeon wrote:
> Actually, I can quickly create memleak with simple testcase like:
> while(1000)
> {
> Random seek(max 500M) -> write 1M -> fsync ->truncate(random size)
> }
> 
> As there is significant memleak in this situation also, I started focusing
> on this simple testcase only. When use this testcase, I never saw EBUSY path
> in jbd2_journal_invalidatepage() -> journal_unmap_buffer().
> 
> On the other hand(), for memleak pages jbd2_journal_invalidatepage() ->
> journal_unmap_buffer() was almost bailing out every time from
>                 if (!buffer_dirty(bh)) {
>                         /* bdflush has written it.  We can drop it now */
>                         goto zap_buffer;
>                 }
> 
> As per debugging, NULL mapping page or buffer is created in below scenario:
> Write(or Kjournald) -> jbd2_journal_commit_transaction
>                     -> BH_JBDDirty buffer is added to forget list.
>                     -> Buffer is added to new checkpoint,
>                        additional reference count is taken on b_jcount which
>                        keeps buffer busy until remove checkpoint.
>                     -> Buffer is unfiled, removed from forget list, BH_JBDDirty
>                        is cleared and BH_Dirty is set, it is exposed to VM. 

Ah, I see! Thanks for the easy reproduction testcase and for the debugging.
Attached patch fixes the issue (at least the simple truncate case) for me.
Can you check whether it fixes the issue for you as well? Thanks!

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

[-- Attachment #2: 0001-jbd2-Fix-unreclaimed-pages-after-truncate-in-data-jo.patch --]
[-- Type: text/x-patch, Size: 2206 bytes --]

>From 29e43b5691866e8eb363c1ef64e1fc89b2792e86 Mon Sep 17 00:00:00 2001
From: Jan Kara <jack@suse.cz>
Date: Mon, 23 Nov 2015 13:34:30 +0100
Subject: [PATCH] jbd2: Fix unreclaimed pages after truncate in data=journal
 mode

Ted and Namjae have reported that truncated pages don't get timely
reclaimed after being truncated in data=journal mode. The following test
triggers the issue easily:

for (i = 0; i < 1000; i++) {
	pwrite(fd, buf, 1024*1024, 0);
	fsync(fd);
	fsync(fd);
	ftruncate(fd, 0);
}

The reason is that journal_unmap_buffer() finds that truncated buffers
are not journalled (jh->b_transaction == NULL), they are part of
checkpoint list of a transaction (jh->b_cp_transaction != NULL) and have
been already written out (!buffer_dirty(bh)). We clean such buffers but
we leave them in the checkpoint list. Since checkpoint transaction holds
a reference to the journal head, these buffers cannot be released until
the checkpoint transaction is cleaned up. And at that point we don't
call release_buffer_page() anymore so pages detached from mapping are
lingering in the system waiting for reclaim to find them and free them.

Fix the problem by removing buffers from transaction checkpoint lists
when journal_unmap_buffer() finds out they don't have to be there
anymore.

Reported-by: Namjae Jeon <namjae.jeon@samsung.com>
Fixes: de1b794130b130e77ffa975bb58cb843744f9ae5
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/jbd2/transaction.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
index 89463eee6791..daa1514259e0 100644
--- a/fs/jbd2/transaction.c
+++ b/fs/jbd2/transaction.c
@@ -2152,6 +2152,7 @@ static int journal_unmap_buffer(journal_t *journal, struct buffer_head *bh,
 
 		if (!buffer_dirty(bh)) {
 			/* bdflush has written it.  We can drop it now */
+			__jbd2_journal_remove_checkpoint(jh);
 			goto zap_buffer;
 		}
 
@@ -2181,6 +2182,7 @@ static int journal_unmap_buffer(journal_t *journal, struct buffer_head *bh,
 				/* The orphan record's transaction has
 				 * committed.  We can cleanse this buffer */
 				clear_buffer_jbddirty(bh);
+				__jbd2_journal_remove_checkpoint(jh);
 				goto zap_buffer;
 			}
 		}
-- 
2.1.4


  reply	other threads:[~2015-11-23 13:53 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-17 16:02 memory leak: data=journal and {collapse,insert,zero}_range Theodore Ts'o
2015-10-20 12:06 ` Namjae Jeon
2015-10-20 15:54   ` Theodore Ts'o
2015-10-21  9:44     ` Namjae Jeon
2015-10-21 14:52       ` Theodore Ts'o
2015-11-09  5:21         ` Namjae Jeon
2015-11-10 14:49           ` Jan Kara
2015-11-17  4:47             ` Namjae Jeon
2015-11-18 21:36               ` Jan Kara
2015-11-19  9:42                 ` Jan Kara
2015-11-20  4:34                   ` Namjae Jeon
2015-11-23 13:53                     ` Jan Kara [this message]
2015-11-24  4:21                       ` Namjae Jeon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151123135342.GJ23418@quack.suse.cz \
    --to=jack@suse.cz \
    --cc=linux-ext4@vger.kernel.org \
    --cc=namjae.jeon@samsung.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox