linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: jing zhang <zj.barak@gmail.com>
To: Andreas Dilger <adilger@sun.com>
Cc: tytso@mit.edu, linux-ext4 <linux-ext4@vger.kernel.org>,
	Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Subject: Re: [PATCH] ext4: memory leakage in ext4_discard_preallocations
Date: Sat, 20 Mar 2010 22:05:13 +0800	[thread overview]
Message-ID: <ac8f92701003200705u2bb6b65p4adce7b79f250705@mail.gmail.com> (raw)
In-Reply-To: <67790F0F-9921-4A98-8DC6-DA1C00CE6CA9@sun.com>

2010/3/20, Andreas Dilger <adilger@sun.com>:
> On 2010-03-19, at 08:17, jing zhang wrote:
>>>> 		ext4_get_group_no_and_offset(sb, pa->pa_pstart, &group, NULL);
>>>> @@ -3811,6 +3813,12 @@ repeat:
>>>> 		list_del(&pa->u.pa_tmp_list);
>>>> 		call_rcu(&(pa)->u.pa_rcu, ext4_mb_pa_callback);
>>>> 	}
>>>> +	if (! list_empty(&list)) {
>>>> +		if (occurs++ < 2)
>>>> +			goto best_efforts;
>>>> +		else
>>>> +			BUG();
>>>> +	}
>>>> 	if (ac)
>>>> 		kmem_cache_free(ext4_ac_cachep, ac);
>>>> }
>>>
>>> Hmm, I'm not sure that BUG() is appropriate here.  If there is an
>>> I/O error reading the block bitmap, #1, retrying isn't going to help,
>>> and #2, bringing down the entire system just because of an I/O error
>>> in reading the block bitmap doesn't seem right.
>>
>> But disk hardware error is not rare,
>
> Exactly, which is the reason why it should not cause the system to
> hang.  The filesystem should handle such errors gracefully if this is
> possible, return an error to the application, and/or marking the
> filesystem in error so that it will be checked on next boot, or similar.
>
>>> Right now, if there is a problem, we just end up leaving the
>>> preallocated list on the inode.  Does that cause problems later on
>>> down the line which you have observed?
>>>
>>> 					- Ted
>>
>> and is there still chance to call the
>>       call_rcu(&(pa)->u.pa_rcu, ext4_mb_pa_callback);
>> function again later on? (I am not sure yet the chance does exist.)
>>
>> If no chance, how about the kmem_cache subsystem then?
>> After reboot, the file system is still reliable, or just with a few
>> lost blocks?
>>
>> Thus it is necessary, at least for me, to make sure whether the
>> chance exists.
>>                                      - zj
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-
>> ext4" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
> Cheers, Andreas
> --
> Andreas Dilger
> Sr. Staff Engineer, Lustre Group
> Sun Microsystems of Canada, Inc.

Evening,

Thanks Andreas and Ted for your good explanations to deal error in
gentle way, and I got it that the chance may exist since the pa is not
deleted from its group_list yet.

And it also seems that there is work deserved.
       - zj

---

--- linux-2.6.32/fs/ext4/mballoc.c	2009-12-03 11:51:22.000000000 +0800
+++ fs/mballoc.c	2010-03-20 21:40:04.000000000 +0800
@@ -3788,14 +3788,14 @@ repeat:
 		err = ext4_mb_load_buddy(sb, group, &e4b);
 		if (err) {
 			ext4_error(sb, __func__, "Error in loading buddy "
-					"information for %u", group);
+			"information for group %u inode %lu", group, inode->i_ino);
 			continue;
 		}

 		bitmap_bh = ext4_read_block_bitmap(sb, group);
 		if (bitmap_bh == NULL) {
 			ext4_error(sb, __func__, "Error in reading block "
-					"bitmap for %u", group);
+			"bitmap for group %u inode %lu", group, inode->i_ino);
 			ext4_mb_release_desc(&e4b);
 			continue;
 		}
@@ -3811,6 +3811,14 @@ repeat:
 		list_del(&pa->u.pa_tmp_list);
 		call_rcu(&(pa)->u.pa_rcu, ext4_mb_pa_callback);
 	}
+	if (! list_empty(&list)) {
+		/*
+		 * we have to do something for the check in
+		 * the function, ext4_mb_discard_group_preallocations()
+		 */
+		list_for_each_entry(pa, &list, u.pa_tmp_list)
+			pa->pa_deleted = 0;
+	}
 	if (ac)
 		kmem_cache_free(ext4_ac_cachep, ac);
 }

  reply	other threads:[~2010-03-20 14:05 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-03-18 12:39 [PATCH] ext4: memory leakage in ext4_discard_preallocations jing zhang
2010-03-18 17:46 ` tytso
2010-03-19 14:17   ` jing zhang
2010-03-19 17:27     ` Andreas Dilger
2010-03-20 14:05       ` jing zhang [this message]
2010-03-26  8:37         ` Aneesh Kumar K. V
2010-03-26 14:12           ` jing zhang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ac8f92701003200705u2bb6b65p4adce7b79f250705@mail.gmail.com \
    --to=zj.barak@gmail.com \
    --cc=adilger@sun.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=shaggy@linux.vnet.ibm.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).