* [PATCH] ext4: memory leakage in ext4_discard_preallocations @ 2010-03-18 12:39 jing zhang 2010-03-18 17:46 ` tytso 0 siblings, 1 reply; 7+ messages in thread From: jing zhang @ 2010-03-18 12:39 UTC (permalink / raw) To: linux-ext4; +Cc: Theodore Ts'o, Andreas Dilger, Dave Kleikamp From: Jing Zhang <zj.barak@gmail.com> Date: Thu Mar 18 20:33:44 2010 When unexpected errors occur, there is memory leakage, and more. Cc: Theodore Ts'o <tytso@mit.edu> Cc: Andreas Dilger <adilger@sun.com> Cc: Dave Kleikamp <shaggy@linux.vnet.ibm.com> Signed-off-by: Jing Zhang <zj.barak@gmail.com> --- --- linux-2.6.32/fs/ext4/mballoc.c 2009-12-03 11:51:22.000000000 +0800 +++ zj/mballoc.c 2010-03-18 20:41:32.000000000 +0800 @@ -3717,6 +3717,7 @@ void ext4_discard_preallocations(struct struct list_head list; struct ext4_buddy e4b; int err; + int occurs = 0; if (!S_ISREG(inode->i_mode)) { /*BUG_ON(!list_empty(&ei->i_prealloc_list));*/ @@ -3781,6 +3782,7 @@ repeat: } spin_unlock(&ei->i_prealloc_lock); +best_efforts: list_for_each_entry_safe(pa, tmp, &list, u.pa_tmp_list) { BUG_ON(pa->pa_type != MB_INODE_PA); ext4_get_group_no_and_offset(sb, pa->pa_pstart, &group, NULL); @@ -3811,6 +3813,12 @@ repeat: list_del(&pa->u.pa_tmp_list); call_rcu(&(pa)->u.pa_rcu, ext4_mb_pa_callback); } + if (! list_empty(&list)) { + if (occurs++ < 2) + goto best_efforts; + else + BUG(); + } if (ac) kmem_cache_free(ext4_ac_cachep, ac); } ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] ext4: memory leakage in ext4_discard_preallocations 2010-03-18 12:39 [PATCH] ext4: memory leakage in ext4_discard_preallocations jing zhang @ 2010-03-18 17:46 ` tytso 2010-03-19 14:17 ` jing zhang 0 siblings, 1 reply; 7+ messages in thread From: tytso @ 2010-03-18 17:46 UTC (permalink / raw) To: jing zhang; +Cc: linux-ext4, Andreas Dilger, Dave Kleikamp > ext4_get_group_no_and_offset(sb, pa->pa_pstart, &group, NULL); > @@ -3811,6 +3813,12 @@ repeat: > list_del(&pa->u.pa_tmp_list); > call_rcu(&(pa)->u.pa_rcu, ext4_mb_pa_callback); > } > + if (! list_empty(&list)) { > + if (occurs++ < 2) > + goto best_efforts; > + else > + BUG(); > + } > if (ac) > kmem_cache_free(ext4_ac_cachep, ac); > } Hmm, I'm not sure that BUG() is appropriate here. If there is an I/O error reading the block bitmap, #1, retrying isn't going to help, and #2, bringing down the entire system just because of an I/O error in reading the block bitmap doesn't seem right. Right now, if there is a problem, we just end up leaving the preallocated list on the inode. Does that cause problems later on down the line which you have observed? - Ted ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] ext4: memory leakage in ext4_discard_preallocations 2010-03-18 17:46 ` tytso @ 2010-03-19 14:17 ` jing zhang 2010-03-19 17:27 ` Andreas Dilger 0 siblings, 1 reply; 7+ messages in thread From: jing zhang @ 2010-03-19 14:17 UTC (permalink / raw) To: tytso; +Cc: linux-ext4, Andreas Dilger, Dave Kleikamp >> ext4_get_group_no_and_offset(sb, pa->pa_pstart, &group, NULL); >> @@ -3811,6 +3813,12 @@ repeat: >> list_del(&pa->u.pa_tmp_list); >> call_rcu(&(pa)->u.pa_rcu, ext4_mb_pa_callback); >> } >> + if (! list_empty(&list)) { >> + if (occurs++ < 2) >> + goto best_efforts; >> + else >> + BUG(); >> + } >> if (ac) >> kmem_cache_free(ext4_ac_cachep, ac); >> } > > Hmm, I'm not sure that BUG() is appropriate here. If there is an > I/O error reading the block bitmap, #1, retrying isn't going to help, > and #2, bringing down the entire system just because of an I/O error > in reading the block bitmap doesn't seem right. But disk hardware error is not rare, > Right now, if there is a problem, we just end up leaving the > preallocated list on the inode. Does that cause problems later on > down the line which you have observed? > > - Ted and is there still chance to call the call_rcu(&(pa)->u.pa_rcu, ext4_mb_pa_callback); function again later on? (I am not sure yet the chance does exist.) If no chance, how about the kmem_cache subsystem then? After reboot, the file system is still reliable, or just with a few lost blocks? Thus it is necessary, at least for me, to make sure whether the chance exists. - zj ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] ext4: memory leakage in ext4_discard_preallocations 2010-03-19 14:17 ` jing zhang @ 2010-03-19 17:27 ` Andreas Dilger 2010-03-20 14:05 ` jing zhang 0 siblings, 1 reply; 7+ messages in thread From: Andreas Dilger @ 2010-03-19 17:27 UTC (permalink / raw) To: jing zhang; +Cc: tytso, linux-ext4, Dave Kleikamp On 2010-03-19, at 08:17, jing zhang wrote: >>> ext4_get_group_no_and_offset(sb, pa->pa_pstart, &group, NULL); >>> @@ -3811,6 +3813,12 @@ repeat: >>> list_del(&pa->u.pa_tmp_list); >>> call_rcu(&(pa)->u.pa_rcu, ext4_mb_pa_callback); >>> } >>> + if (! list_empty(&list)) { >>> + if (occurs++ < 2) >>> + goto best_efforts; >>> + else >>> + BUG(); >>> + } >>> if (ac) >>> kmem_cache_free(ext4_ac_cachep, ac); >>> } >> >> Hmm, I'm not sure that BUG() is appropriate here. If there is an >> I/O error reading the block bitmap, #1, retrying isn't going to help, >> and #2, bringing down the entire system just because of an I/O error >> in reading the block bitmap doesn't seem right. > > But disk hardware error is not rare, Exactly, which is the reason why it should not cause the system to hang. The filesystem should handle such errors gracefully if this is possible, return an error to the application, and/or marking the filesystem in error so that it will be checked on next boot, or similar. >> Right now, if there is a problem, we just end up leaving the >> preallocated list on the inode. Does that cause problems later on >> down the line which you have observed? >> >> - Ted > > and is there still chance to call the > call_rcu(&(pa)->u.pa_rcu, ext4_mb_pa_callback); > function again later on? (I am not sure yet the chance does exist.) > > If no chance, how about the kmem_cache subsystem then? > After reboot, the file system is still reliable, or just with a few > lost blocks? > > Thus it is necessary, at least for me, to make sure whether the > chance exists. > - zj > -- > To unsubscribe from this list: send the line "unsubscribe linux- > ext4" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] ext4: memory leakage in ext4_discard_preallocations 2010-03-19 17:27 ` Andreas Dilger @ 2010-03-20 14:05 ` jing zhang 2010-03-26 8:37 ` Aneesh Kumar K. V 0 siblings, 1 reply; 7+ messages in thread From: jing zhang @ 2010-03-20 14:05 UTC (permalink / raw) To: Andreas Dilger; +Cc: tytso, linux-ext4, Dave Kleikamp 2010/3/20, Andreas Dilger <adilger@sun.com>: > On 2010-03-19, at 08:17, jing zhang wrote: >>>> ext4_get_group_no_and_offset(sb, pa->pa_pstart, &group, NULL); >>>> @@ -3811,6 +3813,12 @@ repeat: >>>> list_del(&pa->u.pa_tmp_list); >>>> call_rcu(&(pa)->u.pa_rcu, ext4_mb_pa_callback); >>>> } >>>> + if (! list_empty(&list)) { >>>> + if (occurs++ < 2) >>>> + goto best_efforts; >>>> + else >>>> + BUG(); >>>> + } >>>> if (ac) >>>> kmem_cache_free(ext4_ac_cachep, ac); >>>> } >>> >>> Hmm, I'm not sure that BUG() is appropriate here. If there is an >>> I/O error reading the block bitmap, #1, retrying isn't going to help, >>> and #2, bringing down the entire system just because of an I/O error >>> in reading the block bitmap doesn't seem right. >> >> But disk hardware error is not rare, > > Exactly, which is the reason why it should not cause the system to > hang. The filesystem should handle such errors gracefully if this is > possible, return an error to the application, and/or marking the > filesystem in error so that it will be checked on next boot, or similar. > >>> Right now, if there is a problem, we just end up leaving the >>> preallocated list on the inode. Does that cause problems later on >>> down the line which you have observed? >>> >>> - Ted >> >> and is there still chance to call the >> call_rcu(&(pa)->u.pa_rcu, ext4_mb_pa_callback); >> function again later on? (I am not sure yet the chance does exist.) >> >> If no chance, how about the kmem_cache subsystem then? >> After reboot, the file system is still reliable, or just with a few >> lost blocks? >> >> Thus it is necessary, at least for me, to make sure whether the >> chance exists. >> - zj >> -- >> To unsubscribe from this list: send the line "unsubscribe linux- >> ext4" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > > Cheers, Andreas > -- > Andreas Dilger > Sr. Staff Engineer, Lustre Group > Sun Microsystems of Canada, Inc. Evening, Thanks Andreas and Ted for your good explanations to deal error in gentle way, and I got it that the chance may exist since the pa is not deleted from its group_list yet. And it also seems that there is work deserved. - zj --- --- linux-2.6.32/fs/ext4/mballoc.c 2009-12-03 11:51:22.000000000 +0800 +++ fs/mballoc.c 2010-03-20 21:40:04.000000000 +0800 @@ -3788,14 +3788,14 @@ repeat: err = ext4_mb_load_buddy(sb, group, &e4b); if (err) { ext4_error(sb, __func__, "Error in loading buddy " - "information for %u", group); + "information for group %u inode %lu", group, inode->i_ino); continue; } bitmap_bh = ext4_read_block_bitmap(sb, group); if (bitmap_bh == NULL) { ext4_error(sb, __func__, "Error in reading block " - "bitmap for %u", group); + "bitmap for group %u inode %lu", group, inode->i_ino); ext4_mb_release_desc(&e4b); continue; } @@ -3811,6 +3811,14 @@ repeat: list_del(&pa->u.pa_tmp_list); call_rcu(&(pa)->u.pa_rcu, ext4_mb_pa_callback); } + if (! list_empty(&list)) { + /* + * we have to do something for the check in + * the function, ext4_mb_discard_group_preallocations() + */ + list_for_each_entry(pa, &list, u.pa_tmp_list) + pa->pa_deleted = 0; + } if (ac) kmem_cache_free(ext4_ac_cachep, ac); } ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] ext4: memory leakage in ext4_discard_preallocations 2010-03-20 14:05 ` jing zhang @ 2010-03-26 8:37 ` Aneesh Kumar K. V 2010-03-26 14:12 ` jing zhang 0 siblings, 1 reply; 7+ messages in thread From: Aneesh Kumar K. V @ 2010-03-26 8:37 UTC (permalink / raw) To: jing zhang, Andreas Dilger; +Cc: tytso, linux-ext4, Dave Kleikamp On Sat, 20 Mar 2010 22:05:13 +0800, jing zhang <zj.barak@gmail.com> wrote: > > Evening, > > Thanks Andreas and Ted for your good explanations to deal error in > gentle way, and I got it that the chance may exist since the pa is not > deleted from its group_list yet. > > And it also seems that there is work deserved. > - zj > > --- > > --- linux-2.6.32/fs/ext4/mballoc.c 2009-12-03 11:51:22.000000000 +0800 > +++ fs/mballoc.c 2010-03-20 21:40:04.000000000 +0800 > @@ -3788,14 +3788,14 @@ repeat: > err = ext4_mb_load_buddy(sb, group, &e4b); > if (err) { > ext4_error(sb, __func__, "Error in loading buddy " > - "information for %u", group); > + "information for group %u inode %lu", group, inode->i_ino); > continue; > } > > bitmap_bh = ext4_read_block_bitmap(sb, group); > if (bitmap_bh == NULL) { > ext4_error(sb, __func__, "Error in reading block " > - "bitmap for %u", group); > + "bitmap for group %u inode %lu", group, inode->i_ino); > ext4_mb_release_desc(&e4b); > continue; > } > @@ -3811,6 +3811,14 @@ repeat: > list_del(&pa->u.pa_tmp_list); > call_rcu(&(pa)->u.pa_rcu, ext4_mb_pa_callback); > } > + if (! list_empty(&list)) { > + /* > + * we have to do something for the check in > + * the function, ext4_mb_discard_group_preallocations() > + */ > + list_for_each_entry(pa, &list, u.pa_tmp_list) > + pa->pa_deleted = 0; > + } > if (ac) > kmem_cache_free(ext4_ac_cachep, ac); > } Can you add a comment saying if we fail to load buddy or read block bitmap we skip freeing the prealloc space. So mark it undeleted. The prealloc space is still removed from the inode but it is linked to the group prealloc list via (pa_group_list) -aneesh ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] ext4: memory leakage in ext4_discard_preallocations 2010-03-26 8:37 ` Aneesh Kumar K. V @ 2010-03-26 14:12 ` jing zhang 0 siblings, 0 replies; 7+ messages in thread From: jing zhang @ 2010-03-26 14:12 UTC (permalink / raw) To: Aneesh Kumar K. V; +Cc: Andreas Dilger, tytso, linux-ext4, Dave Kleikamp 2010/3/26, Aneesh Kumar K. V <aneesh.kumar@linux.vnet.ibm.com>: > On Sat, 20 Mar 2010 22:05:13 +0800, jing zhang <zj.barak@gmail.com> wrote: >> >> Evening, >> >> Thanks Andreas and Ted for your good explanations to deal error in >> gentle way, and I got it that the chance may exist since the pa is not >> deleted from its group_list yet. >> >> And it also seems that there is work deserved. >> - zj >> >> --- >> >> --- linux-2.6.32/fs/ext4/mballoc.c 2009-12-03 11:51:22.000000000 +0800 >> +++ fs/mballoc.c 2010-03-20 21:40:04.000000000 +0800 >> @@ -3788,14 +3788,14 @@ repeat: >> err = ext4_mb_load_buddy(sb, group, &e4b); >> if (err) { >> ext4_error(sb, __func__, "Error in loading buddy " >> - "information for %u", group); >> + "information for group %u inode %lu", group, inode->i_ino); >> continue; >> } >> >> bitmap_bh = ext4_read_block_bitmap(sb, group); >> if (bitmap_bh == NULL) { >> ext4_error(sb, __func__, "Error in reading block " >> - "bitmap for %u", group); >> + "bitmap for group %u inode %lu", group, inode->i_ino); >> ext4_mb_release_desc(&e4b); >> continue; >> } >> @@ -3811,6 +3811,14 @@ repeat: >> list_del(&pa->u.pa_tmp_list); >> call_rcu(&(pa)->u.pa_rcu, ext4_mb_pa_callback); >> } >> + if (! list_empty(&list)) { >> + /* >> + * we have to do something for the check in >> + * the function, ext4_mb_discard_group_preallocations() >> + */ >> + list_for_each_entry(pa, &list, u.pa_tmp_list) >> + pa->pa_deleted = 0; >> + } >> if (ac) >> kmem_cache_free(ext4_ac_cachep, ac); >> } > > Can you add a comment saying if we fail to load buddy or read block > bitmap we skip freeing the prealloc space. So mark it undeleted. The > prealloc space is still removed from the inode but it is linked to the > group prealloc list via (pa_group_list) > > > -aneesh > /* * here the tricky is to mark PAs undeleted, * since they are still on their pa_group_list. */ That is it, Aneesh. I am still waiting for comments, if any, from Ted, since I am not sure the tricky is safe enough. And I am able not to deliver better patch tonight :( - zj ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2010-03-26 14:12 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-03-18 12:39 [PATCH] ext4: memory leakage in ext4_discard_preallocations jing zhang 2010-03-18 17:46 ` tytso 2010-03-19 14:17 ` jing zhang 2010-03-19 17:27 ` Andreas Dilger 2010-03-20 14:05 ` jing zhang 2010-03-26 8:37 ` Aneesh Kumar K. V 2010-03-26 14:12 ` jing zhang
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).