linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [v4.14-rc1 regression] ext4 failed fstests generic/233 quota test
@ 2017-10-08  5:42 Eryu Guan
  2017-10-10 11:43 ` Jan Kara
  0 siblings, 1 reply; 3+ messages in thread
From: Eryu Guan @ 2017-10-08  5:42 UTC (permalink / raw)
  To: linux-ext4; +Cc: linux-fsdevel, Jan Kara, Eric Whitney

Hi all,

After generic/232 failure has been reported and resolved[1], I still
could see fstests generic/233 failure on ext4 with v4.14-rc3 kernel.
This is not 100% reproduced (block usage needs to exceed soft limit) but
reliably.

 seed = S
 Comparing user usage
-Comparing group usage
+4c4
+< #1001     +-   32064   32000   32000            998  1000  1000       
+---
+> #1001     +-   32064   32000   32000  7days     998  1000  1000

Grace time was not printed by repquota right after the fsstress run when
we exceeded the block soft limit, and only printed after a quotacheck
was run.  With v4.13 kernel, block grace time could be printed
immediately after the fsstress run.

git bisect pointed the first bad to commit 7b9ca4c61bc2 ("quota: Reduce
contention on dq_data_lock"). And I've confirmed the bisection result by
converting the commit in question and running generic/233 for 20
iterations without a failure.

Thanks,
Eryu

[1] https://www.spinics.net/lists/linux-ext4/msg58372.html

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [v4.14-rc1 regression] ext4 failed fstests generic/233 quota test
  2017-10-08  5:42 [v4.14-rc1 regression] ext4 failed fstests generic/233 quota test Eryu Guan
@ 2017-10-10 11:43 ` Jan Kara
  2017-10-10 12:49   ` Jan Kara
  0 siblings, 1 reply; 3+ messages in thread
From: Jan Kara @ 2017-10-10 11:43 UTC (permalink / raw)
  To: Eryu Guan; +Cc: linux-ext4, linux-fsdevel, Jan Kara, Eric Whitney

Hi Eryu,

On Sun 08-10-17 13:42:36, Eryu Guan wrote:
> After generic/232 failure has been reported and resolved[1], I still
> could see fstests generic/233 failure on ext4 with v4.14-rc3 kernel.
> This is not 100% reproduced (block usage needs to exceed soft limit) but
> reliably.
> 
>  seed = S
>  Comparing user usage
> -Comparing group usage
> +4c4
> +< #1001     +-   32064   32000   32000            998  1000  1000       
> +---
> +> #1001     +-   32064   32000   32000  7days     998  1000  1000
> 
> Grace time was not printed by repquota right after the fsstress run when
> we exceeded the block soft limit, and only printed after a quotacheck
> was run.  With v4.13 kernel, block grace time could be printed
> immediately after the fsstress run.

Well, I'd rather interpret the results as "the grace time didn't get set by
the failing kernel, only quotacheck would set it". This configuration with
softlimit == hardlimit is a bit ambiguous (as effectively softlimit and
grace time are unused) and I might have shortcut setting of grace time in
this case somewhere (which would be harmless). But still it warrants closer
investigation. I'll have a look.

> git bisect pointed the first bad to commit 7b9ca4c61bc2 ("quota: Reduce
> contention on dq_data_lock"). And I've confirmed the bisection result by
> converting the commit in question and running generic/233 for 20
> iterations without a failure.

Thanks for digging into this!

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [v4.14-rc1 regression] ext4 failed fstests generic/233 quota test
  2017-10-10 11:43 ` Jan Kara
@ 2017-10-10 12:49   ` Jan Kara
  0 siblings, 0 replies; 3+ messages in thread
From: Jan Kara @ 2017-10-10 12:49 UTC (permalink / raw)
  To: Eryu Guan; +Cc: linux-ext4, linux-fsdevel, Jan Kara, Eric Whitney

On Tue 10-10-17 13:43:23, Jan Kara wrote:
> Hi Eryu,
> 
> On Sun 08-10-17 13:42:36, Eryu Guan wrote:
> > After generic/232 failure has been reported and resolved[1], I still
> > could see fstests generic/233 failure on ext4 with v4.14-rc3 kernel.
> > This is not 100% reproduced (block usage needs to exceed soft limit) but
> > reliably.
> > 
> >  seed = S
> >  Comparing user usage
> > -Comparing group usage
> > +4c4
> > +< #1001     +-   32064   32000   32000            998  1000  1000       
> > +---
> > +> #1001     +-   32064   32000   32000  7days     998  1000  1000
> > 
> > Grace time was not printed by repquota right after the fsstress run when
> > we exceeded the block soft limit, and only printed after a quotacheck
> > was run.  With v4.13 kernel, block grace time could be printed
> > immediately after the fsstress run.
> 
> Well, I'd rather interpret the results as "the grace time didn't get set by
> the failing kernel, only quotacheck would set it". This configuration with
> softlimit == hardlimit is a bit ambiguous (as effectively softlimit and
> grace time are unused) and I might have shortcut setting of grace time in
> this case somewhere (which would be harmless). But still it warrants closer
> investigation. I'll have a look.
> 
> > git bisect pointed the first bad to commit 7b9ca4c61bc2 ("quota: Reduce
> > contention on dq_data_lock"). And I've confirmed the bisection result by
> > converting the commit in question and running generic/233 for 20
> > iterations without a failure.
> 
> Thanks for digging into this!

OK, I've reproduced the issue (although it took me several xfstests run to
hit this) and it is a real bug in handling of DQUOT_ALLOC_NOFAIL quota
allocations. I'll send a fix shortly once testing completes.

								Honza

-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2017-10-10 12:49 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-10-08  5:42 [v4.14-rc1 regression] ext4 failed fstests generic/233 quota test Eryu Guan
2017-10-10 11:43 ` Jan Kara
2017-10-10 12:49   ` Jan Kara

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).