From mboxrd@z Thu Jan 1 00:00:00 1970 From: Zheng Liu Subject: Re: ext4 xfstest regression due to ext4_es_lookup_extent Date: Sun, 24 Feb 2013 11:21:56 +0800 Message-ID: <20130224032156.GA5840@gmail.com> References: <87obfcs1x6.fsf@openvz.org> <20130222180325.GB21264@thunk.org> <87txp3cqwt.fsf@openvz.org> <51289343.90704@gmail.com> <20130224001447.GB1196@thunk.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Dmitry Monakhov , linux-ext4@vger.kernel.org To: Theodore Ts'o Return-path: Received: from mail-pb0-f47.google.com ([209.85.160.47]:65029 "EHLO mail-pb0-f47.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1759410Ab3BXDHG (ORCPT ); Sat, 23 Feb 2013 22:07:06 -0500 Received: by mail-pb0-f47.google.com with SMTP id rp2so1086422pbb.34 for ; Sat, 23 Feb 2013 19:07:05 -0800 (PST) Content-Disposition: inline In-Reply-To: <20130224001447.GB1196@thunk.org> Sender: linux-ext4-owner@vger.kernel.org List-ID: On Sat, Feb 23, 2013 at 07:14:47PM -0500, Theodore Ts'o wrote: > On Sat, Feb 23, 2013 at 06:00:35PM +0800, Zheng Liu wrote: > > > Actually I think that the regression in 269'th you have found recently > > > caused by similar issue and commit which you foud by bisecting ( the one > > > which allow migration between indirect<->extent based inodes) > > > simply helps to spot real issue in es_caching code. > > > > I will revise this patch. IIRC, we forgot to update status tree after > > an inode is migrated from extent-based to indirect-based. Thanks for > > pointing out. > > Can you do this as a new commit? I've already bumped the master > pointer up since I finished running xfstests and I'm seeing no > regressions (at least with my set of xfstests). So given that > everything has been tested and things looks pretty stable, I pushed up > the master branch. Yes, I will prepare it as a new commit. But I am not pretty sure that the root cause is es_caching. > > I did remember that you were still working on this regression, but > since we're already half-way through the merge window, I really want > to make things are ready for a merge request to Linus. (Which I > probably will be sending to Linus by Monday or Tuesday.) > > I do plan to collect bug fixes and any remaining regression fixes to > push to Linus by -rc2 or -rc3, so if don't rush fixing up defrag > functionality. For defrag regression, I have two choice to fix it. One is a quick but sub-optimal fix that we can invalidate all written/unwritten/hole extent from status tree. But it will decrease the performance because we need to load extent into status tree again. Further, one thing we need to keep in our mind is that some extent is unwritten and delayed. So it makes thing complicated. But now we don't need to worry about it because a bigalloc file system doesn't support defrag. So we are safe. Another is to update all extent in status tree. I think this is a better choice and I think Dmirty is working on it. Dmitry, I don't get your response. Could you confirm it? TBH, we never use migration and defrag feature in our product system. I admit that I almost don't pay a attention to them. It's my fault. I make a plan for the next works. 1. Try to prepare a patch that invalidates all cache in status tree to fix defrag regression, and wait Dmitry's patch. 2. Revise migration patch. 3. Submit remain patches for extent status tree that try to convert unwritten extent in end_io callback function and remove a bogus wait in ext4_ind_direct_IO. Now the patch has already done and still need to be tested. 4. get_block_t and *map_blocks cleanup. 5. extent-level locking. Any comment? Thanks, - Zheng