From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: with ECARTIS (v1.0.0; list xfs); Thu, 09 Aug 2007 23:08:54 -0700 (PDT) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l7A68ibm014782 for ; Thu, 9 Aug 2007 23:08:47 -0700 Message-ID: <46BC00ED.1010201@sgi.com> Date: Fri, 10 Aug 2007 16:08:45 +1000 From: Timothy Shimmin MIME-Version: 1.0 Subject: Re: REVIEW: fix xfs_repair phase 4 ag_stride with prefetch References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-15; format=flowed Content-Transfer-Encoding: 7bit Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com List-Id: xfs To: Barry Naujok Cc: "xfs@oss.sgi.com" , xfs-dev Barry Naujok wrote: > > AG stride testing was performed on a system with ample amounts > of memory, so prefetching with AG stride during Phase 4 was > missed. The attached patch fixes this. > > 32 AGs, ag_stride = 4: > > Phase 3 - for each AG... > - scan and clear agi unlinked lists... > - process known inodes and perform inode discovery... > - agno = 0 > - agno = 4 > - agno = 8 > - agno = 12 > - agno = 16 > - agno = 20 > - agno = 24 > - agno = 28 > > which is correct... but in Phase 4: > > Phase 4 - check for duplicate blocks... > - setting up duplicate extent list... > - check for inodes claiming duplicate blocks... > - agno = 0 > - agno = 1 > - agno = 2 > - agno = 5 > - agno = 6 > - agno = 4 > - agno = 7 > - agno = 3 Okay I replied to this in the bug but forgot about the review email request so I will reply here too:) ========================== ADDITIONAL INFORMATION (ADD) From: timothy shimmin Date: Aug 09 2007 10:59:01PM [BugWorks mailnews processor v1.7.1] ========================== bnaujok@sgi.com via BugWorks wrote: > > [ EXCESSIVE QUOTED TEXT DELETED ] > > - 12:00:16: setting up duplicate extent list - 32 of 32 allocation groups done > - check for inodes claiming duplicate blocks... > - agno = 0 > - agno = 1 > - agno = 2 > - agno = 5 > - agno = 6 > - agno = 4 > - agno = 7 > - agno = 3 So you need to replicate the logic/code you used in phase 3 and put it into phase4 I presume. And the changes are in phase4.c/process_ags() (like in phase3.c/process_ags). So previously it would iterate thru the thread-count and then prefetch and queue up to the ag-count stepping by the ag-stride. And now we go thru each AG but only do an ag-stride worth at a time for a thread. So each thread will get an ag-stride worth of ags to deal with: Fixed code: Thread 0 - gets ag 0-3 (go thru a loop doing a fetch & queue work on ag 0-3) Thread 1 - gets ag 4-7 Thread 2 - gets ag 8-11 etc... Okay, starting to get it now :) So previously (broken code) it would do: Thread 0 - gets ag 0,4,8,12,16,... Thread 1 - gets ag 1,5,9,13,17,... Thread 2 - gets ag 2,6,10,14,18,... I presume the bad phase 4 msgs above are coming out in a different order than what I said above just because of the threading. Or did I make a mistake in my reading of the code. Pity we couldn't share some code there between the phases but I haven't looked to see the difficulties with that are - do tell me. Would be nice to guarantee handling the striding/threading the same in both phases by both using the same iterator code - particularly if it needs to be changed in the future etc. But whatever. And I presume the fix is so that the prefetching for a thread is done in ag-order (and they are just given a chunks worth) instead of the old bad way where the prefetching would be done with a jump over AGs by the stride instead of in order. So it looks fine if my understanding is correct. Cheers, Tim.