* fsck memory usage @ 2013-04-17 15:10 Subranshu Patel 2013-04-17 23:07 ` Theodore Ts'o 0 siblings, 1 reply; 6+ messages in thread From: Subranshu Patel @ 2013-04-17 15:10 UTC (permalink / raw) To: linux-ext4 I performed some recovery (fsck) tests with large EXT4 filesystem. The filesystem size was 500GB (3 million files, 5000 directories). Perfomed force recovery on the clean filesystem and measured the memory usage, which was around 2 GB. Then I performed metadata corruption - 10% of the files, 10% of the directories and some superblock attributes using debugfs. Then I executed fsck to find a memory usage of around 8GB, a much larger value. 1. Is there a way to reduce the memory usage (apart from scratch_files option as it increases the recovery time time) 2. This question is not related to this EXT4 mailing list. But in real scenario how this kind of situation (large memory usage) is handled in large scale filesystem deployment when actual filesystem corruption occurs (may be due to some fault in hardware/controller) ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: fsck memory usage 2013-04-17 15:10 fsck memory usage Subranshu Patel @ 2013-04-17 23:07 ` Theodore Ts'o 2013-04-18 18:34 ` Andreas Dilger 2013-05-01 2:42 ` Subranshu Patel 0 siblings, 2 replies; 6+ messages in thread From: Theodore Ts'o @ 2013-04-17 23:07 UTC (permalink / raw) To: Subranshu Patel; +Cc: linux-ext4 On Wed, Apr 17, 2013 at 08:40:08PM +0530, Subranshu Patel wrote: > I performed some recovery (fsck) tests with large EXT4 filesystem. The > filesystem size was 500GB (3 million files, 5000 directories). > Perfomed force recovery on the clean filesystem and measured the > memory usage, which was around 2 GB. > What version of e2fsprogs are you using? There has been a number of changes made to improve both CPU and memory utilization in more recent versions of e2fsprogs. What would be useful would be for you to run the command: /usr/bin/time e2fsck -nvftt /dev/XXX Here's a run that I've done on a 1TB disk that was about 70% filled with 8M files. It doesn't have as many directories (1000) and far fewer files (3000) but you'll see it uses much less memory: e2fsck 1.42.6+git2 (29-Nov-2012) Pass 1: Checking inodes, blocks, and sizes Pass 1: Memory used: 400k/7888k (299k/102k), time: 9.64/ 1.04/ 0.02 Pass 1: I/O read: 4MB, write: 0MB, rate: 0.41MB/s Pass 2: Checking directory structure Pass 2: Memory used: 400k/15536k (276k/125k), time: 3.72/ 0.02/ 0.05 Pass 2: I/O read: 5MB, write: 0MB, rate: 1.34MB/s Pass 3: Checking directory connectivity Peak memory: Memory used: 400k/15536k (276k/125k), time: 13.59/ 1.28/ 0.07 Pass 3A: Memory used: 400k/15536k (297k/104k), time: 0.00/ 0.00/ 0.00 Pass 3A: I/O read: 0MB, write: 0MB, rate: 0.00MB/s Pass 3: Memory used: 400k/15536k (263k/138k), time: 0.00/ 0.00/ 0.00 Pass 3: I/O read: 1MB, write: 0MB, rate: 1162.79MB/s Pass 4: Checking reference counts Pass 4: Memory used: 400k/240k (228k/173k), time: 1.90/ 1.88/ 0.00 Pass 4: I/O read: 0MB, write: 0MB, rate: 0.00MB/s Pass 5: Checking group summary information Pass 5: Memory used: 400k/240k (206k/195k), time: 6.25/ 1.46/ 0.38 Pass 5: I/O read: 31MB, write: 0MB, rate: 4.96MB/s /dev/hdw3: 4272/48891680 files (0.6% non-contiguous), 170570829/244190000 blocks Memory used: 400k/240k (206k/195k), time: 21.93/ 4.78/ 0.46 I/O read: 39MB, write: 0MB, rate: 1.78MB/s 4.78user 0.55system 0:22.08elapsed 24%CPU (0avgtext+0avgdata 68608maxresident)k 0inputs+0outputs (5major+2323minor)pagefaults 0swaps It would be useful to see what your run reports, and to see what version of e2fsprogs you are using. > Then I performed metadata corruption - 10% of the files, 10% of the > directories and some superblock attributes using debugfs. Then I > executed fsck to find a memory usage of around 8GB, a much larger > value. It's going to depend on what sort of metadata corruption was suffered. If you need to do pass 1b/c/d fix ups, it will need more memory. That's pretty much unavoidable, but it's also not the common case. In most use cases, if those cases require using swap, that's generally OK if it's the rare case, and not the common case. That's why it's not something I've really been worried about. > 2. This question is not related to this EXT4 mailing list. But in real > scenario how this kind of situation (large memory usage) is handled in > large scale filesystem deployment when actual filesystem corruption > occurs (may be due to some fault in hardware/controller) What's your use case where you are memory constrained? Is it a bookshelf NAS configuration? Are you hooking up large number of disks to a memory-constrained server and then trying to run fsck in parallel across a large number of 3TB or 4TB disks? Depending on what you are trying to do, there may be different solutions. In general ext4 has always assumed at least a "reasonable" amount of memory for a large amount of storage, but it's understood that reasonable has changed over the years. So there have been some improvements that we've made more recently, but it may or may not bee good enough for your use case. Can you give us more details about what your requirements are? Regards, - Ted ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: fsck memory usage 2013-04-17 23:07 ` Theodore Ts'o @ 2013-04-18 18:34 ` Andreas Dilger 2013-05-01 2:42 ` Subranshu Patel 1 sibling, 0 replies; 6+ messages in thread From: Andreas Dilger @ 2013-04-18 18:34 UTC (permalink / raw) To: Theodore Ts'o; +Cc: Subranshu Patel, linux-ext4 On 2013-04-17, at 5:07 PM, Theodore Ts'o wrote: > On Wed, Apr 17, 2013 at 08:40:08PM +0530, Subranshu Patel wrote: >> I performed some recovery (fsck) tests with large EXT4 filesystem. >> The filesystem size was 500GB (3 million files, 5000 directories). >> Performed force recovery on the clean filesystem and measured the >> memory usage, which was around 2 GB. >> >> Then I performed metadata corruption - 10% of the files, 10% of the >> directories and some superblock attributes using debugfs. Then I >> executed fsck to find a memory usage of around 8GB, a much larger >> value. > > It's going to depend on what sort of metadata corruption was > suffered. If you need to do pass 1b/c/d fix ups, it will need more > memory. That's pretty much unavoidable, but it's also not the > common case. In most use cases, if those cases require using swap, > that's generally OK if it's the rare case, and not the common case. > That's why it's not something I've really been worried about. This is also where the "inode badness" patch would potentially help out to avoid even trying to fix inodes that are random garbage, and as a result the duplicate block processing would be skipped. http://git.whamcloud.com/?p=tools/e2fsprogs.git;a=commitdiff;h=c17983c570d4fd87e628dd4fdf12d232cfd00694 I was just discussing this patch today, but unfortunately I don't think the rewrite of that patch will happen any time soon. Is there any chance that the existing patch could be landed? The original objection to this patch was that it should centralize all of the inode checks into a single location, but is there a chance to land it as is? I don't think the current changes in the patch are so bad to mark the inode bad at the same locations that call fix_problem(). Cheers, Andreas >> 2. This question is not related to this EXT4 mailing list. But in >> real scenario how this kind of situation (large memory usage) is >> handled in large scale filesystem deployment when actual filesystem >> corruption occurs (may be due to some fault in hardware/controller) > > What's your use case where you are memory constrained? Is it a > bookshelf NAS configuration? Are you hooking up large number of > disks to a memory-constrained server and then trying to run fsck > in parallel across a large number of 3TB or 4TB disks? Depending > on what you are trying to do, there may be different solutions. > > In general ext4 has always assumed at least a "reasonable" amount > of memory for a large amount of storage, but it's understood that > reasonable has changed over the years. So there have been some > improvements that we've made more recently, but it may or may not > be good enough for your use case. Can you give us more details > about what your requirements are? Cheers, Andreas ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: fsck memory usage 2013-04-17 23:07 ` Theodore Ts'o 2013-04-18 18:34 ` Andreas Dilger @ 2013-05-01 2:42 ` Subranshu Patel 2013-05-01 4:09 ` Theodore Ts'o 2013-05-06 1:27 ` Andreas Dilger 1 sibling, 2 replies; 6+ messages in thread From: Subranshu Patel @ 2013-05-01 2:42 UTC (permalink / raw) To: Theodore Ts'o; +Cc: linux-ext4 > What version of e2fsprogs are you using? There has been a number of > changes made to improve both CPU and memory utilization in more recent > versions of e2fsprogs. I am using version 1.41.12 >> Then I performed metadata corruption - 10% of the files, 10% of the >> directories and some superblock attributes using debugfs. Then I >> executed fsck to find a memory usage of around 8GB, a much larger >> value. > It's going to depend on what sort of metadata corruption was suffered. > If you need to do pass 1b/c/d fix ups, it will need more memory. > That's pretty much unavoidable, but it's also not the common case. In > most use cases, if those cases require using swap, that's generally OK > if it's the rare case, and not the common case. That's why it's not > something I've really been worried about. I used the sar command for tracking memory usage. The total memory usage reported by sar command is around 8GB, but it includes the buffer and cache memory. memused = 8GB buffer = 6.7GB cache = negligible (some MBs) So I think the effective memory usage will be 1.3GB (8 - 6.7). So the memory reported under buffer and cache is available for use (if any other process requires it). Please correct my understanding. -- Subranshu ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: fsck memory usage 2013-05-01 2:42 ` Subranshu Patel @ 2013-05-01 4:09 ` Theodore Ts'o 2013-05-06 1:27 ` Andreas Dilger 1 sibling, 0 replies; 6+ messages in thread From: Theodore Ts'o @ 2013-05-01 4:09 UTC (permalink / raw) To: Subranshu Patel; +Cc: linux-ext4 On Wed, May 01, 2013 at 08:12:14AM +0530, Subranshu Patel wrote: > > I used the sar command for tracking memory usage. The total memory > usage reported by sar command is around 8GB, but it includes the > buffer and cache memory. > > memused = 8GB > > buffer = 6.7GB > > cache = negligible (some MBs) > > So I think the effective memory usage will be 1.3GB (8 - 6.7). So the > memory reported under buffer and cache is available for use (if any > other process requires it). Please correct my understanding. Yes, in general this is true. I'm curious why you are you don't just use /usr/bin/time, though. - Ted ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: fsck memory usage 2013-05-01 2:42 ` Subranshu Patel 2013-05-01 4:09 ` Theodore Ts'o @ 2013-05-06 1:27 ` Andreas Dilger 1 sibling, 0 replies; 6+ messages in thread From: Andreas Dilger @ 2013-05-06 1:27 UTC (permalink / raw) To: Subranshu Patel; +Cc: Theodore Ts'o, linux-ext4 On 2013-04-30, at 8:42 PM, Subranshu Patel wrote: >> What version of e2fsprogs are you using? There has been a number of >> changes made to improve both CPU and memory utilization in more recent versions of e2fsprogs. > > I am using version 1.41.12 Could you please retest with a recent release like 1.42.7? That would allow us to compare the memory usage of the newer bitmap code. To make it fair, it would probably be best to run the 1.41.12 and 1.42.7 e2fsck on the same image, so you should make a copy of the block device after corrupting it, but before the first e2fsck. >>> Then I performed metadata corruption - 10% of the files, 10% of the directories and some superblock attributes using debugfs. >>> Then I executed fsck to find a memory usage of around 8GB, a >>> much larger value. >> >> It's going to depend on what sort of metadata corruption was suffered. If you need to do pass 1b/c/d fix ups, it will need >> more memory. >> >> That's pretty much unavoidable, but it's also not the common case. >> In most use cases, if those cases require using swap, that's >> generally OK if it's the rare case, and not the common case. >> That's why it's not something I've really been worried about. > > I used the sar command for tracking memory usage. The total memory > usage reported by sar command is around 8GB, but it includes the > buffer and cache memory. > > memused = 8GB > > buffer = 6.7GB > > cache = negligible (some MBs) > > So I think the effective memory usage will be 1.3GB (8 - 6.7). So the > memory reported under buffer and cache is available for use (if any > other process requires it). Please correct my understanding. It would also be useful to compare the "sar" memory usage to the usage reported by e2fsck itself with "-ttt" to see if they match relatively well or not. Cheers, Andreas ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2013-05-06 1:27 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-04-17 15:10 fsck memory usage Subranshu Patel 2013-04-17 23:07 ` Theodore Ts'o 2013-04-18 18:34 ` Andreas Dilger 2013-05-01 2:42 ` Subranshu Patel 2013-05-01 4:09 ` Theodore Ts'o 2013-05-06 1:27 ` Andreas Dilger
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).