* question about i_dtime being used as an orphan list pointer @ 2010-08-25 19:22 Vitali Lovich 2010-08-25 20:39 ` Ted Ts'o 0 siblings, 1 reply; 5+ messages in thread From: Vitali Lovich @ 2010-08-25 19:22 UTC (permalink / raw) To: linux-ext4 So I've run into this problem where the clock was reset into the 1970s on my system, causing e2fsck to get confused & think a file I deleted actually had an orphan list inode pointer stored in the i_dtime instead of the deletion time, causing e2fsck to get all confused & return an error code. My current idea for a workaround is to clamp the value to midnight 2000 if get_seconds returns that, but I'm not really enamoured with that idea. Looking at this further, I made the following observations (please let me know if I'm completely off base): Even a value of midnight 2010 corresponds to a limit of only about 1 billion files (1 262 304 000). Thus it seems if you delete a file on a partition with more than a billion files, it will make e2fsck think you've got a corrupt file-system even though you don't. A slightly related question I have is if anyone knows whether i_dtime is actually used as a timestamp for anything useful in kernel or user-space? Can I just set d_time to 0xffffffff when it's deleted instead of giving it wall-clock time? Thanks for any feedback, Vitali ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: question about i_dtime being used as an orphan list pointer 2010-08-25 19:22 question about i_dtime being used as an orphan list pointer Vitali Lovich @ 2010-08-25 20:39 ` Ted Ts'o 2010-08-25 21:25 ` Vitali Lovich 0 siblings, 1 reply; 5+ messages in thread From: Ted Ts'o @ 2010-08-25 20:39 UTC (permalink / raw) To: Vitali Lovich; +Cc: linux-ext4 On Wed, Aug 25, 2010 at 12:22:22PM -0700, Vitali Lovich wrote: > So I've run into this problem where the clock was reset into the 1970s > on my system, causing e2fsck to get confused & think a file I deleted > actually had an orphan list inode pointer stored in the i_dtime > instead of the deletion time, causing e2fsck to get all confused & > return an error code. Are you worried about solving this problem as a one shot deal, or as a long-term design issue. The long-term proper solution is run your clock so it is correct. :-) In the short term, probably the easist thing to do is just run e2fsck -y and let those inodes get populated into lost+found, and then delete them by hands afterwards. For a long-term fix, it probably would make sense to patch ext3/ext4 so that when we delete a file, and the current time is less than number of inodes, that we set dtime to 0xffffffff. > Even a value of midnight 2010 corresponds to a limit of only about 1 > billion files (1 262 304 000). Thus it seems if you delete a file on > a partition with more than a billion files, it will make e2fsck think > you've got a corrupt file-system even though you don't. And if you assuming the smallest possible inode ratio, that's a 4.5TB file. If you use the default inode, that's a 18TB. Ext3 is limited to 16TB, so that's not even an issue.... - Ted ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: question about i_dtime being used as an orphan list pointer 2010-08-25 20:39 ` Ted Ts'o @ 2010-08-25 21:25 ` Vitali Lovich 2010-08-25 22:48 ` Ted Ts'o 0 siblings, 1 reply; 5+ messages in thread From: Vitali Lovich @ 2010-08-25 21:25 UTC (permalink / raw) To: Ted Ts'o; +Cc: linux-ext4 On Wed, Aug 25, 2010 at 1:39 PM, Ted Ts'o <tytso@mit.edu> wrote: > On Wed, Aug 25, 2010 at 12:22:22PM -0700, Vitali Lovich wrote: >> So I've run into this problem where the clock was reset into the 1970s >> on my system, causing e2fsck to get confused & think a file I deleted >> actually had an orphan list inode pointer stored in the i_dtime >> instead of the deletion time, causing e2fsck to get all confused & >> return an error code. > > Are you worried about solving this problem as a one shot deal, or as a > long-term design issue. The long-term proper solution is run your > clock so it is correct. :-) Right :). The problem is we haven't completely root-caused the issue of why the clock had the wrong time, so I wanted to put in a robust fix that makes the failure-mode more graceful. I don't like using the number of inodes as the threshold since that might change thus getting back to that problem of putting inodes on the orphan list that shouldn't be. > > In the short term, probably the easist thing to do is just run e2fsck > -y and let those inodes get populated into lost+found, and then delete > them by hands afterwards. This is an embedded device - not an option to do anything by hand. When e2fsck fails we assume that there was a catastrophic failure & we brick the device to avoid any potential corruption of user data (this is only when e2fsck fails on the root or boot partitions right after we unmount them after an OS upgrade). > > For a long-term fix, it probably would make sense to patch ext3/ext4 > so that when we delete a file, and the current time is less than > number of inodes, that we set dtime to 0xffffffff. Is the i_dtime actually used at all? Could we just patch it to always set that as the i_dtime always? I grepped the source, & no one is actually using the i_dtime field. > >> Even a value of midnight 2010 corresponds to a limit of only about 1 >> billion files (1 262 304 000). Thus it seems if you delete a file on >> a partition with more than a billion files, it will make e2fsck think >> you've got a corrupt file-system even though you don't. > > And if you assuming the smallest possible inode ratio, that's a 4.5TB > file. If you use the default inode, that's a 18TB. Ext3 is limited > to 16TB, so that's not even an issue.... Depends on the block size, but yes, I don't think this is a super important issue in general with respect to ext2/3 - not sure how this affects ext4 since it, AFAIK, supports much higher capacity partitions. Thanks, Vitali -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: question about i_dtime being used as an orphan list pointer 2010-08-25 21:25 ` Vitali Lovich @ 2010-08-25 22:48 ` Ted Ts'o 2010-08-25 22:51 ` Vitali Lovich 0 siblings, 1 reply; 5+ messages in thread From: Ted Ts'o @ 2010-08-25 22:48 UTC (permalink / raw) To: Vitali Lovich; +Cc: linux-ext4 On Wed, Aug 25, 2010 at 02:25:37PM -0700, Vitali Lovich wrote: > > For a long-term fix, it probably would make sense to patch ext3/ext4 > > so that when we delete a file, and the current time is less than > > number of inodes, that we set dtime to 0xffffffff. > > Is the i_dtime actually used at all? Could we just patch it to always > set that as the i_dtime always? I grepped the source, & no one is > actually using the i_dtime field. It's not used by the kernel, but it is used occasionally by people who are trouble shooting file systems, and want to know when a particular inode was deleted. So a local hack to always set i_dtime to ~0 would not be horrible. But something which sets i_dtime to the system time, unless this would cause confusion with an orphaned inode linked list, in which case ~0 is used instead, is probably a cleaner fix. - Ted ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: question about i_dtime being used as an orphan list pointer 2010-08-25 22:48 ` Ted Ts'o @ 2010-08-25 22:51 ` Vitali Lovich 0 siblings, 0 replies; 5+ messages in thread From: Vitali Lovich @ 2010-08-25 22:51 UTC (permalink / raw) To: Ted Ts'o; +Cc: linux-ext4 That was my analysis as well. Thanks. -Vitali ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2010-08-25 22:51 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-08-25 19:22 question about i_dtime being used as an orphan list pointer Vitali Lovich 2010-08-25 20:39 ` Ted Ts'o 2010-08-25 21:25 ` Vitali Lovich 2010-08-25 22:48 ` Ted Ts'o 2010-08-25 22:51 ` Vitali Lovich
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).