From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jan Kara Subject: Re: [PATCH-v2 1/2] fs: make sure the timestamps for lazytime inodes eventually get written Date: Tue, 17 Mar 2015 11:33:37 +0100 Message-ID: <20150317103337.GE23155@quack.suse.cz> References: <1426533260-3305-1-git-send-email-tytso@mit.edu> <1426533260-3305-2-git-send-email-tytso@mit.edu> <0ADB84CA-6F1D-41EC-A847-FCD5764FCAE9@dilger.ca> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Theodore Ts'o , Linux Filesystem Development List , jack@suse.cz, torvalds@linux-foundation.org, viro@zeniv.linux.org.uk, stable@vger.kernel.org To: Andreas Dilger Return-path: Received: from cantor2.suse.de ([195.135.220.15]:39169 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752839AbbCQKdm (ORCPT ); Tue, 17 Mar 2015 06:33:42 -0400 Content-Disposition: inline In-Reply-To: <0ADB84CA-6F1D-41EC-A847-FCD5764FCAE9@dilger.ca> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Mon 16-03-15 15:34:12, Andreas Dilger wrote: > On Mar 16, 2015, at 1:14 PM, Theodore Ts'o wrote: > > > > Jan Kara pointed out that if there is an inode which is constantly > > getting dirtied with I_DIRTY_PAGES, an inode with an updated timestamp > > will never be written since inode->dirtied_when is constantly getting > > updated. We fix this by adding an extra field to the inode, > > dirtied_time_when, so inodes with a stale dirtytime can get detected > > and handled. > > The drawback here is that this adds another 8 bytes to every inode for > a field of marginal value, since this is only important for the rare > case of a file that is being dirtied continuously. Yes. > I wonder if something more lightweight could be added to avoid this > problem? For example, we only care about this case if it has been > going on for more than the lazytime interval (about a day), so the > inode could store a 16-bit i_dirtied_time_when that is approximately > (jiffies >> bits_in_a_half_a_day) and only check time_after() that. > The __u16 could fit into some existing hole (e.g. after i_bytes on my > kernel) and avoid expanding the size of the inode at all. > > The remaining high bits of i_dirtied_time_when would be irrelevant, since > a __u16 of half-days is about 80 years, so it would be enough to compare: > > > time_after(i_dirtied_time_when, (__u16)(jiffies >> bits_in_half_a_day)) > > > A day is 86400s, so 43200s is close to (1 << 22) jiffies for HZ=100, and > (1 << 25) jiffies is about 3/8 of a day for HZ=1000. Since the exact > times for inode writeout don't matter very much here, having only shifts > to convert jiffies to i_dirtied_time_when in the kernel is better I think. Yes, something like this should be possible. But I wanted that to happen as a separate patch once we have everything working correctly. The code is subtle enough that I didn't want Ted to complicate it with further optimizations initially. > Minor issue, is there a good reason why dirtied_time_when doesn't have an > "i_" prefix? I guess it's matching with dirtied_when which doesn't have i_ prefix just because noone added it initially. I don't really care either way. Honza -- Jan Kara SUSE Labs, CR