From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751617AbbFYMOF (ORCPT ); Thu, 25 Jun 2015 08:14:05 -0400 Received: from mail-wi0-f176.google.com ([209.85.212.176]:34554 "EHLO mail-wi0-f176.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751222AbbFYMOA (ORCPT ); Thu, 25 Jun 2015 08:14:00 -0400 Message-ID: <558BF083.5060205@gmail.com> Date: Thu, 25 Jun 2015 14:13:55 +0200 From: Karsten Blees User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.3.0 MIME-Version: 1.0 To: John Stultz CC: lkml , Thomas Gleixner Subject: [PATCH v2] time.c::timespec_trunc: fix nanosecond file time rounding References: <5577240D.7020309@gmail.com> <5580A586.7060202@gmail.com> In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org timespec_trunc() avoids rounding if granularity <= nanoseconds-per-jiffie (or TICK_NSEC). This optimization assumes that: 1. current_kernel_time().tv_nsec is already rounded to TICK_NSEC (i.e. with HZ=1000 you'd get 1000000, 2000000, 3000000... but never 1000001). This is no longer true (probably since hrtimers introduced in 2.6.16). 2. TICK_NSEC is evenly divisible by all possible granularities. This may be true for HZ=100, 250, 1000, but obviously not for HZ=300 / TICK_NSEC=3333333 (introduced in 2.6.20). Thus, sub-second portions of in-core file times are not rounded to on-disk granularity. I.e. file times may change when the inode is re-read from disk or when the file system is remounted. This affects all file systems with file time granularities > 1 ns and < 1s, e.g. CEPH (1000 ns), UDF (1000 ns), CIFS (100 ns), NTFS (100 ns) and FUSE (configurable from user mode via struct fuse_init_out.time_gran). Steps to reproduce with e.g. UDF: $ dd if=/dev/zero of=udfdisk count=10000 && mkudffs udfdisk $ mkdir udf && mount udfdisk udf $ touch udf/test && stat -c %y udf/test 2015-06-09 10:22:56.130006767 +0200 $ umount udf && mount udfdisk udf $ stat -c %y udf/test 2015-06-09 10:22:56.130006000 +0200 Remounting truncates the mtime to 1 µs. Fix the rounding in timespec_trunc() and update the documentation. timespec_trunc() is exclusively used to calculate inode's [acm]time (mostly via current_fs_time()), and always with super_block.s_time_gran as second argument. So this can safely be changed without side effects. Note: This does _not_ fix the issue for FAT's 2 second mtime resolution, as super_block.s_time_gran isn't prepared to handle different ctime / mtime / atime resolutions nor resolutions > 1 second. Signed-off-by: Karsten Blees --- Am 17.06.2015 um 01:08 schrieb John Stultz: > > Logically its ok. I might suggest cleaning it up as: > > if ((gran < 1) || (gran > NSEC_PER_SEC)) > WARN_ON(1); /* catch invalid granularity values */ > else if (gran == NSEC_PER_SEC) > t.tv_nsec = 0; /* special case to avoid div */ > else if ((gran > 1) && ( gran < NSEC_PER_SEC)) > t.tv_nsec -= t.tv_nsec % gran; > return t; > > Also it would be good to make it clear in the function comment that > gran > NSEC_PER_SEC are invalid. > > thanks > -john > I chose to stick to testing the most common cases first (1 ns and 1 s). I don't think GCC would be smart enough to reoder the comparisons based on 'unlikely()' in the WARN macro... Thanks, Karsten kernel/time/time.c | 22 ++++++++-------------- 1 file changed, 8 insertions(+), 14 deletions(-) diff --git a/kernel/time/time.c b/kernel/time/time.c index 972e3bb..5733922 100644 --- a/kernel/time/time.c +++ b/kernel/time/time.c @@ -287,26 +287,20 @@ EXPORT_SYMBOL(jiffies_to_usecs); * @t: Timespec * @gran: Granularity in ns. * - * Truncate a timespec to a granularity. gran must be smaller than a second. - * Always rounds down. - * - * This function should be only used for timestamps returned by - * current_kernel_time() or CURRENT_TIME, not with do_gettimeofday() because - * it doesn't handle the better resolution of the latter. + * Truncate a timespec to a granularity. Always rounds down. gran must + * not be 0 nor greater than a second (NSEC_PER_SEC, or 10^9 ns). */ struct timespec timespec_trunc(struct timespec t, unsigned gran) { - /* - * Division is pretty slow so avoid it for common cases. - * Currently current_kernel_time() never returns better than - * jiffies resolution. Exploit that. - */ - if (gran <= jiffies_to_usecs(1) * 1000) { + /* Avoid division in the common cases 1 ns and 1 s. */ + if (gran == 1) { /* nothing */ - } else if (gran == 1000000000) { + } else if (gran == NSEC_PER_SEC) { t.tv_nsec = 0; - } else { + } else if (gran > 1 && gran < NSEC_PER_SEC) { t.tv_nsec -= t.tv_nsec % gran; + } else { + WARN(1, "illegal file time granularity: %u", gran); } return t; } -- 2.0.0.791.g124e248