[PATCH] e2fsck: Don't skip time based checks if broken_system

linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] e2fsck: Don't skip time based checks if broken_system_clock=1 unnecessarily
@ 2014-03-28  3:15 Jan Kara
  2014-04-21  0:39 ` Theodore Ts'o
  0 siblings, 1 reply; 3+ messages in thread
From: Jan Kara @ 2014-03-28  3:15 UTC (permalink / raw)
  To: Ted Tso; +Cc: linux-ext4, Jan Kara

When option broken_system_clock is set, we unconditionally skip checks
whether e2fsck should check the fs because the fs was last checked too
long ago. Distributions however set broken_system_clock by default as
some of the systems simply have the clocks broken and e2fsck stops the
boot because of that which is harsh given how minor issue that is (at
least for the fs). Thus checking every X days doesn't work with these
distributions.

Change slightly how broken_system_clock works to make it more useful. We
first check in the superblock whether the current time and times stored
in the superblock look sensible. If yes, we also do honor check
interval. If the current time and times in superblock don't correspond
properly, we skip the check interval test.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 e2fsck/super.c | 44 ++++++++++++++++++++++++++------------------
 e2fsck/unix.c  | 11 ++++++-----
 2 files changed, 32 insertions(+), 23 deletions(-)

diff --git a/e2fsck/super.c b/e2fsck/super.c
index e9892e2db6b3..f2bdf98f26dc 100644
--- a/e2fsck/super.c
+++ b/e2fsck/super.c
@@ -840,28 +840,36 @@ void check_super_block(e2fsck_t ctx)
 	 * Check to see if the superblock last mount time or last
 	 * write time is in the future.
 	 */
-	if (!broken_system_clock &&
-	    !(ctx->flags & E2F_FLAG_TIME_INSANE) &&
+	if (!(ctx->flags & E2F_FLAG_TIME_INSANE) &&
 	    fs->super->s_mtime > (__u32) ctx->now) {
-		pctx.num = fs->super->s_mtime;
-		problem = PR_0_FUTURE_SB_LAST_MOUNT;
-		if (fs->super->s_mtime <= (__u32) ctx->now + ctx->time_fudge)
-			problem = PR_0_FUTURE_SB_LAST_MOUNT_FUDGED;
-		if (fix_problem(ctx, problem, &pctx)) {
-			fs->super->s_mtime = ctx->now;
-			fs->flags |= EXT2_FLAG_DIRTY;
+		if (broken_system_clock)
+			ctx->flags |= E2F_FLAG_TIME_INSANE;
+		else {
+			pctx.num = fs->super->s_mtime;
+			problem = PR_0_FUTURE_SB_LAST_MOUNT;
+			if (fs->super->s_mtime <=
+			    (__u32) ctx->now + ctx->time_fudge)
+				problem = PR_0_FUTURE_SB_LAST_MOUNT_FUDGED;
+			if (fix_problem(ctx, problem, &pctx)) {
+				fs->super->s_mtime = ctx->now;
+				fs->flags |= EXT2_FLAG_DIRTY;
+			}
 		}
 	}
-	if (!broken_system_clock &&
-	    !(ctx->flags & E2F_FLAG_TIME_INSANE) &&
+	if (!(ctx->flags & E2F_FLAG_TIME_INSANE) &&
 	    fs->super->s_wtime > (__u32) ctx->now) {
-		pctx.num = fs->super->s_wtime;
-		problem = PR_0_FUTURE_SB_LAST_WRITE;
-		if (fs->super->s_wtime <= (__u32) ctx->now + ctx->time_fudge)
-			problem = PR_0_FUTURE_SB_LAST_WRITE_FUDGED;
-		if (fix_problem(ctx, problem, &pctx)) {
-			fs->super->s_wtime = ctx->now;
-			fs->flags |= EXT2_FLAG_DIRTY;
+		if (broken_system_clock)
+			ctx->flags |= E2F_FLAG_TIME_INSANE;
+		else {
+			pctx.num = fs->super->s_wtime;
+			problem = PR_0_FUTURE_SB_LAST_WRITE;
+			if (fs->super->s_wtime <=
+			    (__u32) ctx->now + ctx->time_fudge)
+				problem = PR_0_FUTURE_SB_LAST_WRITE_FUDGED;
+			if (fix_problem(ctx, problem, &pctx)) {
+				fs->super->s_wtime = ctx->now;
+				fs->flags |= EXT2_FLAG_DIRTY;
+			}
 		}
 	}
 
diff --git a/e2fsck/unix.c b/e2fsck/unix.c
index b39383d7ef70..9caa17c02f9c 100644
--- a/e2fsck/unix.c
+++ b/e2fsck/unix.c
@@ -371,13 +371,13 @@ static void check_if_skip(e2fsck_t ctx)
 		if (batt && (fs->super->s_mnt_count <
 			     (unsigned) fs->super->s_max_mnt_count*2))
 			reason = 0;
-	} else if (!broken_system_clock && fs->super->s_checkinterval &&
-		   (ctx->now < lastcheck)) {
+	} else if (!(ctx->flags & E2F_FLAG_TIME_INSANE) &&
+		   fs->super->s_checkinterval && (ctx->now < lastcheck)) {
 		reason = _(" has filesystem last checked time in the future");
 		if (batt)
 			reason = 0;
-	} else if (!broken_system_clock && fs->super->s_checkinterval &&
-		   ((ctx->now - lastcheck) >=
+	} else if (!(ctx->flags & E2F_FLAG_TIME_INSANE) &&
+		   fs->super->s_checkinterval && ((ctx->now - lastcheck) >=
 		    ((time_t) fs->super->s_checkinterval))) {
 		reason = _(" has gone %u days without being checked");
 		reason_arg = (ctx->now - fs->super->s_lastcheck)/(3600*24);
@@ -434,7 +434,8 @@ static void check_if_skip(e2fsck_t ctx)
 		if (next_check <= 0)
 			next_check = 1;
 	}
-	if (!broken_system_clock && fs->super->s_checkinterval &&
+	if (!(ctx->flags & E2F_FLAG_TIME_INSANE) &&
+	    fs->super->s_checkinterval &&
 	    ((ctx->now - fs->super->s_lastcheck) >= fs->super->s_checkinterval))
 		next_check = 1;
 	if (next_check <= 5) {
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] e2fsck: Don't skip time based checks if broken_system_clock=1 unnecessarily
  2014-03-28  3:15 [PATCH] e2fsck: Don't skip time based checks if broken_system_clock=1 unnecessarily Jan Kara
@ 2014-04-21  0:39 ` Theodore Ts'o
  2014-04-22 15:47   ` Jan Kara
  0 siblings, 1 reply; 3+ messages in thread
From: Theodore Ts'o @ 2014-04-21  0:39 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-ext4

On Fri, Mar 28, 2014 at 04:15:31AM +0100, Jan Kara wrote:
> When option broken_system_clock is set, we unconditionally skip checks
> whether e2fsck should check the fs because the fs was last checked too
> long ago. Distributions however set broken_system_clock by default as
> some of the systems simply have the clocks broken and e2fsck stops the
> boot because of that which is harsh given how minor issue that is (at
> least for the fs). Thus checking every X days doesn't work with these
> distributions.

There are a couple of different ways in which the time can be
considered "broken":
>
1) The system has no battery-backed block and so the time is always
January 1, 1970 on boot.  These systems are taken care of by the
TIME_INSANE checks.

2) The distribution the timezone is incorrect when e2fsck is run, so
depending on whether you are east or west of GMT, the clock could
appear to be up to 24 hours in the future, which would falsely trigger
the e2fsck check.  That is now taken care of by the accept_time_fudge
hack, which is enabled by default.

3) The clock is totally insane, which means that it could be any
value; it could have a time in 2037; it could have a time in the
1970's.  This is what broken_system_clock means now.

If the system clock is completely untrustworthy, you really do want to
skip all time based checks, because there is no way to tell whether
the times look "sensible" or not.  If the clock warps forward by N
years, it's still going to look sensible, and it can force a time
based check when one shouldn't be needed.

I think the issue is that there may be some distributions that set
broken_system_lock way back in the past, before we added the
accept_time_fudge (which was added in 2009).  So if the concern is
dealing with people who have systems in Western Europe whose RTC ticks
localtime (as opposed to UTC), that problem was fixed by
accept_time_fudge.  And if the concern is systems in state #1, that
was dealt with by the TIME_INSANE checks in 2010.

So if those are the main concerns, you can simply get rid of setting
broken_system_clock, since those problems are now solved by default.

But if your worry is RTC clocks which are _totally_ bonkers, then we
really have to do what we are currently doing.  Besides, I thought
these days, most folks aren't enabling time based checks any more.
IMHO, the real right answer is that distributions should be setting
folks to use LVM by default, and to do checks using snapshots every
month or so run out of cron.

						 - Ted

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] e2fsck: Don't skip time based checks if broken_system_clock=1 unnecessarily
  2014-04-21  0:39 ` Theodore Ts'o
@ 2014-04-22 15:47   ` Jan Kara
  0 siblings, 0 replies; 3+ messages in thread
From: Jan Kara @ 2014-04-22 15:47 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: Jan Kara, linux-ext4

On Sun 20-04-14 20:39:56, Ted Tso wrote:
> On Fri, Mar 28, 2014 at 04:15:31AM +0100, Jan Kara wrote:
> > When option broken_system_clock is set, we unconditionally skip checks
> > whether e2fsck should check the fs because the fs was last checked too
> > long ago. Distributions however set broken_system_clock by default as
> > some of the systems simply have the clocks broken and e2fsck stops the
> > boot because of that which is harsh given how minor issue that is (at
> > least for the fs). Thus checking every X days doesn't work with these
> > distributions.
> 
> There are a couple of different ways in which the time can be
> considered "broken":
> >
> 1) The system has no battery-backed block and so the time is always
> January 1, 1970 on boot.  These systems are taken care of by the
> TIME_INSANE checks.
> 
> 2) The distribution the timezone is incorrect when e2fsck is run, so
> depending on whether you are east or west of GMT, the clock could
> appear to be up to 24 hours in the future, which would falsely trigger
> the e2fsck check.  That is now taken care of by the accept_time_fudge
> hack, which is enabled by default.
> 
> 3) The clock is totally insane, which means that it could be any
> value; it could have a time in 2037; it could have a time in the
> 1970's.  This is what broken_system_clock means now.
> 
> If the system clock is completely untrustworthy, you really do want to
> skip all time based checks, because there is no way to tell whether
> the times look "sensible" or not.  If the clock warps forward by N
> years, it's still going to look sensible, and it can force a time
> based check when one shouldn't be needed.
>
> I think the issue is that there may be some distributions that set
> broken_system_lock way back in the past, before we added the
> accept_time_fudge (which was added in 2009).  So if the concern is
> dealing with people who have systems in Western Europe whose RTC ticks
> localtime (as opposed to UTC), that problem was fixed by
> accept_time_fudge.  And if the concern is systems in state #1, that
> was dealt with by the TIME_INSANE checks in 2010.
> 
> So if those are the main concerns, you can simply get rid of setting
> broken_system_clock, since those problems are now solved by default.
  So looking into our bugzilla we added broken_system_clock=1 to
e2fsck.conf because of a report of a machine where the time went few days
backward after a reboot (the most likely explanation to me seems that the
system time wasn't getting properly written to the hw clock for some
reason).

Anyway, thinking about this again maybe my point should have been: Isn't it
too harsh to declare mount time in future as requiring manual intervention?
Why don't we just reset it (or just leave it alone) in preen mode without
requiring user interaction? I understand it may indicate some corruption
problem but we don't really care about corruption in this particular field
much...

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2014-04-22 15:47 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-03-28  3:15 [PATCH] e2fsck: Don't skip time based checks if broken_system_clock=1 unnecessarily Jan Kara
2014-04-21  0:39 ` Theodore Ts'o
2014-04-22 15:47   ` Jan Kara

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).