* review: bump up xlog_state_do_callback loop checking
@ 2006-07-24 5:57 Nathan Scott
2006-07-25 9:42 ` Christoph Hellwig
0 siblings, 1 reply; 2+ messages in thread
From: Nathan Scott @ 2006-07-24 5:57 UTC (permalink / raw)
To: xfs
Hi,
I started running the QA tests with an external log on a ramdisk &
now constantly see situations where xlog_state_so_callback reports:
Filesystem "sda2": xlog_state_do_callback: looping 10
Filesystem "sda2": xlog_state_do_callback: looping 20
Filesystem "sda2": xlog_state_do_callback: looping 10
Filesystem "sda2": xlog_state_do_callback: looping 10
Filesystem "sda2": xlog_state_do_callback: looping 10
Filesystem "sda2": xlog_state_do_callback: looping 20
on the system console. Tim and I looked into this further, and remembered
long ago list discussion on the topic, after others reported this too (also
on ramdisks interestingly):
http://oss.sgi.com/archives/xfs/2005-02/msg00108.html
http://oss.sgi.com/archives/xfs/2005-02/msg00109.html
So, it seems Glen added this to try detect infinte loops on systems where
we do log callback processing in interrupt context (i.e. IRIX). It seems
that with ramdisks its causing spurious warnings due to how quickly the
completion handlers will be run (immediate, sync) though. We can quite
easily still keep the same infinite loop check but bump up the reporting
threshold to something that wont happen for ramdisks/raid caches ... and
report each several-thousand iterations instead of each tenth one. It
does still seems worthwhile to keep the infinte loop detection though, so
at this stage I've left that in there.
Tim also insisted I optimise away the modulo operation that we do in the
callback processing loop, while I was fixing this other issue, and he's
pointed out an easy way to do that...
cheers.
--
Nathan
Index: xfs-linux/xfs_log.c
===================================================================
--- xfs-linux.orig/xfs_log.c 2006-07-20 12:06:56.455633750 +1000
+++ xfs-linux/xfs_log.c 2006-07-20 12:17:19.819492000 +1000
@@ -2243,9 +2243,13 @@ xlog_state_do_callback(
iclog = iclog->ic_next;
} while (first_iclog != iclog);
- if (repeats && (repeats % 10) == 0) {
+
+ if (repeats > 5000) {
+ flushcnt += repeats;
+ repeats = 0;
xfs_fs_cmn_err(CE_WARN, log->l_mp,
- "xlog_state_do_callback: looping %d", repeats);
+ "%s: possible infinite loop (%d iterations)",
+ __FUNCTION__, flushcnt);
}
} while (!ioerrors && loopdidcallbacks);
@@ -2277,6 +2281,7 @@ xlog_state_do_callback(
}
#endif
+ flushcnt = 0;
if (log->l_iclog->ic_state & (XLOG_STATE_ACTIVE|XLOG_STATE_IOERROR)) {
flushcnt = log->l_flushcnt;
log->l_flushcnt = 0;
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: review: bump up xlog_state_do_callback loop checking
2006-07-24 5:57 review: bump up xlog_state_do_callback loop checking Nathan Scott
@ 2006-07-25 9:42 ` Christoph Hellwig
0 siblings, 0 replies; 2+ messages in thread
From: Christoph Hellwig @ 2006-07-25 9:42 UTC (permalink / raw)
To: Nathan Scott; +Cc: xfs
On Mon, Jul 24, 2006 at 03:57:30PM +1000, Nathan Scott wrote:
> Hi,
>
> I started running the QA tests with an external log on a ramdisk &
> now constantly see situations where xlog_state_so_callback reports:
>
> Filesystem "sda2": xlog_state_do_callback: looping 10
> Filesystem "sda2": xlog_state_do_callback: looping 20
> Filesystem "sda2": xlog_state_do_callback: looping 10
> Filesystem "sda2": xlog_state_do_callback: looping 10
> Filesystem "sda2": xlog_state_do_callback: looping 10
> Filesystem "sda2": xlog_state_do_callback: looping 20
>
> on the system console. Tim and I looked into this further, and remembered
> long ago list discussion on the topic, after others reported this too (also
> on ramdisks interestingly):
> http://oss.sgi.com/archives/xfs/2005-02/msg00108.html
> http://oss.sgi.com/archives/xfs/2005-02/msg00109.html
>
> So, it seems Glen added this to try detect infinte loops on systems where
> we do log callback processing in interrupt context (i.e. IRIX). It seems
> that with ramdisks its causing spurious warnings due to how quickly the
> completion handlers will be run (immediate, sync) though. We can quite
> easily still keep the same infinite loop check but bump up the reporting
> threshold to something that wont happen for ramdisks/raid caches ... and
> report each several-thousand iterations instead of each tenth one. It
> does still seems worthwhile to keep the infinte loop detection though, so
> at this stage I've left that in there.
>
> Tim also insisted I optimise away the modulo operation that we do in the
> callback processing loop, while I was fixing this other issue, and he's
> pointed out an easy way to do that...
ok
> Index: xfs-linux/xfs_log.c
> ===================================================================
> --- xfs-linux.orig/xfs_log.c 2006-07-20 12:06:56.455633750 +1000
> +++ xfs-linux/xfs_log.c 2006-07-20 12:17:19.819492000 +1000
> @@ -2243,9 +2243,13 @@ xlog_state_do_callback(
>
> iclog = iclog->ic_next;
> } while (first_iclog != iclog);
> - if (repeats && (repeats % 10) == 0) {
> +
> + if (repeats > 5000) {
> + flushcnt += repeats;
> + repeats = 0;
> xfs_fs_cmn_err(CE_WARN, log->l_mp,
> - "xlog_state_do_callback: looping %d", repeats);
> + "%s: possible infinite loop (%d iterations)",
> + __FUNCTION__, flushcnt);
> }
> } while (!ioerrors && loopdidcallbacks);
>
> @@ -2277,6 +2281,7 @@ xlog_state_do_callback(
> }
> #endif
>
> + flushcnt = 0;
> if (log->l_iclog->ic_state & (XLOG_STATE_ACTIVE|XLOG_STATE_IOERROR)) {
> flushcnt = log->l_flushcnt;
> log->l_flushcnt = 0;
>
>
---end quoted text---
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2006-07-25 10:51 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-07-24 5:57 review: bump up xlog_state_do_callback loop checking Nathan Scott
2006-07-25 9:42 ` Christoph Hellwig
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox