* Disk spin down
@ 2012-02-12 18:23 Andy Bennett
2012-02-12 20:06 ` Dave Chinner
2012-02-13 8:36 ` Brian Candler
0 siblings, 2 replies; 8+ messages in thread
From: Andy Bennett @ 2012-02-12 18:23 UTC (permalink / raw)
To: xfs
Hi,
Can someone help me work out where writes are coming from on my system?
They don't seem to be originating from a user process and it looks like
[xfsbufd] might be to blame.
I have recently bought a Samsung NC110 nebook with an Hitachi
HTS543232A7A384 320GB SATA disk.
I have installed Debian Squeeze. This comes with Linux 2.6.32-5-amd64
but I upgraded to the Linux-3.2.0-0.bpo.1-amd64 kernel in
squeeze-backports to get the appropriate firmware for the wireless card.
My partitions are mostly XFS, except /boot which is ext4.
I notice that the 3.2 kernel constantly spins the disk up and down at an
interval of about 10 seconds. Under the 2.6 one the interval is much
larger but it still happens.
In order to investigate I used 'iotop' invoked both as 'iotop' and
'iotop -oa'. On a quiescent system this shows periodic overall writes of
a few KB/s but it doesn't show any writes for a particular process. No
reads are ever shown on a quiescent system.
I used blktrace which suggested that sda6 (/var) was to blame. I used
'lsof |grep /var' to see what was running and switched off a bunch of
daemons just to be sure:
-----
sudo /etc/init.d/gdm stop; sudo /etc/init.d/cron stop; sudo
/etc/init.d/dbus stop; sudo /etc/init.d/rsyslog stop; sudo
/etc/init.d/atd stop; sudo /etc/init.d/acpid stop
-----
I used 'tail -f' on the remaining open files in /var to ensure that
nothing was being written.
The disk still spun up and down frequently.
I took some blktraces with 'sudo btrace /dev/sda >
/dev/shm/blktrace.Linux-...':
-----
$ cat blktrace.Linux-2.6.32-5-amd64
8,0 0 1 0.000000000 644 A W 28618136 + 8 <- (8,6)
3970456
8,0 0 2 0.000002934 644 Q W 28618136 + 8 [flush-8:0]
8,0 0 3 0.000013899 644 G W 28618136 + 8 [flush-8:0]
8,0 0 4 0.000020464 644 P N [flush-8:0]
8,0 0 5 0.000024096 644 I W 28618136 + 8 [flush-8:0]
8,0 0 0 0.000032197 0 m N cfq280 insert_request
8,0 0 0 0.000035899 0 m N cfq280 add_to_rr
8,0 0 0 0.000041346 0 m N cfq280 preempt
8,0 0 0 0.000044559 0 m N cfq644 slice expired t=1
8,0 0 0 0.000047911 0 m N cfq644 resid=-30761
8,0 0 0 0.000054616 0 m N cfq280 set_active
8,0 0 0 0.000059994 0 m N cfq280 fifo=(null)
8,0 0 0 0.000062857 0 m N cfq280 dispatch_insert
8,0 0 0 0.000066419 0 m N cfq280 del_from_rr
8,0 0 0 0.000070121 0 m N cfq280 dispatched a request
8,0 0 0 0.000074521 0 m N cfq280 activate rq, drv=1
8,0 0 6 0.000076756 644 D W 28618136 + 8 [flush-8:0]
8,0 0 7 0.000139753 644 A WS 28618208 + 16 <- (8,6)
3970528
8,0 0 8 0.000142197 644 Q WS 28618208 + 16 [flush-8:0]
8,0 0 9 0.000147645 644 G WS 28618208 + 16 [flush-8:0]
8,0 0 10 0.000152603 644 P N [flush-8:0]
8,0 0 11 0.000155118 644 I W 28618208 + 16 [flush-8:0]
8,0 0 0 0.000159588 0 m N cfq644 insert_request
8,0 0 0 0.000162521 0 m N cfq644 add_to_rr
8,0 0 0 0.000166921 0 m N cfq644 preempt
8,0 0 0 0.000169784 0 m N cfq280 slice expired t=1
8,0 0 0 0.000174673 0 m N cfq644 set_active
8,0 0 0 0.000178375 0 m N cfq644 fifo=(null)
8,0 0 0 0.000181099 0 m N cfq644 dispatch_insert
8,0 0 0 0.000184311 0 m N cfq644 del_from_rr
8,0 0 0 0.000187664 0 m N cfq644 dispatched a request
8,0 0 0 0.000191086 0 m N cfq644 activate rq, drv=2
8,0 0 12 0.000192972 644 D W 28618208 + 16 [flush-8:0]
8,0 0 13 1.854932267 0 C W 28618136 + 8 [0]
8,0 0 0 1.854957131 0 m N cfq280 complete
8,0 0 14 1.866415850 0 C W 28618208 + 16 [0]
8,0 0 0 1.866432402 0 m N cfq644 complete
8,0 0 0 1.866436453 0 m N cfq644 set_slice=25
CPU0 (8,0):
Reads Queued: 0, 0KiB Writes Queued: 2,
12KiB
Read Dispatches: 0, 0KiB Write Dispatches: 2,
12KiB
Reads Requeued: 0 Writes Requeued: 0
Reads Completed: 0, 0KiB Writes Completed: 2,
12KiB
Read Merges: 0, 0KiB Write Merges: 0,
0KiB
Read depth: 0 Write depth: 2
IO unplugs: 0 Timer unplugs: 0
Throughput (R/W): 0KiB/s / 6KiB/s
Events (8,0): 38 entries
Skips: 3 forward (29 - 43.3%)
-----
-----
$ cat blktrace.Linux-3.2.0-0.bpo.1-amd64
8,0 1 1 0.000000000 9 A WBSM 28595407 + 2 <- (8,6)
3947727
8,0 1 2 0.000003771 9 Q WBSM 28595407 + 2 [kworker/1:0]
8,0 1 3 0.000014736 9 G WBSM 28595407 + 2 [kworker/1:0]
8,0 1 4 0.000024165 9 I WBSM 28595407 + 2 [kworker/1:0]
8,0 3 1 0.000174603 0 D WSM 28595407 + 2 [swapper/1]
8,0 1 5 0.000363663 0 C WSM 28595407 + 2 [0]
8,0 1 6 0.037216442 936 A WM 24647680 + 1 <- (8,6) 0
8,0 1 7 0.037219166 936 Q WM 24647680 + 1 [xfsbufd/sda6]
8,0 1 8 0.037230480 936 G WM 24647680 + 1 [xfsbufd/sda6]
8,0 1 9 0.037233903 936 P N [xfsbufd/sda6]
8,0 1 10 0.037239001 936 I WM 24647680 + 1 [xfsbufd/sda6]
8,0 0 1 0.037009992 0 C WSM 28595407 [0]
8,0 1 0 0.037248988 0 m N cfq202A / insert_request
8,0 1 11 0.037254506 936 U N [xfsbufd/sda6] 1
8,0 1 0 0.037261211 0 m N cfq202A / slice expired t=0
8,0 1 0 0.037267496 0 m N / served: vt=6391808
min_vt=6371328
8,0 1 0 0.037274201 0 m N cfq202A / sl_used=10
disp=1 charge=10 iops=0 sect=8
8,0 1 0 0.037280626 0 m N cfq workload slice:10
8,0 1 0 0.037285236 0 m N cfq202A / set_active
wl_prio:0 wl_type:0
8,0 1 0 0.037289287 0 m N cfq202A / Not idling.
st->count:1
8,0 1 0 0.037294525 0 m N cfq202A / fifo=
(null)
8,0 1 0 0.037298017 0 m N cfq202A / dispatch_insert
8,0 1 0 0.037303953 0 m N cfq202A / dispatched a
request
8,0 1 0 0.037307515 0 m N cfq202A / activate rq,
drv=1
8,0 1 12 0.037309401 936 D WM 24647680 + 1 [xfsbufd/sda6]
8,0 1 13 0.037518785 0 C WM 24647680 + 1 [0]
8,0 1 0 0.037539947 0 m N cfq202A / complete
rqnoidle 0
8,0 1 0 0.037545115 0 m N cfq202A / set_slice=10
8,0 1 0 0.037547629 0 m N cfq schedule dispatch
8,0 1 14 0.087965523 936 A WM 28618208 + 16 <- (8,6)
3970528
8,0 1 15 0.087968386 936 Q WM 28618208 + 16
[xfsbufd/sda6]
8,0 1 16 0.087978443 936 G WM 28618208 + 16
[xfsbufd/sda6]
8,0 1 17 0.087981935 936 P N [xfsbufd/sda6]
8,0 1 18 0.087987592 936 I WM 28618208 + 16
[xfsbufd/sda6]
8,0 1 0 0.087994995 0 m N cfq202A / insert_request
8,0 1 19 0.087999255 936 U N [xfsbufd/sda6] 1
8,0 1 0 0.088004563 0 m N cfq202A / slice expired t=0
8,0 1 0 0.088009871 0 m N / served: vt=6412288
min_vt=6391808
8,0 1 0 0.088016017 0 m N cfq202A / sl_used=10
disp=1 charge=10 iops=0 sect=1
8,0 1 0 0.088021255 0 m N cfq workload slice:10
8,0 1 0 0.088025725 0 m N cfq202A / set_active
wl_prio:0 wl_type:0
8,0 1 0 0.088029776 0 m N cfq202A / Not idling.
st->count:1
8,0 1 0 0.088034246 0 m N cfq202A / fifo=
(null)
8,0 1 0 0.088037528 0 m N cfq202A / dispatch_insert
8,0 1 0 0.088042696 0 m N cfq202A / dispatched a
request
8,0 1 0 0.088046328 0 m N cfq202A / activate rq,
drv=1
8,0 1 20 0.088048214 936 D WM 28618208 + 16
[xfsbufd/sda6]
8,0 1 21 0.088291817 0 C WM 28618208 + 16 [0]
8,0 1 0 0.088311442 0 m N cfq202A / complete
rqnoidle 0
8,0 1 0 0.088316680 0 m N cfq202A / set_slice=10
8,0 1 0 0.088319055 0 m N cfq schedule dispatch
CPU0 (8,0):
Reads Queued: 0, 0KiB Writes Queued: 0,
0KiB
Read Dispatches: 0, 0KiB Write Dispatches: 0,
0KiB
Reads Requeued: 0 Writes Requeued: 0
Reads Completed: 0, 0KiB Writes Completed: 1,
0KiB
Read Merges: 0, 0KiB Write Merges: 0,
0KiB
Read depth: 0 Write depth: 1
IO unplugs: 0 Timer unplugs: 0
CPU1 (8,0):
Reads Queued: 0, 0KiB Writes Queued: 3,
9KiB
Read Dispatches: 0, 0KiB Write Dispatches: 2,
8KiB
Reads Requeued: 0 Writes Requeued: 0
Reads Completed: 0, 0KiB Writes Completed: 3,
9KiB
Read Merges: 0, 0KiB Write Merges: 0,
0KiB
Read depth: 0 Write depth: 1
IO unplugs: 2 Timer unplugs: 0
CPU3 (8,0):
Reads Queued: 0, 0KiB Writes Queued: 0,
0KiB
Read Dispatches: 0, 0KiB Write Dispatches: 1,
1KiB
Reads Requeued: 0 Writes Requeued: 0
Reads Completed: 0, 0KiB Writes Completed: 0,
0KiB
Read Merges: 0, 0KiB Write Merges: 0,
0KiB
Read depth: 0 Write depth: 1
IO unplugs: 0 Timer unplugs: 0
Total (8,0):
Reads Queued: 0, 0KiB Writes Queued: 3,
9KiB
Read Dispatches: 0, 0KiB Write Dispatches: 3,
9KiB
Reads Requeued: 0 Writes Requeued: 0
Reads Completed: 0, 0KiB Writes Completed: 4,
9KiB
Read Merges: 0, 0KiB Write Merges: 0,
0KiB
IO unplugs: 2 Timer unplugs: 0
Throughput (R/W): 0KiB/s / 102KiB/s
Events (8,0): 51 entries
Skips: 5 forward (71 - 58.2%)
-----
The blktrace from Linux 3.2 seems much more incriminating for XFS than
the Linux 2.6 one.
Does anyone have any hints about whether this is an XFS bug or
regression or where I can look for more information?
Many thanks for your time.
Regards,
@ndy
--
andyjpb@ashurst.eu.org
http://www.ashurst.eu.org/
0x7EBA75FF
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Disk spin down
2012-02-12 18:23 Disk spin down Andy Bennett
@ 2012-02-12 20:06 ` Dave Chinner
2012-02-12 21:15 ` Andy Bennett
2012-02-13 8:36 ` Brian Candler
1 sibling, 1 reply; 8+ messages in thread
From: Dave Chinner @ 2012-02-12 20:06 UTC (permalink / raw)
To: Andy Bennett; +Cc: xfs
On Sun, Feb 12, 2012 at 06:23:45PM +0000, Andy Bennett wrote:
> Hi,
>
> Can someone help me work out where writes are coming from on my system?
> They don't seem to be originating from a user process and it looks like
> [xfsbufd] might be to blame.
>
> I have recently bought a Samsung NC110 nebook with an Hitachi
> HTS543232A7A384 320GB SATA disk.
>
> I have installed Debian Squeeze. This comes with Linux 2.6.32-5-amd64
> but I upgraded to the Linux-3.2.0-0.bpo.1-amd64 kernel in
> squeeze-backports to get the appropriate firmware for the wireless card.
>
> My partitions are mostly XFS, except /boot which is ext4.
>
> I notice that the 3.2 kernel constantly spins the disk up and down at an
> interval of about 10 seconds. Under the 2.6 one the interval is much
> larger but it still happens.
>
> In order to investigate I used 'iotop' invoked both as 'iotop' and
> 'iotop -oa'. On a quiescent system this shows periodic overall writes of
> a few KB/s but it doesn't show any writes for a particular process. No
> reads are ever shown on a quiescent system.
>
> I used blktrace which suggested that sda6 (/var) was to blame. I used
> 'lsof |grep /var' to see what was running and switched off a bunch of
> daemons just to be sure:
>
> -----
> sudo /etc/init.d/gdm stop; sudo /etc/init.d/cron stop; sudo
> /etc/init.d/dbus stop; sudo /etc/init.d/rsyslog stop; sudo
> /etc/init.d/atd stop; sudo /etc/init.d/acpid stop
> -----
>
> I used 'tail -f' on the remaining open files in /var to ensure that
> nothing was being written.
>
>
> The disk still spun up and down frequently.
>
> I took some blktraces with 'sudo btrace /dev/sda >
> /dev/shm/blktrace.Linux-...':
>
....
> -----
> $ cat blktrace.Linux-3.2.0-0.bpo.1-amd64
> 8,0 1 1 0.000000000 9 A WBSM 28595407 + 2 <- (8,6) 3947727
> 8,0 1 2 0.000003771 9 Q WBSM 28595407 + 2 [kworker/1:0]
> 8,0 1 3 0.000014736 9 G WBSM 28595407 + 2 [kworker/1:0]
> 8,0 1 4 0.000024165 9 I WBSM 28595407 + 2 [kworker/1:0]
> 8,0 3 1 0.000174603 0 D WSM 28595407 + 2 [swapper/1]
> 8,0 1 5 0.000363663 0 C WSM 28595407 + 2 [0]
That's a log write, probably from the periodic xfssync code.
491 STATIC void
492 xfs_sync_worker(
493 struct work_struct *work)
494 {
495 struct xfs_mount *mp = container_of(to_delayed_work(work),
496 struct xfs_mount, m_sync_work);
497 int error;
498
499 if (!(mp->m_flags & XFS_MOUNT_RDONLY)) {
500 /* dgc: errors ignored here */
501 if (mp->m_super->s_frozen == SB_UNFROZEN &&
502 xfs_log_need_covered(mp))
503 error = xfs_fs_log_dummy(mp);
504 else
505 xfs_log_force(mp, 0);
This branch here - an async log force.
506
507 /* start pushing all the metadata that is currently dirty */
508 xfs_ail_push_all(mp->m_ail);
Which will wake the xfsaild, which triggers the xfsbufd to write metadata.
509 }
510
511 /* queue us up again */
512 xfs_syncd_queue_sync(mp);
513 }
> 8,0 1 6 0.037216442 936 A WM 24647680 + 1 <- (8,6) 0
> 8,0 1 7 0.037219166 936 Q WM 24647680 + 1 [xfsbufd/sda6]
> 8,0 1 8 0.037230480 936 G WM 24647680 + 1 [xfsbufd/sda6]
> 8,0 1 9 0.037233903 936 P N [xfsbufd/sda6]
> 8,0 1 10 0.037239001 936 I WM 24647680 + 1 [xfsbufd/sda6]
xfsbufd is now writing a single sector buffer, it plugged.
> 8,0 0 1 0.037009992 0 C WSM 28595407 [0]
Log IO completes.
> 8,0 1 11 0.037254506 936 U N [xfsbufd/sda6] 1
xfsbufd unplugs.
> 8,0 1 12 0.037309401 936 D WM 24647680 + 1 [xfsbufd/sda6]
> 8,0 1 13 0.037518785 0 C WM 24647680 + 1 [0]
IO completes.
50ms passes. That's the xfsaild backoff time when it's walked all
the metadata in the dirty list. Then it starts again:
> 8,0 1 14 0.087965523 936 A WM 28618208 + 16 <- (8,6) 3970528
> 8,0 1 15 0.087968386 936 Q WM 28618208 + 16 [xfsbufd/sda6]
> 8,0 1 16 0.087978443 936 G WM 28618208 + 16 [xfsbufd/sda6]
> 8,0 1 17 0.087981935 936 P N [xfsbufd/sda6]
> 8,0 1 18 0.087987592 936 I WM 28618208 + 16 [xfsbufd/sda6]
> 8,0 1 19 0.087999255 936 U N [xfsbufd/sda6] 1
> 8,0 1 20 0.088048214 936 D WM 28618208 + 16 [xfsbufd/sda6]
> 8,0 1 21 0.088291817 0 C WM 28618208 + 16 [0]
And finds more work to do, which triggers the xfsbufd to flush an
8k buffer - most likely an inode buffer - and goes back to sleep.
Seems to me that something is still dirtying an inode regularly.
Perhaps you need to look at the XFS and writeback event traces to
find out what process is dirtying the inode. trace-cmd is your
friend...
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Disk spin down
2012-02-12 20:06 ` Dave Chinner
@ 2012-02-12 21:15 ` Andy Bennett
2012-02-12 21:28 ` Andy Bennett
0 siblings, 1 reply; 8+ messages in thread
From: Andy Bennett @ 2012-02-12 21:15 UTC (permalink / raw)
To: Dave Chinner; +Cc: xfs
Hi,
> Seems to me that something is still dirtying an inode regularly.
>
> Perhaps you need to look at the XFS and writeback event traces to
> find out what process is dirtying the inode. trace-cmd is your
> friend...
Something like this?
-----
echo 1 > /sys/kernel/debug/tracing/events/xfs/enable
echo 0 > /sys/kernel/debug/tracing/events/xfs/enable
more /sys/kernel/debug/tracing/trace
-----
I tried recreating the situation of the last 2 days (clean boot, stopped
services) and it's currently quiescing nicely. :-(
I'll keep an eye on it and try to catch it in the act but every time I
turn the tracing on the HDD light stays firmly off. :-(
Regards,
@ndy
--
andyjpb@ashurst.eu.org
http://www.ashurst.eu.org/
0x7EBA75FF
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Disk spin down
2012-02-12 21:15 ` Andy Bennett
@ 2012-02-12 21:28 ` Andy Bennett
2012-02-12 22:35 ` Dave Chinner
0 siblings, 1 reply; 8+ messages in thread
From: Andy Bennett @ 2012-02-12 21:28 UTC (permalink / raw)
To: Dave Chinner; +Cc: xfs
[-- Attachment #1: Type: text/plain, Size: 1355 bytes --]
Hi,
>> Seems to me that something is still dirtying an inode regularly.
>>
>> Perhaps you need to look at the XFS and writeback event traces to
>> find out what process is dirtying the inode. trace-cmd is your
>> friend...
>
> Something like this?
>
> -----
> echo 1 > /sys/kernel/debug/tracing/events/xfs/enable
>
> echo 0 > /sys/kernel/debug/tracing/events/xfs/enable
>
> more /sys/kernel/debug/tracing/trace
> -----
>
>
> I tried recreating the situation of the last 2 days (clean boot, stopped
> services) and it's currently quiescing nicely. :-(
>
> I'll keep an eye on it and try to catch it in the act but every time I
> turn the tracing on the HDD light stays firmly off. :-(
There is more interesting news already.
I had used 'hdparm -S 120' to set the spindown_timeout to 10 minutes. It
appears that that was sticking through a cold boot. Setting that back to
its previous value of 1 (5 seconds) makes the disk constantly spin up
and down when I suspect it is idle.
I've caught a trace over the course of a few spinup/downs and attached
it (gzipped as it's 208K unpacked).
When the spindown_timeout was set to 10 minutes I managed to run the
trace for a minute without logging anything. When the spindown_timeout
is 5 seconds much more is logged.
Regards,
@ndy
--
andyjpb@ashurst.eu.org
http://www.ashurst.eu.org/
0x7EBA75FF
[-- Attachment #2: trace.gz --]
[-- Type: application/gzip, Size: 14229 bytes --]
[-- Attachment #3: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Disk spin down
2012-02-12 21:28 ` Andy Bennett
@ 2012-02-12 22:35 ` Dave Chinner
2012-02-12 22:57 ` Andy Bennett
0 siblings, 1 reply; 8+ messages in thread
From: Dave Chinner @ 2012-02-12 22:35 UTC (permalink / raw)
To: Andy Bennett; +Cc: xfs
On Sun, Feb 12, 2012 at 09:28:32PM +0000, Andy Bennett wrote:
> Hi,
>
> >> Seems to me that something is still dirtying an inode regularly.
> >>
> >> Perhaps you need to look at the XFS and writeback event traces to
> >> find out what process is dirtying the inode. trace-cmd is your
> >> friend...
> >
> > Something like this?
> >
> > -----
> > echo 1 > /sys/kernel/debug/tracing/events/xfs/enable
> >
> > echo 0 > /sys/kernel/debug/tracing/events/xfs/enable
> >
> > more /sys/kernel/debug/tracing/trace
> > -----
> >
> >
> > I tried recreating the situation of the last 2 days (clean boot, stopped
> > services) and it's currently quiescing nicely. :-(
> >
> > I'll keep an eye on it and try to catch it in the act but every time I
> > turn the tracing on the HDD light stays firmly off. :-(
>
> There is more interesting news already.
>
> I had used 'hdparm -S 120' to set the spindown_timeout to 10 minutes. It
> appears that that was sticking through a cold boot. Setting that back to
> its previous value of 1 (5 seconds) makes the disk constantly spin up
> and down when I suspect it is idle.
Well, that's kind of important to know.
It takes XFS a minimum of 90s to idle a filesystem properly after
any modification. Setting a spindown time shorter than this will
cause the disk to spin up and down all the time until the filesystem
idles itself.
What else have you tuned on your system?
> I've caught a trace over the course of a few spinup/downs and attached
> it (gzipped as it's 208K unpacked).
Which you've taken about 90s after boot, so while there is probably
still dirty inodes due to the boot process. Indeed:
flush-8:0-1225 [002] 91.103273: xfs_ilock: dev 8:6 ino 0x80a124 flags ILOCK_EXCL caller xfs_iomap_write_allocate
flush-8:0-1225 [002] 91.103287: xfs_perag_get: dev 8:6 agno 2 refcount 28 caller xfs_bmap_btalloc_nullfb
flush-8:0-1225 [002] 91.103290: xfs_perag_put: dev 8:6 agno 2 refcount 27 caller xfs_bmap_btalloc_nullfb
flush-8:0-1225 [002] 91.103292: xfs_perag_get: dev 8:6 agno 3 refcount 32 caller xfs_bmap_btalloc_nullfb
flush-8:0-1225 [002] 91.103293: xfs_perag_put: dev 8:6 agno 3 refcount 31 caller xfs_bmap_btalloc_nullfb
flush-8:0-1225 [002] 91.103295: xfs_perag_get: dev 8:6 agno 2 refcount 28 caller xfs_alloc_vextent
That's data writeback happening, so filesystem idling is still at
least 90s away from this. So, it's no surprise your disk is
spinning up and down here because there is IO being done every 5-10
seconds which is in the same order of frequency as the IO the system
is issuing....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Disk spin down
2012-02-12 22:35 ` Dave Chinner
@ 2012-02-12 22:57 ` Andy Bennett
0 siblings, 0 replies; 8+ messages in thread
From: Andy Bennett @ 2012-02-12 22:57 UTC (permalink / raw)
To: Dave Chinner; +Cc: xfs
Hi,
>>>> Seems to me that something is still dirtying an inode regularly.
>>>>
>>>> Perhaps you need to look at the XFS and writeback event traces to
>>>> find out what process is dirtying the inode. trace-cmd is your
>>>> friend...
>>> Something like this?
>>>
>>> -----
>>> echo 1 > /sys/kernel/debug/tracing/events/xfs/enable
>>>
>>> echo 0 > /sys/kernel/debug/tracing/events/xfs/enable
>>>
>>> more /sys/kernel/debug/tracing/trace
>>> -----
>>>
>>>
>>> I tried recreating the situation of the last 2 days (clean boot, stopped
>>> services) and it's currently quiescing nicely. :-(
>>>
>>> I'll keep an eye on it and try to catch it in the act but every time I
>>> turn the tracing on the HDD light stays firmly off. :-(
>> There is more interesting news already.
>>
>> I had used 'hdparm -S 120' to set the spindown_timeout to 10 minutes. It
>> appears that that was sticking through a cold boot. Setting that back to
>> its previous value of 1 (5 seconds) makes the disk constantly spin up
>> and down when I suspect it is idle.
>
> Well, that's kind of important to know.
>
> It takes XFS a minimum of 90s to idle a filesystem properly after
> any modification. Setting a spindown time shorter than this will
> cause the disk to spin up and down all the time until the filesystem
> idles itself.
>
> What else have you tuned on your system?
This is a new laptop: 5 seconds was the factory default. I increased it
to 10 minutes between my first and second posts in an attempt to
investigate the problem.
Further investigations reveal that I need to switch off APM ('hdparm -B
255') on the disk as well otherwise it still racks up spinup/down cycles
long after boot; at rate of 2 or 3 a minute even if the spindown_timeout
is set to 10 minutes.
>> I've caught a trace over the course of a few spinup/downs and attached
>> it (gzipped as it's 208K unpacked).
>
> Which you've taken about 90s after boot, so while there is probably
> still dirty inodes due to the boot process. Indeed:
>
> flush-8:0-1225 [002] 91.103273: xfs_ilock: dev 8:6 ino 0x80a124 flags ILOCK_EXCL caller xfs_iomap_write_allocate
> flush-8:0-1225 [002] 91.103287: xfs_perag_get: dev 8:6 agno 2 refcount 28 caller xfs_bmap_btalloc_nullfb
> flush-8:0-1225 [002] 91.103290: xfs_perag_put: dev 8:6 agno 2 refcount 27 caller xfs_bmap_btalloc_nullfb
> flush-8:0-1225 [002] 91.103292: xfs_perag_get: dev 8:6 agno 3 refcount 32 caller xfs_bmap_btalloc_nullfb
> flush-8:0-1225 [002] 91.103293: xfs_perag_put: dev 8:6 agno 3 refcount 31 caller xfs_bmap_btalloc_nullfb
> flush-8:0-1225 [002] 91.103295: xfs_perag_get: dev 8:6 agno 2 refcount 28 caller xfs_alloc_vextent
>
> That's data writeback happening, so filesystem idling is still at
> least 90s away from this. So, it's no surprise your disk is
> spinning up and down here because there is IO being done every 5-10
> seconds which is in the same order of frequency as the IO the system
> is issuing....
OK. Thanks for pointing out my errors. I'll keep an eye on the situation.
Provided the spindown_timeout is >90s would you expect the disk to idle
properly? Is there something else (other than the spindown_timeout) that
could be encouraging the disk to go to sleep that would be mitigated by
switching off APM?
Many thanks for your time; especially your efforts analysing the logs.
Regards,
@ndy
--
andyjpb@ashurst.eu.org
http://www.ashurst.eu.org/
0x7EBA75FF
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Disk spin down
2012-02-12 18:23 Disk spin down Andy Bennett
2012-02-12 20:06 ` Dave Chinner
@ 2012-02-13 8:36 ` Brian Candler
2012-02-13 23:41 ` Dave Chinner
1 sibling, 1 reply; 8+ messages in thread
From: Brian Candler @ 2012-02-13 8:36 UTC (permalink / raw)
To: Andy Bennett; +Cc: xfs
On Sun, Feb 12, 2012 at 06:23:45PM +0000, Andy Bennett wrote:
> Can someone help me work out where writes are coming from on my system?
Is the filesystem in question mounted with 'relatime' or 'noatime'?
Otherwise, merely reading a file will cause its inode to be dirtied.
(However in modern kernels, 'relatime' might be the default, I'm not sure)
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Disk spin down
2012-02-13 8:36 ` Brian Candler
@ 2012-02-13 23:41 ` Dave Chinner
0 siblings, 0 replies; 8+ messages in thread
From: Dave Chinner @ 2012-02-13 23:41 UTC (permalink / raw)
To: Brian Candler; +Cc: Andy Bennett, xfs
On Mon, Feb 13, 2012 at 08:36:51AM +0000, Brian Candler wrote:
> On Sun, Feb 12, 2012 at 06:23:45PM +0000, Andy Bennett wrote:
> > Can someone help me work out where writes are coming from on my system?
>
> Is the filesystem in question mounted with 'relatime' or 'noatime'?
> Otherwise, merely reading a file will cause its inode to be dirtied.
>
> (However in modern kernels, 'relatime' might be the default, I'm not sure)
relatime has been the default for quite some time, and even before
relatime was implemented in the VFS, xfs implementated atime updates
in a manner almost indistinguishable from relatime since about 2007....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2012-02-13 23:41 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-02-12 18:23 Disk spin down Andy Bennett
2012-02-12 20:06 ` Dave Chinner
2012-02-12 21:15 ` Andy Bennett
2012-02-12 21:28 ` Andy Bennett
2012-02-12 22:35 ` Dave Chinner
2012-02-12 22:57 ` Andy Bennett
2012-02-13 8:36 ` Brian Candler
2012-02-13 23:41 ` Dave Chinner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox