* High latencies writing to a memory mapped file
@ 2010-07-22 14:47 Shawn Bohrer
2010-07-22 23:28 ` Dave Chinner
2010-07-27 9:24 ` Matthias Schniedermeyer
0 siblings, 2 replies; 13+ messages in thread
From: Shawn Bohrer @ 2010-07-22 14:47 UTC (permalink / raw)
To: xfs
Hello,
We have an application that receives data over the network, and then
writes data out to a shared memory mapped file on xfs. Previously
everything worked great when we were running a CentOS 5.3 kernel
(2.6.18-128.7.1.el5), but when we upgraded to the newer CentOS 5.4 or
5.5 kernels we started to see occasional spikes (up to 1 second) in
the time it takes to write the data to the memory mapped file. I
noticed that between the CentOS 5.3 and 5.4 kernels the upstream xfs
changes were backported. Testing on 2.6.32.16 shows the same
occasional latency spikes.
After doing some debugging on 2.6.32.16 I have found the cause of the
latency spikes. The spikes occur every time pdflush runs to flush the
dirty pages. Note our app only writes data at about 1.5 MB/s and we
are well below both the dirty_background_ratio and the dirty_ratio so
these flushes are only of pages that are dirty_expire_centisecs old.
Capturing one of the spikes with ftrace shows:
...
flush-8:0-601 [011] 1291592.127590: funcgraph_entry: | xfs_ilock() {
flush-8:0-601 [011] 1291592.127590: funcgraph_entry: | down_write() {
flush-8:0-601 [011] 1291592.127590: funcgraph_entry: 0.213 us | _cond_resched();
flush-8:0-601 [011] 1291592.127591: funcgraph_exit: 0.657 us | }
flush-8:0-601 [011] 1291592.127591: funcgraph_exit: 1.174 us | }
flush-8:0-601 [011] 1291592.127764: sched_switch: task flush-8:0:601 [120] (D) ==> swapper:0 [120]
...
Here you can see flush-8:0 acquires the xfs_ilock() then I believe it
blocks waiting for the I/O to complete. Next you can see my
file_writer process generates a page fault while writing new data to
the memory mapped file. The page fault tries to acquire the
xfs_ilock() and then blocks because it is held by the flush-8:0
process.
...
file_writer-10549 [001] 1291593.100416: print: 00000000 write_delay start
file_writer-10549 [001] 1291593.100419: funcgraph_entry: | handle_mm_fault() {
file_writer-10549 [001] 1291593.100420: funcgraph_entry: | __do_fault() {
file_writer-10549 [001] 1291593.100420: funcgraph_entry: | filemap_fault() {
file_writer-10549 [001] 1291593.100421: funcgraph_entry: 0.775 us | find_get_page();
file_writer-10549 [001] 1291593.100422: funcgraph_entry: 0.191 us | _cond_resched();
file_writer-10549 [001] 1291593.100422: funcgraph_exit: 1.980 us | }
file_writer-10549 [001] 1291593.100422: funcgraph_entry: | unlock_page() {
file_writer-10549 [001] 1291593.100423: funcgraph_entry: 0.154 us | page_waitqueue();
file_writer-10549 [001] 1291593.100423: funcgraph_entry: 0.313 us | __wake_up_bit();
file_writer-10549 [001] 1291593.100423: funcgraph_exit: 1.080 us | }
file_writer-10549 [001] 1291593.100424: funcgraph_entry: | xfs_vm_page_mkwrite() {
file_writer-10549 [001] 1291593.100424: funcgraph_entry: | block_page_mkwrite() {
file_writer-10549 [001] 1291593.100424: funcgraph_entry: 0.240 us | _cond_resched();
file_writer-10549 [001] 1291593.100425: funcgraph_entry: | block_prepare_write() {
file_writer-10549 [001] 1291593.100425: funcgraph_entry: | __block_prepare_write() {
file_writer-10549 [001] 1291593.100426: funcgraph_entry: | create_empty_buffers() {
file_writer-10549 [001] 1291593.100426: funcgraph_entry: | alloc_page_buffers() {
file_writer-10549 [001] 1291593.100426: funcgraph_entry: | alloc_buffer_head() {
file_writer-10549 [001] 1291593.100426: funcgraph_entry: | kmem_cache_alloc() {
file_writer-10549 [001] 1291593.100427: funcgraph_entry: 0.491 us | _cond_resched();
file_writer-10549 [001] 1291593.100427: funcgraph_exit: 1.043 us | }
file_writer-10549 [001] 1291593.100428: funcgraph_entry: 0.185 us | recalc_bh_state();
file_writer-10549 [001] 1291593.100428: funcgraph_exit: 2.003 us | }
file_writer-10549 [001] 1291593.100428: funcgraph_entry: 0.137 us | set_bh_page();
file_writer-10549 [001] 1291593.100428: funcgraph_exit: 2.701 us | }
file_writer-10549 [001] 1291593.100429: funcgraph_entry: 0.271 us | _spin_lock();
file_writer-10549 [001] 1291593.100429: funcgraph_exit: 3.751 us | }
file_writer-10549 [001] 1291593.100429: funcgraph_entry: | xfs_get_blocks() {
file_writer-10549 [001] 1291593.100430: funcgraph_entry: | __xfs_get_blocks() {
file_writer-10549 [001] 1291593.100430: funcgraph_entry: | xfs_iomap() {
file_writer-10549 [001] 1291593.100430: funcgraph_entry: | xfs_ilock() {
file_writer-10549 [001] 1291593.100430: funcgraph_entry: | down_write() {
file_writer-10549 [001] 1291593.100431: funcgraph_entry: 0.241 us | _cond_resched();
file_writer-10549 [001] 1291593.100431: funcgraph_entry: 0.258 us | _spin_lock_irq();
file_writer-10549 [001] 1291593.100432: funcgraph_entry: | schedule() {
file_writer-10549 [001] 1291593.100432: funcgraph_entry: 0.236 us | rcu_sched_qs();
file_writer-10549 [001] 1291593.100432: funcgraph_entry: 0.152 us | _spin_lock_irq();
file_writer-10549 [001] 1291593.100433: funcgraph_entry: | deactivate_task() {
file_writer-10549 [001] 1291593.100433: funcgraph_entry: | dequeue_task() {
file_writer-10549 [001] 1291593.100433: funcgraph_entry: | dequeue_task_fair() {
file_writer-10549 [001] 1291593.100433: funcgraph_entry: | dequeue_entity() {
file_writer-10549 [001] 1291593.100434: funcgraph_entry: | update_curr() {
file_writer-10549 [001] 1291593.100434: funcgraph_entry: 0.351 us | task_of();
file_writer-10549 [001] 1291593.100434: funcgraph_exit: 0.862 us | }
file_writer-10549 [001] 1291593.100435: funcgraph_entry: 0.139 us | task_of();
file_writer-10549 [001] 1291593.100435: funcgraph_exit: 1.462 us | }
file_writer-10549 [001] 1291593.100435: funcgraph_entry: 0.159 us | hrtick_start_fair();
file_writer-10549 [001] 1291593.100435: funcgraph_exit: 2.188 us | }
file_writer-10549 [001] 1291593.100435: funcgraph_exit: 2.627 us | }
file_writer-10549 [001] 1291593.100436: funcgraph_exit: 2.934 us | }
file_writer-10549 [001] 1291593.100436: funcgraph_entry: 0.138 us | update_shares_locked();
file_writer-10549 [001] 1291593.100436: funcgraph_entry: | find_busiest_group() {
file_writer-10549 [001] 1291593.100436: funcgraph_entry: 0.144 us | default_scale_smt_power();
file_writer-10549 [001] 1291593.100437: funcgraph_entry: | scale_rt_power() {
file_writer-10549 [001] 1291593.100437: funcgraph_entry: 0.140 us | sched_avg_update();
file_writer-10549 [001] 1291593.100437: funcgraph_exit: 0.413 us | }
file_writer-10549 [001] 1291593.100437: funcgraph_exit: 1.290 us | }
file_writer-10549 [001] 1291593.100438: funcgraph_entry: 0.137 us | msecs_to_jiffies();
file_writer-10549 [001] 1291593.100438: funcgraph_entry: 0.140 us | update_shares_locked();
file_writer-10549 [001] 1291593.100438: funcgraph_entry: 0.502 us | find_busiest_group();
file_writer-10549 [001] 1291593.100439: funcgraph_entry: 0.140 us | msecs_to_jiffies();
file_writer-10549 [001] 1291593.100439: funcgraph_entry: 0.141 us | update_shares_locked();
file_writer-10549 [001] 1291593.100439: funcgraph_entry: 0.706 us | find_busiest_group();
file_writer-10549 [001] 1291593.100440: funcgraph_entry: 0.598 us | find_busiest_queue();
file_writer-10549 [001] 1291593.100441: funcgraph_entry: 0.137 us | msecs_to_jiffies();
file_writer-10549 [001] 1291593.100441: funcgraph_entry: 0.155 us | put_prev_task_fair();
file_writer-10549 [001] 1291593.100442: funcgraph_entry: 0.152 us | pick_next_task_fair();
file_writer-10549 [001] 1291593.100442: funcgraph_entry: 0.144 us | pick_next_task_rt();
file_writer-10549 [001] 1291593.100442: funcgraph_entry: 0.149 us | pick_next_task_fair();
file_writer-10549 [001] 1291593.100442: funcgraph_entry: 0.203 us | pick_next_task_idle();
file_writer-10549 [001] 1291593.100443: funcgraph_entry: 0.143 us | perf_event_task_sched_out();
file_writer-10549 [001] 1291593.100443: funcgraph_entry: | ftrace_raw_event_sched_switch() {
file_writer-10549 [001] 1291593.100443: sched_switch: task file_writer:10549 [120] (D) ==> swapper:0 [120]
...
Finally xfslogd/1 wakes up the flush-8:0 process which releases the
xfs_ilock() waking up my file_writer process and allowing it to
continue.
...
xfslogd/1-213 [001] 1291593.158606: sched_wakeup: task flush-8:0:601 [120] success=1 [011]
<idle>-0 [011] 1291593.158609: sched_switch: task swapper:0 [120] (R) ==> flush-8:0:601 [120]
flush-8:0-601 [011] 1291593.158623: funcgraph_entry: | xfs_iunlock() {
flush-8:0-601 [011] 1291593.158623: funcgraph_entry: | up_write() {
flush-8:0-601 [011] 1291593.158623: funcgraph_entry: 0.266 us | _spin_lock_irqsave();
flush-8:0-601 [011] 1291593.158624: funcgraph_entry: | wake_up_process() {
flush-8:0-601 [011] 1291593.158624: funcgraph_entry: | try_to_wake_up() {
flush-8:0-601 [011] 1291593.158625: funcgraph_entry: | task_rq_lock() {
flush-8:0-601 [011] 1291593.158625: funcgraph_entry: 0.159 us | _spin_lock();
flush-8:0-601 [011] 1291593.158625: funcgraph_exit: 0.763 us | }
flush-8:0-601 [011] 1291593.158626: funcgraph_entry: 0.237 us | _spin_unlock_irqrestore();
flush-8:0-601 [011] 1291593.158626: funcgraph_entry: | select_task_rq_fair() {
flush-8:0-601 [011] 1291593.158627: funcgraph_entry: 0.231 us | update_shares();
flush-8:0-601 [011] 1291593.158627: funcgraph_entry: 0.162 us | effective_load();
flush-8:0-601 [011] 1291593.158628: funcgraph_entry: 0.141 us | effective_load();
flush-8:0-601 [011] 1291593.158628: funcgraph_exit: 1.623 us | }
flush-8:0-601 [011] 1291593.158628: funcgraph_entry: | task_rq_lock() {
flush-8:0-601 [011] 1291593.158628: funcgraph_entry: 0.142 us | _spin_lock();
flush-8:0-601 [011] 1291593.158628: funcgraph_exit: 0.420 us | }
flush-8:0-601 [011] 1291593.158629: funcgraph_entry: | activate_task() {
flush-8:0-601 [011] 1291593.158629: funcgraph_entry: | enqueue_task() {
flush-8:0-601 [011] 1291593.158629: funcgraph_entry: | enqueue_task_fair() {
flush-8:0-601 [011] 1291593.158629: funcgraph_entry: | enqueue_entity() {
flush-8:0-601 [011] 1291593.158630: funcgraph_entry: 0.277 us | update_curr();
flush-8:0-601 [011] 1291593.158630: funcgraph_entry: 0.159 us | place_entity();
flush-8:0-601 [011] 1291593.158630: funcgraph_entry: 0.137 us | task_of();
flush-8:0-601 [011] 1291593.158631: funcgraph_entry: 0.172 us | __enqueue_entity();
flush-8:0-601 [011] 1291593.158631: funcgraph_exit: 1.654 us | }
flush-8:0-601 [011] 1291593.158631: funcgraph_exit: 2.062 us | }
flush-8:0-601 [011] 1291593.158631: funcgraph_exit: 2.444 us | }
flush-8:0-601 [011] 1291593.158631: funcgraph_exit: 2.760 us | }
flush-8:0-601 [011] 1291593.158632: funcgraph_entry: | ftrace_raw_event_sched_wakeup() {
flush-8:0-601 [011] 1291593.158632: sched_wakeup: task file_writer:10549 [120] success=1 [001]
...
Looking back at the 2.6.18-128.7.1.el5 kernel I see that the old
kernel provided the following vm operations:
static struct vm_operations_struct xfs_file_vm_ops = {
.nopage = filemap_nopage,
.populate = filemap_populate,
};
Where the new kernels provides:
static struct vm_operations_struct xfs_file_vm_ops = {
.nopage = filemap_nopage,
.populate = filemap_populate,
.page_mkwrite = xfs_vm_page_mkwrite,
};
static const struct vm_operations_struct xfs_file_vm_ops = {
.fault = filemap_fault,
.page_mkwrite = xfs_vm_page_mkwrite,
};
So I can see that the xfs_vm_page_mkwrite is a new addition and this
is what is causing the stalls on the newer kernels. I'll admit I know
very little about file systems, or the implications of these changes.
My question in general is what can we do to eliminate these latencies?
Thanks,
Shawn
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: High latencies writing to a memory mapped file 2010-07-22 14:47 High latencies writing to a memory mapped file Shawn Bohrer @ 2010-07-22 23:28 ` Dave Chinner 2010-07-26 22:09 ` Shawn Bohrer 2010-07-27 9:24 ` Matthias Schniedermeyer 1 sibling, 1 reply; 13+ messages in thread From: Dave Chinner @ 2010-07-22 23:28 UTC (permalink / raw) To: Shawn Bohrer; +Cc: xfs On Thu, Jul 22, 2010 at 09:47:06AM -0500, Shawn Bohrer wrote: > Hello, > > We have an application that receives data over the network, and then > writes data out to a shared memory mapped file on xfs. Previously > everything worked great when we were running a CentOS 5.3 kernel > (2.6.18-128.7.1.el5), but when we upgraded to the newer CentOS 5.4 or > 5.5 kernels we started to see occasional spikes (up to 1 second) in > the time it takes to write the data to the memory mapped file. I > noticed that between the CentOS 5.3 and 5.4 kernels the upstream xfs > changes were backported. Testing on 2.6.32.16 shows the same > occasional latency spikes. > > After doing some debugging on 2.6.32.16 I have found the cause of the > latency spikes. The spikes occur every time pdflush runs to flush the > dirty pages. Note our app only writes data at about 1.5 MB/s and we > are well below both the dirty_background_ratio and the dirty_ratio so > these flushes are only of pages that are dirty_expire_centisecs old. > Capturing one of the spikes with ftrace shows: > > ... > flush-8:0-601 [011] 1291592.127590: funcgraph_entry: | xfs_ilock() { > flush-8:0-601 [011] 1291592.127590: funcgraph_entry: | down_write() { > flush-8:0-601 [011] 1291592.127590: funcgraph_entry: 0.213 us | _cond_resched(); > flush-8:0-601 [011] 1291592.127591: funcgraph_exit: 0.657 us | } > flush-8:0-601 [011] 1291592.127591: funcgraph_exit: 1.174 us | } > flush-8:0-601 [011] 1291592.127764: sched_switch: task flush-8:0:601 [120] (D) ==> swapper:0 [120] > ... what is the lead-in context to this hunk? i.e. how does xfs_ilock() get called? That will tell me if it is blocking on the iolock or the ilock... > Here you can see flush-8:0 acquires the xfs_ilock() then I believe it > blocks waiting for the I/O to complete. Next you can see my > file_writer process generates a page fault while writing new data to > the memory mapped file. The page fault tries to acquire the > xfs_ilock() and then blocks because it is held by the flush-8:0 > process. > > ... > file_writer-10549 [001] 1291593.100416: print: 00000000 write_delay start > file_writer-10549 [001] 1291593.100419: funcgraph_entry: | handle_mm_fault() { > file_writer-10549 [001] 1291593.100420: funcgraph_entry: | __do_fault() { > file_writer-10549 [001] 1291593.100420: funcgraph_entry: | filemap_fault() { > file_writer-10549 [001] 1291593.100421: funcgraph_entry: 0.775 us | find_get_page(); > file_writer-10549 [001] 1291593.100422: funcgraph_entry: 0.191 us | _cond_resched(); > file_writer-10549 [001] 1291593.100422: funcgraph_exit: 1.980 us | } > file_writer-10549 [001] 1291593.100422: funcgraph_entry: | unlock_page() { > file_writer-10549 [001] 1291593.100423: funcgraph_entry: 0.154 us | page_waitqueue(); > file_writer-10549 [001] 1291593.100423: funcgraph_entry: 0.313 us | __wake_up_bit(); > file_writer-10549 [001] 1291593.100423: funcgraph_exit: 1.080 us | } > file_writer-10549 [001] 1291593.100424: funcgraph_entry: | xfs_vm_page_mkwrite() { > file_writer-10549 [001] 1291593.100424: funcgraph_entry: | block_page_mkwrite() { > file_writer-10549 [001] 1291593.100424: funcgraph_entry: 0.240 us | _cond_resched(); > file_writer-10549 [001] 1291593.100425: funcgraph_entry: | block_prepare_write() { > file_writer-10549 [001] 1291593.100425: funcgraph_entry: | __block_prepare_write() { > file_writer-10549 [001] 1291593.100426: funcgraph_entry: | create_empty_buffers() { > file_writer-10549 [001] 1291593.100426: funcgraph_entry: | alloc_page_buffers() { > file_writer-10549 [001] 1291593.100426: funcgraph_entry: | alloc_buffer_head() { > file_writer-10549 [001] 1291593.100426: funcgraph_entry: | kmem_cache_alloc() { > file_writer-10549 [001] 1291593.100427: funcgraph_entry: 0.491 us | _cond_resched(); > file_writer-10549 [001] 1291593.100427: funcgraph_exit: 1.043 us | } > file_writer-10549 [001] 1291593.100428: funcgraph_entry: 0.185 us | recalc_bh_state(); > file_writer-10549 [001] 1291593.100428: funcgraph_exit: 2.003 us | } > file_writer-10549 [001] 1291593.100428: funcgraph_entry: 0.137 us | set_bh_page(); > file_writer-10549 [001] 1291593.100428: funcgraph_exit: 2.701 us | } > file_writer-10549 [001] 1291593.100429: funcgraph_entry: 0.271 us | _spin_lock(); > file_writer-10549 [001] 1291593.100429: funcgraph_exit: 3.751 us | } > file_writer-10549 [001] 1291593.100429: funcgraph_entry: | xfs_get_blocks() { > file_writer-10549 [001] 1291593.100430: funcgraph_entry: | __xfs_get_blocks() { > file_writer-10549 [001] 1291593.100430: funcgraph_entry: | xfs_iomap() { > file_writer-10549 [001] 1291593.100430: funcgraph_entry: | xfs_ilock() { So this is trying to lock the ilock exclusively to map the range or do delayed allocation reservations. > Finally xfslogd/1 wakes up the flush-8:0 process which releases the > xfs_ilock() waking up my file_writer process and allowing it to > continue. > ... > xfslogd/1-213 [001] 1291593.158606: sched_wakeup: task flush-8:0:601 [120] success=1 [011] > <idle>-0 [011] 1291593.158609: sched_switch: task swapper:0 [120] (R) ==> flush-8:0:601 [120] xfslogd? Inodes do not get locked across log writes; they are unlocked in the transaction commit. Hmmm - perhaps the issue is that the transaction commit has not been able to get a iclog buffer to write the transaction into and so is sleeping waiting for an iclog IO to complete. That would be woken by the xfslogd, and then the transaction would write into the now-available iclog and unlock the inode. So, I'd say that in this case your application is waiting for log IO completion to occur. Nothing you can do to avoid that, but you can minimise the impact - increase the log buffer size via the mount option logbsize (i.e. -o logbsize=262144) to reduce the frequency of iclog buffer IOs and make it less likely the system will run stall having run out of iclog space. The other possibility is that the log IO is being slowed by barriers being issued with the log writes. If you've got non-volatile caches on your storage, you can turn off barriers and that should help reduce log IO latency significantly. Another possibility is to switch to the deadline or no-op IO scheduler to avoid log IO being delayed by (the ever changing) CFQ IO scheduling idiosyncracies. > Where the new kernels provides: > > static struct vm_operations_struct xfs_file_vm_ops = { > .nopage = filemap_nopage, > .populate = filemap_populate, > .page_mkwrite = xfs_vm_page_mkwrite, > }; > > static const struct vm_operations_struct xfs_file_vm_ops = { > .fault = filemap_fault, > .page_mkwrite = xfs_vm_page_mkwrite, > }; > > So I can see that the xfs_vm_page_mkwrite is a new addition and this > is what is causing the stalls on the newer kernels. ->page_mkwrite is necessary to solve significant other problems with mmap writes (like detecting ENOSPC correctly and allowing mmap writes to execute delayed allocation and detect unwritten extents correctly) so it's not going away. Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: High latencies writing to a memory mapped file 2010-07-22 23:28 ` Dave Chinner @ 2010-07-26 22:09 ` Shawn Bohrer 2010-07-26 23:22 ` Dave Chinner 0 siblings, 1 reply; 13+ messages in thread From: Shawn Bohrer @ 2010-07-26 22:09 UTC (permalink / raw) To: Dave Chinner; +Cc: xfs Hey Dave, Thanks for the suggestions. On Fri, Jul 23, 2010 at 09:28:46AM +1000, Dave Chinner wrote: > On Thu, Jul 22, 2010 at 09:47:06AM -0500, Shawn Bohrer wrote: > > Hello, > > > > We have an application that receives data over the network, and then > > writes data out to a shared memory mapped file on xfs. Previously > > everything worked great when we were running a CentOS 5.3 kernel > > (2.6.18-128.7.1.el5), but when we upgraded to the newer CentOS 5.4 or > > 5.5 kernels we started to see occasional spikes (up to 1 second) in > > the time it takes to write the data to the memory mapped file. I > > noticed that between the CentOS 5.3 and 5.4 kernels the upstream xfs > > changes were backported. Testing on 2.6.32.16 shows the same > > occasional latency spikes. > > > > After doing some debugging on 2.6.32.16 I have found the cause of the > > latency spikes. The spikes occur every time pdflush runs to flush the > > dirty pages. Note our app only writes data at about 1.5 MB/s and we > > are well below both the dirty_background_ratio and the dirty_ratio so > > these flushes are only of pages that are dirty_expire_centisecs old. > > Capturing one of the spikes with ftrace shows: > > > > ... > > flush-8:0-601 [011] 1291592.127590: funcgraph_entry: | xfs_ilock() { > > flush-8:0-601 [011] 1291592.127590: funcgraph_entry: | down_write() { > > flush-8:0-601 [011] 1291592.127590: funcgraph_entry: 0.213 us | _cond_resched(); > > flush-8:0-601 [011] 1291592.127591: funcgraph_exit: 0.657 us | } > > flush-8:0-601 [011] 1291592.127591: funcgraph_exit: 1.174 us | } > > flush-8:0-601 [011] 1291592.127764: sched_switch: task flush-8:0:601 [120] (D) ==> swapper:0 [120] > > ... > > what is the lead-in context to this hunk? i.e. how does xfs_ilock() > get called? That will tell me if it is blocking on the iolock or the > ilock... I don't have the lead-in context here. I'm using the function_graph tracer and with stalls up to a second in length it easily overflowed the ftrace ring buffer. Once I found I my process was blocking on the xfs_ilock(), I limited my traces to include xfs_ilock, xfs_iunlock, handle_mm_fault, and the scheduler events. If you are still interested in seeing some more detail here I can try to collect some more captures. > > Here you can see flush-8:0 acquires the xfs_ilock() then I believe it > > blocks waiting for the I/O to complete. Next you can see my > > file_writer process generates a page fault while writing new data to > > the memory mapped file. The page fault tries to acquire the > > xfs_ilock() and then blocks because it is held by the flush-8:0 > > process. > > > > ... > > file_writer-10549 [001] 1291593.100416: print: 00000000 write_delay start > > file_writer-10549 [001] 1291593.100419: funcgraph_entry: | handle_mm_fault() { > > file_writer-10549 [001] 1291593.100420: funcgraph_entry: | __do_fault() { > > file_writer-10549 [001] 1291593.100420: funcgraph_entry: | filemap_fault() { > > file_writer-10549 [001] 1291593.100421: funcgraph_entry: 0.775 us | find_get_page(); > > file_writer-10549 [001] 1291593.100422: funcgraph_entry: 0.191 us | _cond_resched(); > > file_writer-10549 [001] 1291593.100422: funcgraph_exit: 1.980 us | } > > file_writer-10549 [001] 1291593.100422: funcgraph_entry: | unlock_page() { > > file_writer-10549 [001] 1291593.100423: funcgraph_entry: 0.154 us | page_waitqueue(); > > file_writer-10549 [001] 1291593.100423: funcgraph_entry: 0.313 us | __wake_up_bit(); > > file_writer-10549 [001] 1291593.100423: funcgraph_exit: 1.080 us | } > > file_writer-10549 [001] 1291593.100424: funcgraph_entry: | xfs_vm_page_mkwrite() { > > file_writer-10549 [001] 1291593.100424: funcgraph_entry: | block_page_mkwrite() { > > file_writer-10549 [001] 1291593.100424: funcgraph_entry: 0.240 us | _cond_resched(); > > file_writer-10549 [001] 1291593.100425: funcgraph_entry: | block_prepare_write() { > > file_writer-10549 [001] 1291593.100425: funcgraph_entry: | __block_prepare_write() { > > file_writer-10549 [001] 1291593.100426: funcgraph_entry: | create_empty_buffers() { > > file_writer-10549 [001] 1291593.100426: funcgraph_entry: | alloc_page_buffers() { > > file_writer-10549 [001] 1291593.100426: funcgraph_entry: | alloc_buffer_head() { > > file_writer-10549 [001] 1291593.100426: funcgraph_entry: | kmem_cache_alloc() { > > file_writer-10549 [001] 1291593.100427: funcgraph_entry: 0.491 us | _cond_resched(); > > file_writer-10549 [001] 1291593.100427: funcgraph_exit: 1.043 us | } > > file_writer-10549 [001] 1291593.100428: funcgraph_entry: 0.185 us | recalc_bh_state(); > > file_writer-10549 [001] 1291593.100428: funcgraph_exit: 2.003 us | } > > file_writer-10549 [001] 1291593.100428: funcgraph_entry: 0.137 us | set_bh_page(); > > file_writer-10549 [001] 1291593.100428: funcgraph_exit: 2.701 us | } > > file_writer-10549 [001] 1291593.100429: funcgraph_entry: 0.271 us | _spin_lock(); > > file_writer-10549 [001] 1291593.100429: funcgraph_exit: 3.751 us | } > > file_writer-10549 [001] 1291593.100429: funcgraph_entry: | xfs_get_blocks() { > > file_writer-10549 [001] 1291593.100430: funcgraph_entry: | __xfs_get_blocks() { > > file_writer-10549 [001] 1291593.100430: funcgraph_entry: | xfs_iomap() { > > file_writer-10549 [001] 1291593.100430: funcgraph_entry: | xfs_ilock() { > > So this is trying to lock the ilock exclusively to map the range or > do delayed allocation reservations. > > > Finally xfslogd/1 wakes up the flush-8:0 process which releases the > > xfs_ilock() waking up my file_writer process and allowing it to > > continue. > > ... > > xfslogd/1-213 [001] 1291593.158606: sched_wakeup: task flush-8:0:601 [120] success=1 [011] > > <idle>-0 [011] 1291593.158609: sched_switch: task swapper:0 [120] (R) ==> flush-8:0:601 [120] > > xfslogd? Inodes do not get locked across log writes; they are > unlocked in the transaction commit. > > Hmmm - perhaps the issue is that the transaction commit has not been > able to get a iclog buffer to write the transaction into and so is > sleeping waiting for an iclog IO to complete. That would be woken by > the xfslogd, and then the transaction would write into the > now-available iclog and unlock the inode. > > So, I'd say that in this case your application is waiting for log IO > completion to occur. Nothing you can do to avoid that, but you can > minimise the impact - increase the log buffer size via the mount > option logbsize (i.e. -o logbsize=262144) to reduce the frequency of > iclog buffer IOs and make it less likely the system will run stall > having run out of iclog space. > > The other possibility is that the log IO is being slowed by barriers > being issued with the log writes. If you've got non-volatile caches > on your storage, you can turn off barriers and that should help > reduce log IO latency significantly. I've tried mounting with "noatime,nobarrier,logbufs=8,logbsize=262144" but I don't notice much difference and still see the stalls. > Another possibility is to switch to the deadline or no-op IO > scheduler to avoid log IO being delayed by (the ever changing) > CFQ IO scheduling idiosyncracies. I also tried using the noop IO scheduler, but also didn't see any improvement. Thanks, Shawn _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: High latencies writing to a memory mapped file 2010-07-26 22:09 ` Shawn Bohrer @ 2010-07-26 23:22 ` Dave Chinner 2010-08-03 22:03 ` Shawn Bohrer 0 siblings, 1 reply; 13+ messages in thread From: Dave Chinner @ 2010-07-26 23:22 UTC (permalink / raw) To: Shawn Bohrer; +Cc: xfs On Mon, Jul 26, 2010 at 05:09:09PM -0500, Shawn Bohrer wrote: > On Fri, Jul 23, 2010 at 09:28:46AM +1000, Dave Chinner wrote: > > On Thu, Jul 22, 2010 at 09:47:06AM -0500, Shawn Bohrer wrote: > > > Hello, > > > > > > We have an application that receives data over the network, and then > > > writes data out to a shared memory mapped file on xfs. Previously > > > everything worked great when we were running a CentOS 5.3 kernel > > > (2.6.18-128.7.1.el5), but when we upgraded to the newer CentOS 5.4 or > > > 5.5 kernels we started to see occasional spikes (up to 1 second) in > > > the time it takes to write the data to the memory mapped file. I > > > noticed that between the CentOS 5.3 and 5.4 kernels the upstream xfs > > > changes were backported. Testing on 2.6.32.16 shows the same > > > occasional latency spikes. > > > > > > After doing some debugging on 2.6.32.16 I have found the cause of the > > > latency spikes. The spikes occur every time pdflush runs to flush the > > > dirty pages. Note our app only writes data at about 1.5 MB/s and we > > > are well below both the dirty_background_ratio and the dirty_ratio so > > > these flushes are only of pages that are dirty_expire_centisecs old. > > > Capturing one of the spikes with ftrace shows: > > > > > > ... > > > flush-8:0-601 [011] 1291592.127590: funcgraph_entry: | xfs_ilock() { > > > flush-8:0-601 [011] 1291592.127590: funcgraph_entry: | down_write() { > > > flush-8:0-601 [011] 1291592.127590: funcgraph_entry: 0.213 us | _cond_resched(); > > > flush-8:0-601 [011] 1291592.127591: funcgraph_exit: 0.657 us | } > > > flush-8:0-601 [011] 1291592.127591: funcgraph_exit: 1.174 us | } > > > flush-8:0-601 [011] 1291592.127764: sched_switch: task flush-8:0:601 [120] (D) ==> swapper:0 [120] > > > ... > > > > what is the lead-in context to this hunk? i.e. how does xfs_ilock() > > get called? That will tell me if it is blocking on the iolock or the > > ilock... > > I don't have the lead-in context here. I'm using the function_graph > tracer and with stalls up to a second in length it easily overflowed > the ftrace ring buffer. Once I found I my process was blocking on the > xfs_ilock(), I limited my traces to include xfs_ilock, xfs_iunlock, > handle_mm_fault, and the scheduler events. If you are still > interested in seeing some more detail here I can try to collect some > more captures. I'm not sure isthe tracepoint buffer is the same as the ftrace buffer, but if it is you can increase the size by doing this sort of thing: # echo 32768 > /sys/kernel/debug/tracing/buffer_size_kb I often do that to capture long XFS traces during testing.... [snip] > > Hmmm - perhaps the issue is that the transaction commit has not been > > able to get a iclog buffer to write the transaction into and so is > > sleeping waiting for an iclog IO to complete. That would be woken by > > the xfslogd, and then the transaction would write into the > > now-available iclog and unlock the inode. > > > > So, I'd say that in this case your application is waiting for log IO > > completion to occur. Nothing you can do to avoid that, but you can > > minimise the impact - increase the log buffer size via the mount > > option logbsize (i.e. -o logbsize=262144) to reduce the frequency of > > iclog buffer IOs and make it less likely the system will run stall > > having run out of iclog space. > > > > The other possibility is that the log IO is being slowed by barriers > > being issued with the log writes. If you've got non-volatile caches > > on your storage, you can turn off barriers and that should help > > reduce log IO latency significantly. > > I've tried mounting with "noatime,nobarrier,logbufs=8,logbsize=262144" > but I don't notice much difference and still see the stalls. Ok. One other possibility came to mind - the transaction commit is blocking while locking the superblock buffer to apply free block count changes. Do you have the "lazy-count" option set (xfs_info <mtpt> will tell you). If it isn't set, then transaction commit latency can be quite high if the sb buffer is being written to disk when the transaction tries to lock it. If it is not set, you can use xfs_admin to set that bit (mkfs has defaulted to lazy-count=1 for some time now). I this doesn't help, then we need to do more work to determine what is happening... Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: High latencies writing to a memory mapped file 2010-07-26 23:22 ` Dave Chinner @ 2010-08-03 22:03 ` Shawn Bohrer 0 siblings, 0 replies; 13+ messages in thread From: Shawn Bohrer @ 2010-08-03 22:03 UTC (permalink / raw) To: Dave Chinner; +Cc: xfs Hey Dave, Thanks for the suggestions. On Tue, Jul 27, 2010 at 09:22:24AM +1000, Dave Chinner wrote: > On Mon, Jul 26, 2010 at 05:09:09PM -0500, Shawn Bohrer wrote: > > On Fri, Jul 23, 2010 at 09:28:46AM +1000, Dave Chinner wrote: > > > On Thu, Jul 22, 2010 at 09:47:06AM -0500, Shawn Bohrer wrote: > > > > Hello, > > > > > > > > We have an application that receives data over the network, and then > > > > writes data out to a shared memory mapped file on xfs. Previously > > > > everything worked great when we were running a CentOS 5.3 kernel > > > > (2.6.18-128.7.1.el5), but when we upgraded to the newer CentOS 5.4 or > > > > 5.5 kernels we started to see occasional spikes (up to 1 second) in > > > > the time it takes to write the data to the memory mapped file. I > > > > noticed that between the CentOS 5.3 and 5.4 kernels the upstream xfs > > > > changes were backported. Testing on 2.6.32.16 shows the same > > > > occasional latency spikes. > > > > > > > > After doing some debugging on 2.6.32.16 I have found the cause of the > > > > latency spikes. The spikes occur every time pdflush runs to flush the > > > > dirty pages. Note our app only writes data at about 1.5 MB/s and we > > > > are well below both the dirty_background_ratio and the dirty_ratio so > > > > these flushes are only of pages that are dirty_expire_centisecs old. > > > > Capturing one of the spikes with ftrace shows: > > > > > > > > ... > > > > flush-8:0-601 [011] 1291592.127590: funcgraph_entry: | xfs_ilock() { > > > > flush-8:0-601 [011] 1291592.127590: funcgraph_entry: | down_write() { > > > > flush-8:0-601 [011] 1291592.127590: funcgraph_entry: 0.213 us | _cond_resched(); > > > > flush-8:0-601 [011] 1291592.127591: funcgraph_exit: 0.657 us | } > > > > flush-8:0-601 [011] 1291592.127591: funcgraph_exit: 1.174 us | } > > > > flush-8:0-601 [011] 1291592.127764: sched_switch: task flush-8:0:601 [120] (D) ==> swapper:0 [120] > > > > ... > > > > > > what is the lead-in context to this hunk? i.e. how does xfs_ilock() > > > get called? That will tell me if it is blocking on the iolock or the > > > ilock... > > > > I don't have the lead-in context here. I'm using the function_graph > > tracer and with stalls up to a second in length it easily overflowed > > the ftrace ring buffer. Once I found I my process was blocking on the > > xfs_ilock(), I limited my traces to include xfs_ilock, xfs_iunlock, > > handle_mm_fault, and the scheduler events. If you are still > > interested in seeing some more detail here I can try to collect some > > more captures. > > I'm not sure isthe tracepoint buffer is the same as the ftrace > buffer, but if it is you can increase the size by doing this sort of > thing: > > # echo 32768 > /sys/kernel/debug/tracing/buffer_size_kb > > I often do that to capture long XFS traces during testing.... > > [snip] > > > > Hmmm - perhaps the issue is that the transaction commit has not been > > > able to get a iclog buffer to write the transaction into and so is > > > sleeping waiting for an iclog IO to complete. That would be woken by > > > the xfslogd, and then the transaction would write into the > > > now-available iclog and unlock the inode. > > > > > > So, I'd say that in this case your application is waiting for log IO > > > completion to occur. Nothing you can do to avoid that, but you can > > > minimise the impact - increase the log buffer size via the mount > > > option logbsize (i.e. -o logbsize=262144) to reduce the frequency of > > > iclog buffer IOs and make it less likely the system will run stall > > > having run out of iclog space. > > > > > > The other possibility is that the log IO is being slowed by barriers > > > being issued with the log writes. If you've got non-volatile caches > > > on your storage, you can turn off barriers and that should help > > > reduce log IO latency significantly. > > > > I've tried mounting with "noatime,nobarrier,logbufs=8,logbsize=262144" > > but I don't notice much difference and still see the stalls. > > Ok. One other possibility came to mind - the transaction commit is > blocking while locking the superblock buffer to apply free block > count changes. Do you have the "lazy-count" option set (xfs_info > <mtpt> will tell you). If it isn't set, then transaction commit > latency can be quite high if the sb buffer is being written to disk > when the transaction tries to lock it. If it is not set, you can use > xfs_admin to set that bit (mkfs has defaulted to lazy-count=1 for > some time now). Setting lazy-count=1 looks like it has helped quite a bit to reduce the frequency that I was seeing the spikes. I am still seeing spikes of up to a second, but I haven't had time to capture any more traces to verify if they are the same issue. Overall though this may be good enough. Thanks, Shawn _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: High latencies writing to a memory mapped file 2010-07-22 14:47 High latencies writing to a memory mapped file Shawn Bohrer 2010-07-22 23:28 ` Dave Chinner @ 2010-07-27 9:24 ` Matthias Schniedermeyer 2010-07-27 10:47 ` Emmanuel Florac [not found] ` <4C4EDEFD.7000401@hardwarefreak.com> 1 sibling, 2 replies; 13+ messages in thread From: Matthias Schniedermeyer @ 2010-07-27 9:24 UTC (permalink / raw) To: Shawn Bohrer; +Cc: xfs On 22.07.2010 09:47, Shawn Bohrer wrote: I can't help with your exact problem, but this reminds me of a quite similar looking problem i have. We have a linux-computer with samba to act as a fileserver for a few Windows-Clients and ever since kernel 2.6.26 (Never happend with 2.6.25) the server randomly "hangs" for a few seconds (Which is more of a problem that drives people crazy). It takes about 1-2 weeks for the symtoms to show and only a reboot of the computer "cures" it (for another 1-2 weeks). Because of the long incubation period and the whole slew of subsystems (and samba) involded, i never had the drive to debug this, as a "quick" reboot every 1-2 weeks only takes about 2 minutes. With "whole slew of subsystems" in mean: Hardware: SATA on onboard-controller (Intel G41-Chipset(ICH7) with PIIX-driver, because BIOS is crippled and doesn't allow AHCI) (Hardware itself is unlikly to be be at fault, because a few month ago the complete computer and HDDs were switched, but the problem persisted) Kernel: Currently 2.6.32(-5, Debian SID), first seen with 2.6.26 (vanilla) Drivers/Filesystem: md(RAID1), loop-aes, XFS (I tried logbufs=8,logbsize=256k,nobarrier, no apparent changes. I'm not 100% sure if i tried other schedulers, but i think i did) And finally Samba on top. As i had to reboot the computer just yesterday, and will have to power down the server over the weekend because we are moving, it will be at least 2-3 weeks before i anticipate the problem to resurface. In case anybody has an idea what i could do to identify and/or isolate the root-cause, i'm open for suggestions. Bis denn -- Real Programmers consider "what you see is what you get" to be just as bad a concept in Text Editors as it is in women. No, the Real Programmer wants a "you asked for it, you got it" text editor -- complicated, cryptic, powerful, unforgiving, dangerous. _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: High latencies writing to a memory mapped file 2010-07-27 9:24 ` Matthias Schniedermeyer @ 2010-07-27 10:47 ` Emmanuel Florac 2010-07-27 11:27 ` Matthias Schniedermeyer 2010-07-27 13:00 ` Stan Hoeppner [not found] ` <4C4EDEFD.7000401@hardwarefreak.com> 1 sibling, 2 replies; 13+ messages in thread From: Emmanuel Florac @ 2010-07-27 10:47 UTC (permalink / raw) To: Matthias Schniedermeyer; +Cc: Shawn Bohrer, xfs Le Tue, 27 Jul 2010 11:24:52 +0200 Matthias Schniedermeyer <ms@citd.de> écrivait: > In case anybody has an idea what i could do to identify and/or > isolate the root-cause, i'm open for suggestions. > I'd thought that the io scheduler is the culprit. I've generally found that the default CFQ scheduler (default) is terrible for file servers. -- ------------------------------------------------------------------------ Emmanuel Florac | Direction technique | Intellique | <eflorac@intellique.com> | +33 1 78 94 84 02 ------------------------------------------------------------------------ _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: High latencies writing to a memory mapped file 2010-07-27 10:47 ` Emmanuel Florac @ 2010-07-27 11:27 ` Matthias Schniedermeyer 2010-07-27 13:06 ` Emmanuel Florac 2010-07-27 13:00 ` Stan Hoeppner 1 sibling, 1 reply; 13+ messages in thread From: Matthias Schniedermeyer @ 2010-07-27 11:27 UTC (permalink / raw) To: Emmanuel Florac; +Cc: Shawn Bohrer, xfs On 27.07.2010 12:47, Emmanuel Florac wrote: > Le Tue, 27 Jul 2010 11:24:52 +0200 > Matthias Schniedermeyer <ms@citd.de> écrivait: > > > In case anybody has an idea what i could do to identify and/or > > isolate the root-cause, i'm open for suggestions. > > > > I'd thought that the io scheduler is the culprit. I've generally found > that the default CFQ scheduler (default) is terrible for file servers. I have 'deadline'ed the server, if that is the culprit then i will "know" in about a month. (As i don't know how to force the problem i can only wait for the problem not to happen. :-| ) Bis denn -- Real Programmers consider "what you see is what you get" to be just as bad a concept in Text Editors as it is in women. No, the Real Programmer wants a "you asked for it, you got it" text editor -- complicated, cryptic, powerful, unforgiving, dangerous. _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: High latencies writing to a memory mapped file 2010-07-27 11:27 ` Matthias Schniedermeyer @ 2010-07-27 13:06 ` Emmanuel Florac 0 siblings, 0 replies; 13+ messages in thread From: Emmanuel Florac @ 2010-07-27 13:06 UTC (permalink / raw) To: Matthias Schniedermeyer; +Cc: Shawn Bohrer, xfs Le Tue, 27 Jul 2010 13:27:00 +0200 Matthias Schniedermeyer <ms@citd.de> écrivait: > I have 'deadline'ed the server, if that is the culprit then i will > "know" in about a month. (As i don't know how to force the problem i > can only wait for the problem not to happen. :-| ) Let's hope for the best :) -- ------------------------------------------------------------------------ Emmanuel Florac | Direction technique | Intellique | <eflorac@intellique.com> | +33 1 78 94 84 02 ------------------------------------------------------------------------ _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: High latencies writing to a memory mapped file 2010-07-27 10:47 ` Emmanuel Florac 2010-07-27 11:27 ` Matthias Schniedermeyer @ 2010-07-27 13:00 ` Stan Hoeppner 1 sibling, 0 replies; 13+ messages in thread From: Stan Hoeppner @ 2010-07-27 13:00 UTC (permalink / raw) To: xfs Emmanuel Florac put forth on 7/27/2010 5:47 AM: > Le Tue, 27 Jul 2010 11:24:52 +0200 > Matthias Schniedermeyer <ms@citd.de> écrivait: > >> In case anybody has an idea what i could do to identify and/or >> isolate the root-cause, i'm open for suggestions. >> > > I'd thought that the io scheduler is the culprit. I've generally found > that the default CFQ scheduler (default) is terrible for file servers. Agreed. The scheduler may not be the cause of his issue, but switching to deadline or noop should certainly help overall I/O a bit. I'm no fan of CFQ either. I've been using deadline for quite a while and it seems to be good for local single disk and JBOD, as well as for Linux software RAID to local disks. Deadline and noop are both good choices if using a good intelligent PCI RAID controller, or if doing I/O over a FC/iSCSI HBA to a SAN controller; in fact noop probably shouldn't be used with anything but intelligent I/O controllers. My informal testing some time ago revealed that deadline will pick up some of the performance slack if you're unable to use NCQ due to a crappy SATA chipset/HBA and/or drive firmware (or both). Testing also showed that it basically ties noop when used with Qlogic 4Gb FC HBAs and IBM and Nexsan SAN array controllers. Deadline just seems to work pretty well with anything, picking up slack when lacking intelligent I/O hardware, and not really getting in the way when you have it. ** Testing will reveal which is better for a given server/storage and application mix. Such testing is why I only include deadline in my custom kernels (that and the fact I like to keep my kernels tight, as small as possible--dropping CFQ and noop simply cuts a little more dead weight from the binary). -- Stan _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 13+ messages in thread
[parent not found: <4C4EDEFD.7000401@hardwarefreak.com>]
* Re: High latencies writing to a memory mapped file [not found] ` <4C4EDEFD.7000401@hardwarefreak.com> @ 2010-07-27 14:49 ` Matthias Schniedermeyer 2010-07-27 19:59 ` Kinzel, David 0 siblings, 1 reply; 13+ messages in thread From: Matthias Schniedermeyer @ 2010-07-27 14:49 UTC (permalink / raw) To: Stan Hoeppner; +Cc: xfs On 27.07.2010 08:28, Stan Hoeppner wrote: > Matthias Schniedermeyer put forth on 7/27/2010 4:24 AM: > > > We have a linux-computer with samba to act as a fileserver for a few > > Windows-Clients and ever since kernel 2.6.26 (Never happend with 2.6.25) > > the server randomly "hangs" for a few seconds (Which is more of a > > problem that drives people crazy). > > Can you supply any kinda of empirical data or logging that would suggest the > problem lies with XFS? You've not given us much to go on except anecdote. > > Google "linux server hangs" and you'll find thousands of causes and solutions. The "best" further anecdote i have is: I cloned(*) the server, on identical hardware (ignoring HDDs), for someone else with 2 major differences. - no loop-aes, as they didn't need encryption. - ext4 instead of xfs That machine has currently an uptime of 180 days and i don't have a single report of a "hang". (Which technically doesn't mean that the machine doesn't have that same bug. Only that no one ever complained to me) For empirical data: I don't know how or where to start. *: In the best sense of the word. As the first step i 'rsync'ed the "system"-HDD of our server to a new HDD. ... Bis denn -- Real Programmers consider "what you see is what you get" to be just as bad a concept in Text Editors as it is in women. No, the Real Programmer wants a "you asked for it, you got it" text editor -- complicated, cryptic, powerful, unforgiving, dangerous. _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 13+ messages in thread
* RE: High latencies writing to a memory mapped file 2010-07-27 14:49 ` Matthias Schniedermeyer @ 2010-07-27 19:59 ` Kinzel, David 2010-07-27 21:02 ` Matthias Schniedermeyer 0 siblings, 1 reply; 13+ messages in thread From: Kinzel, David @ 2010-07-27 19:59 UTC (permalink / raw) To: xfs >> Matthias Schniedermeyer put forth on 7/27/2010 4:24 AM: >> >> > We have a linux-computer with samba to act as a fileserver >for a few >> > Windows-Clients and ever since kernel 2.6.26 (Never >happend with 2.6.25) >> > the server randomly "hangs" for a few seconds (Which is more of a >> > problem that drives people crazy). >> The server hangs, or the connections to samba do? I've found that samba is given a ridiculously low IO priority so that any IO on the server will pretty much cause it to stall -- be it updating locate, a backup job, etc. This email communication and any files transmitted with it may contain confidential and or proprietary information and is provided for the use of the intended recipient only. Any review, retransmission or dissemination of this information by anyone other than the intended recipient is prohibited. If you receive this email in error, please contact the sender and delete this communication and any copies immediately. Thank you. http://www.encana.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: High latencies writing to a memory mapped file 2010-07-27 19:59 ` Kinzel, David @ 2010-07-27 21:02 ` Matthias Schniedermeyer 0 siblings, 0 replies; 13+ messages in thread From: Matthias Schniedermeyer @ 2010-07-27 21:02 UTC (permalink / raw) To: Kinzel, David; +Cc: xfs On 27.07.2010 13:59, Kinzel, David wrote: > >> Matthias Schniedermeyer put forth on 7/27/2010 4:24 AM: > >> > >> > We have a linux-computer with samba to act as a fileserver > >for a few > >> > Windows-Clients and ever since kernel 2.6.26 (Never > >happend with 2.6.25) > >> > the server randomly "hangs" for a few seconds (Which is more of a > >> > problem that drives people crazy). > >> > > The server hangs, or the connections to samba do? I've found that samba That's a question i actually hadn't though about. This certainly extends the circle of possible culprits a little. From the top of my mind thats the the NIC (Intel e100) and it's e100-driver and the (As Cheap as possible) 24-port 100MBit-Switch used to connect all computers. And last but not least the whole TCP/IP network-stack in the kernel. This also differentiates "our" server to the mentioned "copy" which uses whichever GBit-NIC that was onboard (Something from Realtek AFAIR) to the a tree of even cheaper 5 and 8 port switches, not to mention on-the-fly-wiring. > is given a ridiculously low IO priority so that any IO on the server > will pretty much cause it to stall -- be it updating locate, a backup > job, etc. Whatever the culprit is, in our case it's something that changed between 2.6.25 and 2.6.26, before 2.6.26 we were happy with whatever was current for years. (System is running a reguarly updated Debian SID for at least 6-7 years) And the most irritating things is the "a reboot fixes it". When the problem reared it's head, you can restart samba, drop all caches, umount/decompose the RAID and whatnot, the problem immediatly reappears after you get back to working conditions. Bis denn -- Real Programmers consider "what you see is what you get" to be just as bad a concept in Text Editors as it is in women. No, the Real Programmer wants a "you asked for it, you got it" text editor -- complicated, cryptic, powerful, unforgiving, dangerous. _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2010-08-03 22:03 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-07-22 14:47 High latencies writing to a memory mapped file Shawn Bohrer
2010-07-22 23:28 ` Dave Chinner
2010-07-26 22:09 ` Shawn Bohrer
2010-07-26 23:22 ` Dave Chinner
2010-08-03 22:03 ` Shawn Bohrer
2010-07-27 9:24 ` Matthias Schniedermeyer
2010-07-27 10:47 ` Emmanuel Florac
2010-07-27 11:27 ` Matthias Schniedermeyer
2010-07-27 13:06 ` Emmanuel Florac
2010-07-27 13:00 ` Stan Hoeppner
[not found] ` <4C4EDEFD.7000401@hardwarefreak.com>
2010-07-27 14:49 ` Matthias Schniedermeyer
2010-07-27 19:59 ` Kinzel, David
2010-07-27 21:02 ` Matthias Schniedermeyer
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox