All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Norman Cheung <norman.cheung@kla-tencor.com>
Cc: linux-xfs@oss.sgi.com
Subject: Re: Hung in D state during fclose
Date: Tue, 12 Feb 2013 17:55:45 +1100	[thread overview]
Message-ID: <20130212065545.GC10731@dastard> (raw)
In-Reply-To: <loom.20130212T071115-446@post.gmane.org>

On Tue, Feb 12, 2013 at 06:17:04AM +0000, Norman Cheung wrote:
> I am not sure if this forum is the same as xfs@oss.sgi.com, if so, my apology 
> for double posting.  I appreciate in sight or work around on this.
> 
> Every 3 - 4 days, my application will hang in D state at file close.  And 
> shortly after that flush (from a different partition) is locked in D state 
> also.  
> 
> My application runs continuously, 5 threads are writing data at a rate of 
> 1.5M/sec to 5 different XFS partitions.  Each of these partitions is a 2 disk 
> RAID 0. In addition, I have other threads consuming 100% CPU at all time, and 
> most of these threads are tied to its own CPU.
> 
> There are 5 data writing threads are also set to run in specific CPU (one CPU  
> per  thread), with priority set to high (-2).  The data writing pattern is: 
> each disk writing thread will write a file  1.5 Gig. Then the thread will 
> pause for about 3 minutes. Hence we have 5 files of 1.5Gig each after one 
> processing cycle.  And we keep 5 sets and delete the older ones.
> 
> After about 300 - 800 cycle, one or two of these disk writing threads will go 
> into D state.  And within a second flush of another partition will show up in 
> D state.  then after 15 minutes of no activities, the parent task will lower 
> the priority of all threads (to noraml 20) and abort the threads.  In all 
> cases, lowering the priority will get threads out of D states.  I have also 
> tried running the disk writing threads with normal priority (20).  Same 
> hangs.  Also the fclose of all 5 files to 5 different partitions happens 
> around the same time.
> 
> Thanks in advance,
> 
> Norman 
> 
> 
> Below is the sysrq for the 2 offending threads.
> 
> 1. the disk writing thread hung in fclose
> 
> 
> Tigris_IMC.exe  D 0000000000000000     0  4197   4100 0x00000000
> ffff881f3db921c0 0000000000000086 0000000000000000 ffff881f42eb8b80
> ffff880861419fd8 ffff880861419fd8 ffff880861419fd8 ffff881f3db921c0
> 0000000000080000 0000000000000000 00000000000401e0 00000000061805c1 Call Trace:
> [<ffffffff810d89ed>] ? zone_statistics+0x9d/0xa0 [<ffffffffa0402682>] ? 
> xfs_iomap_write_delay+0x172/0x2b0 [xfs] [<ffffffff813c7e35>] ? 
> rwsem_down_failed_common+0xc5/0x150
> [<ffffffff811f32a3>] ? call_rwsem_down_write_failed+0x13/0x20
> [<ffffffff813c74ec>] ? down_write+0x1c/0x1d [<ffffffffa03fba8e>] ? 
> xfs_ilock+0x7e/0xa0 [xfs] [<ffffffffa041b64b>] ? __xfs_get_blocks+0x1db/0x3d0 
> [xfs] [<ffffffff81103340>] ? kmem_cache_alloc+0x100/0x130 
> [<ffffffff8113fa2e>] ? alloc_page_buffers+0x6e/0xe0 [<ffffffff81141cdf>] ? 
> __block_write_begin+0x1cf/0x4d0 [<ffffffffa041b850>] ? 
> xfs_get_blocks_direct+0x10/0x10 [xfs] [<ffffffffa041b850>] ? 
> xfs_get_blocks_direct+0x10/0x10 [xfs] [<ffffffff8114226b>] ? 
> block_write_begin+0x4b/0xa0 [<ffffffffa041b8fb>] ? 
> xfs_vm_write_begin+0x3b/0x70 [xfs] [<ffffffff810c0258>] ? 
> generic_file_buffered_write+0xf8/0x250
> [<ffffffffa04207b5>] ? xfs_file_buffered_aio_write+0xc5/0x130 [xfs] 
> [<ffffffffa042099c>] ? xfs_file_aio_write+0x17c/0x2a0 [xfs] 
> [<ffffffff81115b28>] ? do_sync_write+0xb8/0xf0 [<ffffffff8119daa4>] ? 
> security_file_permission+0x24/0xc0
> [<ffffffff8111630a>] ? vfs_write+0xaa/0x190 [<ffffffff81116657>] ? 
> sys_write+0x47/0x90 [<ffffffff813ce412>] ? system_call_fastpath+0x16/0x1b

Can you please post non-mangled traces? these have all the lines run
together and then wrapped by you mailer....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2013-02-12  6:56 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-02-12  6:17 Hung in D state during fclose Norman Cheung
2013-02-12  6:55 ` Dave Chinner [this message]
2013-02-12  7:01   ` Cheung, Norman
2013-02-12 10:20     ` Dave Chinner
2013-02-12 16:39       ` Cheung, Norman
2013-02-12 20:22         ` Dave Chinner
2013-02-12 21:06           ` Cheung, Norman
2013-02-12 22:21             ` Dave Chinner
2013-02-13  0:12             ` Cheung, Norman
2013-02-13  5:15               ` Dave Chinner
2013-02-14  4:53                 ` Cheung, Norman
2013-02-26 19:41                   ` Cheung, Norman
2013-02-26 20:31                     ` Dave Chinner
  -- strict thread matches above, loose matches on Subject: below --
2013-02-11 23:36 Cheung, Norman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130212065545.GC10731@dastard \
    --to=david@fromorbit.com \
    --cc=linux-xfs@oss.sgi.com \
    --cc=norman.cheung@kla-tencor.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.