Re: Array 'freezes' for some time after large writes?

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Jim Duchek <jim.duchek@gmail.com>
To: Mark Knecht <markknecht@gmail.com>
Cc: linux-raid@vger.kernel.org
Subject: Re: Array 'freezes' for some time after large writes?
Date: Tue, 30 Mar 2010 14:59:20 -0600	[thread overview]
Message-ID: <r2vdead81ad1003301359g73b5a6ady3a9bcc174d556579@mail.gmail.com> (raw)
In-Reply-To: <5bdc1c8b1003301345p46efdaddv5d420c30e75b013f@mail.gmail.com>

I'm using ext4 on everything, but it's hard to judge which ext3 bugs
might affect ext4 as well.  I really don't have the ability to
destructively test the array, I need all the data that's on it and I
don't have enough spare space elsewhere to back it all up.  You might
see if you can trigger it with dd, writing to the drive directly w/no
filesystem?

Jim



On 30 March 2010 14:45, Mark Knecht <markknecht@gmail.com> wrote:
> Hi,
>   I am running the nvidia binary drivers. I'm not doing anything with
> X at this point so I an just unload them I think. I could even remove
> the card I suppose.
>
>   I built a machine for my dad a couple of months ago that uses the
> same 1TB WD drive that I am using now. I don't remember seeing
> anything like this on his machine but I'm going to go check that.
>
>   One other similarity I suspect we have is ext3? There were problems
> with ext3 priority inversion in earlier kernel. It's my understanding
> that they thought they had that worked out but possibly we're
> triggering this somehow? since I've got a lot of disk space I can set
> up some other partitions, etc4, reiser4, etc., and try copying files
> to trigger it. However it's difficult for me if it requires read/write
> as I'm not set up to really use the machine yet. Is that something you
> have room to try?
>
>   Also, we haven't discussed what drivers are loaded or kernel
> config. Here's my current driver set:
>
> keeper ~ # lsmod
> Module                  Size  Used by
> ipv6                  207757  30
> usbhid                 21529  0
> nvidia              10611606  22
> snd_hda_codec_realtek   239530  1
> snd_hda_intel          17688  0
> ehci_hcd               30854  0
> snd_hda_codec          45755  2 snd_hda_codec_realtek,snd_hda_intel
> snd_pcm                58104  2 snd_hda_intel,snd_hda_codec
> snd_timer              15030  1 snd_pcm
> snd                    37476  5
> snd_hda_codec_realtek,snd_hda_intel,snd_hda_codec,snd_pcm,snd_timer
> soundcore                800  1 snd
> snd_page_alloc          5809  2 snd_hda_intel,snd_pcm
> rtc_cmos                7678  0
> rtc_core               11093  1 rtc_cmos
> sg                     23029  0
> uhci_hcd               18047  0
> usbcore               115023  4 usbhid,ehci_hcd,uhci_hcd
> agpgart                24341  1 nvidia
> processor              23121  0
> e1000e                111701  0
> firewire_ohci          20022  0
> rtc_lib                 1617  1 rtc_core
> firewire_core          36109  1 firewire_ohci
> thermal                11650  0
> keeper ~ #
>
> - Mark
>
> On Tue, Mar 30, 2010 at 1:32 PM, Jim Duchek <jim.duchek@gmail.com> wrote:
>> Hrm, I've never seen that kernel message.  I don't think any of my
>> freezes have lasted for up to 120 seconds though (my drives are half
>> as big -- might matter?)  It looks like we've both got WD drives --
>> and we both have nvidia 9500gt's as well.  Are you running the nvidia
>> binary drivers, or noveau? (It seems like it wouldn't matter
>> especially as, at least on my system, they don't share an interrupt or
>> anything, but I hate to ignore any hardware that we both have the same
>> of). I did move to 2.6.33 for some time, but that didn't change the
>> behaviour.
>>
>> Jim
>>
>>
>> On 30 March 2010 13:05, Mark Knecht <markknecht@gmail.com> wrote:
>>> On Tue, Mar 30, 2010 at 10:47 AM, Jim Duchek <jim.duchek@gmail.com> wrote:
>>> <SNIP>
>>>>  You're having this happen even if the disk in question is not in an
>>>> array?  If so perhaps it's an SATA issue and not a RAID one, and we
>>>> should move this discussion accordingly.
>>>
>>> Yes, in my case the delays are so long - sometimes 2 or 3 minutes -
>>> that when I tried to build the system using RAID1 I got this kernel
>>> bug in dmesg. It's jsut info - not a real failure - but because it's
>>> talking about long delays I gave up on RAID and tried a standard
>>> single drive build. Turns out that it has (I think...) nothing to do
>>> with RAID at all. you'll not that there are instructions for turning
>>> the message off but I've not tried them. I intend to do a parallel
>>> RAID1 build on this machine and be able to test both RAID vs non-RAID.
>>>
>>> - Mark
>>>
>>> INFO: task kjournald:17466 blocked for more than 120 seconds.
>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>> kjournald     D ffff8800280bbe00     0 17466      2 0x00000000
>>>  ffff8801adf9d890 0000000000000046 0000000000000000 0000000000000000
>>>  ffff8801adcbde44 0000000000004000 000000000000fe00 000000000000c878
>>>  0000000800000050 ffff88017a99aa40 ffff8801af90a150 ffff8801adf9db08
>>> Call Trace:
>>>  [<ffffffff812dd063>] ? md_make_request+0xb6/0xf1
>>>  [<ffffffff8109c248>] ? sync_buffer+0x0/0x40
>>>  [<ffffffff8137a4fc>] ? io_schedule+0x2d/0x3a
>>>  [<ffffffff8109c283>] ? sync_buffer+0x3b/0x40
>>>  [<ffffffff8137a879>] ? __wait_on_bit+0x41/0x70
>>>  [<ffffffff8109c248>] ? sync_buffer+0x0/0x40
>>>  [<ffffffff8137a913>] ? out_of_line_wait_on_bit+0x6b/0x77
>>>  [<ffffffff810438b2>] ? wake_bit_function+0x0/0x23
>>>  [<ffffffff8109c637>] ? sync_dirty_buffer+0x72/0xaa
>>>  [<ffffffff81131b8e>] ? journal_commit_transaction+0xa74/0xde2
>>>  [<ffffffff8103abcc>] ? lock_timer_base+0x26/0x4b
>>>  [<ffffffff81043884>] ? autoremove_wake_function+0x0/0x2e
>>>  [<ffffffff81134804>] ? kjournald+0xe3/0x206
>>>  [<ffffffff81043884>] ? autoremove_wake_function+0x0/0x2e
>>>  [<ffffffff81134721>] ? kjournald+0x0/0x206
>>>  [<ffffffff81043591>] ? kthread+0x8b/0x93
>>>  [<ffffffff8100bd3a>] ? child_rip+0xa/0x20
>>>  [<ffffffff81043506>] ? kthread+0x0/0x93
>>>  [<ffffffff8100bd30>] ? child_rip+0x0/0x20
>>> livecd ~ #
>>>
>>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

next prev parent reply	other threads:[~2010-03-30 20:59 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-03-30 17:07 Array 'freezes' for some time after large writes? Jim Duchek
2010-03-30 17:18 ` Mark Knecht
2010-03-30 17:47   ` Jim Duchek
2010-03-30 18:00     ` Mark Knecht
2010-03-30 18:05     ` Mark Knecht
2010-03-30 20:32       ` Jim Duchek
2010-03-30 20:45         ` Mark Knecht
2010-03-30 20:59           ` Jim Duchek [this message]
2010-03-30 22:21             ` Mark Knecht
2010-03-30 23:50               ` Mark Knecht
2010-03-31  0:22                 ` Jim Duchek
2010-03-31  1:35 ` Roger Heflin
2010-03-31 16:12   ` Mark Knecht
2010-03-31 16:25     ` Jim Duchek
2010-03-31 16:37 ` Asdo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=r2vdead81ad1003301359g73b5a6ady3a9bcc174d556579@mail.gmail.com \
    --to=jim.duchek@gmail.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=markknecht@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).