public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Ross Biro <rossb@google.com>
To: Andre Hedrick <andre@linux-ide.org>
Cc: linux-kernel@vger.kernel.org
Subject: Re: BUG: [2.4.18+] IDE Race Condition
Date: Tue, 28 Jan 2003 09:42:21 -0800	[thread overview]
Message-ID: <3E36C0FD.1020002@google.com> (raw)
In-Reply-To: Pine.LNX.4.10.10301280918420.9272-100000@master.linux-ide.org


Sorry, we are not using sub-microsecond timers, we are just using the 
latest-greatest-bugiest hard drives and lots of them.  We can see 
problems like this because we can run on so many machines at once.  Even 
something that has a small percentage chance of showing up, we can see 
pretty quick.

We do have a bus analyzer that can time things down to about 4ns.  That 
has been invaluable for tracking down some of these issues.

    Ross

Andre Hedrick wrote:

>Ross,
>
>How did you create the sub-microsecond timers to profile the device/driver
>behavior?  I had been working on this for a while but little success.
>This is one of the key methods to predict device failure.
>
>One of the goals of the prebuilder it find slam prebuild commands down the
>pipes to force breakages and races to show up.
>
>So you have a possible solution?
>
>Cheers,
>
>Andre Hedrick
>LAD Storage Consulting Group
>
>
>On Tue, 28 Jan 2003, Ross Biro wrote:
>
>  
>
>>Easy, it happens all the time, there are just no tests in place to see it.
>>
>>We are keeping a histogram of how long every IDE DMA transfer takes 
>>place.  In ide_intr we record the time and set the start time in 
>>ide_drive_t to 0.  In ide_dma_proc, ide_dma_begin right AFTER activating 
>>the dma, we store the current value of jiffies in start time in ide_drive_t.
>>
>>In both those places we check to make sure that the value of start_time 
>>is sensible.  In ide_dma_begin, we make sure it's 0 and in ide_dma_intr, 
>>we make sure its non-zero.  Because of this race condition, we often saw 
>>DMAs finish before they began.
>>
>>In the normal kernel, the only thing I can see that could go wrong would 
>>be that the printk
>>
>>printk("%s: ide_intr: huh? expected NULL handler on exit\n", drive->name);
>>
>>in ide_intr  could be triggered.  I've never seen it happen, but I 
>>believe with enough effort, it could be made to occur.
>>
>>    Ross
>>
>>
>>Andre Hedrick wrote:
>>
>>    
>>
>>>Okay, how do you reproduce it to see the effects?
>>>
>>>On Mon, 27 Jan 2003, Ross Biro wrote:
>>>
>>> 
>>>
>>>      
>>>
>>>>The net effect of this race condition and the other one I spotted is 
>>>>that you may see some interesting messages in your log file and you can 
>>>>detect the race condition if you look for it hard enough.  I don't 
>>>>currently see any bad effects.
>>>>
>>>>   Ross
>>>>
>>>>Ross Biro wrote:
>>>>
>>>>   
>>>>
>>>>        
>>>>
>>>>>There is at least one more IDE race condition in 2.4.18 and 
>>>>>2.4.21-pre3. Basically the interrupt for the controller being serviced 
>>>>>is left on while setting up the next command.  I'm not sure how much 
>>>>>trouble it can cause but it does lead to some interesting stack traces.
>>>>>
>>>>>The condition
>>>>>if (masked_irq && hwif->irq != masked_irq)
>>>>>in ide_do_request should be replaced with
>>>>>if (!masked_irq || hwif->irq != masked_irq)
>>>>>in two places.
>>>>>
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>-
>>>>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>>>>the body of a message to majordomo@vger.kernel.org
>>>>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>Please read the FAQ at  http://www.tux.org/lkml/
>>>>
>>>>   
>>>>
>>>>        
>>>>
>>>Andre Hedrick
>>>LAD Storage Consulting Group
>>>
>>> 
>>>
>>>      
>>>
>
>  
>




  reply	other threads:[~2003-01-28 17:33 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-01-27 17:11 BUG: [2.4.18+] IDE Race Condition Ross Biro
2003-01-27 17:34 ` Ross Biro
2003-01-28  2:46   ` Andre Hedrick
2003-01-28 16:48     ` Ross Biro
2003-01-28 17:28       ` Andre Hedrick
2003-01-28 17:42         ` Ross Biro [this message]
2003-01-28 18:01           ` Andre Hedrick
2003-01-30 17:34 ` Alan Cox
2003-01-30 16:58   ` Ross Biro
2003-01-30 18:01     ` Alan Cox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3E36C0FD.1020002@google.com \
    --to=rossb@google.com \
    --cc=andre@linux-ide.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox