linux-hotplug.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: prarit (prarit@prarit.com) <prarit@verizon.net>
To: linux-hotplug@vger.kernel.org
Subject: Re: Latest bk kernel does not properly free PCI IO & MEM allocations
Date: Sat, 12 Mar 2005 17:04:54 +0000	[thread overview]
Message-ID: <0ID900FO80S63RF0@vms046.mailsrvcs.net> (raw)
In-Reply-To: <422F42A9.7050009@sgi.com>

Greg,

Thanks for the reply.  I'm at home today and don't have access to my SGI
account.  If you want to fwd this to the mailing list please feel free.
If I don't see it Monday morning, I'll fwd it back to the list then.

>> What I meant to say was pci_free_resources calls release_resource, where
>> release_region calls __release_region.  __release_region is called a 
>> "legacy" function?
 
> What documentation calls it that?

Hrmm ... I distinctly remember seeing that somewhere.  I'll find it again.
I also recall the term "compatibility cruft" used ... in ioport.h?

>I don"t like the wording at all, no one will notice it (trust me I"ve
> tried stuff like this before...)

Yeah ... I know.  I've tried it before too.  I've been working in this
space for a long time and it's clear to me that it needs a compile time 
#define DEBUG printk option.  Issues within the resource allocation code 
are impossible to debug without sticking a large # of printk's 
throughout the code.

>But what"s really curious, is why no one has hit this before.  Nothing
> has changed recently in this area of the kernel.  Did this used to work
> before?  Does it work just fine without the patch for other drivers?

I haven't gone back through kernels to determine where this "breakage"
occurred, but I do know that it is in the 2.6.9 kernel as I have been 
developing on a RHEL4 platform.

This is how I stumbled across this:  As you know I'm building and am
testing an SGI Altix Hotplug Driver.  

While testing I started up a few memory stress and IO stress tests
*that did not involve the card I was targeting for the test* and
approximately 5% of the time I hit a NULL pointer oops.  

At first glance, I thought the issue was within the sysfs/proc 
filesystems or pci_free_resources as that's where the oops' were 
-- obviously with more inspection I realized that didn't make
sense and it wasn't  the case.

A few things have to happen (in the precise order) for someone
to hit this issue.  Note that I have been running on 16, 32, and
64 cpu systems so it is very like #4 below is due to another CPU.

1.  HP slot is disabled via sysfs.
2.  PCI driver must call release_regions
3.  release_regions kfree's resource structures
4.  Context switch/Another CPU: resource area is alloc'd by 
    something else.
5.  pci_free_resources is called

Possible oops right here.

I've also stumbled across the case where the pci_free_resources case 
tried to free non-existant regions -- dumping the memory address indicated
in one case that I was looking at char data ... I recall that it 
looked like I was looking at the word "qla".  I dumped /proc/iomem
and /proc/ioports and incurred an oops in /proc .

Suppose #4 above doesn't happen.  Step #5 occurs and no one (user or 
system) is the wiser.  The memory is still intact -- no oops.

Additionally, I've seen people using PCI Hotplug in the field. 
Typically when removing a card sysadmins tend to quiesce the system
before removal.  That makes #4 just that much less likely to occur.

:) :)  Want the long explanation? :) :)

P.


-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95&alloc_id\x14396&op=click
_______________________________________________
Linux-hotplug-devel mailing list  http://linux-hotplug.sourceforge.net
Linux-hotplug-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-hotplug-devel

  parent reply	other threads:[~2005-03-12 17:04 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-03-09 18:38 Latest bk kernel does not properly free PCI IO & MEM allocations Prarit Bhargava
2005-03-10 15:07 ` Prarit Bhargava
2005-03-10 17:17 ` Greg KH
2005-03-10 18:16 ` Prarit Bhargava
2005-03-10 18:50 ` Prarit Bhargava
2005-03-10 23:23 ` Greg KH
2005-03-11 21:30 ` Prarit Bhargava
2005-03-12  7:38 ` Greg KH
2005-03-12 17:04 ` prarit [this message]
2005-03-13  0:49 ` Re: Latest bk kernel does not properly free PCI IO & MEM prarit
2005-03-15  6:11 ` Re: Latest bk kernel does not properly free PCI IO & MEM allocations Rajesh Shah
2005-03-15 12:55 ` Prarit Bhargava

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0ID900FO80S63RF0@vms046.mailsrvcs.net \
    --to=prarit@verizon.net \
    --cc=linux-hotplug@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).