From: Linas Vepstas <linas@austin.ibm.com>
To: Paul Mackerras <paulus@samba.org>
Cc: anton@samba.org, akpm@osdl.org, linuxppc64-dev@ozlabs.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH] PPC64: EEH Recovery
Date: Thu, 20 Jan 2005 16:39:16 -0600 [thread overview]
Message-ID: <20050120223916.GJ9140@austin.ibm.com> (raw)
In-Reply-To: <16877.63693.915740.385920@cargo.ozlabs.ibm.com>
On Wed, Jan 19, 2005 at 05:06:05PM +1100, Paul Mackerras was heard to remark:
> Linas Vepstas writes:
>
> > p.s. It was not clear to me if the EEH patch previously sent
> > (6 January 2005, same subject line) will be wending its way into
> > the main Torvalds kernel tree, or not. I hadn't really gotten
> > confirmation one way or another.
>
> I'm not really totally happy with it yet, on a number of fronts:
>
> 1. You're adding more PCI-specific stuff to the device_node struct,
> which I don't like. I would prefer that the device_node tree
> contains basically just what we get from OF, and that we have a
> separate struct for storing ppc64-specific information for each PCI
> device. Fixing that is outside the scope of your patch, though.
I wrote this down on my to-do list. Its the sort of thing that
evaporates from my consciousness when other things come along,
but I'll give it a shot.
> 2. I don't see why the device nodes for the PCI subtree being reset
> would go away, and thus I don't see the need for your eeh_cfg_tree
> struct.
Its not the reset, its the hot-plug remove. The hot plug code assumes
that you are going to physically remove the device from the slot, so
it removes the device_node as part of the "unconfig".
Of course, I found this out only after performing a null-pointer deref.
Note only does the node go away, but all of the various pointers it holds
are zeroed in the process.
The cfg tree holds on to those pointers, so that I wouldn't have to
muck with the device_node removal code to do something tricky.
> 3. Is there a good reason why we can't use the assigned-addresses
> property on the relevant device tree nodes to tell us what to set
> the BARs to?
Yes, the reason is that after a reset, that property doesn't hold any
decent data. I discussed this with the firmware developers, and thier
response was that it is the kernel's responsibility to compute
(or save/restore) such values. (Except for bridges, which they will do for us).
> 4. I think the 5 second sleep is quite bogus, and shows that we have
> the flow of control wrong.
:) Yes, well, indeed it is. Don't look at me, not my idea.
> In particular I think it should be a
> userland write to a sysfs file that kicks off the restart process
> rather than it just happening after 5 seconds. Anyway, what
> process or thread is executing that 5 second sleep? Is it keventd
> or something?
Its a workqueue.
> 5. AFAICS userland will get an unplug notification for the device, but
> nothing to indicate that is due to an EEH slot isolation event. I
> think userland should be told about EEH events.
In principle, I'd agree. In practice, this would seem to require changes
or additions or enhancements to udev that I don't quite understand, as
well as potential changes to udev scripts. Maybe I don't understand
sysfs sufficiently well. I am very tempted to punt on this, and wait
for the Intel-backed PCI-E code to get to this point, and then do whatever
they're doing.
--linas
next prev parent reply other threads:[~2005-01-20 22:39 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-01-06 19:24 [PATCH] PPC64: EEH Recovery Linas Vepstas
2005-01-17 20:14 ` Linas Vepstas
2005-01-19 6:06 ` Paul Mackerras
2005-01-19 16:00 ` Nathan Fontenot
2005-01-20 22:39 ` Linas Vepstas [this message]
2005-01-21 2:50 ` Paul Mackerras
2005-01-20 22:48 ` Linas Vepstas
-- strict thread matches above, loose matches on Subject: below --
2004-11-17 23:52 Linas Vepstas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20050120223916.GJ9140@austin.ibm.com \
--to=linas@austin.ibm.com \
--cc=akpm@osdl.org \
--cc=anton@samba.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc64-dev@ozlabs.org \
--cc=paulus@samba.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.