LinuxPPC-Dev Archive on lore.kernel.org

LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed

* Re: [PATCH 0/3] Enable multiple MSI feature in pSeries
From: Michael Ellerman @ 2013-02-04  3:23 UTC (permalink / raw)
  To: Mike Qiu; +Cc: tglx, linuxppc-dev, linux-kernel
In-Reply-To: <1358235536-32741-1-git-send-email-qiudayu@linux.vnet.ibm.com>

On Tue, 2013-01-15 at 15:38 +0800, Mike Qiu wrote:
> Currently, multiple MSI feature hasn't been enabled in pSeries,
> These patches try to enbale this feature.

Hi Mike,

> These patches have been tested by using ipr driver, and the driver patch
> has been made by Wen Xiong <wenxiong@linux.vnet.ibm.com>:

So who wrote these patches? Normally we would expect the original author
to post the patches if at all possible.

> [PATCH 0/7] Add support for new IBM SAS controllers

I would like to see the full series, including the driver enablement.

> Test platform: One partition of pSeries with one cpu core(4 SMTs) and 
>                RAID bus controller: IBM PCI-E IPR SAS Adapter (ASIC) in POWER7
> OS version: SUSE Linux Enterprise Server 11 SP2  (ppc64) with 3.8-rc3 kernel 
> 
> IRQ 21 and 22 are assigned to the ipr device which support 2 mutiple MSI.
> 
> The test results is shown by 'cat /proc/interrups':
>           CPU0       CPU1       CPU2       CPU3       
> 21:          6          5          5          5      XICS Level     host1-0
> 22:        817        814        816        813      XICS Level     host1-1

This shows that you are correctly configuring two MSIs.

But the key advantage of using multiple interrupts is to distribute load
across CPUs and improve performance. So I would like to see some
performance numbers that show that there is a real benefit for all the
extra complexity in the code.

cheers

^ permalink raw reply

* Re: [PATCH?] Move ACPI device nodes under /sys/firmware/acpi (was: Re: [RFC PATCH v2 01/12] Add sys_hotplug.h for system device hotplug framework)
From: Greg KH @ 2013-02-04  1:24 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-s390, Toshi Kani, jiang.liu, wency, linux-acpi, yinghai,
	linux-kernel, linux-mm, isimatu.yasuaki, srivatsa.bhat, guohanjun,
	bhelgaas, akpm, linuxppc-dev, lenb
In-Reply-To: <2806030.VWUMy6F7lm@vostro.rjw.lan>

On Sat, Feb 02, 2013 at 11:18:20PM +0100, Rafael J. Wysocki wrote:
> On Saturday, February 02, 2013 09:15:37 PM Rafael J. Wysocki wrote:
> > On Saturday, February 02, 2013 03:58:01 PM Greg KH wrote:
> [...]
> > 
> > > I know it's more complicated with these types of devices, and I think we
> > > are getting closer to the correct solution, I just don't want to ever
> > > see duplicate devices in the driver model for the same physical device.
> > 
> > Do you mean two things based on struct device for the same hardware component?
> > That's been happening already pretty much forever for every PCI device known
> > to the ACPI layer, for PNP and many others.  However, those ACPI things are (or
> > rather should be, but we're going to clean that up) only for convenience (to be
> > able to see the namespace structure and related things in sysfs).  So the stuff
> > under /sys/devices/LNXSYSTM\:00/ is not "real".  In my view it shouldn't even
> > be under /sys/devices/ (/sys/firmware/acpi/ seems to be a better place for it),
> > but that may be difficult to change without breaking user space (maybe we can
> > just symlink it from /sys/devices/ or something).  And the ACPI bus type
> > shouldn't even exist in my opinion.
> 
> Well, well.
> 
> In fact, the appended patch moves the whole ACPI device nodes tree under
> /sys/firmware/acpi/ and I'm not seeing any negative consequences of that on my
> test box (events work and so on).  User space is quite new on it, though, and
> the patch is hackish.

Try booting a RHEL 5 image on this type of kernel, or some old Fedora
releases, they were sensitive to changes in sysfs.

greg k-h

^ permalink raw reply

* Re: [RFC PATCH v2 01/12] Add sys_hotplug.h for system device hotplug framework
From: Greg KH @ 2013-02-04  1:23 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-s390, Toshi Kani, jiang.liu, wency, linux-acpi, yinghai,
	linux-kernel, linux-mm, isimatu.yasuaki, srivatsa.bhat, guohanjun,
	bhelgaas, akpm, linuxppc-dev, lenb
In-Reply-To: <1810611.i6Sc4oLaux@vostro.rjw.lan>

On Sat, Feb 02, 2013 at 09:15:37PM +0100, Rafael J. Wysocki wrote:
> On Saturday, February 02, 2013 03:58:01 PM Greg KH wrote:
> > On Fri, Feb 01, 2013 at 11:12:59PM +0100, Rafael J. Wysocki wrote:
> > > On Friday, February 01, 2013 08:23:12 AM Greg KH wrote:
> > > > On Thu, Jan 31, 2013 at 09:54:51PM +0100, Rafael J. Wysocki wrote:
> > > > > > > But, again, I'm going to ask why you aren't using the existing cpu /
> > > > > > > memory / bridge / node devices that we have in the kernel.  Please use
> > > > > > > them, or give me a _really_ good reason why they will not work.
> > > > > > 
> > > > > > We cannot use the existing system devices or ACPI devices here.  During
> > > > > > hot-plug, ACPI handler sets this shp_device info, so that cpu and memory
> > > > > > handlers (drivers/cpu.c and mm/memory_hotplug.c) can obtain their target
> > > > > > device information in a platform-neutral way.  During hot-add, we first
> > > > > > creates an ACPI device node (i.e. device under /sys/bus/acpi/devices),
> > > > > > but platform-neutral modules cannot use them as they are ACPI-specific.
> > > > > 
> > > > > But suppose we're smart and have ACPI scan handlers that will create
> > > > > "physical" device nodes for those devices during the ACPI namespace scan.
> > > > > Then, the platform-neutral nodes will be able to bind to those "physical"
> > > > > nodes.  Moreover, it should be possible to get a hierarchy of device objects
> > > > > this way that will reflect all of the dependencies we need to take into
> > > > > account during hot-add and hot-remove operations.  That may not be what we
> > > > > have today, but I don't see any *fundamental* obstacles preventing us from
> > > > > using this approach.
> > > > 
> > > > I would _much_ rather see that be the solution here as I think it is the
> > > > proper one.
> > > > 
> > > > > This is already done for PCI host bridges and platform devices and I don't
> > > > > see why we can't do that for the other types of devices too.
> > > > 
> > > > I agree.
> > > > 
> > > > > The only missing piece I see is a way to handle the "eject" problem, i.e.
> > > > > when we try do eject a device at the top of a subtree and need to tear down
> > > > > the entire subtree below it, but if that's going to lead to a system crash,
> > > > > for example, we want to cancel the eject.  It seems to me that we'll need some
> > > > > help from the driver core here.
> > > > 
> > > > I say do what we always have done here, if the user asked us to tear
> > > > something down, let it happen as they are the ones that know best :)
> > > > 
> > > > Seriously, I guess this gets back to the "fail disconnect" idea that the
> > > > ACPI developers keep harping on.  I thought we already resolved this
> > > > properly by having them implement it in their bus code, no reason the
> > > > same thing couldn't happen here, right?
> > > 
> > > Not really. :-)  We haven't ever resolved that particular issue I'm afraid.
> > 
> > Ah, I didn't realize that.
> > 
> > > > I don't think the core needs to do anything special, but if so, I'll be glad
> > > > to review it.
> > > 
> > > OK, so this is the use case.  We have "eject" defined for something like
> > > a container with a number of CPU cores, PCI host bridge, and a memory
> > > controller under it.  And a few pretty much arbitrary I/O devices as a bonus.
> > > 
> > > Now, there's a button on the system case labeled as "Eject" and if that button
> > > is pressed, we're supposed to _try_ to eject all of those things at once.  We
> > > are allowed to fail that request, though, if that's problematic for some
> > > reason, but we're supposed to let the BIOS know about that.
> > > 
> > > Do you seriously think that if that button is pressed, we should just proceed
> > > with removing all that stuff no matter what?  That'd be kind of like Russian
> > > roulette for whoever pressed that button, because s/he could only press it and
> > > wait for the system to either crash or not.  Or maybe to crash a bit later
> > > because of some delayed stuff that would hit one of those devices that had just
> > > gone.  Surely not a situation any admin of a high-availability system would
> > > like to be in. :-)
> > > 
> > > Quite frankly, I have no idea how that can be addressed in a single bus type,
> > > let alone ACPI (which is not even a proper bus type, just something pretending
> > > to be one).
> > 
> > You don't have it as a single bus type, you have a controller somewhere,
> > off of the bus being destroyed, that handles sending remove events to
> > the device and tearing everything down.  PCI does this from the very
> > beginning.
> 
> Yes, but those are just remove events and we can only see how destructive they
> were after the removal.  The point is to be able to figure out whether or not
> we *want* to do the removal in the first place.

Yes, but, you will always race if you try to test to see if you can shut
down a device and then trying to do it.  So walking the bus ahead of
time isn't a good idea.

And, we really don't have a viable way to recover if disconnect() fails,
do we.  What do we do in that situation, restore the other devices we
disconnected successfully?  How do we remember/know what they were?

PCI hotplug almost had this same problem until the designers finally
realized that they just had to accept the fact that removing a PCI
device could either happen by:
	- a user yanking out the device, at which time the OS better
	  clean up properly no matter what happens
	- the user asked nicely to remove a device, and the OS can take
	  as long as it wants to complete that action, including
	  stalling for noticable amounts of time before eventually,
	  always letting the action succeed.

I think the second thing is what you have to do here.  If a user tells
the OS it wants to remove these devices, you better do it.  If you
can't, because memory is being used by someone else, either move them
off, or just hope that nothing bad happens, before the user gets
frustrated and yanks out the CPU/memory module themselves physically :)

> Say you have a computing node which signals a hardware problem in a processor
> package (the container with CPU cores, memory, PCI host bridge etc.).  You
> may want to eject that package, but you don't want to kill the system this
> way.  So if the eject is doable, it is very much desirable to do it, but if it
> is not doable, you'd rather shut the box down and do the replacement afterward.
> That may be costly, however (maybe weeks of computations), so it should be
> avoided if possible, but not at the expense of crashing the box if the eject
> doesn't work out.

These same "situations" came up for PCI hotplug, and I still say the
same resolution there holds true, as described above.  The user wants to
remove something, so let them do it.  They always know best, and get mad
at us if we think otherwise :)

What does the ACPI spec say about this type of thing?  Surely the same
people that did the PCI Hotplug spec were consulted when doing this part
of the spec, right?  Yeah, I know, I can dream...

> > I know it's more complicated with these types of devices, and I think we
> > are getting closer to the correct solution, I just don't want to ever
> > see duplicate devices in the driver model for the same physical device.
> 
> Do you mean two things based on struct device for the same hardware component?
> That's been happening already pretty much forever for every PCI device known
> to the ACPI layer, for PNP and many others.  However, those ACPI things are (or
> rather should be, but we're going to clean that up) only for convenience (to be
> able to see the namespace structure and related things in sysfs).  So the stuff
> under /sys/devices/LNXSYSTM\:00/ is not "real".

Yes, I've never treated that as a "real" device because they (usually)
didn't ever bind to the "real" driver that controlled the device and how
it talked to the rest of the os (like a USB device for example.)  I
always thought just of it as a "shadow" of the firmware image, nothing
that should be directly operated on if at all possible.

But, as you are pointing out, maybe this needs to be changed.  Having
users have to look in one part of the tree for one interface to a
device, and another totally different part for a different interface to
the same physical device is crazy, don't you agree?

As to how to solve it, I really have no idea, I don't know ACPI that
well at all, and honestly, don't want to, I want to keep what little
hair I have left...

> In my view it shouldn't even
> be under /sys/devices/ (/sys/firmware/acpi/ seems to be a better place for it),

I agree.

> but that may be difficult to change without breaking user space (maybe we can
> just symlink it from /sys/devices/ or something).  And the ACPI bus type
> shouldn't even exist in my opinion.
> 
> There's much confusion in there and much work to clean that up, I agree, but
> that's kind of separate from the hotplug thing.

I agree as well.

Best of luck.

greg k-h

^ permalink raw reply

* Re: 3.7-rc7: BUG: MAX_STACK_TRACE_ENTRIES too low!
From: Christian Kujau @ 2013-02-04  0:38 UTC (permalink / raw)
  To: Li Zhong; +Cc: linuxppc-dev, LKML
In-Reply-To: <alpine.DEB.2.01.1301271451390.7378@trent.utfs.org>

On Sun, 27 Jan 2013 at 14:56, Christian Kujau wrote:
> Hm, is there no chance to get this into 3.8? I've been running with this 
> patch applied since 3.7-rc7 and it got rid of this 
> "MAX_STACK_TRACE_ENTRIES too low" message. I've just upgraded to 3.8-rc5 
> and it's still not in mainline :-\

Hah! I just noticed that it got merged the next day - Thanks!

Christian.
-- 
BOFH excuse #80:

That's a great computer you have there; have you considered how it would work as a BSD machine?

^ permalink raw reply

* Re: [RFC PATCH v2 01/12] Add sys_hotplug.h for system device hotplug framework
From: Toshi Kani @ 2013-02-04  0:28 UTC (permalink / raw)
  To: Greg KH
  Cc: linux-s390@vger.kernel.org, jiang.liu@huawei.com,
	wency@cn.fujitsu.com, linux-mm@kvack.org, yinghai@kernel.org,
	linux-kernel@vger.kernel.org, Rafael J. Wysocki,
	linux-acpi@vger.kernel.org, isimatu.yasuaki@jp.fujitsu.com,
	srivatsa.bhat@linux.vnet.ibm.com, guohanjun@huawei.com,
	bhelgaas@google.com, akpm@linux-foundation.org,
	linuxppc-dev@lists.ozlabs.org, lenb@kernel.org
In-Reply-To: <20130202150154.GC1434@kroah.com>

On Sat, 2013-02-02 at 16:01 +0100, Greg KH wrote:
> On Fri, Feb 01, 2013 at 01:40:10PM -0700, Toshi Kani wrote:
> > On Fri, 2013-02-01 at 07:30 +0000, Greg KH wrote:
> > > On Thu, Jan 31, 2013 at 06:32:18PM -0700, Toshi Kani wrote:
> > >  > This is already done for PCI host bridges and platform devices and I don't
> > > > > see why we can't do that for the other types of devices too.
> > > > > 
> > > > > The only missing piece I see is a way to handle the "eject" problem, i.e.
> > > > > when we try do eject a device at the top of a subtree and need to tear down
> > > > > the entire subtree below it, but if that's going to lead to a system crash,
> > > > > for example, we want to cancel the eject.  It seems to me that we'll need some
> > > > > help from the driver core here.
> > > > 
> > > > There are three different approaches suggested for system device
> > > > hot-plug:
> > > >  A. Proceed within system device bus scan.
> > > >  B. Proceed within ACPI bus scan.
> > > >  C. Proceed with a sequence (as a mini-boot).
> > > > 
> > > > Option A uses system devices as tokens, option B uses acpi devices as
> > > > tokens, and option C uses resource tables as tokens, for their handlers.
> > > > 
> > > > Here is summary of key questions & answers so far.  I hope this
> > > > clarifies why I am suggesting option 3.
> > > > 
> > > > 1. What are the system devices?
> > > > System devices provide system-wide core computing resources, which are
> > > > essential to compose a computer system.  System devices are not
> > > > connected to any particular standard buses.
> > > 
> > > Not a problem, lots of devices are not connected to any "particular
> > > standard busses".  All this means is that system devices are connected
> > > to the "system" bus, nothing more.
> > 
> > Can you give me a few examples of other devices that support hotplug and
> > are not connected to any particular buses?  I will investigate them to
> > see how they are managed to support hotplug.
> 
> Any device that is attached to any bus in the driver model can be
> hotunplugged from userspace by telling it to be "unbound" from the
> driver controlling it.  Try it for any platform device in your system to
> see how it happens.

The unbind operation, as I understand from you, is to detach a driver
from a device.  Yes, unbinding can be done for any devices.  It is
however different from hot-plug operation, which unplugs a device.

Today, the unbind operation to an ACPI cpu/memory devices causes
hot-unplug (offline) operation to them, which is one of the major issues
for us since unbind cannot fail.  This patchset addresses this issue by
making the unbind operation of ACPI cpu/memory devices to do the
unbinding only.  ACPI drivers no longer control cpu and memory as they
are supposed to be controlled by their drivers, cpu and memory modules.
The current hotplug code requires putting all device control stuff into
ACPI, which this patchset is trying to fix it.

> > > > 2. Why are the system devices special?
> > > > The system devices are initialized during early boot-time, by multiple
> > > > subsystems, from the boot-up sequence, in pre-defined order.  They
> > > > provide low-level services to enable other subsystems to come up.
> > > 
> > > Sorry, no, that doesn't mean they are special, nothing here is unique
> > > for the point of view of the driver model from any other device or bus.
> > 
> > I think system devices are unique in a sense that they are initialized
> > before drivers run.
> 
> No, most all devices are "initialized" before a driver runs on it, USB
> is one such example, PCI another, and I'm pretty sure that there are
> others.

USB devices can be initialized after the USB bus driver is initialized.
Similarly, PCI devices can be initialized after the PCI bus driver is
initialized.  However, CPU and memory are initialized without any
dependency to their bus driver since there is no such thing.

In addition, CPU and memory have two drivers -- their actual
drivers/subsystems and their ACPI drivers.  Their actual
drivers/subsystems initialize cpu and memory.  The ACPI drivers
initialize driver-specific data of their ACPI nodes.  During hot-plug
operation, however, the current code requires these ACPI drivers to do
their hot-plug operations.  This patchset keeps them separated during
hot-plug and let their actual drivers/subsystems to do the job.

> > > If you need to initialize the driver
> > > core earlier, then do so.  Or, even better, just wait until enough of
> > > the system has come up and then go initialize all of the devices you
> > > have found so far as part of your boot process.
> > 
> > They are pseudo drivers that provide sysfs entry points of cpu and
> > memory.  They do not actually initialize cpu and memory.  I do not think
> > initializing cpu and memory fits into the driver model either, since
> > drivers should run after cpu and memory are initialized.
> 
> We already represent CPUs in the sysfs tree, don't represent them in two
> different places with two different structures.  Use the existing ones
> please.

This patchset does not make any changes to sysfs.  It does however make
the system drivers to control the system devices, and ACPI to do the
ACPI stuff only.

The boot sequence calls the following steps to initialize memory and
cpu, as well as their acpi nodes.  These steps are independent, i.e. #1
and #2 run without ACPI.

 1. mm -> initialize memory -> create sysfs system/memory
 2. smp/cpu -> initialize cpu -> create sysfs system/cpu
 3. acpi core -> acpi scan -> create sysfs acpi nodes

While there are 3 separate steps at boot, the current hotplug code tries
to do everything in #3.  Therefore, this patchset provides a sequencer
(which is similar to the boot sequence) to run these steps during a
hot-plug operation as well.  Hence, we have consistent steps and role
mode between boot and hot-plug operations.

> > > None of the above things you have stated seem to have anything to do
> > > with your proposed patch, so I don't understand why you have mentioned
> > > them...
> > 
> > You suggested option A before, which uses system bus scan to initialize
> > all system devices at boot time as well as hot-plug.  I tried to say
> > that this option would not be doable.
> 
> I haven't yet been convinced otherwise, sorry.  Please prove me wrong :)

This patchset enables the system drivers (i.e. cpu and memory drivers)
to do the hot-plug operations, and ACPI core to do the ACPI stuff.

We should not go with the system driver only approach since we do need
to use ACPI stuff.  Also, we should not go with the ACPI only approach
since we do need to use the system (and other) drivers.  This patchset
provides a sequencer to manage these steps across multiple subsystems.

Thanks,
-Toshi

^ permalink raw reply

* Re: [RFC PATCH v2 01/12] Add sys_hotplug.h for system device hotplug framework
From: Rafael J. Wysocki @ 2013-02-03 20:44 UTC (permalink / raw)
  To: Greg KH
  Cc: linux-s390, Toshi Kani, jiang.liu, wency, linux-acpi, yinghai,
	linux-kernel, linux-mm, isimatu.yasuaki, srivatsa.bhat, guohanjun,
	bhelgaas, akpm, linuxppc-dev, lenb
In-Reply-To: <1810611.i6Sc4oLaux@vostro.rjw.lan>

On Saturday, February 02, 2013 09:15:37 PM Rafael J. Wysocki wrote:
> On Saturday, February 02, 2013 03:58:01 PM Greg KH wrote:
> > On Fri, Feb 01, 2013 at 11:12:59PM +0100, Rafael J. Wysocki wrote:
> > > On Friday, February 01, 2013 08:23:12 AM Greg KH wrote:
> > > > On Thu, Jan 31, 2013 at 09:54:51PM +0100, Rafael J. Wysocki wrote:
> > > > > > > But, again, I'm going to ask why you aren't using the existing cpu /
> > > > > > > memory / bridge / node devices that we have in the kernel.  Please use
> > > > > > > them, or give me a _really_ good reason why they will not work.
> > > > > > 
> > > > > > We cannot use the existing system devices or ACPI devices here.  During
> > > > > > hot-plug, ACPI handler sets this shp_device info, so that cpu and memory
> > > > > > handlers (drivers/cpu.c and mm/memory_hotplug.c) can obtain their target
> > > > > > device information in a platform-neutral way.  During hot-add, we first
> > > > > > creates an ACPI device node (i.e. device under /sys/bus/acpi/devices),
> > > > > > but platform-neutral modules cannot use them as they are ACPI-specific.
> > > > > 
> > > > > But suppose we're smart and have ACPI scan handlers that will create
> > > > > "physical" device nodes for those devices during the ACPI namespace scan.
> > > > > Then, the platform-neutral nodes will be able to bind to those "physical"
> > > > > nodes.  Moreover, it should be possible to get a hierarchy of device objects
> > > > > this way that will reflect all of the dependencies we need to take into
> > > > > account during hot-add and hot-remove operations.  That may not be what we
> > > > > have today, but I don't see any *fundamental* obstacles preventing us from
> > > > > using this approach.
> > > > 
> > > > I would _much_ rather see that be the solution here as I think it is the
> > > > proper one.
> > > > 
> > > > > This is already done for PCI host bridges and platform devices and I don't
> > > > > see why we can't do that for the other types of devices too.
> > > > 
> > > > I agree.
> > > > 
> > > > > The only missing piece I see is a way to handle the "eject" problem, i.e.
> > > > > when we try do eject a device at the top of a subtree and need to tear down
> > > > > the entire subtree below it, but if that's going to lead to a system crash,
> > > > > for example, we want to cancel the eject.  It seems to me that we'll need some
> > > > > help from the driver core here.
> > > > 
> > > > I say do what we always have done here, if the user asked us to tear
> > > > something down, let it happen as they are the ones that know best :)
> > > > 
> > > > Seriously, I guess this gets back to the "fail disconnect" idea that the
> > > > ACPI developers keep harping on.  I thought we already resolved this
> > > > properly by having them implement it in their bus code, no reason the
> > > > same thing couldn't happen here, right?
> > > 
> > > Not really. :-)  We haven't ever resolved that particular issue I'm afraid.
> > 
> > Ah, I didn't realize that.
> > 
> > > > I don't think the core needs to do anything special, but if so, I'll be glad
> > > > to review it.
> > > 
> > > OK, so this is the use case.  We have "eject" defined for something like
> > > a container with a number of CPU cores, PCI host bridge, and a memory
> > > controller under it.  And a few pretty much arbitrary I/O devices as a bonus.
> > > 
> > > Now, there's a button on the system case labeled as "Eject" and if that button
> > > is pressed, we're supposed to _try_ to eject all of those things at once.  We
> > > are allowed to fail that request, though, if that's problematic for some
> > > reason, but we're supposed to let the BIOS know about that.
> > > 
> > > Do you seriously think that if that button is pressed, we should just proceed
> > > with removing all that stuff no matter what?  That'd be kind of like Russian
> > > roulette for whoever pressed that button, because s/he could only press it and
> > > wait for the system to either crash or not.  Or maybe to crash a bit later
> > > because of some delayed stuff that would hit one of those devices that had just
> > > gone.  Surely not a situation any admin of a high-availability system would
> > > like to be in. :-)
> > > 
> > > Quite frankly, I have no idea how that can be addressed in a single bus type,
> > > let alone ACPI (which is not even a proper bus type, just something pretending
> > > to be one).
> > 
> > You don't have it as a single bus type, you have a controller somewhere,
> > off of the bus being destroyed, that handles sending remove events to
> > the device and tearing everything down.  PCI does this from the very
> > beginning.
> 
> Yes, but those are just remove events and we can only see how destructive they
> were after the removal.  The point is to be able to figure out whether or not
> we *want* to do the removal in the first place.
> 
> Say you have a computing node which signals a hardware problem in a processor
> package (the container with CPU cores, memory, PCI host bridge etc.).  You
> may want to eject that package, but you don't want to kill the system this
> way.  So if the eject is doable, it is very much desirable to do it, but if it
> is not doable, you'd rather shut the box down and do the replacement afterward.
> That may be costly, however (maybe weeks of computations), so it should be
> avoided if possible, but not at the expense of crashing the box if the eject
> doesn't work out.

It seems to me that we could handle that with the help of a new flag, say
"no_eject", in struct device, a global mutex, and a function that will walk
the given subtree of the device hierarchy and check if "no_eject" is set for
any devices in there.  Plus a global "no_eject" switch, perhaps.

To be more precise, suppose we have a "no_eject" flag that is set for a device
when the kernel is about to start using it in such a way that ejecting it would
lead to serious trouble (i.e. it is a memory module holding the kernel's page
tables or something like that).  Next, suppose that we have a function called,
say, "device_may_eject()" that will walk the subtree of the device hierarchy
starting at the given node and return false if any of the devices in there have
"no_eject" set.  Further, require that device_may_eject() has to be called
under a global mutex called something like "device_eject_lock".  Then, we can
arrange things as follows:

1. When a device is about to be used for such purposes that it shouldn't be
   ejected, the relevant code should:
  (a) Acquire device_eject_lock.
  (b) Check if the device is still there and go to (e) if not.
  (c) Do whatever needs to be done to the device.
  (d) Set the device's no_eject flag.
  (e) Release device_eject_lock.

2. When an eject operation is about to be carried out on a subtree of the device
   hierarchy, the eject code (be it ACPI or something else) should:
  (a) Acquire device_eject_lock.
  (b) Call device_may_eject() on the starting device and go to (d) if it
      returns false.
  (c) Carry out the eject (that includes calling .remove() from all of the
      involved drivers in partiular).
  (d) Release device_eject_lock.

3. When it is OK to eject the device again, the relevant code should just clear
   its "no_eject" flag under device_eject_lock.

If we want to synchronize that with such things like boot or system suspend,
a global "no_eject" switch can be used for that (it needs to be manipulated
under device_eject_lock) and one more step (check the global "no_eject") is
needed between 2(a) and 2(b).

Does it look like a reasonable approach?

Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply

* [PATCH v2 1/1] powerpc/85xx: Board support for ppa8548
From: Stef van Os @ 2013-02-03 19:39 UTC (permalink / raw)
  To: Kumar Gala, linuxppc-dev
  Cc: Scott Wood, Paul Mackerras, Timur Tabi, Stef van Os

Initial board support for the Prodrive PPA8548 AMC module. Board
is an MPC8548 AMC platform used in RapidIO systems. This module is
also used to test/work on mainline linux RapidIO software.

PPA8548 overview:
- 1.3 GHz Freescale PowerQUICC III MPC8548 processor
- 1 GB DDR2 @ 266 MHz
- 8 MB NOR flash
- Serial RapidIO 1.2
- 1 x 10/100/1000 BASE-T front ethernet
- 1 x 1000 BASE-BX ethernet on AMC connector

Signed-off-by: Stef van Os <stef.van.os@prodrive.nl>
---
 arch/powerpc/boot/dts/ppa8548.dts           |  163 +++++++++++++++++++++++++++
 arch/powerpc/configs/85xx/ppa8548_defconfig |   65 +++++++++++
 arch/powerpc/platforms/85xx/Kconfig         |    7 +
 arch/powerpc/platforms/85xx/Makefile        |    1 +
 arch/powerpc/platforms/85xx/ppa8548.c       |   98 ++++++++++++++++
 5 files changed, 334 insertions(+), 0 deletions(-)
 create mode 100644 arch/powerpc/boot/dts/ppa8548.dts
 create mode 100644 arch/powerpc/configs/85xx/ppa8548_defconfig
 create mode 100644 arch/powerpc/platforms/85xx/ppa8548.c

diff --git a/arch/powerpc/boot/dts/ppa8548.dts b/arch/powerpc/boot/dts/ppa8548.dts
new file mode 100644
index 0000000..b5f5220
--- /dev/null
+++ b/arch/powerpc/boot/dts/ppa8548.dts
@@ -0,0 +1,163 @@
+/*
+ * PPA8548 Device Tree Source (36-bit address map)
+ * Copyright 2013 Prodrive B.V.
+ *
+ * Based on:
+ * MPC8548 CDS Device Tree Source (36-bit address map)
+ * Copyright 2012 Freescale Semiconductor Inc.
+ *
+ * This program is free software; you can redistribute  it and/or modify it
+ * under  the terms of  the GNU General  Public License as published by the
+ * Free Software Foundation;  either version 2 of the  License, or (at your
+ * option) any later version.
+ */
+
+/include/ "fsl/mpc8548si-pre.dtsi"
+
+/ {
+	model = "ppa8548";
+	compatible = "ppa8548";
+	#address-cells = <2>;
+	#size-cells = <2>;
+	interrupt-parent = <&mpic>;
+
+	memory {
+		device_type = "memory";
+		reg = <0 0 0x0 0x40000000>;
+	};
+
+	lbc: localbus@fe0005000 {
+		reg = <0xf 0xe0005000 0 0x1000>;
+		ranges = <0x0 0x0 0xf 0xff800000 0x00800000>;
+	};
+
+	soc: soc8548@fe0000000 {
+		ranges = <0 0xf 0xe0000000 0x100000>;
+	};
+
+	pci0: pci@fe0008000 {
+		status = "disabled";
+	};
+
+	pci1: pci@fe0009000 {
+		status = "disabled";
+	};
+
+	pci2: pcie@fe000a000 {
+		status = "disabled";
+	};
+
+	rio: rapidio@fe00c0000 {
+		reg = <0xf 0xe00c0000 0x0 0x11000>;
+		port1 {
+			ranges = <0x0 0x0 0x0 0x80000000 0x0 0x40000000>;
+		};
+	};
+};
+
+&lbc {
+	nor@0 {
+		#address-cells = <1>;
+		#size-cells = <1>;
+		compatible = "cfi-flash";
+		reg = <0x0 0x0 0x00800000>;
+		bank-width = <2>;
+		device-width = <2>;
+
+		partition@0 {
+			reg = <0x0 0x7A0000>;
+			label = "user";
+		};
+
+		partition@7A0000 {
+			reg = <0x7A0000 0x20000>;
+			label = "env";
+			read-only;
+		};
+
+		partition@7C0000 {
+			reg = <0x7C0000 0x40000>;
+			label = "u-boot";
+			read-only;
+		};
+	};
+};
+
+&soc {
+	i2c@3000 {
+		rtc@6f {
+			compatible = "intersil,isl1208";
+			reg = <0x6f>;
+		};
+	};
+
+	i2c@3100 {
+	};
+
+	/*
+	 * Only ethernet controller @25000 and @26000 are used.
+	 * Use alias enet2 and enet3 for the remainig controllers,
+	 * to stay compatible with mpc8548si-pre.dtsi.
+	 */
+	enet2: ethernet@24000 {
+		status = "disabled";
+	};
+
+	mdio@24520 {
+		phy0: ethernet-phy@0 {
+			interrupts = <7 1 0 0>;
+			reg = <0x0>;
+			device_type = "ethernet-phy";
+		};
+		phy1: ethernet-phy@1 {
+			interrupts = <8 1 0 0>;
+			reg = <0x1>;
+			device_type = "ethernet-phy";
+		};
+		tbi0: tbi-phy@11 {
+			reg = <0x11>;
+			device_type = "tbi-phy";
+		};
+	};
+
+	enet0: ethernet@25000 {
+		tbi-handle = <&tbi1>;
+		phy-handle = <&phy0>;
+	};
+
+	mdio@25520 {
+		tbi1: tbi-phy@11 {
+			reg = <0x11>;
+			device_type = "tbi-phy";
+		};
+	};
+
+	enet1: ethernet@26000 {
+		tbi-handle = <&tbi2>;
+		phy-handle = <&phy1>;
+	};
+
+	mdio@26520 {
+		tbi2: tbi-phy@11 {
+			reg = <0x11>;
+			device_type = "tbi-phy";
+		};
+	};
+
+	enet3: ethernet@27000 {
+		status = "disabled";
+	};
+
+	mdio@27520 {
+		tbi3: tbi-phy@11 {
+			reg = <0x11>;
+			device_type = "tbi-phy";
+		};
+	};
+
+	crypto@30000 {
+		status = "disabled";
+	};
+};
+
+/include/ "fsl/mpc8548si-post.dtsi"
diff --git a/arch/powerpc/configs/85xx/ppa8548_defconfig b/arch/powerpc/configs/85xx/ppa8548_defconfig
new file mode 100644
index 0000000..a11337d
--- /dev/null
+++ b/arch/powerpc/configs/85xx/ppa8548_defconfig
@@ -0,0 +1,65 @@
+CONFIG_PPC_85xx=y
+CONFIG_PPA8548=y
+CONFIG_DTC=y
+CONFIG_DEFAULT_UIMAGE=y
+CONFIG_IKCONFIG=y
+CONFIG_IKCONFIG_PROC=y
+# CONFIG_PCI is not set
+# CONFIG_USB_SUPPORT is not set
+CONFIG_ADVANCED_OPTIONS=y
+CONFIG_LOWMEM_SIZE_BOOL=y
+CONFIG_LOWMEM_SIZE=0x40000000
+CONFIG_LOWMEM_CAM_NUM_BOOL=y
+CONFIG_LOWMEM_CAM_NUM=4
+CONFIG_PAGE_OFFSET_BOOL=y
+CONFIG_PAGE_OFFSET=0xb0000000
+CONFIG_KERNEL_START_BOOL=y
+CONFIG_KERNEL_START=0xb0000000
+# CONFIG_PHYSICAL_START_BOOL is not set
+CONFIG_PHYSICAL_START=0x00000000
+CONFIG_PHYSICAL_ALIGN=0x04000000
+CONFIG_TASK_SIZE_BOOL=y
+CONFIG_TASK_SIZE=0xb0000000
+
+CONFIG_FSL_LBC=y
+CONFIG_FSL_DMA=y
+CONFIG_FSL_RIO=y
+
+CONFIG_RAPIDIO=y
+CONFIG_RAPIDIO_DMA_ENGINE=y
+CONFIG_RAPIDIO_TSI57X=y
+CONFIG_RAPIDIO_TSI568=y
+CONFIG_RAPIDIO_CPS_XX=y
+CONFIG_RAPIDIO_CPS_GEN2=y
+CONFIG_SERIAL_8250=y
+CONFIG_SERIAL_8250_CONSOLE=y
+CONFIG_PROC_DEVICETREE=y
+
+CONFIG_MTD=y
+CONFIG_MTD_BLKDEVS=y
+CONFIG_MTD_BLOCK=y
+CONFIG_MTD_CFI=y
+CONFIG_MTD_CFI_AMDSTD=y
+CONFIG_MTD_CFI_INTELEXT=y
+CONFIG_MTD_CHAR=y
+CONFIG_MTD_CMDLINE_PARTS=y
+CONFIG_MTD_CONCAT=y
+CONFIG_MTD_PARTITIONS=y
+CONFIG_MTD_PHYSMAP_OF=y
+
+CONFIG_I2C=y
+CONFIG_I2C_MPC=y
+CONFIG_I2C_CHARDEV
+CONFIG_RTC_CLASS=y
+CONFIG_RTC_HCTOSYS=y
+CONFIG_RTC_DRV_ISL1208=y
+
+CONFIG_NET=y
+CONFIG_INET=y
+CONFIG_IP_PNP=y
+CONFIG_NETDEVICES=y
+CONFIG_MII=y
+CONFIG_GIANFAR=y
+CONFIG_MARVELL_PHY=y
+CONFIG_NFS_FS=y
+CONFIG_ROOT_NFS=y
diff --git a/arch/powerpc/platforms/85xx/Kconfig b/arch/powerpc/platforms/85xx/Kconfig
index 02d02a0..803bfed 100644
--- a/arch/powerpc/platforms/85xx/Kconfig
+++ b/arch/powerpc/platforms/85xx/Kconfig
@@ -191,6 +191,13 @@ config SBC8548
 	help
 	  This option enables support for the Wind River SBC8548 board
 
+config PPA8548
+	bool "Prodrive PPA8548"
+	help
+	  This option enables support for the Prodrive PPA8548 board.
+	select DEFAULT_UIMAGE
+	select HAS_RAPIDIO
+
 config GE_IMP3A
 	bool "GE Intelligent Platforms IMP3A"
 	select DEFAULT_UIMAGE
diff --git a/arch/powerpc/platforms/85xx/Makefile b/arch/powerpc/platforms/85xx/Makefile
index 76f679c..d5f4195 100644
--- a/arch/powerpc/platforms/85xx/Makefile
+++ b/arch/powerpc/platforms/85xx/Makefile
@@ -25,6 +25,7 @@ obj-$(CONFIG_P5040_DS)    += p5040_ds.o corenet_ds.o
 obj-$(CONFIG_STX_GP3)	  += stx_gp3.o
 obj-$(CONFIG_TQM85xx)	  += tqm85xx.o
 obj-$(CONFIG_SBC8548)     += sbc8548.o
+obj-$(CONFIG_PPA8548)     += ppa8548.o
 obj-$(CONFIG_SOCRATES)    += socrates.o socrates_fpga_pic.o
 obj-$(CONFIG_KSI8560)	  += ksi8560.o
 obj-$(CONFIG_XES_MPC85xx) += xes_mpc85xx.o
diff --git a/arch/powerpc/platforms/85xx/ppa8548.c b/arch/powerpc/platforms/85xx/ppa8548.c
new file mode 100644
index 0000000..af7bf1a
--- /dev/null
+++ b/arch/powerpc/platforms/85xx/ppa8548.c
@@ -0,0 +1,98 @@
+/*
+ * ppa8548 setup and early boot code.
+ *
+ * Copyright 2009 Prodrive B.V..
+ *
+ * By Stef van Os (see MAINTAINERS for contact information)
+ *
+ * Based on the SBC8548 support - Copyright 2007 Wind River Systems Inc.
+ * Based on the MPC8548CDS support - Copyright 2005 Freescale Inc.
+ *
+ * This program is free software; you can redistribute  it and/or modify it
+ * under  the terms of  the GNU General  Public License as published by the
+ * Free Software Foundation;  either version 2 of the  License, or (at your
+ * option) any later version.
+ */
+
+#include <linux/stddef.h>
+#include <linux/kernel.h>
+#include <linux/init.h>
+#include <linux/reboot.h>
+#include <linux/seq_file.h>
+#include <linux/of_platform.h>
+
+#include <asm/machdep.h>
+#include <asm/udbg.h>
+#include <asm/mpic.h>
+
+#include <sysdev/fsl_soc.h>
+
+static void __init ppa8548_pic_init(void)
+{
+	struct mpic *mpic = mpic_alloc(NULL, 0, MPIC_BIG_ENDIAN,
+			0, 256, " OpenPIC  ");
+	BUG_ON(mpic == NULL);
+	mpic_init(mpic);
+}
+
+/*
+ * Setup the architecture
+ */
+static void __init ppa8548_setup_arch(void)
+{
+	if (ppc_md.progress)
+		ppc_md.progress("ppa8548_setup_arch()", 0);
+}
+
+static void ppa8548_show_cpuinfo(struct seq_file *m)
+{
+	uint svid, phid1;
+
+	svid = mfspr(SPRN_SVR);
+
+	seq_printf(m, "Vendor\t\t: Prodrive B.V.\n");
+	seq_printf(m, "SVR\t\t: 0x%x\n", svid);
+
+	/* Display cpu Pll setting */
+	phid1 = mfspr(SPRN_HID1);
+	seq_printf(m, "PLL setting\t: 0x%x\n", ((phid1 >> 24) & 0x3f));
+}
+
+static struct of_device_id __initdata of_bus_ids[] = {
+	{ .name = "soc", },
+	{ .type = "soc", },
+	{ .compatible = "simple-bus", },
+	{ .compatible = "gianfar", },
+	{ .compatible = "fsl,srio", },
+	{},
+};
+
+static int __init declare_of_platform_devices(void)
+{
+	of_platform_bus_probe(NULL, of_bus_ids, NULL);
+
+	return 0;
+}
+machine_device_initcall(ppa8548, declare_of_platform_devices);
+
+/*
+ * Called very early, device-tree isn't unflattened
+ */
+static int __init ppa8548_probe(void)
+{
+	unsigned long root = of_get_flat_dt_root();
+
+	return of_flat_dt_is_compatible(root, "ppa8548");
+}
+
+define_machine(ppa8548) {
+	.name		= "ppa8548",
+	.probe		= ppa8548_probe,
+	.setup_arch	= ppa8548_setup_arch,
+	.init_IRQ	= ppa8548_pic_init,
+	.show_cpuinfo	= ppa8548_show_cpuinfo,
+	.get_irq	= mpic_get_irq,
+	.restart	= fsl_rstcr_restart,
+	.calibrate_decr = generic_calibrate_decr,
+	.progress	= udbg_progress,
+};
-- 
1.7.2.5

^ permalink raw reply related

* [PATCH?] Move ACPI device nodes under /sys/firmware/acpi (was: Re: [RFC PATCH v2 01/12] Add sys_hotplug.h for system device hotplug framework)
From: Rafael J. Wysocki @ 2013-02-02 22:18 UTC (permalink / raw)
  To: Greg KH
  Cc: linux-s390, Toshi Kani, jiang.liu, wency, linux-acpi, yinghai,
	linux-kernel, linux-mm, isimatu.yasuaki, srivatsa.bhat, guohanjun,
	bhelgaas, akpm, linuxppc-dev, lenb
In-Reply-To: <1810611.i6Sc4oLaux@vostro.rjw.lan>

On Saturday, February 02, 2013 09:15:37 PM Rafael J. Wysocki wrote:
> On Saturday, February 02, 2013 03:58:01 PM Greg KH wrote:
[...]
> 
> > I know it's more complicated with these types of devices, and I think we
> > are getting closer to the correct solution, I just don't want to ever
> > see duplicate devices in the driver model for the same physical device.
> 
> Do you mean two things based on struct device for the same hardware component?
> That's been happening already pretty much forever for every PCI device known
> to the ACPI layer, for PNP and many others.  However, those ACPI things are (or
> rather should be, but we're going to clean that up) only for convenience (to be
> able to see the namespace structure and related things in sysfs).  So the stuff
> under /sys/devices/LNXSYSTM\:00/ is not "real".  In my view it shouldn't even
> be under /sys/devices/ (/sys/firmware/acpi/ seems to be a better place for it),
> but that may be difficult to change without breaking user space (maybe we can
> just symlink it from /sys/devices/ or something).  And the ACPI bus type
> shouldn't even exist in my opinion.

Well, well.

In fact, the appended patch moves the whole ACPI device nodes tree under
/sys/firmware/acpi/ and I'm not seeing any negative consequences of that on my
test box (events work and so on).  User space is quite new on it, though, and
the patch is hackish.

Still ...


---
Prototype, no sign-off
---
 drivers/acpi/scan.c |    2 ++
 1 file changed, 2 insertions(+)

Index: linux-pm/drivers/acpi/scan.c
===================================================================
--- linux-pm.orig/drivers/acpi/scan.c
+++ linux-pm/drivers/acpi/scan.c
@@ -1443,6 +1443,8 @@ void acpi_init_device_object(struct acpi
 	device->flags.match_driver = false;
 	device_initialize(&device->dev);
 	dev_set_uevent_suppress(&device->dev, true);
+	if (handle == ACPI_ROOT_OBJECT)
+		device->dev.kobj.parent = acpi_kobj;
 }
 
 void acpi_device_add_finalize(struct acpi_device *device)



-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply

* Re: [PATCH v3 00/22] PCI: Iterate pci host bridge instead of pci root bus
From: Bjorn Helgaas @ 2013-02-02 21:50 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: linux-ia64, Mauro Carvalho Chehab, David Airlie, linux-pci,
	dri-devel, David Howells, Paul Mackerras, sparclinux,
	linux-am33-list, Russell King, x86, linux-altix, Doug Thompson,
	Matt Turner, linux-edac, Fenghua Yu, Jiang Liu,
	microblaze-uclinux, Rafael J. Wysocki, Ivan Kokshaysky,
	Taku Izumi, linux-arm-kernel, Richard Henderson, Michal Simek,
	Tony Luck, Toshi Kani, Greg Kroah-Hartman, linux-alpha,
	Koichi Yasutake, linuxppc-dev, David S. Miller
In-Reply-To: <1359314629-18651-1-git-send-email-yinghai@kernel.org>

On Sun, Jan 27, 2013 at 12:23 PM, Yinghai Lu <yinghai@kernel.org> wrote:
> Now we have pci_root_buses list, and there is lots of iteration with
> list_of_each of it, that is not safe after we add pci root bus hotplug
> support after booting stage.
>
> Add pci_get_next_host_bridge and use bus_find_device in driver core to
> iterate host bridge and the same time get root bus.
>
> We replace searching root bus with searching host_bridge,
> as host_bridge->bus is the root bus.
> After those replacing, we even could kill pci_root_buses list.

These are the problems I think you're fixing:

1) pci_find_next_bus() is not safe because even though it holds
pci_bus_sem while walking the pci_root_buses list, it doesn't hold a
reference on the bus it returns.  The bus could be removed while the
caller is using it.

2) "list_for_each_entry(bus, &pci_root_buses, node)" is not safe
because hotplug might modify the pci_root_buses list.  Replacing that
with for_each_pci_host_bridge() solves that problem by using
bus_find_device(), which is built on klists, which are designed for
that problem.

3) pci_find_next_bus() claims to iterate through all known PCI buses,
but in fact only iterates through root buses.

So far, so good.  Those are problems we need to fix.

Your solution is to introduce for_each_pci_host_bridge() as an
iterator through the known host bridges.  There are two scenarios
where we use something like this:

1) We want to perform an operation on every known host bridge.

2) We want to initialize something for every host bridge.

In my opinion, the only instance of scenario 1) is bus_rescan_store(),
where we want to rescan all PCI host bridges.

In every other case, we're doing some kind of initialization of all
the host bridges.  For these cases, for_each_pci_host_bridge() is the
wrong solution because it doesn't work for hot-added bridges.  I think
these cases should be changed to use pcibios_root_bridge_prepare() or
something something else called in the host bridge add path.

Bjorn

^ permalink raw reply

* Re: [PATCH 1/1] powerpc/85xx: Board support for ppa8548
From: Timur Tabi @ 2013-02-02 20:24 UTC (permalink / raw)
  To: Stef van Os; +Cc: Scott Wood, linuxppc-dev, Paul Mackerras
In-Reply-To: <510D6819.7050002@prodrive.nl>

On Sat, Feb 2, 2013 at 1:25 PM, Stef van Os <stef.van.os@prodrive.nl> wrote:

>> I guess the reason you're not using fsl/mpc8548si-post.dtsi is that you
>> don't want PCI.  Maybe PCI and srio should be moved out of that file, or
>> ifdeffed if 85xx ever ends up using the preprocessor for its device trees.
>>
> I went with timur's solution, patch v2 uses mpc8548si-post.dtsi and
> mpc8548si-pre.dtsi again, disabling everything that is not on the board.

FYI, someone posted some patches for dtc a couple months ago that
added an option to strip out disabled nodes when compiling the dts,
but those patches were rejected.  In your case, those patches could
significantly reduce the size of the compiled dtb.

^ permalink raw reply

* Re: [RFC PATCH v2 01/12] Add sys_hotplug.h for system device hotplug framework
From: Rafael J. Wysocki @ 2013-02-02 20:15 UTC (permalink / raw)
  To: Greg KH
  Cc: linux-s390, Toshi Kani, jiang.liu, wency, linux-acpi, yinghai,
	linux-kernel, linux-mm, isimatu.yasuaki, srivatsa.bhat, guohanjun,
	bhelgaas, akpm, linuxppc-dev, lenb
In-Reply-To: <20130202145801.GB1434@kroah.com>

On Saturday, February 02, 2013 03:58:01 PM Greg KH wrote:
> On Fri, Feb 01, 2013 at 11:12:59PM +0100, Rafael J. Wysocki wrote:
> > On Friday, February 01, 2013 08:23:12 AM Greg KH wrote:
> > > On Thu, Jan 31, 2013 at 09:54:51PM +0100, Rafael J. Wysocki wrote:
> > > > > > But, again, I'm going to ask why you aren't using the existing cpu /
> > > > > > memory / bridge / node devices that we have in the kernel.  Please use
> > > > > > them, or give me a _really_ good reason why they will not work.
> > > > > 
> > > > > We cannot use the existing system devices or ACPI devices here.  During
> > > > > hot-plug, ACPI handler sets this shp_device info, so that cpu and memory
> > > > > handlers (drivers/cpu.c and mm/memory_hotplug.c) can obtain their target
> > > > > device information in a platform-neutral way.  During hot-add, we first
> > > > > creates an ACPI device node (i.e. device under /sys/bus/acpi/devices),
> > > > > but platform-neutral modules cannot use them as they are ACPI-specific.
> > > > 
> > > > But suppose we're smart and have ACPI scan handlers that will create
> > > > "physical" device nodes for those devices during the ACPI namespace scan.
> > > > Then, the platform-neutral nodes will be able to bind to those "physical"
> > > > nodes.  Moreover, it should be possible to get a hierarchy of device objects
> > > > this way that will reflect all of the dependencies we need to take into
> > > > account during hot-add and hot-remove operations.  That may not be what we
> > > > have today, but I don't see any *fundamental* obstacles preventing us from
> > > > using this approach.
> > > 
> > > I would _much_ rather see that be the solution here as I think it is the
> > > proper one.
> > > 
> > > > This is already done for PCI host bridges and platform devices and I don't
> > > > see why we can't do that for the other types of devices too.
> > > 
> > > I agree.
> > > 
> > > > The only missing piece I see is a way to handle the "eject" problem, i.e.
> > > > when we try do eject a device at the top of a subtree and need to tear down
> > > > the entire subtree below it, but if that's going to lead to a system crash,
> > > > for example, we want to cancel the eject.  It seems to me that we'll need some
> > > > help from the driver core here.
> > > 
> > > I say do what we always have done here, if the user asked us to tear
> > > something down, let it happen as they are the ones that know best :)
> > > 
> > > Seriously, I guess this gets back to the "fail disconnect" idea that the
> > > ACPI developers keep harping on.  I thought we already resolved this
> > > properly by having them implement it in their bus code, no reason the
> > > same thing couldn't happen here, right?
> > 
> > Not really. :-)  We haven't ever resolved that particular issue I'm afraid.
> 
> Ah, I didn't realize that.
> 
> > > I don't think the core needs to do anything special, but if so, I'll be glad
> > > to review it.
> > 
> > OK, so this is the use case.  We have "eject" defined for something like
> > a container with a number of CPU cores, PCI host bridge, and a memory
> > controller under it.  And a few pretty much arbitrary I/O devices as a bonus.
> > 
> > Now, there's a button on the system case labeled as "Eject" and if that button
> > is pressed, we're supposed to _try_ to eject all of those things at once.  We
> > are allowed to fail that request, though, if that's problematic for some
> > reason, but we're supposed to let the BIOS know about that.
> > 
> > Do you seriously think that if that button is pressed, we should just proceed
> > with removing all that stuff no matter what?  That'd be kind of like Russian
> > roulette for whoever pressed that button, because s/he could only press it and
> > wait for the system to either crash or not.  Or maybe to crash a bit later
> > because of some delayed stuff that would hit one of those devices that had just
> > gone.  Surely not a situation any admin of a high-availability system would
> > like to be in. :-)
> > 
> > Quite frankly, I have no idea how that can be addressed in a single bus type,
> > let alone ACPI (which is not even a proper bus type, just something pretending
> > to be one).
> 
> You don't have it as a single bus type, you have a controller somewhere,
> off of the bus being destroyed, that handles sending remove events to
> the device and tearing everything down.  PCI does this from the very
> beginning.

Yes, but those are just remove events and we can only see how destructive they
were after the removal.  The point is to be able to figure out whether or not
we *want* to do the removal in the first place.

Say you have a computing node which signals a hardware problem in a processor
package (the container with CPU cores, memory, PCI host bridge etc.).  You
may want to eject that package, but you don't want to kill the system this
way.  So if the eject is doable, it is very much desirable to do it, but if it
is not doable, you'd rather shut the box down and do the replacement afterward.
That may be costly, however (maybe weeks of computations), so it should be
avoided if possible, but not at the expense of crashing the box if the eject
doesn't work out.

> I know it's more complicated with these types of devices, and I think we
> are getting closer to the correct solution, I just don't want to ever
> see duplicate devices in the driver model for the same physical device.

Do you mean two things based on struct device for the same hardware component?
That's been happening already pretty much forever for every PCI device known
to the ACPI layer, for PNP and many others.  However, those ACPI things are (or
rather should be, but we're going to clean that up) only for convenience (to be
able to see the namespace structure and related things in sysfs).  So the stuff
under /sys/devices/LNXSYSTM\:00/ is not "real".  In my view it shouldn't even
be under /sys/devices/ (/sys/firmware/acpi/ seems to be a better place for it),
but that may be difficult to change without breaking user space (maybe we can
just symlink it from /sys/devices/ or something).  And the ACPI bus type
shouldn't even exist in my opinion.

There's much confusion in there and much work to clean that up, I agree, but
that's kind of separate from the hotplug thing.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply

* Re: [PATCH 1/1] powerpc/85xx: Board support for ppa8548
From: Stef van Os @ 2013-02-02 19:25 UTC (permalink / raw)
  To: Scott Wood; +Cc: Paul Mackerras, linuxppc-dev, timur.tabi
In-Reply-To: <1359765063.23561.14@snotra>

Thanks for the comment!

On 02/02/2013 01:31 AM, Scott Wood wrote:
> On 02/01/2013 09:36:53 AM, Stef van Os wrote:
>> +    memory {
>> +        device_type = "memory";
>> +        reg = <0 0 0x0 0x40000000>;
>> +    };
>
> You have a "filled in by U-Boot" comment elsewhere in the file, but 
> you aren't letting U-Boot fill in the memory size?
>
The U-Boot used in this board currently does not call the 
fdt_fixup_memory function. Would have been better, but changing it now 
requires production image changes and requalification.
>> +    board_lbc: lbc: localbus@fe0005000 {
>> +        reg = <0xf 0xe0005000 0 0x1000>;
>> +        ranges = <0x0 0x0 0xf 0xff800000 0x00800000>;
>> +    };
>> +
>> +    board_soc: soc: soc8548@fe0000000 {
>> +        ranges = <0 0xf 0xe0000000 0x100000>;
>> +    };
>
> I know some existing dts files do this, but there's no need for two 
> labels one one node.
True, fixed in v2 of patch.
>
>> +    rio: rapidio@fe00c0000 {
>> +        reg = <0xf 0xe00c0000 0x0 0x11000>;
>> +        port1 {
>> +            ranges = <0x0 0x0 0x0 0x80000000 0x0 0x40000000>;
>> +        };
>> +    };
>> +};
>> +
>> +&lbc {
>> +    #address-cells = <2>;
>> +    #size-cells = <1>;
>> +    compatible = "fsl,mpc8548-lbc", "fsl,pq3-localbus", "simple-bus";
>> +    interrupts = <19 2 0 0>;
>> +};
>> +
>> +&rio {
>> +    compatible = "fsl,srio";
>> +    interrupts = <48 2 0 0>;
>> +    #address-cells = <2>;
>> +    #size-cells = <2>;
>> +    fsl,srio-rmu-handle = <&rmu>;
>> +    ranges;
>> +
>> +    port1 {
>> +        #address-cells = <2>;
>> +        #size-cells = <2>;
>> +        cell-index = <1>;
>> +    };
>> +};
>> +
>> +&soc {
>> +    #address-cells = <1>;
>> +    #size-cells = <1>;
>> +    device_type = "soc";
>> +    compatible = "fsl,mpc8548-immr", "simple-bus";
>> +    bus-frequency = <0>;        // Filled out by uboot.
>> +
>> +    ecm-law@0 {
>> +        compatible = "fsl,ecm-law";
>> +        reg = <0x0 0x1000>;
>> +        fsl,num-laws = <10>;
>> +    };
>> +
>> +    ecm@1000 {
>> +        compatible = "fsl,mpc8548-ecm", "fsl,ecm";
>> +        reg = <0x1000 0x1000>;
>> +        interrupts = <17 2 0 0>;
>> +    };
>> +
>> +    memory-controller@2000 {
>> +        compatible = "fsl,mpc8548-memory-controller";
>> +        reg = <0x2000 0x1000>;
>> +        interrupts = <18 2 0 0>;
>> +    };
>> +
>> +/include/ "fsl/pq3-i2c-0.dtsi"
>> +/include/ "fsl/pq3-i2c-1.dtsi"
>> +/include/ "fsl/pq3-duart-0.dtsi"
>> +
>> +    L2: l2-cache-controller@20000 {
>> +        compatible = "fsl,mpc8548-l2-cache-controller";
>> +        reg = <0x20000 0x1000>;
>> +        cache-line-size = <32>;    // 32 bytes
>> +        cache-size = <0x80000>; // L2, 512K
>> +        interrupts = <16 2 0 0>;
>> +    };
>> +
>> +/include/ "fsl/pq3-dma-0.dtsi"
>> +/include/ "fsl/pq3-etsec1-0.dtsi"
>> +/include/ "fsl/pq3-etsec1-1.dtsi"
>> +/include/ "fsl/pq3-etsec1-2.dtsi"
>> +/include/ "fsl/pq3-etsec1-3.dtsi"
>> +
>> +/include/ "fsl/pq3-sec2.1-0.dtsi"
>> +/include/ "fsl/pq3-mpic.dtsi"
>> +/include/ "fsl/pq3-rmu-0.dtsi"
>> +
>> +    global-utilities@e0000 {
>> +        compatible = "fsl,mpc8548-guts";
>> +        reg = <0xe0000 0x1000>;
>> +        fsl,has-rstcr;
>> +    };
>> +};
>
> I guess the reason you're not using fsl/mpc8548si-post.dtsi is that 
> you don't want PCI.  Maybe PCI and srio should be moved out of that 
> file, or ifdeffed if 85xx ever ends up using the preprocessor for its 
> device trees.
>
I went with timur's solution, patch v2 uses mpc8548si-post.dtsi and 
mpc8548si-pre.dtsi again, disabling everything that is not on the board.
>> diff --git a/arch/powerpc/platforms/85xx/ppa8548.c 
>> b/arch/powerpc/platforms/85xx/ppa8548.c
>> new file mode 100644
>> index 0000000..80a9307
>> --- /dev/null
>> +++ b/arch/powerpc/platforms/85xx/ppa8548.c
>> @@ -0,0 +1,119 @@
>> +/*
>> + * ppa8548 setup and early boot code.
>> + *
>> + * Copyright 2009 Prodrive B.V..
>> + *
>> + * By Stef van Os (see MAINTAINERS for contact information)
>> + *
>> + * Based on the SBC8548 support - Copyright 2007 Wind River Systems 
>> Inc.
>> + * Based on the MPC8548CDS support - Copyright 2005 Freescale Inc.
>> + *
>> + * This program is free software; you can redistribute  it and/or 
>> modify it
>> + * under  the terms of  the GNU General  Public License as published 
>> by the
>> + * Free Software Foundation;  either version 2 of the  License, or 
>> (at your
>> + * option) any later version.
>> + */
>> +
>> +#include <linux/stddef.h>
>> +#include <linux/kernel.h>
>> +#include <linux/init.h>
>> +#include <linux/errno.h>
>> +#include <linux/reboot.h>
>> +#include <linux/kdev_t.h>
>> +#include <linux/major.h>
>> +#include <linux/console.h>
>> +#include <linux/delay.h>
>> +#include <linux/seq_file.h>
>> +#include <linux/initrd.h>
>> +#include <linux/module.h>
>> +#include <linux/interrupt.h>
>> +#include <linux/fsl_devices.h>
>> +#include <linux/of_platform.h>
>> +
>> +#include <asm/pgtable.h>
>> +#include <asm/page.h>
>> +#include <asm/atomic.h>
>> +#include <asm/time.h>
>> +#include <asm/io.h>
>> +#include <asm/machdep.h>
>> +#include <asm/ipic.h>
>> +#include <asm/irq.h>
>> +#include <mm/mmu_decl.h>
>> +#include <asm/prom.h>
>> +#include <asm/udbg.h>
>> +#include <asm/mpic.h>
>> +
>> +#include <sysdev/fsl_soc.h>
>
> I doubt you need all of these.
>
> E.g. asm/ipic.h is for 83xx and 512x chips.  Some others are for 
> things that haven't been done by board files for years (e.g. kdev_t.h).
Fixed in v2, 10 includes left, instead of 25+...
>
>> +static void ppa8548_show_cpuinfo(struct seq_file *m)
>> +{
>> +    uint pvid, svid, phid1;
>> +
>> +    pvid = mfspr(SPRN_PVR);
>> +    svid = mfspr(SPRN_SVR);
>> +
>> +    seq_printf(m, "Vendor\t\t: Prodrive B.V.\n");
>> +    seq_printf(m, "Machine\t\t: ppa8548\n");
>> +    seq_printf(m, "PVR\t\t: 0x%x\n", pvid);
>> +    seq_printf(m, "SVR\t\t: 0x%x\n", svid);
>> +
>> +    /* Display cpu Pll setting */
>> +    phid1 = mfspr(SPRN_HID1);
>> +    seq_printf(m, "PLL setting\t: 0x%x\n", ((phid1 >> 24) & 0x3f));
>> +}
>
> PVR and ppc_md.name are already shown by the generic /proc/cpuinfo code.
>
> -Scott
Fixed, removed PVR and Machine. SVR is still shown, because it is not in 
generic cpuinfo code.

Will do a test run of the changes and send out v2 later this weekend.

Regards,
Stef

^ permalink raw reply

* Re: [RFC PATCH v2 01/12] Add sys_hotplug.h for system device hotplug framework
From: Greg KH @ 2013-02-02 15:01 UTC (permalink / raw)
  To: Toshi Kani
  Cc: linux-s390@vger.kernel.org, jiang.liu@huawei.com,
	wency@cn.fujitsu.com, linux-mm@kvack.org, yinghai@kernel.org,
	linux-kernel@vger.kernel.org, Rafael J. Wysocki,
	linux-acpi@vger.kernel.org, isimatu.yasuaki@jp.fujitsu.com,
	srivatsa.bhat@linux.vnet.ibm.com, guohanjun@huawei.com,
	bhelgaas@google.com, akpm@linux-foundation.org,
	linuxppc-dev@lists.ozlabs.org, lenb@kernel.org
In-Reply-To: <1359751210.15120.278.camel@misato.fc.hp.com>

On Fri, Feb 01, 2013 at 01:40:10PM -0700, Toshi Kani wrote:
> On Fri, 2013-02-01 at 07:30 +0000, Greg KH wrote:
> > On Thu, Jan 31, 2013 at 06:32:18PM -0700, Toshi Kani wrote:
> >  > This is already done for PCI host bridges and platform devices and I don't
> > > > see why we can't do that for the other types of devices too.
> > > > 
> > > > The only missing piece I see is a way to handle the "eject" problem, i.e.
> > > > when we try do eject a device at the top of a subtree and need to tear down
> > > > the entire subtree below it, but if that's going to lead to a system crash,
> > > > for example, we want to cancel the eject.  It seems to me that we'll need some
> > > > help from the driver core here.
> > > 
> > > There are three different approaches suggested for system device
> > > hot-plug:
> > >  A. Proceed within system device bus scan.
> > >  B. Proceed within ACPI bus scan.
> > >  C. Proceed with a sequence (as a mini-boot).
> > > 
> > > Option A uses system devices as tokens, option B uses acpi devices as
> > > tokens, and option C uses resource tables as tokens, for their handlers.
> > > 
> > > Here is summary of key questions & answers so far.  I hope this
> > > clarifies why I am suggesting option 3.
> > > 
> > > 1. What are the system devices?
> > > System devices provide system-wide core computing resources, which are
> > > essential to compose a computer system.  System devices are not
> > > connected to any particular standard buses.
> > 
> > Not a problem, lots of devices are not connected to any "particular
> > standard busses".  All this means is that system devices are connected
> > to the "system" bus, nothing more.
> 
> Can you give me a few examples of other devices that support hotplug and
> are not connected to any particular buses?  I will investigate them to
> see how they are managed to support hotplug.

Any device that is attached to any bus in the driver model can be
hotunplugged from userspace by telling it to be "unbound" from the
driver controlling it.  Try it for any platform device in your system to
see how it happens.

> > > 2. Why are the system devices special?
> > > The system devices are initialized during early boot-time, by multiple
> > > subsystems, from the boot-up sequence, in pre-defined order.  They
> > > provide low-level services to enable other subsystems to come up.
> > 
> > Sorry, no, that doesn't mean they are special, nothing here is unique
> > for the point of view of the driver model from any other device or bus.
> 
> I think system devices are unique in a sense that they are initialized
> before drivers run.

No, most all devices are "initialized" before a driver runs on it, USB
is one such example, PCI another, and I'm pretty sure that there are
others.

> > If you need to initialize the driver
> > core earlier, then do so.  Or, even better, just wait until enough of
> > the system has come up and then go initialize all of the devices you
> > have found so far as part of your boot process.
> 
> They are pseudo drivers that provide sysfs entry points of cpu and
> memory.  They do not actually initialize cpu and memory.  I do not think
> initializing cpu and memory fits into the driver model either, since
> drivers should run after cpu and memory are initialized.

We already represent CPUs in the sysfs tree, don't represent them in two
different places with two different structures.  Use the existing ones
please.

> > None of the above things you have stated seem to have anything to do
> > with your proposed patch, so I don't understand why you have mentioned
> > them...
> 
> You suggested option A before, which uses system bus scan to initialize
> all system devices at boot time as well as hot-plug.  I tried to say
> that this option would not be doable.

I haven't yet been convinced otherwise, sorry.  Please prove me wrong :)

thanks,

greg k-h

^ permalink raw reply

* Re: [RFC PATCH v2 01/12] Add sys_hotplug.h for system device hotplug framework
From: Greg KH @ 2013-02-02 14:58 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-s390, Toshi Kani, jiang.liu, wency, linux-acpi, yinghai,
	linux-kernel, linux-mm, isimatu.yasuaki, srivatsa.bhat, guohanjun,
	bhelgaas, akpm, linuxppc-dev, lenb
In-Reply-To: <1987042.JQv02Zsfg5@vostro.rjw.lan>

On Fri, Feb 01, 2013 at 11:12:59PM +0100, Rafael J. Wysocki wrote:
> On Friday, February 01, 2013 08:23:12 AM Greg KH wrote:
> > On Thu, Jan 31, 2013 at 09:54:51PM +0100, Rafael J. Wysocki wrote:
> > > > > But, again, I'm going to ask why you aren't using the existing cpu /
> > > > > memory / bridge / node devices that we have in the kernel.  Please use
> > > > > them, or give me a _really_ good reason why they will not work.
> > > > 
> > > > We cannot use the existing system devices or ACPI devices here.  During
> > > > hot-plug, ACPI handler sets this shp_device info, so that cpu and memory
> > > > handlers (drivers/cpu.c and mm/memory_hotplug.c) can obtain their target
> > > > device information in a platform-neutral way.  During hot-add, we first
> > > > creates an ACPI device node (i.e. device under /sys/bus/acpi/devices),
> > > > but platform-neutral modules cannot use them as they are ACPI-specific.
> > > 
> > > But suppose we're smart and have ACPI scan handlers that will create
> > > "physical" device nodes for those devices during the ACPI namespace scan.
> > > Then, the platform-neutral nodes will be able to bind to those "physical"
> > > nodes.  Moreover, it should be possible to get a hierarchy of device objects
> > > this way that will reflect all of the dependencies we need to take into
> > > account during hot-add and hot-remove operations.  That may not be what we
> > > have today, but I don't see any *fundamental* obstacles preventing us from
> > > using this approach.
> > 
> > I would _much_ rather see that be the solution here as I think it is the
> > proper one.
> > 
> > > This is already done for PCI host bridges and platform devices and I don't
> > > see why we can't do that for the other types of devices too.
> > 
> > I agree.
> > 
> > > The only missing piece I see is a way to handle the "eject" problem, i.e.
> > > when we try do eject a device at the top of a subtree and need to tear down
> > > the entire subtree below it, but if that's going to lead to a system crash,
> > > for example, we want to cancel the eject.  It seems to me that we'll need some
> > > help from the driver core here.
> > 
> > I say do what we always have done here, if the user asked us to tear
> > something down, let it happen as they are the ones that know best :)
> > 
> > Seriously, I guess this gets back to the "fail disconnect" idea that the
> > ACPI developers keep harping on.  I thought we already resolved this
> > properly by having them implement it in their bus code, no reason the
> > same thing couldn't happen here, right?
> 
> Not really. :-)  We haven't ever resolved that particular issue I'm afraid.

Ah, I didn't realize that.

> > I don't think the core needs to do anything special, but if so, I'll be glad
> > to review it.
> 
> OK, so this is the use case.  We have "eject" defined for something like
> a container with a number of CPU cores, PCI host bridge, and a memory
> controller under it.  And a few pretty much arbitrary I/O devices as a bonus.
> 
> Now, there's a button on the system case labeled as "Eject" and if that button
> is pressed, we're supposed to _try_ to eject all of those things at once.  We
> are allowed to fail that request, though, if that's problematic for some
> reason, but we're supposed to let the BIOS know about that.
> 
> Do you seriously think that if that button is pressed, we should just proceed
> with removing all that stuff no matter what?  That'd be kind of like Russian
> roulette for whoever pressed that button, because s/he could only press it and
> wait for the system to either crash or not.  Or maybe to crash a bit later
> because of some delayed stuff that would hit one of those devices that had just
> gone.  Surely not a situation any admin of a high-availability system would
> like to be in. :-)
> 
> Quite frankly, I have no idea how that can be addressed in a single bus type,
> let alone ACPI (which is not even a proper bus type, just something pretending
> to be one).

You don't have it as a single bus type, you have a controller somewhere,
off of the bus being destroyed, that handles sending remove events to
the device and tearing everything down.  PCI does this from the very
beginning.

I know it's more complicated with these types of devices, and I think we
are getting closer to the correct solution, I just don't want to ever
see duplicate devices in the driver model for the same physical device.

thanks,

greg k-h

^ permalink raw reply

* Re: [PATCH] powerpc/512x: add function for CS parameter configuration
From: Timur Tabi @ 2013-02-02 12:31 UTC (permalink / raw)
  To: Anatolij Gustschin; +Cc: linuxppc-dev
In-Reply-To: <20130202130201.204ee2fd@crub>

Anatolij Gustschin wrote:

>>> +struct mpc512x_lpc {
>>> +       u32     cs_cfg[8];      /* CS config */
>>> +       u32     cs_ctrl;        /* CS Control Register */
>>> +       u32     cs_status;      /* CS Status Register */
>>> +       u32     burst_ctrl;     /* CS Burst Control Register */
>>> +       u32     deadcycle_ctrl; /* CS Deadcycle Control Register */
>>> +       u32     holdcycle_ctrl; /* CS Holdcycle Control Register */
>>> +       u32     alt;            /* Address Latch Timing Register */
>>> +};
>>
>> These should be __be32.
>
> Why? To add a new bunch of sparse warnings?

Hmm... I thought that making them __be32 will *avoid* sparse warnings.

>> You forgot the iounmap() if lpc == NULL.
>
> No, it is intentional, no need to map/unmap again and again for all
> subsequent calls.

Sorry, for some reason I thought that lpc was a parameter that you passed in.

-- 
Timur Tabi

^ permalink raw reply

* Re: [PATCH] powerpc/512x: add function for CS parameter configuration
From: Anatolij Gustschin @ 2013-02-02 12:02 UTC (permalink / raw)
  To: Timur Tabi; +Cc: linuxppc-dev
In-Reply-To: <CAOZdJXUfe5KWBtabUUPrGHhcFb6dbnPNJCsU23yoxb-4GeKQsg@mail.gmail.com>

On Fri, 1 Feb 2013 22:31:51 -0600
Timur Tabi <timur@tabi.org> wrote:

> On Fri, Feb 1, 2013 at 7:28 AM, Anatolij Gustschin <agust@denx.de> wrote:
> > Add ability to configure CS parameters for devices that need
> > different CS parameters setup after their configuration. I.e.
> > an FPGA device on LP bus can require different CS parameters
> > for its bus interface after loading firmware into it. A driver
> > can easily reconfigure the LPC CS parameters using this function.
> 
> Could you define "CS" somewhere?

I'll define it in revised commit log in v2.

> > +struct mpc512x_lpc {
> > +       u32     cs_cfg[8];      /* CS config */
> > +       u32     cs_ctrl;        /* CS Control Register */
> > +       u32     cs_status;      /* CS Status Register */
> > +       u32     burst_ctrl;     /* CS Burst Control Register */
> > +       u32     deadcycle_ctrl; /* CS Deadcycle Control Register */
> > +       u32     holdcycle_ctrl; /* CS Holdcycle Control Register */
> > +       u32     alt;            /* Address Latch Timing Register */
> > +};
> 
> These should be __be32.

Why? To add a new bunch of sparse warnings?

...
> > +       if (cs < 0 || cs > 7)
> > +               return -EINVAL;
> 
> If you make cs an unsigned int, you won't need to test for < 0.

can do that, yes.

> > +
> > +       if (!lpc) {
> > +               np = of_find_compatible_node(NULL, NULL, "fsl,mpc5121-lpc");
> > +               lpc = of_iomap(np, 0);
> > +               of_node_put(np);
> > +               if (!lpc)
> > +                       return -ENOMEM;
> > +       }
> > +
> > +       out_be32(&lpc->cs_cfg[cs], val);
> 
> You forgot the iounmap() if lpc == NULL.

No, it is intentional, no need to map/unmap again and again for all
subsequent calls.


> > +       return 0;
> > +}
> > +EXPORT_SYMBOL_GPL(mpc512x_cs_config);
> 
> EXPORT_SYMBOL, please, to match the rest of the Freescale platforms.

I'll change it in v2. Thanks!

Anatolij

^ permalink raw reply

* AW: Why is the e500v2 core not using cpuidle?
From: Thomas Waldecker @ 2013-02-02  9:41 UTC (permalink / raw)
  To: Scott Wood; +Cc: Linux PPC dev mailing list (linuxppc-dev@lists.ozlabs.org)
In-Reply-To: <1359742517.23561.2@snotra>

Hi Scott,=0A=
=0A=
>> Why is there no support for the cpuidle framework?=0A=
> Because nobody implemented it. :-)=0A=
That's the reason I thought before :-)=0A=
=0A=
> The only reason I can think of to implement it on this chip would be to=
=0A=
> dynamically choose when to enter nap versus doze, rather than always=0A=
> just using doze.  It's not clear whether the difference in power=0A=
> savings is worth it -- do you have any way of measuring?=0A=
=0A=
Is the e500 only using doze? There are comments in the file =0A=
arch/powerpc/kernel/idle_e500.S=0A=
which are stating:=0A=
/*  Go to NAP or DOZE now */=0A=
or=0A=
/* Return from NAP/DOZE ...*/=0A=
=0A=
and because of this comments I thought that both modes are in use.=0A=
=0A=
I have a way of measuring the power and it is also a small part of my maste=
rthesis,=0A=
but it is not very meaningful because at the measuring point there are othe=
r peripheral=0A=
components too.=0A=
=0A=
According to the comments can I activate the nap mode somehow?=0A=
=0A=
>> How can I debug the e500 idle modes?=0A=
>> Are there any statistics?=0A=
> Top reports idle percentage...=0A=
If the e500 and e500v2 are indeed using only the doze mode it =0A=
would be enough statistics.=0A=
=0A=
>> I already ran PowerTOP on a QorIQ P2020 but it's almost useless=0A=
>> because the information is missing.=0A=
> What information are you looking for?=0A=
=0A=
On my laptop I get detailled idle stats in PowerTOP 2.2=0A=
=0A=
          Package   |            CPU 0=0A=
POLL        0.0%    | POLL        0.0%    0.0 ms=0A=
C1          0.7%    | C1          0.4%    0.2 ms=0A=
C2         86.0%    | C2         86.8%    1.4 ms=0A=
=0A=
                    |            CPU 1=0A=
                    | POLL        0.0%    0.4 ms=0A=
                    | C1          0.9%    0.6 ms=0A=
                    | C2         85.2%    1.5 ms=0A=
=0A=
It is using the sysfs interface=0A=
=0A=
thomas@localhost /s/d/s/c/c/cpuidle> pwd=0A=
/sys/devices/system/cpu/cpu0/cpuidle=0A=
thomas@localhost /s/d/s/c/c/cpuidle> tree=0A=
.=0A=
=86=80=80 state0=0A=
=81=9A=9A =86=80=80 desc=0A=
=81=9A=9A =86=80=80 disable=0A=
=81=9A=9A =86=80=80 latency=0A=
=81=9A=9A =86=80=80 name=0A=
=81=9A=9A =86=80=80 power=0A=
=81=9A=9A =86=80=80 time=0A=
=81=9A=9A =84=80=80 usage=0A=
=86=80=80 state1=0A=
    (same as state0)=0A=
=84=80=80 state2=0A=
    (also same as state 0)=0A=
thomas@localhost /s/d/s/c/c/cpuidle> cat */name=0A=
POLL=0A=
C1=0A=
C2=0A=
=0A=
Such statistics would be great for the doze, nap (and sleep for the whole p=
ackage).=0A=
=0A=
Kind regards,=0A=
Thomas=

^ permalink raw reply

* Re: [PATCH] PowerPC documentation: fixed path to the powerpc directory
From: Thomas Waldecker @ 2013-02-02  9:08 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: linuxppc-dev, paulus, linux-kernel, rob, linux-doc
In-Reply-To: <1357858846.4838.92.camel@pasglop>

Hi Ben,

I sent it twice, sorry. I haven't seen it applied for a long time and
got no response.

It is already applied.

Kind regards,
Thomas

^ permalink raw reply

* linux-next: build failure after merge of the tip tree
From: Stephen Rothwell @ 2013-02-02  5:04 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Peter Zijlstra
  Cc: Sukadev Bhattiprolu, linux-next, linux-kernel, linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 3674 bytes --]

Hi all,

After merging the tip tree, today's linux-next build (powerpc
ppc64_defconfig) failed like this:

arch/powerpc/perf/power7-pmu.c:397:2: error: initialization from incompatible pointer type [-Werror]
arch/powerpc/perf/power7-pmu.c:397:2: error: (near initialization for 'power7_events_attr[0]') [-Werror]
arch/powerpc/perf/power7-pmu.c:398:2: error: initialization from incompatible pointer type [-Werror]
arch/powerpc/perf/power7-pmu.c:398:2: error: (near initialization for 'power7_events_attr[1]') [-Werror]
arch/powerpc/perf/power7-pmu.c:399:2: error: initialization from incompatible pointer type [-Werror]
arch/powerpc/perf/power7-pmu.c:399:2: error: (near initialization for 'power7_events_attr[2]') [-Werror]
arch/powerpc/perf/power7-pmu.c:400:2: error: initialization from incompatible pointer type [-Werror]
arch/powerpc/perf/power7-pmu.c:400:2: error: (near initialization for 'power7_events_attr[3]') [-Werror]
arch/powerpc/perf/power7-pmu.c:401:2: error: initialization from incompatible pointer type [-Werror]
arch/powerpc/perf/power7-pmu.c:401:2: error: (near initialization for 'power7_events_attr[4]') [-Werror]
arch/powerpc/perf/power7-pmu.c:402:2: error: initialization from incompatible pointer type [-Werror]
arch/powerpc/perf/power7-pmu.c:402:2: error: (near initialization for 'power7_events_attr[5]') [-Werror]
arch/powerpc/perf/power7-pmu.c:403:2: error: initialization from incompatible pointer type [-Werror]
arch/powerpc/perf/power7-pmu.c:403:2: error: (near initialization for 'power7_events_attr[6]') [-Werror]
arch/powerpc/perf/power7-pmu.c:404:2: error: initialization from incompatible pointer type [-Werror]
arch/powerpc/perf/power7-pmu.c:404:2: error: (near initialization for 'power7_events_attr[7]') [-Werror]
arch/powerpc/perf/power7-pmu.c:406:2: error: initialization from incompatible pointer type [-Werror]
arch/powerpc/perf/power7-pmu.c:406:2: error: (near initialization for 'power7_events_attr[8]') [-Werror]
arch/powerpc/perf/power7-pmu.c:407:2: error: initialization from incompatible pointer type [-Werror]
arch/powerpc/perf/power7-pmu.c:407:2: error: (near initialization for 'power7_events_attr[9]') [-Werror]
arch/powerpc/perf/power7-pmu.c:408:2: error: initialization from incompatible pointer type [-Werror]
arch/powerpc/perf/power7-pmu.c:408:2: error: (near initialization for 'power7_events_attr[10]') [-Werror]
arch/powerpc/perf/power7-pmu.c:409:2: error: initialization from incompatible pointer type [-Werror]
arch/powerpc/perf/power7-pmu.c:409:2: error: (near initialization for 'power7_events_attr[11]') [-Werror]
arch/powerpc/perf/power7-pmu.c:410:2: error: initialization from incompatible pointer type [-Werror]
arch/powerpc/perf/power7-pmu.c:410:2: error: (near initialization for 'power7_events_attr[12]') [-Werror]
arch/powerpc/perf/power7-pmu.c:411:2: error: initialization from incompatible pointer type [-Werror]
arch/powerpc/perf/power7-pmu.c:411:2: error: (near initialization for 'power7_events_attr[13]') [-Werror]
arch/powerpc/perf/power7-pmu.c:412:2: error: initialization from incompatible pointer type [-Werror]
arch/powerpc/perf/power7-pmu.c:412:2: error: (near initialization for 'power7_events_attr[14]') [-Werror]
arch/powerpc/perf/power7-pmu.c:413:2: error: initialization from incompatible pointer type [-Werror]
arch/powerpc/perf/power7-pmu.c:413:2: error: (near initialization for 'power7_events_attr[15]') [-Werror]

Caused by commit 1c53a270724d ("perf/POWER7: Make generic event
translations available in sysfs").

I have used the tip tree from 20130128 for today.
-- 
Cheers,
Stephen Rothwell                    sfr@canb.auug.org.au

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply

* Re: [PATCH 1/1] powerpc/85xx: Board support for ppa8548
From: Timur Tabi @ 2013-02-02  4:34 UTC (permalink / raw)
  To: Scott Wood; +Cc: Stef van Os, Paul Mackerras, linuxppc-dev
In-Reply-To: <1359765063.23561.14@snotra>

On Fri, Feb 1, 2013 at 6:31 PM, Scott Wood <scottwood@freescale.com> wrote:
>
> I guess the reason you're not using fsl/mpc8548si-post.dtsi is that you
> don't want PCI.  Maybe PCI and srio should be moved out of that file, or
> ifdeffed if 85xx ever ends up using the preprocessor for its device trees.

Wouldn't it be easier to add status="disabled" in this dts file?  I do
something similar with the LBC on the p1022rdk.

	board_lbc: lbc: localbus@ffe05000 {
		/* The P1022 RDK does not have any localbus devices */
		status = "disabled";
	};

^ permalink raw reply

* Re: [PATCH] powerpc/512x: add function for CS parameter configuration
From: Timur Tabi @ 2013-02-02  4:31 UTC (permalink / raw)
  To: Anatolij Gustschin; +Cc: linuxppc-dev
In-Reply-To: <1359725305-23916-1-git-send-email-agust@denx.de>

On Fri, Feb 1, 2013 at 7:28 AM, Anatolij Gustschin <agust@denx.de> wrote:
> Add ability to configure CS parameters for devices that need
> different CS parameters setup after their configuration. I.e.
> an FPGA device on LP bus can require different CS parameters
> for its bus interface after loading firmware into it. A driver
> can easily reconfigure the LPC CS parameters using this function.

Could you define "CS" somewhere?

> +struct mpc512x_lpc {
> +       u32     cs_cfg[8];      /* CS config */
> +       u32     cs_ctrl;        /* CS Control Register */
> +       u32     cs_status;      /* CS Status Register */
> +       u32     burst_ctrl;     /* CS Burst Control Register */
> +       u32     deadcycle_ctrl; /* CS Deadcycle Control Register */
> +       u32     holdcycle_ctrl; /* CS Holdcycle Control Register */
> +       u32     alt;            /* Address Latch Timing Register */
> +};

These should be __be32.

> +
> +int mpc512x_cs_config(int cs, u32 val);
> +
>  #endif /* __ASM_POWERPC_MPC5121_H__ */
> diff --git a/arch/powerpc/platforms/512x/mpc512x_shared.c b/arch/powerpc/platforms/512x/mpc512x_shared.c
> index c344438..112b4f6 100644
> --- a/arch/powerpc/platforms/512x/mpc512x_shared.c
> +++ b/arch/powerpc/platforms/512x/mpc512x_shared.c
> @@ -436,3 +436,33 @@ void __init mpc512x_init(void)
>         mpc512x_restart_init();
>         mpc512x_psc_fifo_init();
>  }
> +
> +/**
> + * mpc512x_cs_config - Setup chip select configuration
> + * @cs: chip select number
> + * @val: chip select configuration value
> + *
> + * Perform chip select configuration for devices on LocalPlus Bus.
> + * Intended to dynamically reconfigure the chip select parameters
> + * for configurable devices on the bus.
> + */
> +int mpc512x_cs_config(int cs, u32 val)
> +{
> +       static struct mpc512x_lpc __iomem *lpc;
> +       struct device_node *np;
> +
> +       if (cs < 0 || cs > 7)
> +               return -EINVAL;

If you make cs an unsigned int, you won't need to test for < 0.

> +
> +       if (!lpc) {
> +               np = of_find_compatible_node(NULL, NULL, "fsl,mpc5121-lpc");
> +               lpc = of_iomap(np, 0);
> +               of_node_put(np);
> +               if (!lpc)
> +                       return -ENOMEM;
> +       }
> +
> +       out_be32(&lpc->cs_cfg[cs], val);

You forgot the iounmap() if lpc == NULL.

> +       return 0;
> +}
> +EXPORT_SYMBOL_GPL(mpc512x_cs_config);

EXPORT_SYMBOL, please, to match the rest of the Freescale platforms.



-- 
Timur Tabi
Linux kernel developer at Freescale

^ permalink raw reply

* Re: [PATCH 1/1] powerpc/85xx: Board support for ppa8548
From: Scott Wood @ 2013-02-02  0:31 UTC (permalink / raw)
  To: Stef van Os; +Cc: Paul Mackerras, linuxppc-dev, Stef van Os
In-Reply-To: <1359733013-722-1-git-send-email-stef.van.os@prodrive.nl>

On 02/01/2013 09:36:53 AM, Stef van Os wrote:
> +	memory {
> +		device_type =3D "memory";
> +		reg =3D <0 0 0x0 0x40000000>;
> +	};

You have a "filled in by U-Boot" comment elsewhere in the file, but you =20
aren't letting U-Boot fill in the memory size?

> +	board_lbc: lbc: localbus@fe0005000 {
> +		reg =3D <0xf 0xe0005000 0 0x1000>;
> +		ranges =3D <0x0 0x0 0xf 0xff800000 0x00800000>;
> +	};
> +
> +	board_soc: soc: soc8548@fe0000000 {
> +		ranges =3D <0 0xf 0xe0000000 0x100000>;
> +	};

I know some existing dts files do this, but there's no need for two =20
labels one one node.

> +	rio: rapidio@fe00c0000 {
> +		reg =3D <0xf 0xe00c0000 0x0 0x11000>;
> +		port1 {
> +			ranges =3D <0x0 0x0 0x0 0x80000000 0x0 =20
> 0x40000000>;
> +		};
> +	};
> +};
> +
> +&lbc {
> +	#address-cells =3D <2>;
> +	#size-cells =3D <1>;
> +	compatible =3D "fsl,mpc8548-lbc", "fsl,pq3-localbus", =20
> "simple-bus";
> +	interrupts =3D <19 2 0 0>;
> +};
> +
> +&rio {
> +	compatible =3D "fsl,srio";
> +	interrupts =3D <48 2 0 0>;
> +	#address-cells =3D <2>;
> +	#size-cells =3D <2>;
> +	fsl,srio-rmu-handle =3D <&rmu>;
> +	ranges;
> +
> +	port1 {
> +		#address-cells =3D <2>;
> +		#size-cells =3D <2>;
> +		cell-index =3D <1>;
> +	};
> +};
> +
> +&soc {
> +	#address-cells =3D <1>;
> +	#size-cells =3D <1>;
> +	device_type =3D "soc";
> +	compatible =3D "fsl,mpc8548-immr", "simple-bus";
> +	bus-frequency =3D <0>;		// Filled out by uboot.
> +
> +	ecm-law@0 {
> +		compatible =3D "fsl,ecm-law";
> +		reg =3D <0x0 0x1000>;
> +		fsl,num-laws =3D <10>;
> +	};
> +
> +	ecm@1000 {
> +		compatible =3D "fsl,mpc8548-ecm", "fsl,ecm";
> +		reg =3D <0x1000 0x1000>;
> +		interrupts =3D <17 2 0 0>;
> +	};
> +
> +	memory-controller@2000 {
> +		compatible =3D "fsl,mpc8548-memory-controller";
> +		reg =3D <0x2000 0x1000>;
> +		interrupts =3D <18 2 0 0>;
> +	};
> +
> +/include/ "fsl/pq3-i2c-0.dtsi"
> +/include/ "fsl/pq3-i2c-1.dtsi"
> +/include/ "fsl/pq3-duart-0.dtsi"
> +
> +	L2: l2-cache-controller@20000 {
> +		compatible =3D "fsl,mpc8548-l2-cache-controller";
> +		reg =3D <0x20000 0x1000>;
> +		cache-line-size =3D <32>;	// 32 bytes
> +		cache-size =3D <0x80000>; // L2, 512K
> +		interrupts =3D <16 2 0 0>;
> +	};
> +
> +/include/ "fsl/pq3-dma-0.dtsi"
> +/include/ "fsl/pq3-etsec1-0.dtsi"
> +/include/ "fsl/pq3-etsec1-1.dtsi"
> +/include/ "fsl/pq3-etsec1-2.dtsi"
> +/include/ "fsl/pq3-etsec1-3.dtsi"
> +
> +/include/ "fsl/pq3-sec2.1-0.dtsi"
> +/include/ "fsl/pq3-mpic.dtsi"
> +/include/ "fsl/pq3-rmu-0.dtsi"
> +
> +	global-utilities@e0000 {
> +		compatible =3D "fsl,mpc8548-guts";
> +		reg =3D <0xe0000 0x1000>;
> +		fsl,has-rstcr;
> +	};
> +};

I guess the reason you're not using fsl/mpc8548si-post.dtsi is that you =20
don't want PCI.  Maybe PCI and srio should be moved out of that file, =20
or ifdeffed if 85xx ever ends up using the preprocessor for its device =20
trees.

> diff --git a/arch/powerpc/platforms/85xx/ppa8548.c =20
> b/arch/powerpc/platforms/85xx/ppa8548.c
> new file mode 100644
> index 0000000..80a9307
> --- /dev/null
> +++ b/arch/powerpc/platforms/85xx/ppa8548.c
> @@ -0,0 +1,119 @@
> +/*
> + * ppa8548 setup and early boot code.
> + *
> + * Copyright 2009 Prodrive B.V..
> + *
> + * By Stef van Os (see MAINTAINERS for contact information)
> + *
> + * Based on the SBC8548 support - Copyright 2007 Wind River Systems =20
> Inc.
> + * Based on the MPC8548CDS support - Copyright 2005 Freescale Inc.
> + *
> + * This program is free software; you can redistribute  it and/or =20
> modify it
> + * under  the terms of  the GNU General  Public License as published =20
> by the
> + * Free Software Foundation;  either version 2 of the  License, or =20
> (at your
> + * option) any later version.
> + */
> +
> +#include <linux/stddef.h>
> +#include <linux/kernel.h>
> +#include <linux/init.h>
> +#include <linux/errno.h>
> +#include <linux/reboot.h>
> +#include <linux/kdev_t.h>
> +#include <linux/major.h>
> +#include <linux/console.h>
> +#include <linux/delay.h>
> +#include <linux/seq_file.h>
> +#include <linux/initrd.h>
> +#include <linux/module.h>
> +#include <linux/interrupt.h>
> +#include <linux/fsl_devices.h>
> +#include <linux/of_platform.h>
> +
> +#include <asm/pgtable.h>
> +#include <asm/page.h>
> +#include <asm/atomic.h>
> +#include <asm/time.h>
> +#include <asm/io.h>
> +#include <asm/machdep.h>
> +#include <asm/ipic.h>
> +#include <asm/irq.h>
> +#include <mm/mmu_decl.h>
> +#include <asm/prom.h>
> +#include <asm/udbg.h>
> +#include <asm/mpic.h>
> +
> +#include <sysdev/fsl_soc.h>

I doubt you need all of these.

E.g. asm/ipic.h is for 83xx and 512x chips.  Some others are for things =20
that haven't been done by board files for years (e.g. kdev_t.h).

> +static void ppa8548_show_cpuinfo(struct seq_file *m)
> +{
> +	uint pvid, svid, phid1;
> +
> +	pvid =3D mfspr(SPRN_PVR);
> +	svid =3D mfspr(SPRN_SVR);
> +
> +	seq_printf(m, "Vendor\t\t: Prodrive B.V.\n");
> +	seq_printf(m, "Machine\t\t: ppa8548\n");
> +	seq_printf(m, "PVR\t\t: 0x%x\n", pvid);
> +	seq_printf(m, "SVR\t\t: 0x%x\n", svid);
> +
> +	/* Display cpu Pll setting */
> +	phid1 =3D mfspr(SPRN_HID1);
> +	seq_printf(m, "PLL setting\t: 0x%x\n", ((phid1 >> 24) & 0x3f));
> +}

PVR and ppc_md.name are already shown by the generic /proc/cpuinfo code.

-Scott=

^ permalink raw reply

* Re: [RFC PATCH v2 01/12] Add sys_hotplug.h for system device hotplug framework
From: Toshi Kani @ 2013-02-01 23:12 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-s390@vger.kernel.org, jiang.liu@huawei.com,
	wency@cn.fujitsu.com, linux-acpi@vger.kernel.org, Greg KH,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	isimatu.yasuaki@jp.fujitsu.com, yinghai@kernel.org,
	srivatsa.bhat@linux.vnet.ibm.com, guohanjun@huawei.com,
	bhelgaas@google.com, akpm@linux-foundation.org,
	linuxppc-dev@lists.ozlabs.org, lenb@kernel.org
In-Reply-To: <2370118.yuaZBuKn6n@vostro.rjw.lan>

On Fri, 2013-02-01 at 23:21 +0100, Rafael J. Wysocki wrote:
> On Friday, February 01, 2013 01:40:10 PM Toshi Kani wrote:
> > On Fri, 2013-02-01 at 07:30 +0000, Greg KH wrote:
> > > On Thu, Jan 31, 2013 at 06:32:18PM -0700, Toshi Kani wrote:
> > >  > This is already done for PCI host bridges and platform devices and I don't
> > > > > see why we can't do that for the other types of devices too.
> > > > > 
> > > > > The only missing piece I see is a way to handle the "eject" problem, i.e.
> > > > > when we try do eject a device at the top of a subtree and need to tear down
> > > > > the entire subtree below it, but if that's going to lead to a system crash,
> > > > > for example, we want to cancel the eject.  It seems to me that we'll need some
> > > > > help from the driver core here.
> > > > 
> > > > There are three different approaches suggested for system device
> > > > hot-plug:
> > > >  A. Proceed within system device bus scan.
> > > >  B. Proceed within ACPI bus scan.
> > > >  C. Proceed with a sequence (as a mini-boot).
> > > > 
> > > > Option A uses system devices as tokens, option B uses acpi devices as
> > > > tokens, and option C uses resource tables as tokens, for their handlers.
> > > > 
> > > > Here is summary of key questions & answers so far.  I hope this
> > > > clarifies why I am suggesting option 3.
> > > > 
> > > > 1. What are the system devices?
> > > > System devices provide system-wide core computing resources, which are
> > > > essential to compose a computer system.  System devices are not
> > > > connected to any particular standard buses.
> > > 
> > > Not a problem, lots of devices are not connected to any "particular
> > > standard busses".  All this means is that system devices are connected
> > > to the "system" bus, nothing more.
> > 
> > Can you give me a few examples of other devices that support hotplug and
> > are not connected to any particular buses?  I will investigate them to
> > see how they are managed to support hotplug.
> > 
> > > > 2. Why are the system devices special?
> > > > The system devices are initialized during early boot-time, by multiple
> > > > subsystems, from the boot-up sequence, in pre-defined order.  They
> > > > provide low-level services to enable other subsystems to come up.
> > > 
> > > Sorry, no, that doesn't mean they are special, nothing here is unique
> > > for the point of view of the driver model from any other device or bus.
> > 
> > I think system devices are unique in a sense that they are initialized
> > before drivers run.
> > 
> > > > 3. Why can't initialize the system devices from the driver structure at
> > > > boot?
> > > > The driver structure is initialized at the end of the boot sequence and
> > > > requires the low-level services from the system devices initialized
> > > > beforehand.
> > > 
> > > Wait, what "driver structure"?  
> > 
> > Sorry it was not clear.  cpu_dev_init() and memory_dev_init() are called
> > from driver_init() at the end of the boot sequence, and initialize
> > system/cpu and system/memory devices.  I assume they are the system bus
> > you are referring with option A.
> > 
> > > If you need to initialize the driver
> > > core earlier, then do so.  Or, even better, just wait until enough of
> > > the system has come up and then go initialize all of the devices you
> > > have found so far as part of your boot process.
> > 
> > They are pseudo drivers that provide sysfs entry points of cpu and
> > memory.  They do not actually initialize cpu and memory.  I do not think
> > initializing cpu and memory fits into the driver model either, since
> > drivers should run after cpu and memory are initialized.
> > 
> > > None of the above things you have stated seem to have anything to do
> > > with your proposed patch, so I don't understand why you have mentioned
> > > them...
> > 
> > You suggested option A before, which uses system bus scan to initialize
> > all system devices at boot time as well as hot-plug.  I tried to say
> > that this option would not be doable.
> > 
> > > > 4. Why do we need a new common framework?
> > > > Sysfs CPU and memory on-lining/off-lining are performed within the CPU
> > > > and memory modules.  They are common code and do not depend on ACPI.
> > > > Therefore, a new common framework is necessary to integrate both
> > > > on-lining/off-lining operation and hot-plugging operation of system
> > > > devices into a single framework.
> > > 
> > > {sigh}
> > > 
> > > Removing and adding devices and handling hotplug operations is what the
> > > driver core was written for, almost 10 years ago.  To somehow think that
> > > your devices are "special" just because they don't use ACPI is odd,
> > > because the driver core itself has nothing to do with ACPI.  Don't get
> > > the current mix of x86 system code tied into ACPI confused with an
> > > driver core issues here please.
> > 
> > CPU online/offline operation is performed within the CPU module.  Memory
> > online/offline operation is performed within the memory module.  CPU and
> > memory hotplug operations are performed within ACPI.  While they deal
> > with the same set of devices, they operate independently and are not
> > managed under a same framework.
> > 
> > I agree with you that not using ACPI is perfectly fine.  My point is
> > that ACPI framework won't be able to manage operations that do not use
> > ACPI.
> > 
> > > > 5. Why can't do everything with ACPI bus scan?
> > > > Software dependency among system devices may not be dictated by the ACPI
> > > > hierarchy.  For instance, memory should be initialized before CPUs (i.e.
> > > > a new cpu may need its local memory), but such ordering cannot be
> > > > guaranteed by the ACPI hierarchy.  Also, as described in 4,
> > > > online/offline operations are independent from ACPI.  
> > > 
> > > That's fine, the driver core is independant from ACPI.  I don't care how
> > > you do the scaning of your devices, but I do care about you creating new
> > > driver core pieces that duplicate the existing functionality of what we
> > > have today.
> > >
> > > In short, I like Rafael's proposal better, and I fail to see how
> > > anything you have stated here would matter in how this is implemented. :)
> > 
> > Doing everything within ACPI means we can only manage ACPI hotplug
> > operations, not online/offline operations.  But I understand that you
> > concern about adding a new framework with option C.  It is good to know
> > that you are fine with option B. :)  So, I will step back, and think
> > about what we can do within ACPI.
> 
> Not much, because ACPI only knows about a subset of devices that may be
> involved in that, and a limited one for that matter.  For one example,
> anything connected through PCI and not having a corresponding ACPI object (i.e.
> pretty much every add-in card in existence) will be unknown to ACPI.  And
> say one of these things is a SATA controller with a number of disks under it
> and so on.  ACPI won't even know that it exists.  Moreover, PCI won't know
> that those disks exist.  Etc.

Agreed.  Thanks for bringing I/Os into the picture.  I did not mention
them since they have not supported in this patchset, but we certainly
need to consider them into the design.

Thanks,
-Toshi

^ permalink raw reply

* Re: [RFC PATCH v2 01/12] Add sys_hotplug.h for system device hotplug framework
From: Rafael J. Wysocki @ 2013-02-01 22:21 UTC (permalink / raw)
  To: Toshi Kani
  Cc: linux-s390@vger.kernel.org, jiang.liu@huawei.com,
	wency@cn.fujitsu.com, linux-acpi@vger.kernel.org, Greg KH,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	isimatu.yasuaki@jp.fujitsu.com, yinghai@kernel.org,
	srivatsa.bhat@linux.vnet.ibm.com, guohanjun@huawei.com,
	bhelgaas@google.com, akpm@linux-foundation.org,
	linuxppc-dev@lists.ozlabs.org, lenb@kernel.org
In-Reply-To: <1359751210.15120.278.camel@misato.fc.hp.com>

On Friday, February 01, 2013 01:40:10 PM Toshi Kani wrote:
> On Fri, 2013-02-01 at 07:30 +0000, Greg KH wrote:
> > On Thu, Jan 31, 2013 at 06:32:18PM -0700, Toshi Kani wrote:
> >  > This is already done for PCI host bridges and platform devices and I don't
> > > > see why we can't do that for the other types of devices too.
> > > > 
> > > > The only missing piece I see is a way to handle the "eject" problem, i.e.
> > > > when we try do eject a device at the top of a subtree and need to tear down
> > > > the entire subtree below it, but if that's going to lead to a system crash,
> > > > for example, we want to cancel the eject.  It seems to me that we'll need some
> > > > help from the driver core here.
> > > 
> > > There are three different approaches suggested for system device
> > > hot-plug:
> > >  A. Proceed within system device bus scan.
> > >  B. Proceed within ACPI bus scan.
> > >  C. Proceed with a sequence (as a mini-boot).
> > > 
> > > Option A uses system devices as tokens, option B uses acpi devices as
> > > tokens, and option C uses resource tables as tokens, for their handlers.
> > > 
> > > Here is summary of key questions & answers so far.  I hope this
> > > clarifies why I am suggesting option 3.
> > > 
> > > 1. What are the system devices?
> > > System devices provide system-wide core computing resources, which are
> > > essential to compose a computer system.  System devices are not
> > > connected to any particular standard buses.
> > 
> > Not a problem, lots of devices are not connected to any "particular
> > standard busses".  All this means is that system devices are connected
> > to the "system" bus, nothing more.
> 
> Can you give me a few examples of other devices that support hotplug and
> are not connected to any particular buses?  I will investigate them to
> see how they are managed to support hotplug.
> 
> > > 2. Why are the system devices special?
> > > The system devices are initialized during early boot-time, by multiple
> > > subsystems, from the boot-up sequence, in pre-defined order.  They
> > > provide low-level services to enable other subsystems to come up.
> > 
> > Sorry, no, that doesn't mean they are special, nothing here is unique
> > for the point of view of the driver model from any other device or bus.
> 
> I think system devices are unique in a sense that they are initialized
> before drivers run.
> 
> > > 3. Why can't initialize the system devices from the driver structure at
> > > boot?
> > > The driver structure is initialized at the end of the boot sequence and
> > > requires the low-level services from the system devices initialized
> > > beforehand.
> > 
> > Wait, what "driver structure"?  
> 
> Sorry it was not clear.  cpu_dev_init() and memory_dev_init() are called
> from driver_init() at the end of the boot sequence, and initialize
> system/cpu and system/memory devices.  I assume they are the system bus
> you are referring with option A.
> 
> > If you need to initialize the driver
> > core earlier, then do so.  Or, even better, just wait until enough of
> > the system has come up and then go initialize all of the devices you
> > have found so far as part of your boot process.
> 
> They are pseudo drivers that provide sysfs entry points of cpu and
> memory.  They do not actually initialize cpu and memory.  I do not think
> initializing cpu and memory fits into the driver model either, since
> drivers should run after cpu and memory are initialized.
> 
> > None of the above things you have stated seem to have anything to do
> > with your proposed patch, so I don't understand why you have mentioned
> > them...
> 
> You suggested option A before, which uses system bus scan to initialize
> all system devices at boot time as well as hot-plug.  I tried to say
> that this option would not be doable.
> 
> > > 4. Why do we need a new common framework?
> > > Sysfs CPU and memory on-lining/off-lining are performed within the CPU
> > > and memory modules.  They are common code and do not depend on ACPI.
> > > Therefore, a new common framework is necessary to integrate both
> > > on-lining/off-lining operation and hot-plugging operation of system
> > > devices into a single framework.
> > 
> > {sigh}
> > 
> > Removing and adding devices and handling hotplug operations is what the
> > driver core was written for, almost 10 years ago.  To somehow think that
> > your devices are "special" just because they don't use ACPI is odd,
> > because the driver core itself has nothing to do with ACPI.  Don't get
> > the current mix of x86 system code tied into ACPI confused with an
> > driver core issues here please.
> 
> CPU online/offline operation is performed within the CPU module.  Memory
> online/offline operation is performed within the memory module.  CPU and
> memory hotplug operations are performed within ACPI.  While they deal
> with the same set of devices, they operate independently and are not
> managed under a same framework.
> 
> I agree with you that not using ACPI is perfectly fine.  My point is
> that ACPI framework won't be able to manage operations that do not use
> ACPI.
> 
> > > 5. Why can't do everything with ACPI bus scan?
> > > Software dependency among system devices may not be dictated by the ACPI
> > > hierarchy.  For instance, memory should be initialized before CPUs (i.e.
> > > a new cpu may need its local memory), but such ordering cannot be
> > > guaranteed by the ACPI hierarchy.  Also, as described in 4,
> > > online/offline operations are independent from ACPI.  
> > 
> > That's fine, the driver core is independant from ACPI.  I don't care how
> > you do the scaning of your devices, but I do care about you creating new
> > driver core pieces that duplicate the existing functionality of what we
> > have today.
> >
> > In short, I like Rafael's proposal better, and I fail to see how
> > anything you have stated here would matter in how this is implemented. :)
> 
> Doing everything within ACPI means we can only manage ACPI hotplug
> operations, not online/offline operations.  But I understand that you
> concern about adding a new framework with option C.  It is good to know
> that you are fine with option B. :)  So, I will step back, and think
> about what we can do within ACPI.

Not much, because ACPI only knows about a subset of devices that may be
involved in that, and a limited one for that matter.  For one example,
anything connected through PCI and not having a corresponding ACPI object (i.e.
pretty much every add-in card in existence) will be unknown to ACPI.  And
say one of these things is a SATA controller with a number of disks under it
and so on.  ACPI won't even know that it exists.  Moreover, PCI won't know
that those disks exist.  Etc.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply

* Re: [RFC PATCH v2 01/12] Add sys_hotplug.h for system device hotplug framework
From: Rafael J. Wysocki @ 2013-02-01 22:12 UTC (permalink / raw)
  To: Greg KH
  Cc: linux-s390, Toshi Kani, jiang.liu, wency, linux-acpi, yinghai,
	linux-kernel, linux-mm, isimatu.yasuaki, srivatsa.bhat, guohanjun,
	bhelgaas, akpm, linuxppc-dev, lenb
In-Reply-To: <20130201072312.GB1180@kroah.com>

On Friday, February 01, 2013 08:23:12 AM Greg KH wrote:
> On Thu, Jan 31, 2013 at 09:54:51PM +0100, Rafael J. Wysocki wrote:
> > > > But, again, I'm going to ask why you aren't using the existing cpu /
> > > > memory / bridge / node devices that we have in the kernel.  Please use
> > > > them, or give me a _really_ good reason why they will not work.
> > > 
> > > We cannot use the existing system devices or ACPI devices here.  During
> > > hot-plug, ACPI handler sets this shp_device info, so that cpu and memory
> > > handlers (drivers/cpu.c and mm/memory_hotplug.c) can obtain their target
> > > device information in a platform-neutral way.  During hot-add, we first
> > > creates an ACPI device node (i.e. device under /sys/bus/acpi/devices),
> > > but platform-neutral modules cannot use them as they are ACPI-specific.
> > 
> > But suppose we're smart and have ACPI scan handlers that will create
> > "physical" device nodes for those devices during the ACPI namespace scan.
> > Then, the platform-neutral nodes will be able to bind to those "physical"
> > nodes.  Moreover, it should be possible to get a hierarchy of device objects
> > this way that will reflect all of the dependencies we need to take into
> > account during hot-add and hot-remove operations.  That may not be what we
> > have today, but I don't see any *fundamental* obstacles preventing us from
> > using this approach.
> 
> I would _much_ rather see that be the solution here as I think it is the
> proper one.
> 
> > This is already done for PCI host bridges and platform devices and I don't
> > see why we can't do that for the other types of devices too.
> 
> I agree.
> 
> > The only missing piece I see is a way to handle the "eject" problem, i.e.
> > when we try do eject a device at the top of a subtree and need to tear down
> > the entire subtree below it, but if that's going to lead to a system crash,
> > for example, we want to cancel the eject.  It seems to me that we'll need some
> > help from the driver core here.
> 
> I say do what we always have done here, if the user asked us to tear
> something down, let it happen as they are the ones that know best :)
> 
> Seriously, I guess this gets back to the "fail disconnect" idea that the
> ACPI developers keep harping on.  I thought we already resolved this
> properly by having them implement it in their bus code, no reason the
> same thing couldn't happen here, right?

Not really. :-)  We haven't ever resolved that particular issue I'm afraid.

> I don't think the core needs to do anything special, but if so, I'll be glad
> to review it.

OK, so this is the use case.  We have "eject" defined for something like
a container with a number of CPU cores, PCI host bridge, and a memory
controller under it.  And a few pretty much arbitrary I/O devices as a bonus.

Now, there's a button on the system case labeled as "Eject" and if that button
is pressed, we're supposed to _try_ to eject all of those things at once.  We
are allowed to fail that request, though, if that's problematic for some
reason, but we're supposed to let the BIOS know about that.

Do you seriously think that if that button is pressed, we should just proceed
with removing all that stuff no matter what?  That'd be kind of like Russian
roulette for whoever pressed that button, because s/he could only press it and
wait for the system to either crash or not.  Or maybe to crash a bit later
because of some delayed stuff that would hit one of those devices that had just
gone.  Surely not a situation any admin of a high-availability system would
like to be in. :-)

Quite frankly, I have no idea how that can be addressed in a single bus type,
let alone ACPI (which is not even a proper bus type, just something pretending
to be one).

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox