LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH 1/1] powerpc/85xx: Board support for ppa8548
From: Scott Wood @ 2013-02-04 16:34 UTC (permalink / raw)
  To: Timur Tabi; +Cc: Stef van Os, Paul Mackerras, linuxppc-dev
In-Reply-To: <CAOZdJXU6OFu1878s9rrNws3uqT_F7-GFS95ZJT5EZmw35=6hbQ@mail.gmail.com>

On 02/01/2013 10:34:44 PM, Timur Tabi wrote:
> On Fri, Feb 1, 2013 at 6:31 PM, Scott Wood <scottwood@freescale.com> =20
> wrote:
> >
> > I guess the reason you're not using fsl/mpc8548si-post.dtsi is that =20
> you
> > don't want PCI.  Maybe PCI and srio should be moved out of that =20
> file, or
> > ifdeffed if 85xx ever ends up using the preprocessor for its device =20
> trees.
>=20
> Wouldn't it be easier to add status=3D"disabled" in this dts file?  I do
> something similar with the LBC on the p1022rdk.
>=20
> 	board_lbc: lbc: localbus@ffe05000 {
> 		/* The P1022 RDK does not have any localbus devices */
> 		status =3D "disabled";
> 	};

Yeah, that'd work.

-Scott=

^ permalink raw reply

* Re: [RFC PATCH v2 01/12] Add sys_hotplug.h for system device hotplug framework
From: Toshi Kani @ 2013-02-04 16:19 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-s390, jiang.liu, wency, linux-acpi, Greg KH, linux-kernel,
	linux-mm, isimatu.yasuaki, yinghai, srivatsa.bhat, guohanjun,
	bhelgaas, akpm, linuxppc-dev, lenb
In-Reply-To: <2048116.Qo8UgQ5hjb@vostro.rjw.lan>

On Mon, 2013-02-04 at 15:21 +0100, Rafael J. Wysocki wrote:
> On Monday, February 04, 2013 04:48:10 AM Greg KH wrote:
> > On Sun, Feb 03, 2013 at 09:44:39PM +0100, Rafael J. Wysocki wrote:
> > > > Yes, but those are just remove events and we can only see how destructive they
> > > > were after the removal.  The point is to be able to figure out whether or not
> > > > we *want* to do the removal in the first place.
> > > > 
> > > > Say you have a computing node which signals a hardware problem in a processor
> > > > package (the container with CPU cores, memory, PCI host bridge etc.).  You
> > > > may want to eject that package, but you don't want to kill the system this
> > > > way.  So if the eject is doable, it is very much desirable to do it, but if it
> > > > is not doable, you'd rather shut the box down and do the replacement afterward.
> > > > That may be costly, however (maybe weeks of computations), so it should be
> > > > avoided if possible, but not at the expense of crashing the box if the eject
> > > > doesn't work out.
> > > 
> > > It seems to me that we could handle that with the help of a new flag, say
> > > "no_eject", in struct device, a global mutex, and a function that will walk
> > > the given subtree of the device hierarchy and check if "no_eject" is set for
> > > any devices in there.  Plus a global "no_eject" switch, perhaps.
> > 
> > I think this will always be racy, or at worst, slow things down on
> > normal device operations as you will always be having to grab this flag
> > whenever you want to do something new.
> 
> I don't see why this particular scheme should be racy, at least I don't see any
> obvious races in it (although I'm not that good at races detection in general,
> admittedly).
> 
> Also, I don't expect that flag to be used for everything, just for things known
> to seriously break if forcible eject is done.  That may be not precise enough,
> so that's a matter of defining its purpose more precisely.
> 
> We can do something like that on the ACPI level (ie. introduce a no_eject flag
> in struct acpi_device and provide an iterface for the layers above ACPI to
> manipulate it) but then devices without ACPI namespace objects won't be
> covered.  That may not be a big deal, though.

I am afraid that bringing the device status management into the ACPI
level would not a good idea.  acpi_device should only reflect ACPI
device object information, not how its actual device is being used.

I like your initiative of acpi_scan_driver and I think scanning /
trimming of ACPI object info is what the ACPI drivers should do.


Thanks,
-Toshi

^ permalink raw reply

* Re: [RFC PATCH v2 01/12] Add sys_hotplug.h for system device hotplug framework
From: Toshi Kani @ 2013-02-04 16:02 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-s390, jiang.liu, wency, linux-acpi, Greg KH, linux-kernel,
	linux-mm, isimatu.yasuaki, yinghai, srivatsa.bhat, guohanjun,
	bhelgaas, akpm, linuxppc-dev, lenb
In-Reply-To: <5192355.CsKHU8mj3W@vostro.rjw.lan>

On Mon, 2013-02-04 at 14:41 +0100, Rafael J. Wysocki wrote:
> On Sunday, February 03, 2013 07:23:49 PM Greg KH wrote:
> > On Sat, Feb 02, 2013 at 09:15:37PM +0100, Rafael J. Wysocki wrote:
> > > On Saturday, February 02, 2013 03:58:01 PM Greg KH wrote:
  :
> > > Yes, but those are just remove events and we can only see how destructive they
> > > were after the removal.  The point is to be able to figure out whether or not
> > > we *want* to do the removal in the first place.
> > 
> > Yes, but, you will always race if you try to test to see if you can shut
> > down a device and then trying to do it.  So walking the bus ahead of
> > time isn't a good idea.
> >
> > And, we really don't have a viable way to recover if disconnect() fails,
> > do we.  What do we do in that situation, restore the other devices we
> > disconnected successfully?  How do we remember/know what they were?
> > 
> > PCI hotplug almost had this same problem until the designers finally
> > realized that they just had to accept the fact that removing a PCI
> > device could either happen by:
> > 	- a user yanking out the device, at which time the OS better
> > 	  clean up properly no matter what happens
> > 	- the user asked nicely to remove a device, and the OS can take
> > 	  as long as it wants to complete that action, including
> > 	  stalling for noticable amounts of time before eventually,
> > 	  always letting the action succeed.
> > 
> > I think the second thing is what you have to do here.  If a user tells
> > the OS it wants to remove these devices, you better do it.  If you
> > can't, because memory is being used by someone else, either move them
> > off, or just hope that nothing bad happens, before the user gets
> > frustrated and yanks out the CPU/memory module themselves physically :)
> 
> Well, that we can't help, but sometimes users really *want* the OS to tell them
> if it is safe to unplug something at this particualr time (think about the
> Windows' "safe remove" feature for USB sticks, for example; that came out of
> users' demand AFAIR).
> 
> So in my opinion it would be good to give them an option to do "safe eject" or
> "forcible eject", whichever they prefer.

For system device hot-plug, it always needs to be "safe eject".  This
feature will be implemented on mission critical servers, which are
managed by professional IT folks.  Crashing a server causes serious
money to the business.

A user yanking out a system device won't happen, and it immediately
crashes the system if it is done.  So, we have nothing to do with this
case.  The 2nd case can hang the operation, waiting forever to proceed,
which is still a serious issue for enterprise customers.


> > > Say you have a computing node which signals a hardware problem in a processor
> > > package (the container with CPU cores, memory, PCI host bridge etc.).  You
> > > may want to eject that package, but you don't want to kill the system this
> > > way.  So if the eject is doable, it is very much desirable to do it, but if it
> > > is not doable, you'd rather shut the box down and do the replacement afterward.
> > > That may be costly, however (maybe weeks of computations), so it should be
> > > avoided if possible, but not at the expense of crashing the box if the eject
> > > doesn't work out.
> > 
> > These same "situations" came up for PCI hotplug, and I still say the
> > same resolution there holds true, as described above.  The user wants to
> > remove something, so let them do it.  They always know best, and get mad
> > at us if we think otherwise :)
> 
> Well, not necessarily.  Users sometimes really don't know what they are doing
> and want us to give them a hint.  My opinion is that if we can give them a
> hint, there's no reason not to.
> 
> > What does the ACPI spec say about this type of thing?  Surely the same
> > people that did the PCI Hotplug spec were consulted when doing this part
> > of the spec, right?  Yeah, I know, I can dream...
> 
> It's not very specific (as usual), but it gives hints. :-)
> 
> For example, there is the _OST method (Section 6.3.5 of ACPI 5) that we are
> supposed to use to notify the platform of ejection failures and there are
> status codes like "0x81: Device in use by application" or "0x82: Device busy"
> that can be used in there.  So definitely the authors took ejection failures
> for software-related reasons into consideration.

That is correct.  Also, ACPI spec deliberately does not define
implementation details, so we defined DIG64 hotplug spec below (which I
contributed to the spec.)

http://www.dig64.org/home/DIG64_HPPF_R1_0.pdf

For example, Figure 2 in page 14 states memory hot-remove flow.  The
operation needs to either succeed or fail.  Crash or hang is not an
option.


Thanks,
-Toshi

^ permalink raw reply

* Re: [PATCH v2 1/1] powerpc/85xx: Board support for ppa8548
From: Timur Tabi @ 2013-02-04 15:30 UTC (permalink / raw)
  To: Stef van Os; +Cc: Scott Wood, Paul Mackerras, linuxppc-dev@lists.ozlabs.org
In-Reply-To: <1359920347-15340-1-git-send-email-stef.van.os@prodrive.nl>

On 02/03/2013 01:39 PM, Stef van Os wrote:

> +	pci0: pci@fe0008000 {
> +		status = "disabled";
> +	};
> +
> +	pci1: pci@fe0009000 {
> +		status = "disabled";
> +	};
> +
> +	pci2: pcie@fe000a000 {
> +		status = "disabled";
> +	};

I was hoping you'd follow my example and include a comment indicating 
why the PCI devices are all disabled.

> +static void ppa8548_show_cpuinfo(struct seq_file *m)
> +{
> +	uint svid, phid1;

Please don't used unsized integers for hardware registers.

	uint32_t svid, phid1;


-- 
Timur Tabi

^ permalink raw reply

* Re: [RFC PATCH v2 01/12] Add sys_hotplug.h for system device hotplug framework
From: Greg KH @ 2013-02-04 14:33 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-s390, Toshi Kani, jiang.liu, wency, linux-acpi, yinghai,
	linux-kernel, linux-mm, isimatu.yasuaki, srivatsa.bhat, guohanjun,
	bhelgaas, akpm, linuxppc-dev, lenb
In-Reply-To: <2048116.Qo8UgQ5hjb@vostro.rjw.lan>

On Mon, Feb 04, 2013 at 03:21:22PM +0100, Rafael J. Wysocki wrote:
> On Monday, February 04, 2013 04:48:10 AM Greg KH wrote:
> > On Sun, Feb 03, 2013 at 09:44:39PM +0100, Rafael J. Wysocki wrote:
> > > > Yes, but those are just remove events and we can only see how destructive they
> > > > were after the removal.  The point is to be able to figure out whether or not
> > > > we *want* to do the removal in the first place.
> > > > 
> > > > Say you have a computing node which signals a hardware problem in a processor
> > > > package (the container with CPU cores, memory, PCI host bridge etc.).  You
> > > > may want to eject that package, but you don't want to kill the system this
> > > > way.  So if the eject is doable, it is very much desirable to do it, but if it
> > > > is not doable, you'd rather shut the box down and do the replacement afterward.
> > > > That may be costly, however (maybe weeks of computations), so it should be
> > > > avoided if possible, but not at the expense of crashing the box if the eject
> > > > doesn't work out.
> > > 
> > > It seems to me that we could handle that with the help of a new flag, say
> > > "no_eject", in struct device, a global mutex, and a function that will walk
> > > the given subtree of the device hierarchy and check if "no_eject" is set for
> > > any devices in there.  Plus a global "no_eject" switch, perhaps.
> > 
> > I think this will always be racy, or at worst, slow things down on
> > normal device operations as you will always be having to grab this flag
> > whenever you want to do something new.
> 
> I don't see why this particular scheme should be racy, at least I don't see any
> obvious races in it (although I'm not that good at races detection in general,
> admittedly).
> 
> Also, I don't expect that flag to be used for everything, just for things known
> to seriously break if forcible eject is done.  That may be not precise enough,
> so that's a matter of defining its purpose more precisely.
> 
> We can do something like that on the ACPI level (ie. introduce a no_eject flag
> in struct acpi_device and provide an iterface for the layers above ACPI to
> manipulate it) but then devices without ACPI namespace objects won't be
> covered.  That may not be a big deal, though.
> 
> So say dev is about to be used for something incompatible with ejecting, so to
> speak.  Then, one would do platform_lock_eject(dev), which would check if dev
> has an ACPI handle and then take acpi_eject_lock (if so).  The return value of
> platform_lock_eject(dev) would need to be checked to see if the device is not
> gone.  If it returns success (0), one would do something to the device and
> call platform_no_eject(dev) and then platform_unlock_eject(dev).

How does a device "know" it is doing something that is incompatible with
ejecting?  That's a non-trivial task from what I can tell.

What happens if a device wants to set that flag, right after it was told
to eject and the device was in the middle of being removed?  How can you
"fail" the "I can't be removed me now, so don't" requirement that it now
has?

thanks,

greg k-h

^ permalink raw reply

* Re: [RFC PATCH v2 01/12] Add sys_hotplug.h for system device hotplug framework
From: Rafael J. Wysocki @ 2013-02-04 14:21 UTC (permalink / raw)
  To: Greg KH
  Cc: linux-s390, Toshi Kani, jiang.liu, wency, linux-acpi, yinghai,
	linux-kernel, linux-mm, isimatu.yasuaki, srivatsa.bhat, guohanjun,
	bhelgaas, akpm, linuxppc-dev, lenb
In-Reply-To: <20130204124810.GB22096@kroah.com>

On Monday, February 04, 2013 04:48:10 AM Greg KH wrote:
> On Sun, Feb 03, 2013 at 09:44:39PM +0100, Rafael J. Wysocki wrote:
> > > Yes, but those are just remove events and we can only see how destructive they
> > > were after the removal.  The point is to be able to figure out whether or not
> > > we *want* to do the removal in the first place.
> > > 
> > > Say you have a computing node which signals a hardware problem in a processor
> > > package (the container with CPU cores, memory, PCI host bridge etc.).  You
> > > may want to eject that package, but you don't want to kill the system this
> > > way.  So if the eject is doable, it is very much desirable to do it, but if it
> > > is not doable, you'd rather shut the box down and do the replacement afterward.
> > > That may be costly, however (maybe weeks of computations), so it should be
> > > avoided if possible, but not at the expense of crashing the box if the eject
> > > doesn't work out.
> > 
> > It seems to me that we could handle that with the help of a new flag, say
> > "no_eject", in struct device, a global mutex, and a function that will walk
> > the given subtree of the device hierarchy and check if "no_eject" is set for
> > any devices in there.  Plus a global "no_eject" switch, perhaps.
> 
> I think this will always be racy, or at worst, slow things down on
> normal device operations as you will always be having to grab this flag
> whenever you want to do something new.

I don't see why this particular scheme should be racy, at least I don't see any
obvious races in it (although I'm not that good at races detection in general,
admittedly).

Also, I don't expect that flag to be used for everything, just for things known
to seriously break if forcible eject is done.  That may be not precise enough,
so that's a matter of defining its purpose more precisely.

We can do something like that on the ACPI level (ie. introduce a no_eject flag
in struct acpi_device and provide an iterface for the layers above ACPI to
manipulate it) but then devices without ACPI namespace objects won't be
covered.  That may not be a big deal, though.

So say dev is about to be used for something incompatible with ejecting, so to
speak.  Then, one would do platform_lock_eject(dev), which would check if dev
has an ACPI handle and then take acpi_eject_lock (if so).  The return value of
platform_lock_eject(dev) would need to be checked to see if the device is not
gone.  If it returns success (0), one would do something to the device and
call platform_no_eject(dev) and then platform_unlock_eject(dev).

To clear no_eject one would just call platform_allow_to_eject(dev) that would
do all of the locking and clearing in one operation.

The ACPI eject side would be similar to the thing I described previously,
so it would (1) take acpi_eject_lock, (2) see if any struct acpi_device
involved has no_eject set and if not, then (3) do acpi_bus_trim(), (4)
carry out the eject and (5) release acpi_eject_lock.

Step (2) above might be optional, ie. if eject is forcible, we would just do
(3) etc. without (2).

The locking should prevent races from happening (and it should prevent two
ejects from happening at the same time too, which is not a bad thing by itself).

> See my comments earlier about pci hotplug and the design decisions there
> about "no eject" capabilities for why.

Well, I replied to that one too. :-)

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply

* Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug
From: Srivatsa S. Bhat @ 2013-02-04 13:47 UTC (permalink / raw)
  To: tglx, peterz, tj, oleg, paulmck, rusty, mingo
  Cc: linux-arch, linux, nikunj, linux-pm, fweisbec, linux-doc,
	linux-kernel, rostedt, xiaoguangrong, rjw, sbw, wangyun,
	Srivatsa S. Bhat, netdev, namhyung, akpm, walken, linuxppc-dev,
	linux-arm-kernel
In-Reply-To: <20130122073210.13822.50434.stgit@srivatsabhat.in.ibm.com>

[-- Attachment #1: Type: text/plain, Size: 3770 bytes --]

On 01/22/2013 01:03 PM, Srivatsa S. Bhat wrote:
> Hi,
> 
> This patchset removes CPU hotplug's dependence on stop_machine() from the CPU
> offline path and provides an alternative (set of APIs) to preempt_disable() to
> prevent CPUs from going offline, which can be invoked from atomic context.
> The motivation behind the removal of stop_machine() is to avoid its ill-effects
> and thus improve the design of CPU hotplug. (More description regarding this
> is available in the patches).
> 
> All the users of preempt_disable()/local_irq_disable() who used to use it to
> prevent CPU offline, have been converted to the new primitives introduced in the
> patchset. Also, the CPU_DYING notifiers have been audited to check whether
> they can cope up with the removal of stop_machine() or whether they need to
> use new locks for synchronization (all CPU_DYING notifiers looked OK, without
> the need for any new locks).
> 
> Applies on v3.8-rc4. It currently has some locking issues with cpu idle (on
> which even lockdep didn't provide any insight unfortunately). So for now, it
> works with CONFIG_CPU_IDLE=n.
> 

I ran this patchset on a POWER 7 machine with 32 cores (128 logical CPUs)
[POWER doesn't have the cpu idle issue]. And the results (latency or the time
taken for a single CPU offline) are shown below.

Experiment:
----------

Run a heavy workload (genload from LTP) that generates significant system time;
With '# online CPUs' online, measure the time it takes to complete the stop-m/c
phase in mainline and the equivalent phase in the patched kernel for 1 CPU
offline operation. (It is important to note here that the measurement shows the
average time it takes to perform a *single* CPU offline operation).

Expected results:
----------------

Since stop-machine doesn't scale with no. of online CPUs, we expect the
mainline kernel to take longer and longer for taking 1 CPU offline, with
increasing no. of online CPUs. The patched kernel is expected to take a
constant amount of time, irrespective of the number of online CPUs, because it
has a scalable design.


Experimental results:
---------------------

                 Avg. latency of 1 CPU offline (ms) [stop-cpu/stop-m/c latency]

# online CPUs    Mainline (with stop-m/c)       This patchset (no stop-m/c)

      8                 17.04                          7.73

     16                 18.05                          6.44

     32                 17.31                          7.39

     64                 32.40                          9.28

    128                 98.23                          7.35


Analysis and conclusion:
------------------------

The patched kernel performs pretty well and meets our expectations. It beats
mainline easily. As shown in the table above and the graph attached with this
mail, it has the following advantages:

1. Avg. latency is less than mainline (roughly half that of even the least
   in mainline).

2. The avg. latency is a constant, irrespective of number of online CPUs in
   the system, which proves that the design/synchronization scheme is scalable.

3. Throughout the duration shown above, mainline disables interrupts on all
   CPUs. But the patched kernel not only has a smaller duration of hotplug,
   but also keeps interrupts enabled on other CPUs, which makes CPU offline
   less disruptive on latency-sensitive workloads running on the system.


So, this gives us an idea of how this patchset actually performs. Of course
there are bugs and issues that still need fixing (even mainline crashes with
hotplug sometimes), but I did the above experiment to verify whether the
design is working as expected and whether it really shows significant
improvements over mainline. And thankfully, it does :-)

Regards,
Srivatsa S. Bhat


[-- Attachment #2: CPU hotplug latency.png --]
[-- Type: image/png, Size: 172574 bytes --]

^ permalink raw reply

* Re: [RFC PATCH v2 01/12] Add sys_hotplug.h for system device hotplug framework
From: Rafael J. Wysocki @ 2013-02-04 13:41 UTC (permalink / raw)
  To: Greg KH
  Cc: linux-s390, Toshi Kani, jiang.liu, wency, linux-acpi, yinghai,
	linux-kernel, linux-mm, isimatu.yasuaki, srivatsa.bhat, guohanjun,
	bhelgaas, akpm, linuxppc-dev, lenb
In-Reply-To: <20130204012349.GA6433@kroah.com>

On Sunday, February 03, 2013 07:23:49 PM Greg KH wrote:
> On Sat, Feb 02, 2013 at 09:15:37PM +0100, Rafael J. Wysocki wrote:
> > On Saturday, February 02, 2013 03:58:01 PM Greg KH wrote:
> > > On Fri, Feb 01, 2013 at 11:12:59PM +0100, Rafael J. Wysocki wrote:
> > > > On Friday, February 01, 2013 08:23:12 AM Greg KH wrote:
> > > > > On Thu, Jan 31, 2013 at 09:54:51PM +0100, Rafael J. Wysocki wrote:
> > > > > > > > But, again, I'm going to ask why you aren't using the existing cpu /
> > > > > > > > memory / bridge / node devices that we have in the kernel.  Please use
> > > > > > > > them, or give me a _really_ good reason why they will not work.
> > > > > > > 
> > > > > > > We cannot use the existing system devices or ACPI devices here.  During
> > > > > > > hot-plug, ACPI handler sets this shp_device info, so that cpu and memory
> > > > > > > handlers (drivers/cpu.c and mm/memory_hotplug.c) can obtain their target
> > > > > > > device information in a platform-neutral way.  During hot-add, we first
> > > > > > > creates an ACPI device node (i.e. device under /sys/bus/acpi/devices),
> > > > > > > but platform-neutral modules cannot use them as they are ACPI-specific.
> > > > > > 
> > > > > > But suppose we're smart and have ACPI scan handlers that will create
> > > > > > "physical" device nodes for those devices during the ACPI namespace scan.
> > > > > > Then, the platform-neutral nodes will be able to bind to those "physical"
> > > > > > nodes.  Moreover, it should be possible to get a hierarchy of device objects
> > > > > > this way that will reflect all of the dependencies we need to take into
> > > > > > account during hot-add and hot-remove operations.  That may not be what we
> > > > > > have today, but I don't see any *fundamental* obstacles preventing us from
> > > > > > using this approach.
> > > > > 
> > > > > I would _much_ rather see that be the solution here as I think it is the
> > > > > proper one.
> > > > > 
> > > > > > This is already done for PCI host bridges and platform devices and I don't
> > > > > > see why we can't do that for the other types of devices too.
> > > > > 
> > > > > I agree.
> > > > > 
> > > > > > The only missing piece I see is a way to handle the "eject" problem, i.e.
> > > > > > when we try do eject a device at the top of a subtree and need to tear down
> > > > > > the entire subtree below it, but if that's going to lead to a system crash,
> > > > > > for example, we want to cancel the eject.  It seems to me that we'll need some
> > > > > > help from the driver core here.
> > > > > 
> > > > > I say do what we always have done here, if the user asked us to tear
> > > > > something down, let it happen as they are the ones that know best :)
> > > > > 
> > > > > Seriously, I guess this gets back to the "fail disconnect" idea that the
> > > > > ACPI developers keep harping on.  I thought we already resolved this
> > > > > properly by having them implement it in their bus code, no reason the
> > > > > same thing couldn't happen here, right?
> > > > 
> > > > Not really. :-)  We haven't ever resolved that particular issue I'm afraid.
> > > 
> > > Ah, I didn't realize that.
> > > 
> > > > > I don't think the core needs to do anything special, but if so, I'll be glad
> > > > > to review it.
> > > > 
> > > > OK, so this is the use case.  We have "eject" defined for something like
> > > > a container with a number of CPU cores, PCI host bridge, and a memory
> > > > controller under it.  And a few pretty much arbitrary I/O devices as a bonus.
> > > > 
> > > > Now, there's a button on the system case labeled as "Eject" and if that button
> > > > is pressed, we're supposed to _try_ to eject all of those things at once.  We
> > > > are allowed to fail that request, though, if that's problematic for some
> > > > reason, but we're supposed to let the BIOS know about that.
> > > > 
> > > > Do you seriously think that if that button is pressed, we should just proceed
> > > > with removing all that stuff no matter what?  That'd be kind of like Russian
> > > > roulette for whoever pressed that button, because s/he could only press it and
> > > > wait for the system to either crash or not.  Or maybe to crash a bit later
> > > > because of some delayed stuff that would hit one of those devices that had just
> > > > gone.  Surely not a situation any admin of a high-availability system would
> > > > like to be in. :-)
> > > > 
> > > > Quite frankly, I have no idea how that can be addressed in a single bus type,
> > > > let alone ACPI (which is not even a proper bus type, just something pretending
> > > > to be one).
> > > 
> > > You don't have it as a single bus type, you have a controller somewhere,
> > > off of the bus being destroyed, that handles sending remove events to
> > > the device and tearing everything down.  PCI does this from the very
> > > beginning.
> > 
> > Yes, but those are just remove events and we can only see how destructive they
> > were after the removal.  The point is to be able to figure out whether or not
> > we *want* to do the removal in the first place.
> 
> Yes, but, you will always race if you try to test to see if you can shut
> down a device and then trying to do it.  So walking the bus ahead of
> time isn't a good idea.
>
> And, we really don't have a viable way to recover if disconnect() fails,
> do we.  What do we do in that situation, restore the other devices we
> disconnected successfully?  How do we remember/know what they were?
> 
> PCI hotplug almost had this same problem until the designers finally
> realized that they just had to accept the fact that removing a PCI
> device could either happen by:
> 	- a user yanking out the device, at which time the OS better
> 	  clean up properly no matter what happens
> 	- the user asked nicely to remove a device, and the OS can take
> 	  as long as it wants to complete that action, including
> 	  stalling for noticable amounts of time before eventually,
> 	  always letting the action succeed.
> 
> I think the second thing is what you have to do here.  If a user tells
> the OS it wants to remove these devices, you better do it.  If you
> can't, because memory is being used by someone else, either move them
> off, or just hope that nothing bad happens, before the user gets
> frustrated and yanks out the CPU/memory module themselves physically :)

Well, that we can't help, but sometimes users really *want* the OS to tell them
if it is safe to unplug something at this particualr time (think about the
Windows' "safe remove" feature for USB sticks, for example; that came out of
users' demand AFAIR).

So in my opinion it would be good to give them an option to do "safe eject" or
"forcible eject", whichever they prefer.

> > Say you have a computing node which signals a hardware problem in a processor
> > package (the container with CPU cores, memory, PCI host bridge etc.).  You
> > may want to eject that package, but you don't want to kill the system this
> > way.  So if the eject is doable, it is very much desirable to do it, but if it
> > is not doable, you'd rather shut the box down and do the replacement afterward.
> > That may be costly, however (maybe weeks of computations), so it should be
> > avoided if possible, but not at the expense of crashing the box if the eject
> > doesn't work out.
> 
> These same "situations" came up for PCI hotplug, and I still say the
> same resolution there holds true, as described above.  The user wants to
> remove something, so let them do it.  They always know best, and get mad
> at us if we think otherwise :)

Well, not necessarily.  Users sometimes really don't know what they are doing
and want us to give them a hint.  My opinion is that if we can give them a
hint, there's no reason not to.

> What does the ACPI spec say about this type of thing?  Surely the same
> people that did the PCI Hotplug spec were consulted when doing this part
> of the spec, right?  Yeah, I know, I can dream...

It's not very specific (as usual), but it gives hints. :-)

For example, there is the _OST method (Section 6.3.5 of ACPI 5) that we are
supposed to use to notify the platform of ejection failures and there are
status codes like "0x81: Device in use by application" or "0x82: Device busy"
that can be used in there.  So definitely the authors took ejection failures
for software-related reasons into consideration.

> > > I know it's more complicated with these types of devices, and I think we
> > > are getting closer to the correct solution, I just don't want to ever
> > > see duplicate devices in the driver model for the same physical device.
> > 
> > Do you mean two things based on struct device for the same hardware component?
> > That's been happening already pretty much forever for every PCI device known
> > to the ACPI layer, for PNP and many others.  However, those ACPI things are (or
> > rather should be, but we're going to clean that up) only for convenience (to be
> > able to see the namespace structure and related things in sysfs).  So the stuff
> > under /sys/devices/LNXSYSTM\:00/ is not "real".
> 
> Yes, I've never treated that as a "real" device because they (usually)
> didn't ever bind to the "real" driver that controlled the device and how
> it talked to the rest of the os (like a USB device for example.)  I
> always thought just of it as a "shadow" of the firmware image, nothing
> that should be directly operated on if at all possible.

Precisely.  That's why I'd like to move that stuff away from /sys/devices/
and I don't see a reason why these objects should be based on struct device.
They need kobjects to show up in sysfs, but apart from this they don't really
need anything from struct device as far as I can say.

> But, as you are pointing out, maybe this needs to be changed.  Having
> users have to look in one part of the tree for one interface to a
> device, and another totally different part for a different interface to
> the same physical device is crazy, don't you agree?

Well, it is confusing.  I don't have a problem with exposing the ACPI namespace
in the form of a directory structure in sysfs and I see some benefits from
doing that, but I'd like it to be clear what's represented by that directory
structure and I don't want people to confuse ACPI device objects with devices
(they are abstract interfaces to devices rather than anything else).

> As to how to solve it, I really have no idea, I don't know ACPI that
> well at all, and honestly, don't want to, I want to keep what little
> hair I have left...

I totally understand you. :-)

> > In my view it shouldn't even
> > be under /sys/devices/ (/sys/firmware/acpi/ seems to be a better place for it),
> 
> I agree.
> 
> > but that may be difficult to change without breaking user space (maybe we can
> > just symlink it from /sys/devices/ or something).  And the ACPI bus type
> > shouldn't even exist in my opinion.
> > 
> > There's much confusion in there and much work to clean that up, I agree, but
> > that's kind of separate from the hotplug thing.
> 
> I agree as well.
> 
> Best of luck.

Thanks. :-)

Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply

* Re: 3.8: possible circular locking dependency detected
From: Michael Ellerman @ 2013-02-04 13:05 UTC (permalink / raw)
  To: Christian Kujau
  Cc: linux-wireless, Johannes Berg, linuxppc-dev, LKML, b43-dev
In-Reply-To: <alpine.DEB.2.01.1302031843450.13050@trent.utfs.org>

On Sun, 2013-02-03 at 21:09 -0800, Christian Kujau wrote:
> Hi,
> 
> similar to what I reported earlier [0] for 3.8.0-rc1, this happens during 
> "ifup wlan0" (which in effect starts wpa_supplicant to bring up a Broadcom 
> b43 wifi network interface). The interface is working though and continues 
> to work over several ifup/ifdown iterations.
> 
> The backtrace looks awfully similar to the earlier[0] report, but this 
> time it had b43* stuff in it so I thought I should report it. Full dmesg 
> and .config here: http://nerdbynature.de/bits/3.8.0-rc6/

>[0] https://lkml.org/lkml/2013/1/3/543

Actually the backtrace looks very different, that was fb vs
console_lock.

This one should probably go to linux-wireless@vger.kernel.org (CC-ed).

IIUI it's actually the work handling that is the problem. In
b43_wireless_core_stop() it calls cancel_work_sync() with the RTNL held,
but the work function (b43_request_firmware()) also takes the RTNL via
ieee80211_register_hw() and wiphy_register(). eg:

> [  807.767412]        CPU0                    CPU1
> [  807.768561]        ----                    ----
> [  807.769690]   lock(rtnl_mutex);
> [  807.770822]                                process_one_work(..., &wl->firmware_load);
> [  807.771970]                                lock(rtnl_mutex);
> [  807.773115]   cancel_work_sync(&wl->firmware_load);

cheers

> [  807.693791] 
> [  807.695519] ======================================================
> [  807.697198] [ INFO: possible circular locking dependency detected ]
> [  807.698890] 3.8.0-rc6-00008-g8b31849 #1 Not tainted
> [  807.700573] -------------------------------------------------------
> [  807.702255] wpa_supplicant/4129 is trying to acquire lock:
> [  807.703925]  ((&wl->firmware_load)){+.+.+.}, at: [<c0049d1c>] flush_work+0x0/0x2b0
> [  807.705696] 
> [  807.705696] but task is already holding lock:
> [  807.709023]  (rtnl_mutex){+.+.+.}, at: [<c04362d4>] rtnl_lock+0x1c/0x2c
> [  807.710743] 
> [  807.710743] which lock already depends on the new lock.
> [  807.710743] 
> [  807.715541] 
> [  807.715541] the existing dependency chain (in reverse order) is:
> [  807.718691] 
> [  807.718691] -> #1 (rtnl_mutex){+.+.+.}:
> [  807.721903]        [<c04f69ac>] mutex_lock_nested+0x6c/0x2bc
> [  807.723533]        [<c04362d4>] rtnl_lock+0x1c/0x2c
> [  807.725138]        [<f20be0d8>] wiphy_register+0x510/0x53c [cfg80211]
> [  807.726798]        [<f64815ec>] ieee80211_register_hw+0x3f8/0x82c [mac80211]
> [  807.728431]        [<f64c6890>] b43_request_firmware+0x8c/0x198 [b43]
> [  807.730025]        [<c004b1f4>] process_one_work+0x1a4/0x498
> [  807.731549]        [<c004b8b4>] worker_thread+0x17c/0x428
> [  807.733025]        [<c0051280>] kthread+0xa8/0xac
> [  807.734439]        [<c0010990>] ret_from_kernel_thread+0x64/0x6c
> [  807.735810] 
> [  807.735810] -> #0 ((&wl->firmware_load)){+.+.+.}:
> [  807.738389]        [<c0071b38>] lock_acquire+0x50/0x6c
> [  807.739694]        [<c0049d58>] flush_work+0x3c/0x2b0
> [  807.740980]        [<c004c30c>] __cancel_work_timer+0x94/0xec
> [  807.742271]        [<f64c748c>] b43_wireless_core_stop+0x5c/0x234 [b43]
> [  807.743574]        [<f64c76b0>] b43_op_stop+0x4c/0x88 [b43]
> [  807.744888]        [<f64a4c08>] ieee80211_stop_device+0x4c/0x8c [mac80211]
> [  807.746240]        [<f64913a8>] ieee80211_do_stop+0x2c0/0x5dc [mac80211]
> [  807.747582]        [<f64916dc>] ieee80211_stop+0x18/0x2c [mac80211]
> [  807.748925]        [<c0423d94>] __dev_close_many+0xb0/0x100
> [  807.750257]        [<c0423e10>] __dev_close+0x2c/0x4c
> [  807.751560]        [<c0427514>] __dev_change_flags+0x124/0x178
> [  807.752868]        [<c042761c>] dev_change_flags+0x1c/0x64
> [  807.754177]        [<c047db9c>] devinet_ioctl+0x69c/0x74c
> [  807.755459]        [<c047ecf8>] inet_ioctl+0xcc/0xf8
> [  807.756709]        [<c040eab0>] sock_ioctl+0x70/0x2e8
> [  807.757948]        [<c00d9f70>] do_vfs_ioctl+0xa4/0x7c0
> [  807.759182]        [<c00da6d0>] sys_ioctl+0x44/0x70
> [  807.760407]        [<c001084c>] ret_from_syscall+0x0/0x38
> [  807.761622] 
> [  807.761622] other info that might help us debug this:
> [  807.761622] 
> [  807.765107]  Possible unsafe locking scenario:
> [  807.765107] 
> [  807.767412]        CPU0                    CPU1
> [  807.768561]        ----                    ----
> [  807.769690]   lock(rtnl_mutex);
> [  807.770822]                                lock((&wl->firmware_load));
> [  807.771970]                                lock(rtnl_mutex);
> [  807.773115]   lock((&wl->firmware_load));
> [  807.774244] 
> [  807.774244]  *** DEADLOCK ***
> [  807.774244] 
> [  807.777405] 1 lock held by wpa_supplicant/4129:
> [  807.778475]  #0:  (rtnl_mutex){+.+.+.}, at: [<c04362d4>] rtnl_lock+0x1c/0x2c
> [  807.779628] 
> [  807.779628] stack backtrace:
> [  807.781720] Call Trace:
> [  807.782765] [eea2db20] [c0009160] show_stack+0x48/0x15c (unreliable)
> [  807.784087] [eea2db60] [c04fae24] print_circular_bug+0x2b0/0x2c8
> [  807.785169] [eea2db90] [c0071300] __lock_acquire+0x14f4/0x18c8
> [  807.786254] [eea2dc30] [c0071b38] lock_acquire+0x50/0x6c
> [  807.787335] [eea2dc50] [c0049d58] flush_work+0x3c/0x2b0
> [  807.788418] [eea2dcc0] [c004c30c] __cancel_work_timer+0x94/0xec
> [  807.789516] [eea2dcf0] [f64c748c] b43_wireless_core_stop+0x5c/0x234 [b43]
> [  807.790629] [eea2dd20] [f64c76b0] b43_op_stop+0x4c/0x88 [b43]
> [  807.791754] [eea2dd40] [f64a4c08] ieee80211_stop_device+0x4c/0x8c [mac80211]
> [  807.792888] [eea2dd50] [f64913a8] ieee80211_do_stop+0x2c0/0x5dc [mac80211]
> [  807.794016] [eea2dd80] [f64916dc] ieee80211_stop+0x18/0x2c [mac80211]
> [  807.795124] [eea2dd90] [c0423d94] __dev_close_many+0xb0/0x100
> [  807.796237] [eea2ddb0] [c0423e10] __dev_close+0x2c/0x4c
> [  807.797345] [eea2ddd0] [c0427514] __dev_change_flags+0x124/0x178
> [  807.798451] [eea2ddf0] [c042761c] dev_change_flags+0x1c/0x64
> [  807.799564] [eea2de10] [c047db9c] devinet_ioctl+0x69c/0x74c
> [  807.800680] [eea2de70] [c047ecf8] inet_ioctl+0xcc/0xf8
> [  807.801792] [eea2de80] [c040eab0] sock_ioctl+0x70/0x2e8
> [  807.802905] [eea2dea0] [c00d9f70] do_vfs_ioctl+0xa4/0x7c0
> [  807.804025] [eea2df10] [c00da6d0] sys_ioctl+0x44/0x70
> [  807.805145] [eea2df40] [c001084c] ret_from_syscall+0x0/0x38
> [  807.806270] --- Exception: c01 at 0xfbf2758
> [  807.806270]     LR = 0xfbf26bc
> 
> [  850.445766] b43-phy1: Broadcom 4306 WLAN found (core revision 5)
> [  850.469062] b43-phy1: Found PHY: Analog 2, Type 2 (G), Revision 2
> [  850.493199] Broadcom 43xx driver loaded [ Features: P ]
> [  850.497448] ieee80211 phy1: Selected rate control algorithm 'minstrel_ht'
> [  860.820792] b43-phy1: Loading firmware version 410.2160 (2007-05-26 15:32:10)
> 
> 

^ permalink raw reply

* Re: [RFC PATCH v2 01/12] Add sys_hotplug.h for system device hotplug framework
From: Greg KH @ 2013-02-04 12:48 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-s390, Toshi Kani, jiang.liu, wency, linux-acpi, yinghai,
	linux-kernel, linux-mm, isimatu.yasuaki, srivatsa.bhat, guohanjun,
	bhelgaas, akpm, linuxppc-dev, lenb
In-Reply-To: <5876609.Ic1nhHW6N2@vostro.rjw.lan>

On Sun, Feb 03, 2013 at 09:44:39PM +0100, Rafael J. Wysocki wrote:
> > Yes, but those are just remove events and we can only see how destructive they
> > were after the removal.  The point is to be able to figure out whether or not
> > we *want* to do the removal in the first place.
> > 
> > Say you have a computing node which signals a hardware problem in a processor
> > package (the container with CPU cores, memory, PCI host bridge etc.).  You
> > may want to eject that package, but you don't want to kill the system this
> > way.  So if the eject is doable, it is very much desirable to do it, but if it
> > is not doable, you'd rather shut the box down and do the replacement afterward.
> > That may be costly, however (maybe weeks of computations), so it should be
> > avoided if possible, but not at the expense of crashing the box if the eject
> > doesn't work out.
> 
> It seems to me that we could handle that with the help of a new flag, say
> "no_eject", in struct device, a global mutex, and a function that will walk
> the given subtree of the device hierarchy and check if "no_eject" is set for
> any devices in there.  Plus a global "no_eject" switch, perhaps.

I think this will always be racy, or at worst, slow things down on
normal device operations as you will always be having to grab this flag
whenever you want to do something new.

See my comments earlier about pci hotplug and the design decisions there
about "no eject" capabilities for why.

thanks,

greg k-h

^ permalink raw reply

* Re: [RFC PATCH v2 01/12] Add sys_hotplug.h for system device hotplug framework
From: Greg KH @ 2013-02-04 12:46 UTC (permalink / raw)
  To: Toshi Kani
  Cc: linux-s390@vger.kernel.org, jiang.liu@huawei.com,
	wency@cn.fujitsu.com, linux-mm@kvack.org, yinghai@kernel.org,
	linux-kernel@vger.kernel.org, Rafael J. Wysocki,
	linux-acpi@vger.kernel.org, isimatu.yasuaki@jp.fujitsu.com,
	srivatsa.bhat@linux.vnet.ibm.com, guohanjun@huawei.com,
	bhelgaas@google.com, akpm@linux-foundation.org,
	linuxppc-dev@lists.ozlabs.org, lenb@kernel.org
In-Reply-To: <1359937689.23410.82.camel@misato.fc.hp.com>

On Sun, Feb 03, 2013 at 05:28:09PM -0700, Toshi Kani wrote:
> On Sat, 2013-02-02 at 16:01 +0100, Greg KH wrote:
> > On Fri, Feb 01, 2013 at 01:40:10PM -0700, Toshi Kani wrote:
> > > On Fri, 2013-02-01 at 07:30 +0000, Greg KH wrote:
> > > > On Thu, Jan 31, 2013 at 06:32:18PM -0700, Toshi Kani wrote:
> > > >  > This is already done for PCI host bridges and platform devices and I don't
> > > > > > see why we can't do that for the other types of devices too.
> > > > > > 
> > > > > > The only missing piece I see is a way to handle the "eject" problem, i.e.
> > > > > > when we try do eject a device at the top of a subtree and need to tear down
> > > > > > the entire subtree below it, but if that's going to lead to a system crash,
> > > > > > for example, we want to cancel the eject.  It seems to me that we'll need some
> > > > > > help from the driver core here.
> > > > > 
> > > > > There are three different approaches suggested for system device
> > > > > hot-plug:
> > > > >  A. Proceed within system device bus scan.
> > > > >  B. Proceed within ACPI bus scan.
> > > > >  C. Proceed with a sequence (as a mini-boot).
> > > > > 
> > > > > Option A uses system devices as tokens, option B uses acpi devices as
> > > > > tokens, and option C uses resource tables as tokens, for their handlers.
> > > > > 
> > > > > Here is summary of key questions & answers so far.  I hope this
> > > > > clarifies why I am suggesting option 3.
> > > > > 
> > > > > 1. What are the system devices?
> > > > > System devices provide system-wide core computing resources, which are
> > > > > essential to compose a computer system.  System devices are not
> > > > > connected to any particular standard buses.
> > > > 
> > > > Not a problem, lots of devices are not connected to any "particular
> > > > standard busses".  All this means is that system devices are connected
> > > > to the "system" bus, nothing more.
> > > 
> > > Can you give me a few examples of other devices that support hotplug and
> > > are not connected to any particular buses?  I will investigate them to
> > > see how they are managed to support hotplug.
> > 
> > Any device that is attached to any bus in the driver model can be
> > hotunplugged from userspace by telling it to be "unbound" from the
> > driver controlling it.  Try it for any platform device in your system to
> > see how it happens.
> 
> The unbind operation, as I understand from you, is to detach a driver
> from a device.  Yes, unbinding can be done for any devices.  It is
> however different from hot-plug operation, which unplugs a device.

Physically, yes, but to the driver involved, and the driver core, there
is no difference.  That was one of the primary goals of the driver core
creation so many years ago.

> Today, the unbind operation to an ACPI cpu/memory devices causes
> hot-unplug (offline) operation to them, which is one of the major issues
> for us since unbind cannot fail.  This patchset addresses this issue by
> making the unbind operation of ACPI cpu/memory devices to do the
> unbinding only.  ACPI drivers no longer control cpu and memory as they
> are supposed to be controlled by their drivers, cpu and memory modules.

I think that's the problem right there, solve that, please.

> > > > > 2. Why are the system devices special?
> > > > > The system devices are initialized during early boot-time, by multiple
> > > > > subsystems, from the boot-up sequence, in pre-defined order.  They
> > > > > provide low-level services to enable other subsystems to come up.
> > > > 
> > > > Sorry, no, that doesn't mean they are special, nothing here is unique
> > > > for the point of view of the driver model from any other device or bus.
> > > 
> > > I think system devices are unique in a sense that they are initialized
> > > before drivers run.
> > 
> > No, most all devices are "initialized" before a driver runs on it, USB
> > is one such example, PCI another, and I'm pretty sure that there are
> > others.
> 
> USB devices can be initialized after the USB bus driver is initialized.
> Similarly, PCI devices can be initialized after the PCI bus driver is
> initialized.  However, CPU and memory are initialized without any
> dependency to their bus driver since there is no such thing.

You can create such a thing if you want :)

> In addition, CPU and memory have two drivers -- their actual
> drivers/subsystems and their ACPI drivers.

Again, I feel that is the root of the problem.  Rafael seems to be
working on solving this, which I think is essencial to your work as
well.

thanks,

greg k-h

^ permalink raw reply

* Re: [PATCH?] Move ACPI device nodes under /sys/firmware/acpi (was: Re: [RFC PATCH v2 01/12] Add sys_hotplug.h for system device hotplug framework)
From: Rafael J. Wysocki @ 2013-02-04 12:34 UTC (permalink / raw)
  To: Greg KH
  Cc: linux-s390, Toshi Kani, jiang.liu, wency, linux-acpi, yinghai,
	linux-kernel, linux-mm, isimatu.yasuaki, srivatsa.bhat, guohanjun,
	bhelgaas, akpm, linuxppc-dev, lenb
In-Reply-To: <20130204012447.GB6433@kroah.com>

On Sunday, February 03, 2013 07:24:47 PM Greg KH wrote:
> On Sat, Feb 02, 2013 at 11:18:20PM +0100, Rafael J. Wysocki wrote:
> > On Saturday, February 02, 2013 09:15:37 PM Rafael J. Wysocki wrote:
> > > On Saturday, February 02, 2013 03:58:01 PM Greg KH wrote:
> > [...]
> > > 
> > > > I know it's more complicated with these types of devices, and I think we
> > > > are getting closer to the correct solution, I just don't want to ever
> > > > see duplicate devices in the driver model for the same physical device.
> > > 
> > > Do you mean two things based on struct device for the same hardware component?
> > > That's been happening already pretty much forever for every PCI device known
> > > to the ACPI layer, for PNP and many others.  However, those ACPI things are (or
> > > rather should be, but we're going to clean that up) only for convenience (to be
> > > able to see the namespace structure and related things in sysfs).  So the stuff
> > > under /sys/devices/LNXSYSTM\:00/ is not "real".  In my view it shouldn't even
> > > be under /sys/devices/ (/sys/firmware/acpi/ seems to be a better place for it),
> > > but that may be difficult to change without breaking user space (maybe we can
> > > just symlink it from /sys/devices/ or something).  And the ACPI bus type
> > > shouldn't even exist in my opinion.
> > 
> > Well, well.
> > 
> > In fact, the appended patch moves the whole ACPI device nodes tree under
> > /sys/firmware/acpi/ and I'm not seeing any negative consequences of that on my
> > test box (events work and so on).  User space is quite new on it, though, and
> > the patch is hackish.
> 
> Try booting a RHEL 5 image on this type of kernel, or some old Fedora
> releases, they were sensitive to changes in sysfs.

Well, I've found a machine where it causes problems to happen.

I'll try to add a symlink from /sys/devices to that and see what happens then.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply

* Re: [PATCH RESEND 1/1] arch Kconfig: remove references to IRQ_PER_CPU
From: Paul Mundt @ 2013-02-04 11:30 UTC (permalink / raw)
  To: James Hogan
  Cc: linux-arch, linux-mips, linux-ia64, linux-parisc, linux-sh,
	linux-hexagon, Helge Deller, linux-kernel, Fenghua Yu,
	James E.J. Bottomley, Mike Frysinger, Ralf Baechle,
	uclinux-dist-devel, Thomas Gleixner, linuxppc-dev, Paul Mackerras
In-Reply-To: <1359972583-17134-1-git-send-email-james.hogan@imgtec.com>

On Mon, Feb 04, 2013 at 10:09:43AM +0000, James Hogan wrote:
> The IRQ_PER_CPU Kconfig symbol was removed in the following commit:
> 
> Commit 6a58fb3bad099076f36f0f30f44507bc3275cdb6 ("genirq: Remove
> CONFIG_IRQ_PER_CPU") merged in v2.6.39-rc1.
> 
> But IRQ_PER_CPU wasn't removed from any of the architecture Kconfig
> files where it was defined or selected. It's completely unused so remove
> the remaining references.
> 
> Signed-off-by: James Hogan <james.hogan@imgtec.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Mike Frysinger <vapier@gentoo.org>
> Cc: Fenghua Yu <fenghua.yu@intel.com>
> Cc: Ralf Baechle <ralf@linux-mips.org>
> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
> Cc: Helge Deller <deller@gmx.de>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Paul Mundt <lethal@linux-sh.org>
> Acked-by: Tony Luck <tony.luck@intel.com>
> Acked-by: Richard Kuo <rkuo@codeaurora.org>

Acked-by: Paul Mundt <lethal@linux-sh.org>

^ permalink raw reply

* [PATCH v2] powerpc/512x: add function for chip select parameter configuration
From: Anatolij Gustschin @ 2013-02-04 10:16 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Anatolij Gustschin
In-Reply-To: <1359725305-23916-1-git-send-email-agust@denx.de>

Add ability to configure chip select (CS) parameters for devices
that need different CS parameters setup after their configuration.
I.e. an FPGA device on LP bus can require different CS parameters
for its bus interface after loading firmware into it. A driver
can easily reconfigure the LPC CS parameters using this function.

Signed-off-by: Anatolij Gustschin <agust@denx.de>
---
v2:
 - define CS in commit log
 - use unsigned int type for cs argument to simplify
   valid CS range check
 - use EXPORT_SYMBOL

 arch/powerpc/include/asm/mpc5121.h           |   16 +++++++++++++
 arch/powerpc/platforms/512x/mpc512x_shared.c |   30 ++++++++++++++++++++++++++
 2 files changed, 46 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/mpc5121.h b/arch/powerpc/include/asm/mpc5121.h
index 8c0ab2c..8ae133e 100644
--- a/arch/powerpc/include/asm/mpc5121.h
+++ b/arch/powerpc/include/asm/mpc5121.h
@@ -53,4 +53,20 @@ struct mpc512x_ccm {
 	u32	m4ccr;	/* MSCAN4 CCR */
 	u8	res[0x98]; /* Reserved */
 };
+
+/*
+ * LPC Module
+ */
+struct mpc512x_lpc {
+	u32	cs_cfg[8];	/* CS config */
+	u32	cs_ctrl;	/* CS Control Register */
+	u32	cs_status;	/* CS Status Register */
+	u32	burst_ctrl;	/* CS Burst Control Register */
+	u32	deadcycle_ctrl;	/* CS Deadcycle Control Register */
+	u32	holdcycle_ctrl;	/* CS Holdcycle Control Register */
+	u32	alt;		/* Address Latch Timing Register */
+};
+
+int mpc512x_cs_config(unsigned int cs, u32 val);
+
 #endif /* __ASM_POWERPC_MPC5121_H__ */
diff --git a/arch/powerpc/platforms/512x/mpc512x_shared.c b/arch/powerpc/platforms/512x/mpc512x_shared.c
index 39d5630..8e2bde9 100644
--- a/arch/powerpc/platforms/512x/mpc512x_shared.c
+++ b/arch/powerpc/platforms/512x/mpc512x_shared.c
@@ -456,3 +456,33 @@ void __init mpc512x_init(void)
 	mpc512x_restart_init();
 	mpc512x_psc_fifo_init();
 }
+
+/**
+ * mpc512x_cs_config - Setup chip select configuration
+ * @cs: chip select number
+ * @val: chip select configuration value
+ *
+ * Perform chip select configuration for devices on LocalPlus Bus.
+ * Intended to dynamically reconfigure the chip select parameters
+ * for configurable devices on the bus.
+ */
+int mpc512x_cs_config(unsigned int cs, u32 val)
+{
+	static struct mpc512x_lpc __iomem *lpc;
+	struct device_node *np;
+
+	if (cs > 7)
+		return -EINVAL;
+
+	if (!lpc) {
+		np = of_find_compatible_node(NULL, NULL, "fsl,mpc5121-lpc");
+		lpc = of_iomap(np, 0);
+		of_node_put(np);
+		if (!lpc)
+			return -ENOMEM;
+	}
+
+	out_be32(&lpc->cs_cfg[cs], val);
+	return 0;
+}
+EXPORT_SYMBOL(mpc512x_cs_config);
-- 
1.7.5.4

^ permalink raw reply related

* [PATCH RESEND 1/1] arch Kconfig: remove references to IRQ_PER_CPU
From: James Hogan @ 2013-02-04 10:09 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-arch, linux-mips, James Hogan, linux-ia64, linux-parisc,
	linux-sh, linux-hexagon, Helge Deller, James E.J. Bottomley,
	Fenghua Yu, Paul Mundt, Mike Frysinger, Ralf Baechle,
	uclinux-dist-devel, Thomas Gleixner, linuxppc-dev, Paul Mackerras

The IRQ_PER_CPU Kconfig symbol was removed in the following commit:

Commit 6a58fb3bad099076f36f0f30f44507bc3275cdb6 ("genirq: Remove
CONFIG_IRQ_PER_CPU") merged in v2.6.39-rc1.

But IRQ_PER_CPU wasn't removed from any of the architecture Kconfig
files where it was defined or selected. It's completely unused so remove
the remaining references.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Mike Frysinger <vapier@gentoo.org>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Helge Deller <deller@gmx.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Acked-by: Tony Luck <tony.luck@intel.com>
Acked-by: Richard Kuo <rkuo@codeaurora.org>
---

Does anybody want to pick this patch up?

 arch/blackfin/Kconfig |    1 -
 arch/hexagon/Kconfig  |    1 -
 arch/ia64/Kconfig     |    1 -
 arch/mips/Kconfig     |    1 -
 arch/parisc/Kconfig   |    1 -
 arch/powerpc/Kconfig  |    1 -
 arch/sh/Kconfig       |    3 ---
 7 files changed, 0 insertions(+), 9 deletions(-)

diff --git a/arch/blackfin/Kconfig b/arch/blackfin/Kconfig
index b6f3ad5..c709715 100644
--- a/arch/blackfin/Kconfig
+++ b/arch/blackfin/Kconfig
@@ -38,7 +38,6 @@ config BLACKFIN
 	select HAVE_GENERIC_HARDIRQS
 	select GENERIC_ATOMIC64
 	select GENERIC_IRQ_PROBE
-	select IRQ_PER_CPU if SMP
 	select USE_GENERIC_SMP_HELPERS if SMP
 	select HAVE_NMI_WATCHDOG if NMI_WATCHDOG
 	select GENERIC_SMP_IDLE_THREAD
diff --git a/arch/hexagon/Kconfig b/arch/hexagon/Kconfig
index 0744f7d..800dd9c 100644
--- a/arch/hexagon/Kconfig
+++ b/arch/hexagon/Kconfig
@@ -12,7 +12,6 @@ config HEXAGON
 	# select ARCH_WANT_OPTIONAL_GPIOLIB
 	# select ARCH_REQUIRE_GPIOLIB
 	# select HAVE_CLK
-	# select IRQ_PER_CPU
 	# select GENERIC_PENDING_IRQ if SMP
 	select HAVE_IRQ_WORK
 	select GENERIC_ATOMIC64
diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig
index 3279646..00c2e88 100644
--- a/arch/ia64/Kconfig
+++ b/arch/ia64/Kconfig
@@ -29,7 +29,6 @@ config IA64
 	select ARCH_DISCARD_MEMBLOCK
 	select GENERIC_IRQ_PROBE
 	select GENERIC_PENDING_IRQ if SMP
-	select IRQ_PER_CPU
 	select GENERIC_IRQ_SHOW
 	select ARCH_WANT_OPTIONAL_GPIOLIB
 	select ARCH_HAVE_NMI_SAFE_CMPXCHG
diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index 2ac626a..451c2e7 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -2161,7 +2161,6 @@ source "mm/Kconfig"
 config SMP
 	bool "Multi-Processing support"
 	depends on SYS_SUPPORTS_SMP
-	select IRQ_PER_CPU
 	select USE_GENERIC_SMP_HELPERS
 	help
 	  This enables support for systems with more than one CPU. If you have
diff --git a/arch/parisc/Kconfig b/arch/parisc/Kconfig
index b77feff..8525be4 100644
--- a/arch/parisc/Kconfig
+++ b/arch/parisc/Kconfig
@@ -16,7 +16,6 @@ config PARISC
 	select BROKEN_RODATA
 	select GENERIC_IRQ_PROBE
 	select GENERIC_PCI_IOMAP
-	select IRQ_PER_CPU
 	select ARCH_HAVE_NMI_SAFE_CMPXCHG
 	select GENERIC_SMP_IDLE_THREAD
 	select GENERIC_STRNCPY_FROM_USER
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 17903f1..f3d7090a 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -125,7 +125,6 @@ config PPC
 	select HAVE_GENERIC_HARDIRQS
 	select ARCH_WANT_IPC_PARSE_VERSION
 	select SPARSE_IRQ
-	select IRQ_PER_CPU
 	select IRQ_DOMAIN
 	select GENERIC_IRQ_SHOW
 	select GENERIC_IRQ_SHOW_LEVEL
diff --git a/arch/sh/Kconfig b/arch/sh/Kconfig
index babc2b8..6f799ec 100644
--- a/arch/sh/Kconfig
+++ b/arch/sh/Kconfig
@@ -91,9 +91,6 @@ config GENERIC_CSUM
 config GENERIC_HWEIGHT
 	def_bool y
 
-config IRQ_PER_CPU
-	def_bool y
-
 config GENERIC_GPIO
 	def_bool n
 
-- 
1.7.7.6

^ permalink raw reply related

* Re: [PATCH 0/3] Enable multiple MSI feature in pSeries
From: Mike Qiu @ 2013-02-04  6:43 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: tglx, linuxppc-dev, linux-kernel
In-Reply-To: <1359957375.25414.31.camel@concordia>

2013/2/4 13:56, Michael Ellerman:
> On Mon, 2013-02-04 at 11:49 +0800, Mike Qiu wrote:
>>> On Tue, 2013-01-15 at 15:38 +0800, Mike Qiu wrote:
>>>> Currently, multiple MSI feature hasn't been enabled in pSeries,
>>>> These patches try to enbale this feature.
>>> Hi Mike,
>>>
>>>> These patches have been tested by using ipr driver, and the driver patch
>>>> has been made by Wen Xiong <wenxiong@linux.vnet.ibm.com>:
>>> So who wrote these patches? Normally we would expect the original author
>>> to post the patches if at all possible.
>> Hi Michael
>>
>> These Multiple MSI patches were wrote by myself, you know this feature
>> has not enabled
>> and it need device driver to test whether it works suitable. So I test
>> my patches use
>> Wen Xiong's ipr patches, which has been send out to the maillinglist.
>>
>> I'm the original author :)
> Ah OK, sorry, that was more or less clear from your mail but I just
> misunderstood.
>
>>>> [PATCH 0/7] Add support for new IBM SAS controllers
>>> I would like to see the full series, including the driver enablement.
>> Yep, but the driver patches were wrote by Wen Xiong and has been send
>> out.
> OK, you mean this series?
>
> http://thread.gmane.org/gmane.linux.scsi/79639
Yes, exactly.
>
>
>> I just use her patches to test my patches. all device support Multiple
>> MSI can use my feature not only IBM SAS controllers, I also test my
>> patches use the broadcom wireless card tg3, and also works OK.
> You mean drivers/net/ethernet/broadcom/tg3.c ? I don't see where it
> calls pci_enable_msi_block() ?
Yes, I just modify the driver to support mutiple MSI.
>
> All devices /can/ use it, but the driver needs to be updated. Currently
> we have two drivers that do so (in Linus' tree), plus the updated IPR.
Not all devices, just the device which support the multiple MSI by hardware,
can use it
>
>>>> Test platform: One partition of pSeries with one cpu core(4 SMTs) and
>>>>                 RAID bus controller: IBM PCI-E IPR SAS Adapter (ASIC) in POWER7
>>>> OS version: SUSE Linux Enterprise Server 11 SP2  (ppc64) with 3.8-rc3 kernel
>>>>
>>>> IRQ 21 and 22 are assigned to the ipr device which support 2 mutiple MSI.
>>>>
>>>> The test results is shown by 'cat /proc/interrups':
>>>>            CPU0       CPU1       CPU2       CPU3
>>>> 21:          6          5          5          5      XICS Level     host1-0
>>>> 22:        817        814        816        813      XICS Level     host1-1
>>> This shows that you are correctly configuring two MSIs.
>>>
>>> But the key advantage of using multiple interrupts is to distribute load
>>> across CPUs and improve performance. So I would like to see some
>>> performance numbers that show that there is a real benefit for all the
>>> extra complexity in the code.
>> Yes, the system just has suport two MSIs. Anyway, I will try to do
>> some proformance test, to show the real benefit.
>> But actually it needs the driver to do so. As the data show above, it
>> seems there is some problems in use the interrupt, the irq 21 use few,
>> most use 22, I will discuss with the driver author to see why and if
>> she fixed, I will give out the proformance result.
> Yeah that would be good.
>
> I really dislike that we have a separate API for multi-MSI vs MSI-X, and
> pci_enable_msi_block() also pushes the contiguous power-of-2 allocation
> into the irq domain layer, which is unpleasant. So if we really must do
> multi-MSI I would like to do it differently.
Yes, but the multi-MSI must need the hardware support, it is one extend 
for MSI,
The device may sopport MSI and multiple MSI, but not support MSI-X.
for these devices, we'd better use multiple MSI to makes it more efficiency,
compare with MSI.

multi-MSI just can use no more than 32 interrupts

Thanks
>
> cheers
>
>

^ permalink raw reply

* Re: [PATCH 0/3] Enable multiple MSI feature in pSeries
From: Michael Ellerman @ 2013-02-04  5:56 UTC (permalink / raw)
  To: Mike Qiu; +Cc: tglx, linuxppc-dev, linux-kernel
In-Reply-To: <510F2FDB.6020303@linux.vnet.ibm.com>

On Mon, 2013-02-04 at 11:49 +0800, Mike Qiu wrote:
> > On Tue, 2013-01-15 at 15:38 +0800, Mike Qiu wrote:
> > > Currently, multiple MSI feature hasn't been enabled in pSeries,
> > > These patches try to enbale this feature.
> > Hi Mike,
> > 
> > > These patches have been tested by using ipr driver, and the driver patch
> > > has been made by Wen Xiong <wenxiong@linux.vnet.ibm.com>:
> > So who wrote these patches? Normally we would expect the original author
> > to post the patches if at all possible.
> Hi Michael
> 
> These Multiple MSI patches were wrote by myself, you know this feature
> has not enabled
> and it need device driver to test whether it works suitable. So I test
> my patches use 
> Wen Xiong's ipr patches, which has been send out to the maillinglist.
> 
> I'm the original author :)

Ah OK, sorry, that was more or less clear from your mail but I just
misunderstood.

> > > [PATCH 0/7] Add support for new IBM SAS controllers
> > I would like to see the full series, including the driver enablement.
> Yep, but the driver patches were wrote by Wen Xiong and has been send
> out.

OK, you mean this series?

http://thread.gmane.org/gmane.linux.scsi/79639


> I just use her patches to test my patches. all device support Multiple
> MSI can use my feature not only IBM SAS controllers, I also test my
> patches use the broadcom wireless card tg3, and also works OK.

You mean drivers/net/ethernet/broadcom/tg3.c ? I don't see where it
calls pci_enable_msi_block() ?

All devices /can/ use it, but the driver needs to be updated. Currently
we have two drivers that do so (in Linus' tree), plus the updated IPR.

> > > Test platform: One partition of pSeries with one cpu core(4 SMTs) and 
> > >                RAID bus controller: IBM PCI-E IPR SAS Adapter (ASIC) in POWER7
> > > OS version: SUSE Linux Enterprise Server 11 SP2  (ppc64) with 3.8-rc3 kernel 
> > > 
> > > IRQ 21 and 22 are assigned to the ipr device which support 2 mutiple MSI.
> > > 
> > > The test results is shown by 'cat /proc/interrups':
> > >           CPU0       CPU1       CPU2       CPU3       
> > > 21:          6          5          5          5      XICS Level     host1-0
> > > 22:        817        814        816        813      XICS Level     host1-1
> > This shows that you are correctly configuring two MSIs.
> > 
> > But the key advantage of using multiple interrupts is to distribute load
> > across CPUs and improve performance. So I would like to see some
> > performance numbers that show that there is a real benefit for all the
> > extra complexity in the code.

> Yes, the system just has suport two MSIs. Anyway, I will try to do
> some proformance test, to show the real benefit.
> But actually it needs the driver to do so. As the data show above, it
> seems there is some problems in use the interrupt, the irq 21 use few,
> most use 22, I will discuss with the driver author to see why and if
> she fixed, I will give out the proformance result.

Yeah that would be good.

I really dislike that we have a separate API for multi-MSI vs MSI-X, and
pci_enable_msi_block() also pushes the contiguous power-of-2 allocation
into the irq domain layer, which is unpleasant. So if we really must do
multi-MSI I would like to do it differently.

cheers

^ permalink raw reply

* Re: [PATCH powerpc] Use VPA subfunction macros instead of numbers for vpa calls
From: Michael Ellerman @ 2013-02-04  5:30 UTC (permalink / raw)
  To: Li Zhong; +Cc: Paul Mackerras, PowerPC email list
In-Reply-To: <1359101541.2666.59.camel@ThinkPad-T5421.cn.ibm.com>

On Fri, 2013-01-25 at 16:12 +0800, Li Zhong wrote:
> Use macros in vpa calls. 
> 
> Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>

Looks good.

cheers

^ permalink raw reply

* 3.8: possible circular locking dependency detected
From: Christian Kujau @ 2013-02-04  5:09 UTC (permalink / raw)
  To: LKML; +Cc: linuxppc-dev, b43-dev

Hi,

similar to what I reported earlier [0] for 3.8.0-rc1, this happens during 
"ifup wlan0" (which in effect starts wpa_supplicant to bring up a Broadcom 
b43 wifi network interface). The interface is working though and continues 
to work over several ifup/ifdown iterations.

The backtrace looks awfully similar to the earlier[0] report, but this 
time it had b43* stuff in it so I thought I should report it. Full dmesg 
and .config here: http://nerdbynature.de/bits/3.8.0-rc6/

Christian.

[  807.693791] 
[  807.695519] ======================================================
[  807.697198] [ INFO: possible circular locking dependency detected ]
[  807.698890] 3.8.0-rc6-00008-g8b31849 #1 Not tainted
[  807.700573] -------------------------------------------------------
[  807.702255] wpa_supplicant/4129 is trying to acquire lock:
[  807.703925]  ((&wl->firmware_load)){+.+.+.}, at: [<c0049d1c>] flush_work+0x0/0x2b0
[  807.705696] 
[  807.705696] but task is already holding lock:
[  807.709023]  (rtnl_mutex){+.+.+.}, at: [<c04362d4>] rtnl_lock+0x1c/0x2c
[  807.710743] 
[  807.710743] which lock already depends on the new lock.
[  807.710743] 
[  807.715541] 
[  807.715541] the existing dependency chain (in reverse order) is:
[  807.718691] 
[  807.718691] -> #1 (rtnl_mutex){+.+.+.}:
[  807.721903]        [<c04f69ac>] mutex_lock_nested+0x6c/0x2bc
[  807.723533]        [<c04362d4>] rtnl_lock+0x1c/0x2c
[  807.725138]        [<f20be0d8>] wiphy_register+0x510/0x53c [cfg80211]
[  807.726798]        [<f64815ec>] ieee80211_register_hw+0x3f8/0x82c [mac80211]
[  807.728431]        [<f64c6890>] b43_request_firmware+0x8c/0x198 [b43]
[  807.730025]        [<c004b1f4>] process_one_work+0x1a4/0x498
[  807.731549]        [<c004b8b4>] worker_thread+0x17c/0x428
[  807.733025]        [<c0051280>] kthread+0xa8/0xac
[  807.734439]        [<c0010990>] ret_from_kernel_thread+0x64/0x6c
[  807.735810] 
[  807.735810] -> #0 ((&wl->firmware_load)){+.+.+.}:
[  807.738389]        [<c0071b38>] lock_acquire+0x50/0x6c
[  807.739694]        [<c0049d58>] flush_work+0x3c/0x2b0
[  807.740980]        [<c004c30c>] __cancel_work_timer+0x94/0xec
[  807.742271]        [<f64c748c>] b43_wireless_core_stop+0x5c/0x234 [b43]
[  807.743574]        [<f64c76b0>] b43_op_stop+0x4c/0x88 [b43]
[  807.744888]        [<f64a4c08>] ieee80211_stop_device+0x4c/0x8c [mac80211]
[  807.746240]        [<f64913a8>] ieee80211_do_stop+0x2c0/0x5dc [mac80211]
[  807.747582]        [<f64916dc>] ieee80211_stop+0x18/0x2c [mac80211]
[  807.748925]        [<c0423d94>] __dev_close_many+0xb0/0x100
[  807.750257]        [<c0423e10>] __dev_close+0x2c/0x4c
[  807.751560]        [<c0427514>] __dev_change_flags+0x124/0x178
[  807.752868]        [<c042761c>] dev_change_flags+0x1c/0x64
[  807.754177]        [<c047db9c>] devinet_ioctl+0x69c/0x74c
[  807.755459]        [<c047ecf8>] inet_ioctl+0xcc/0xf8
[  807.756709]        [<c040eab0>] sock_ioctl+0x70/0x2e8
[  807.757948]        [<c00d9f70>] do_vfs_ioctl+0xa4/0x7c0
[  807.759182]        [<c00da6d0>] sys_ioctl+0x44/0x70
[  807.760407]        [<c001084c>] ret_from_syscall+0x0/0x38
[  807.761622] 
[  807.761622] other info that might help us debug this:
[  807.761622] 
[  807.765107]  Possible unsafe locking scenario:
[  807.765107] 
[  807.767412]        CPU0                    CPU1
[  807.768561]        ----                    ----
[  807.769690]   lock(rtnl_mutex);
[  807.770822]                                lock((&wl->firmware_load));
[  807.771970]                                lock(rtnl_mutex);
[  807.773115]   lock((&wl->firmware_load));
[  807.774244] 
[  807.774244]  *** DEADLOCK ***
[  807.774244] 
[  807.777405] 1 lock held by wpa_supplicant/4129:
[  807.778475]  #0:  (rtnl_mutex){+.+.+.}, at: [<c04362d4>] rtnl_lock+0x1c/0x2c
[  807.779628] 
[  807.779628] stack backtrace:
[  807.781720] Call Trace:
[  807.782765] [eea2db20] [c0009160] show_stack+0x48/0x15c (unreliable)
[  807.784087] [eea2db60] [c04fae24] print_circular_bug+0x2b0/0x2c8
[  807.785169] [eea2db90] [c0071300] __lock_acquire+0x14f4/0x18c8
[  807.786254] [eea2dc30] [c0071b38] lock_acquire+0x50/0x6c
[  807.787335] [eea2dc50] [c0049d58] flush_work+0x3c/0x2b0
[  807.788418] [eea2dcc0] [c004c30c] __cancel_work_timer+0x94/0xec
[  807.789516] [eea2dcf0] [f64c748c] b43_wireless_core_stop+0x5c/0x234 [b43]
[  807.790629] [eea2dd20] [f64c76b0] b43_op_stop+0x4c/0x88 [b43]
[  807.791754] [eea2dd40] [f64a4c08] ieee80211_stop_device+0x4c/0x8c [mac80211]
[  807.792888] [eea2dd50] [f64913a8] ieee80211_do_stop+0x2c0/0x5dc [mac80211]
[  807.794016] [eea2dd80] [f64916dc] ieee80211_stop+0x18/0x2c [mac80211]
[  807.795124] [eea2dd90] [c0423d94] __dev_close_many+0xb0/0x100
[  807.796237] [eea2ddb0] [c0423e10] __dev_close+0x2c/0x4c
[  807.797345] [eea2ddd0] [c0427514] __dev_change_flags+0x124/0x178
[  807.798451] [eea2ddf0] [c042761c] dev_change_flags+0x1c/0x64
[  807.799564] [eea2de10] [c047db9c] devinet_ioctl+0x69c/0x74c
[  807.800680] [eea2de70] [c047ecf8] inet_ioctl+0xcc/0xf8
[  807.801792] [eea2de80] [c040eab0] sock_ioctl+0x70/0x2e8
[  807.802905] [eea2dea0] [c00d9f70] do_vfs_ioctl+0xa4/0x7c0
[  807.804025] [eea2df10] [c00da6d0] sys_ioctl+0x44/0x70
[  807.805145] [eea2df40] [c001084c] ret_from_syscall+0x0/0x38
[  807.806270] --- Exception: c01 at 0xfbf2758
[  807.806270]     LR = 0xfbf26bc

[  850.445766] b43-phy1: Broadcom 4306 WLAN found (core revision 5)
[  850.469062] b43-phy1: Found PHY: Analog 2, Type 2 (G), Revision 2
[  850.493199] Broadcom 43xx driver loaded [ Features: P ]
[  850.497448] ieee80211 phy1: Selected rate control algorithm 'minstrel_ht'
[  860.820792] b43-phy1: Loading firmware version 410.2160 (2007-05-26 15:32:10)


[0] https://lkml.org/lkml/2013/1/3/543
-- 
BOFH excuse #174:

Backbone adjustment

^ permalink raw reply

* Re: [PATCH] powerpc/mm: Fix hash computation function
From: Benjamin Herrenschmidt @ 2013-02-04  4:24 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: Mike Qiu, linuxppc-dev, paulus, Aneesh Kumar K.V
In-Reply-To: <1359948565.25414.12.camel@concordia>

On Mon, 2013-02-04 at 14:29 +1100, Michael Ellerman wrote:
> On Wed, 2013-01-30 at 11:10 +0530, Aneesh Kumar K.V wrote:
> > From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
> > 
> > The ASM version of hash computation function was truncating the upper bit.
> > Make the ASM version similar to hpt_hash function. Remove masking vsid bits.
> > Without this patch, we observed hang during bootup due to not satisfying page
> > fault request correctly.
> 
> Which commit(s) introduced the bug?

The bug has been there for ever (well, as far back as git goes without
digging the old monster repo) so may as well date from my original port
of the C hashing code to asm.

However it wasn't per-se a bug until Aneesh 64T support was added in 3.7
since we didn't use the top vsid bits that are masked.

I've merged the patch. My only worry is whether there might have been a
reason for the masking in the first place, ie, do we ever carry VSIDs
with bad bits at the top... though the hash mask should take care of
that in any case (well I hope...)

Cheers,
Ben.

^ permalink raw reply

* [git pull] Please pull powerpc.git merge branch
From: Benjamin Herrenschmidt @ 2013-02-04  4:22 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linuxppc-dev, Linux Kernel list

Hi Linus !

Just so that you don't get too bored on your Island here's a patch for
3.8 fixing a nasty bug that affects the new 64T support that was merged
in 3.7. Please apply whenever you have a chance (and an internet
connection !)

Cheers,
Ben.

The following changes since commit 689dfa894c57842a05bf6dc9f97e6bb71ec5f386:

  powerpc: Max next_tb to prevent from replaying timer interrupt (2013-01-29 10:18:16 +1100)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc.git merge

for you to fetch changes up to eda8eebdd153c48a4e2a3a3ac3cd9e2e31f5c6b3:

  powerpc/mm: Fix hash computation function (2013-02-04 15:15:08 +1100)

----------------------------------------------------------------
Aneesh Kumar K.V (1):
      powerpc/mm: Fix hash computation function

 arch/powerpc/mm/hash_low_64.S |   62 +++++++++++++++++++++++------------------
 1 file changed, 35 insertions(+), 27 deletions(-)

^ permalink raw reply

* Re: [PATCH 0/3] Enable multiple MSI feature in pSeries
From: Mike Qiu @ 2013-02-04  3:49 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: tglx, linuxppc-dev, linux-kernel
In-Reply-To: <1359948180.25414.11.camel@concordia>

[-- Attachment #1: Type: text/plain, Size: 2427 bytes --]

> On Tue, 2013-01-15 at 15:38 +0800, Mike Qiu wrote:
>> Currently, multiple MSI feature hasn't been enabled in pSeries,
>> These patches try to enbale this feature.
> Hi Mike,
>
>> These patches have been tested by using ipr driver, and the driver patch
>> has been made by Wen Xiong <wenxiong@linux.vnet.ibm.com>:
> So who wrote these patches? Normally we would expect the original author
> to post the patches if at all possible.
Hi Michael

These Multiple MSI patches were wrote by myself, you know this feature 
has not enabled
and it need device driver to test whether it works suitable. So I test 
my patches use
Wen Xiong's ipr patches, which has been send out to the maillinglist.

I'm the**original author :)
>
>> [PATCH 0/7] Add support for new IBM SAS controllers
> I would like to see the full series, including the driver enablement.
Yep, but the driver patches were wrote by Wen Xiong and has been send out.
I just use her patches to test my patches. all device support Multiple MSI
can use my feature not only IBM SAS controllers, I also test my patches use
the broadcom wireless card tg3, and also works OK.
>
>> Test platform: One partition of pSeries with one cpu core(4 SMTs) and
>>                 RAID bus controller: IBM PCI-E IPR SAS Adapter (ASIC) in POWER7
>> OS version: SUSE Linux Enterprise Server 11 SP2  (ppc64) with 3.8-rc3 kernel
>>
>> IRQ 21 and 22 are assigned to the ipr device which support 2 mutiple MSI.
>>
>> The test results is shown by 'cat /proc/interrups':
>>            CPU0       CPU1       CPU2       CPU3
>> 21:          6          5          5          5      XICS Level     host1-0
>> 22:        817        814        816        813      XICS Level     host1-1
> This shows that you are correctly configuring two MSIs.
>
> But the key advantage of using multiple interrupts is to distribute load
> across CPUs and improve performance. So I would like to see some
> performance numbers that show that there is a real benefit for all the
> extra complexity in the code.
Yes, the system just has suport two MSIs. Anyway, I will try to do some 
proformance
test, to show the real benefit.
But actually it needs the driver to do so. As the data show above, it 
seems there is
some problems in use the interrupt, the irq 21 use few, most use 22, I 
will discuss
with the driver author to see why and if she fixed, I will give out the 
proformance
result.

Thanks

Mike

>
> cheers
>


[-- Attachment #2: Type: text/html, Size: 3772 bytes --]

^ permalink raw reply

* Re: [PATCH 2/2] pseries/iommu: remove DDW on kexec
From: Michael Ellerman @ 2013-02-04  3:38 UTC (permalink / raw)
  To: Nishanth Aravamudan; +Cc: miltonm, paulus, anton, nfont, linuxppc-dev
In-Reply-To: <20130129203346.GA7664@linux.vnet.ibm.com>

On Tue, 2013-01-29 at 12:33 -0800, Nishanth Aravamudan wrote:
> Hi Michael,
> 
> On 29.01.2013 [21:58:28 +1100], Michael Ellerman wrote:
> > On Mon, 2013-01-28 at 18:03 -0800, Nishanth Aravamudan wrote:
> > > pseries/iommu: remove DDW on kexec
> > >  ...
> > >     
> > > I believe the simplest, easiest-to-maintain fix is to just change our
> > > initcall to, rather than detecting and updating the new kernel's DDW
> > > knowledge, just remove all DDW configurations. When the drivers
> > > re-initialize, we will set everything back up as it was before.
> > 
> > I don't know this code at all, but this sounds like it will also work
> > for kdump, right? ie. when the original kernel has crashed the 2nd
> > kernel will tear the DDW down and set it back up.
> 
> Yes, my actual test-case (and what was reported as broken) was kdump.
> From my relatively vague (but now growing) understanding of that
> process, kdump does use kexec under the covers to switch to the crash
> kernel, and so does get the same benefit from this change.

OK good. Yeah kdump is basically equivalent to kexec with one major
difference, which is that before a kexec the first kernel is shutdown
more or less normally - as if it was about to reboot.

kdump on the other hand is invoked in the panic path, so nothing is
shutdown in the first kernel (almost, see ehea). So to accommodate kdump
you want any logic to be in the startup path of the 2nd kernel.

> Another datapoint, though, is that it might make sense to recommend (and
> I'm working on figuring this out for the distros, etc) to use
> disable_ddw anyways for the kdump kernel command-line, as DDW isn't
> 'free' and it's unclear if performance is a huge concern for the crash
> kernel (sort of varies with where your storage is, and how much you need
> to dump, which for kdump generally doesn't seem like that much?).

The kdump kernel /should/ just be creating a crash dump and then
rebooting. That may involve writing all of RAM out to disk/nw, which
could be a lot of I/O. So performance may be a concern, but correctness
is much much much more important.

Some folks have talked about having the kdump kernel just come up and
pretend it is back in production, but that is madness for various
reasons and I don't think anyone ever did it. 

cheers

^ permalink raw reply

* Re: [PATCH] powerpc/mm: Fix hash computation function
From: Michael Ellerman @ 2013-02-04  3:29 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: Mike Qiu, paulus, linuxppc-dev
In-Reply-To: <1359524442-5861-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com>

On Wed, 2013-01-30 at 11:10 +0530, Aneesh Kumar K.V wrote:
> From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
> 
> The ASM version of hash computation function was truncating the upper bit.
> Make the ASM version similar to hpt_hash function. Remove masking vsid bits.
> Without this patch, we observed hang during bootup due to not satisfying page
> fault request correctly.

Which commit(s) introduced the bug?

cheers

^ permalink raw reply

* Re: [PATCH 0/3] Enable multiple MSI feature in pSeries
From: Michael Ellerman @ 2013-02-04  3:23 UTC (permalink / raw)
  To: Mike Qiu; +Cc: tglx, linuxppc-dev, linux-kernel
In-Reply-To: <1358235536-32741-1-git-send-email-qiudayu@linux.vnet.ibm.com>

On Tue, 2013-01-15 at 15:38 +0800, Mike Qiu wrote:
> Currently, multiple MSI feature hasn't been enabled in pSeries,
> These patches try to enbale this feature.

Hi Mike,

> These patches have been tested by using ipr driver, and the driver patch
> has been made by Wen Xiong <wenxiong@linux.vnet.ibm.com>:

So who wrote these patches? Normally we would expect the original author
to post the patches if at all possible.

> [PATCH 0/7] Add support for new IBM SAS controllers

I would like to see the full series, including the driver enablement.

> Test platform: One partition of pSeries with one cpu core(4 SMTs) and 
>                RAID bus controller: IBM PCI-E IPR SAS Adapter (ASIC) in POWER7
> OS version: SUSE Linux Enterprise Server 11 SP2  (ppc64) with 3.8-rc3 kernel 
> 
> IRQ 21 and 22 are assigned to the ipr device which support 2 mutiple MSI.
> 
> The test results is shown by 'cat /proc/interrups':
>           CPU0       CPU1       CPU2       CPU3       
> 21:          6          5          5          5      XICS Level     host1-0
> 22:        817        814        816        813      XICS Level     host1-1

This shows that you are correctly configuring two MSIs.

But the key advantage of using multiple interrupts is to distribute load
across CPUs and improve performance. So I would like to see some
performance numbers that show that there is a real benefit for all the
extra complexity in the code.

cheers

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox