LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: 3.0-rc1: powerpc hangs at Kernel virtual memory layout
From: Christian Kujau @ 2011-06-02  6:16 UTC (permalink / raw)
  To: Rafał Miłecki; +Cc: linux ppc dev, LKML
In-Reply-To: <BANLkTi=JRDrzn=-d8jVqZLPGtV1osa1eiw@mail.gmail.com>

On Thu, 2 Jun 2011 at 08:07, Rafał Miłecki wrote:
> 1) You didn't see (like Andres):
> Machine check in kernel mode.
>  Caused by (from SRR1=149030): Transfer error ack signal
>  Oops: Machine check, sig: 7 [#1]
> But, OK, maybe machine check requires something additional in kernel,
> I don't know...
> 
> 2) You didn't see SSB messages
> This is confusing. You should see SSB messages that appear before my
> invalid read happens. Did you somehow disable most of the important
> logs, or sth? Having ssb messages and the end of hung boot would
> directly point you to ssb module.

BenH advised to boot with udbg-immortal and out came:

 http://nerdbynature.de/bits/3.0-rc1/linux-3.0_powerpc_2.jpg
 http://nerdbynature.de/bits/3.0-rc1/linux-3.0_powerpc_2.mp4
 (watch it at very slow speed, as it's only 3sec long)

I've enabled[0] FB_NVIDIA and during normal booting the screen flickers 
after the "... : fixmap" message and the screen clears and is filled again 
from the top - maybe the messages would've been there if booted w/o the 
framebuffer enabled.

Right now I'm happy that Ben's fix helped to get past this message, but 
the system remains unsuable[1] with the latest -git, but more debugging 
has to wait until tomorrow...

Thanks,
Christian.

[0] http://nerdbynature.de/bits/3.0-rc1/config-2.6.39.txt
[1] https://lkml.org/lkml/2011/6/2/6
-- 
BOFH excuse #230:

Lusers learning curve appears to be fractal

^ permalink raw reply

* Re: [RFT][PATCH 3.0] ssb: fix PCI(e) driver regression causing oops on PCI cards
From: Rafał Miłecki @ 2011-06-02  6:13 UTC (permalink / raw)
  To: Andreas Schwab, linux-wireless, John W. Linville,
	Michael Büsch, Christian Kujau
  Cc: linuxppc-dev, Rafał Miłecki, b43-dev
In-Reply-To: <BANLkTinSdDZv7gshYfi4mK5N9K-E7vk3Yw@mail.gmail.com>

2011/6/1 Rafa=C5=82 Mi=C5=82ecki <zajec5@gmail.com>:
> 2011/6/1 Rafa=C5=82 Mi=C5=82ecki <zajec5@gmail.com>:
>> We were incorrectly executing PCIe specific workarounds on PCI cards.
>> This resulted in:
>> Machine check in kernel mode.
>> Caused by (from SRR1=3D149030): Transfer error ack signal
>> Oops: Machine check, sig: 7 [#1]
>
> John, I've tested this patch myself on my PCI BCM4318, including
> checking for 0xFFFFFFFF reads in MMIO dumps.
>
> The patch is correct, please take it for 3.0.

John, I'm afraid more and more people get angry at me because of this ;)

Christian Kujau confirmed this problem and fix.

--=20
Rafa=C5=82

^ permalink raw reply

* Re: 3.0-rc1: powerpc hangs at Kernel virtual memory layout
From: Rafał Miłecki @ 2011-06-02  6:07 UTC (permalink / raw)
  To: Christian Kujau; +Cc: linux ppc dev, LKML
In-Reply-To: <alpine.DEB.2.01.1106011712270.21697@trent.utfs.org>

On Tue, 31 May 2011 at 16:50, Christian Kujau wrote:
> trying to boot 3.0-rc1 on powerpc32 only progresses until:
>
> =C2=A0 > Kernel virtual memory layout:
> =C2=A0 > =C2=A0 * 0xfffcf000..0xfffff000 =C2=A0: fixmap

The weird thing is that:

1) You didn't see (like Andres):
Machine check in kernel mode.
 Caused by (from SRR1=3D149030): Transfer error ack signal
 Oops: Machine check, sig: 7 [#1]
But, OK, maybe machine check requires something additional in kernel,
I don't know...

2) You didn't see SSB messages
This is confusing. You should see SSB messages that appear before my
invalid read happens. Did you somehow disable most of the important
logs, or sth? Having ssb messages and the end of hung boot would
directly point you to ssb module.

--=20
Rafa=C5=82

^ permalink raw reply

* Re: 3.0-rc1: powerpc hangs at Kernel virtual memory layout
From: Rafał Miłecki @ 2011-06-02  6:00 UTC (permalink / raw)
  To: Christian Kujau; +Cc: linux ppc dev, LKML
In-Reply-To: <alpine.DEB.2.01.1106011712270.21697@trent.utfs.org>

2011/6/2 Christian Kujau <lists@nerdbynature.de>:
> On Tue, 31 May 2011 at 16:50, Christian Kujau wrote:
>> trying to boot 3.0-rc1 on powerpc32 only progresses until:
>>
>> =C2=A0 > Kernel virtual memory layout:
>> =C2=A0 > =C2=A0 * 0xfffcf000..0xfffff000 =C2=A0: fixmap
>
> After hours (and hours!) of git-bisecting, it said:
>
> -----------------------
> ccc7c28af205888798b51b6cbc0b557ac1170a49 is the first bad commit
> commit ccc7c28af205888798b51b6cbc0b557ac1170a49
> Author: Rafa=C5=82 Mi=C5=82ecki <zajec5@gmail.com>
> Date: =C2=A0 Fri Apr 1 13:26:52 2011 +0200
>
> =C2=A0 =C2=A0ssb: pci: implement serdes workaround
>
> =C2=A0 =C2=A0Signed-off-by: Rafa=C5=82 Mi=C5=82ecki <zajec5@gmail.com>
> =C2=A0 =C2=A0Signed-off-by: John W. Linville <linville@tuxdriver.com>
> -----------------------

I'm for the problem :(

Patch was already send yesterday, I've even CCed linuxppc-dev:
[RFT][PATCH 3.0] ssb: fix PCI(e) driver regression causing oops on PCI card=
s

--=20
Rafa=C5=82

^ permalink raw reply

* Re: [PATCH] Fix build warning of the defconfigs
From: Paul Mundt @ 2011-06-02  5:24 UTC (permalink / raw)
  To: Wanlong Gao
  Cc: linux-mips, david.woodhouse, linux-sh, paulus, gxt, sam, linux,
	manuel.lauss, rientjes, mingo, vapier, arnd, u.kleine-koenig,
	anton, linux-arm-kernel, linux-kernel, ralf, uclinux-dist-devel,
	akpm, linuxppc-dev, hans-christian.egtvedt
In-Reply-To: <1306945763-6583-1-git-send-email-wanlong.gao@gmail.com>

On Thu, Jun 02, 2011 at 12:29:23AM +0800, Wanlong Gao wrote:
> RTC_CLASS is changed to bool.
> So value 'm' is invalid.
> 
> Signed-off-by: Wanlong Gao <wanlong.gao@gmail.com>

>  arch/sh/configs/titan_defconfig            |    2 +-

Acked-by: Paul Mundt <lethal@linux-sh.org>

^ permalink raw reply

* Re: 3.0-rc1: powerpc hangs at Kernel virtual memory layout
From: Christian Kujau @ 2011-06-02  4:27 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: zajec5, linux ppc dev, LKML
In-Reply-To: <1306983467.29297.51.camel@pasglop>

On Thu, 2 Jun 2011 at 12:57, Benjamin Herrenschmidt wrote:
> Ok, thanks a lot, It looks rather trivial actually: That new workaround
> is PCIe specific but is called unconditionally, and will do bad things
> non-PCIe implementations.

OK, with your patch applied to Linus' latest git tree the machine 
continues to boot. Also, with the latest tree, the "machine is stuck after 
ide-cd init" problem[0] went away.

For this particular problem and patch, feel free to add:

Tested-by: Christian Kujau <lists@nerdbynature.de>

However, shortly after boot and loggin in to the box remotely, the bux did 
not respond any more. I'm not sure if these are related to those SSB/PCIe 
changes, but somehow I hope they are - bisecting those would take much 
longer, as it's not an "instant" death:

 * http://nerdbynature.de/bits/3.0-rc1/linux-3.0-rc1_stuck1.jpg
 * http://nerdbynature.de/bits/3.0-rc1/linux-3.0-rc1_stuck2.jpg

This is what an OCR program made of it:

irq euent stamp: 185804850
hardirqs last enabled at (185904849): [<c04005b0>] _raw_spin_unlock_irqrestore+0x40/0x?e
hardirqs last disabled at (185904850): [<c00120b8>] reenable_mmu+0x24/0x78
Softirqs last enabled at (185892414): [<c000fe8c>] call_do_softirq+0x14/0x24
softirqs last disabled at (18589240?): [<c000fe8c>] call_do_softirq+0x14/0x24
NIP: e04005b4 LR: e04005b0 CTR: 00000000
REGS: ef92be10 TRHP: 0901 Not tainted (3.0.0-rel-00049-g1fa?b6a-dirtg)
MSB: 00009032 <EE.ME.IR.DR> CR: 42002084
TRSK = ef8d0000[38B] ’kuorker/0:2’ THREAD:
GPR00: c04005b0 ef92bec0 efBd0000 00000001
GPR08: 00000000 0b14aed0 0049a306 00030600
HIP [c01005b1] _rau_spin_unlock_irqrestore+0x44/0x?c
LR [c04005b0] _rau_spin_unlock_irqrestore+0x40/0x?c
Call Trace:
[ef92bec0] [c04005b0] _raw_spin_unlock_irqrestore+0x40/0x?c (unreliable)
[ef92bed0] [c029c504] flush_tu_ldisc+0x121/0x230
[ef92bf10] [c001c86c] process_one_uork+0x1c1/0x4cB
[ef92bfS0] [c004efac] worker_thread+0x1?8/0x3c1
[ef92bf90] [c0051148] kthread+0x81/0x88
[ef92hff0] [c0810390] kernel_thread+0x1c/0x68

XER: 20000000
ef92a000 ef8d0660 00000006 00000000 18614000 22002088
Instruction dump:
??? 93e1060c ?c9f23?B 38800001 90010011 4bc6e9a9 ?fc3i`3?8 4be61a69
?3e08080 11820021 1bc6b515 ?fe00124
B8c16008 ?c0803a6 83c1000c

Well, the picture is way better :-\

Thanks,
Christian.

[0] http://nerdbynature.de/bits/3.0-rc1/linux-3.0-rc1-cdrom.jpg
-- 
BOFH excuse #399:

We are a 100% Microsoft Shop.

^ permalink raw reply

* Re: 3.0-rc1: powerpc hangs at Kernel virtual memory layout
From: Christian Kujau @ 2011-06-02  3:06 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: zajec5, linux ppc dev, LKML
In-Reply-To: <1306983467.29297.51.camel@pasglop>

On Thu, 2 Jun 2011 at 12:57, Benjamin Herrenschmidt wrote:
> Ok, thanks a lot, It looks rather trivial actually: That new workaround
> is PCIe specific but is called unconditionally, and will do bad things
> non-PCIe implementations.

Indeed. This PowerBook G4 does not has PCIe, yet the whole SSB thingy gets 
enabled in my .config somehow. Thanks for the quick fix, I tried to revert 
ccc7c28af2... from Linus' current tree, but I had to rip out some more to 
make it compile.

I'll try your fix in a minute and get back to you with those cdrom init 
problems as well.

Thanks,
Christian.
-- 
BOFH excuse #166:

/pub/lunch

^ permalink raw reply

* Re: 3.0-rc1: powerpc hangs at Kernel virtual memory layout
From: Benjamin Herrenschmidt @ 2011-06-02  2:57 UTC (permalink / raw)
  To: Christian Kujau, linville; +Cc: zajec5, linux ppc dev, LKML
In-Reply-To: <alpine.DEB.2.01.1106011712270.21697@trent.utfs.org>

On Wed, 2011-06-01 at 17:16 -0700, Christian Kujau wrote:
> On Tue, 31 May 2011 at 16:50, Christian Kujau wrote:
> > trying to boot 3.0-rc1 on powerpc32 only progresses until:
> > 
> >   > Kernel virtual memory layout:
> >   >   * 0xfffcf000..0xfffff000  : fixmap
> 
> After hours (and hours!) of git-bisecting, it said:
> 
> -----------------------
> ccc7c28af205888798b51b6cbc0b557ac1170a49 is the first bad commit
> commit ccc7c28af205888798b51b6cbc0b557ac1170a49
> Author: Rafał Miłecki <zajec5@gmail.com>
> Date:   Fri Apr 1 13:26:52 2011 +0200
> 
>     ssb: pci: implement serdes workaround
>     
>     Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
>     Signed-off-by: John W. Linville <linville@tuxdriver.com>
> -----------------------

Ok, thanks a lot, It looks rather trivial actually: That new workaround
is PCIe specific but is called unconditionally, and will do bad things
non-PCIe implementations.

John, care to send the patch below to Linus ASAP ? I could reproduce and
verify it fixes it. Thanks !

ssb: pci: Don't call PCIe specific workarounds on PCI cores

Otherwise it can/will crash....

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---

diff --git a/drivers/ssb/driver_pcicore.c b/drivers/ssb/driver_pcicore.c
index 82feb34..eddf1b9 100644
--- a/drivers/ssb/driver_pcicore.c
+++ b/drivers/ssb/driver_pcicore.c
@@ -540,7 +540,8 @@ void ssb_pcicore_init(struct ssb_pcicore *pc)
 		ssb_pcicore_init_clientmode(pc);
 
 	/* Additional always once-executed workarounds */
-	ssb_pcicore_serdes_workaround(pc);
+	if (dev->id.coreid == SSB_DEV_PCIE)
+		ssb_pcicore_serdes_workaround(pc);
 	/* TODO: ASPM */
 	/* TODO: Clock Request Update */
 }

^ permalink raw reply related

* Re: [PATCH] Fix build warning of the defconfigs
From: Guan Xuetao @ 2011-06-02  9:49 UTC (permalink / raw)
  To: Wanlong Gao
  Cc: linux-mips, david.woodhouse, linux-sh, paulus, sam, linux,
	manuel.lauss, rientjes, mingo, vapier, arnd, u.kleine-koenig,
	anton, linux-arm-kernel, linux-kernel, ralf, lethal,
	uclinux-dist-devel, akpm, linuxppc-dev, hans-christian.egtvedt
In-Reply-To: <1306945763-6583-1-git-send-email-wanlong.gao@gmail.com>

On Thu, 2011-06-02 at 00:29 +0800, Wanlong Gao wrote:
> RTC_CLASS is changed to bool.
> So value 'm' is invalid.
> 
> Signed-off-by: Wanlong Gao <wanlong.gao@gmail.com>
> ---
>  arch/arm/configs/davinci_all_defconfig     |    2 +-
>  arch/arm/configs/mxs_defconfig             |    2 +-
>  arch/arm/configs/netx_defconfig            |    2 +-
>  arch/arm/configs/viper_defconfig           |    2 +-
>  arch/arm/configs/xcep_defconfig            |    2 +-
>  arch/arm/configs/zeus_defconfig            |    2 +-
>  arch/avr32/configs/atngw100_mrmt_defconfig |    2 +-
>  arch/blackfin/configs/CM-BF548_defconfig   |    2 +-
>  arch/mips/configs/mtx1_defconfig           |    2 +-
>  arch/powerpc/configs/52xx/pcm030_defconfig |    2 +-
>  arch/powerpc/configs/ps3_defconfig         |    2 +-
>  arch/sh/configs/titan_defconfig            |    2 +-
>  arch/unicore32/configs/debug_defconfig     |    2 +-
>  13 files changed, 13 insertions(+), 13 deletions(-)
> 
> diff --git a/arch/unicore32/configs/debug_defconfig b/arch/unicore32/configs/debug_defconfig
> index b5fbde9..1c367f0 100644
> --- a/arch/unicore32/configs/debug_defconfig
> +++ b/arch/unicore32/configs/debug_defconfig
> @@ -168,7 +168,7 @@ CONFIG_LEDS_TRIGGER_HEARTBEAT=y
>  
>  #	Real Time Clock
>  CONFIG_RTC_LIB=m
> -CONFIG_RTC_CLASS=m
> +CONFIG_RTC_CLASS=y
>  
>  ### File systems
>  CONFIG_EXT2_FS=m

I adjust this config option recently, and enable it with y.
So please just drop the modification in unicore32 config file.

Thanks & Regards.

Guan Xuetao

^ permalink raw reply

* Re: 3.0-rc1: powerpc hangs at Kernel virtual memory layout
From: Benjamin Herrenschmidt @ 2011-06-02  0:47 UTC (permalink / raw)
  To: Christian Kujau; +Cc: zajec5, linux ppc dev, LKML
In-Reply-To: <alpine.DEB.2.01.1106011712270.21697@trent.utfs.org>

On Wed, 2011-06-01 at 17:16 -0700, Christian Kujau wrote:
> ccc7c28af205888798b51b6cbc0b557ac1170a49 is the first bad commit
> commit ccc7c28af205888798b51b6cbc0b557ac1170a49
> Author: Rafał Miłecki <zajec5@gmail.com>
> Date:   Fri Apr 1 13:26:52 2011 +0200
> 
>     ssb: pci: implement serdes workaround
>     
>     Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
>     Signed-off-by: John W. Linville <linville@tuxdriver.com>
> -----------------------
> 
> When I reverted this one from the gi-bisected tree, the box continued to 
> boot (until it got stuck again during IDE/CDROM init, but that may be a 
> different story). I'l; try to revert this from a vanilla 3.0-rc1 and see 
> if it helps 

Thanks. I'll have a look later today. As for the IDE/CDROM init, have
you tried the very latest linus snapshot ? Does that still happens ?
What kind of error do you observe ?

There was some time during the 3.0 merge window process when interrupts
were broken on some PowerBooks, but that should be fixed now.

Cheers,
Ben.

^ permalink raw reply

* Re: 3.0-rc1: powerpc hangs at Kernel virtual memory layout
From: Christian Kujau @ 2011-06-02  0:16 UTC (permalink / raw)
  To: LKML; +Cc: zajec5, linux ppc dev
In-Reply-To: <alpine.DEB.2.01.1105311644010.21697@trent.utfs.org>

On Tue, 31 May 2011 at 16:50, Christian Kujau wrote:
> trying to boot 3.0-rc1 on powerpc32 only progresses until:
> 
>   > Kernel virtual memory layout:
>   >   * 0xfffcf000..0xfffff000  : fixmap

After hours (and hours!) of git-bisecting, it said:

-----------------------
ccc7c28af205888798b51b6cbc0b557ac1170a49 is the first bad commit
commit ccc7c28af205888798b51b6cbc0b557ac1170a49
Author: Rafał Miłecki <zajec5@gmail.com>
Date:   Fri Apr 1 13:26:52 2011 +0200

    ssb: pci: implement serdes workaround
    
    Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
    Signed-off-by: John W. Linville <linville@tuxdriver.com>
-----------------------

When I reverted this one from the gi-bisected tree, the box continued to 
boot (until it got stuck again during IDE/CDROM init, but that may be a 
different story). I'l; try to revert this from a vanilla 3.0-rc1 and see 
if it helps

Thanks,
Christian.

Full gist-bisect-log: http://nerdbynature.de/bits/3.0-rc1/

> And then the system hangs, does not respond to keyboard (sysrq does not 
> seem to work on this PowerBook G4). But after a while the system reboots 
> itself, so I guess the machine panicked but did not print anything on the 
> screen.
> 
> Full messages (picture), config & (working) dmesg:
> 
>    http://nerdbynature.de/bits/3.0-rc1/
> 
-- 
BOFH excuse #406:

Bad cafeteria food landed all the sysadmins in the hospital.

^ permalink raw reply

* Re: [PATCH 7/7] [v2] drivers/misc: introduce Freescale hypervisor management driver
From: Scott Wood @ 2011-06-01 22:24 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: greg, kumar.gala, linux-kernel, akpm, linux-console, linuxppc-dev,
	Timur Tabi
In-Reply-To: <201106012340.14237.arnd@arndb.de>

On Wed, 1 Jun 2011 23:40:14 +0200
Arnd Bergmann <arnd@arndb.de> wrote:

> > +static long ioctl_dtprop(struct fsl_hv_ioctl_prop __user *p, int set)
> > +{
> > +	struct fsl_hv_ioctl_prop param;
> > +	char __user *upath, *upropname;
> > +	void __user *upropval;
> > +	char *path = NULL, *propname = NULL;
> > +	void *propval = NULL;
> > +	int ret = 0;
> > +
> 
> I'm not convinced that an ioctl interface is the right way to work with
> device tree properties. A more natural way would be to export it as
> a file system, or maybe as a flattened device tree blob (the latter option
> would require changing the hypervisor interface, which might not be
> possible).

I wanted to have the hypervisor take an update dtb (we already have special
meta-properties for things like deletion as part of the hv config
mechanism).  But others on the project wanted to keep it simple, and so
get/set property it was. :-/

It's unlikely to change at this point without a real need.

As for a filesystem interface, it's not a good match either.  You can't
iterate over anything to read out the full tree from the hv.  You can't
delete anything.  You can't create empty nodes.  The hv interface was meant
to enable some specific management actions, rather than to provide general
device tree access.  This driver is a thin wrapper around the management
hcalls.

There would still be other ioctls needed for starting/stopping the
partition, etc.

> > +/**
> > + * fsl_hv_ioctl: ioctl main entry point
> > + */
> > +static long fsl_hv_ioctl(struct file *file, unsigned int cmd,
> > +			 unsigned long argaddr)
> > +{
> > +	union fsl_hv_ioctl_param __user *arg =
> > +		(union fsl_hv_ioctl_param __user *)argaddr;
> > +	long ret;
> > +
> 
> For an ioctl, please follow the normal pattern of defining a separate
> structure for each case, no union.

And have fsl_hypervisor.h provide the full set of proper ioctl numbers, with
the specific struct for each ioctl referenced, rather than having the client program do
"ioctl(f, _IOWR(0, cmd, union fsl_hv_ioctl_param), p)".

-Scott

^ permalink raw reply

* Re: [PATCH] document udbg-immortal
From: Jesse Larrew @ 2011-06-01 22:03 UTC (permalink / raw)
  To: linuxppc-dev
In-Reply-To: <alpine.DEB.2.01.1105311817240.21697@trent.utfs.org>

On 05/31/2011 08:22 PM, Christian Kujau wrote:

> +	udbg-immortal	[PPC] When debugging early kernel crashes that
> +			happen after console_init() and before a proper 
> +			console driver takes over, this boot options might
> +			help "seeing" what's going on.
> +

Thanks for documenting this!

Suggested grammatical tweaks:

s/options/option/
s/help "seeing"/help to "see"/

-- 

Jesse Larrew
Software Engineer, Linux on Power Kernel Team
IBM Linux Technology Center
Phone: (512) 973-2052 (T/L: 363-2052)
jlarrew@linux.vnet.ibm.com

^ permalink raw reply

* Re: [git pull] Please pull powerpc.git merge branch
From: Tabi Timur-B04825 @ 2011-06-01 21:56 UTC (permalink / raw)
  To: Kumar Gala; +Cc: linuxppc-dev@ozlabs.org
In-Reply-To: <alpine.LFD.2.00.1105200934580.11641@right.am.freescale.net>

On Fri, May 20, 2011 at 9:35 AM, Kumar Gala <galak@kernel.crashing.org> wro=
te:

> Shaohui Xie (2):
> =A0 =A0 =A0powerpc/fsl_rio: move machine_check handler

This patch causes a build failure if CONFIG_FSL_RIO is not defined:

arch/powerpc/kernel/built-in.o: In function `machine_check_e500mc':
/home/b04825/git/linux.hv/arch/powerpc/kernel/traps.c:429: undefined
reference to `fsl_rio_mcheck_exception'
arch/powerpc/kernel/built-in.o: In function `machine_check_e500':
/home/b04825/git/linux.hv/arch/powerpc/kernel/traps.c:519: undefined
reference to `fsl_rio_mcheck_exception'

--=20
Timur Tabi
Linux kernel developer at Freescale=

^ permalink raw reply

* Re: [PATCH 7/7] [v2] drivers/misc: introduce Freescale hypervisor management driver
From: Alan Cox @ 2011-06-01 21:45 UTC (permalink / raw)
  To: Scott Wood
  Cc: greg, kumar.gala, linux-kernel, akpm, linux-console, linuxppc-dev,
	Timur Tabi
In-Reply-To: <20110601155430.4d5aa6de@schlenkerla.am.freescale.net>


> There's a check for -ENOIOCTLCMD in vfs_ioctl() -- though it generates
> -EINVAL rather than -ENOTTY (why?).

Good question - perhaps someone didn't read the spec 8)

^ permalink raw reply

* Re: [PATCH 7/7] [v2] drivers/misc: introduce Freescale hypervisor management driver
From: Arnd Bergmann @ 2011-06-01 21:40 UTC (permalink / raw)
  To: Timur Tabi
  Cc: kumar.gala, linux-kernel, akpm, linux-console, greg, linuxppc-dev
In-Reply-To: <1306953337-15698-1-git-send-email-timur@freescale.com>

On Wednesday 01 June 2011, Timur Tabi wrote:
> The Freescale hypervisor management driver provides several services to
> drivers and applications related to the Freescale hypervisor:
> 
> 1. An ioctl interface for querying and managing partitions
> 
> 2. A file interface to reading incoming doorbells
> 
> 3. An interrupt handler for shutting down the partition upon receiving the
>    shutdown doorbell from a manager partition
> 
> 4. An interface for receiving callbacks when a managed partition shuts down.
> 
> Signed-off-by: Timur Tabi <timur@freescale.com>
> ---
>  drivers/misc/Kconfig           |    7 +
>  drivers/misc/Makefile          |    1 +
>  drivers/misc/fsl_hypervisor.c  |  941 ++++++++++++++++++++++++++++++++++++++++
>  include/linux/Kbuild           |    1 +
>  include/linux/fsl_hypervisor.h |  203 +++++++++
>  5 files changed, 1153 insertions(+), 0 deletions(-)
>  create mode 100644 drivers/misc/fsl_hypervisor.c
>  create mode 100644 include/linux/fsl_hypervisor.h

I think drivers/misc is not the right place for this, but I'm not completely
sure what is. drivers/firmware would be better at least, but virt/fsl might
also be ok.

> +static long ioctl_dtprop(struct fsl_hv_ioctl_prop __user *p, int set)
> +{
> +	struct fsl_hv_ioctl_prop param;
> +	char __user *upath, *upropname;
> +	void __user *upropval;
> +	char *path = NULL, *propname = NULL;
> +	void *propval = NULL;
> +	int ret = 0;
> +

I'm not convinced that an ioctl interface is the right way to work with
device tree properties. A more natural way would be to export it as
a file system, or maybe as a flattened device tree blob (the latter option
would require changing the hypervisor interface, which might not be
possible).

> +/**
> + * fsl_hv_ioctl: ioctl main entry point
> + */
> +static long fsl_hv_ioctl(struct file *file, unsigned int cmd,
> +			 unsigned long argaddr)
> +{
> +	union fsl_hv_ioctl_param __user *arg =
> +		(union fsl_hv_ioctl_param __user *)argaddr;
> +	long ret;
> +

For an ioctl, please follow the normal pattern of defining a separate
structure for each case, no union.

You can use a void __user * in the common ioctl function, and pass that
to the typed argument list in the specific functions.

	Arnd

^ permalink raw reply

* Re: [PATCH 7/7] [v2] drivers/misc: introduce Freescale hypervisor management driver
From: Scott Wood @ 2011-06-01 20:54 UTC (permalink / raw)
  To: Alan Cox
  Cc: greg, kumar.gala, linux-kernel, akpm, linux-console, linuxppc-dev,
	Timur Tabi
In-Reply-To: <20110601204618.32190be9@lxorguk.ukuu.org.uk>

On Wed, 1 Jun 2011 20:46:18 +0100
Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:

> > +static char *strdup_from_user(const char __user *ustr, size_t max)
> > +{
> > +	size_t len;
> > +	char *str;
> > +
> > +	len = strnlen_user(ustr, max);
> > +	if (len > max)
> > +		return ERR_PTR(-ENAMETOOLONG);

if (len >= max)

> > +	str = kmalloc(len, GFP_KERNEL);
> > +	if (!str)
> > +		return ERR_PTR(-ENOMEM);
> > +
> > +	if (copy_from_user(str, ustr, len))
> > +		return ERR_PTR(-EFAULT);
> > +
> > +	return str;
> > +}

Memory leak on the EFAULT path

If strnlen_user gets an exception, it'll return zero, causing a
zero-length kmalloc.  Will kmalloc(0, ...) return NULL?  If so, a bad
user pointer would result in -ENOMEM rather than -EFAULT.

> > +	default:
> > +		pr_debug("fsl-hv: unknown ioctl %u\n", cmd);
> > +		ret = -ENOIOCTLCMD;
> 
> -ENOTTY
> 
> (-ENOIOCTLCMD is an internal indicator designed so driver layers can say
> 'dunno, try the next layer up')

There's a check for -ENOIOCTLCMD in vfs_ioctl() -- though it generates
-EINVAL rather than -ENOTTY (why?).

Using -ENOIOCTLCMD consistently would make it easier to refactor a
toplevel ioctl handler into a nested one, plus consistency is good in
general.

> > + * We use the same interrupt handler for all doorbells.  Whenever a doorbell
> > + * is rung, and we receive an interrupt, we just put the handle for that
> > + * doorbell (passed to us as *data) into all of the queues.
> 
> I wonder if these should be presented as IRQs or whether that makes no
> sense ?

They are presented as IRQs.  This driver registers the same handler for all
doorbell IRQs, and pipes the notifications up to userspace, but you could
also directly register a kernel handler for an individual doorbell IRQ.

> > +static irqreturn_t fsl_hv_shutdown_isr(int irq, void *data)
> > +{
> > +	schedule_work(&power_off);
> > +
> > +	/* We should never get here */
> 
> Probably worth printing something if you do (panic(...) ?)

I don't think the comment is accurate.  We've just scheduled the workqueue,
not waited for it to complete.

Timur, shouldn't this schedule orderly_poweroff rather than
kernel_power_off?

-Scott

^ permalink raw reply

* Re: [PATCH 7/7] [v2] drivers/misc: introduce Freescale hypervisor management driver
From: Alan Cox @ 2011-06-01 20:34 UTC (permalink / raw)
  To: Timur Tabi
  Cc: kumar.gala, linux-kernel, akpm, linux-console, greg, linuxppc-dev
In-Reply-To: <4DE6A013.409@freescale.com>

On Wed, 1 Jun 2011 15:24:51 -0500
Timur Tabi <timur@freescale.com> wrote:

> Alan Cox wrote:
> > O> +	/* One partition must be local, the other must be remote.  In other
> >> +	   words, if source and target are both -1, or are both not -1, then
> >> +	   return an error. */
> >> +	if ((param.source == -1) == (param.target == -1))
> >> +		return -EINVAL;
> > 
> > Excess brackets (I just mention that one in passing)'
> 
> Do you mean excess parentheses?  If so, then I don't see how.  "(param.source ==
> -1)" and "(param.target == -1)" are expressions that return a boolean.  I'm
> comparing the two boolean results to see if they're equal

Ok - it's a bit non obvious but yes fair enough and it is commented.

> Where exactly in lib/ should it go?  lib/strings.c seems too low-level for a
> function like this.  lib/string_helpers.c is for sprintf-like functions.  And
> it's too generic for lib/powerpc/

I'd submit it to lib/string personally but I'm not sure where would be
better. If someone doesn't like lib/string they can suggest a better
place 8)

> Well, the "handles" are supposed to be just unique numbers.  In this case, they
> are IRQs, but we don't want to expose that.  The application is supposed to scan
> the device tree looking for the doorbell nodes that it wants, and in those nodes
> there is a 'reg' property that contains the handle for that doorbell.  So the
> numbers just need to match.  What they represent is not relevant.

Ok so they are always going to be exposed to users not to drivers, in
which case yes it makes sense. If they are going to get used by kernel
drivers as well an IRQ interface would probably also make sense.

Alan

^ permalink raw reply

* Re: [PATCH 7/7] [v2] drivers/misc: introduce Freescale hypervisor management driver
From: Timur Tabi @ 2011-06-01 20:24 UTC (permalink / raw)
  To: Alan Cox
  Cc: kumar.gala, linux-kernel, akpm, linux-console, greg, linuxppc-dev
In-Reply-To: <20110601204618.32190be9@lxorguk.ukuu.org.uk>

Alan Cox wrote:
> O> +	/* One partition must be local, the other must be remote.  In other
>> +	   words, if source and target are both -1, or are both not -1, then
>> +	   return an error. */
>> +	if ((param.source == -1) == (param.target == -1))
>> +		return -EINVAL;
> 
> Excess brackets (I just mention that one in passing)'

Do you mean excess parentheses?  If so, then I don't see how.  "(param.source ==
-1)" and "(param.target == -1)" are expressions that return a boolean.  I'm
comparing the two boolean results to see if they're equal.  I want to make sure
that the compiler doesn't do something like this:

	if (param.source == (-1 == (param.target == -1)))

I don't even know what that means, but it's not what I want.

>> +static char *strdup_from_user(const char __user *ustr, size_t max)
>> +{
>> +	size_t len;
>> +	char *str;
>> +
>> +	len = strnlen_user(ustr, max);
>> +	if (len > max)
>> +		return ERR_PTR(-ENAMETOOLONG);
>> +
>> +	str = kmalloc(len, GFP_KERNEL);
>> +	if (!str)
>> +		return ERR_PTR(-ENOMEM);
>> +
>> +	if (copy_from_user(str, ustr, len))
>> +		return ERR_PTR(-EFAULT);
>> +
>> +	return str;
>> +}
> 
> This belongs on lib/ if its of general use which I think it perhaps is
> and if we don't have one already.

Where exactly in lib/ should it go?  lib/strings.c seems too low-level for a
function like this.  lib/string_helpers.c is for sprintf-like functions.  And
it's too generic for lib/powerpc/

>> +	default:
>> +		pr_debug("fsl-hv: unknown ioctl %u\n", cmd);
>> +		ret = -ENOIOCTLCMD;
> 
> -ENOTTY
> 
> (-ENOIOCTLCMD is an internal indicator designed so driver layers can say
> 'dunno, try the next layer up')
> 
>> +/* Linked list of processes that have us open */
>> +struct list_head db_list;
> 
> static ?
> 
> 
>> + * We use the same interrupt handler for all doorbells.  Whenever a doorbell
>> + * is rung, and we receive an interrupt, we just put the handle for that
>> + * doorbell (passed to us as *data) into all of the queues.
> 
> I wonder if these should be presented as IRQs or whether that makes no
> sense ?

Well, the "handles" are supposed to be just unique numbers.  In this case, they
are IRQs, but we don't want to expose that.  The application is supposed to scan
the device tree looking for the doorbell nodes that it wants, and in those nodes
there is a 'reg' property that contains the handle for that doorbell.  So the
numbers just need to match.  What they represent is not relevant.

>> +static irqreturn_t fsl_hv_state_change_isr(int irq, void *data)
>> +{
>> +	unsigned int status;
>> +	struct doorbell_isr *dbisr = data;
>> +	int ret;
>> +
>> +	/* Determine the new state, and if it's stopped, notify the clients. */
>> +	ret = fh_partition_get_status(dbisr->partition, &status);
>> +	if (!ret && (status == FH_PARTITION_STOPPED))
>> +		schedule_work(&dbisr->work);
>> +
>> +	/* Call the normal handler */
>> +	return fsl_hv_isr(irq, (void *) (uintptr_t) dbisr->doorbell);
>> +}
> 
> Would a threaded IRQ clean this up by avoiding the queue/work stuff ?

Yes, I think so.  V3 coming up.

-- 
Timur Tabi
Linux kernel developer at Freescale

^ permalink raw reply

* [patch 13/14] PCSPKR: Cleanup Kconfig dependencies
From: ralf @ 2011-06-01 18:05 UTC (permalink / raw)
  To: linux-kernel, linux-arch
  Cc: linux-mips, linux-alpha, x86, Ingo Molnar, Ivan Kokshaysky,
	H. Peter Anvin, Paul Mackerras, Matt Turner, linuxppc-dev,
	Thomas Gleixner, Richard Henderson
In-Reply-To: <20110601180456.801265664@duck.linux-mips.net>

Lenghty lists of the kind "depends on ARCH1 || ARCH2 ... || ARCH123" are
usually either wrong or too coarse grained.  Or plain an ugly sin.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
To: linux-kernel@vger.kernel.org
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: linux-alpha@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: linuxppc-dev@lists.ozlabs.org

 arch/alpha/Kconfig                     |    1 +
 arch/mips/Kconfig                      |    1 +
 arch/powerpc/platforms/chrp/Kconfig    |    1 +
 arch/powerpc/platforms/prep/Kconfig    |    1 +
 arch/powerpc/platforms/pseries/Kconfig |    1 +
 arch/x86/Kconfig                       |    1 +
 init/Kconfig                           |    5 ++++-
 7 files changed, 10 insertions(+), 1 deletion(-)

Index: linux-mips/arch/alpha/Kconfig
===================================================================
--- linux-mips.orig/arch/alpha/Kconfig
+++ linux-mips/arch/alpha/Kconfig
@@ -6,6 +6,7 @@ config ALPHA
 	select HAVE_OPROFILE
 	select HAVE_SYSCALL_WRAPPERS
 	select HAVE_IRQ_WORK
+	select HAVE_PCSPKR_PLATFORM
 	select HAVE_PERF_EVENTS
 	select HAVE_DMA_ATTRS
 	select HAVE_GENERIC_HARDIRQS
Index: linux-mips/arch/mips/Kconfig
===================================================================
--- linux-mips.orig/arch/mips/Kconfig
+++ linux-mips/arch/mips/Kconfig
@@ -5,6 +5,7 @@ config MIPS
 	select HAVE_IDE
 	select HAVE_OPROFILE
 	select HAVE_IRQ_WORK
+	select HAVE_PCSPKR_PLATFORM
 	select HAVE_PERF_EVENTS
 	select PERF_USE_VMALLOC
 	select HAVE_ARCH_KGDB
Index: linux-mips/arch/powerpc/platforms/chrp/Kconfig
===================================================================
--- linux-mips.orig/arch/powerpc/platforms/chrp/Kconfig
+++ linux-mips/arch/powerpc/platforms/chrp/Kconfig
@@ -1,6 +1,7 @@
 config PPC_CHRP
 	bool "Common Hardware Reference Platform (CHRP) based machines"
 	depends on 6xx
+	select HAVE_PCSPKR_PLATFORM
 	select MPIC
 	select PPC_I8259
 	select PPC_INDIRECT_PCI
Index: linux-mips/arch/powerpc/platforms/prep/Kconfig
===================================================================
--- linux-mips.orig/arch/powerpc/platforms/prep/Kconfig
+++ linux-mips/arch/powerpc/platforms/prep/Kconfig
@@ -1,6 +1,7 @@
 config PPC_PREP
 	bool "PowerPC Reference Platform (PReP) based machines"
 	depends on 6xx && BROKEN
+	select HAVE_PCSPKR_PLATFORM
 	select MPIC
 	select PPC_I8259
 	select PPC_INDIRECT_PCI
Index: linux-mips/arch/powerpc/platforms/pseries/Kconfig
===================================================================
--- linux-mips.orig/arch/powerpc/platforms/pseries/Kconfig
+++ linux-mips/arch/powerpc/platforms/pseries/Kconfig
@@ -1,6 +1,7 @@
 config PPC_PSERIES
 	depends on PPC64 && PPC_BOOK3S
 	bool "IBM pSeries & new (POWER5-based) iSeries"
+	select HAVE_PCSPKR_PLATFORM
 	select MPIC
 	select PCI_MSI
 	select PPC_XICS
Index: linux-mips/arch/x86/Kconfig
===================================================================
--- linux-mips.orig/arch/x86/Kconfig
+++ linux-mips/arch/x86/Kconfig
@@ -20,6 +20,7 @@ config X86
 	select HAVE_UNSTABLE_SCHED_CLOCK
 	select HAVE_IDE
 	select HAVE_OPROFILE
+	select HAVE_PCSPKR_PLATFORM
 	select HAVE_PERF_EVENTS
 	select HAVE_IRQ_WORK
 	select HAVE_IOREMAP_PROT
Index: linux-mips/init/Kconfig
===================================================================
--- linux-mips.orig/init/Kconfig
+++ linux-mips/init/Kconfig
@@ -1001,12 +1001,15 @@ config ELF_CORE
 
 config PCSPKR_PLATFORM
 	bool "Enable PC-Speaker support" if EXPERT
-	depends on ALPHA || X86 || MIPS || PPC_PREP || PPC_CHRP || PPC_PSERIES
+	depends on HAVE_PCSPKR_PLATFORM
 	default y
 	help
           This option allows to disable the internal PC-Speaker
           support, saving some memory.
 
+config HAVE_PCSPKR_PLATFORM
+	bool
+
 config BASE_FULL
 	default y
 	bool "Enable full-sized data structures for core" if EXPERT

^ permalink raw reply

* [patch 09/14] i8253: Alpha, PowerPC: Remove unused <asm/8253pit.h> header.
From: ralf @ 2011-06-01 18:05 UTC (permalink / raw)
  To: linux-kernel, linux-arch
  Cc: Ivan Kokshaysky, linux-alpha, Paul Mackerras, Matt Turner,
	linuxppc-dev, Richard Henderson
In-Reply-To: <20110601180456.801265664@duck.linux-mips.net>

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
To: linux-kernel@vger.kernel.org
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: linux-alpha@vger.kernel.org
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: linuxppc-dev@lists.ozlabs.org

 arch/alpha/include/asm/8253pit.h   |    3 ---
 arch/powerpc/include/asm/8253pit.h |    3 ---
 2 files changed, 6 deletions(-)

Index: linux-mips/arch/alpha/include/asm/8253pit.h
===================================================================
--- linux-mips.orig/arch/alpha/include/asm/8253pit.h
+++ /dev/null
@@ -1,3 +0,0 @@
-/*
- * 8253/8254 Programmable Interval Timer
- */
Index: linux-mips/arch/powerpc/include/asm/8253pit.h
===================================================================
--- linux-mips.orig/arch/powerpc/include/asm/8253pit.h
+++ /dev/null
@@ -1,3 +0,0 @@
-/*
- * 8253/8254 Programmable Interval Timer
- */

^ permalink raw reply

* [patch 00/14] Sort out i8253 and PC speaker locking and headers
From: ralf @ 2011-06-01 18:04 UTC (permalink / raw)
  To: linux-kernel, linux-arch, linux-alpha, linuxppc-dev, linux-mips

No longer terribly relevant these days but still broken and just an eyesore
mess of neglience just as I've already raised it a few days ago.  Time to
sort this.

drivers/input/misc/pcspkr.c:

#if defined(CONFIG_MIPS) || defined(CONFIG_X86)
/* Use the global PIT lock ! */
#include <asm/i8253.h>
#else
#include <asm/8253pit.h>
static DEFINE_RAW_SPINLOCK(i8253_lock);
#endif

sound/drivers/pcsp/pcsp.h:

#if defined(CONFIG_MIPS) || defined(CONFIG_X86)
/* Use the global PIT lock ! */
#include <asm/i8253.h>
#else
#include <asm/8253pit.h>
static DEFINE_RAW_SPINLOCK(i8253_lock);

$ git grep -F pcsp.h sound/drivers/pcsp
sound/drivers/pcsp/pcsp.c:#include "pcsp.h"
sound/drivers/pcsp/pcsp_input.c:#include "pcsp.h"
sound/drivers/pcsp/pcsp_lib.c:#include "pcsp.h"
sound/drivers/pcsp/pcsp_mixer.c:#include "pcsp.h"
$ git grep -w i8253_lock sound/drivers/pcsp/
sound/drivers/pcsp/pcsp.h:static DEFINE_RAW_SPINLOCK(i8253_lock);
sound/drivers/pcsp/pcsp_input.c:        raw_spin_lock_irqsave(&i8253_lock, flags
sound/drivers/pcsp/pcsp_input.c:        raw_spin_unlock_irqrestore(&i8253_lock, 
sound/drivers/pcsp/pcsp_lib.c:          raw_spin_lock_irqsave(&i8253_lock, flags
sound/drivers/pcsp/pcsp_lib.c:          raw_spin_unlock_irqrestore(&i8253_lock, 
sound/drivers/pcsp/pcsp_lib.c:  raw_spin_lock(&i8253_lock);
sound/drivers/pcsp/pcsp_lib.c:  raw_spin_unlock(&i8253_lock);
sound/drivers/pcsp/pcsp_lib.c:  raw_spin_lock(&i8253_lock);
sound/drivers/pcsp/pcsp_lib.c:  raw_spin_unlock(&i8253_lock);

Locks are great, everybody should have their own lock!

$ find . -name 8253pit.h
./arch/powerpc/include/asm/8253pit.h
./arch/alpha/include/asm/8253pit.h
$ cat arch/*/include/asm/8253pit.h
/*
 * 8253/8254 Programmable Interval Timer
 */
/*
 * 8253/8254 Programmable Interval Timer
 */
$

Eh...

$ git grep -w PCSPKR_PLATFORM 
arch/mips/Kconfig:      select PCSPKR_PLATFORM
arch/mips/Kconfig:      select PCSPKR_PLATFORM
arch/mips/Kconfig:      select PCSPKR_PLATFORM
arch/powerpc/platforms/amigaone/Kconfig:        select PCSPKR_PLATFORM
drivers/input/misc/Kconfig:     depends on PCSPKR_PLATFORM
init/Kconfig:config PCSPKR_PLATFORM
sound/drivers/Kconfig:  depends on PCSPKR_PLATFORM && X86 && HIGH_RES_TIMERS

So the status is:

 Alpha:		There is no PCSPKR_PLATFORM so while a platform device is
		being installed no drivers will be built.  I don't know
		which Alpha platforms or even if all of Alpha should be
		doing a PCSPKR_PLATFORM so I haven't even tried to sort
		this.
 ARM:		No PC speaker supported, yeah :)
 PowerPC:	Should compile but the locking is wrong but only the AmigaOne
   		platforms should be affected.
 MIPS:		Ok.
 x86:		Ok.
 All others:	No PC speaker supported

Also only the plain old IBM PC XT was using a i8253; every later system
had i8254.  So maybe this is the time for renaming the support code?

  Ralf

^ permalink raw reply

* Re: [PATCH 7/7] [v2] drivers/misc: introduce Freescale hypervisor management driver
From: Alan Cox @ 2011-06-01 19:46 UTC (permalink / raw)
  To: Timur Tabi
  Cc: kumar.gala, linux-kernel, akpm, linux-console, greg, linuxppc-dev
In-Reply-To: <1306953337-15698-1-git-send-email-timur@freescale.com>

O> +	/* One partition must be local, the other must be remote.  In other
> +	   words, if source and target are both -1, or are both not -1, then
> +	   return an error. */
> +	if ((param.source == -1) == (param.target == -1))
> +		return -EINVAL;

Excess brackets (I just mention that one in passing)

> +	/* pages is an array of struct page pointers that's initialized by
> +	   get_user_pages() */
> +	pages = kzalloc(nr_pages * sizeof(struct page *), GFP_KERNEL);
> +	if (!pages) {
> +		pr_debug("fsl-hv: could not allocate page list\n");
> +		return -ENOMEM;
> +	}

pages allocated

> +
> +	/* sg_list is the list of fh_sg_list objects that we pass to the
> +	   hypervisor */
> +	sg_list_unaligned = kmalloc(nr_pages * sizeof(struct fh_sg_list) +
> +		sizeof(struct fh_sg_list) - 1, GFP_KERNEL);
> +	if (!sg_list_unaligned) {
> +		pr_debug("fsl-hv: could not allocate S/G list\n");

but not freed on this path

Although the other paths look fine.

> +exit:
> +	if (pages) {
> +		for (i = 0; i < nr_pages; i++)
> +			if (pages[i])
> +				page_cache_release(pages[i]);
> +	}
> +
> +	kfree(sg_list_unaligned);
> +	kfree(pages);

> +static char *strdup_from_user(const char __user *ustr, size_t max)
> +{
> +	size_t len;
> +	char *str;
> +
> +	len = strnlen_user(ustr, max);
> +	if (len > max)
> +		return ERR_PTR(-ENAMETOOLONG);
> +
> +	str = kmalloc(len, GFP_KERNEL);
> +	if (!str)
> +		return ERR_PTR(-ENOMEM);
> +
> +	if (copy_from_user(str, ustr, len))
> +		return ERR_PTR(-EFAULT);
> +
> +	return str;
> +}

This belongs on lib/ if its of general use which I think it perhaps is
and if we don't have one already.


> +	default:
> +		pr_debug("fsl-hv: unknown ioctl %u\n", cmd);
> +		ret = -ENOIOCTLCMD;

-ENOTTY

(-ENOIOCTLCMD is an internal indicator designed so driver layers can say
'dunno, try the next layer up')

> +/* Linked list of processes that have us open */
> +struct list_head db_list;

static ?


> + * We use the same interrupt handler for all doorbells.  Whenever a doorbell
> + * is rung, and we receive an interrupt, we just put the handle for that
> + * doorbell (passed to us as *data) into all of the queues.

I wonder if these should be presented as IRQs or whether that makes no
sense ?


> +static irqreturn_t fsl_hv_state_change_isr(int irq, void *data)
> +{
> +	unsigned int status;
> +	struct doorbell_isr *dbisr = data;
> +	int ret;
> +
> +	/* Determine the new state, and if it's stopped, notify the clients. */
> +	ret = fh_partition_get_status(dbisr->partition, &status);
> +	if (!ret && (status == FH_PARTITION_STOPPED))
> +		schedule_work(&dbisr->work);
> +
> +	/* Call the normal handler */
> +	return fsl_hv_isr(irq, (void *) (uintptr_t) dbisr->doorbell);
> +}

Would a threaded IRQ clean this up by avoiding the queue/work stuff ?


> +static irqreturn_t fsl_hv_shutdown_isr(int irq, void *data)
> +{
> +	schedule_work(&power_off);
> +
> +	/* We should never get here */

Probably worth printing something if you do (panic(...) ?)

^ permalink raw reply

* Re: [RFT][PATCH 3.0] ssb: fix PCI(e) driver regression causing oops on PCI cards
From: Rafał Miłecki @ 2011-06-01 19:14 UTC (permalink / raw)
  To: Andreas Schwab, linux-wireless, John W. Linville,
	Michael Büsch
  Cc: linuxppc-dev, Rafał Miłecki, b43-dev
In-Reply-To: <1306918871-16965-1-git-send-email-zajec5@gmail.com>

2011/6/1 Rafa=C5=82 Mi=C5=82ecki <zajec5@gmail.com>:
> We were incorrectly executing PCIe specific workarounds on PCI cards.
> This resulted in:
> Machine check in kernel mode.
> Caused by (from SRR1=3D149030): Transfer error ack signal
> Oops: Machine check, sig: 7 [#1]

John, I've tested this patch myself on my PCI BCM4318, including
checking for 0xFFFFFFFF reads in MMIO dumps.

The patch is correct, please take it for 3.0.

--=20
Rafa=C5=82

^ permalink raw reply

* [PATCH 7/7] [v2] drivers/misc: introduce Freescale hypervisor management driver
From: Timur Tabi @ 2011-06-01 18:35 UTC (permalink / raw)
  To: kumar.gala, benh, greg, akpm, linuxppc-dev, linux-kernel,
	linux-console

The Freescale hypervisor management driver provides several services to
drivers and applications related to the Freescale hypervisor:

1. An ioctl interface for querying and managing partitions

2. A file interface to reading incoming doorbells

3. An interrupt handler for shutting down the partition upon receiving the
   shutdown doorbell from a manager partition

4. An interface for receiving callbacks when a managed partition shuts down.

Signed-off-by: Timur Tabi <timur@freescale.com>
---
 drivers/misc/Kconfig           |    7 +
 drivers/misc/Makefile          |    1 +
 drivers/misc/fsl_hypervisor.c  |  941 ++++++++++++++++++++++++++++++++++++++++
 include/linux/Kbuild           |    1 +
 include/linux/fsl_hypervisor.h |  203 +++++++++
 5 files changed, 1153 insertions(+), 0 deletions(-)
 create mode 100644 drivers/misc/fsl_hypervisor.c
 create mode 100644 include/linux/fsl_hypervisor.h

diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig
index d80dcde..3e016b3 100644
--- a/drivers/misc/Kconfig
+++ b/drivers/misc/Kconfig
@@ -216,6 +216,13 @@ config ENCLOSURE_SERVICES
 	  driver (SCSI/ATA) which supports enclosures
 	  or a SCSI enclosure device (SES) to use these services.
 
+config FSL_HV_MANAGER
+	tristate "Freescale hypervisor management driver"
+	depends on FSL_SOC
+	help
+	  This driver allows applications to communicate with the Freescale
+	  Hypervisor.
+
 config SGI_XP
 	tristate "Support communication between SGI SSIs"
 	depends on NET
diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile
index 848e846..d93bd76 100644
--- a/drivers/misc/Makefile
+++ b/drivers/misc/Makefile
@@ -20,6 +20,7 @@ obj-$(CONFIG_SENSORS_BH1770)	+= bh1770glc.o
 obj-$(CONFIG_SENSORS_APDS990X)	+= apds990x.o
 obj-$(CONFIG_SGI_IOC4)		+= ioc4.o
 obj-$(CONFIG_ENCLOSURE_SERVICES) += enclosure.o
+obj-$(CONFIG_FSL_HV_MANAGER)	+= fsl_hypervisor.o
 obj-$(CONFIG_KGDB_TESTS)	+= kgdbts.o
 obj-$(CONFIG_SGI_XP)		+= sgi-xp/
 obj-$(CONFIG_SGI_GRU)		+= sgi-gru/
diff --git a/drivers/misc/fsl_hypervisor.c b/drivers/misc/fsl_hypervisor.c
new file mode 100644
index 0000000..a03aa7b
--- /dev/null
+++ b/drivers/misc/fsl_hypervisor.c
@@ -0,0 +1,941 @@
+/** @file
+ * Freescale Hypervisor Management Driver
+ *
+ * This driver contains functions to support the Freescale hypervisor.
+ */
+/* Copyright (C) 2008-2010 Freescale Semiconductor, Inc.
+ * Author: Timur Tabi <timur@freescale.com>
+ *
+ * This file is licensed under the terms of the GNU General Public License
+ * version 2.  This program is licensed "as is" without any warranty of any
+ * kind, whether express or implied.
+ */
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/init.h>
+#include <linux/types.h>
+#include <linux/err.h>
+#include <linux/fs.h>
+#include <linux/miscdevice.h>
+#include <linux/mm.h>
+#include <linux/pagemap.h>
+#include <linux/slab.h>
+#include <linux/poll.h>
+#include <linux/of.h>
+#include <linux/reboot.h>
+#include <linux/uaccess.h>
+#include <linux/notifier.h>
+
+#include <linux/io.h>
+#include <asm/fsl_hcalls.h>
+
+#include <linux/fsl_hypervisor.h>
+
+static BLOCKING_NOTIFIER_HEAD(failover_subscribers);
+
+/**
+ * ioctl_restart: ioctl interface for FSL_HV_IOCTL_PARTITION_RESTART
+ *
+ * Restart a running partition
+ */
+static long ioctl_restart(struct fsl_hv_ioctl_restart __user *p)
+{
+	struct fsl_hv_ioctl_restart param;
+
+	/* Get the parameters from the user */
+	if (copy_from_user(&param, p, sizeof(struct fsl_hv_ioctl_restart)))
+		return -EFAULT;
+
+	param.ret = fh_partition_restart(param.partition);
+
+	if (copy_to_user(&p->ret, &param.ret, sizeof(__u32)))
+		return -EFAULT;
+
+	return 0;
+}
+
+/**
+ * ioctl_status: ioctl interface for FSL_HV_IOCTL_PARTITION_STATUS
+ *
+ * Query the status of a partition
+ */
+static long ioctl_status(struct fsl_hv_ioctl_status __user *p)
+{
+	struct fsl_hv_ioctl_status param;
+	u32 status;
+
+	/* Get the parameters from the user */
+	if (copy_from_user(&param, p, sizeof(struct fsl_hv_ioctl_status)))
+		return -EFAULT;
+
+	param.ret = fh_partition_get_status(param.partition, &status);
+	if (!param.ret)
+		param.status = status;
+
+	if (copy_to_user(p, &param, sizeof(struct fsl_hv_ioctl_status)))
+		return -EFAULT;
+
+	return 0;
+}
+
+/**
+ * ioctl_start: ioctl interface for FSL_HV_IOCTL_PARTITION_START
+ *
+ * Start a stopped partition.
+ */
+static long ioctl_start(struct fsl_hv_ioctl_start __user *p)
+{
+	struct fsl_hv_ioctl_start param;
+
+	/* Get the parameters from the user */
+	if (copy_from_user(&param, p, sizeof(struct fsl_hv_ioctl_start)))
+		return -EFAULT;
+
+	param.ret = fh_partition_start(param.partition, param.entry_point,
+				       param.load);
+
+	if (copy_to_user(&p->ret, &param.ret, sizeof(__u32)))
+		return -EFAULT;
+
+	return 0;
+}
+
+/**
+ * ioctl_stop: ioctl interface for FSL_HV_IOCTL_PARTITION_STOP
+ *
+ * Stop a running partition
+ */
+static long ioctl_stop(struct fsl_hv_ioctl_stop __user *p)
+{
+	struct fsl_hv_ioctl_stop param;
+
+	/* Get the parameters from the user */
+	if (copy_from_user(&param, p, sizeof(struct fsl_hv_ioctl_stop)))
+		return -EFAULT;
+
+	param.ret = fh_partition_stop(param.partition);
+
+	if (copy_to_user(&p->ret, &param.ret, sizeof(__u32)))
+		return -EFAULT;
+
+	return 0;
+}
+
+/**
+ * ioctl_memcpy: ioctl interface for FSL_HV_IOCTL_MEMCPY
+ *
+ * The FH_MEMCPY hypercall takes an array of address/address/size structures
+ * to represent the data being copied.  As a convenience to the user, this
+ * ioctl takes a user-create buffer and a pointer to a guest physically
+ * contiguous buffer in the remote partition, and creates the
+ * address/address/size array for the hypercall.
+ */
+static long ioctl_memcpy(struct fsl_hv_ioctl_memcpy __user *p)
+{
+	struct fsl_hv_ioctl_memcpy param;
+
+	struct page **pages = NULL;
+	void *sg_list_unaligned = NULL;
+	struct fh_sg_list *sg_list = NULL;
+
+	unsigned int nr_pages;
+	unsigned long lb_offset; /* Offset within a page of the local buffer */
+
+	unsigned int i;
+	long ret = 0;
+	phys_addr_t remote_paddr; /* The next address in the remote buffer */
+	uint32_t count; /* The number of bytes left to copy */
+
+	/* Get the parameters from the user */
+	if (copy_from_user(&param, p, sizeof(struct fsl_hv_ioctl_memcpy)))
+		return -EFAULT;
+
+	/* One partition must be local, the other must be remote.  In other
+	   words, if source and target are both -1, or are both not -1, then
+	   return an error. */
+	if ((param.source == -1) == (param.target == -1))
+		return -EINVAL;
+
+	/*
+	 * The array of pages returned by get_user_pages() covers only
+	 * page-aligned memory.  Since the user buffer is probably not
+	 * page-aligned, we need to handle the discrepancy.
+	 *
+	 * We calculate the offset within a page of the S/G list, and make
+	 * adjustments accordingly.  This will result in a page list that looks
+	 * like this:
+	 *
+	 *      ----    <-- first page starts before the buffer
+	 *     |    |
+	 *     |////|-> ----
+	 *     |////|  |    |
+	 *      ----   |    |
+	 *             |    |
+	 *      ----   |    |
+	 *     |////|  |    |
+	 *     |////|  |    |
+	 *     |////|  |    |
+	 *      ----   |    |
+	 *             |    |
+	 *      ----   |    |
+	 *     |////|  |    |
+	 *     |////|  |    |
+	 *     |////|  |    |
+	 *      ----   |    |
+	 *             |    |
+	 *      ----   |    |
+	 *     |////|  |    |
+	 *     |////|-> ----
+	 *     |    |   <-- last page ends after the buffer
+	 *      ----
+	 *
+	 * The distance between the start of the first page and the start of the
+	 * buffer is lb_offset.  The hashed (///) areas are the parts of the
+	 * page list that contain the actual buffer.
+	 *
+	 * The advantage of this approach is that the number of pages is
+	 * equal to the number of entries in the S/G list that we give to the
+	 * hypervisor.
+	 */
+	lb_offset = param.local_vaddr & (PAGE_SIZE - 1);
+	nr_pages = (param.count + lb_offset + PAGE_SIZE - 1) >> PAGE_SHIFT;
+
+	/* Allocate the buffers we need */
+
+	/* pages is an array of struct page pointers that's initialized by
+	   get_user_pages() */
+	pages = kzalloc(nr_pages * sizeof(struct page *), GFP_KERNEL);
+	if (!pages) {
+		pr_debug("fsl-hv: could not allocate page list\n");
+		return -ENOMEM;
+	}
+
+	/* sg_list is the list of fh_sg_list objects that we pass to the
+	   hypervisor */
+	sg_list_unaligned = kmalloc(nr_pages * sizeof(struct fh_sg_list) +
+		sizeof(struct fh_sg_list) - 1, GFP_KERNEL);
+	if (!sg_list_unaligned) {
+		pr_debug("fsl-hv: could not allocate S/G list\n");
+		return -ENOMEM;
+	}
+	sg_list = PTR_ALIGN(sg_list_unaligned, sizeof(struct fh_sg_list));
+
+	/* Get the physical addresses of the source buffer */
+	down_read(&current->mm->mmap_sem);
+	ret = get_user_pages(current, current->mm,
+		param.local_vaddr - lb_offset, nr_pages,
+		(param.source == -1) ? READ : WRITE,
+		0, pages, NULL);
+	up_read(&current->mm->mmap_sem);
+
+	if (ret != nr_pages) {
+		/* get_user_pages() failed */
+		pr_debug("fsl-hv: could not lock source buffer\n");
+		ret = -EACCES;
+		goto exit;
+	}
+
+	/* reset ret here */
+	ret = 0;
+
+	/* Build the fh_sg_list[] array.  The first page is special
+	   because it's misaligned.*/
+	if (param.source == -1) {
+		sg_list[0].source = page_to_phys(pages[0]) + lb_offset;
+		sg_list[0].target = param.remote_paddr;
+	} else {
+		sg_list[0].source = param.remote_paddr;
+		sg_list[0].target = page_to_phys(pages[0]) + lb_offset;
+	}
+	sg_list[0].size = min_t(uint64_t, param.count, PAGE_SIZE - lb_offset);
+
+	remote_paddr = param.remote_paddr + sg_list[0].size;
+	count = param.count - sg_list[0].size;
+
+	for (i = 1; i < nr_pages; i++) {
+		if (param.source == -1) {
+			/* local to remote */
+			sg_list[i].source = page_to_phys(pages[i]);
+			sg_list[i].target = remote_paddr;
+		} else {
+			/* remote to local */
+			sg_list[i].source = remote_paddr;
+			sg_list[i].target = page_to_phys(pages[i]);
+		}
+		sg_list[i].size = min_t(uint64_t, count, PAGE_SIZE);
+
+		remote_paddr += sg_list[i].size;
+		count -= sg_list[i].size;
+	}
+
+	param.ret = fh_partition_memcpy(param.source, param.target,
+		virt_to_phys(sg_list), nr_pages);
+
+exit:
+	if (pages) {
+		for (i = 0; i < nr_pages; i++)
+			if (pages[i])
+				page_cache_release(pages[i]);
+	}
+
+	kfree(sg_list_unaligned);
+	kfree(pages);
+
+	if (!ret)
+		if (copy_to_user(&p->ret, &param.ret, sizeof(__u32)))
+			return -EFAULT;
+
+	return ret;
+}
+
+/**
+ * ioctl_doorbell: ioctl interface for FSL_HV_IOCTL_DOORBELL
+ *
+ * Ring a doorbell
+ */
+static long ioctl_doorbell(struct fsl_hv_ioctl_doorbell __user *p)
+{
+	struct fsl_hv_ioctl_doorbell param;
+
+	/* Get the parameters from the user */
+	if (copy_from_user(&param, p, sizeof(struct fsl_hv_ioctl_doorbell)))
+		return -EFAULT;
+
+	param.ret = ev_doorbell_send(param.doorbell);
+
+	if (copy_to_user(&p->ret, &param.ret, sizeof(__u32)))
+		return -EFAULT;
+
+	return 0;
+}
+
+static char *strdup_from_user(const char __user *ustr, size_t max)
+{
+	size_t len;
+	char *str;
+
+	len = strnlen_user(ustr, max);
+	if (len > max)
+		return ERR_PTR(-ENAMETOOLONG);
+
+	str = kmalloc(len, GFP_KERNEL);
+	if (!str)
+		return ERR_PTR(-ENOMEM);
+
+	if (copy_from_user(str, ustr, len))
+		return ERR_PTR(-EFAULT);
+
+	return str;
+}
+
+static long ioctl_dtprop(struct fsl_hv_ioctl_prop __user *p, int set)
+{
+	struct fsl_hv_ioctl_prop param;
+	char __user *upath, *upropname;
+	void __user *upropval;
+	char *path = NULL, *propname = NULL;
+	void *propval = NULL;
+	int ret = 0;
+
+	/* Get the parameters from the user */
+	if (copy_from_user(&param, p, sizeof(struct fsl_hv_ioctl_prop)))
+		return -EFAULT;
+
+	upath = (char __user *)(uintptr_t)param.path;
+	upropname = (char __user *)(uintptr_t)param.propname;
+	upropval = (void __user *)(uintptr_t)param.propval;
+
+	path = strdup_from_user(upath, FH_DTPROP_MAX_PATHLEN);
+	if (IS_ERR(path)) {
+		ret = PTR_ERR(path);
+		goto out;
+	}
+
+	propname = strdup_from_user(upropname, FH_DTPROP_MAX_PATHLEN);
+	if (IS_ERR(propname)) {
+		ret = PTR_ERR(propname);
+		goto out;
+	}
+
+	if (param.proplen > FH_DTPROP_MAX_PROPLEN) {
+		ret = -EINVAL;
+		goto out;
+	}
+
+	propval = kmalloc(param.proplen, GFP_KERNEL);
+	if (!propval) {
+		ret = -ENOMEM;
+		goto out;
+	}
+
+	if (set) {
+		if (copy_from_user(propval, upropval, param.proplen)) {
+			ret = -EFAULT;
+			goto out;
+		}
+
+		param.ret = fh_partition_set_dtprop(param.handle,
+						    virt_to_phys(path),
+						    virt_to_phys(propname),
+						    virt_to_phys(propval),
+						    param.proplen);
+	} else {
+		param.ret = fh_partition_get_dtprop(param.handle,
+						    virt_to_phys(path),
+						    virt_to_phys(propname),
+						    virt_to_phys(propval),
+						    &param.proplen);
+
+		if (param.ret == 0) {
+			if (copy_to_user(upropval, propval, param.proplen) ||
+			    put_user(param.proplen, &p->proplen)) {
+				ret = -EFAULT;
+				goto out;
+			}
+		}
+	}
+
+	if (put_user(param.ret, &p->ret))
+		ret = -EFAULT;
+
+out:
+	kfree(path);
+	kfree(propval);
+	kfree(propname);
+
+	return ret;
+}
+
+/**
+ * fsl_hv_ioctl: ioctl main entry point
+ */
+static long fsl_hv_ioctl(struct file *file, unsigned int cmd,
+			 unsigned long argaddr)
+{
+	union fsl_hv_ioctl_param __user *arg =
+		(union fsl_hv_ioctl_param __user *)argaddr;
+	long ret;
+
+	/* Make sure the application is called the right driver. */
+	if (_IOC_TYPE(cmd) != 0) {
+		pr_debug("fsl-hv: ioctl type %u should be 0\n", _IOC_TYPE(cmd));
+		return -EINVAL;
+	}
+
+	/* Make sure the application set the direction flag correctly. */
+	if (_IOC_DIR(cmd) != (_IOC_READ | _IOC_WRITE)) {
+		pr_debug("fsl-hv: ioctl direction should be _IOWR\n");
+		return -EINVAL;
+	}
+
+	/* Make sure the application is passing the right structure to us. */
+	if (_IOC_SIZE(cmd) < sizeof(union fsl_hv_ioctl_param)) {
+		pr_debug("fsl-hv: ioctl size %u is too small (should be %u)\n",
+			 _IOC_SIZE(cmd), sizeof(union fsl_hv_ioctl_param));
+		return -EINVAL;
+	}
+
+	switch (_IOC_NR(cmd)) {
+	case FSL_HV_IOCTL_PARTITION_RESTART:
+		ret = ioctl_restart(&arg->restart);
+		break;
+	case FSL_HV_IOCTL_PARTITION_GET_STATUS:
+		ret = ioctl_status(&arg->status);
+		break;
+	case FSL_HV_IOCTL_PARTITION_START:
+		ret = ioctl_start(&arg->start);
+		break;
+	case FSL_HV_IOCTL_PARTITION_STOP:
+		ret = ioctl_stop(&arg->stop);
+		break;
+	case FSL_HV_IOCTL_MEMCPY:
+		ret = ioctl_memcpy(&arg->memcpy);
+		break;
+	case FSL_HV_IOCTL_DOORBELL:
+		ret = ioctl_doorbell(&arg->doorbell);
+		break;
+	case FSL_HV_IOCTL_GETPROP:
+		ret = ioctl_dtprop(&arg->prop, 0);
+		break;
+	case FSL_HV_IOCTL_SETPROP:
+		ret = ioctl_dtprop(&arg->prop, 1);
+		break;
+	default:
+		pr_debug("fsl-hv: unknown ioctl %u\n", cmd);
+		ret = -ENOIOCTLCMD;
+		break;
+	}
+
+	return ret;
+}
+
+/* Linked list of processes that have us open */
+struct list_head db_list;
+
+/* spinlock for db_list */
+static DEFINE_SPINLOCK(db_list_lock);
+
+/* The size of the doorbell event queue.  This must be a power of two. */
+#define QSIZE	16
+
+/* Returns the next head/tail pointer, wrapping around the queue if necessary */
+#define nextp(x) (((x) + 1) & (QSIZE - 1))
+
+/* Per-open data structure */
+struct doorbell_queue {
+	struct list_head list;
+	spinlock_t lock;
+	wait_queue_head_t wait;
+	unsigned int head;
+	unsigned int tail;
+	uint32_t q[QSIZE];
+};
+
+/* Linked list of ISRs that we registered */
+struct list_head isr_list;
+
+/* Per-ISR data structure */
+struct doorbell_isr {
+	struct list_head list;
+	unsigned int irq;
+	uint32_t doorbell;	/* The doorbell handle */
+	uint32_t partition;	/* The partition handle, if used */
+	struct work_struct work;
+};
+
+/**
+ * fsl_hv_isr - interrupt handler for all doorbells
+ * @param irq - the IRQ (a.k.a. receive handle)
+ *
+ * We use the same interrupt handler for all doorbells.  Whenever a doorbell
+ * is rung, and we receive an interrupt, we just put the handle for that
+ * doorbell (passed to us as *data) into all of the queues.
+ *
+ */
+static irqreturn_t fsl_hv_isr(int irq, void *data)
+{
+	struct doorbell_queue *dbq;
+	unsigned long flags;
+
+	/* Prevent another core from modifying db_list */
+	spin_lock_irqsave(&db_list_lock, flags);
+
+	list_for_each_entry(dbq, &db_list, list) {
+		if (dbq->head != nextp(dbq->tail)) {
+			dbq->q[dbq->tail] = (uint32_t) (uintptr_t) data;
+			/* This memory barrier eliminates the need to grab
+			 * the spinlock for dbq.
+			 */
+			smp_wmb();
+			dbq->tail = nextp(dbq->tail);
+			wake_up_interruptible(&dbq->wait);
+		}
+	}
+
+	spin_unlock_irqrestore(&db_list_lock, flags);
+
+	return IRQ_HANDLED;
+}
+
+/**
+ * fsl_hv_state_change_work_func -- state change worker function
+ *
+ * The state change notification arrives in an interrupt, but we can't call
+ * blocking_notifier_call_chain() in an interrupt handler.  We could call
+ * atomic_notifier_call_chain(), but that would require the clients' call-back
+ * function to run in interrupt context.  Since we don't want to impose that
+ * restriction on the clients, we create a work queue to process the
+ * notification in kernel context.
+ */
+static void fsl_hv_state_change_work_func(struct work_struct *work)
+{
+	struct doorbell_isr *dbisr =
+		container_of(work, struct doorbell_isr, work);
+
+	blocking_notifier_call_chain(&failover_subscribers, dbisr->partition,
+				     NULL);
+}
+
+/**
+ * fsl_hv_state_change_isr - interrupt handler for state-change doorbells
+ */
+static irqreturn_t fsl_hv_state_change_isr(int irq, void *data)
+{
+	unsigned int status;
+	struct doorbell_isr *dbisr = data;
+	int ret;
+
+	/* Determine the new state, and if it's stopped, notify the clients. */
+	ret = fh_partition_get_status(dbisr->partition, &status);
+	if (!ret && (status == FH_PARTITION_STOPPED))
+		schedule_work(&dbisr->work);
+
+	/* Call the normal handler */
+	return fsl_hv_isr(irq, (void *) (uintptr_t) dbisr->doorbell);
+}
+
+/**
+ * fsl_hv_poll - returns a bitmask indicating whether a read will block
+ *
+ * @return unsigned int
+ */
+static unsigned int fsl_hv_poll(struct file *filp, struct poll_table_struct *p)
+{
+	struct doorbell_queue *dbq = filp->private_data;
+	unsigned long flags;
+	unsigned int mask;
+
+	spin_lock_irqsave(&dbq->lock, flags);
+
+	poll_wait(filp, &dbq->wait, p);
+	mask = (dbq->head == dbq->tail) ? 0 : (POLLIN | POLLRDNORM);
+
+	spin_unlock_irqrestore(&dbq->lock, flags);
+
+	return mask;
+}
+
+/**
+ * fsl_hv_read - return the handles for any incoming doorbells
+ *
+ * If there are doorbell handles in the queue for this open instance, then
+ * return them to the caller as an array of 32-bit integers.  Otherwise,
+ * block until there is at least one handle to return.
+ */
+static ssize_t fsl_hv_read(struct file *filp, char __user *buf, size_t len,
+			   loff_t *off)
+{
+	struct doorbell_queue *dbq = filp->private_data;
+	uint32_t __user *p = (uint32_t __user *) buf; /* for put_user() */
+	unsigned long flags;
+	ssize_t count = 0;
+
+	/* Make sure we stop when the user buffer is full. */
+	while (len >= sizeof(uint32_t)) {
+		uint32_t dbell;	/* Local copy of doorbell queue data */
+
+		spin_lock_irqsave(&dbq->lock, flags);
+
+		/* If the queue is empty, then either we're done or we need
+		 * to block.  If the application specified O_NONBLOCK, then
+		 * we return the appropriate error code.
+		 */
+		if (dbq->head == dbq->tail) {
+			spin_unlock_irqrestore(&dbq->lock, flags);
+			if (count)
+				break;
+			if (filp->f_flags & O_NONBLOCK)
+				return -EAGAIN;
+			if (wait_event_interruptible(dbq->wait,
+						     dbq->head != dbq->tail))
+				return -ERESTARTSYS;
+			continue;
+		}
+
+		/* Even though we have an smp_wmb() in the ISR, the core
+		 * might speculatively execute the "dbell = ..." below while
+		 * it's evaluating the if-statement above.  In that case, the
+		 * value put into dbell could be stale if the core accepts the
+		 * speculation. To prevent that, we need a read memory barrier
+		 * here as well.
+		 */
+		smp_rmb();
+
+		/* Copy the data to a temporary local buffer, because
+		 * we can't call copy_to_user() from inside a spinlock
+		 */
+		dbell = dbq->q[dbq->head];
+		dbq->head = nextp(dbq->head);
+
+		spin_unlock_irqrestore(&dbq->lock, flags);
+
+		if (put_user(dbell, p))
+			return -EFAULT;
+		p++;
+		count += sizeof(uint32_t);
+		len -= sizeof(uint32_t);
+	}
+
+	return count;
+}
+
+/**
+ * fsl_hv_open - open the driver
+ *
+ * Open the driver and prepare for reading doorbells.
+ *
+ * Every time an application opens the driver, we create a doorbell queue
+ * for that file handle.  This queue is used for any incoming doorbells.
+ */
+static int fsl_hv_open(struct inode *inode, struct file *filp)
+{
+	struct doorbell_queue *dbq;
+	unsigned long flags;
+	int ret = 0;
+
+	dbq = kzalloc(sizeof(struct doorbell_queue), GFP_KERNEL);
+	if (!dbq) {
+		pr_err("fsl-hv: out of memory\n");
+		return -ENOMEM;
+	}
+
+	spin_lock_init(&dbq->lock);
+	init_waitqueue_head(&dbq->wait);
+
+	spin_lock_irqsave(&db_list_lock, flags);
+	list_add(&dbq->list, &db_list);
+	spin_unlock_irqrestore(&db_list_lock, flags);
+
+	filp->private_data = dbq;
+
+	return ret;
+}
+
+/**
+ * fsl_hv_close - close the driver
+ */
+static int fsl_hv_close(struct inode *inode, struct file *filp)
+{
+	struct doorbell_queue *dbq = filp->private_data;
+	unsigned long flags;
+
+	int ret = 0;
+
+	spin_lock_irqsave(&db_list_lock, flags);
+	list_del(&dbq->list);
+	spin_unlock_irqrestore(&db_list_lock, flags);
+
+	kfree(dbq);
+
+	return ret;
+}
+
+static const struct file_operations fsl_hv_fops = {
+	.owner = THIS_MODULE,
+	.open = fsl_hv_open,
+	.release = fsl_hv_close,
+	.poll = fsl_hv_poll,
+	.read = fsl_hv_read,
+	.unlocked_ioctl = fsl_hv_ioctl,
+};
+
+static struct miscdevice fsl_hv_misc_dev = {
+	MISC_DYNAMIC_MINOR,
+	"fsl-hv",
+	&fsl_hv_fops
+};
+
+static DECLARE_WORK(power_off, (work_func_t) kernel_power_off);
+
+static irqreturn_t fsl_hv_shutdown_isr(int irq, void *data)
+{
+	schedule_work(&power_off);
+
+	/* We should never get here */
+	return IRQ_NONE;
+}
+
+/**
+ * get_parent_handle -- returns the handle of the parent of the given node
+ *
+ * The handle is the value of the 'reg' property
+ */
+static int get_parent_handle(struct device_node *np)
+{
+	struct device_node *parent;
+	const uint32_t *prop;
+	int len;
+
+	parent = of_get_parent(np);
+	if (!parent)
+		/* It's not really possible for this to fail */
+		return -ENODEV;
+
+	prop = of_get_property(parent, "reg", &len);
+	of_node_put(parent);
+
+	if (!prop || (len != sizeof(uint32_t)))
+		/* This can happen only if the node is malformed */
+		return -ENODEV;
+
+	return *prop;
+}
+
+/**
+ * fsl_hv_failover_register -- register a callback for failover events
+ *
+ * This function is called by device drivers to register their callback
+ * functions for fail-over events.
+ */
+int fsl_hv_failover_register(struct notifier_block *nb)
+{
+	return blocking_notifier_chain_register(&failover_subscribers, nb);
+}
+EXPORT_SYMBOL(fsl_hv_failover_register);
+
+/**
+ * fsl_hv_failover_unregister -- unregister a callback for failover events
+ */
+int fsl_hv_failover_unregister(struct notifier_block *nb)
+{
+	return blocking_notifier_chain_unregister(&failover_subscribers, nb);
+}
+EXPORT_SYMBOL(fsl_hv_failover_unregister);
+
+/**
+ * has_fsl_hypervisor - return TRUE if we're running under FSL hypervisor
+ *
+ * This function checks to see if we're running under the Freescale
+ * hypervisor, and returns zero if we're not, or non-zero if we are.
+ *
+ * First, it checks if MSR[GS]==1, which means we're running under some
+ * hypervisor.  Then it checks if there is a hypervisor node in the device
+ * tree.  Currently, that means there needs to be a node in the root called
+ * "hypervisor" and which has a property named "fsl,hv-version".
+ */
+static int has_fsl_hypervisor(void)
+{
+	struct device_node *node;
+	int ret;
+
+	if (!(mfmsr() & MSR_GS))
+		return 0;
+
+	node = of_find_node_by_path("/hypervisor");
+	if (!node)
+		return 0;
+
+	ret = of_find_property(node, "fsl,hv-version", NULL) != NULL;
+
+	of_node_put(node);
+
+	return ret;
+}
+
+/**
+ * fsl_hypervisor_init: Freescale hypervisor management driver init
+ *
+ * This function is called when this module is loaded.
+ *
+ * Register ourselves as a miscellaneous driver.  This will register the
+ * fops structure and create the right sysfs entries for udev.
+ */
+static int __init fsl_hypervisor_init(void)
+{
+	struct device_node *np;
+	struct doorbell_isr *dbisr, *n;
+	int ret;
+
+	pr_info("Freescale hypervisor management driver\n");
+
+	if (!has_fsl_hypervisor()) {
+		pr_info("fsl-hv: no hypervisor found\n");
+		return -ENODEV;
+	}
+
+	ret = misc_register(&fsl_hv_misc_dev);
+	if (ret) {
+		pr_err("fsl-hv: cannot register device\n");
+		return ret;
+	}
+
+	INIT_LIST_HEAD(&db_list);
+	INIT_LIST_HEAD(&isr_list);
+
+	for_each_compatible_node(np, NULL, "epapr,hv-receive-doorbell") {
+		unsigned int irq;
+		const uint32_t *handle;
+
+		handle = of_get_property(np, "interrupts", NULL);
+		irq = irq_of_parse_and_map(np, 0);
+		if (!handle || (irq == NO_IRQ)) {
+			pr_err("fsl-hv: no 'interrupts' property in %s node\n",
+				np->full_name);
+			continue;
+		}
+
+		dbisr = kzalloc(sizeof(*dbisr), GFP_KERNEL);
+		if (!dbisr)
+			goto out_of_memory;
+
+		dbisr->irq = irq;
+		dbisr->doorbell = *handle;
+		INIT_WORK(&dbisr->work, fsl_hv_state_change_work_func);
+
+		if (of_device_is_compatible(np, "fsl,hv-shutdown-doorbell")) {
+			/* The shutdown doorbell gets its own ISR */
+			ret = request_irq(irq, fsl_hv_shutdown_isr, 0,
+					  np->name, dbisr);
+		} else if (of_device_is_compatible(np,
+			"fsl,hv-state-change-doorbell")) {
+			/* The state change doorbell triggers a notification if
+			 * the state of the managed partition changes to
+			 * "stopped". We need a separate interrupt handler for
+			 * that, and we also need to know the handle of the
+			 * target partition, not just the handle of the
+			 * doorbell.
+			 */
+			dbisr->partition = ret = get_parent_handle(np);
+			if (ret < 0) {
+				pr_err("fsl-hv: node %s has missing or "
+				       "malformed parent\n", np->full_name);
+				kfree(dbisr);
+				continue;
+			}
+			ret = request_irq(irq, fsl_hv_state_change_isr, 0,
+					  np->name, dbisr);
+		} else
+			ret = request_irq(irq, fsl_hv_isr, 0, np->name, dbisr);
+
+		if (ret < 0) {
+			pr_err("fsl-hv: could not request irq %u for node %s\n",
+			       irq, np->full_name);
+			kfree(dbisr);
+			continue;
+		}
+
+		list_add(&dbisr->list, &isr_list);
+
+		pr_info("fsl-hv: registered handler for doorbell %u\n",
+			*handle);
+	}
+
+	return 0;
+
+out_of_memory:
+	list_for_each_entry_safe(dbisr, n, &isr_list, list) {
+		free_irq(dbisr->irq, dbisr);
+		list_del(&dbisr->list);
+		kfree(dbisr);
+	}
+
+	misc_deregister(&fsl_hv_misc_dev);
+
+	return -ENOMEM;
+}
+
+/**
+ * fsl_hypervisor_exit: Freescale hypervisor management driver termination
+ *
+ * This function is called when this driver is unloaded.
+ */
+static void __exit fsl_hypervisor_exit(void)
+{
+	struct doorbell_isr *dbisr, *n;
+
+	list_for_each_entry_safe(dbisr, n, &isr_list, list) {
+		free_irq(dbisr->irq, dbisr);
+		list_del(&dbisr->list);
+		kfree(dbisr);
+	}
+
+	misc_deregister(&fsl_hv_misc_dev);
+}
+
+module_init(fsl_hypervisor_init);
+module_exit(fsl_hypervisor_exit);
+
+MODULE_AUTHOR("Timur Tabi <timur@freescale.com>");
+MODULE_DESCRIPTION("Freescale hypervisor management driver");
+MODULE_LICENSE("GPL v2");
diff --git a/include/linux/Kbuild b/include/linux/Kbuild
index 75cf611..68c341a 100644
--- a/include/linux/Kbuild
+++ b/include/linux/Kbuild
@@ -134,6 +134,7 @@ header-y += firewire-cdev.h
 header-y += firewire-constants.h
 header-y += flat.h
 header-y += fs.h
+header-y += fsl_hypervisor.h
 header-y += fuse.h
 header-y += futex.h
 header-y += gameport.h
diff --git a/include/linux/fsl_hypervisor.h b/include/linux/fsl_hypervisor.h
new file mode 100644
index 0000000..63740a2
--- /dev/null
+++ b/include/linux/fsl_hypervisor.h
@@ -0,0 +1,203 @@
+/*
+ * Freescale hypervisor ioctl interface
+ *
+ * Copyright (C) 2008-2011 Freescale Semiconductor, Inc.
+ *
+ * This file is licensed under the terms of the GNU General Public License
+ * version 2.  This program is licensed "as is" without any warranty of any
+ * kind, whether express or implied.
+ *
+ * This file is used by the Freescale hypervisor management driver.  It can
+ * also be included by applications that need to communicate with the driver
+ * via the ioctl interface.
+ */
+
+#ifndef FSL_HYPERVISOR_H
+#define FSL_HYPERVISOR_H
+
+#include <linux/types.h>
+
+/**
+ * Freescale hypervisor ioctl parameter
+ */
+union fsl_hv_ioctl_param {
+
+	/**
+	 * @ret: Return value.
+	 *
+	 * This is always the first word of any structure.
+	 */
+	__u32 ret;
+
+	/**
+	 * struct fsl_hv_ioctl_restart: restart a partition
+	 * @ret: return error code from the hypervisor
+	 * @partition: the ID of the partition to restart, or -1 for the
+	 *             calling partition
+	 *
+	 * Used by FSL_HV_IOCTL_PARTITION_RESTART
+	 */
+	struct fsl_hv_ioctl_restart {
+		__u32 ret;
+		__u32 partition;
+	} restart;
+
+	/**
+	 * struct fsl_hv_ioctl_status: get a partition's status
+	 * @ret: return error code from the hypervisor
+	 * @partition: the ID of the partition to query, or -1 for the
+	 *             calling partition
+	 * @status: The returned status of the partition
+	 *
+	 * Used by FSL_HV_IOCTL_PARTITION_GET_STATUS
+	 *
+	 * Values of 'status':
+	 *    0 = Stopped
+	 *    1 = Running
+	 *    2 = Starting
+	 *    3 = Stopping
+	 */
+	struct fsl_hv_ioctl_status {
+		__u32 ret;
+		__u32 partition;
+		__u32 status;
+	} status;
+
+	/**
+	 * struct fsl_hv_ioctl_start: start a partition
+	 * @ret: return error code from the hypervisor
+	 * @partition: the ID of the partition to control
+	 * @entry_point: The offset within the guest IMA to start execution
+	 * @load: If non-zero, reload the partition's images before starting
+	 *
+	 * Used by FSL_HV_IOCTL_PARTITION_START
+	 */
+	struct fsl_hv_ioctl_start {
+		__u32 ret;
+		__u32 partition;
+		__u32 entry_point;
+		__u32 load;
+	} start;
+
+	/**
+	 * struct fsl_hv_ioctl_stop: stop a partition
+	 * @ret: return error code from the hypervisor
+	 * @partition: the ID of the partition to stop, or -1 for the calling
+	 *             partition
+	 *
+	 * Used by FSL_HV_IOCTL_PARTITION_STOP
+	 */
+	struct fsl_hv_ioctl_stop {
+		__u32 ret;
+		__u32 partition;
+	} stop;
+
+	/**
+	 * struct fsl_hv_ioctl_memcpy: copy memory between partitions
+	 * @ret: return error code from the hypervisor
+	 * @source: the partition ID of the source partition, or -1 for this
+	 *          partition
+	 * @target: the partition ID of the target partition, or -1 for this
+	 *          partition
+	 * @local_addr: user-space virtual address of a buffer in the local
+	 *              partition
+	 * @remote_addr: guest physical address of a buffer in the
+	 *           remote partition
+	 * @count: the number of bytes to copy.  Both the local and remote
+	 *         buffers must be at least 'count' bytes long
+	 *
+	 * Used by FSL_HV_IOCTL_MEMCPY
+	 *
+	 * The 'local' partition is the partition that calls this ioctl.  The
+	 * 'remote' partition is a different partition.  The data is copied from
+	 * the 'source' paritition' to the 'target' partition.
+	 *
+	 * The buffer in the remote partition must be guest physically
+	 * contiguous.
+	 *
+	 * This ioctl does not support copying memory between two remote
+	 * partitions or within the same partition, so either 'source' or
+	 * 'target' (but not both) must be -1.  In other words, either
+	 *
+	 *      source == local and target == remote
+	 * or
+	 *      source == remote and target == local
+	 */
+	struct fsl_hv_ioctl_memcpy {
+		__u32 ret;
+		__u32 source;
+		__u32 target;
+		__u64 local_vaddr;
+		__u64 remote_paddr;
+		__u64 count;
+	} memcpy;
+
+	/**
+	 * struct fsl_hv_ioctl_doorbell: ring a doorbell
+	 * @ret: return error code from the hypervisor
+	 * @doorbell: the handle of the doorbell to ring doorbell
+	 *
+	 * Used by FSL_HV_IOCTL_DOORBELL
+	 */
+	struct fsl_hv_ioctl_doorbell {
+		__u32 ret;
+		__u32 doorbell;
+	} doorbell;
+
+	/**
+	 * struct fsl_hv_ioctl_prop: get/set a device tree property
+	 * @ret: return error code from the hypervisor
+	 * @handle: handle of partition whose tree to access
+	 * @path: virtual address of path name of node to access
+	 * @propname: virtual address of name of property to access
+	 * @propval: virtual address of property data buffer
+	 * @proplen: Size of property data buffer
+	 *
+	 * Used by FSL_HV_IOCTL_DOORBELL
+	 */
+	struct fsl_hv_ioctl_prop {
+		__u32 ret;
+		__u32 handle;
+		__u64 path;
+		__u64 propname;
+		__u64 propval;
+		__u32 proplen;
+	} prop;
+};
+
+/*
+ * ioctl commands.
+ */
+enum {
+	FSL_HV_IOCTL_PARTITION_RESTART = 1, /* Boot another partition */
+	FSL_HV_IOCTL_PARTITION_GET_STATUS = 2, /* Boot another partition */
+	FSL_HV_IOCTL_PARTITION_START = 3, /* Boot another partition */
+	FSL_HV_IOCTL_PARTITION_STOP = 4, /* Stop this or another partition */
+	FSL_HV_IOCTL_MEMCPY = 5, /* Copy data from one partition to another */
+	FSL_HV_IOCTL_DOORBELL = 6, /* Ring a doorbell */
+
+	/* Get a property from another guest's device tree */
+	FSL_HV_IOCTL_GETPROP = 7,
+
+	/* Set a property in another guest's device tree */
+	FSL_HV_IOCTL_SETPROP = 8,
+};
+
+#ifdef __KERNEL__
+
+/**
+ * fsl_hv_event_register -- register a callback for failover events
+ *
+ * This function is called by device drivers to register their callback
+ * functions for fail-over events.
+ */
+int fsl_hv_failover_register(struct notifier_block *nb);
+
+/**
+ * fsl_hv_event_unregister -- unregister a callback for failover events
+ */
+int fsl_hv_failover_unregister(struct notifier_block *nb);
+
+#endif
+
+#endif
-- 
1.7.3.4

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox