LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH 2/2] powerpc/mm: Fix memory_block_size_bytes() for non-pseries
From: Benjamin Herrenschmidt @ 2011-07-02 14:15 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-mm@kvack.org, Thomas Gleixner, linuxppc-dev,
	linux-kernel@vger.kernel.org
In-Reply-To: <20110702102333.GC17482@elte.hu>

On Sat, 2011-07-02 at 12:23 +0200, Ingo Molnar wrote:
> It's certainly not a hard rule - but note that the file in question 
> (arch/powerpc/platforms/pseries/hotplug-memory.c) has a rather 
> inconsistent comment style, sometimes even within the same function:
> 
>         /*
>          * Remove htab bolted mappings for this section of memory
>          */
> ...
> 
>         /* Ensure all vmalloc mappings are flushed in case they also
>          * hit that section of memory
>          */
> 
> That kind of inconsistency within the same .c file and within the 
> same function is not defensible with a "style is a matter of taste" 
> argument.

Right, that's a matter of different people with different taste mucking
around with the same file I suppose.

Most of this probably predates my involvement as a maintainer but even
if not (and I really can't be bothered digging into the history), it
might very well be something I didn't pay attention to while reviewing.

Seriously, it's so low on the scale of what matters ... I'm sure we both
have more valuable stuff to spend our time and energy on :-)

Cheers,
Ben.

^ permalink raw reply

* Re: [PATCH] powerpc: enable access to HT Host-Bridge on Maple
From: Segher Boessenkool @ 2011-07-02 21:50 UTC (permalink / raw)
  To: Dmitry Eremin-Solenikov; +Cc: linuxppc-dev
In-Reply-To: <BANLkTinarWc6n5KLJdx=kFC5wysw+8VQtw@mail.gmail.com>

>>> CPC925/CPC945 use special window to access host bridge
>>> functionality of
>>> u3-ht. Provide a way to access this device.
>>
>> Why?  Is anything going to use it?
>
> Hmmm. Why not?

Because if nothing uses it it is essentially dead code.

> Initially I stumbled upon the fact that powermac provides
> such acces. Then I discovered that it provides configuration for  
> memory windows
> and other parts.

There are no memory windows in there afaik.  The main use is the HT
link config stuff, which in a few places needs special handling to
work :-/

> (I needed it for my hack to add AGP bridge on U3 in Linux,
> as I didn't want to change firmware).

Can you tell more about this?

>>> +static int u3_ht_root_read_config(struct pci_controller *hose, u8
>>> offset,
>>> +				  int len, u32 *val)
>>> +{
>>> +	volatile void __iomem *addr;
>>> +
>>> +	addr = hose->cfg_addr;
>>> +	addr += ((offset & ~3) << 2) + (4 - len - (offset & 3));
>>
>> This will only work for len 1,2,4 with offset a multiple of len,  
>> is that
>> guaranteed here?
>
> I have to check. I think there is no guarantee, but standard accessors
> are created exactly for 1, 2 and 4-byte access. And IIRC according
> to PCI specs, offset should be len-aligned.

If other places that need it do not do an alignment check, you don't  
need
to either I suppose.  But could you make the switch for length have
explicit 1,2,4 and default error?

>>> +static int u3_ht_root_write_config(struct pci_controller *hose, u8
>>> offset,
>>> +				  int len, u32 val)
>>> +{
>>> +	volatile void __iomem *addr;
>>> +
>>> +	addr = hose->cfg_addr + ((offset & ~3) << 2) + (4 - len - (offset
>>> & 3));
>>> +
>>> +	if (offset >= PCI_BASE_ADDRESS_0 && offset < PCI_CAPABILITY_LIST)
>>> +		return PCIBIOS_SUCCESSFUL;
>>
>> This is a workaround for something; at the very least it needs a
>> comment,
>> but probably it shouldn't be here at all.
>
> I'll add a comment. Basically these registers are unimplemented on  
> u3/u4
> bridges and at least some kinds of access to them cause system freeze.
> I'll try to check what triggers what this night.

I don't think the hardware freezes, but probably Linux isn't happy  
when it
tries to write to the non-existent BARs.  Or something like that.

>>>  static int u3_ht_read_config(struct pci_bus *bus, unsigned int  
>>> devfn,
>>>  			     int offset, int len, u32 *val)
>>>  {
>>> @@ -217,6 +265,9 @@ static int u3_ht_read_config(struct pci_bus
>>> *bus, unsigned int devfn,
>>>  	if (hose == NULL)
>>>  		return PCIBIOS_DEVICE_NOT_FOUND;
>>>
>>> +	if (bus->number == hose->first_busno && devfn == PCI_DEVFN(0, 0))
>>> +		return u3_ht_root_read_config(hose, offset, len, val);
>>> +
>>>  	if (offset > 0xff)
>>>  		return PCIBIOS_BAD_REGISTER_NUMBER;
>>
>> u3_ht_root_read_config() can get an offset out of range this way.
>
> Yes, as offsets for this host bridge can be larger then 0xff. U3/U4  
> have
> some HT configuration there, and I didn't want to impose a limit  
> there.
> If you are against this, I can change the order of calls.

The HT config is at (PCI) offset 0x40.  There are no implemented  
registers
at PCI offset >= 0x100.  That corresponds to f8070000..f80703ff, there's
one 4-byte PCI reg per 16 byte U3 bus address.

>>>  	hose->cfg_data = ioremap(0xf2000000, 0x02000000);
>>> +	hose->cfg_addr = ioremap(0xf8070000, 0x1000);
>>
>> Eww.  You could just make a global instead of abusing existing  
>> fields,
>> there can be only one CPC9x5 in a system anyway.
>
> This was a copy-paste of corresponding PowerMac code. Do you really  
> want
> for me to change this to global variable?

It could use a comment at least.  The addresses aren't "cfg_data" and
"cfg_addr" at all.

> BTW: I see a lot of code duplication between PowerMac and Maple pci.c
> files. Would you like a patch that merges those files to something
> like arch/powerpc/sysdev/u3-pci.c? I can try merging them...

That would be most excellent!  I hope there isn't much pmac special- 
casing
needed for that.

Thanks,


Segher

^ permalink raw reply

* Re: [PATCH 6/7] [v2] tty/powerpc: introduce the ePAPR embedded hypervisor byte channel driver
From: Tabi Timur-B04825 @ 2011-07-03  1:43 UTC (permalink / raw)
  To: greg@kroah.com
  Cc: arnd@arndb.de, linux-kernel@vger.kernel.org, akpm@kernel.org,
	linux-console@vger.kernel.org, Gala Kumar-B11780,
	linuxppc-dev@lists.ozlabs.org, alan@lxorguk.ukuu.org.uk
In-Reply-To: <1305830092-20104-1-git-send-email-timur@freescale.com>

On Thu, May 19, 2011 at 1:34 PM, Timur Tabi <timur@freescale.com> wrote:
> The ePAPR embedded hypervisor specification provides an API for "byte
> channels", which are serial-like virtual devices for sending and receivin=
g
> streams of bytes. =A0This driver provides Linux kernel support for byte
> channels via three distinct interfaces:
>
> 1) An early-console (udbg) driver. =A0This provides early console output
> through a byte channel. =A0The byte channel handle must be specified in a
> Kconfig option.
>
> 2) A normal console driver. =A0Output is sent to the byte channel designa=
ted
> for stdout in the device tree. =A0The console driver is for handling kern=
el
> printk calls.
>
> 3) A tty driver, which is used to handle user-space input and output. =A0=
The
> byte channel used for the console is designated as the default tty.
>
> Signed-off-by: Timur Tabi <timur@freescale.com>

Greg, would you please pick up this patch for 3.1?  Patches 1-5 are
already queued through the PowerPC tree.

Thank you.

The patch is available here if you need to download it:
http://patchwork.ozlabs.org/patch/96443/

--=20
Timur Tabi
Linux kernel developer at Freescale=

^ permalink raw reply

* Re: [RFC PATCH 17/17] KVM: PPC: Add an ioctl for userspace to select which platform to emulate
From: Avi Kivity @ 2011-07-03  8:15 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Scott Wood, linuxppc-dev@ozlabs.org, Paul Mackerras,
	kvm-ppc@vger.kernel.org, kvm@vger.kernel.org
In-Reply-To: <2EDDAC02-4BDC-4007-82E1-2C6D732C24D5@suse.de>

On 06/30/2011 07:33 PM, Alexander Graf wrote:
> On 30.06.2011, at 18:00, Avi Kivity<avi@redhat.com>  wrote:
>
> >  On 06/30/2011 06:22 PM, Alexander Graf wrote:
> >>>  Regarding that.  There's another option - the ioctl code embeds the structure size.  So if we extend the ioctl parsing to pad up (or truncate down) from the user's size to our size, and similarly in the other direction, we can get away from this ugliness.
> >>>
> >>>  Some years ago I posted a generic helper that did this (and also kmalloc'ed and kfree'd the data itself), but it wasn't received favourably.  Maybe I should try again (and we can possibly use it in kvm even if it is rejected for general use, though that's against our principles of pushing all generic infrastructure to the wider kernel).
> >>
> >>
> >>  That does sound interesting, but requires a lot more thought to be put into the actual code, as we basically need to read out the feature bitmap, then provide a minimum size for the chosen features and then decide if they fit in.
> >
> >
> >  Why? just put the things you want in the structure.
> >
> >  old userspace ->  new kernel: we auto-zero the parts userspace left out, and zero means old behaviour, so everthing works
> >  new userspace ->  old kernel: truncate.  Userspace shouldn't have used any new features (KVM_CAP), and we can -EINVAL if the truncated section contains a nonzero bit.
>
> Yup, which requires knowledge in the code on what actually fits :). Logic we don't have today.

I don't follow.  What knowledge is required?  Please give an example.

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply

* Re: [RFC PATCH 17/17] KVM: PPC: Add an ioctl for userspace to select which platform to emulate
From: Alexander Graf @ 2011-07-03  8:34 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Scott Wood, linuxppc-dev@ozlabs.org, Paul Mackerras,
	kvm-ppc@vger.kernel.org, kvm@vger.kernel.org
In-Reply-To: <4E102516.1040805@redhat.com>


On 03.07.2011, at 10:15, Avi Kivity wrote:

> On 06/30/2011 07:33 PM, Alexander Graf wrote:
>> On 30.06.2011, at 18:00, Avi Kivity<avi@redhat.com>  wrote:
>>=20
>> >  On 06/30/2011 06:22 PM, Alexander Graf wrote:
>> >>>  Regarding that.  There's another option - the ioctl code embeds =
the structure size.  So if we extend the ioctl parsing to pad up (or =
truncate down) from the user's size to our size, and similarly in the =
other direction, we can get away from this ugliness.
>> >>>
>> >>>  Some years ago I posted a generic helper that did this (and also =
kmalloc'ed and kfree'd the data itself), but it wasn't received =
favourably.  Maybe I should try again (and we can possibly use it in kvm =
even if it is rejected for general use, though that's against our =
principles of pushing all generic infrastructure to the wider kernel).
>> >>
>> >>
>> >>  That does sound interesting, but requires a lot more thought to =
be put into the actual code, as we basically need to read out the =
feature bitmap, then provide a minimum size for the chosen features and =
then decide if they fit in.
>> >
>> >
>> >  Why? just put the things you want in the structure.
>> >
>> >  old userspace ->  new kernel: we auto-zero the parts userspace =
left out, and zero means old behaviour, so everthing works
>> >  new userspace ->  old kernel: truncate.  Userspace shouldn't have =
used any new features (KVM_CAP), and we can -EINVAL if the truncated =
section contains a nonzero bit.
>>=20
>> Yup, which requires knowledge in the code on what actually fits :). =
Logic we don't have today.
>=20
> I don't follow.  What knowledge is required?  Please give an example.

Sure. Let's take an easy example Currently we have for get_pvinfo:

        case KVM_PPC_GET_PVINFO: {
                struct kvm_ppc_pvinfo pvinfo;
                memset(&pvinfo, 0, sizeof(pvinfo));
                r =3D kvm_vm_ioctl_get_pvinfo(&pvinfo);
                if (copy_to_user(argp, &pvinfo, sizeof(pvinfo))) {
                        r =3D -EFAULT;
                        goto out;
                }

                break;
        }

        /* for KVM_PPC_GET_PVINFO */
        struct kvm_ppc_pvinfo {
                /* out */
                __u32 flags;
                __u32 hcall[4];
                __u8  pad[108];
        };

The padding would not be there with your idea. An updated version could =
look like this:

        /* for KVM_PPC_GET_PVINFO */
        struct kvm_ppc_pvinfo {
                /* out */
                __u32 flags;
                __u32 hcall[4];
                __u64 features;  /* only there with =
PVINFO_FLAGS_FEATURES */
        };

Now, your idea was to not use copy_from/to_user directly, but instead =
some wrapper that could pad with zeros on read or truncate on write. So =
instead we would essentially get:

        int kvm_vm_ioctl_get_pvinfo(struct kvm_ppc_pvinfo *pvinfo, int =
*required_size)
        {
                [...]
		if (pvinfo_flags & PVINFO_FLAGS_FEATURES) {
                        *required_size =3D 16;
                } else {
                        *required_size =3D 8;
                }
                [...]
        }

        case KVM_PPC_GET_PVINFO: {
                struct kvm_ppc_pvinfo pvinfo;
                int required_size =3D 0;
                memset(&pvinfo, 0, sizeof(pvinfo));
                r =3D kvm_vm_ioctl_get_pvinfo(&pvinfo, &required_size);
                if (copy_to_user(argp, &pvinfo, required_size) {
                        r =3D -EFAULT;
                        goto out;
                }

                break;
        }

Otherwise we might write over data the user expected. And that logic =
that tells to copy_to_user how much data it actually takes to put all =
the information in is not there today and would have to be added. You =
can even verify that required_size with the ioctl passed size to make =
100% sure user space is sane, but I'd claim that a feature bitmap is =
plenty of information to ensure that we're not doing something stupid.


Alex

^ permalink raw reply

* Re: [RFC PATCH 17/17] KVM: PPC: Add an ioctl for userspace to select which platform to emulate
From: Avi Kivity @ 2011-07-03  8:56 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Scott Wood, linuxppc-dev@ozlabs.org, Paul Mackerras,
	kvm-ppc@vger.kernel.org, kvm@vger.kernel.org
In-Reply-To: <70A08140-B592-4B2F-985B-D8E5C78C743B@suse.de>

On 07/03/2011 11:34 AM, Alexander Graf wrote:
> >>
> >>  Yup, which requires knowledge in the code on what actually fits :). Logic we don't have today.
> >
> >  I don't follow.  What knowledge is required?  Please give an example.
>
> Sure. Let's take an easy example Currently we have for get_pvinfo:
>

<snip>

> The padding would not be there with your idea. An updated version could look like this:
>
>          /* for KVM_PPC_GET_PVINFO */
>          struct kvm_ppc_pvinfo {
>                  /* out */
>                  __u32 flags;
>                  __u32 hcall[4];
>                  __u64 features;  /* only there with PVINFO_FLAGS_FEATURES */
>          };
>
> Now, your idea was to not use copy_from/to_user directly, but instead some wrapper that could pad with zeros on read or truncate on write. So instead we would essentially get:
>
>          int kvm_vm_ioctl_get_pvinfo(struct kvm_ppc_pvinfo *pvinfo, int *required_size)
>          {
>                  [...]
> 		if (pvinfo_flags&  PVINFO_FLAGS_FEATURES) {
>                          *required_size = 16;
>                  } else {
>                          *required_size = 8;
>                  }
>                  [...]
>          }


Why?  Kernel code would only consider the full structure.


>          case KVM_PPC_GET_PVINFO: {
>                  struct kvm_ppc_pvinfo pvinfo;
>                  int required_size = 0;
>                  memset(&pvinfo, 0, sizeof(pvinfo));
>                  r = kvm_vm_ioctl_get_pvinfo(&pvinfo,&required_size);
>                  if (copy_to_user(argp,&pvinfo, required_size) {
>                          r = -EFAULT;
>                          goto out;
>                  }

required_size would come from the size encoded in the ioctl number, no 
need to calculate it separately.

>                  break;
>          }
>
> Otherwise we might write over data the user expected. And that logic that tells to copy_to_user how much data it actually takes to put all the information in is not there today and would have to be added. You can even verify that required_size with the ioctl passed size to make 100% sure user space is sane, but I'd claim that a feature bitmap is plenty of information to ensure that we're not doing something stupid.

I don't see why we have to caclulate something, then verify it against 
the correct answer.

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply

* Re: [RFC PATCH 17/17] KVM: PPC: Add an ioctl for userspace to select which platform to emulate
From: Alexander Graf @ 2011-07-03  9:00 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Scott Wood, linuxppc-dev@ozlabs.org, Paul Mackerras,
	kvm-ppc@vger.kernel.org, kvm@vger.kernel.org
In-Reply-To: <4E102ECE.1060004@redhat.com>


On 03.07.2011, at 10:56, Avi Kivity wrote:

> On 07/03/2011 11:34 AM, Alexander Graf wrote:
>> >>
>> >>  Yup, which requires knowledge in the code on what actually fits =
:). Logic we don't have today.
>> >
>> >  I don't follow.  What knowledge is required?  Please give an =
example.
>>=20
>> Sure. Let's take an easy example Currently we have for get_pvinfo:
>>=20
>=20
> <snip>
>=20
>> The padding would not be there with your idea. An updated version =
could look like this:
>>=20
>>         /* for KVM_PPC_GET_PVINFO */
>>         struct kvm_ppc_pvinfo {
>>                 /* out */
>>                 __u32 flags;
>>                 __u32 hcall[4];
>>                 __u64 features;  /* only there with =
PVINFO_FLAGS_FEATURES */
>>         };
>>=20
>> Now, your idea was to not use copy_from/to_user directly, but instead =
some wrapper that could pad with zeros on read or truncate on write. So =
instead we would essentially get:
>>=20
>>         int kvm_vm_ioctl_get_pvinfo(struct kvm_ppc_pvinfo *pvinfo, =
int *required_size)
>>         {
>>                 [...]
>> 		if (pvinfo_flags&  PVINFO_FLAGS_FEATURES) {
>>                         *required_size =3D 16;
>>                 } else {
>>                         *required_size =3D 8;
>>                 }
>>                 [...]
>>         }
>=20
>=20
> Why?  Kernel code would only consider the full structure.
>=20
>=20
>>         case KVM_PPC_GET_PVINFO: {
>>                 struct kvm_ppc_pvinfo pvinfo;
>>                 int required_size =3D 0;
>>                 memset(&pvinfo, 0, sizeof(pvinfo));
>>                 r =3D =
kvm_vm_ioctl_get_pvinfo(&pvinfo,&required_size);
>>                 if (copy_to_user(argp,&pvinfo, required_size) {
>>                         r =3D -EFAULT;
>>                         goto out;
>>                 }
>=20
> required_size would come from the size encoded in the ioctl number, no =
need to calculate it separately.
>=20
>>                 break;
>>         }
>>=20
>> Otherwise we might write over data the user expected. And that logic =
that tells to copy_to_user how much data it actually takes to put all =
the information in is not there today and would have to be added. You =
can even verify that required_size with the ioctl passed size to make =
100% sure user space is sane, but I'd claim that a feature bitmap is =
plenty of information to ensure that we're not doing something stupid.
>=20
> I don't see why we have to caclulate something, then verify it against =
the correct answer.

Ah, I think I'm grasping your idea. You'd simply truncate the resulting =
struct according to the size passed by the ioctl and call it a day. =
Well, that works too. User space simply wouldn't be able to know if all =
information actually fit into the struct, but I guess that's fine :).


Alex

^ permalink raw reply

* Re: [RFC PATCH 17/17] KVM: PPC: Add an ioctl for userspace to select which platform to emulate
From: Avi Kivity @ 2011-07-03  9:05 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Scott Wood, linuxppc-dev@ozlabs.org, Paul Mackerras,
	kvm-ppc@vger.kernel.org, kvm@vger.kernel.org
In-Reply-To: <CBE84140-B1FC-48B6-A000-77CB58C5F928@suse.de>

On 07/03/2011 12:00 PM, Alexander Graf wrote:
> >>          }
> >>
> >>  Otherwise we might write over data the user expected. And that logic that tells to copy_to_user how much data it actually takes to put all the information in is not there today and would have to be added. You can even verify that required_size with the ioctl passed size to make 100% sure user space is sane, but I'd claim that a feature bitmap is plenty of information to ensure that we're not doing something stupid.
> >
> >  I don't see why we have to caclulate something, then verify it against the correct answer.
>
> Ah, I think I'm grasping your idea. You'd simply truncate the resulting struct according to the size passed by the ioctl and call it a day. Well, that works too. User space simply wouldn't be able to know if all information actually fit into the struct, but I guess that's fine :).
>

Right.  The idea is that if KVM_FLAG_BLAH implies a field 
kvm_struct::blah, then either both are present in the headers, or none  
of them.

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply

* Re: [RFC PATCH 17/17] KVM: PPC: Add an ioctl for userspace to select which platform to emulate
From: Alexander Graf @ 2011-07-03  9:09 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Scott Wood, linuxppc-dev@ozlabs.org, Paul Mackerras,
	kvm-ppc@vger.kernel.org, kvm@vger.kernel.org
In-Reply-To: <4E1030E1.6070101@redhat.com>


On 03.07.2011, at 11:05, Avi Kivity wrote:

> On 07/03/2011 12:00 PM, Alexander Graf wrote:
>> >>          }
>> >>
>> >>  Otherwise we might write over data the user expected. And that =
logic that tells to copy_to_user how much data it actually takes to put =
all the information in is not there today and would have to be added. =
You can even verify that required_size with the ioctl passed size to =
make 100% sure user space is sane, but I'd claim that a feature bitmap =
is plenty of information to ensure that we're not doing something =
stupid.
>> >
>> >  I don't see why we have to caclulate something, then verify it =
against the correct answer.
>>=20
>> Ah, I think I'm grasping your idea. You'd simply truncate the =
resulting struct according to the size passed by the ioctl and call it a =
day. Well, that works too. User space simply wouldn't be able to know if =
all information actually fit into the struct, but I guess that's fine =
:).
>>=20
>=20
> Right.  The idea is that if KVM_FLAG_BLAH implies a field =
kvm_struct::blah, then either both are present in the headers, or none  =
of them.

Yup, makes sense. I like the idea :). Gets rid of all the useless =
paddings and reserved fields. We could even truncate the structs that =
already have paddings in them if we only copy min(sizeof(real_struct), =
ioctl_passed_size); (which we should anyways).

How long until we get a patch set? :)


Alex

^ permalink raw reply

* Re: [RFC PATCH 17/17] KVM: PPC: Add an ioctl for userspace to select which platform to emulate
From: Avi Kivity @ 2011-07-03  9:12 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Scott Wood, linuxppc-dev@ozlabs.org, Paul Mackerras,
	kvm-ppc@vger.kernel.org, kvm@vger.kernel.org
In-Reply-To: <77E6DEE3-F00A-46EA-B8AF-58BC62A8143D@suse.de>

On 07/03/2011 12:09 PM, Alexander Graf wrote:
> >
> >  Right.  The idea is that if KVM_FLAG_BLAH implies a field kvm_struct::blah, then either both are present in the headers, or none  of them.
>
> Yup, makes sense. I like the idea :). Gets rid of all the useless paddings and reserved fields. We could even truncate the structs that already have paddings in them if we only copy min(sizeof(real_struct), ioctl_passed_size); (which we should anyways).
>

No, we can't change anything that is already out.  If will change the 
ioctl numbers, so building against new headers but running against an 
old kernel will fail.

> How long until we get a patch set? :)

Well, I'd really like to get the qemu memory API out first.

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply

* [PATCH 3/7] fault-injection: notifier error injection
From: Akinobu Mita @ 2011-07-03 14:16 UTC (permalink / raw)
  To: linux-kernel, akpm
  Cc: Greg Kroah-Hartman, Akinobu Mita, Rafael J. Wysocki, linux-mm,
	Paul Mackerras, Pavel Machek, linux-pm, linuxppc-dev
In-Reply-To: <1309702581-16863-1-git-send-email-akinobu.mita@gmail.com>

The notifier error injection provides the ability to inject artifical
errors to specified notifier chain callbacks. It is useful to test
the error handling of notifier call chain failures.

This adds common basic functions to define which type of events can be
fail and to initialize the debugfs interface to control what error code
should be returned and which event should be failed.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: linux-pm@lists.linux-foundation.org
Cc: linux-mm@kvack.org
Cc: linuxppc-dev@lists.ozlabs.org
---
 include/linux/notifier.h |   25 ++++++++++++++++++++
 kernel/notifier.c        |   57 ++++++++++++++++++++++++++++++++++++++++++++++
 lib/Kconfig.debug        |   11 +++++++++
 3 files changed, 93 insertions(+), 0 deletions(-)

diff --git a/include/linux/notifier.h b/include/linux/notifier.h
index c0688b0..51882d6 100644
--- a/include/linux/notifier.h
+++ b/include/linux/notifier.h
@@ -278,5 +278,30 @@ extern struct blocking_notifier_head reboot_notifier_list;
 #define VT_UPDATE		0x0004 /* A bigger update occurred */
 #define VT_PREWRITE		0x0005 /* A char is about to be written to the console */
 
+#ifdef CONFIG_NOTIFIER_ERROR_INJECTION
+
+struct err_inject_notifier_action {
+	unsigned long val;
+	int error;
+	const char *name;
+};
+
+#define ERR_INJECT_NOTIFIER_ACTION(action)	\
+	.name = #action, .val = (action),
+
+struct err_inject_notifier_block {
+	struct notifier_block nb;
+	struct dentry *dir;
+	struct err_inject_notifier_action actions[];
+	/* The last slot must be terminated with zero sentinel */
+};
+
+extern int err_inject_notifier_block_init(struct err_inject_notifier_block *enb,
+				const char *name, int priority);
+extern void err_inject_notifier_block_cleanup(
+				struct err_inject_notifier_block *enb);
+
+#endif /* CONFIG_NOTIFIER_ERROR_INJECTION */
+
 #endif /* __KERNEL__ */
 #endif /* _LINUX_NOTIFIER_H */
diff --git a/kernel/notifier.c b/kernel/notifier.c
index 2488ba7..8dcc485 100644
--- a/kernel/notifier.c
+++ b/kernel/notifier.c
@@ -5,6 +5,7 @@
 #include <linux/rcupdate.h>
 #include <linux/vmalloc.h>
 #include <linux/reboot.h>
+#include <linux/debugfs.h>
 
 /*
  *	Notifier list for kernel code which wants to be called
@@ -584,3 +585,59 @@ int unregister_die_notifier(struct notifier_block *nb)
 	return atomic_notifier_chain_unregister(&die_chain, nb);
 }
 EXPORT_SYMBOL_GPL(unregister_die_notifier);
+
+#ifdef CONFIG_NOTIFIER_ERROR_INJECTION
+
+static int err_inject_notifier_callback(struct notifier_block *nb,
+				unsigned long val, void *p)
+{
+	int err = 0;
+	struct err_inject_notifier_block *enb =
+		container_of(nb, struct err_inject_notifier_block, nb);
+	struct err_inject_notifier_action *action;
+
+	for (action = enb->actions; action->name; action++) {
+		if (action->val == val) {
+			err = action->error;
+			break;
+		}
+	}
+	if (err) {
+		printk(KERN_INFO "Injecting error (%d) to %s\n",
+			err, action->name);
+	}
+
+	return notifier_from_errno(err);
+}
+
+int err_inject_notifier_block_init(struct err_inject_notifier_block *enb,
+				const char *name, int priority)
+{
+	struct err_inject_notifier_action *action;
+	mode_t mode = S_IFREG | S_IRUSR | S_IWUSR;
+
+	enb->nb.notifier_call = err_inject_notifier_callback;
+	enb->nb.priority = priority;
+
+	enb->dir = debugfs_create_dir(name, NULL);
+	if (!enb->dir)
+		return -ENOMEM;
+
+	for (action = enb->actions; action->name; action++) {
+		struct dentry *file = debugfs_create_int(action->name, mode,
+						enb->dir, &action->error);
+
+		if (!file) {
+			debugfs_remove_recursive(enb->dir);
+			return -ENOMEM;
+		}
+	}
+	return 0;
+}
+
+void err_inject_notifier_block_cleanup(struct err_inject_notifier_block *enb)
+{
+	debugfs_remove_recursive(enb->dir);
+}
+
+#endif /* CONFIG_NOTIFIER_ERROR_INJECTION */
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index dd373c8..8c6ce7e 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1018,6 +1018,17 @@ config LKDTM
 	Documentation on how to use the module can be found in
 	Documentation/fault-injection/provoke-crashes.txt
 
+config NOTIFIER_ERROR_INJECTION
+	bool "Notifier error injection"
+	depends on DEBUG_KERNEL
+	select DEBUG_FS
+	help
+	  This option provides the ability to inject artifical errors to
+	  specified notifier chain callbacks. It is useful to test the error
+	  handling of notifier call chain failures.
+
+	  Say N if unsure.
+
 config CPU_NOTIFIER_ERROR_INJECT
 	tristate "CPU notifier error injection module"
 	depends on HOTPLUG_CPU && DEBUG_KERNEL
-- 
1.7.4.4

^ permalink raw reply related

* [PATCH 7/7] powerpc: pSeries reconfig notifier error injection
From: Akinobu Mita @ 2011-07-03 14:16 UTC (permalink / raw)
  To: linux-kernel, akpm; +Cc: linuxppc-dev, Paul Mackerras, Akinobu Mita
In-Reply-To: <1309702581-16863-1-git-send-email-akinobu.mita@gmail.com>

This provides the ability to inject artifical errors to pSeries reconfig
notifier chain callbacks.  It is controlled through debugfs interface
under /sys/kernel/debug/pSeries-reconfig-notifier-error-inject/

Each of the files in the directory represents an event which can be
failed and contains the error code.  If the notifier call chain should
be failed with some events notified, write the error code to the files.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: linuxppc-dev@lists.ozlabs.org
---
 arch/powerpc/platforms/pseries/reconfig.c |   31 +++++++++++++++++++++++++++++
 lib/Kconfig.debug                         |    9 ++++++++
 2 files changed, 40 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/reconfig.c b/arch/powerpc/platforms/pseries/reconfig.c
index 168651a..31d9b0f 100644
--- a/arch/powerpc/platforms/pseries/reconfig.c
+++ b/arch/powerpc/platforms/pseries/reconfig.c
@@ -117,6 +117,37 @@ int pSeries_reconfig_notify(unsigned long action, void *p)
 	return notifier_to_errno(err);
 }
 
+#ifdef CONFIG_PSERIES_RECONFIG_NOTIFIER_ERROR_INJECTION
+
+static struct err_inject_notifier_block err_inject_reconfig_nb = {
+	.actions = {
+		{ ERR_INJECT_NOTIFIER_ACTION(PSERIES_RECONFIG_ADD) },
+		{ ERR_INJECT_NOTIFIER_ACTION(PSERIES_RECONFIG_REMOVE) },
+		{ ERR_INJECT_NOTIFIER_ACTION(PSERIES_DRCONF_MEM_ADD) },
+		{ ERR_INJECT_NOTIFIER_ACTION(PSERIES_DRCONF_MEM_REMOVE) },
+		{}
+	}
+};
+
+static int __init err_inject_reconfig_notifier_init(void)
+{
+	int err;
+
+	err = err_inject_notifier_block_init(&err_inject_reconfig_nb,
+			"pSeries-reconfig-notifier-error-inject", -1);
+	if (err)
+		return err;
+
+	err = pSeries_reconfig_notifier_register(&err_inject_reconfig_nb.nb);
+	if (err)
+		err_inject_notifier_block_cleanup(&err_inject_reconfig_nb);
+
+	return err;
+}
+late_initcall(err_inject_reconfig_notifier_init);
+
+#endif /* CONFIG_PSERIES_RECONFIG_NOTIFIER_ERROR_INJECTION */
+
 static int pSeries_reconfig_add_node(const char *path, struct property *proplist)
 {
 	struct device_node *np;
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 52f0b0e..2becf8c 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1059,6 +1059,15 @@ config MEMORY_NOTIFIER_ERROR_INJECTION
 	  memory hotplug notifier chain callbacks.  It is controlled through
 	  debugfs interface under /sys/kernel/debug/memory-notifier-error-inject/
 
+config PSERIES_RECONFIG_NOTIFIER_ERROR_INJECTION
+	bool "pSeries reconfig notifier error injection"
+	depends on PPC_PSERIES && NOTIFIER_ERROR_INJECTION
+	help
+	  This option provides the ability to inject artifical errors to
+	  pSeries reconfig notifier chain callbacks.  It is controlled
+	  through debugfs interface under
+	  /sys/kernel/debug/pSeries-reconfig-notifier-error-inject/
+
 config CPU_NOTIFIER_ERROR_INJECT
 	tristate "CPU notifier error injection module"
 	depends on HOTPLUG_CPU && DEBUG_KERNEL
-- 
1.7.4.4

^ permalink raw reply related

* Re: [PATCH 3/7] fault-injection: notifier error injection
From: Pavel Machek @ 2011-07-03 17:16 UTC (permalink / raw)
  To: Akinobu Mita
  Cc: Greg Kroah-Hartman, linux-kernel, Rafael J. Wysocki, linux-mm,
	Paul Mackerras, akpm, linuxppc-dev, linux-pm
In-Reply-To: <1309702581-16863-4-git-send-email-akinobu.mita@gmail.com>


> +	for (action = enb->actions; action->name; action++) {
> +		struct dentry *file = debugfs_create_int(action->name, mode,
> +						enb->dir, &action->error);
> +
> +		if (!file) {
> +			debugfs_remove_recursive(enb->dir);
> +			return -ENOMEM;
> +		}

Few lines how this work would be welcome...?
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply

* Re: linux tqm8260
From: Wolfgang Denk @ 2011-07-03 19:01 UTC (permalink / raw)
  To: bourkeb idir; +Cc: Linuxppc-dev@lists.ozlabs.org
In-Reply-To: <1309510675.9518.YahooMailNeo@web29503.mail.ird.yahoo.com>

Dear bourkeb idir,

In message <1309510675.9518.YahooMailNeo@web29503.mail.ird.yahoo.com> you wrote:

> i'm trying to port linux in a tqm8260 fresscale card with an mpc8260
> cpu .i've already ported u-boot and it seems working but now i'm
> trying to cross compile a linux kernel 2.4.4 (which have a config
> file for tqm8260) and i always get trouble when i try to make a
> zImage . can anyone help me ,please?

This statement makes little sense to me.

First, the TQM8260 is not a Freescare card.  The manufacturer is (or
rather was) TQ Systems.

Second, I wonder why you would spend any efforts into such a system -
the product was EOLed a long time ago.

Third, when the product was still alive, it was pretty well supported
by U-Boot and Linux.  U-Boot sopport for it is still available in
mainline.  Linux support was available in our linuxppc_2_4_devel tree
up to and including Linux version 2.4.25.  So using 2.4.4 as a
starting point for such a work is even more incomprehensible to me.


If you really have a TQM8260 board, then just use the mainline U-Boot
code (current release, i. e. v20100.06) and the old 2.4.25 based
source tree mentioned above.

But be warned: this is ancient hardware, and so is the kernel code.
There is no guarantee it will even compile using recent tool chains.

Best regards,

Wolfgang Denk

-- 
DENX Software Engineering GmbH,     MD: Wolfgang Denk & Detlev Zundel
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd@denx.de
You've no idea of what a poor opinion  I  have  of  myself,  and  how
little I deserve it.                                  - W. S. Gilbert

^ permalink raw reply

* Re: [PATCH] powerpc: enable access to HT Host-Bridge on Maple
From: Dmitry Eremin-Solenikov @ 2011-07-03 23:31 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: linuxppc-dev
In-Reply-To: <288B47E0-3869-486D-B753-62EFA1EEF66D@kernel.crashing.org>

On 7/3/11, Segher Boessenkool <segher@kernel.crashing.org> wrote:
>>>> CPC925/CPC945 use special window to access host bridge
>>>> functionality of
>>>> u3-ht. Provide a way to access this device.
>>>
>>> Why?  Is anything going to use it?
>>
>> Hmmm. Why not?
>
> Because if nothing uses it it is essentially dead code.

1) It provides some (possibly debug) information to users
2) It provides a way to configure (e.g. HT address mask (reg 80),
   some bridge control regs (around c0-ff), etc.

>> Initially I stumbled upon the fact that powermac provides
>> such acces. Then I discovered that it provides configuration for
>> memory windows
>> and other parts.
>
> There are no memory windows in there afaik.  The main use is the HT
> link config stuff, which in a few places needs special handling to
> work :-/

Register 80 (0xf8070200) is an HT1 address mask, which configures
which memory belongs to HT bridge & devices (contrary to AGP/PCIe)

>> (I needed it for my hack to add AGP bridge on U3 in Linux,
>> as I didn't want to change firmware).
>
> Can you tell more about this?

I have a Mapple-D board (aka IBM PPC970FX ev. kit). It has an AGP slot,
which I would like to use. The slot is unsupported in PIBS (the original
firmware) (it doesn't enumerate devices behind the slot, it doesn't setup it,
it doesn't put u3-agp in device tree).

I didn't want to hack the PIBS (at least while it's possible and for initial
debug of the hardware setup),
so I had to develop a patch for Linux kernel which would:
1) configure hardware (memory windows, IRQs, etc).
2) tell Linux about the AGP bridge.

Second part is more or less complete (with some hacks though). Basically
I register an u3-agp slot if there is no u3-agp and u4-pcie devices. This still
needs an improvement before I'd submit this code anywhere.

I've mostly finished the HW setup (I can see devices behind the bridge).
The radeon plugged into AGP slot starts to warm up, but it fails somewhere.
The thing I still miss (or at least one of them) is the IRQ routing. Ben gave me
some possible pointers, so I'll have to check if he is right or not.
Unfortunately
I still hand't time to accomplish this. I can show my (possibly a bit
ugly) code,
but it still doesn't fully work.

>>>> +static int u3_ht_root_read_config(struct pci_controller *hose, u8
>>>> offset,
>>>> +				  int len, u32 *val)
>>>> +{
>>>> +	volatile void __iomem *addr;
>>>> +
>>>> +	addr = hose->cfg_addr;
>>>> +	addr += ((offset & ~3) << 2) + (4 - len - (offset & 3));
>>>
>>> This will only work for len 1,2,4 with offset a multiple of len,
>>> is that
>>> guaranteed here?
>>
>> I have to check. I think there is no guarantee, but standard accessors
>> are created exactly for 1, 2 and 4-byte access. And IIRC according
>> to PCI specs, offset should be len-aligned.
>
> If other places that need it do not do an alignment check, you don't
> need
> to either I suppose.  But could you make the switch for length have
> explicit 1,2,4 and default error?

Yes, of course. I wanted to overcome this, but of course I can split it
back to the switch.

BTW: the corresponding PowerMac code has a bug: try reading
VID/DID as a byte. word and dword and compare the outcome
with the output from typical PCI device.

>>>> +static int u3_ht_root_write_config(struct pci_controller *hose, u8
>>>> offset,
>>>> +				  int len, u32 val)
>>>> +{
>>>> +	volatile void __iomem *addr;
>>>> +
>>>> +	addr = hose->cfg_addr + ((offset & ~3) << 2) + (4 - len - (offset
>>>> & 3));
>>>> +
>>>> +	if (offset >= PCI_BASE_ADDRESS_0 && offset < PCI_CAPABILITY_LIST)
>>>> +		return PCIBIOS_SUCCESSFUL;
>>>
>>> This is a workaround for something; at the very least it needs a
>>> comment,
>>> but probably it shouldn't be here at all.
>>
>> I'll add a comment. Basically these registers are unimplemented on
>> u3/u4
>> bridges and at least some kinds of access to them cause system freeze.
>> I'll try to check what triggers what this night.
>
> I don't think the hardware freezes, but probably Linux isn't happy
> when it
> tries to write to the non-existent BARs.  Or something like that.

IIRC I've had problems even when reading. Will run an experiment tomorrow.

>>>>  static int u3_ht_read_config(struct pci_bus *bus, unsigned int
>>>> devfn,
>>>>  			     int offset, int len, u32 *val)
>>>>  {
>>>> @@ -217,6 +265,9 @@ static int u3_ht_read_config(struct pci_bus
>>>> *bus, unsigned int devfn,
>>>>  	if (hose == NULL)
>>>>  		return PCIBIOS_DEVICE_NOT_FOUND;
>>>>
>>>> +	if (bus->number == hose->first_busno && devfn == PCI_DEVFN(0, 0))
>>>> +		return u3_ht_root_read_config(hose, offset, len, val);
>>>> +
>>>>  	if (offset > 0xff)
>>>>  		return PCIBIOS_BAD_REGISTER_NUMBER;
>>>
>>> u3_ht_root_read_config() can get an offset out of range this way.
>>
>> Yes, as offsets for this host bridge can be larger then 0xff. U3/U4
>> have
>> some HT configuration there, and I didn't want to impose a limit
>> there.
>> If you are against this, I can change the order of calls.
>
> The HT config is at (PCI) offset 0x40.  There are no implemented
> registers
> at PCI offset >= 0x100.  That corresponds to f8070000..f80703ff, there's
> one 4-byte PCI reg per 16 byte U3 bus address.

Hmmm. I saw the "HT Performance Monitor" part of the U4 spec, but
didn't notice that all regs are marked as "not implemented".

>>>>  	hose->cfg_data = ioremap(0xf2000000, 0x02000000);
>>>> +	hose->cfg_addr = ioremap(0xf8070000, 0x1000);
>>>
>>> Eww.  You could just make a global instead of abusing existing
>>> fields,
>>> there can be only one CPC9x5 in a system anyway.
>>
>> This was a copy-paste of corresponding PowerMac code. Do you really
>> want
>> for me to change this to global variable?
>
> It could use a comment at least.  The addresses aren't "cfg_data" and
> "cfg_addr" at all.

Fine, I'll add a comment here.

>> BTW: I see a lot of code duplication between PowerMac and Maple pci.c
>> files. Would you like a patch that merges those files to something
>> like arch/powerpc/sysdev/u3-pci.c? I can try merging them...
>
> That would be most excellent!  I hope there isn't much pmac special-
> casing
> needed for that.

Still have to check. Please don't expect a patch soon.

-- 
With best wishes
Dmitry

^ permalink raw reply

* Re: [BUG?]3.0-rc4+ftrace+kprobe: set kprobe at instruction 'stwu' lead to system crash/freeze
From: Yong Zhang @ 2011-07-04  2:23 UTC (permalink / raw)
  To: tiejun.chen
  Cc: Jim Keniston, linux-kernel, Steven Rostedt, paulus,
	yrl.pp-manager.tt, Masami Hiramatsu, linuxppc-dev
In-Reply-To: <4E0D9B5E.3010901@windriver.com>

On Fri, Jul 1, 2011 at 6:03 PM, tiejun.chen <tiejun.chen@windriver.com> wro=
te:
>> root@unknown:/root> insmod kprobe_example.ko func=3Dshow_interrupts
>> Planted kprobe at c009be18
>> root@unknown:/root> cat /proc/interrupts
>> pre_handler: p->addr =3D 0xc009be18, nip =3D 0xc009be18, msr =3D 0x29000
>> post_handler: p->addr =3D 0xc009be18, msr =3D 0x29000,boostable =3D 1
>> Oops: Exception in kernel mode, sig: 11 [#1]
>> PREEMPT MPC8536 DS
>> Modules linked in: kprobe_example
>> NIP: df159e74 LR: c0106f40 CTR: c009be18
>> REGS: df159d90 TRAP: 0700 =C2=A0 Not tainted =C2=A0(3.0.0-rc4-00001-ge8f=
fcca-dirty)
>> MSR: 00029000 <EE,ME,CE> =C2=A0CR: 20202688 =C2=A0XER: 00000000
>> TASK =3D dfaa5340[613] 'cat' THREAD: df158000
>> GPR00: fffff000 df159e40 dfaa5340 df024a00 df159e78 00000000 df159f20 00=
000001
>> GPR08: c10060d0 c009be18 00029000 df159e70 00000000 1001ca74 1ffb5f00 10=
0a01cc
>> GPR16: 00000000 00000000 00000000 00000000 df024a28 df159f20 00000000 df=
bff080
>> GPR24: 10016000 00001000 df159f20 df159e78 dfbff080 df159e78 df024a00 df=
159e70
>> NIP [df159e74] 0xdf159e74
>> LR [c0106f40] seq_read+0x2a4/0x568
>> Call Trace:
>> [df159e40] [00029000] 0x29000 (unreliable)
>> [df159e74] [00000000] =C2=A0 (null)
>> Instruction dump:
>> XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
>> XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
>> ---[ end trace 60026bfc1fe79aed ]---
>> Segmentation fault
>
> Maybe I can understand this problem.
>
> When we kprobe these operations such as store-and-update-word for SP(r1),
>
> stwu r1, -A(r1)
>
> The program exception is triggered then PPC always allocate an exception =
frame
> as shown as the follows:
>
> old r1 --------
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 ...
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 nip
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 gpr[2]~gpr[31]
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 gpr[1] <--------- old r1 is stored here.
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 gpr[0]
> =C2=A0 =C2=A0 =C2=A0 -------- <-- pr_regs @offset 16 bytes
> =C2=A0 =C2=A0 =C2=A0 padding
> =C2=A0 =C2=A0 =C2=A0 STACK_FRAME_REGS_MARKER
> =C2=A0 =C2=A0 =C2=A0 LR
> =C2=A0 =C2=A0 =C2=A0 back chain
> new r1 --------
>
> Here emulate_step() is called to emulate 'stwu'. Actually this is equival=
ent to
> 1> update pr_regs->gpr[1] =3D mem(old r1 + (-A))
> 2> 'stw <old r1>, mem<(old r1 + (-A)) >
>
> You should notice the stack based on new r1 would be covered with mem<old=
 r1
> +(-A)>. So after this, the kernel exit from post_krpobe, something would =
be
> broken. This should depend on sizeof(-A).
>
> For kprobe show_interrupts, you can see pregs->nip is re-written violentl=
y so
> kernel issued.

Yeah, my debug also show this, so this is the root cause.
Thanks for your explanation.

>
> But sometimes we may only re-write some violate registers the kernel stil=
l
> alive. And so this is just why the kernel works well for some kprobed poi=
nt
> after you change some kernel options/toolchains.
>
> If I'm correct its difficult to kprobe these stwu sp operation since the
> sizeof(-A) is undermined for the kernel. So we have to implement in-depen=
d
> interrupt stack like PPC64.

Hmmm, a dedicated exception stack will smooth the concern IMHO,
Ben, Kuma?

Thanks,
Yong


--=20
Only stand for myself

^ permalink raw reply

* Re: [PATCH 3/7] fault-injection: notifier error injection
From: Akinobu Mita @ 2011-07-04  5:35 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Greg Kroah-Hartman, linux-kernel, Rafael J. Wysocki, linux-mm,
	Paul Mackerras, akpm, linuxppc-dev, linux-pm
In-Reply-To: <20110703171626.GG21127@elf.ucw.cz>

2011/7/4 Pavel Machek <pavel@ucw.cz>:
>
>> + =A0 =A0 for (action =3D enb->actions; action->name; action++) {
>> + =A0 =A0 =A0 =A0 =A0 =A0 struct dentry *file =3D debugfs_create_int(act=
ion->name, mode,
>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
=A0 =A0 =A0 =A0 =A0 enb->dir, &action->error);
>> +
>> + =A0 =A0 =A0 =A0 =A0 =A0 if (!file) {
>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 debugfs_remove_recursive(enb->=
dir);
>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 return -ENOMEM;
>> + =A0 =A0 =A0 =A0 =A0 =A0 }
>
> Few lines how this work would be welcome...?

OK, I'll add a comment like below.

/*
 * Create debugfs r/w file containing action->error. If notifier call
 * chain is called with action->val, it will fail with the error code
 */

^ permalink raw reply

* hvc_console change results in corrupt oops output
From: Anton Blanchard @ 2011-07-04 10:57 UTC (permalink / raw)
  To: benh, brueckner, borntraeger, linuxppc-dev


Hi,

We've been struggling to debug a hang on a large ppc64 box. Every time
we collect oops output there are pieces of the oops output missing and
in some cases entire CPUs are missing.

Eventually I realised the hvc_console driver is dropping characters.
The commit that caused this is:


commit 3feebbb5492e9e463467cefb633e23a3dfcec132
Author: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Date:   Mon Oct 13 23:12:50 2008 +0000

    hvc_console: Fix loop if put_char() returns 0
    
    If put_char() routine of a hvc console backend returns 0, then the
    hvc console starts looping in the following scenarios:
    
    1. hvc_console_print()
        If put_char() returns 0 then the while loop may loop forever.
        I have added the missing check for 0 to throw away console
        messages.


The hypervisor gives us a busy return, so we could retry a number of
times instead of dropping it on the floor. We'd need to do it in the
hvc_console driver - the tty drivers share the same backend
functions so we can't hide it in the pseries put_chars function.

Anton

^ permalink raw reply

* Re: [RFC PATCH 17/17] KVM: PPC: Add an ioctl for userspace to select which platform to emulate
From: Alexander Graf @ 2011-07-04 10:59 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Scott Wood, linuxppc-dev@ozlabs.org, Paul Mackerras,
	kvm-ppc@vger.kernel.org, kvm@vger.kernel.org
In-Reply-To: <4E103270.4040506@redhat.com>


On 03.07.2011, at 11:12, Avi Kivity wrote:

> On 07/03/2011 12:09 PM, Alexander Graf wrote:
>> >
>> >  Right.  The idea is that if KVM_FLAG_BLAH implies a field =
kvm_struct::blah, then either both are present in the headers, or none  =
of them.
>>=20
>> Yup, makes sense. I like the idea :). Gets rid of all the useless =
paddings and reserved fields. We could even truncate the structs that =
already have paddings in them if we only copy min(sizeof(real_struct), =
ioctl_passed_size); (which we should anyways).
>>=20
>=20
> No, we can't change anything that is already out.  If will change the =
ioctl numbers, so building against new headers but running against an =
old kernel will fail.

So this will only be enabled for completely new ioctls?

>=20
>> How long until we get a patch set? :)
>=20
> Well, I'd really like to get the qemu memory API out first.

Can we add the ioctl without padding now then and add your awesome =
extensibility stuff to it later on? We only have to make sure we don't =
actually release the intermediate steps as an upstream kernel then, =
right?


Alex

^ permalink raw reply

* Re: [RFC PATCH 17/17] KVM: PPC: Add an ioctl for userspace to select which platform to emulate
From: Avi Kivity @ 2011-07-04 11:22 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Scott Wood, linuxppc-dev@ozlabs.org, Paul Mackerras,
	kvm-ppc@vger.kernel.org, kvm@vger.kernel.org
In-Reply-To: <40F62A22-6A47-4414-8FF9-0534568353DA@suse.de>

On 07/04/2011 01:59 PM, Alexander Graf wrote:
> On 03.07.2011, at 11:12, Avi Kivity wrote:
>
> >  On 07/03/2011 12:09 PM, Alexander Graf wrote:
> >>  >
> >>  >   Right.  The idea is that if KVM_FLAG_BLAH implies a field kvm_struct::blah, then either both are present in the headers, or none  of them.
> >>
> >>  Yup, makes sense. I like the idea :). Gets rid of all the useless paddings and reserved fields. We could even truncate the structs that already have paddings in them if we only copy min(sizeof(real_struct), ioctl_passed_size); (which we should anyways).
> >>
> >
> >  No, we can't change anything that is already out.  If will change the ioctl numbers, so building against new headers but running against an old kernel will fail.
>
> So this will only be enabled for completely new ioctls?

Yes, unfortunately.

> >
> >>  How long until we get a patch set? :)
> >
> >  Well, I'd really like to get the qemu memory API out first.
>
> Can we add the ioctl without padding now then and add your awesome extensibility stuff to it later on?

Yes.

> We only have to make sure we don't actually release the intermediate steps as an upstream kernel then, right?

What intermediate steps?  We can't add fields to the structure before we 
get the extensibility infrastructure, but that's all.

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply

* Re: [RFC PATCH 17/17] KVM: PPC: Add an ioctl for userspace to select which platform to emulate
From: Alexander Graf @ 2011-07-04 11:36 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Scott Wood, linuxppc-dev@ozlabs.org, Paul Mackerras,
	kvm-ppc@vger.kernel.org, kvm@vger.kernel.org
In-Reply-To: <4E11A293.2010505@redhat.com>


On 04.07.2011, at 13:22, Avi Kivity wrote:

> On 07/04/2011 01:59 PM, Alexander Graf wrote:
>> On 03.07.2011, at 11:12, Avi Kivity wrote:
>>=20
>> >  On 07/03/2011 12:09 PM, Alexander Graf wrote:
>> >>  >
>> >>  >   Right.  The idea is that if KVM_FLAG_BLAH implies a field =
kvm_struct::blah, then either both are present in the headers, or none  =
of them.
>> >>
>> >>  Yup, makes sense. I like the idea :). Gets rid of all the useless =
paddings and reserved fields. We could even truncate the structs that =
already have paddings in them if we only copy min(sizeof(real_struct), =
ioctl_passed_size); (which we should anyways).
>> >>
>> >
>> >  No, we can't change anything that is already out.  If will change =
the ioctl numbers, so building against new headers but running against =
an old kernel will fail.
>>=20
>> So this will only be enabled for completely new ioctls?
>=20
> Yes, unfortunately.
>=20
>> >
>> >>  How long until we get a patch set? :)
>> >
>> >  Well, I'd really like to get the qemu memory API out first.
>>=20
>> Can we add the ioctl without padding now then and add your awesome =
extensibility stuff to it later on?
>=20
> Yes.
>=20
>> We only have to make sure we don't actually release the intermediate =
steps as an upstream kernel then, right?
>=20
> What intermediate steps?  We can't add fields to the structure before =
we get the extensibility infrastructure, but that's all.

If we add it now without extensibility code, we will have a kernel that =
only knows the ioctl with the size as it is now. If we later add another =
field, the ioctl size changes which would render new user space running =
on that particular kernel to fail, because it doesn't have the "just =
truncate the ioctl param to the size" logic yet.


Alex

^ permalink raw reply

* Re: [RFC PATCH 17/17] KVM: PPC: Add an ioctl for userspace to select which platform to emulate
From: Avi Kivity @ 2011-07-04 11:37 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Scott Wood, linuxppc-dev@ozlabs.org, Paul Mackerras,
	kvm-ppc@vger.kernel.org, kvm@vger.kernel.org
In-Reply-To: <BC4ECEFA-15A4-49A9-AE53-9635F3F6F996@suse.de>

On 07/04/2011 02:36 PM, Alexander Graf wrote:
> >
> >  What intermediate steps?  We can't add fields to the structure before we get the extensibility infrastructure, but that's all.
>
> If we add it now without extensibility code, we will have a kernel that only knows the ioctl with the size as it is now. If we later add another field, the ioctl size changes which would render new user space running on that particular kernel to fail, because it doesn't have the "just truncate the ioctl param to the size" logic yet.

You're right.  So let's add the padding now, and only use the great new 
stuff after it's actually in.

Working ugly is better than broken pretty.

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply

* Re: [RFC PATCH 17/17] KVM: PPC: Add an ioctl for userspace to select which platform to emulate
From: Alexander Graf @ 2011-07-04 11:41 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Scott Wood, linuxppc-dev@ozlabs.org, Paul Mackerras,
	kvm-ppc@vger.kernel.org, kvm@vger.kernel.org
In-Reply-To: <4E11A611.3010904@redhat.com>


On 04.07.2011, at 13:37, Avi Kivity wrote:

> On 07/04/2011 02:36 PM, Alexander Graf wrote:
>> >
>> >  What intermediate steps?  We can't add fields to the structure =
before we get the extensibility infrastructure, but that's all.
>>=20
>> If we add it now without extensibility code, we will have a kernel =
that only knows the ioctl with the size as it is now. If we later add =
another field, the ioctl size changes which would render new user space =
running on that particular kernel to fail, because it doesn't have the =
"just truncate the ioctl param to the size" logic yet.
>=20
> You're right.  So let's add the padding now, and only use the great =
new stuff after it's actually in.
>=20
> Working ugly is better than broken pretty.

Hrm. Too bad :).


Alex

^ permalink raw reply

* Re: [PATCH 10/17] KVM: PPC: Add support for Book3S processors in hypervisor mode
From: Paul Mackerras @ 2011-07-04 11:51 UTC (permalink / raw)
  To: Dave Hansen; +Cc: linuxppc-dev, kvm-ppc, kvm, Alexander Graf
In-Reply-To: <1309545462.16582.12.camel@nimitz>

On Fri, Jul 01, 2011 at 11:37:42AM -0700, Dave Hansen wrote:
> On Wed, 2011-06-29 at 20:21 +1000, Paul Mackerras wrote: 
> > +struct kvmppc_pginfo {
> > +	unsigned long pfn;
> > +	atomic_t refcnt;
> > +};
> 
> I only see this refcnt inc'd in one spot and never decremented or read.
> Is the refcnt just the number of hptes we have for this particular page
> at the moment?  

It's redundant at present, because the guest physical memory is all
large pages and we don't hand any pages back until the guest quits.

We're going to have to keep some sort of list of HPTEs for each
guest-physical page and we will probably need this refcnt then.

> > +static unsigned long user_page_size(unsigned long addr)
> > +{
> > +	struct vm_area_struct *vma;
> > +	unsigned long size = PAGE_SIZE;
> > +
> > +	down_read(&current->mm->mmap_sem);
> > +	vma = find_vma(current->mm, addr);
> > +	if (vma)
> > +		size = vma_kernel_pagesize(vma);
> > +	up_read(&current->mm->mmap_sem);
> > +	return size;
> > +}
> 
> That one looks pretty arch-independent and like it could use some
> consolidation with: virt/kvm/kvm_main.c::kvm_host_page_size()

Actually, it goes away again in patch 14. :)

> > +	/* VRMA can't be > 1TB */
> > +	if (npages > 1ul << (40 - kvm->arch.ram_porder))
> > +		npages = 1ul << (40 - kvm->arch.ram_porder);
> 
> Is that because it can only be a single segment?  Does that mean that we
> can't ever have guests larger than 1TB?  Or just that they have to live
> with 1TB until they get their own page tables up?

It can only be a single segment, and that's because that's how the
hardware works.

It means that the when running in real mode the guest can access at
most the first 1TB of physical memory.  Real mode is used early in the
boot process and also in the early stages of exception/interrupt
processing.  (Under pHyp we typically only get to access the first
128MB or 256MB of memory in real mode.)

> > +	/* Can't use more than 1 HPTE per HPTEG */
> > +	if (npages > HPT_NPTEG)
> > +		npages = HPT_NPTEG;
> > +
> > +	for (i = 0; i < npages; ++i) {
> > +		pfn = pginfo[i].pfn;
> > +		/* can't use hpt_hash since va > 64 bits */
> > +		hash = (i ^ (VRMA_VSID ^ (VRMA_VSID << 25))) & HPT_HASH_MASK;
> 
> Is that because 'i' could potentially have a very large pfn?  Nish
> thought it might have something to do with the hpte entries being larger
> than 64-bits themselves with the vsid included, but we got thoroughly
> confused. :)

It's because VRMA_VSID (which is set by hardware, we don't control it)
is larger than 2^24.  The hpt_hash() function takes a 64-bit VA
composed of a 24-bit VSID and 40 bits of segment offset (for addresses
in 1TB segments).

Paul.

^ permalink raw reply

* Re: [PATCH] powerpc: enable access to HT Host-Bridge on Maple
From: Dmitry Eremin-Solenikov @ 2011-07-04 12:46 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: linuxppc-dev
In-Reply-To: <288B47E0-3869-486D-B753-62EFA1EEF66D@kernel.crashing.org>

Hello,

On 03.07.2011 01:50, Segher Boessenkool wrote:
>>>> +static int u3_ht_root_write_config(struct pci_controller *hose, u8
>>>> offset,
>>>> + int len, u32 val)
>>>> +{
>>>> + volatile void __iomem *addr;
>>>> +
>>>> + addr = hose->cfg_addr + ((offset & ~3) << 2) + (4 - len - (offset
>>>> & 3));
>>>> +
>>>> + if (offset >= PCI_BASE_ADDRESS_0 && offset < PCI_CAPABILITY_LIST)
>>>> + return PCIBIOS_SUCCESSFUL;
>>>
>>> This is a workaround for something; at the very least it needs a
>>> comment,
>>> but probably it shouldn't be here at all.
>>
>> I'll add a comment. Basically these registers are unimplemented on u3/u4
>> bridges and at least some kinds of access to them cause system freeze.
>> I'll try to check what triggers what this night.
>
> I don't think the hardware freezes, but probably Linux isn't happy when it
> tries to write to the non-existent BARs. Or something like that.

I've run several experiments. At least writing to the ROM_ADDRESS (0x30) 
causes some kind of strange reboot (I see 'Error: Magic number in 
message area NVRAM is not valid.') errors on the service console and 
Linux consoles are silent. Writing to other BARs seem to cause no direct 
effect (reg remains =0), but causes very strange behaviour during boot. 
All PCI access cycles seem to return 0, strange DRAM ECC error pops up, 
etc. I'd prefer to leave these register writes disabled, even if it's 
the problem of some hardware revision (?).


-- 
With best wishes
Dmitry

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox