* Re: [PATCH] powerpc/64s: Fix THP PMD collapse serialisation
From: Nicholas Piggin @ 2019-06-03 7:33 UTC (permalink / raw)
To: Aneesh Kumar K.V, linuxppc-dev
In-Reply-To: <bb7eabc7-8dbb-14e9-f797-6dfd339bb0ba@linux.ibm.com>
Aneesh Kumar K.V's on June 3, 2019 4:43 pm:
> On 6/3/19 11:35 AM, Nicholas Piggin wrote:
>> Commit 1b2443a547f9 ("powerpc/book3s64: Avoid multiple endian conversion
>> in pte helpers") changed the actual bitwise tests in pte_access_permitted
>> by using pte_write() and pte_present() helpers rather than raw bitwise
>> testing _PAGE_WRITE and _PAGE_PRESENT bits.
>>
>> The pte_present change now returns true for ptes which are !_PAGE_PRESENT
>> and _PAGE_INVALID, which is the combination used by pmdp_invalidate to
>> synchronize access from lock-free lookups. pte_access_permitted is used by
>> pmd_access_permitted, so allowing GUP lock free access to proceed with
>> such PTEs breaks this synchronisation.
>>
>> This bug has been observed on HPT host, with random crashes and corruption
>> in guests, usually together with bad PMD messages in the host.
>>
>> Fix this by adding an explicit check in pmd_access_permitted, and
>> documenting the condition explicitly.
>>
>> The pte_write() change should be okay, and would prevent GUP from falling
>> back to the slow path when encountering savedwrite ptes, which matches
>> what x86 (that does not implement savedwrite) does.
>>
>> Fixes: 1b2443a547f9 ("powerpc/book3s64: Avoid multiple endian conversion in pte helpers")
>> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
>> Cc: Christophe Leroy <christophe.leroy@c-s.fr>
>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
>> ---
>> arch/powerpc/include/asm/book3s/64/pgtable.h | 19 ++++++++++++++++++-
>> arch/powerpc/mm/book3s64/pgtable.c | 3 +++
>> 2 files changed, 21 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
>> index 7dede2e34b70..aaa72aa1b765 100644
>> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
>> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
>> @@ -1092,7 +1092,24 @@ static inline int pmd_protnone(pmd_t pmd)
>> #define pmd_access_permitted pmd_access_permitted
>> static inline bool pmd_access_permitted(pmd_t pmd, bool write)
>> {
>> - return pte_access_permitted(pmd_pte(pmd), write);
>> + pte_t pte = pmd_pte(pmd);
>> + unsigned long pteval = pte_val(pte);
>> +
>> + /*
>> + * pmdp_invalidate sets this combination (that is not caught by
>> + * !pte_present() check in pte_access_permitted), to prevent
>> + * lock-free lookups, as part of the serialize_against_pte_lookup()
>> + * synchronisation.
>> + *
>> + * This check inadvertently catches the case where the PTE's hardware
>> + * PRESENT bit is cleared while TLB is flushed, to work around
>> + * hardware TLB issues. This is suboptimal, but should not be hit
>> + * frequently and should be harmless.
>> + */
>> + if ((pteval & _PAGE_INVALID) && !(pteval & _PAGE_PRESENT))
>> + return false;
>> +
>> + return pte_access_permitted(pte, write);
>> }
>>
>
>
> you need to do similar for other lockless page table walk like
> find_linux_pte
Yeah good point as discussed offline. I was going to make that a
separate patch, it would have a different Fixes:. I have not been
able to trigger any bugs caused by it, whereas the bug caused by
this patch hits reliably in about 10 minutes or less.
Maybe the race window is just a lot smaller or the function is
less frequently used?
Thanks,
Nick
^ permalink raw reply
* Re: [PATCH 03/16] mm: simplify gup_fast_permitted
From: Christoph Hellwig @ 2019-06-03 7:41 UTC (permalink / raw)
To: Linus Torvalds
Cc: Rich Felker, Yoshinori Sato, Linux-sh list, James Hogan,
the arch/x86 maintainers, Khalid Aziz, Nicholas Piggin,
David S. Miller, Linux-MM, Paul Burton, Paul Mackerras,
Andrey Konovalov, sparclinux, linux-mips, linuxppc-dev,
Christoph Hellwig, Linux List Kernel Mailing
In-Reply-To: <CAHk-=whusWKhS=SYoC9f9HjVmPvR5uP51Mq=ZCtktqTBT2qiBw@mail.gmail.com>
On Sat, Jun 01, 2019 at 09:14:17AM -0700, Linus Torvalds wrote:
> On Sat, Jun 1, 2019 at 12:50 AM Christoph Hellwig <hch@lst.de> wrote:
> >
> > Pass in the already calculated end value instead of recomputing it, and
> > leave the end > start check in the callers instead of duplicating them
> > in the arch code.
>
> Good cleanup, except it's wrong.
>
> > - if (nr_pages <= 0)
> > + if (end < start)
> > return 0;
>
> You moved the overflow test to generic code - good.
>
> You removed the sign and zero test on nr_pages - bad.
I only removed a duplicate of it. The full (old) code in
get_user_pages_fast() looks like this:
if (nr_pages <= 0)
return 0;
if (unlikely(!access_ok((void __user *)start, len)))
return -EFAULT;
if (gup_fast_permitted(start, nr_pages)) {
^ permalink raw reply
* Re: [RFC PATCH] powerpc/64s: __find_linux_pte synchronization vs pmdp_invalidate
From: Christophe Leroy @ 2019-06-03 7:41 UTC (permalink / raw)
To: Nicholas Piggin, linuxppc-dev; +Cc: Aneesh Kumar K . V
In-Reply-To: <20190603064408.14735-1-npiggin@gmail.com>
Le 03/06/2019 à 08:44, Nicholas Piggin a écrit :
> The pmd_none check does not catch hugepage collapse, nor does the
> pmd_present check in pmd_trans_huge, because hugepage collapse sets
> !_PAGE_PRESENT && _PAGE_INVALID (which results in !pmd_none and
> pmd_present).
>
> Aneesh noticed we might need this check as well.
>
> ---
> arch/powerpc/mm/pgtable.c | 15 +++++++++++----
> 1 file changed, 11 insertions(+), 4 deletions(-)
>
> diff --git a/arch/powerpc/mm/pgtable.c b/arch/powerpc/mm/pgtable.c
> index db4a6253df92..7a702d21400a 100644
> --- a/arch/powerpc/mm/pgtable.c
> +++ b/arch/powerpc/mm/pgtable.c
> @@ -372,13 +372,20 @@ pte_t *__find_linux_pte(pgd_t *pgdir, unsigned long ea,
> pdshift = PMD_SHIFT;
> pmdp = pmd_offset(&pud, ea);
> pmd = READ_ONCE(*pmdp);
> - /*
> - * A hugepage collapse is captured by pmd_none, because
> - * it mark the pmd none and do a hpte invalidate.
> - */
> +
> if (pmd_none(pmd))
> return NULL;
>
> +#ifdef CONFIG_PPC_BOOK3S_64
I can't see anything that would build fail on other subarches. Wouldn't
be better to use
if (IS_ENABLED(CONFIG_PPC_BOOK3S_64) && (pmd_val(pmd) &
(_PAGE_PRESENT|_PAGE_INVALID) == _PAGE_INVALID))
> + if (pmd_val(pmd) & (_PAGE_PRESENT|_PAGE_INVALID) == _PAGE_INVALID) {
Maybe using pmd_raw() instead as in all the book3s64 helpers ?
Christophe
> + /*
> + * A hugepage collapse is captured by this condition, see
> + * pmdp_invalidate.
> + */
> + return NULL;
> + }
> +#endif
> +
> if (pmd_trans_huge(pmd) || pmd_devmap(pmd)) {
> if (is_thp)
> *is_thp = true;
>
^ permalink raw reply
* Re: [PATCH 08/16] sparc64: add the missing pgd_page definition
From: Christoph Hellwig @ 2019-06-03 7:44 UTC (permalink / raw)
To: Linus Torvalds
Cc: Rich Felker, Yoshinori Sato, Linux-sh list, James Hogan,
the arch/x86 maintainers, Khalid Aziz, Nicholas Piggin,
David S. Miller, Linux-MM, Paul Burton, Paul Mackerras,
Andrey Konovalov, sparclinux, linux-mips, linuxppc-dev,
Christoph Hellwig, Linux List Kernel Mailing
In-Reply-To: <CAHk-=wj9w5NxTcJsqpvYUiL3OBOH-J3=4-vXcc3GaG_U8H-gJw@mail.gmail.com>
On Sat, Jun 01, 2019 at 09:28:54AM -0700, Linus Torvalds wrote:
> Both sparc64 and sh had this pattern, but now that I look at it more
> closely, I think your version is wrong, or at least nonoptimal.
I bet it is. Then again these symbols are just required for the code
to compile, as neither sparc64 nor sh actually use the particular
variant of huge pages we need it for. Then again even actually dead
code should better be not too buggy if it isn't just a stub.
> So I thgink this would be better done with
>
> #define pgd_page(pgd) pfn_to_page(pgd_pfn(pgd))
>
> where that "pgd_pfn()" would need to be a new (but likely very
> trivial) function. That's what we do for pte_pfn().
>
> IOW, it would likely end up something like
>
> #define pgd_to_pfn(pgd) (pgd_val(x) >> PFN_PGD_SHIFT)
True. I guess it would be best if we could get most if not all
architectures to use common versions of these macros so that we have
the issue settled once.
^ permalink raw reply
* Re: [PATCH 10/16] sparc64: use the generic get_user_pages_fast code
From: Christoph Hellwig @ 2019-06-03 7:44 UTC (permalink / raw)
To: Hillf Danton
Cc: x86, Rich Felker, Yoshinori Sato, linux-sh, James Hogan,
linuxppc-dev, Khalid Aziz, Nicholas Piggin, David S. Miller,
linux-mm, Paul Burton, Paul Mackerras, Andrey Konovalov,
sparclinux, linux-mips, Linus Torvalds, Christoph Hellwig,
linux-kernel
In-Reply-To: <20190601074959.14036-11-hch@lst.de>
On Sun, Jun 02, 2019 at 03:39:48PM +0800, Hillf Danton wrote:
>
> Hi Christoph
>
> On Sat, 1 Jun 2019 09:49:53 +0200 Christoph Hellwig wrote:
> >
> > diff --git a/arch/sparc/include/asm/pgtable_64.h b/arch/sparc/include/asm/pgtable_64.h
> > index a93eca29e85a..2301ab5250e4 100644
> > --- a/arch/sparc/include/asm/pgtable_64.h
> > +++ b/arch/sparc/include/asm/pgtable_64.h
> > @@ -1098,6 +1098,24 @@ static inline unsigned long untagged_addr(unsigned long start)
> > }
> > #define untagged_addr untagged_addr
> >
> > +static inline bool pte_access_permitted(pte_t pte, bool write)
> > +{
> > + u64 prot;
> > +
> > + if (tlb_type == hypervisor) {
> > + prot = _PAGE_PRESENT_4V | _PAGE_P_4V;
> > + if (prot)
>
> Feel free to correct me if I misread or miss anything.
> It looks like a typo: s/prot/write/, as checking _PAGE_PRESENT_4V and
> _PAGE_P_4V makes prot always have _PAGE_WRITE_4V set, regardless of write.
True, the if prot should be if write.
^ permalink raw reply
* Re: [PATCH BACKPORTv2 4.19, 5.0, 5.1] crypto: vmx - ghash: do nosimd fallback manually
From: Greg KH @ 2019-06-03 7:54 UTC (permalink / raw)
To: Daniel Axtens; +Cc: linuxppc-dev, Herbert Xu, stable
In-Reply-To: <20190603020848.9598-1-dja@axtens.net>
On Mon, Jun 03, 2019 at 12:08:48PM +1000, Daniel Axtens wrote:
> commit 357d065a44cdd77ed5ff35155a989f2a763e96ef upstream.
> [backported: the VMX driver did not use crypto_simd_usable() until
> after 5.1]
>
> VMX ghash was using a fallback that did not support interleaving simd
> and nosimd operations, leading to failures in the extended test suite.
>
> If I understood correctly, Eric's suggestion was to use the same
> data format that the generic code uses, allowing us to call into it
> with the same contexts. I wasn't able to get that to work - I think
> there's a very different key structure and data layout being used.
>
> So instead steal the arm64 approach and perform the fallback
> operations directly if required.
>
> Fixes: cc333cd68dfa ("crypto: vmx - Adding GHASH routines for VMX module")
> Cc: stable@vger.kernel.org # v4.1+
> Reported-by: Eric Biggers <ebiggers@google.com>
> Signed-off-by: Daniel Axtens <dja@axtens.net>
> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> Tested-by: Michael Ellerman <mpe@ellerman.id.au>
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
> Signed-off-by: Daniel Axtens <dja@axtens.net>
> ---
>
> v2: do stable backport form correctly.
Thanks for all of these, now queued up.
greg k-h
^ permalink raw reply
* [PATCH v3] powerpc: fix kexec failure on book3s/32
From: Christophe Leroy @ 2019-06-03 8:20 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
Aaro Koskinen
Cc: linuxppc-dev, linux-kernel
In the old days, _PAGE_EXEC didn't exist on 6xx aka book3s/32.
Therefore, allthough __mapin_ram_chunk() was already mapping kernel
text with PAGE_KERNEL_TEXT and the rest with PAGE_KERNEL, the entire
memory was executable. Part of the memory (first 512kbytes) was
mapped with BATs instead of page table, but it was also entirely
mapped as executable.
In commit 385e89d5b20f ("powerpc/mm: add exec protection on
powerpc 603"), we started adding exec protection to some 6xx, namely
the 603, for pages mapped via pagetables.
Then, in commit 63b2bc619565 ("powerpc/mm/32s: Use BATs for
STRICT_KERNEL_RWX"), the exec protection was extended to BAT mapped
memory, so that really only the kernel text could be executed.
The problem here is that kexec is based on copying some code into
upper part of memory then executing it from there in order to install
a fresh new kernel at its definitive location.
However, the code is position independant and first part of it is
just there to deactivate the MMU and jump to the second part. So it
is possible to run this first part inplace instead of running the
copy. Once the MMU is off, there is no protection anymore and the
second part of the code will just run as before.
Reported-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Fixes: 63b2bc619565 ("powerpc/mm/32s: Use BATs for STRICT_KERNEL_RWX")
Cc: stable@vger.kernel.org
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
---
Aaro, can you test this patch ? Thanks.
arch/powerpc/include/asm/kexec.h | 3 +++
arch/powerpc/kernel/machine_kexec_32.c | 4 +++-
2 files changed, 6 insertions(+), 1 deletion(-)
diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
index 4a585cba1787..c68476818753 100644
--- a/arch/powerpc/include/asm/kexec.h
+++ b/arch/powerpc/include/asm/kexec.h
@@ -94,6 +94,9 @@ static inline bool kdump_in_progress(void)
return crashing_cpu >= 0;
}
+void relocate_new_kernel(unsigned long indirection_page, unsigned long reboot_code_buffer,
+ unsigned long start_address) __noreturn;
+
#ifdef CONFIG_KEXEC_FILE
extern const struct kexec_file_ops kexec_elf64_ops;
diff --git a/arch/powerpc/kernel/machine_kexec_32.c b/arch/powerpc/kernel/machine_kexec_32.c
index affe5dcce7f4..2b160d68db49 100644
--- a/arch/powerpc/kernel/machine_kexec_32.c
+++ b/arch/powerpc/kernel/machine_kexec_32.c
@@ -30,7 +30,6 @@ typedef void (*relocate_new_kernel_t)(
*/
void default_machine_kexec(struct kimage *image)
{
- extern const unsigned char relocate_new_kernel[];
extern const unsigned int relocate_new_kernel_size;
unsigned long page_list;
unsigned long reboot_code_buffer, reboot_code_buffer_phys;
@@ -58,6 +57,9 @@ void default_machine_kexec(struct kimage *image)
reboot_code_buffer + KEXEC_CONTROL_PAGE_SIZE);
printk(KERN_INFO "Bye!\n");
+ if (!IS_ENABLED(CONFIG_FSL_BOOKE) && !IS_ENABLED(CONFIG_44x))
+ relocate_new_kernel(page_list, reboot_code_buffer_phys, image->start);
+
/* now call it */
rnk = (relocate_new_kernel_t) reboot_code_buffer;
(*rnk)(page_list, reboot_code_buffer_phys, image->start);
--
2.13.3
^ permalink raw reply related
* Re: [PATCH v3 1/3] PCI: Introduce pcibios_ignore_alignment_request
From: Alexey Kardashevskiy @ 2019-06-03 8:35 UTC (permalink / raw)
To: Shawn Anastasio
Cc: Sam Bobroff, linux-pci, Linux Kernel Mailing List, rppt, Oliver,
Bjorn Helgaas, Paul Mackerras, xyjxie, linuxppc-dev
In-Reply-To: <2f4185ac-d19f-6668-7b3e-a300ce3b9e00@ozlabs.ru>
On 03/06/2019 15:02, Alexey Kardashevskiy wrote:
>
>
> On 03/06/2019 12:23, Shawn Anastasio wrote:
>>
>>
>> On 5/30/19 10:56 PM, Alexey Kardashevskiy wrote:
>>>
>>>
>>> On 31/05/2019 08:49, Shawn Anastasio wrote:
>>>> On 5/29/19 10:39 PM, Alexey Kardashevskiy wrote:
>>>>>
>>>>>
>>>>> On 28/05/2019 17:39, Shawn Anastasio wrote:
>>>>>>
>>>>>>
>>>>>> On 5/28/19 1:27 AM, Alexey Kardashevskiy wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 28/05/2019 15:36, Oliver wrote:
>>>>>>>> On Tue, May 28, 2019 at 2:03 PM Shawn Anastasio <shawn@anastas.io>
>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Introduce a new pcibios function pcibios_ignore_alignment_request
>>>>>>>>> which allows the PCI core to defer to platform-specific code to
>>>>>>>>> determine whether or not to ignore alignment requests for PCI
>>>>>>>>> resources.
>>>>>>>>>
>>>>>>>>> The existing behavior is to simply ignore alignment requests when
>>>>>>>>> PCI_PROBE_ONLY is set. This is behavior is maintained by the
>>>>>>>>> default implementation of pcibios_ignore_alignment_request.
>>>>>>>>>
>>>>>>>>> Signed-off-by: Shawn Anastasio <shawn@anastas.io>
>>>>>>>>> ---
>>>>>>>>> drivers/pci/pci.c | 9 +++++++--
>>>>>>>>> include/linux/pci.h | 1 +
>>>>>>>>> 2 files changed, 8 insertions(+), 2 deletions(-)
>>>>>>>>>
>>>>>>>>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>>>>>>>>> index 8abc843b1615..8207a09085d1 100644
>>>>>>>>> --- a/drivers/pci/pci.c
>>>>>>>>> +++ b/drivers/pci/pci.c
>>>>>>>>> @@ -5882,6 +5882,11 @@ resource_size_t __weak
>>>>>>>>> pcibios_default_alignment(void)
>>>>>>>>> return 0;
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> +int __weak pcibios_ignore_alignment_request(void)
>>>>>>>>> +{
>>>>>>>>> + return pci_has_flag(PCI_PROBE_ONLY);
>>>>>>>>> +}
>>>>>>>>> +
>>>>>>>>> #define RESOURCE_ALIGNMENT_PARAM_SIZE COMMAND_LINE_SIZE
>>>>>>>>> static char
>>>>>>>>> resource_alignment_param[RESOURCE_ALIGNMENT_PARAM_SIZE] = {0};
>>>>>>>>> static DEFINE_SPINLOCK(resource_alignment_lock);
>>>>>>>>> @@ -5906,9 +5911,9 @@ static resource_size_t
>>>>>>>>> pci_specified_resource_alignment(struct pci_dev *dev,
>>>>>>>>> p = resource_alignment_param;
>>>>>>>>> if (!*p && !align)
>>>>>>>>> goto out;
>>>>>>>>> - if (pci_has_flag(PCI_PROBE_ONLY)) {
>>>>>>>>> + if (pcibios_ignore_alignment_request()) {
>>>>>>>>> align = 0;
>>>>>>>>> - pr_info_once("PCI: Ignoring requested alignments
>>>>>>>>> (PCI_PROBE_ONLY)\n");
>>>>>>>>> + pr_info_once("PCI: Ignoring requested
>>>>>>>>> alignments\n");
>>>>>>>>> goto out;
>>>>>>>>> }
>>>>>>>>
>>>>>>>> I think the logic here is questionable to begin with. If the user
>>>>>>>> has
>>>>>>>> explicitly requested re-aligning a resource via the command line
>>>>>>>> then
>>>>>>>> we should probably do it even if PCI_PROBE_ONLY is set. When it
>>>>>>>> breaks
>>>>>>>> they get to keep the pieces.
>>>>>>>>
>>>>>>>> That said, the real issue here is that PCI_PROBE_ONLY probably
>>>>>>>> shouldn't be set under qemu/kvm. Under the other hypervisor
>>>>>>>> (PowerVM)
>>>>>>>> hotplugged devices are configured by firmware before it's passed to
>>>>>>>> the guest and we need to keep the FW assignments otherwise things
>>>>>>>> break. QEMU however doesn't do any BAR assignments and relies on
>>>>>>>> that
>>>>>>>> being handled by the guest. At boot time this is done by SLOF, but
>>>>>>>> Linux only keeps SLOF around until it's extracted the device-tree.
>>>>>>>> Once that's done SLOF gets blown away and the kernel needs to do
>>>>>>>> it's
>>>>>>>> own BAR assignments. I'm guessing there's a hack in there to make it
>>>>>>>> work today, but it's a little surprising that it works at all...
>>>>>>>
>>>>>>>
>>>>>>> The hack is to run a modified qemu-aware "/usr/sbin/rtas_errd" in the
>>>>>>> guest which receives an event from qemu (RAS_EPOW from
>>>>>>> /proc/interrupts), fetches device tree chunks (and as I understand
>>>>>>> it -
>>>>>>> they come with BARs from phyp but without from qemu) and writes
>>>>>>> "1" to
>>>>>>> "/sys/bus/pci/rescan" which calls pci_assign_resource() eventually:
>>>>>>
>>>>>> Interesting. Does this mean that the PHYP hotplug path doesn't
>>>>>> call pci_assign_resource?
>>>>>
>>>>>
>>>>> I'd expect dlpar_add_slot() to be called under phyp and eventually
>>>>> pci_device_add() which (I think) may or may not trigger later
>>>>> reassignment.
>>>>>
>>>>>
>>>>>> If so it means the patch may not
>>>>>> break that platform after all, though it still may not be
>>>>>> the correct way of doing things.
>>>>>
>>>>>
>>>>> We should probably stop enforcing the PCI_PROBE_ONLY flag - it seems
>>>>> that (unless resource_alignment= is used) the pseries guest should just
>>>>> walk through all allocated resources and leave them unchanged.
>>>>
>>>> If we add a pcibios_default_alignment() implementation like was
>>>> suggested earlier, then it will behave as if the user has
>>>> specified resource_alignment= by default and SLOF's assignments
>>>> won't be honored (I think).
>>>
>>>
>>> I removed pci_add_flags(PCI_PROBE_ONLY) from pSeries_setup_arch and
>>> tried booting with and without pci=resource_alignment= and I can see no
>>> difference - BARs are still aligned to 64K as programmed in SLOF; if I
>>> hack SLOF to align to 4K or 32K - BARs get packed and the guest leaves
>>> them unchanged.
>>>
>>>
>>>> I guess it boils down to one question - is it important that we
>>>> observe SLOF's initial BAR assignments?
>>>
>>> It isn't if it's SLOF but it is if it's phyp. It used to not
>>> allow/support BAR reassignment and even if it does not, I'd rather avoid
>>> touching them.
>>
>> A quick update. I tried removing pci_add_flags(PCI_PROBE_ONLY) which
>> worked, but if I add an implementation of pcibios_default_alignment
>> which simply returns PAGE_SIZE, my VM fails to boot and many errors
>> from the virtio disk driver are printed to the console.
>>
>> After some investigation, it seems that with pcibios_default_alignment
>> present, Linux will reallocate all resources provided by SLOF on
>> boot. I'm still not sure why exactly this causes the virtio driver
>> to fail, but it does indicate that there is a reason to keep
>> SLOF's initial assignments.
>>
>> Anybody have an idea what's causing this?
>
> With your changes the guest feels the urge to reassign bars (no idea why
> but ok), when it does so, it puts both BARs (one is prefetchable) into
> the 32bit non-prefetchable window of the PHB (SLOF puts the prefetchable
> bar to a 64bit prefetchable window, I have no idea why the guest does it
> different either but this must still work) and then qemu does not
> emulate something properly - unassigned_mem_accepts() is triggered on
> the bar access - no idea why - I am debugging it right now.
Sooo the problem is that resouce::flags has 2 bits to describe 64bit
BARs - PCI_BASE_ADDRESS_MEM_TYPE_64 and IORESOURCE_MEM_64 - and we don't
set IORESOURCE_MEM_64 for 64bit BARs when parsing the fdt.
So the BAR reallocator moves the BAR to 32bit window (which is not
desirable but permitted, I still have to chase it) and then
pci_std_update_resource() writes BAR back but since now it is 32bit BAR,
it does not write to the upper 32bits so that half remains 0x2100, QEMU
does not move BAR to the right window and the MMIO stops working.
Try this in the guest kernel, it seems to keep bars where they were
after slof.
diff --git a/arch/powerpc/kernel/pci_of_scan.c
b/arch/powerpc/kernel/pci_of_scan.c
index 24191ea2d9a7..64ad92016b63 100644
--- a/arch/powerpc/kernel/pci_of_scan.c
+++ b/arch/powerpc/kernel/pci_of_scan.c
@@ -45,6 +45,8 @@ unsigned int pci_parse_of_flags(u32 addr0, int bridge)
if (addr0 & 0x02000000) {
flags = IORESOURCE_MEM | PCI_BASE_ADDRESS_SPACE_MEMORY;
flags |= (addr0 >> 22) & PCI_BASE_ADDRESS_MEM_TYPE_64;
+ if (flags & PCI_BASE_ADDRESS_MEM_TYPE_64)
+ flags |= IORESOURCE_MEM_64;
flags |= (addr0 >> 28) & PCI_BASE_ADDRESS_MEM_TYPE_1M;
if (addr0 & 0x40000000)
--
Alexey
^ permalink raw reply related
* Re: [PATCH] scsi: ibmvscsi: Don't use rc uninitialized in ibmvscsi_do_work
From: Christophe Leroy @ 2019-06-03 8:45 UTC (permalink / raw)
To: Nathan Chancellor, Tyrel Datwyler, James E.J. Bottomley,
Martin K. Petersen
Cc: clang-built-linux, linuxppc-dev, linux-kernel, linux-scsi
In-Reply-To: <20190531185306.41290-1-natechancellor@gmail.com>
Le 31/05/2019 à 20:53, Nathan Chancellor a écrit :
> clang warns:
>
> drivers/scsi/ibmvscsi/ibmvscsi.c:2126:7: warning: variable 'rc' is used
> uninitialized whenever switch case is taken [-Wsometimes-uninitialized]
> case IBMVSCSI_HOST_ACTION_NONE:
> ^~~~~~~~~~~~~~~~~~~~~~~~~
> drivers/scsi/ibmvscsi/ibmvscsi.c:2151:6: note: uninitialized use occurs
> here
> if (rc) {
> ^~
>
> Initialize rc to zero so that the atomic_set and dev_err statement don't
> trigger for the cases that just break.
>
> Fixes: 035a3c4046b5 ("scsi: ibmvscsi: redo driver work thread to use enum action states")
> Link: https://github.com/ClangBuiltLinux/linux/issues/502
> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
> ---
> drivers/scsi/ibmvscsi/ibmvscsi.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.c b/drivers/scsi/ibmvscsi/ibmvscsi.c
> index 727c31dc11a0..6714d8043e62 100644
> --- a/drivers/scsi/ibmvscsi/ibmvscsi.c
> +++ b/drivers/scsi/ibmvscsi/ibmvscsi.c
> @@ -2118,7 +2118,7 @@ static unsigned long ibmvscsi_get_desired_dma(struct vio_dev *vdev)
> static void ibmvscsi_do_work(struct ibmvscsi_host_data *hostdata)
> {
> unsigned long flags;
> - int rc;
> + int rc = 0;
I don't think the above is the best solution, as it hides the warning
instead of really fixing it.
Your problem is that some legs of the switch are missing setting the
value of rc, it would therefore be better to fix the legs instead of
setting a default value which may not be correct for every case,
allthough it may be at the time being.
Christophe
> char *action = "reset";
>
> spin_lock_irqsave(hostdata->host->host_lock, flags);
>
^ permalink raw reply
* Re: [PATCH v3] powerpc/pseries: Fix cpu_hotplug_lock acquisition in resize_hpt()
From: Gautham R Shenoy @ 2019-06-03 8:55 UTC (permalink / raw)
To: Gautham R. Shenoy
Cc: linux-kernel, Nicholas Piggin, Paul Mackerras, Aneesh Kumar K.V,
linuxppc-dev
In-Reply-To: <1557906352-29048-1-git-send-email-ego@linux.vnet.ibm.com>
Hi,
On Wed, May 15, 2019 at 01:15:52PM +0530, Gautham R. Shenoy wrote:
> From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>
>
> The calls to arch_add_memory()/arch_remove_memory() are always made
> with the read-side cpu_hotplug_lock acquired via
> memory_hotplug_begin(). On pSeries,
> arch_add_memory()/arch_remove_memory() eventually call resize_hpt()
> which in turn calls stop_machine() which acquires the read-side
> cpu_hotplug_lock again, thereby resulting in the recursive acquisition
> of this lock.
A clarification regarding why we hadn't observed this problem earlier.
In the absence of CONFIG_PROVE_LOCKING, we hadn't observed a system
lockup during a memory hotplug operation because cpus_read_lock() is a
per-cpu rwsem read, which, in the fast-path (in the absence of the
writer, which in our case is a CPU-hotplug operation) simply
increments the read_count on the semaphore. Thus a recursive read in
the fast-path doesn't cause any problems.
However, we can hit this problem in practice if there is a concurrent
CPU-Hotplug operation in progress which is waiting to acquire the
write-side of the lock. This will cause the second recursive read to
block until the writer finishes. While the writer is blocked since the
first read holds the lock. Thus both the reader as well as the writers
fail to make any progress thereby blocking both CPU-Hotplug as well as
Memory Hotplug operations.
Memory-Hotplug CPU-Hotplug
CPU 0 CPU 1
------ ------
1. down_read(cpu_hotplug_lock.rw_sem)
[memory_hotplug_begin]
2. down_write(cpu_hotplug_lock.rw_sem)
[cpu_up/cpu_down]
3. down_read(cpu_hotplug_lock.rw_sem)
[stop_machine()]
>
> Lockdep complains as follows in these code-paths.
>
> swapper/0/1 is trying to acquire lock:
> (____ptrval____) (cpu_hotplug_lock.rw_sem){++++}, at: stop_machine+0x2c/0x60
>
> but task is already holding lock:
> (____ptrval____) (cpu_hotplug_lock.rw_sem){++++}, at: mem_hotplug_begin+0x20/0x50
>
> other info that might help us debug this:
> Possible unsafe locking scenario:
>
> CPU0
> ----
> lock(cpu_hotplug_lock.rw_sem);
> lock(cpu_hotplug_lock.rw_sem);
>
> *** DEADLOCK ***
>
> May be due to missing lock nesting notation
>
> 3 locks held by swapper/0/1:
> #0: (____ptrval____) (&dev->mutex){....}, at: __driver_attach+0x12c/0x1b0
> #1: (____ptrval____) (cpu_hotplug_lock.rw_sem){++++}, at: mem_hotplug_begin+0x20/0x50
> #2: (____ptrval____) (mem_hotplug_lock.rw_sem){++++}, at: percpu_down_write+0x54/0x1a0
>
> stack backtrace:
> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.0.0-rc5-58373-gbc99402235f3-dirty #166
> Call Trace:
> [c0000000feb03150] [c000000000e32bd4] dump_stack+0xe8/0x164 (unreliable)
> [c0000000feb031a0] [c00000000020d6c0] __lock_acquire+0x1110/0x1c70
> [c0000000feb03320] [c00000000020f080] lock_acquire+0x240/0x290
> [c0000000feb033e0] [c00000000017f554] cpus_read_lock+0x64/0xf0
> [c0000000feb03420] [c00000000029ebac] stop_machine+0x2c/0x60
> [c0000000feb03460] [c0000000000d7f7c] pseries_lpar_resize_hpt+0x19c/0x2c0
> [c0000000feb03500] [c0000000000788d0] resize_hpt_for_hotplug+0x70/0xd0
> [c0000000feb03570] [c000000000e5d278] arch_add_memory+0x58/0xfc
> [c0000000feb03610] [c0000000003553a8] devm_memremap_pages+0x5e8/0x8f0
> [c0000000feb036c0] [c0000000009c2394] pmem_attach_disk+0x764/0x830
> [c0000000feb037d0] [c0000000009a7c38] nvdimm_bus_probe+0x118/0x240
> [c0000000feb03860] [c000000000968500] really_probe+0x230/0x4b0
> [c0000000feb038f0] [c000000000968aec] driver_probe_device+0x16c/0x1e0
> [c0000000feb03970] [c000000000968ca8] __driver_attach+0x148/0x1b0
> [c0000000feb039f0] [c0000000009650b0] bus_for_each_dev+0x90/0x130
> [c0000000feb03a50] [c000000000967dd4] driver_attach+0x34/0x50
> [c0000000feb03a70] [c000000000967068] bus_add_driver+0x1a8/0x360
> [c0000000feb03b00] [c00000000096a498] driver_register+0x108/0x170
> [c0000000feb03b70] [c0000000009a7400] __nd_driver_register+0xd0/0xf0
> [c0000000feb03bd0] [c00000000128aa90] nd_pmem_driver_init+0x34/0x48
> [c0000000feb03bf0] [c000000000010a10] do_one_initcall+0x1e0/0x45c
> [c0000000feb03cd0] [c00000000122462c] kernel_init_freeable+0x540/0x64c
> [c0000000feb03db0] [c00000000001110c] kernel_init+0x2c/0x160
> [c0000000feb03e20] [c00000000000bed4] ret_from_kernel_thread+0x5c/0x68
>
> Fix this issue by
> 1) Requiring all the calls to pseries_lpar_resize_hpt() be made
> with cpu_hotplug_lock held.
>
> 2) In pseries_lpar_resize_hpt() invoke stop_machine_cpuslocked()
> as a consequence of 1)
>
> 3) To satisfy 1), in hpt_order_set(), call mmu_hash_ops.resize_hpt()
> with cpu_hotplug_lock held.
>
> Reported-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
> ---
> v2 -> v3 : Updated the comment for pseries_lpar_resize_hpt()
> Updated the commit-log with the full backtrace.
> v1 -> v2 : Rebased against powerpc/next instead of linux/master
>
> arch/powerpc/mm/book3s64/hash_utils.c | 9 ++++++++-
> arch/powerpc/platforms/pseries/lpar.c | 8 ++++++--
> 2 files changed, 14 insertions(+), 3 deletions(-)
>
> diff --git a/arch/powerpc/mm/book3s64/hash_utils.c b/arch/powerpc/mm/book3s64/hash_utils.c
> index 919a861..d07fcafd 100644
> --- a/arch/powerpc/mm/book3s64/hash_utils.c
> +++ b/arch/powerpc/mm/book3s64/hash_utils.c
> @@ -38,6 +38,7 @@
> #include <linux/libfdt.h>
> #include <linux/pkeys.h>
> #include <linux/hugetlb.h>
> +#include <linux/cpu.h>
>
> #include <asm/debugfs.h>
> #include <asm/processor.h>
> @@ -1928,10 +1929,16 @@ static int hpt_order_get(void *data, u64 *val)
>
> static int hpt_order_set(void *data, u64 val)
> {
> + int ret;
> +
> if (!mmu_hash_ops.resize_hpt)
> return -ENODEV;
>
> - return mmu_hash_ops.resize_hpt(val);
> + cpus_read_lock();
> + ret = mmu_hash_ops.resize_hpt(val);
> + cpus_read_unlock();
> +
> + return ret;
> }
>
> DEFINE_DEBUGFS_ATTRIBUTE(fops_hpt_order, hpt_order_get, hpt_order_set, "%llu\n");
> diff --git a/arch/powerpc/platforms/pseries/lpar.c b/arch/powerpc/platforms/pseries/lpar.c
> index 1034ef1..557d592 100644
> --- a/arch/powerpc/platforms/pseries/lpar.c
> +++ b/arch/powerpc/platforms/pseries/lpar.c
> @@ -859,7 +859,10 @@ static int pseries_lpar_resize_hpt_commit(void *data)
> return 0;
> }
>
> -/* Must be called in user context */
> +/*
> + * Must be called in process context. The caller must hold the
> + * cpus_lock.
> + */
> static int pseries_lpar_resize_hpt(unsigned long shift)
> {
> struct hpt_resize_state state = {
> @@ -913,7 +916,8 @@ static int pseries_lpar_resize_hpt(unsigned long shift)
>
> t1 = ktime_get();
>
> - rc = stop_machine(pseries_lpar_resize_hpt_commit, &state, NULL);
> + rc = stop_machine_cpuslocked(pseries_lpar_resize_hpt_commit,
> + &state, NULL);
>
> t2 = ktime_get();
>
> --
> 1.9.4
>
^ permalink raw reply
* Re: [PATCH v3 1/3] PCI: Introduce pcibios_ignore_alignment_request
From: Shawn Anastasio @ 2019-06-03 9:12 UTC (permalink / raw)
To: Alexey Kardashevskiy
Cc: Sam Bobroff, linux-pci, Linux Kernel Mailing List, rppt, Oliver,
Bjorn Helgaas, Paul Mackerras, xyjxie, linuxppc-dev
In-Reply-To: <bab59b8e-f1fc-f92a-36bb-4ff471e6da24@ozlabs.ru>
On 6/3/19 3:35 AM, Alexey Kardashevskiy wrote:
>
>
> On 03/06/2019 15:02, Alexey Kardashevskiy wrote:
>>
>>
>> On 03/06/2019 12:23, Shawn Anastasio wrote:
>>>
>>>
>>> On 5/30/19 10:56 PM, Alexey Kardashevskiy wrote:
>>>>
>>>>
>>>> On 31/05/2019 08:49, Shawn Anastasio wrote:
>>>>> On 5/29/19 10:39 PM, Alexey Kardashevskiy wrote:
>>>>>>
>>>>>>
>>>>>> On 28/05/2019 17:39, Shawn Anastasio wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 5/28/19 1:27 AM, Alexey Kardashevskiy wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On 28/05/2019 15:36, Oliver wrote:
>>>>>>>>> On Tue, May 28, 2019 at 2:03 PM Shawn Anastasio <shawn@anastas.io>
>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> Introduce a new pcibios function pcibios_ignore_alignment_request
>>>>>>>>>> which allows the PCI core to defer to platform-specific code to
>>>>>>>>>> determine whether or not to ignore alignment requests for PCI
>>>>>>>>>> resources.
>>>>>>>>>>
>>>>>>>>>> The existing behavior is to simply ignore alignment requests when
>>>>>>>>>> PCI_PROBE_ONLY is set. This is behavior is maintained by the
>>>>>>>>>> default implementation of pcibios_ignore_alignment_request.
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Shawn Anastasio <shawn@anastas.io>
>>>>>>>>>> ---
>>>>>>>>>> drivers/pci/pci.c | 9 +++++++--
>>>>>>>>>> include/linux/pci.h | 1 +
>>>>>>>>>> 2 files changed, 8 insertions(+), 2 deletions(-)
>>>>>>>>>>
>>>>>>>>>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>>>>>>>>>> index 8abc843b1615..8207a09085d1 100644
>>>>>>>>>> --- a/drivers/pci/pci.c
>>>>>>>>>> +++ b/drivers/pci/pci.c
>>>>>>>>>> @@ -5882,6 +5882,11 @@ resource_size_t __weak
>>>>>>>>>> pcibios_default_alignment(void)
>>>>>>>>>> return 0;
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> +int __weak pcibios_ignore_alignment_request(void)
>>>>>>>>>> +{
>>>>>>>>>> + return pci_has_flag(PCI_PROBE_ONLY);
>>>>>>>>>> +}
>>>>>>>>>> +
>>>>>>>>>> #define RESOURCE_ALIGNMENT_PARAM_SIZE COMMAND_LINE_SIZE
>>>>>>>>>> static char
>>>>>>>>>> resource_alignment_param[RESOURCE_ALIGNMENT_PARAM_SIZE] = {0};
>>>>>>>>>> static DEFINE_SPINLOCK(resource_alignment_lock);
>>>>>>>>>> @@ -5906,9 +5911,9 @@ static resource_size_t
>>>>>>>>>> pci_specified_resource_alignment(struct pci_dev *dev,
>>>>>>>>>> p = resource_alignment_param;
>>>>>>>>>> if (!*p && !align)
>>>>>>>>>> goto out;
>>>>>>>>>> - if (pci_has_flag(PCI_PROBE_ONLY)) {
>>>>>>>>>> + if (pcibios_ignore_alignment_request()) {
>>>>>>>>>> align = 0;
>>>>>>>>>> - pr_info_once("PCI: Ignoring requested alignments
>>>>>>>>>> (PCI_PROBE_ONLY)\n");
>>>>>>>>>> + pr_info_once("PCI: Ignoring requested
>>>>>>>>>> alignments\n");
>>>>>>>>>> goto out;
>>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> I think the logic here is questionable to begin with. If the user
>>>>>>>>> has
>>>>>>>>> explicitly requested re-aligning a resource via the command line
>>>>>>>>> then
>>>>>>>>> we should probably do it even if PCI_PROBE_ONLY is set. When it
>>>>>>>>> breaks
>>>>>>>>> they get to keep the pieces.
>>>>>>>>>
>>>>>>>>> That said, the real issue here is that PCI_PROBE_ONLY probably
>>>>>>>>> shouldn't be set under qemu/kvm. Under the other hypervisor
>>>>>>>>> (PowerVM)
>>>>>>>>> hotplugged devices are configured by firmware before it's passed to
>>>>>>>>> the guest and we need to keep the FW assignments otherwise things
>>>>>>>>> break. QEMU however doesn't do any BAR assignments and relies on
>>>>>>>>> that
>>>>>>>>> being handled by the guest. At boot time this is done by SLOF, but
>>>>>>>>> Linux only keeps SLOF around until it's extracted the device-tree.
>>>>>>>>> Once that's done SLOF gets blown away and the kernel needs to do
>>>>>>>>> it's
>>>>>>>>> own BAR assignments. I'm guessing there's a hack in there to make it
>>>>>>>>> work today, but it's a little surprising that it works at all...
>>>>>>>>
>>>>>>>>
>>>>>>>> The hack is to run a modified qemu-aware "/usr/sbin/rtas_errd" in the
>>>>>>>> guest which receives an event from qemu (RAS_EPOW from
>>>>>>>> /proc/interrupts), fetches device tree chunks (and as I understand
>>>>>>>> it -
>>>>>>>> they come with BARs from phyp but without from qemu) and writes
>>>>>>>> "1" to
>>>>>>>> "/sys/bus/pci/rescan" which calls pci_assign_resource() eventually:
>>>>>>>
>>>>>>> Interesting. Does this mean that the PHYP hotplug path doesn't
>>>>>>> call pci_assign_resource?
>>>>>>
>>>>>>
>>>>>> I'd expect dlpar_add_slot() to be called under phyp and eventually
>>>>>> pci_device_add() which (I think) may or may not trigger later
>>>>>> reassignment.
>>>>>>
>>>>>>
>>>>>>> If so it means the patch may not
>>>>>>> break that platform after all, though it still may not be
>>>>>>> the correct way of doing things.
>>>>>>
>>>>>>
>>>>>> We should probably stop enforcing the PCI_PROBE_ONLY flag - it seems
>>>>>> that (unless resource_alignment= is used) the pseries guest should just
>>>>>> walk through all allocated resources and leave them unchanged.
>>>>>
>>>>> If we add a pcibios_default_alignment() implementation like was
>>>>> suggested earlier, then it will behave as if the user has
>>>>> specified resource_alignment= by default and SLOF's assignments
>>>>> won't be honored (I think).
>>>>
>>>>
>>>> I removed pci_add_flags(PCI_PROBE_ONLY) from pSeries_setup_arch and
>>>> tried booting with and without pci=resource_alignment= and I can see no
>>>> difference - BARs are still aligned to 64K as programmed in SLOF; if I
>>>> hack SLOF to align to 4K or 32K - BARs get packed and the guest leaves
>>>> them unchanged.
>>>>
>>>>
>>>>> I guess it boils down to one question - is it important that we
>>>>> observe SLOF's initial BAR assignments?
>>>>
>>>> It isn't if it's SLOF but it is if it's phyp. It used to not
>>>> allow/support BAR reassignment and even if it does not, I'd rather avoid
>>>> touching them.
>>>
>>> A quick update. I tried removing pci_add_flags(PCI_PROBE_ONLY) which
>>> worked, but if I add an implementation of pcibios_default_alignment
>>> which simply returns PAGE_SIZE, my VM fails to boot and many errors
>>> from the virtio disk driver are printed to the console.
>>>
>>> After some investigation, it seems that with pcibios_default_alignment
>>> present, Linux will reallocate all resources provided by SLOF on
>>> boot. I'm still not sure why exactly this causes the virtio driver
>>> to fail, but it does indicate that there is a reason to keep
>>> SLOF's initial assignments.
>>>
>>> Anybody have an idea what's causing this?
>>
>> With your changes the guest feels the urge to reassign bars (no idea why
>> but ok), when it does so, it puts both BARs (one is prefetchable) into
>> the 32bit non-prefetchable window of the PHB (SLOF puts the prefetchable
>> bar to a 64bit prefetchable window, I have no idea why the guest does it
>> different either but this must still work) and then qemu does not
>> emulate something properly - unassigned_mem_accepts() is triggered on
>> the bar access - no idea why - I am debugging it right now.
>
>
> Sooo the problem is that resouce::flags has 2 bits to describe 64bit
> BARs - PCI_BASE_ADDRESS_MEM_TYPE_64 and IORESOURCE_MEM_64 - and we don't
> set IORESOURCE_MEM_64 for 64bit BARs when parsing the fdt.
>
> So the BAR reallocator moves the BAR to 32bit window (which is not
> desirable but permitted, I still have to chase it) and then
> pci_std_update_resource() writes BAR back but since now it is 32bit BAR,
> it does not write to the upper 32bits so that half remains 0x2100, QEMU
> does not move BAR to the right window and the MMIO stops working.
>
> Try this in the guest kernel, it seems to keep bars where they were
> after slof.
Nice debugging work! With your patch the VM does boot. I'm not sure
if SLOF's original allocations are being kept or if Linux is redoing
them (how do you check?), but MMIO works and the system boots anyways.
I've also tested hotplug and the BAR allocations are page-aligned too,
as expected.
^ permalink raw reply
* Re: [PATCH v3 14/16] powerpc/32: implement fast entry for syscalls on BOOKE
From: Christophe Leroy @ 2019-06-03 9:16 UTC (permalink / raw)
To: Michael Ellerman; +Cc: linuxppc-dev, linux-kernel, Nicholas Piggin
In-Reply-To: <20190528190341.Horde.nTXOule-IO2ReXFiNIqNbg8@messagerie.si.c-s.fr>
On 05/28/2019 05:03 PM, Christophe Leroy wrote:
> Michael Ellerman <mpe@ellerman.id.au> a écrit :
>
>> Christophe Leroy <christophe.leroy@c-s.fr> writes:
>>> Le 23/05/2019 à 09:00, Christophe Leroy a écrit :
>>>
>>> [...]
>>>
>>>>> arch/powerpc/kernel/head_fsl_booke.o: In function `SystemCall':
>>>>> arch/powerpc/kernel/head_fsl_booke.S:416: undefined reference to
>>>>> `kvmppc_handler_BOOKE_INTERRUPT_SYSCALL_SPRN_SRR1'
>>>>> Makefile:1052: recipe for target 'vmlinux' failed
>>>>>
>>>>>> +.macro SYSCALL_ENTRY trapno intno
>>>>>> + mfspr r10, SPRN_SPRG_THREAD
>>>>>> +#ifdef CONFIG_KVM_BOOKE_HV
>>>>>> +BEGIN_FTR_SECTION
>>>>>> + mtspr SPRN_SPRG_WSCRATCH0, r10
>>>>>> + stw r11, THREAD_NORMSAVE(0)(r10)
>>>>>> + stw r13, THREAD_NORMSAVE(2)(r10)
>>>>>> + mfcr r13 /* save CR in r13 for now */
>>>>>> + mfspr r11, SPRN_SRR1
>>>>>> + mtocrf 0x80, r11 /* check MSR[GS] without clobbering
>>>>>> reg */
>>>>>> + bf 3, 1975f
>>>>>> + b kvmppc_handler_BOOKE_INTERRUPT_\intno\()_SPRN_SRR1
>>>>>
>>>>> It seems to me that the "_SPRN_SRR1" on the end of this line
>>>>> isn't meant to be there... However, it still fails to link with that
>>>>> removed.
>>>
>>> It looks like I missed the macro expansion.
>>>
>>> The called function should be kvmppc_handler_8_0x01B
>>>
>>> Seems like kisskb doesn't build any config like this.
>>
>> I thought we did, ie:
>>
>> http://kisskb.ellerman.id.au/kisskb/buildresult/13817941/
>
> That's a ppc64 config it seems. The problem was on booke32.
>
> Christophe
>
>>
>> But clearly something is missing to trigger the bug.
I was able to trigger the bug with mpc85xx_defconfig +
CONFIG_VIRTUALIZATION + CONFIG_PPC_E500MC
The bug pops up when CONFIG_KVM_BOOKE_HV is set.
Christophe
>>
>> cheers
>
>
^ permalink raw reply
* Re: [PATCH 22/22] docs: fix broken documentation links
From: Christophe Leroy @ 2019-06-03 7:34 UTC (permalink / raw)
To: Mauro Carvalho Chehab, Linux Doc Mailing List
Cc: Andrew Lunn, Thomas Preston, Wolfram Sang, Catalin Marinas,
Linus Walleij, Will Deacon, Russell King, Paul Mackerras,
Alessia Mantegazza, Jakub Wilk, AKASHI Takahiro, Kevin Hilman,
James Morris, linux-acpi, Andy Gross, xen-devel, Jason Wang,
Alexander Popov, Qian Cai, Al Viro, Andy Lutomirski,
Thomas Gleixner, Kairui Song, Quentin Perret, Greg Kroah-Hartman,
Rafael J. Wysocki, linux-kernel, Paul Burton, Jiri Kosina,
Casey Schaufler, Andrew Morton, Lu Baolu, Mark Rutland, Feng Tang,
Dave Hansen, Mimi Zohar, Kamalesh Babulal, Masahiro Yamada,
Yannik Sembritzki, Harry Wei, linux-i2c, Shuah Khan,
Stephen Rothwell, Paul E. McKenney, Alexandre Ghiti, YueHaibing,
Robert Moore, Bartosz Golaszewski, Len Brown, David Brown,
Joerg Roedel, linux-arm-msm, Mauro Carvalho Chehab, linux-gpio,
Claudiu Manoil, Florian Fainelli, Jacek Anaszewski, Bjorn Helgaas,
linux-amlogic, Boris Ostrovsky, Mika Westerberg, linux-arm-kernel,
Tony Luck, Sean Christopherson, Rob Herring, James Morse,
Robin Murphy, Samuel Mendoza-Jonas, linux-pci, Bhupesh Sharma,
Josh Poimboeuf, platform-driver-x86, Ding Xiang, linux-kselftest,
Alex Shi, Lorenzo Pieralisi, Baoquan He, Jonathan Corbet,
Raphael Gault, Joel Stanley, Federico Vaga, Darren Hart,
Erik Schmauss, Serge E. Hallyn, Palmer Dabbelt, Kees Cook,
Bartlomiej Zolnierkiewicz, Jonathan Neuschäfer,
SeongJae Park, Mark Brown, Borislav Petkov, Sunil Muthuswamy,
virtualization, devel, Ard Biesheuvel, Liam Girdwood,
Sakari Ailus, Olof Johansson, Logan Gunthorpe, David S. Miller,
Kirill A. Shutemov, Sven Van Asbroeck, Michal Hocko, kvm,
Michael S. Tsirkin, Peter Zijlstra, Thorsten Leemhuis,
David Howells, linux-mm, H. Peter Anvin, devel, Manfred Spraul,
Luis Chamberlain, x86, Pavel Tatashin, Mike Rapoport, Ingo Molnar,
Dave Young, devicetree, Arnaldo Carvalho de Melo, Jerome Glisse,
Stefano Stabellini, Jonathan Cameron, Dmitry Vyukov, linux-edac,
Juergen Gross, Denis Efremov, netdev, Nicolas Ferre, Changbin Du,
linux-security-module, linuxppc-dev, Andy Shevchenko
In-Reply-To: <f9fecacbe4ce0b2b3aed38d71ae3753f2daf3ce3.1559171394.git.mchehab+samsung@kernel.org>
Le 30/05/2019 à 01:23, Mauro Carvalho Chehab a écrit :
> Mostly due to x86 and acpi conversion, several documentation
> links are still pointing to the old file. Fix them.
>
> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
> ---
> Documentation/acpi/dsd/leds.txt | 2 +-
> Documentation/admin-guide/kernel-parameters.rst | 6 +++---
> Documentation/admin-guide/kernel-parameters.txt | 16 ++++++++--------
> Documentation/admin-guide/ras.rst | 2 +-
> .../devicetree/bindings/net/fsl-enetc.txt | 7 +++----
> .../bindings/pci/amlogic,meson-pcie.txt | 2 +-
> .../bindings/regulator/qcom,rpmh-regulator.txt | 2 +-
> Documentation/devicetree/booting-without-of.txt | 2 +-
> Documentation/driver-api/gpio/board.rst | 2 +-
> Documentation/driver-api/gpio/consumer.rst | 2 +-
> .../firmware-guide/acpi/enumeration.rst | 2 +-
> .../firmware-guide/acpi/method-tracing.rst | 2 +-
> Documentation/i2c/instantiating-devices | 2 +-
> Documentation/sysctl/kernel.txt | 4 ++--
> .../translations/it_IT/process/howto.rst | 2 +-
> .../it_IT/process/stable-kernel-rules.rst | 4 ++--
> .../translations/zh_CN/process/4.Coding.rst | 2 +-
> Documentation/x86/x86_64/5level-paging.rst | 2 +-
> Documentation/x86/x86_64/boot-options.rst | 4 ++--
> .../x86/x86_64/fake-numa-for-cpusets.rst | 2 +-
> MAINTAINERS | 6 +++---
> arch/arm/Kconfig | 2 +-
> arch/arm64/kernel/kexec_image.c | 2 +-
> arch/powerpc/Kconfig | 2 +-
> arch/x86/Kconfig | 16 ++++++++--------
> arch/x86/Kconfig.debug | 2 +-
> arch/x86/boot/header.S | 2 +-
> arch/x86/entry/entry_64.S | 2 +-
> arch/x86/include/asm/bootparam_utils.h | 2 +-
> arch/x86/include/asm/page_64_types.h | 2 +-
> arch/x86/include/asm/pgtable_64_types.h | 2 +-
> arch/x86/kernel/cpu/microcode/amd.c | 2 +-
> arch/x86/kernel/kexec-bzimage64.c | 2 +-
> arch/x86/kernel/pci-dma.c | 2 +-
> arch/x86/mm/tlb.c | 2 +-
> arch/x86/platform/pvh/enlighten.c | 2 +-
> drivers/acpi/Kconfig | 10 +++++-----
> drivers/net/ethernet/faraday/ftgmac100.c | 2 +-
> .../fieldbus/Documentation/fieldbus_dev.txt | 4 ++--
> drivers/vhost/vhost.c | 2 +-
> include/acpi/acpi_drivers.h | 2 +-
> include/linux/fs_context.h | 2 +-
> include/linux/lsm_hooks.h | 2 +-
> mm/Kconfig | 2 +-
> security/Kconfig | 2 +-
> tools/include/linux/err.h | 2 +-
> tools/objtool/Documentation/stack-validation.txt | 4 ++--
> tools/testing/selftests/x86/protection_keys.c | 2 +-
> 48 files changed, 77 insertions(+), 78 deletions(-)
[...]
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index 8c1c636308c8..e868d2bd48b8 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -898,7 +898,7 @@ config PPC_MEM_KEYS
> page-based protections, but without requiring modification of the
> page tables when an application changes protection domains.
>
> - For details, see Documentation/vm/protection-keys.rst
> + For details, see Documentation/x86/protection-keys.rst
It looks strange to reference an x86 file, for powerpc arch.
Christophe
^ permalink raw reply
* Re: [PATCH v8 3/7] s390/pci: add support for IOMMU default DMA mode build options
From: Sebastian Ott @ 2019-06-03 11:51 UTC (permalink / raw)
To: Zhen Lei
Cc: linux-ia64, linux-doc, Hanjun Guo, Heiko Carstens, Paul Mackerras,
H . Peter Anvin, linux-s390, Jonathan Corbet,
Jean-Philippe Brucker, Joerg Roedel, x86, Ingo Molnar, Fenghua Yu,
Will Deacon, John Garry, linuxppc-dev, Borislav Petkov,
Thomas Gleixner, Gerald Schaefer, Tony Luck, David Woodhouse,
linux-kernel, iommu, Martin Schwidefsky, Robin Murphy
In-Reply-To: <20190530034831.4184-4-thunder.leizhen@huawei.com>
On Thu, 30 May 2019, Zhen Lei wrote:
> The default DMA mode is LAZY on s390, this patch make it can be set to
> STRICT at build time. It can be overridden by boot option.
>
> There is no functional change.
>
> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Acked-by: Sebastian Ott <sebott@linux.ibm.com>
^ permalink raw reply
* Re: [PATCH] powerpc/powernv/npu: Fix reference leak
From: Michael Ellerman @ 2019-06-03 12:32 UTC (permalink / raw)
To: Greg Kurz, linux-kernel
Cc: Alexey Kardashevskiy, Alistair Popple, linuxppc-dev
In-Reply-To: <155568805354.600470.13376593185688810607.stgit@bahia.lan>
On Fri, 2019-04-19 at 15:34:13 UTC, Greg Kurz wrote:
> Since 902bdc57451c, get_pci_dev() calls pci_get_domain_bus_and_slot(). This
> has the effect of incrementing the reference count of the PCI device, as
> explained in drivers/pci/search.c:
>
> * Given a PCI domain, bus, and slot/function number, the desired PCI
> * device is located in the list of PCI devices. If the device is
> * found, its reference count is increased and this function returns a
> * pointer to its data structure. The caller must decrement the
> * reference count by calling pci_dev_put(). If no device is found,
> * %NULL is returned.
>
> Nothing was done to call pci_dev_put() and the reference count of GPU and
> NPU PCI devices rockets up.
>
> A natural way to fix this would be to teach the callers about the change,
> so that they call pci_dev_put() when done with the pointer. This turns
> out to be quite intrusive, as it affects many paths in npu-dma.c,
> pci-ioda.c and vfio_pci_nvlink2.c. Also, the issue appeared in 4.16 and
> some affected code got moved around since then: it would be problematic
> to backport the fix to stable releases.
>
> All that code never cared for reference counting anyway. Call pci_dev_put()
> from get_pci_dev() to revert to the previous behavior.
>
> Fixes: 902bdc57451c ("powerpc/powernv/idoa: Remove unnecessary pcidev from pci_dn")
> Cc: stable@vger.kernel.org # v4.16
> Signed-off-by: Greg Kurz <groug@kaod.org>
> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Applied to powerpc next, thanks.
https://git.kernel.org/powerpc/c/02c5f5394918b9b47ff4357b1b183357
cheers
^ permalink raw reply
* Re: [PATCH -next] powerpc/book3s64: Make some symbols static
From: Michael Ellerman @ 2019-06-03 12:32 UTC (permalink / raw)
To: YueHaibing, benh, paulus, christophe.leroy, aneesh.kumar
Cc: linuxppc-dev, YueHaibing, linux-kernel
In-Reply-To: <20190504102427.12332-1-yuehaibing@huawei.com>
On Sat, 2019-05-04 at 10:24:27 UTC, YueHaibing wrote:
> Fix sparse warnings:
>
> arch/powerpc/mm/book3s64/radix_pgtable.c:326:13: warning: symbol 'radix_init_pgtable' was not declared. Should it be static?
> arch/powerpc/mm/book3s64/hash_native.c:48:1: warning: symbol 'native_tlbie_lock' was not declared. Should it be static?
> arch/powerpc/mm/book3s64/hash_utils.c:988:24: warning: symbol 'init_hash_mm_context' was not declared. Should it be static?
>
> Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Applied to powerpc next, thanks.
https://git.kernel.org/powerpc/c/d667edc01bedcd23988ef69f851e7cc2
cheers
^ permalink raw reply
* Re: [PATCH -next] misc: ocxl: Make ocxl_remove static
From: Michael Ellerman @ 2019-06-03 12:32 UTC (permalink / raw)
To: YueHaibing, fbarrat, ajd, arnd, gregkh
Cc: linuxppc-dev, YueHaibing, linux-kernel
In-Reply-To: <20190504102720.42220-1-yuehaibing@huawei.com>
On Sat, 2019-05-04 at 10:27:20 UTC, YueHaibing wrote:
> Fix sparse warning:
>
> drivers/misc/ocxl/pci.c:44:6: warning:
> symbol 'ocxl_remove' was not declared. Should it be static?
>
> Reported-by: Hulk Robot <hulkci@huawei.com>
> Signed-off-by: YueHaibing <yuehaibing@huawei.com>
> Acked-by: Andrew Donnellan <ajd@linux.ibm.com>
Applied to powerpc next, thanks.
https://git.kernel.org/powerpc/c/00b0cdbbc87fb4b531a0d75f4004ed3d
cheers
^ permalink raw reply
* Re: [PATCH 1/2] powerpc/lib: fix redundant inclusion of quad.o
From: Michael Ellerman @ 2019-06-03 12:32 UTC (permalink / raw)
To: Christophe Leroy, Benjamin Herrenschmidt, Paul Mackerras
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <7496da89e027e563cb8e62dc89548525cf53b57e.1557741292.git.christophe.leroy@c-s.fr>
On Mon, 2019-05-13 at 10:00:14 UTC, Christophe Leroy wrote:
> quad.o is only for PPC64, and already included in obj64-y,
> so it doesn't have to be in obj-y
>
> Fixes: 31bfdb036f12 ("powerpc: Use instruction emulation infrastructure to handle alignment faults")
> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Series applied to powerpc next, thanks.
https://git.kernel.org/powerpc/c/f8e0d0fddf87f26aed576983f752329d
cheers
^ permalink raw reply
* Re: [PATCH] powerpc/pseries: Fix xive=off command line
From: Michael Ellerman @ 2019-06-03 12:32 UTC (permalink / raw)
To: Greg Kurz, Paul Mackerras, Benjamin Herrenschmidt
Cc: pavrampu, linuxppc-dev, linux-kernel, stable,
Cédric Le Goater
In-Reply-To: <155791470178.432724.8008395673479905061.stgit@bahia.lan>
On Wed, 2019-05-15 at 10:05:01 UTC, Greg Kurz wrote:
> On POWER9, if the hypervisor supports XIVE exploitation mode, the guest OS
> will unconditionally requests for the XIVE interrupt mode even if XIVE was
> deactivated with the kernel command line xive=off. Later on, when the spapr
> XIVE init code handles xive=off, it disables XIVE and tries to fall back on
> the legacy mode XICS.
>
> This discrepency causes a kernel panic because the hypervisor is configured
> to provide the XIVE interrupt mode to the guest :
>
> [ 0.008837] kernel BUG at arch/powerpc/sysdev/xics/xics-common.c:135!
> [ 0.008877] Oops: Exception in kernel mode, sig: 5 [#1]
> [ 0.008908] LE SMP NR_CPUS=1024 NUMA pSeries
> [ 0.008939] Modules linked in:
> [ 0.008964] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 5.0.13-200.fc29.ppc64le #1
> [ 0.009018] NIP: c000000001029ab8 LR: c000000001029aac CTR: c0000000018e0000
> [ 0.009065] REGS: c0000007f96d7900 TRAP: 0700 Tainted: G W (5.0.13-200.fc29.ppc64le)
> [ 0.009119] MSR: 8000000002029033 <SF,VEC,EE,ME,IR,DR,RI,LE> CR: 28000222 XER: 20040000
> [ 0.009168] CFAR: c0000000001b1e28 IRQMASK: 0
> [ 0.009168] GPR00: c000000001029aac c0000007f96d7b90 c0000000015e8600 0000000000000000
> [ 0.009168] GPR04: 0000000000000001 0000000000000000 0000000000000061 646f6d61696e0d0a
> [ 0.009168] GPR08: 00000007fd8f0000 0000000000000001 c0000000014c44c0 c0000007f96d76cf
> [ 0.009168] GPR12: 0000000000000000 c0000000018e0000 0000000000000001 0000000000000000
> [ 0.009168] GPR16: 0000000000000000 0000000000000001 c0000007f96d7c08 c0000000016903d0
> [ 0.009168] GPR20: c0000007fffe04e8 ffffffffffffffea c000000001620164 c00000000161fe58
> [ 0.009168] GPR24: c000000000ea6c88 c0000000011151a8 00000000006000c0 c0000007f96d7c34
> [ 0.009168] GPR28: 0000000000000000 c0000000014b286c c000000001115180 c00000000161dc70
> [ 0.009558] NIP [c000000001029ab8] xics_smp_probe+0x38/0x98
> [ 0.009590] LR [c000000001029aac] xics_smp_probe+0x2c/0x98
> [ 0.009622] Call Trace:
> [ 0.009639] [c0000007f96d7b90] [c000000001029aac] xics_smp_probe+0x2c/0x98 (unreliable)
> [ 0.009687] [c0000007f96d7bb0] [c000000001033404] pSeries_smp_probe+0x40/0xa0
> [ 0.009734] [c0000007f96d7bd0] [c0000000010212a4] smp_prepare_cpus+0x62c/0x6ec
> [ 0.009782] [c0000007f96d7cf0] [c0000000010141b8] kernel_init_freeable+0x148/0x448
> [ 0.009829] [c0000007f96d7db0] [c000000000010ba4] kernel_init+0x2c/0x148
> [ 0.009870] [c0000007f96d7e20] [c00000000000bdd4] ret_from_kernel_thread+0x5c/0x68
> [ 0.009916] Instruction dump:
> [ 0.009940] 7c0802a6 60000000 7c0802a6 38800002 f8010010 f821ffe1 3c62001c e863b9a0
> [ 0.009988] 4b1882d1 60000000 7c690034 5529d97e <0b090000> 3d22001c e929b998 3ce2ff8f
>
> Look for xive=off during prom_init and don't ask for XIVE in this case. One
> exception though: if the host only supports XIVE, we still want to boot so
> we ignore xive=off.
>
> Similarly, have the spapr XIVE init code to looking at the interrupt mode
> negociated during CAS, and ignore xive=off if the hypervisor only supports
> XIVE.
>
> Fixes: eac1e731b59e ("powerpc/xive: guest exploitation of the XIVE interrupt controller")
> Cc: stable@vger.kernel.org # v4.20
> Reported-by: Pavithra R. Prakash <pavrampu@in.ibm.com>
> Signed-off-by: Greg Kurz <groug@kaod.org>
> Reviewed-by: Cédric Le Goater <clg@kaod.org>
Applied to powerpc next, thanks.
https://git.kernel.org/powerpc/c/a3bf9fbdad600b1e4335dd90979f8d60
cheers
^ permalink raw reply
* Re: [PATCH] powerpc: Remove variable ‘path’ since not used
From: Michael Ellerman @ 2019-06-03 12:32 UTC (permalink / raw)
To: Mathieu Malaterre
Cc: Mathieu Malaterre, linuxppc-dev, Paul Mackerras, linux-kernel
In-Reply-To: <20190523102520.20585-1-malat@debian.org>
On Thu, 2019-05-23 at 10:25:20 UTC, Mathieu Malaterre wrote:
> In commit eab00a208eb6 ("powerpc: Move `path` variable inside
> DEBUG_PROM") DEBUG_PROM sentinels were added to silence a warning
> (treated as error with W=1):
>
> arch/powerpc/kernel/prom_init.c:1388:8: error: variable ‘path’ set but not used [-Werror=unused-but-set-variable]
>
> Rework the original patch and simplify the code, by removing the
> variable ‘path’ completely. Fix line over 90 characters.
>
> Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
> Signed-off-by: Mathieu Malaterre <malat@debian.org>
Applied to powerpc next, thanks.
https://git.kernel.org/powerpc/c/c806a6fde1c29e7419afcf94d761827a
cheers
^ permalink raw reply
* Re: [PATCH] powerpc/powernv: Show checkstop reason for NPU2 HMIs
From: Michael Ellerman @ 2019-06-03 12:32 UTC (permalink / raw)
To: Frederic Barrat, linuxppc-dev, ajd, arbab, aik; +Cc: clombard, groug
In-Reply-To: <20190523122804.4801-1-fbarrat@linux.ibm.com>
On Thu, 2019-05-23 at 12:28:04 UTC, Frederic Barrat wrote:
> If the kernel is notified of an HMI caused by the NPU2, it's currently
> not being recognized and it logs the default message:
>
> Unknown Malfunction Alert of type 3
>
> The NPU on Power 9 has 3 Fault Isolation Registers, so that's a lot of
> possible causes, but we should at least log that it's an NPU problem
> and report which FIR and which bit were raised if opal gave us the
> information.
>
> Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com>
Applied to powerpc next, thanks.
https://git.kernel.org/powerpc/c/89d87bcba2874d824affb7842bb3960c
cheers
^ permalink raw reply
* Re: [PATCH] powerpc/powernv: Update firmware archaeology around OPAL_HANDLE_HMI
From: Michael Ellerman @ 2019-06-03 12:32 UTC (permalink / raw)
To: Stewart Smith, linuxppc-dev; +Cc: Stewart Smith
In-Reply-To: <20190524050956.14114-1-stewart@linux.ibm.com>
On Fri, 2019-05-24 at 05:09:56 UTC, Stewart Smith wrote:
> The first machines to ship with OPAL firmware all got firmware updates
> that have the new call, but just in case someone is foolish enough to
> believe the first 4 months of firmware is the best, we keep this code
> around.
>
> Comment is updated to not refer to late 2014 as recent or the future.
>
> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
Applied to powerpc next, thanks.
https://git.kernel.org/powerpc/c/1549c42deff5f3ffff326ae295ae5816
cheers
^ permalink raw reply
* Re: [PATCH] dlpar: Fix a missing-check bug in dlpar_parse_cc_property()
From: Michael Ellerman @ 2019-06-03 12:32 UTC (permalink / raw)
To: Gen Zhang, benh, paulus; +Cc: nfont, linuxppc-dev, linux-kernel
In-Reply-To: <20190526024240.GA14546@zhanggen-UX430UQ>
On Sun, 2019-05-26 at 02:42:40 UTC, Gen Zhang wrote:
> In dlpar_parse_cc_property(), 'prop->name' is allocated by kstrdup().
> kstrdup() may return NULL, so it should be checked and handle error.
> And prop should be freed if 'prop->name' is NULL.
>
> Signed-off-by: Gen Zhang <blackgod016574@gmail.com>
> Acked-by: Nathan Lynch <nathanl@linux.ibm.com>
Applied to powerpc next, thanks.
https://git.kernel.org/powerpc/c/efa9ace68e487ddd29c2b4d6dd232421
cheers
^ permalink raw reply
* [PATCH] powerpc/32s: fix booting with CONFIG_PPC_EARLY_DEBUG_BOOTX
From: Christophe Leroy @ 2019-06-03 13:00 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
Mathieu Malaterre
Cc: linuxppc-dev, linux-kernel
When booting through OF, setup_disp_bat() does nothing because
disp_BAT are not set. By change, it used to work because BOOTX
buffer is mapped 1:1 at address 0x81000000 by the bootloader, and
btext_setup_display() sets virt addr same as phys addr.
But since commit 215b823707ce ("powerpc/32s: set up an early static
hash table for KASAN."), a temporary page table overrides the
bootloader mapping.
This 0x81000000 is also problematic with the newly implemented
Kernel Userspace Access Protection (KUAP) because it is within user
address space.
This patch fixes those issues by properly setting disp_BAT through
a call to btext_prepare_BAT(), allowing setup_disp_bat() to
properly setup BAT3 for early bootx screen buffer access.
Reported-by: Mathieu Malaterre <malat@debian.org>
Fixes: 215b823707ce ("powerpc/32s: set up an early static hash table for KASAN.")
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
---
arch/powerpc/include/asm/btext.h | 4 ++++
arch/powerpc/kernel/prom_init.c | 1 +
arch/powerpc/kernel/prom_init_check.sh | 2 +-
3 files changed, 6 insertions(+), 1 deletion(-)
diff --git a/arch/powerpc/include/asm/btext.h b/arch/powerpc/include/asm/btext.h
index 3ffad030393c..461b0f193864 100644
--- a/arch/powerpc/include/asm/btext.h
+++ b/arch/powerpc/include/asm/btext.h
@@ -13,7 +13,11 @@ extern void btext_update_display(unsigned long phys, int width, int height,
int depth, int pitch);
extern void btext_setup_display(int width, int height, int depth, int pitch,
unsigned long address);
+#ifdef CONFIG_PPC32
extern void btext_prepare_BAT(void);
+#else
+static inline void btext_prepare_BAT(void) { }
+#endif
extern void btext_map(void);
extern void btext_unmap(void);
diff --git a/arch/powerpc/kernel/prom_init.c b/arch/powerpc/kernel/prom_init.c
index 3555cad7bdde..ed446b7ea164 100644
--- a/arch/powerpc/kernel/prom_init.c
+++ b/arch/powerpc/kernel/prom_init.c
@@ -2336,6 +2336,7 @@ static void __init prom_check_displays(void)
prom_printf("W=%d H=%d LB=%d addr=0x%x\n",
width, height, pitch, addr);
btext_setup_display(width, height, 8, pitch, addr);
+ btext_prepare_BAT();
}
#endif /* CONFIG_PPC_EARLY_DEBUG_BOOTX */
}
diff --git a/arch/powerpc/kernel/prom_init_check.sh b/arch/powerpc/kernel/prom_init_check.sh
index 518d416971c1..160bef0d553d 100644
--- a/arch/powerpc/kernel/prom_init_check.sh
+++ b/arch/powerpc/kernel/prom_init_check.sh
@@ -24,7 +24,7 @@ fi
WHITELIST="add_reloc_offset __bss_start __bss_stop copy_and_flush
_end enter_prom $MEM_FUNCS reloc_offset __secondary_hold
__secondary_hold_acknowledge __secondary_hold_spinloop __start
-logo_linux_clut224
+logo_linux_clut224 btext_prepare_BAT
reloc_got2 kernstart_addr memstart_addr linux_banner _stext
__prom_init_toc_start __prom_init_toc_end btext_setup_display TOC."
--
2.13.3
^ permalink raw reply related
* Re: [PATCH 01/16] uaccess: add untagged_addr definition for other arches
From: Khalid Aziz @ 2019-06-03 15:16 UTC (permalink / raw)
To: Christoph Hellwig, Linus Torvalds, Paul Burton, James Hogan,
Yoshinori Sato, Rich Felker, David S. Miller, Andrey Konovalov
Cc: Catalin Marinas, linux-sh, x86, linux-mips, Nicholas Piggin,
linux-kernel, linux-mm, Paul Mackerras, sparclinux, linuxppc-dev
In-Reply-To: <20190601074959.14036-2-hch@lst.de>
On 6/1/19 1:49 AM, Christoph Hellwig wrote:
> From: Andrey Konovalov <andreyknvl@google.com>
>
> To allow arm64 syscalls to accept tagged pointers from userspace, we must
> untag them when they are passed to the kernel. Since untagging is done in
> generic parts of the kernel, the untagged_addr macro needs to be defined
> for all architectures.
>
> Define it as a noop for architectures other than arm64.
Could you reword above sentence? We are already starting off with
untagged_addr() not being no-op for arm64 and sparc64. It will expand
further potentially. So something more along the lines of "Define it as
noop for architectures that do not support memory tagging". The first
paragraph in the log can also be rewritten to be not specific to arm64.
--
Khalid
>
> Acked-by: Catalin Marinas <catalin.marinas@arm.com>
> Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
> include/linux/mm.h | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 0e8834ac32b7..949d43e9c0b6 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -99,6 +99,10 @@ extern int mmap_rnd_compat_bits __read_mostly;
> #include <asm/pgtable.h>
> #include <asm/processor.h>
>
> +#ifndef untagged_addr
> +#define untagged_addr(addr) (addr)
> +#endif
> +
> #ifndef __pa_symbol
> #define __pa_symbol(x) __pa(RELOC_HIDE((unsigned long)(x), 0))
> #endif
>
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox