public inbox for linux-parisc@vger.kernel.org
 help / color / mirror / Atom feed
From: Helge Deller <deller@gmx.de>
To: John David Anglin <dave.anglin@bell.net>
Cc: Guenter Roeck <linux@roeck-us.net>, Helge Deller <deller@gmx.de>,
	linux-parisc@vger.kernel.org,
	Matthew Wilcox <willy@infradead.org>
Subject: Re: [PATCH] parisc: Optimize per-pagetable spinlocks (v11)
Date: Fri, 12 Feb 2021 00:14:12 +0100	[thread overview]
Message-ID: <YCW6RF/JiD/ezW3q@ls3530> (raw)
In-Reply-To: <06b31cea-61ce-a3a2-8abe-48aa5ab9dafb@bell.net>

* John David Anglin <dave.anglin@bell.net>:
> On 2021-02-11 4:51 p.m., Guenter Roeck wrote:
> > On Thu, Feb 11, 2021 at 09:38:25AM -0500, John David Anglin wrote:
> >> On 2021-02-10 8:20 p.m., Guenter Roeck wrote:
> >>> On Wed, Feb 10, 2021 at 01:57:42PM -0500, John David Anglin wrote:
> >>>> On 2021-02-10 12:23 p.m., Helge Deller wrote:
> >>>>> On 2/10/21 3:52 PM, Guenter Roeck wrote:
> >>>>>> On Wed, Jan 27, 2021 at 10:18:51PM +0100, Helge Deller wrote:
> >>>>>>> On parisc a spinlock is stored in the next page behind the pgd which
> >>>>>>> protects against parallel accesses to the pgd. That's why one additional
> >>>>>>> page (PGD_ALLOC_ORDER) is allocated for the pgd.
> >>>>>>>
> >>>>>>> Matthew Wilcox suggested that we instead should use a pointer in the
> >>>>>>> struct page table for this spinlock and noted, that the comments for the
> >>>>>>> PGD_ORDER and PMD_ORDER defines were wrong.
> >>>>>>>
> >>>>>>> Both suggestions are addressed in this patch. The pgd spinlock
> >>>>>>> (parisc_pgd_lock) is stored in the struct page table. In
> >>>>>>> switch_mm_irqs_off() the physical address of this lock is loaded into
> >>>>>>> cr28 (tr4) and the pgd into cr25, so that the fault handlers can
> >>>>>>> directly access the lock.
> >>>>>>>
> >>>>>>> The currently implemened Hybrid L2/L3 page table scheme (where the pmd
> >>>>>>> is adjacent to the pgd) is dropped now too.
> >>>>>>>
> >>>>>>> Suggested-by: Matthew Wilcox <willy@infradead.org>
> >>>>>>> Fixes: b37d1c1898b2 ("parisc: Use per-pagetable spinlock")
> >>>>>>> Signed-off-by: Helge Deller <deller@gmx.de>
> >>>>>>> Signed-off-by: John David Anglin <dave.anglin@bell.net>
> >>>>>> This patch results in:
> >>>>>>
> >>>>>> BUG: spinlock recursion on CPU#0, swapper/0/1
> >>>>>>   lock: 0x12226d14, .magic: dead4ead, .owner: swapper/0/1, .owner_cpu: 0
> >>>>>> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.11.0-rc7-next-20210209-32bit #1
> >>>>>> Hardware name: 9000/778/B160L
> >>>>>> Backtrace:
> >>>>>>   [<1019f9bc>] show_stack+0x34/0x48
> >>>>>>   [<10a65278>] dump_stack+0x94/0x114
> >>>>>>   [<10219f4c>] spin_dump+0x8c/0xb8
> >>>>>>   [<1021a0b4>] do_raw_spin_lock+0xdc/0x108
> >>>>>>   [<10a7367c>] _raw_spin_lock_irqsave+0x30/0x48
> >>>>>>   [<102bf41c>] handle_mm_fault+0x5e8/0xdb0
> >>>>>>   [<102b813c>] __get_user_pages.part.0+0x1b0/0x3d4
> >>>>>>   [<102b8900>] __get_user_pages_remote+0x134/0x34c
> >>>>>>   [<102b8b80>] get_user_pages_remote+0x68/0x90
> >>>>>>   [<102fccb0>] get_arg_page+0x94/0xd8
> >>>>>>   [<102fdd84>] copy_string_kernel+0xc4/0x234
> >>>>>>   [<102fe70c>] kernel_execve+0xcc/0x1a4
> >>>>>>   [<10a58d94>] run_init_process+0xbc/0xe0
> >>>>>>   [<10a70d50>] kernel_init+0x98/0x13c
> >>>>>>   [<1019a01c>] ret_from_kernel_thread+0x1c/0x24
> >>>>>>
> >>>>>> when trying to boot parisc/hppa images in qemu. Reverting this patch fixes
> >>>>>> the problem.
> >>>>> True, I can reproduce the problem.
> >>>>> With CONFIG_DEBUG_SPINLOCK=y you get the backtrace above.
> >>>>> With  CONFIG_DEBUG_SPINLOCK=n it just hangs.
> >>>>> Happenes with 32-bit kernel with SMP kernel, even if only one virtual CPU is started.
> >>>> Which is quite puzzling since most spin locks are optimized in this case case.  Strikes me we
> >>>> have a lock that's not initialized.
> >>>>
> >>>> It's not obvious how this relates to the patch.  get_arg_page() calls get_user_pages_remote() with
> >>> The fact that reverting it fixes the problem is a good indication
> >>> that the problem does relate to this patch.
> >>>
> >>> As for how - no idea. That is not my area of expertise.
> >> I built Helge's for-next tree both with CONFIG_DEBUG_SPINLOCK=y and CONFIG_DEBUG_SPINLOCK=n.  Both
> >> builds work fine on c8000.
> >>
> >> The only thing that might have changed in the patch is the alignment of the lock used for page table updates.
> >> Qemu only support PA 1.x instructions.  The ldcw instruction needs 16-byte alignment on real hardware and
> >> there is code to dynamically align lock pointers to 16-byte alignment.  The c8000 supports PA 2.0 instructions
> >> and the ldcw,co instruction only needs 4-byte alignment.  Perhaps there is an issue with the dynamic alignment
> >> of the lock pointer or the lock initialization in the PA 1.x build for qemu.
> >>
> > The first lock is acquired in mm/memory.c from line 3565:
> >
> >         vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, vmf->address,
> >                         &vmf->ptl);
> >
> > The second (recursive) lock is acquired from line 3587 in the same
> > function:
> >
> >         set_pte_at(vma->vm_mm, vmf->address, vmf->pte, entry);
> >
> > Source code lines are from next-20210211. I confirmed with debug code
> > that the lock address passed to do_raw_spin_lock() is the same in both
> > calls.
> Thanks Guenter.  I assume this is with v15 of the patch?

Yes, happens with latest version too.

> It looks as if we might not need the ptl lock in set_pte_at() and probably elsewhere.

Just removing the locks from set_pte_at() didn't solved it.
After removing the others too, it works.
Patch is attached below.
I added a memory-barrier to set_pte() too.

> But I'm still puzzled as to why this doesn't happen when different locks are used as in your
> report with the earlier patch. 

I think it happened earlier too. Would need to test though.

Helge

diff --git a/arch/parisc/include/asm/pgtable.h b/arch/parisc/include/asm/pgtable.h
index 2e1873cd4877..2c68010d896a 100644
--- a/arch/parisc/include/asm/pgtable.h
+++ b/arch/parisc/include/asm/pgtable.h
@@ -82,17 +82,14 @@ static inline void purge_tlb_entries(struct mm_struct *mm, unsigned long addr)
  */
 #define set_pte(pteptr, pteval)                                 \
         do{                                                     \
-                *(pteptr) = (pteval);                           \
+		*(pteptr) = (pteval);				\
+		mb();						\
         } while(0)

 #define set_pte_at(mm, addr, ptep, pteval)			\
 	do {							\
-		unsigned long flags;				\
-		spinlock_t *pgd_lock = &(mm)->page_table_lock;	\
-		spin_lock_irqsave(pgd_lock, flags);		\
 		set_pte(ptep, pteval);				\
 		purge_tlb_entries(mm, addr);			\
-		spin_unlock_irqrestore(pgd_lock, flags);	\
 	} while (0)

 #endif /* !__ASSEMBLY__ */
@@ -431,22 +428,15 @@ extern void update_mmu_cache(struct vm_area_struct *, unsigned long, pte_t *);
 static inline int ptep_test_and_clear_young(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep)
 {
 	pte_t pte;
-	unsigned long flags;
-	spinlock_t *pgd_lock;

 	if (!pte_young(*ptep))
 		return 0;

-	pgd_lock = &vma->vm_mm->page_table_lock;
-	spin_lock_irqsave(pgd_lock, flags);
 	pte = *ptep;
 	if (!pte_young(pte)) {
-		spin_unlock_irqrestore(pgd_lock, flags);
 		return 0;
 	}
-	set_pte(ptep, pte_mkold(pte));
-	purge_tlb_entries(vma->vm_mm, addr);
-	spin_unlock_irqrestore(pgd_lock, flags);
+	set_pte_at(vma->vm_mm, addr, ptep, pte_mkold(pte));
 	return 1;
 }

@@ -454,29 +444,16 @@ struct mm_struct;
 static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep)
 {
 	pte_t old_pte;
-	unsigned long flags;
-	spinlock_t *pgd_lock;

-	pgd_lock = &mm->page_table_lock;
-	spin_lock_irqsave(pgd_lock, flags);
 	old_pte = *ptep;
-	set_pte(ptep, __pte(0));
-	purge_tlb_entries(mm, addr);
-	spin_unlock_irqrestore(pgd_lock, flags);
+	set_pte_at(mm, addr, ptep, __pte(0));

 	return old_pte;
 }

 static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr, pte_t *ptep)
 {
-	unsigned long flags;
-	spinlock_t *pgd_lock;
-
-	pgd_lock = &mm->page_table_lock;
-	spin_lock_irqsave(pgd_lock, flags);
-	set_pte(ptep, pte_wrprotect(*ptep));
-	purge_tlb_entries(mm, addr);
-	spin_unlock_irqrestore(pgd_lock, flags);
+	set_pte_at(mm, addr, ptep, pte_wrprotect(*ptep));
 }

 #define pte_same(A,B)	(pte_val(A) == pte_val(B))

  parent reply	other threads:[~2021-02-11 23:16 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <c696f95d-a5ba-1f73-fbe9-a5d3f25e79c0@bell.net>
     [not found] ` <2786b971-254a-ae07-ea24-38e57bd29892@bell.net>
     [not found]   ` <d4751664-6627-920e-9c44-7f9b7e287431@bell.net>
     [not found]     ` <6fb36e0e-62f5-68c7-92ec-c6dd16841813@bell.net>
     [not found]       ` <44ee7e09-90e7-0766-f0e4-bde2d3cdc2ec@bell.net>
     [not found]         ` <5100eb80-975f-d77d-846a-5aabc25d0f95@bell.net>
     [not found]           ` <e023991b-ba2e-f6da-94fb-0988ad70e717@bell.net>
     [not found]             ` <9b9c6446-365f-9ca6-b89f-c330fca11952@bell.net>
     [not found]               ` <94210da5-5642-82ef-85ae-688e1c07473d@gmx.de>
     [not found]                 ` <4f76001d-f050-286f-4b6f-790554583eea@bell.net>
2021-01-27 21:18                   ` [PATCH] parisc: Optimize per-pagetable spinlocks (v11) Helge Deller
2021-01-28  8:36                     ` Rolf Eike Beer
2021-01-28 15:24                       ` [PATCH] parisc: Optimize per-pagetable spinlocks (v12) Helge Deller
2021-02-10 14:52                     ` [PATCH] parisc: Optimize per-pagetable spinlocks (v11) Guenter Roeck
2021-02-10 17:23                       ` Helge Deller
2021-02-10 18:57                         ` John David Anglin
2021-02-11  1:20                           ` Guenter Roeck
2021-02-11 14:38                             ` John David Anglin
2021-02-11 21:51                               ` Guenter Roeck
2021-02-11 22:16                                 ` John David Anglin
2021-02-11 23:12                                   ` Guenter Roeck
2021-02-11 23:14                                   ` Helge Deller [this message]
2021-02-11 23:22                                     ` Helge Deller
2021-02-11 23:23                                     ` John David Anglin
2021-02-11 23:34                                     ` John David Anglin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YCW6RF/JiD/ezW3q@ls3530 \
    --to=deller@gmx.de \
    --cc=dave.anglin@bell.net \
    --cc=linux-parisc@vger.kernel.org \
    --cc=linux@roeck-us.net \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox