linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3] mm: Give kmap_lock before call flush_tlb_kernel_rang,avoid kmap_high deadlock.
@ 2024-07-29 12:13 zhangchun
  2024-08-08 10:17 ` zhangchun
  0 siblings, 1 reply; 9+ messages in thread
From: zhangchun @ 2024-07-29 12:13 UTC (permalink / raw)
  To: akpm
  Cc: jiaoxupo, linux-kernel, linux-mm, shaohaojize, zhang.chuna,
	zhang.zhansheng, zhang.zhengming

 CPU 0:                                                 CPU 1:
 kmap_high(){                                           kmap_xxx() {
               ...                                        irq_disable();
        spin_lock(&kmap_lock)
               ...
        map_new_virtual                                     ...
           flush_all_zero_pkmaps
              flush_tlb_kernel_range         /* CPU0 holds the kmap_lock */
                      smp_call_function_many         spin_lock(&kmap_lock)
                      ...                                   ....
        spin_unlock(&kmap_lock)
               ...

CPU 0 holds the kmap_lock, waiting for CPU 1 respond to IPI. But CPU 1 has
disabled irqs, waiting for kmap_lock, cannot answer the IPI. Fix this by 
releasing kmap_lock before call flush_tlb_kernel_range, avoid kmap_lock
deadlock.

        if (need_flush) {
            unlock_kmap();
            flush_tlb_kernel_range(PKMAP_ADDR(0), PKMAP_ADDR(LAST_PKMAP));
            lock_kmap();
        }

Dropping the lock like this is safe. kmap_lock is used to protect
pkmap_count, pkmap_page_table and last_pkmap_nr(static variable). 
When call flush_tlb_kernel_range(PKMAP_ADDR(0), PKMAP_ADDR(LAST_PKMAP)), 
flush_tlb_kernel_range will neither modify nor read these variables. 
Leave that data unprotected here is safe.

map_new_virtual aims to find an usable entry pkmap_count[last_pkmap_nr].
When read and modify the pkmap_count[last_pkmap_nr], the kmap_lock is 
not dropped. "if (!pkmap_count[last_pkmap_nr])" determine 
pkmap_count[last_pkmap_nr] is usable or not. If unusable, try agin.

Furthermore, the value of static variable last_pkmap_nr is stored in
a local variable last_pkmap_nr, when kmap_lock is acquired, this is 
thread-safe.

In an extreme case, if Thread A and Thread B access the same last_pkmap_nr,
Thread A calls function flush_tlb_kernel_range and release the kmap_lock,
and Thread B then acquires the kmap_lock and modifies the variable
pkmap_count[last_pkmap_nr]. After Thread A completes the execution 
of function the variable pkmap_count[last_pkmap_nr]. After Thread A 
completes the execution of function flush_tlb_kernel_range, it will
check the variable pkmap_count[last_pkmap_nr].

static inline unsigned long map_new_virtual(struct page *page)
{
        unsigned long vaddr;
        int count;
        unsigned int last_pkmap_nr; // local variable to store static variable last_pkmap_nr
        unsigned int color = get_pkmap_color(page);

start:
      ...
                        flush_all_zero_pkmaps();// release kmap_lock, then acquire it
                        count = get_pkmap_entries_count(color);
                }
                ...
                if (!pkmap_count[last_pkmap_nr]) // pkmap_count[last_pkmap_nr] is used or not
                        break;  /* Found a usable entry */
                if (--count)
                        continue;

               ...
        vaddr = PKMAP_ADDR(last_pkmap_nr);
        set_pte_at(&init_mm, vaddr,
                   &(pkmap_page_table[last_pkmap_nr]), mk_pte(page, kmap_prot));

        pkmap_count[last_pkmap_nr] = 1;
        ...
        return vaddr;
}

Fixes: 3297e760776a ("highmem: atomic highmem kmap page pinning")
Signed-off-by: zhangchun <zhang.chuna@h3c.com>
Co-developed-by: zhangzhansheng <zhang.zhansheng@h3c.com>
Signed-off-by: zhangzhansheng <zhang.zhansheng@h3c.com>
Suggested-by: Matthew Wilcox <willy@infradead.org>
Reviewed-by: zhangzhengming <zhang.zhengming@h3c.com>
---
 mm/highmem.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/mm/highmem.c b/mm/highmem.c
index ef3189b..07f2c67 100644
--- a/mm/highmem.c
+++ b/mm/highmem.c
@@ -231,8 +231,18 @@ static void flush_all_zero_pkmaps(void)
 		set_page_address(page, NULL);
 		need_flush = 1;
 	}
-	if (need_flush)
+	if (need_flush) {
+		/*
+		 * In multi-core system one CPU holds the kmap_lock, waiting
+		 * for other CPUs respond to IPI. But other CPUS has disabled
+		 * irqs, waiting for kmap_lock, cannot answer the IPI. Release
+		 * kmap_lock before call flush_tlb_kernel_range, avoid kmap_lock
+		 * deadlock.
+		 */
+		unlock_kmap();
 		flush_tlb_kernel_range(PKMAP_ADDR(0), PKMAP_ADDR(LAST_PKMAP));
+		lock_kmap();
+	}
 }
 
 void __kmap_flush_unused(void)
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v3] mm: Give kmap_lock before call flush_tlb_kernel_rang,avoid kmap_high deadlock.
  2024-07-29 12:13 zhangchun
@ 2024-08-08 10:17 ` zhangchun
  0 siblings, 0 replies; 9+ messages in thread
From: zhangchun @ 2024-08-08 10:17 UTC (permalink / raw)
  To: zhang.chuna
  Cc: akpm, jiaoxupo, linux-kernel, linux-mm, shaohaojize,
	zhang.zhansheng, zhang.zhengming

Very sorry to disturb! Just a friendly ping!
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v3] mm: Give kmap_lock before call flush_tlb_kernel_rang,avoid kmap_high deadlock.
  2024-07-10 17:36 [PATCH v2] " Andrew Morton
@ 2024-08-08 10:32 ` zhangchun
  0 siblings, 0 replies; 9+ messages in thread
From: zhangchun @ 2024-08-08 10:32 UTC (permalink / raw)
  To: akpm
  Cc: jiaoxupo, linux-kernel, linux-mm, shaohaojize, zhang.chuna,
	zhang.zhansheng, zhang.zhengming

Very sorry to disturb! Just a friendly ping!
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v3] mm: Give kmap_lock before call flush_tlb_kernel_rang,avoid kmap_high deadlock.
  2024-07-24  0:26 [PATCH v2] " Andrew Morton
@ 2024-08-19 16:10 ` zhangchun
  2024-09-03 11:52   ` zhangchun
                     ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: zhangchun @ 2024-08-19 16:10 UTC (permalink / raw)
  To: akpm
  Cc: jiaoxupo, linux-kernel, linux-mm, shaohaojize, zhang.chuna,
	zhang.zhansheng, zhang.zhengming

 CPU 0:                                                 CPU 1:
 kmap_high(){                                           kmap_xxx() {
               ...                                        irq_disable();
        spin_lock(&kmap_lock)
               ...
        map_new_virtual                                     ...
           flush_all_zero_pkmaps
              flush_tlb_kernel_range         /* CPU0 holds the kmap_lock */
                      smp_call_function_many         spin_lock(&kmap_lock)
                      ...                                   ....
        spin_unlock(&kmap_lock)
               ...

CPU 0 holds the kmap_lock, waiting for CPU 1 respond to IPI. But CPU 1 has disabled irqs, waiting for kmap_lock,
cannot answer the IPI. Fix this by releasing  kmap_lock before call flush_tlb_kernel_range, avoid kmap_lock
deadlock. Like this:

        if (need_flush) {
            unlock_kmap();
            flush_tlb_kernel_range(PKMAP_ADDR(0), PKMAP_ADDR(LAST_PKMAP));
            lock_kmap();
        }

Dropping the lock is safe. kmap_lock is used to protect pkmap_count, pkmap_page_table and last_pkmap_nr(static variable).
When call flush_tlb_kernel_range(PKMAP_ADDR(0),
PKMAP_ADDR(LAST_PKMAP)), flush_tlb_kernel_range will neither modify nor read these variables. Leave that data unprotected
here is safe.

map_new_virtual aims to find an usable entry pkmap_count[last_pkmap_nr]. When read and modify the pkmap_count[last_pkmap_nr],
the kmap_lock is not dropped.
"if (!pkmap_count[last_pkmap_nr])" determine pkmap_count[last_pkmap_nr] is usable or not. If unusable, try agin.

Furthermore, the value of static variable last_pkmap_nr is stored in a local variable last_pkmap_nr, when kmap_lock is acquired,
this is thread-safe.

In an extreme case, if Thread A and Thread B access the same last_pkmap_nr, Thread A calls function flush_tlb_kernel_range and
release the kmap_lock, and Thread B then acquires the kmap_lock and modifies the variable pkmap_count[last_pkmap_nr]. After
Thread A completes the execution of function flush_tlb_kernel_range, it will check the variable pkmap_count[last_pkmap_nr].

static inline unsigned long map_new_virtual(struct page *page)
{
        unsigned long vaddr;
        int count;
        unsigned int last_pkmap_nr; // local variable to store static variable last_pkmap_nr
        unsigned int color = get_pkmap_color(page);

start:
        ...
                        flush_all_zero_pkmaps();// release kmap_lock, then acquire it
                        count = get_pkmap_entries_count(color);
                }
                ...
                if (!pkmap_count[last_pkmap_nr]) // pkmap_count[last_pkmap_nr] is used or not
                        break;  /* Found a usable entry */
                if (--count)
                        continue;

               ...
        vaddr = PKMAP_ADDR(last_pkmap_nr);
        set_pte_at(&init_mm, vaddr,
                   &(pkmap_page_table[last_pkmap_nr]), mk_pte(page, kmap_prot));

        pkmap_count[last_pkmap_nr] = 1;
        ...
        return vaddr;
}

Fixes: 3297e760776a ("highmem: atomic highmem kmap page pinning")
Signed-off-by: zhangchun <zhang.chuna@h3c.com>
Co-developed-by: zhangzhansheng <zhang.zhansheng@h3c.com>
Signed-off-by: zhangzhansheng <zhang.zhansheng@h3c.com>
Suggested-by: Matthew Wilcox <willy@infradead.org>
Reviewed-by: zhangzhengming <zhang.zhengming@h3c.com>
---
 mm/highmem.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/mm/highmem.c b/mm/highmem.c index ef3189b..07f2c67 100644
--- a/mm/highmem.c
+++ b/mm/highmem.c
@@ -231,8 +231,18 @@ static void flush_all_zero_pkmaps(void)
 		set_page_address(page, NULL);
 		need_flush = 1;
 	}
-	if (need_flush)
+	if (need_flush) {
+		/*
+		 * In multi-core system one CPU holds the kmap_lock, waiting
+		 * for other CPUs respond to IPI. But other CPUS has disabled
+		 * irqs, waiting for kmap_lock, cannot answer the IPI. Release
+		 * kmap_lock before call flush_tlb_kernel_range, avoid kmap_lock
+		 * deadlock.
+		 */
+		unlock_kmap();
 		flush_tlb_kernel_range(PKMAP_ADDR(0), PKMAP_ADDR(LAST_PKMAP));
+		lock_kmap();
+	}
 }
 
 void __kmap_flush_unused(void)
--
1.8.3.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v3] mm: Give kmap_lock before call flush_tlb_kernel_rang,avoid kmap_high deadlock.
  2024-08-19 16:10 ` [PATCH v3] " zhangchun
@ 2024-09-03 11:52   ` zhangchun
  2024-10-08  3:23   ` zhangchun
  2024-10-14  7:41   ` zhangchun
  2 siblings, 0 replies; 9+ messages in thread
From: zhangchun @ 2024-09-03 11:52 UTC (permalink / raw)
  To: akpm
  Cc: jiaoxupo, linux-kernel, linux-mm, shaohaojize, zhang.zhansheng,
	zhang.zhengming, zhangchun

Very sorry to disturb! Just a friendly ping!
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v3] mm: Give kmap_lock before call flush_tlb_kernel_rang,avoid kmap_high deadlock.
       [not found] <1724083806-21956-1-git-send-email-akpm@linux-foundation.org>
@ 2024-10-08  3:19 ` zhangchun
  2024-10-08  3:20 ` zhangchun
  1 sibling, 0 replies; 9+ messages in thread
From: zhangchun @ 2024-10-08  3:19 UTC (permalink / raw)
  To: zhang.chuna
  Cc: akpm, jiaoxupo, linux-kernel, linux-mm, shaohaojize,
	zhang.zhansheng, zhang.zhengming

Very sorry to disturb! Just a friendly ping! This deadlock bug needs to fixed!
If any additional info needs, please contact me. Long for your reply!

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v3] mm: Give kmap_lock before call flush_tlb_kernel_rang,avoid kmap_high deadlock.
       [not found] <1724083806-21956-1-git-send-email-akpm@linux-foundation.org>
  2024-10-08  3:19 ` [PATCH v3] mm: Give kmap_lock before call flush_tlb_kernel_rang,avoid kmap_high deadlock zhangchun
@ 2024-10-08  3:20 ` zhangchun
  1 sibling, 0 replies; 9+ messages in thread
From: zhangchun @ 2024-10-08  3:20 UTC (permalink / raw)
  To: zhang.chuna
  Cc: akpm, jiaoxupo, linux-kernel, linux-mm, shaohaojize,
	zhang.zhansheng, zhang.zhengming

Very sorry to disturb! Just a friendly ping! This deadlock bug needs to fixed!
If any additional info needs, please contact me. Long for your reply!

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v3] mm: Give kmap_lock before call flush_tlb_kernel_rang,avoid kmap_high deadlock.
  2024-08-19 16:10 ` [PATCH v3] " zhangchun
  2024-09-03 11:52   ` zhangchun
@ 2024-10-08  3:23   ` zhangchun
  2024-10-14  7:41   ` zhangchun
  2 siblings, 0 replies; 9+ messages in thread
From: zhangchun @ 2024-10-08  3:23 UTC (permalink / raw)
  To: akpm
  Cc: jiaoxupo, linux-kernel, linux-mm, shaohaojize, zhang.zhansheng,
	zhang.zhengming, zhangchun

Very sorry to disturb! Just a friendly ping! This deadlock bug needs to fixed!
If any additional info needs, please contact me. Long for your reply!

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v3] mm: Give kmap_lock before call flush_tlb_kernel_rang,avoid kmap_high deadlock.
  2024-08-19 16:10 ` [PATCH v3] " zhangchun
  2024-09-03 11:52   ` zhangchun
  2024-10-08  3:23   ` zhangchun
@ 2024-10-14  7:41   ` zhangchun
  2 siblings, 0 replies; 9+ messages in thread
From: zhangchun @ 2024-10-14  7:41 UTC (permalink / raw)
  To: akpm
  Cc: jiaoxupo, linux-kernel, linux-mm, shaohaojize, zhang.zhansheng,
	zhang.zhengming, zhangchun

Very sorry to disturb! Just a friendly ping! This deadlock bug is necessary to fix!
If any additional info needs, please contact me. Long for your reply!

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2024-10-14  7:40 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <1724083806-21956-1-git-send-email-akpm@linux-foundation.org>
2024-10-08  3:19 ` [PATCH v3] mm: Give kmap_lock before call flush_tlb_kernel_rang,avoid kmap_high deadlock zhangchun
2024-10-08  3:20 ` zhangchun
2024-07-29 12:13 zhangchun
2024-08-08 10:17 ` zhangchun
  -- strict thread matches above, loose matches on Subject: below --
2024-07-24  0:26 [PATCH v2] " Andrew Morton
2024-08-19 16:10 ` [PATCH v3] " zhangchun
2024-09-03 11:52   ` zhangchun
2024-10-08  3:23   ` zhangchun
2024-10-14  7:41   ` zhangchun
2024-07-10 17:36 [PATCH v2] " Andrew Morton
2024-08-08 10:32 ` [PATCH v3] " zhangchun

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).