Building the Linux kernel with Clang and LLVM
 help / color / mirror / Atom feed
* Re: [PATCH V2 7/7] mm: Use pgdp_get() for accessing PGD entries
       [not found] <20240917073117.1531207-8-anshuman.khandual@arm.com>
@ 2024-09-18 20:30 ` kernel test robot
  2024-09-19  7:55   ` Anshuman Khandual
  0 siblings, 1 reply; 10+ messages in thread
From: kernel test robot @ 2024-09-18 20:30 UTC (permalink / raw)
  To: Anshuman Khandual, linux-mm
  Cc: llvm, oe-kbuild-all, Anshuman Khandual, Andrew Morton,
	Linux Memory Management List, David Hildenbrand, Ryan Roberts,
	Mike Rapoport (IBM), Arnd Bergmann, x86, linux-m68k,
	linux-fsdevel, kasan-dev, linux-kernel, linux-perf-users,
	Dimitri Sivanich, Alexander Viro, Muchun Song, Andrey Ryabinin,
	Miaohe Lin, Dennis Zhou, Tejun Heo, Christoph Lameter,
	Uladzislau Rezki, Christoph Hellwig

Hi Anshuman,

kernel test robot noticed the following build errors:

[auto build test ERROR on char-misc/char-misc-testing]
[also build test ERROR on char-misc/char-misc-next char-misc/char-misc-linus brauner-vfs/vfs.all dennis-percpu/for-next linus/master v6.11]
[cannot apply to akpm-mm/mm-everything next-20240918]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Anshuman-Khandual/m68k-mm-Change-pmd_val/20240917-153331
base:   char-misc/char-misc-testing
patch link:    https://lore.kernel.org/r/20240917073117.1531207-8-anshuman.khandual%40arm.com
patch subject: [PATCH V2 7/7] mm: Use pgdp_get() for accessing PGD entries
config: arm-footbridge_defconfig (https://download.01.org/0day-ci/archive/20240919/202409190310.ViHBRe12-lkp@intel.com/config)
compiler: clang version 20.0.0git (https://github.com/llvm/llvm-project 8663a75fa2f31299ab8d1d90288d9df92aadee88)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240919/202409190310.ViHBRe12-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202409190310.ViHBRe12-lkp@intel.com/

All errors (new ones prefixed by >>):

   In file included from arch/arm/kernel/asm-offsets.c:12:
   In file included from include/linux/mm.h:30:
>> include/linux/pgtable.h:1245:18: error: use of undeclared identifier 'pgdp'; did you mean 'pgd'?
    1245 |         pgd_t old_pgd = pgdp_get(pgd);
         |                         ^
   arch/arm/include/asm/pgtable.h:154:36: note: expanded from macro 'pgdp_get'
     154 | #define pgdp_get(pgpd)          READ_ONCE(*pgdp)
         |                                            ^
   include/linux/pgtable.h:1243:48: note: 'pgd' declared here
    1243 | static inline int pgd_none_or_clear_bad(pgd_t *pgd)
         |                                                ^
>> include/linux/pgtable.h:1245:18: error: use of undeclared identifier 'pgdp'; did you mean 'pgd'?
    1245 |         pgd_t old_pgd = pgdp_get(pgd);
         |                         ^
   arch/arm/include/asm/pgtable.h:154:36: note: expanded from macro 'pgdp_get'
     154 | #define pgdp_get(pgpd)          READ_ONCE(*pgdp)
         |                                            ^
   include/linux/pgtable.h:1243:48: note: 'pgd' declared here
    1243 | static inline int pgd_none_or_clear_bad(pgd_t *pgd)
         |                                                ^
>> include/linux/pgtable.h:1245:18: error: use of undeclared identifier 'pgdp'; did you mean 'pgd'?
    1245 |         pgd_t old_pgd = pgdp_get(pgd);
         |                         ^
   arch/arm/include/asm/pgtable.h:154:36: note: expanded from macro 'pgdp_get'
     154 | #define pgdp_get(pgpd)          READ_ONCE(*pgdp)
         |                                            ^
   include/linux/pgtable.h:1243:48: note: 'pgd' declared here
    1243 | static inline int pgd_none_or_clear_bad(pgd_t *pgd)
         |                                                ^
>> include/linux/pgtable.h:1245:18: error: use of undeclared identifier 'pgdp'; did you mean 'pgd'?
    1245 |         pgd_t old_pgd = pgdp_get(pgd);
         |                         ^
   arch/arm/include/asm/pgtable.h:154:36: note: expanded from macro 'pgdp_get'
     154 | #define pgdp_get(pgpd)          READ_ONCE(*pgdp)
         |                                            ^
   include/linux/pgtable.h:1243:48: note: 'pgd' declared here
    1243 | static inline int pgd_none_or_clear_bad(pgd_t *pgd)
         |                                                ^
>> include/linux/pgtable.h:1245:18: error: use of undeclared identifier 'pgdp'; did you mean 'pgd'?
    1245 |         pgd_t old_pgd = pgdp_get(pgd);
         |                         ^
   arch/arm/include/asm/pgtable.h:154:36: note: expanded from macro 'pgdp_get'
     154 | #define pgdp_get(pgpd)          READ_ONCE(*pgdp)
         |                                            ^
   include/linux/pgtable.h:1243:48: note: 'pgd' declared here
    1243 | static inline int pgd_none_or_clear_bad(pgd_t *pgd)
         |                                                ^
>> include/linux/pgtable.h:1245:18: error: use of undeclared identifier 'pgdp'; did you mean 'pgd'?
    1245 |         pgd_t old_pgd = pgdp_get(pgd);
         |                         ^
   arch/arm/include/asm/pgtable.h:154:36: note: expanded from macro 'pgdp_get'
     154 | #define pgdp_get(pgpd)          READ_ONCE(*pgdp)
         |                                            ^
   include/linux/pgtable.h:1243:48: note: 'pgd' declared here
    1243 | static inline int pgd_none_or_clear_bad(pgd_t *pgd)
         |                                                ^
>> include/linux/pgtable.h:1245:18: error: use of undeclared identifier 'pgdp'; did you mean 'pgd'?
    1245 |         pgd_t old_pgd = pgdp_get(pgd);
         |                         ^
   arch/arm/include/asm/pgtable.h:154:36: note: expanded from macro 'pgdp_get'
     154 | #define pgdp_get(pgpd)          READ_ONCE(*pgdp)
         |                                            ^
   include/linux/pgtable.h:1243:48: note: 'pgd' declared here
    1243 | static inline int pgd_none_or_clear_bad(pgd_t *pgd)
         |                                                ^
>> include/linux/pgtable.h:1245:18: error: use of undeclared identifier 'pgdp'; did you mean 'pgd'?
    1245 |         pgd_t old_pgd = pgdp_get(pgd);
         |                         ^
   arch/arm/include/asm/pgtable.h:154:36: note: expanded from macro 'pgdp_get'
     154 | #define pgdp_get(pgpd)          READ_ONCE(*pgdp)
         |                                            ^
   include/linux/pgtable.h:1243:48: note: 'pgd' declared here
    1243 | static inline int pgd_none_or_clear_bad(pgd_t *pgd)
         |                                                ^
>> include/linux/pgtable.h:1245:8: error: array initializer must be an initializer list or wide string literal
    1245 |         pgd_t old_pgd = pgdp_get(pgd);
         |               ^
   In file included from arch/arm/kernel/asm-offsets.c:12:
   In file included from include/linux/mm.h:1131:
   In file included from include/linux/huge_mm.h:8:
   In file included from include/linux/fs.h:33:
   In file included from include/linux/percpu-rwsem.h:7:
   In file included from include/linux/rcuwait.h:6:
   In file included from include/linux/sched/signal.h:6:
   include/linux/signal.h:98:11: warning: array index 3 is past the end of the array (that has type 'unsigned long[2]') [-Warray-bounds]
      98 |                 return (set->sig[3] | set->sig[2] |
         |                         ^        ~
   arch/arm/include/asm/signal.h:17:2: note: array 'sig' declared here
      17 |         unsigned long sig[_NSIG_WORDS];
         |         ^
   In file included from arch/arm/kernel/asm-offsets.c:12:
   In file included from include/linux/mm.h:1131:
   In file included from include/linux/huge_mm.h:8:
   In file included from include/linux/fs.h:33:
   In file included from include/linux/percpu-rwsem.h:7:
   In file included from include/linux/rcuwait.h:6:
   In file included from include/linux/sched/signal.h:6:
   include/linux/signal.h:98:25: warning: array index 2 is past the end of the array (that has type 'unsigned long[2]') [-Warray-bounds]
      98 |                 return (set->sig[3] | set->sig[2] |
         |                                       ^        ~
   arch/arm/include/asm/signal.h:17:2: note: array 'sig' declared here
      17 |         unsigned long sig[_NSIG_WORDS];
         |         ^
   In file included from arch/arm/kernel/asm-offsets.c:12:
   In file included from include/linux/mm.h:1131:
   In file included from include/linux/huge_mm.h:8:
   In file included from include/linux/fs.h:33:
   In file included from include/linux/percpu-rwsem.h:7:
   In file included from include/linux/rcuwait.h:6:
   In file included from include/linux/sched/signal.h:6:
   include/linux/signal.h:114:11: warning: array index 3 is past the end of the array (that has type 'const unsigned long[2]') [-Warray-bounds]
     114 |                 return  (set1->sig[3] == set2->sig[3]) &&
         |                          ^         ~
   arch/arm/include/asm/signal.h:17:2: note: array 'sig' declared here
      17 |         unsigned long sig[_NSIG_WORDS];
         |         ^
   In file included from arch/arm/kernel/asm-offsets.c:12:
   In file included from include/linux/mm.h:1131:
   In file included from include/linux/huge_mm.h:8:
   In file included from include/linux/fs.h:33:
   In file included from include/linux/percpu-rwsem.h:7:
   In file included from include/linux/rcuwait.h:6:
   In file included from include/linux/sched/signal.h:6:
   include/linux/signal.h:114:27: warning: array index 3 is past the end of the array (that has type 'const unsigned long[2]') [-Warray-bounds]
     114 |                 return  (set1->sig[3] == set2->sig[3]) &&
         |                                          ^         ~
   arch/arm/include/asm/signal.h:17:2: note: array 'sig' declared here
      17 |         unsigned long sig[_NSIG_WORDS];
         |         ^
   In file included from arch/arm/kernel/asm-offsets.c:12:
   In file included from include/linux/mm.h:1131:
   In file included from include/linux/huge_mm.h:8:
   In file included from include/linux/fs.h:33:
   In file included from include/linux/percpu-rwsem.h:7:
   In file included from include/linux/rcuwait.h:6:
   In file included from include/linux/sched/signal.h:6:
   include/linux/signal.h:115:5: warning: array index 2 is past the end of the array (that has type 'const unsigned long[2]') [-Warray-bounds]
     115 |                         (set1->sig[2] == set2->sig[2]) &&
         |                          ^         ~
   arch/arm/include/asm/signal.h:17:2: note: array 'sig' declared here
      17 |         unsigned long sig[_NSIG_WORDS];
         |         ^
   In file included from arch/arm/kernel/asm-offsets.c:12:
   In file included from include/linux/mm.h:1131:
   In file included from include/linux/huge_mm.h:8:
   In file included from include/linux/fs.h:33:
   In file included from include/linux/percpu-rwsem.h:7:
   In file included from include/linux/rcuwait.h:6:
   In file included from include/linux/sched/signal.h:6:
   include/linux/signal.h:115:21: warning: array index 2 is past the end of the array (that has type 'const unsigned long[2]') [-Warray-bounds]
     115 |                         (set1->sig[2] == set2->sig[2]) &&
         |                                          ^         ~
   arch/arm/include/asm/signal.h:17:2: note: array 'sig' declared here
      17 |         unsigned long sig[_NSIG_WORDS];
         |         ^
   In file included from arch/arm/kernel/asm-offsets.c:12:
   In file included from include/linux/mm.h:1131:
   In file included from include/linux/huge_mm.h:8:
   In file included from include/linux/fs.h:33:
   In file included from include/linux/percpu-rwsem.h:7:
   In file included from include/linux/rcuwait.h:6:
   In file included from include/linux/sched/signal.h:6:
   include/linux/signal.h:157:1: warning: array index 3 is past the end of the array (that has type 'const unsigned long[2]') [-Warray-bounds]
     157 | _SIG_SET_BINOP(sigorsets, _sig_or)
         | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/signal.h:138:8: note: expanded from macro '_SIG_SET_BINOP'
     138 |                 a3 = a->sig[3]; a2 = a->sig[2];                         \
         |                      ^      ~
   arch/arm/include/asm/signal.h:17:2: note: array 'sig' declared here
      17 |         unsigned long sig[_NSIG_WORDS];
         |         ^
   In file included from arch/arm/kernel/asm-offsets.c:12:
   In file included from include/linux/mm.h:1131:
   In file included from include/linux/huge_mm.h:8:
   In file included from include/linux/fs.h:33:
--
     163 | _SIG_SET_BINOP(sigandnsets, _sig_andn)
         | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/signal.h:140:3: note: expanded from macro '_SIG_SET_BINOP'
     140 |                 r->sig[3] = op(a3, b3);                                 \
         |                 ^      ~
   arch/arm/include/asm/signal.h:17:2: note: array 'sig' declared here
      17 |         unsigned long sig[_NSIG_WORDS];
         |         ^
   In file included from arch/arm/kernel/asm-offsets.c:12:
   In file included from include/linux/mm.h:1131:
   In file included from include/linux/huge_mm.h:8:
   In file included from include/linux/fs.h:33:
   In file included from include/linux/percpu-rwsem.h:7:
   In file included from include/linux/rcuwait.h:6:
   In file included from include/linux/sched/signal.h:6:
   include/linux/signal.h:163:1: warning: array index 2 is past the end of the array (that has type 'unsigned long[2]') [-Warray-bounds]
     163 | _SIG_SET_BINOP(sigandnsets, _sig_andn)
         | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/signal.h:141:3: note: expanded from macro '_SIG_SET_BINOP'
     141 |                 r->sig[2] = op(a2, b2);                                 \
         |                 ^      ~
   arch/arm/include/asm/signal.h:17:2: note: array 'sig' declared here
      17 |         unsigned long sig[_NSIG_WORDS];
         |         ^
   In file included from arch/arm/kernel/asm-offsets.c:12:
   In file included from include/linux/mm.h:1131:
   In file included from include/linux/huge_mm.h:8:
   In file included from include/linux/fs.h:33:
   In file included from include/linux/percpu-rwsem.h:7:
   In file included from include/linux/rcuwait.h:6:
   In file included from include/linux/sched/signal.h:6:
   include/linux/signal.h:187:1: warning: array index 3 is past the end of the array (that has type 'unsigned long[2]') [-Warray-bounds]
     187 | _SIG_SET_OP(signotset, _sig_not)
         | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/signal.h:174:27: note: expanded from macro '_SIG_SET_OP'
     174 |         case 4: set->sig[3] = op(set->sig[3]);                          \
         |                                  ^        ~
   include/linux/signal.h:186:24: note: expanded from macro '_sig_not'
     186 | #define _sig_not(x)     (~(x))
         |                            ^
   arch/arm/include/asm/signal.h:17:2: note: array 'sig' declared here
      17 |         unsigned long sig[_NSIG_WORDS];
         |         ^
   In file included from arch/arm/kernel/asm-offsets.c:12:
   In file included from include/linux/mm.h:1131:
   In file included from include/linux/huge_mm.h:8:
   In file included from include/linux/fs.h:33:
   In file included from include/linux/percpu-rwsem.h:7:
   In file included from include/linux/rcuwait.h:6:
   In file included from include/linux/sched/signal.h:6:
   include/linux/signal.h:187:1: warning: array index 3 is past the end of the array (that has type 'unsigned long[2]') [-Warray-bounds]
     187 | _SIG_SET_OP(signotset, _sig_not)
         | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/signal.h:174:10: note: expanded from macro '_SIG_SET_OP'
     174 |         case 4: set->sig[3] = op(set->sig[3]);                          \
         |                 ^        ~
   arch/arm/include/asm/signal.h:17:2: note: array 'sig' declared here
      17 |         unsigned long sig[_NSIG_WORDS];
         |         ^
   In file included from arch/arm/kernel/asm-offsets.c:12:
   In file included from include/linux/mm.h:1131:
   In file included from include/linux/huge_mm.h:8:
   In file included from include/linux/fs.h:33:
   In file included from include/linux/percpu-rwsem.h:7:
   In file included from include/linux/rcuwait.h:6:
   In file included from include/linux/sched/signal.h:6:
   include/linux/signal.h:187:1: warning: array index 2 is past the end of the array (that has type 'unsigned long[2]') [-Warray-bounds]
     187 | _SIG_SET_OP(signotset, _sig_not)
         | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/signal.h:175:20: note: expanded from macro '_SIG_SET_OP'
     175 |                 set->sig[2] = op(set->sig[2]);                          \
         |                                  ^        ~
   include/linux/signal.h:186:24: note: expanded from macro '_sig_not'
     186 | #define _sig_not(x)     (~(x))
         |                            ^
   arch/arm/include/asm/signal.h:17:2: note: array 'sig' declared here
      17 |         unsigned long sig[_NSIG_WORDS];
         |         ^
   In file included from arch/arm/kernel/asm-offsets.c:12:
   In file included from include/linux/mm.h:1131:
   In file included from include/linux/huge_mm.h:8:
   In file included from include/linux/fs.h:33:
   In file included from include/linux/percpu-rwsem.h:7:
   In file included from include/linux/rcuwait.h:6:
   In file included from include/linux/sched/signal.h:6:
   include/linux/signal.h:187:1: warning: array index 2 is past the end of the array (that has type 'unsigned long[2]') [-Warray-bounds]
     187 | _SIG_SET_OP(signotset, _sig_not)
         | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/signal.h:175:3: note: expanded from macro '_SIG_SET_OP'
     175 |                 set->sig[2] = op(set->sig[2]);                          \
         |                 ^        ~
   arch/arm/include/asm/signal.h:17:2: note: array 'sig' declared here
      17 |         unsigned long sig[_NSIG_WORDS];
         |         ^
   In file included from arch/arm/kernel/asm-offsets.c:12:
   In file included from include/linux/mm.h:2232:
   include/linux/vmstat.h:517:36: warning: arithmetic between different enumeration types ('enum node_stat_item' and 'enum lru_list') [-Wenum-enum-conversion]
     517 |         return node_stat_name(NR_LRU_BASE + lru) + 3; // skip "nr_"
         |                               ~~~~~~~~~~~ ^ ~~~
   In file included from arch/arm/kernel/asm-offsets.c:12:
>> include/linux/mm.h:2822:28: error: use of undeclared identifier 'pgdp'; did you mean 'pgd'?
    2822 |         return (unlikely(pgd_none(pgdp_get(pgd))) && __p4d_alloc(mm, pgd, address)) ?
         |                                   ^
   arch/arm/include/asm/pgtable.h:154:36: note: expanded from macro 'pgdp_get'
     154 | #define pgdp_get(pgpd)          READ_ONCE(*pgdp)
         |                                            ^
   include/linux/mm.h:2819:61: note: 'pgd' declared here
    2819 | static inline p4d_t *p4d_alloc(struct mm_struct *mm, pgd_t *pgd,
         |                                                             ^
>> include/linux/mm.h:2822:28: error: use of undeclared identifier 'pgdp'; did you mean 'pgd'?
    2822 |         return (unlikely(pgd_none(pgdp_get(pgd))) && __p4d_alloc(mm, pgd, address)) ?
         |                                   ^
   arch/arm/include/asm/pgtable.h:154:36: note: expanded from macro 'pgdp_get'
     154 | #define pgdp_get(pgpd)          READ_ONCE(*pgdp)
         |                                            ^
   include/linux/mm.h:2819:61: note: 'pgd' declared here
    2819 | static inline p4d_t *p4d_alloc(struct mm_struct *mm, pgd_t *pgd,
         |                                                             ^
>> include/linux/mm.h:2822:28: error: use of undeclared identifier 'pgdp'; did you mean 'pgd'?
    2822 |         return (unlikely(pgd_none(pgdp_get(pgd))) && __p4d_alloc(mm, pgd, address)) ?
         |                                   ^
   arch/arm/include/asm/pgtable.h:154:36: note: expanded from macro 'pgdp_get'
     154 | #define pgdp_get(pgpd)          READ_ONCE(*pgdp)
         |                                            ^
   include/linux/mm.h:2819:61: note: 'pgd' declared here
    2819 | static inline p4d_t *p4d_alloc(struct mm_struct *mm, pgd_t *pgd,
         |                                                             ^
>> include/linux/mm.h:2822:28: error: use of undeclared identifier 'pgdp'; did you mean 'pgd'?
    2822 |         return (unlikely(pgd_none(pgdp_get(pgd))) && __p4d_alloc(mm, pgd, address)) ?
         |                                   ^
   arch/arm/include/asm/pgtable.h:154:36: note: expanded from macro 'pgdp_get'
     154 | #define pgdp_get(pgpd)          READ_ONCE(*pgdp)
         |                                            ^
   include/linux/mm.h:2819:61: note: 'pgd' declared here
    2819 | static inline p4d_t *p4d_alloc(struct mm_struct *mm, pgd_t *pgd,
         |                                                             ^
>> include/linux/mm.h:2822:28: error: use of undeclared identifier 'pgdp'; did you mean 'pgd'?
    2822 |         return (unlikely(pgd_none(pgdp_get(pgd))) && __p4d_alloc(mm, pgd, address)) ?
         |                                   ^
   arch/arm/include/asm/pgtable.h:154:36: note: expanded from macro 'pgdp_get'
     154 | #define pgdp_get(pgpd)          READ_ONCE(*pgdp)
         |                                            ^
   include/linux/mm.h:2819:61: note: 'pgd' declared here
    2819 | static inline p4d_t *p4d_alloc(struct mm_struct *mm, pgd_t *pgd,
         |                                                             ^
>> include/linux/mm.h:2822:28: error: use of undeclared identifier 'pgdp'; did you mean 'pgd'?
    2822 |         return (unlikely(pgd_none(pgdp_get(pgd))) && __p4d_alloc(mm, pgd, address)) ?
         |                                   ^
   arch/arm/include/asm/pgtable.h:154:36: note: expanded from macro 'pgdp_get'
     154 | #define pgdp_get(pgpd)          READ_ONCE(*pgdp)
         |                                            ^
   include/linux/mm.h:2819:61: note: 'pgd' declared here
    2819 | static inline p4d_t *p4d_alloc(struct mm_struct *mm, pgd_t *pgd,
         |                                                             ^
>> include/linux/mm.h:2822:28: error: use of undeclared identifier 'pgdp'; did you mean 'pgd'?
    2822 |         return (unlikely(pgd_none(pgdp_get(pgd))) && __p4d_alloc(mm, pgd, address)) ?
         |                                   ^
   arch/arm/include/asm/pgtable.h:154:36: note: expanded from macro 'pgdp_get'
     154 | #define pgdp_get(pgpd)          READ_ONCE(*pgdp)
         |                                            ^
   include/linux/mm.h:2819:61: note: 'pgd' declared here
    2819 | static inline p4d_t *p4d_alloc(struct mm_struct *mm, pgd_t *pgd,
         |                                                             ^
>> include/linux/mm.h:2822:28: error: use of undeclared identifier 'pgdp'; did you mean 'pgd'?
    2822 |         return (unlikely(pgd_none(pgdp_get(pgd))) && __p4d_alloc(mm, pgd, address)) ?
         |                                   ^
   arch/arm/include/asm/pgtable.h:154:36: note: expanded from macro 'pgdp_get'
     154 | #define pgdp_get(pgpd)          READ_ONCE(*pgdp)
         |                                            ^
   include/linux/mm.h:2819:61: note: 'pgd' declared here
    2819 | static inline p4d_t *p4d_alloc(struct mm_struct *mm, pgd_t *pgd,
         |                                                             ^
>> include/linux/mm.h:2822:28: error: passing 'const volatile pmdval_t *' (aka 'const volatile unsigned int *') to parameter of type 'pmdval_t *' (aka 'unsigned int *') discards qualifiers [-Werror,-Wincompatible-pointer-types-discards-qualifiers]
    2822 |         return (unlikely(pgd_none(pgdp_get(pgd))) && __p4d_alloc(mm, pgd, address)) ?
         |                 ~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~
   arch/arm/include/asm/pgtable.h:154:25: note: expanded from macro 'pgdp_get'
     154 | #define pgdp_get(pgpd)          READ_ONCE(*pgdp)
         |                                 ^
   include/asm-generic/rwonce.h:47:28: note: expanded from macro 'READ_ONCE'
      47 | #define READ_ONCE(x)                                                    \
         |                                                                         ^
   include/linux/compiler.h:77:42: note: expanded from macro 'unlikely'
      77 | # define unlikely(x)    __builtin_expect(!!(x), 0)
         |                                             ^
   include/asm-generic/pgtable-nop4d.h:21:34: note: passing argument to parameter 'pgd' here
      21 | static inline int pgd_none(pgd_t pgd)           { return 0; }
         |                                  ^
   29 warnings and 18 errors generated.
   make[3]: *** [scripts/Makefile.build:117: arch/arm/kernel/asm-offsets.s] Error 1
   make[3]: Target 'prepare' not remade because of errors.
   make[2]: *** [Makefile:1194: prepare0] Error 2
   make[2]: Target 'prepare' not remade because of errors.
   make[1]: *** [Makefile:224: __sub-make] Error 2
   make[1]: Target 'prepare' not remade because of errors.
   make: *** [Makefile:224: __sub-make] Error 2
   make: Target 'prepare' not remade because of errors.


vim +1245 include/linux/pgtable.h

  1242	
  1243	static inline int pgd_none_or_clear_bad(pgd_t *pgd)
  1244	{
> 1245		pgd_t old_pgd = pgdp_get(pgd);
  1246	
  1247		if (pgd_none(old_pgd))
  1248			return 1;
  1249		if (unlikely(pgd_bad(old_pgd))) {
  1250			pgd_clear_bad(pgd);
  1251			return 1;
  1252		}
  1253		return 0;
  1254	}
  1255	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V2 7/7] mm: Use pgdp_get() for accessing PGD entries
  2024-09-18 20:30 ` [PATCH V2 7/7] mm: Use pgdp_get() for accessing PGD entries kernel test robot
@ 2024-09-19  7:55   ` Anshuman Khandual
  2024-09-19  9:11     ` Russell King (Oracle)
  0 siblings, 1 reply; 10+ messages in thread
From: Anshuman Khandual @ 2024-09-19  7:55 UTC (permalink / raw)
  To: kernel test robot, linux-mm, Russell King (Oracle)
  Cc: llvm, oe-kbuild-all, Andrew Morton, David Hildenbrand,
	Ryan Roberts, Mike Rapoport (IBM), Arnd Bergmann, x86, linux-m68k,
	linux-fsdevel, kasan-dev, linux-kernel, linux-perf-users,
	Dimitri Sivanich, Alexander Viro, Muchun Song, Andrey Ryabinin,
	Miaohe Lin, Dennis Zhou, Tejun Heo, Christoph Lameter,
	Uladzislau Rezki, Christoph Hellwig

On 9/19/24 02:00, kernel test robot wrote:
> Hi Anshuman,
> 
> kernel test robot noticed the following build errors:
> 
> [auto build test ERROR on char-misc/char-misc-testing]
> [also build test ERROR on char-misc/char-misc-next char-misc/char-misc-linus brauner-vfs/vfs.all dennis-percpu/for-next linus/master v6.11]
> [cannot apply to akpm-mm/mm-everything next-20240918]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch#_base_tree_information]
> 
> url:    https://github.com/intel-lab-lkp/linux/commits/Anshuman-Khandual/m68k-mm-Change-pmd_val/20240917-153331
> base:   char-misc/char-misc-testing
> patch link:    https://lore.kernel.org/r/20240917073117.1531207-8-anshuman.khandual%40arm.com
> patch subject: [PATCH V2 7/7] mm: Use pgdp_get() for accessing PGD entries
> config: arm-footbridge_defconfig (https://download.01.org/0day-ci/archive/20240919/202409190310.ViHBRe12-lkp@intel.com/config)
> compiler: clang version 20.0.0git (https://github.com/llvm/llvm-project 8663a75fa2f31299ab8d1d90288d9df92aadee88)
> reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240919/202409190310.ViHBRe12-lkp@intel.com/reproduce)
> 
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <lkp@intel.com>
> | Closes: https://lore.kernel.org/oe-kbuild-all/202409190310.ViHBRe12-lkp@intel.com/
> 
> All errors (new ones prefixed by >>):
> 
>    In file included from arch/arm/kernel/asm-offsets.c:12:
>    In file included from include/linux/mm.h:30:
>>> include/linux/pgtable.h:1245:18: error: use of undeclared identifier 'pgdp'; did you mean 'pgd'?
>     1245 |         pgd_t old_pgd = pgdp_get(pgd);
>          |                         ^
>    arch/arm/include/asm/pgtable.h:154:36: note: expanded from macro 'pgdp_get'
>      154 | #define pgdp_get(pgpd)          READ_ONCE(*pgdp)
>          |                                            ^
>    include/linux/pgtable.h:1243:48: note: 'pgd' declared here
>     1243 | static inline int pgd_none_or_clear_bad(pgd_t *pgd)
>          |                                                ^

arm (32) platform currently overrides pgdp_get() helper in the platform but
defines that like the exact same version as the generic one, albeit with a
typo which can be fixed with something like this.

diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
index be91e376df79..aedb32d49c2a 100644
--- a/arch/arm/include/asm/pgtable.h
+++ b/arch/arm/include/asm/pgtable.h
@@ -151,7 +151,7 @@ extern pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
 
 extern pgd_t swapper_pg_dir[PTRS_PER_PGD];
 
-#define pgdp_get(pgpd)         READ_ONCE(*pgdp)
+#define pgdp_get(pgdp)         READ_ONCE(*pgdp)
 
 #define pud_page(pud)          pmd_page(__pmd(pud_val(pud)))
 #define pud_write(pud)         pmd_write(__pmd(pud_val(pud)))

Regardless there is another problem here. On arm platform there are multiple
pgd_t definitions available depending on various configs but some are arrays
instead of a single data element, although platform pgdp_get() helper remains
the same for all.

arch/arm/include/asm/page-nommu.h:typedef unsigned long pgd_t[2];
arch/arm/include/asm/pgtable-2level-types.h:typedef struct { pmdval_t pgd[2]; } pgd_t;
arch/arm/include/asm/pgtable-2level-types.h:typedef pmdval_t pgd_t[2];
arch/arm/include/asm/pgtable-3level-types.h:typedef struct { pgdval_t pgd; } pgd_t;
arch/arm/include/asm/pgtable-3level-types.h:typedef pgdval_t pgd_t;

I guess it might need different pgdp_get() variants depending applicable pgd_t
definition. Will continue looking into this further but meanwhile copied Russel
King in case he might be able to give some direction.

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH V2 7/7] mm: Use pgdp_get() for accessing PGD entries
  2024-09-19  7:55   ` Anshuman Khandual
@ 2024-09-19  9:11     ` Russell King (Oracle)
  2024-09-19 15:48       ` Ryan Roberts
  0 siblings, 1 reply; 10+ messages in thread
From: Russell King (Oracle) @ 2024-09-19  9:11 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: kernel test robot, linux-mm, llvm, oe-kbuild-all, Andrew Morton,
	David Hildenbrand, Ryan Roberts, Mike Rapoport (IBM),
	Arnd Bergmann, x86, linux-m68k, linux-fsdevel, kasan-dev,
	linux-kernel, linux-perf-users, Dimitri Sivanich, Alexander Viro,
	Muchun Song, Andrey Ryabinin, Miaohe Lin, Dennis Zhou, Tejun Heo,
	Christoph Lameter, Uladzislau Rezki, Christoph Hellwig

On Thu, Sep 19, 2024 at 01:25:08PM +0530, Anshuman Khandual wrote:
> arm (32) platform currently overrides pgdp_get() helper in the platform but
> defines that like the exact same version as the generic one, albeit with a
> typo which can be fixed with something like this.

pgdp_get() was added to arm in eba2591d99d1 ("mm: Introduce
pudp/p4dp/pgdp_get() functions") with the typo you've spotted. It seems
it was added with no users, otherwise the error would have been spotted
earlier. I'm not a fan of adding dead code to the kernel for this
reason.

> Regardless there is another problem here. On arm platform there are multiple
> pgd_t definitions available depending on various configs but some are arrays
> instead of a single data element, although platform pgdp_get() helper remains
> the same for all.
> 
> arch/arm/include/asm/page-nommu.h:typedef unsigned long pgd_t[2];
> arch/arm/include/asm/pgtable-2level-types.h:typedef struct { pmdval_t pgd[2]; } pgd_t;
> arch/arm/include/asm/pgtable-2level-types.h:typedef pmdval_t pgd_t[2];
> arch/arm/include/asm/pgtable-3level-types.h:typedef struct { pgdval_t pgd; } pgd_t;
> arch/arm/include/asm/pgtable-3level-types.h:typedef pgdval_t pgd_t;
> 
> I guess it might need different pgdp_get() variants depending applicable pgd_t
> definition. Will continue looking into this further but meanwhile copied Russel
> King in case he might be able to give some direction.

That's Russel*L*, thanks.

32-bit arm uses, in some circumstances, an array because each level 1
page table entry is actually two descriptors. It needs to be this way
because each level 2 table pointed to by each level 1 entry has 256
entries, meaning it only occupies 1024 bytes in a 4096 byte page.

In order to cut down on the wastage, treat the level 1 page table as
groups of two entries, which point to two consecutive 1024 byte tables
in the level 2 page.

The level 2 entry isn't suitable for the kernel's use cases (there are
no bits to represent accessed/dirty and other important stuff that the
Linux MM wants) so we maintain the hardware page tables and a separate
set that Linux uses in the same page. Again, the software tables are
consecutive, so from Linux's perspective, the level 2 page tables
have 512 entries in them and occupy one full page.

This is documented in arch/arm/include/asm/pgtable-2level.h

However, what this means is that from the software perspective, the
level 1 page table descriptors are an array of two entries, both of
which need to be setup when creating a level 2 page table, but only
the first one should ever be dereferenced when walking the tables,
otherwise the code that walks the second level of page table entries
will walk off the end of the software table into the actual hardware
descriptors.

I've no idea what the idea is behind introducing pgd_get() and what
it's semantics are, so I can't comment further.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V2 7/7] mm: Use pgdp_get() for accessing PGD entries
  2024-09-19  9:11     ` Russell King (Oracle)
@ 2024-09-19 15:48       ` Ryan Roberts
  2024-09-19 17:06         ` Russell King (Oracle)
  0 siblings, 1 reply; 10+ messages in thread
From: Ryan Roberts @ 2024-09-19 15:48 UTC (permalink / raw)
  To: Russell King (Oracle), Anshuman Khandual
  Cc: kernel test robot, linux-mm, llvm, oe-kbuild-all, Andrew Morton,
	David Hildenbrand, Mike Rapoport (IBM), Arnd Bergmann, x86,
	linux-m68k, linux-fsdevel, kasan-dev, linux-kernel,
	linux-perf-users, Dimitri Sivanich, Alexander Viro, Muchun Song,
	Andrey Ryabinin, Miaohe Lin, Dennis Zhou, Tejun Heo,
	Christoph Lameter, Uladzislau Rezki, Christoph Hellwig

On 19/09/2024 10:11, Russell King (Oracle) wrote:
> On Thu, Sep 19, 2024 at 01:25:08PM +0530, Anshuman Khandual wrote:
>> arm (32) platform currently overrides pgdp_get() helper in the platform but
>> defines that like the exact same version as the generic one, albeit with a
>> typo which can be fixed with something like this.
> 
> pgdp_get() was added to arm in eba2591d99d1 ("mm: Introduce
> pudp/p4dp/pgdp_get() functions") with the typo you've spotted. It seems
> it was added with no users, otherwise the error would have been spotted
> earlier. I'm not a fan of adding dead code to the kernel for this
> reason.
> 
>> Regardless there is another problem here. On arm platform there are multiple
>> pgd_t definitions available depending on various configs but some are arrays
>> instead of a single data element, although platform pgdp_get() helper remains
>> the same for all.
>>
>> arch/arm/include/asm/page-nommu.h:typedef unsigned long pgd_t[2];
>> arch/arm/include/asm/pgtable-2level-types.h:typedef struct { pmdval_t pgd[2]; } pgd_t;
>> arch/arm/include/asm/pgtable-2level-types.h:typedef pmdval_t pgd_t[2];
>> arch/arm/include/asm/pgtable-3level-types.h:typedef struct { pgdval_t pgd; } pgd_t;
>> arch/arm/include/asm/pgtable-3level-types.h:typedef pgdval_t pgd_t;
>>
>> I guess it might need different pgdp_get() variants depending applicable pgd_t
>> definition. Will continue looking into this further but meanwhile copied Russel
>> King in case he might be able to give some direction.
> 
> That's Russel*L*, thanks.
> 
> 32-bit arm uses, in some circumstances, an array because each level 1
> page table entry is actually two descriptors. It needs to be this way
> because each level 2 table pointed to by each level 1 entry has 256
> entries, meaning it only occupies 1024 bytes in a 4096 byte page.
> 
> In order to cut down on the wastage, treat the level 1 page table as
> groups of two entries, which point to two consecutive 1024 byte tables
> in the level 2 page.
> 
> The level 2 entry isn't suitable for the kernel's use cases (there are
> no bits to represent accessed/dirty and other important stuff that the
> Linux MM wants) so we maintain the hardware page tables and a separate
> set that Linux uses in the same page. Again, the software tables are
> consecutive, so from Linux's perspective, the level 2 page tables
> have 512 entries in them and occupy one full page.
> 
> This is documented in arch/arm/include/asm/pgtable-2level.h
> 
> However, what this means is that from the software perspective, the
> level 1 page table descriptors are an array of two entries, both of
> which need to be setup when creating a level 2 page table, but only
> the first one should ever be dereferenced when walking the tables,
> otherwise the code that walks the second level of page table entries
> will walk off the end of the software table into the actual hardware
> descriptors.
> 
> I've no idea what the idea is behind introducing pgd_get() and what
> it's semantics are, so I can't comment further.

The helper is intended to read the value of the entry pointed to by the passed
in pointer. And it shoiuld be read in a "single copy atomic" manner, meaning no
tearing. Further, the PTL is expected to be held when calling the getter. If the
HW can write to the entry such that its racing with the lock holder (i.e. HW
update of access/dirty) then READ_ONCE() should be suitable for most
architectures. If there is no possibility of racing (because HW doesn't write to
the entry), then a simple dereference would be sufficient, I think (which is
what the core code was already doing in most cases).

There is additional benefit that the architecture can hook this function if it
has exotic use cases (see contpte feature on arm64 as an example, which hooks
ptep_get()).

It sounds to me like the arm (32) implementation of pgdp_get() could just
continue to do a direct dereference and this should be safe? I don't think it
supports HW update of access/dirty?

Thanks,
Ryan



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V2 7/7] mm: Use pgdp_get() for accessing PGD entries
  2024-09-19 15:48       ` Ryan Roberts
@ 2024-09-19 17:06         ` Russell King (Oracle)
  2024-09-19 17:49           ` Ryan Roberts
  0 siblings, 1 reply; 10+ messages in thread
From: Russell King (Oracle) @ 2024-09-19 17:06 UTC (permalink / raw)
  To: Ryan Roberts
  Cc: Anshuman Khandual, kernel test robot, linux-mm, llvm,
	oe-kbuild-all, Andrew Morton, David Hildenbrand,
	Mike Rapoport (IBM), Arnd Bergmann, x86, linux-m68k,
	linux-fsdevel, kasan-dev, linux-kernel, linux-perf-users,
	Dimitri Sivanich, Alexander Viro, Muchun Song, Andrey Ryabinin,
	Miaohe Lin, Dennis Zhou, Tejun Heo, Christoph Lameter,
	Uladzislau Rezki, Christoph Hellwig

On Thu, Sep 19, 2024 at 05:48:58PM +0200, Ryan Roberts wrote:
> > 32-bit arm uses, in some circumstances, an array because each level 1
> > page table entry is actually two descriptors. It needs to be this way
> > because each level 2 table pointed to by each level 1 entry has 256
> > entries, meaning it only occupies 1024 bytes in a 4096 byte page.
> > 
> > In order to cut down on the wastage, treat the level 1 page table as
> > groups of two entries, which point to two consecutive 1024 byte tables
> > in the level 2 page.
> > 
> > The level 2 entry isn't suitable for the kernel's use cases (there are
> > no bits to represent accessed/dirty and other important stuff that the
> > Linux MM wants) so we maintain the hardware page tables and a separate
> > set that Linux uses in the same page. Again, the software tables are
> > consecutive, so from Linux's perspective, the level 2 page tables
> > have 512 entries in them and occupy one full page.
> > 
> > This is documented in arch/arm/include/asm/pgtable-2level.h
> > 
> > However, what this means is that from the software perspective, the
> > level 1 page table descriptors are an array of two entries, both of
> > which need to be setup when creating a level 2 page table, but only
> > the first one should ever be dereferenced when walking the tables,
> > otherwise the code that walks the second level of page table entries
> > will walk off the end of the software table into the actual hardware
> > descriptors.
> > 
> > I've no idea what the idea is behind introducing pgd_get() and what
> > it's semantics are, so I can't comment further.
> 
> The helper is intended to read the value of the entry pointed to by the passed
> in pointer. And it shoiuld be read in a "single copy atomic" manner, meaning no
> tearing. Further, the PTL is expected to be held when calling the getter. If the
> HW can write to the entry such that its racing with the lock holder (i.e. HW
> update of access/dirty) then READ_ONCE() should be suitable for most
> architectures. If there is no possibility of racing (because HW doesn't write to
> the entry), then a simple dereference would be sufficient, I think (which is
> what the core code was already doing in most cases).

The core code should be making no access to the PGD entries on 32-bit
ARM since the PGD level does not exist. Writes are done at PMD level
in arch code. Reads are done by core code at PMD level.

It feels to me like pgd_get() just doesn't fit the model to which 32-bit
ARM was designed to use decades ago, so I want full details about what
pgd_get() is going to be used for and how it is going to be used,
because I feel completely in the dark over this new development. I fear
that someone hasn't understood the Linux page table model if they're
wanting to access stuff at levels that effectively "aren't implemented"
in the architecture specific kernel model of the page tables.

Essentially, on 32-bit 2-level ARM, the PGD is merely indexed by the
virtual address. As far as the kernel is concerned, each entry is
64-bit, and the generic kernel code has no business accessing that
through the pgd pointer.

The pgd pointer is passed through the PUD and PMD levels, where it is
typecast down through the kernel layers to a pmd_t pointer, where it
becomes a 32-bit quantity. This results in only the _first_ level 1
pointer being dereferenced by kernel code to a 32-bit pmd_t quantity.
pmd_page_vaddr() converts this pmd_t quantity to a pte pointer (which
points at the software level 2 page tables, not the hardware page
tables.)

So, as I'm now being told that the kernel wants to dereference the
pgd level despite the model I describe above, alarm bells are ringing.
I want full information please.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V2 7/7] mm: Use pgdp_get() for accessing PGD entries
  2024-09-19 17:06         ` Russell King (Oracle)
@ 2024-09-19 17:49           ` Ryan Roberts
  2024-09-19 20:25             ` Russell King (Oracle)
  0 siblings, 1 reply; 10+ messages in thread
From: Ryan Roberts @ 2024-09-19 17:49 UTC (permalink / raw)
  To: Russell King (Oracle)
  Cc: Anshuman Khandual, kernel test robot, linux-mm, llvm,
	oe-kbuild-all, Andrew Morton, David Hildenbrand,
	Mike Rapoport (IBM), Arnd Bergmann, x86, linux-m68k,
	linux-fsdevel, kasan-dev, linux-kernel, linux-perf-users,
	Dimitri Sivanich, Alexander Viro, Muchun Song, Andrey Ryabinin,
	Miaohe Lin, Dennis Zhou, Tejun Heo, Christoph Lameter,
	Uladzislau Rezki, Christoph Hellwig

On 19/09/2024 18:06, Russell King (Oracle) wrote:
> On Thu, Sep 19, 2024 at 05:48:58PM +0200, Ryan Roberts wrote:
>>> 32-bit arm uses, in some circumstances, an array because each level 1
>>> page table entry is actually two descriptors. It needs to be this way
>>> because each level 2 table pointed to by each level 1 entry has 256
>>> entries, meaning it only occupies 1024 bytes in a 4096 byte page.
>>>
>>> In order to cut down on the wastage, treat the level 1 page table as
>>> groups of two entries, which point to two consecutive 1024 byte tables
>>> in the level 2 page.
>>>
>>> The level 2 entry isn't suitable for the kernel's use cases (there are
>>> no bits to represent accessed/dirty and other important stuff that the
>>> Linux MM wants) so we maintain the hardware page tables and a separate
>>> set that Linux uses in the same page. Again, the software tables are
>>> consecutive, so from Linux's perspective, the level 2 page tables
>>> have 512 entries in them and occupy one full page.
>>>
>>> This is documented in arch/arm/include/asm/pgtable-2level.h
>>>
>>> However, what this means is that from the software perspective, the
>>> level 1 page table descriptors are an array of two entries, both of
>>> which need to be setup when creating a level 2 page table, but only
>>> the first one should ever be dereferenced when walking the tables,
>>> otherwise the code that walks the second level of page table entries
>>> will walk off the end of the software table into the actual hardware
>>> descriptors.
>>>
>>> I've no idea what the idea is behind introducing pgd_get() and what
>>> it's semantics are, so I can't comment further.
>>
>> The helper is intended to read the value of the entry pointed to by the passed
>> in pointer. And it shoiuld be read in a "single copy atomic" manner, meaning no
>> tearing. Further, the PTL is expected to be held when calling the getter. If the
>> HW can write to the entry such that its racing with the lock holder (i.e. HW
>> update of access/dirty) then READ_ONCE() should be suitable for most
>> architectures. If there is no possibility of racing (because HW doesn't write to
>> the entry), then a simple dereference would be sufficient, I think (which is
>> what the core code was already doing in most cases).
> 
> The core code should be making no access to the PGD entries on 32-bit
> ARM since the PGD level does not exist. Writes are done at PMD level
> in arch code. Reads are done by core code at PMD level.
> 
> It feels to me like pgd_get() just doesn't fit the model to which 32-bit
> ARM was designed to use decades ago, so I want full details about what
> pgd_get() is going to be used for and how it is going to be used,
> because I feel completely in the dark over this new development. I fear
> that someone hasn't understood the Linux page table model if they're
> wanting to access stuff at levels that effectively "aren't implemented"
> in the architecture specific kernel model of the page tables.

This change isn't as big and scary as I think you fear. The core-mm today
dereferences pgd pointers (and p4d, pud, pmd pointers) directly in its code. See
follow_pfnmap_start(), gup_fast_pgd_leaf(), and many other sites. These changes
aim to abstract those dereferences into an inline function that the architecture
can override and implement if it so wishes.

The core-mm implements default versions of these helper functions which do
READ_ONCE(), but does not currently use them consistently.

From Anshuman's comments earlier in this thread, it looked to me like the arm
pgd_t type is too big to read with READ_ONCE() - it can't be atomically read on
that arch. So my proposal was to implement the override for arm to do exactly
what the core-mm used to do, which is a pointer dereference. So that would
result in exact same behaviour for the arm arch.

> 
> Essentially, on 32-bit 2-level ARM, the PGD is merely indexed by the
> virtual address. As far as the kernel is concerned, each entry is
> 64-bit, and the generic kernel code has no business accessing that
> through the pgd pointer.
> 
> The pgd pointer is passed through the PUD and PMD levels, where it is
> typecast down through the kernel layers to a pmd_t pointer, where it
> becomes a 32-bit quantity. This results in only the _first_ level 1
> pointer being dereferenced by kernel code to a 32-bit pmd_t quantity.
> pmd_page_vaddr() converts this pmd_t quantity to a pte pointer (which
> points at the software level 2 page tables, not the hardware page
> tables.)

As an aside, my understanding of Linux's pgtable model differs from what you
describe. As I understand it, Linux's logical page table model has 5 levels
(pgd, p4d, pud, pmd, pte). If an arch doesn't support all 5 levels, then the
middle levels can be folded away (p4d first, then pud, then pmd). But the
core-mm still logically walks all 5 levels. So if the HW supports 2 levels,
those levels are (pgd, pte). But you are suggesting that arm exposes pmd and
pte, which is not what Linux expects? (Perhaps you call it the pmd in the arch,
but that is being folded and accessed through the pgd helpers in core code, I
believe?

> 
> So, as I'm now being told that the kernel wants to dereference the
> pgd level despite the model I describe above, alarm bells are ringing.
> I want full information please.
> 

This is not new; the kernel already dereferences the pgd pointers.

Thanks,
Ryan


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V2 7/7] mm: Use pgdp_get() for accessing PGD entries
  2024-09-19 17:49           ` Ryan Roberts
@ 2024-09-19 20:25             ` Russell King (Oracle)
  2024-09-20  6:57               ` Ryan Roberts
  0 siblings, 1 reply; 10+ messages in thread
From: Russell King (Oracle) @ 2024-09-19 20:25 UTC (permalink / raw)
  To: Ryan Roberts
  Cc: Anshuman Khandual, kernel test robot, linux-mm, llvm,
	oe-kbuild-all, Andrew Morton, David Hildenbrand,
	Mike Rapoport (IBM), Arnd Bergmann, x86, linux-m68k,
	linux-fsdevel, kasan-dev, linux-kernel, linux-perf-users,
	Dimitri Sivanich, Alexander Viro, Muchun Song, Andrey Ryabinin,
	Miaohe Lin, Dennis Zhou, Tejun Heo, Christoph Lameter,
	Uladzislau Rezki, Christoph Hellwig

On Thu, Sep 19, 2024 at 07:49:09PM +0200, Ryan Roberts wrote:
> On 19/09/2024 18:06, Russell King (Oracle) wrote:
> > On Thu, Sep 19, 2024 at 05:48:58PM +0200, Ryan Roberts wrote:
> >>> 32-bit arm uses, in some circumstances, an array because each level 1
> >>> page table entry is actually two descriptors. It needs to be this way
> >>> because each level 2 table pointed to by each level 1 entry has 256
> >>> entries, meaning it only occupies 1024 bytes in a 4096 byte page.
> >>>
> >>> In order to cut down on the wastage, treat the level 1 page table as
> >>> groups of two entries, which point to two consecutive 1024 byte tables
> >>> in the level 2 page.
> >>>
> >>> The level 2 entry isn't suitable for the kernel's use cases (there are
> >>> no bits to represent accessed/dirty and other important stuff that the
> >>> Linux MM wants) so we maintain the hardware page tables and a separate
> >>> set that Linux uses in the same page. Again, the software tables are
> >>> consecutive, so from Linux's perspective, the level 2 page tables
> >>> have 512 entries in them and occupy one full page.
> >>>
> >>> This is documented in arch/arm/include/asm/pgtable-2level.h
> >>>
> >>> However, what this means is that from the software perspective, the
> >>> level 1 page table descriptors are an array of two entries, both of
> >>> which need to be setup when creating a level 2 page table, but only
> >>> the first one should ever be dereferenced when walking the tables,
> >>> otherwise the code that walks the second level of page table entries
> >>> will walk off the end of the software table into the actual hardware
> >>> descriptors.
> >>>
> >>> I've no idea what the idea is behind introducing pgd_get() and what
> >>> it's semantics are, so I can't comment further.
> >>
> >> The helper is intended to read the value of the entry pointed to by the passed
> >> in pointer. And it shoiuld be read in a "single copy atomic" manner, meaning no
> >> tearing. Further, the PTL is expected to be held when calling the getter. If the
> >> HW can write to the entry such that its racing with the lock holder (i.e. HW
> >> update of access/dirty) then READ_ONCE() should be suitable for most
> >> architectures. If there is no possibility of racing (because HW doesn't write to
> >> the entry), then a simple dereference would be sufficient, I think (which is
> >> what the core code was already doing in most cases).
> > 
> > The core code should be making no access to the PGD entries on 32-bit
> > ARM since the PGD level does not exist. Writes are done at PMD level
> > in arch code. Reads are done by core code at PMD level.
> > 
> > It feels to me like pgd_get() just doesn't fit the model to which 32-bit
> > ARM was designed to use decades ago, so I want full details about what
> > pgd_get() is going to be used for and how it is going to be used,
> > because I feel completely in the dark over this new development. I fear
> > that someone hasn't understood the Linux page table model if they're
> > wanting to access stuff at levels that effectively "aren't implemented"
> > in the architecture specific kernel model of the page tables.
> 
> This change isn't as big and scary as I think you fear.

The situation is as I state above. Core code must _not_ dereference pgd
pointers on 32-bit ARM.

> The core-mm today
> dereferences pgd pointers (and p4d, pud, pmd pointers) directly in its code. See
> follow_pfnmap_start(),

Doesn't seem to exist at least not in 6.11.

> gup_fast_pgd_leaf(), and many other sites.

Only built when CONFIG_HAVE_GUP_FAST is set, which 32-bit ARM doesn't
set because its meaningless there, except when LPAE is in use (which is
basically the situation I'm discussing.)

> These changes
> aim to abstract those dereferences into an inline function that the architecture
> can override and implement if it so wishes.
> 
> The core-mm implements default versions of these helper functions which do
> READ_ONCE(), but does not currently use them consistently.
> 
> From Anshuman's comments earlier in this thread, it looked to me like the arm
> pgd_t type is too big to read with READ_ONCE() - it can't be atomically read on
> that arch. So my proposal was to implement the override for arm to do exactly
> what the core-mm used to do, which is a pointer dereference. So that would
> result in exact same behaviour for the arm arch.

Let me say this again: core code must NOT dereference pgds on 32-bit
non-LPAE ARM. They are meaningless to core code. A pgd_t does not
reference a single entry in hardware. It references two entries.

> > Essentially, on 32-bit 2-level ARM, the PGD is merely indexed by the
> > virtual address. As far as the kernel is concerned, each entry is
> > 64-bit, and the generic kernel code has no business accessing that
> > through the pgd pointer.
> > 
> > The pgd pointer is passed through the PUD and PMD levels, where it is
> > typecast down through the kernel layers to a pmd_t pointer, where it
> > becomes a 32-bit quantity. This results in only the _first_ level 1
> > pointer being dereferenced by kernel code to a 32-bit pmd_t quantity.
> > pmd_page_vaddr() converts this pmd_t quantity to a pte pointer (which
> > points at the software level 2 page tables, not the hardware page
> > tables.)
> 
> As an aside, my understanding of Linux's pgtable model differs from what you
> describe. As I understand it, Linux's logical page table model has 5 levels
> (pgd, p4d, pud, pmd, pte). If an arch doesn't support all 5 levels, then the
> middle levels can be folded away (p4d first, then pud, then pmd). But the
> core-mm still logically walks all 5 levels. So if the HW supports 2 levels,
> those levels are (pgd, pte). But you are suggesting that arm exposes pmd and
> pte, which is not what Linux expects? (Perhaps you call it the pmd in the arch,
> but that is being folded and accessed through the pgd helpers in core code, I
> believe?

What ARM does dates from before the Linux MM invented the current
"folding" method when we had three page table levels - pgd, pmd
and pte. The current folding techniques were invented well after
32-bit ARM was implemented, which was using the original idea of
how to fold the page tables.

The new folding came up with a totally different way of doing it,
and I looked into converting 32-bit ARM over to it, but it wasn't
possible to do so with the need for two level-1 entries to be
managed for each level-2 page table.

> > So, as I'm now being told that the kernel wants to dereference the
> > pgd level despite the model I describe above, alarm bells are ringing.
> > I want full information please.
> > 
> 
> This is not new; the kernel already dereferences the pgd pointers.

Consider that 32-bit ARM has been this way for decades (Linux was ported
to 32-bit ARM by me back in the 1990s - so it's about 30 years old.)
Compare that to what you're stating is "not new"... I beg to differ with
your opinion on what is new and what isn't. It's all about the relative
time.

This is how the page tables are walked:

static inline pgd_t *pgd_offset_pgd(pgd_t *pgd, unsigned long address)
{
        return (pgd + pgd_index(address));
}

#define pgd_offset(mm, address)         pgd_offset_pgd((mm)->pgd, (address))

This returns a pointer to the pgd. This is then used with p4d_offset()
when walking the next level, and this is defined on 32-bit ARM from
include/asm-generic/pgtable-nop4d.h:

static inline p4d_t *p4d_offset(pgd_t *pgd, unsigned long address)
{
        return (p4d_t *)pgd;
}

Then from include/asm-generic/pgtable-nopud.h:

static inline pud_t *pud_offset(p4d_t *p4d, unsigned long address)
{
        return (pud_t *)p4d;
}

Then from arch/arm/include/asm/pgtable-2level.h:

static inline pmd_t *pmd_offset(pud_t *pud, unsigned long addr)
{
        return (pmd_t *)pud;
}

All of the above casts result in the pgd_t pointer being cast down
to a pmd_t pointer.

Now, looking at stuff in mm/memory.c such as unmap_page_range().

        pgd = pgd_offset(vma->vm_mm, addr);

This gets the pgd pointer into the level 1 page tables associated
with addr, and passes it down to zap_p4d_range().

That passes it to p4d_offset() without dereferencing it, which on
32-bit ARM, merely casts the pgd_t pointer to a p4d_t pointer. Since
a p4d_t is defined to be a struct of a pgd_t, this also points at an
array of two 32-bit quantities. This pointer is passed down to
zap_pud_range().

zap_pud_range() passes this pointer to pud_offset(), again without
dereferencing it, and we end up with a pud_t pointer. Since pud_t is
defined to be a struct of p4d_t, this also points to an array of two
32-bit quantities.

We then have:

                if (pud_trans_huge(*pud) || pud_devmap(*pud)) {

These is an implicit memory copy/access between the memory pointed to
by pud, and their destination (which might be a register). However,
these are optimised away because 32-bit ARM doesn't set
HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD nor ARCH_HAS_PTE_DEVMAP (as
neither inline function make use of their argument.)

NOTE: If making these use READ_ONCE results in an access that can not
be optimised away, that is a bug that needs to be addressed.

zap_pud_range() then passes the pud pointer to zap_pmd_range().

zap_pmd_range() passes this pointer to pud_offset() with no further
dereferences, and this gets cast to a pmd_t pointer, which is a
pointer to the first 32-bit quantity pointed to by the pgd_t pointer.

All the dereferences from this point on are 32-bit which can be done
as single-copy atomic accesses. This will be the first real access
to the level-1 page tables in this code path as the code stands today,
and from this point on, accesses to the page tables are as the
architecture intends them to be.


Now, realise that for all of the accesses above that have all been
optimised away, none of that code even existed when 32-bit ARM was
using this method. The addition of these features not intefering
with the way 32-bit non-LPAE ARM works relies on all of those
accesses being optimised away, and they need to continue to be so
going forward.


Maybe that means that this new (and I mean new in relative terms
compared to the age of the 32-bit ARM code) pgdp_get() accessor
needs to be a non-dereferencing operation, so something like:

#define pgdp_get(pgdp)		((pgd_t){ })

in arch/arm/include/asm/pgtable-2level.h (note the corrected
spelling of pgdp), and the existing pgdp_get() moved to
arch/arm/include/asm/pgtable-3level.h. This isn't tested.

However, let me say this again... without knowing exactly how
and where pgdp_get() is intended to be used, I'm clutching at
straws here. Even looking at Linus' tree, there's very little in
evidence there to suggest how pgdp_get() is intended to be used.
For example, there's no references to it in mm/.


Please realise that I have _no_ _clue_ what "[PATCH V2 7/7] mm: Use
pgdp_get() for accessing PGD entries" is proposing. I wasn't on its
Cc list. I haven't seen the patch. The first I knew anything about
this was with the email that Anshuman Khandual sent in response to
the kernel build bot's build error.

I'm afraid that the kernel build bot's build error means that this
patch:

commit eba2591d99d1f14a04c8a8a845ab0795b93f5646
Author: Alexandre Ghiti <alexghiti@rivosinc.com>
Date:   Wed Dec 13 21:29:59 2023 +0100

    mm: Introduce pudp/p4dp/pgdp_get() functions

is actually broken. I'm sorry that I didn't review that, but how the
series looked when it landed in my mailbox, it looked like it was
specific to RISC-V and of no interest to me, so I didn't bother
reading it (I get _lots_ of email, I can't read everything.) This
is how it looks like in my mailbox (and note that they're marked
as new to this day):

3218 N T Dec 13 Alexandre Ghiti (   0) [PATCH v2 0/4] riscv: Use READ_ONCE()/WRI
3219 N T Dec 13 Alexandre Ghiti (   0) ├─>[PATCH v2 1/4] riscv: Use WRITE_ONCE()
3220 N T Dec 13 Alexandre Ghiti (   0) ├─>[PATCH v2 2/4] mm: Introduce pudp/p4dp
3221 N T Dec 13 Alexandre Ghiti (   0) ├─>[PATCH v2 3/4] riscv: mm: Only compile
3222 N T Dec 13 Alexandre Ghiti (   0) └─>[PATCH v2 4/4] riscv: Use accessors to
3223 N C Dec 14 Anup Patel      (   0)   └─>

Sorry, but I'm not even going to look at something like that when it
looks like it's for RISC-V and nothing else.

One final point... because I'm sure someone's going to say "but you
were in the To: header". I've long since given up using "am I in the
Cc/To header" to carry any useful or meaningful information to
indicate whether it's something I should read. I'm afraid that the
kernel community has long since taught me that is of no value what
so ever, so I merely go by "does this look of any interest". If not,
I don't bother even _opening_ the email.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V2 7/7] mm: Use pgdp_get() for accessing PGD entries
  2024-09-19 20:25             ` Russell King (Oracle)
@ 2024-09-20  6:57               ` Ryan Roberts
  2024-09-20  9:47                 ` Russell King (Oracle)
  0 siblings, 1 reply; 10+ messages in thread
From: Ryan Roberts @ 2024-09-20  6:57 UTC (permalink / raw)
  To: Russell King (Oracle)
  Cc: Anshuman Khandual, kernel test robot, linux-mm, llvm,
	oe-kbuild-all, Andrew Morton, David Hildenbrand,
	Mike Rapoport (IBM), Arnd Bergmann, x86, linux-m68k,
	linux-fsdevel, kasan-dev, linux-kernel, linux-perf-users,
	Dimitri Sivanich, Alexander Viro, Muchun Song, Andrey Ryabinin,
	Miaohe Lin, Dennis Zhou, Tejun Heo, Christoph Lameter,
	Uladzislau Rezki, Christoph Hellwig

On 19/09/2024 21:25, Russell King (Oracle) wrote:
> On Thu, Sep 19, 2024 at 07:49:09PM +0200, Ryan Roberts wrote:
>> On 19/09/2024 18:06, Russell King (Oracle) wrote:
>>> On Thu, Sep 19, 2024 at 05:48:58PM +0200, Ryan Roberts wrote:
>>>>> 32-bit arm uses, in some circumstances, an array because each level 1
>>>>> page table entry is actually two descriptors. It needs to be this way
>>>>> because each level 2 table pointed to by each level 1 entry has 256
>>>>> entries, meaning it only occupies 1024 bytes in a 4096 byte page.
>>>>>
>>>>> In order to cut down on the wastage, treat the level 1 page table as
>>>>> groups of two entries, which point to two consecutive 1024 byte tables
>>>>> in the level 2 page.
>>>>>
>>>>> The level 2 entry isn't suitable for the kernel's use cases (there are
>>>>> no bits to represent accessed/dirty and other important stuff that the
>>>>> Linux MM wants) so we maintain the hardware page tables and a separate
>>>>> set that Linux uses in the same page. Again, the software tables are
>>>>> consecutive, so from Linux's perspective, the level 2 page tables
>>>>> have 512 entries in them and occupy one full page.
>>>>>
>>>>> This is documented in arch/arm/include/asm/pgtable-2level.h
>>>>>
>>>>> However, what this means is that from the software perspective, the
>>>>> level 1 page table descriptors are an array of two entries, both of
>>>>> which need to be setup when creating a level 2 page table, but only
>>>>> the first one should ever be dereferenced when walking the tables,
>>>>> otherwise the code that walks the second level of page table entries
>>>>> will walk off the end of the software table into the actual hardware
>>>>> descriptors.
>>>>>
>>>>> I've no idea what the idea is behind introducing pgd_get() and what
>>>>> it's semantics are, so I can't comment further.
>>>>
>>>> The helper is intended to read the value of the entry pointed to by the passed
>>>> in pointer. And it shoiuld be read in a "single copy atomic" manner, meaning no
>>>> tearing. Further, the PTL is expected to be held when calling the getter. If the
>>>> HW can write to the entry such that its racing with the lock holder (i.e. HW
>>>> update of access/dirty) then READ_ONCE() should be suitable for most
>>>> architectures. If there is no possibility of racing (because HW doesn't write to
>>>> the entry), then a simple dereference would be sufficient, I think (which is
>>>> what the core code was already doing in most cases).
>>>
>>> The core code should be making no access to the PGD entries on 32-bit
>>> ARM since the PGD level does not exist. Writes are done at PMD level
>>> in arch code. Reads are done by core code at PMD level.
>>>
>>> It feels to me like pgd_get() just doesn't fit the model to which 32-bit
>>> ARM was designed to use decades ago, so I want full details about what
>>> pgd_get() is going to be used for and how it is going to be used,
>>> because I feel completely in the dark over this new development. I fear
>>> that someone hasn't understood the Linux page table model if they're
>>> wanting to access stuff at levels that effectively "aren't implemented"
>>> in the architecture specific kernel model of the page tables.
>>
>> This change isn't as big and scary as I think you fear.
> 
> The situation is as I state above. Core code must _not_ dereference pgd
> pointers on 32-bit ARM.

Let's just rewind a bit. This thread exists because the kernel test robot failed
to compile pgd_none_or_clear_bad() (a core-mm function) for the arm architecture
after Anshuman changed the direct pgd dereference to pgdp_get(). The reason
compilation failed is because arm defines its own pgdp_get() override, but it is
broken (there is a typo).

Code before Anshuman's change:

static inline int pgd_none_or_clear_bad(pgd_t *pgd)
{
	if (pgd_none(*pgd))
		return 1;
	if (unlikely(pgd_bad(*pgd))) {
		pgd_clear_bad(pgd);
		return 1;
	}
	return 0;
}

Code after Anshuman's change:

static inline int pgd_none_or_clear_bad(pgd_t *pgd)
{
	pgd_t old_pgd = pgdp_get(pgd);

	if (pgd_none(old_pgd))
		return 1;
	if (unlikely(pgd_bad(old_pgd))) {
		pgd_clear_bad(pgd);
		return 1;
	}
	return 0;
}

So the kernel _is_ alreday dereferencing pgd pointers for the arm arch, and has
been since the beginning of (git) time. Note that pgd_none_or_clear_bad() is
called from core code and from arm arch code.

As an aside, the kernel also dereferences p4d, pud, pmd and pte pointers in
various circumstances. And other changes in this series are also replacing those
direct dereferences with calls to similar helpers. The fact that these are all
folded (by a custom arm implementation if I've understood the below correctly)
just means that each dereference is returning what you would call the pmd from
the HW perspective, I think?

> 
>> The core-mm today
>> dereferences pgd pointers (and p4d, pud, pmd pointers) directly in its code. See
>> follow_pfnmap_start(),
> 
> Doesn't seem to exist at least not in 6.11.

Appologies, I'm on mm-unstable and that isn't upstream yet. See follow_pte() in
v6.11 or __apply_to_page_range(), or pgd_none_or_clear_bad() as per above.

> 
>> gup_fast_pgd_leaf(), and many other sites.
> 
> Only built when CONFIG_HAVE_GUP_FAST is set, which 32-bit ARM doesn't
> set because its meaningless there, except when LPAE is in use (which is
> basically the situation I'm discussing.)
> 
>> These changes
>> aim to abstract those dereferences into an inline function that the architecture
>> can override and implement if it so wishes.
>>
>> The core-mm implements default versions of these helper functions which do
>> READ_ONCE(), but does not currently use them consistently.
>>
>> From Anshuman's comments earlier in this thread, it looked to me like the arm
>> pgd_t type is too big to read with READ_ONCE() - it can't be atomically read on
>> that arch. So my proposal was to implement the override for arm to do exactly
>> what the core-mm used to do, which is a pointer dereference. So that would
>> result in exact same behaviour for the arm arch.
> 
> Let me say this again: core code must NOT dereference pgds on 32-bit
> non-LPAE ARM. They are meaningless to core code. A pgd_t does not
> reference a single entry in hardware. It references two entries.

OK, so there are 3 options; either I have misunderstood what the core code is
doing (because as per above, I'm asserting that core code _is_ dereferencing pgd
pointers), or the core code is dereferencing and that is buggy, or the core code
is derefencing and its working as designed. I believe its the latter, but am
willing to be proved wrong.

> 
>>> Essentially, on 32-bit 2-level ARM, the PGD is merely indexed by the
>>> virtual address. As far as the kernel is concerned, each entry is
>>> 64-bit, and the generic kernel code has no business accessing that
>>> through the pgd pointer.
>>>
>>> The pgd pointer is passed through the PUD and PMD levels, where it is
>>> typecast down through the kernel layers to a pmd_t pointer, where it
>>> becomes a 32-bit quantity. This results in only the _first_ level 1
>>> pointer being dereferenced by kernel code to a 32-bit pmd_t quantity.
>>> pmd_page_vaddr() converts this pmd_t quantity to a pte pointer (which
>>> points at the software level 2 page tables, not the hardware page
>>> tables.)
>>
>> As an aside, my understanding of Linux's pgtable model differs from what you
>> describe. As I understand it, Linux's logical page table model has 5 levels
>> (pgd, p4d, pud, pmd, pte). If an arch doesn't support all 5 levels, then the
>> middle levels can be folded away (p4d first, then pud, then pmd). But the
>> core-mm still logically walks all 5 levels. So if the HW supports 2 levels,
>> those levels are (pgd, pte). But you are suggesting that arm exposes pmd and
>> pte, which is not what Linux expects? (Perhaps you call it the pmd in the arch,
>> but that is being folded and accessed through the pgd helpers in core code, I
>> believe?
> 
> What ARM does dates from before the Linux MM invented the current
> "folding" method when we had three page table levels - pgd, pmd
> and pte. The current folding techniques were invented well after
> 32-bit ARM was implemented, which was using the original idea of
> how to fold the page tables.
> 
> The new folding came up with a totally different way of doing it,
> and I looked into converting 32-bit ARM over to it, but it wasn't
> possible to do so with the need for two level-1 entries to be
> managed for each level-2 page table.
> 
>>> So, as I'm now being told that the kernel wants to dereference the
>>> pgd level despite the model I describe above, alarm bells are ringing.
>>> I want full information please.
>>>
>>
>> This is not new; the kernel already dereferences the pgd pointers.
> 
> Consider that 32-bit ARM has been this way for decades (Linux was ported
> to 32-bit ARM by me back in the 1990s - so it's about 30 years old.)
> Compare that to what you're stating is "not new"... I beg to differ with
> your opinion on what is new and what isn't. It's all about the relative
> time.

By "not new" I meant that it's not introduced by this series. The kernel's
dereferencing of pgd pointers was present before this series came along.

> 
> This is how the page tables are walked:
> 
> static inline pgd_t *pgd_offset_pgd(pgd_t *pgd, unsigned long address)
> {
>         return (pgd + pgd_index(address));
> }
> 
> #define pgd_offset(mm, address)         pgd_offset_pgd((mm)->pgd, (address))
> 
> This returns a pointer to the pgd. This is then used with p4d_offset()
> when walking the next level, and this is defined on 32-bit ARM from
> include/asm-generic/pgtable-nop4d.h:
> 
> static inline p4d_t *p4d_offset(pgd_t *pgd, unsigned long address)
> {
>         return (p4d_t *)pgd;
> }
> 
> Then from include/asm-generic/pgtable-nopud.h:
> 
> static inline pud_t *pud_offset(p4d_t *p4d, unsigned long address)
> {
>         return (pud_t *)p4d;
> }
> 
> Then from arch/arm/include/asm/pgtable-2level.h:
> 
> static inline pmd_t *pmd_offset(pud_t *pud, unsigned long addr)
> {
>         return (pmd_t *)pud;
> }
> 
> All of the above casts result in the pgd_t pointer being cast down
> to a pmd_t pointer.
> 
> Now, looking at stuff in mm/memory.c such as unmap_page_range().
> 
>         pgd = pgd_offset(vma->vm_mm, addr);
> 
> This gets the pgd pointer into the level 1 page tables associated
> with addr, and passes it down to zap_p4d_range().
> 
> That passes it to p4d_offset() without dereferencing it, which on
> 32-bit ARM, merely casts the pgd_t pointer to a p4d_t pointer. Since
> a p4d_t is defined to be a struct of a pgd_t, this also points at an
> array of two 32-bit quantities. This pointer is passed down to
> zap_pud_range().
> 
> zap_pud_range() passes this pointer to pud_offset(), again without
> dereferencing it, and we end up with a pud_t pointer. Since pud_t is
> defined to be a struct of p4d_t, this also points to an array of two
> 32-bit quantities.
> 
> We then have:
> 
>                 if (pud_trans_huge(*pud) || pud_devmap(*pud)) {
> 
> These is an implicit memory copy/access between the memory pointed to
> by pud, and their destination (which might be a register). However,
> these are optimised away because 32-bit ARM doesn't set
> HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD nor ARCH_HAS_PTE_DEVMAP (as
> neither inline function make use of their argument.)
> 
> NOTE: If making these use READ_ONCE results in an access that can not
> be optimised away, that is a bug that needs to be addressed.
> 
> zap_pud_range() then passes the pud pointer to zap_pmd_range().
> 
> zap_pmd_range() passes this pointer to pud_offset() with no further
> dereferences, and this gets cast to a pmd_t pointer, which is a
> pointer to the first 32-bit quantity pointed to by the pgd_t pointer.
> 
> All the dereferences from this point on are 32-bit which can be done
> as single-copy atomic accesses. This will be the first real access
> to the level-1 page tables in this code path as the code stands today,
> and from this point on, accesses to the page tables are as the
> architecture intends them to be.
> 
> 
> Now, realise that for all of the accesses above that have all been
> optimised away, none of that code even existed when 32-bit ARM was
> using this method. The addition of these features not intefering
> with the way 32-bit non-LPAE ARM works relies on all of those
> accesses being optimised away, and they need to continue to be so
> going forward.
> 
> 
> Maybe that means that this new (and I mean new in relative terms
> compared to the age of the 32-bit ARM code) pgdp_get() accessor
> needs to be a non-dereferencing operation, so something like:
> 
> #define pgdp_get(pgdp)		((pgd_t){ })

I'm afraid I haven't digested all these arm-specific details. But if I'm right
that the core kernel does and is correct to dereference pgd pointers for these
non-LPAE arm builds, then I think you at least need arm's implementation to be:

#define pgdp_get(pgdp)		(*pgdp)

Thanks,
Ryan

> 
> in arch/arm/include/asm/pgtable-2level.h (note the corrected
> spelling of pgdp), and the existing pgdp_get() moved to
> arch/arm/include/asm/pgtable-3level.h. This isn't tested.
> 
> However, let me say this again... without knowing exactly how
> and where pgdp_get() is intended to be used, I'm clutching at
> straws here. Even looking at Linus' tree, there's very little in
> evidence there to suggest how pgdp_get() is intended to be used.
> For example, there's no references to it in mm/.
> 
> 
> Please realise that I have _no_ _clue_ what "[PATCH V2 7/7] mm: Use
> pgdp_get() for accessing PGD entries" is proposing. I wasn't on its
> Cc list. I haven't seen the patch. The first I knew anything about
> this was with the email that Anshuman Khandual sent in response to
> the kernel build bot's build error.

Here is the full series for context:

https://lore.kernel.org/linux-mm/20240917073117.1531207-1-anshuman.khandual@arm.com/

> 
> I'm afraid that the kernel build bot's build error means that this
> patch:
> 
> commit eba2591d99d1f14a04c8a8a845ab0795b93f5646
> Author: Alexandre Ghiti <alexghiti@rivosinc.com>
> Date:   Wed Dec 13 21:29:59 2023 +0100
> 
>     mm: Introduce pudp/p4dp/pgdp_get() functions
> 
> is actually broken. I'm sorry that I didn't review that, but how the
> series looked when it landed in my mailbox, it looked like it was
> specific to RISC-V and of no interest to me, so I didn't bother
> reading it (I get _lots_ of email, I can't read everything.) This
> is how it looks like in my mailbox (and note that they're marked
> as new to this day):
> 
> 3218 N T Dec 13 Alexandre Ghiti (   0) [PATCH v2 0/4] riscv: Use READ_ONCE()/WRI
> 3219 N T Dec 13 Alexandre Ghiti (   0) ├─>[PATCH v2 1/4] riscv: Use WRITE_ONCE()
> 3220 N T Dec 13 Alexandre Ghiti (   0) ├─>[PATCH v2 2/4] mm: Introduce pudp/p4dp
> 3221 N T Dec 13 Alexandre Ghiti (   0) ├─>[PATCH v2 3/4] riscv: mm: Only compile
> 3222 N T Dec 13 Alexandre Ghiti (   0) └─>[PATCH v2 4/4] riscv: Use accessors to
> 3223 N C Dec 14 Anup Patel      (   0)   └─>
> 
> Sorry, but I'm not even going to look at something like that when it
> looks like it's for RISC-V and nothing else.
> 
> One final point... because I'm sure someone's going to say "but you
> were in the To: header". I've long since given up using "am I in the
> Cc/To header" to carry any useful or meaningful information to
> indicate whether it's something I should read. I'm afraid that the
> kernel community has long since taught me that is of no value what
> so ever, so I merely go by "does this look of any interest". If not,
> I don't bother even _opening_ the email.
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V2 7/7] mm: Use pgdp_get() for accessing PGD entries
  2024-09-20  6:57               ` Ryan Roberts
@ 2024-09-20  9:47                 ` Russell King (Oracle)
  2024-09-23 15:21                   ` Ryan Roberts
  0 siblings, 1 reply; 10+ messages in thread
From: Russell King (Oracle) @ 2024-09-20  9:47 UTC (permalink / raw)
  To: Ryan Roberts
  Cc: Anshuman Khandual, kernel test robot, linux-mm, llvm,
	oe-kbuild-all, Andrew Morton, David Hildenbrand,
	Mike Rapoport (IBM), Arnd Bergmann, x86, linux-m68k,
	linux-fsdevel, kasan-dev, linux-kernel, linux-perf-users,
	Dimitri Sivanich, Alexander Viro, Muchun Song, Andrey Ryabinin,
	Miaohe Lin, Dennis Zhou, Tejun Heo, Christoph Lameter,
	Uladzislau Rezki, Christoph Hellwig

On Fri, Sep 20, 2024 at 08:57:23AM +0200, Ryan Roberts wrote:
> On 19/09/2024 21:25, Russell King (Oracle) wrote:
> > On Thu, Sep 19, 2024 at 07:49:09PM +0200, Ryan Roberts wrote:
> >> On 19/09/2024 18:06, Russell King (Oracle) wrote:
> >>> On Thu, Sep 19, 2024 at 05:48:58PM +0200, Ryan Roberts wrote:
> >>>>> 32-bit arm uses, in some circumstances, an array because each level 1
> >>>>> page table entry is actually two descriptors. It needs to be this way
> >>>>> because each level 2 table pointed to by each level 1 entry has 256
> >>>>> entries, meaning it only occupies 1024 bytes in a 4096 byte page.
> >>>>>
> >>>>> In order to cut down on the wastage, treat the level 1 page table as
> >>>>> groups of two entries, which point to two consecutive 1024 byte tables
> >>>>> in the level 2 page.
> >>>>>
> >>>>> The level 2 entry isn't suitable for the kernel's use cases (there are
> >>>>> no bits to represent accessed/dirty and other important stuff that the
> >>>>> Linux MM wants) so we maintain the hardware page tables and a separate
> >>>>> set that Linux uses in the same page. Again, the software tables are
> >>>>> consecutive, so from Linux's perspective, the level 2 page tables
> >>>>> have 512 entries in them and occupy one full page.
> >>>>>
> >>>>> This is documented in arch/arm/include/asm/pgtable-2level.h
> >>>>>
> >>>>> However, what this means is that from the software perspective, the
> >>>>> level 1 page table descriptors are an array of two entries, both of
> >>>>> which need to be setup when creating a level 2 page table, but only
> >>>>> the first one should ever be dereferenced when walking the tables,
> >>>>> otherwise the code that walks the second level of page table entries
> >>>>> will walk off the end of the software table into the actual hardware
> >>>>> descriptors.
> >>>>>
> >>>>> I've no idea what the idea is behind introducing pgd_get() and what
> >>>>> it's semantics are, so I can't comment further.
> >>>>
> >>>> The helper is intended to read the value of the entry pointed to by the passed
> >>>> in pointer. And it shoiuld be read in a "single copy atomic" manner, meaning no
> >>>> tearing. Further, the PTL is expected to be held when calling the getter. If the
> >>>> HW can write to the entry such that its racing with the lock holder (i.e. HW
> >>>> update of access/dirty) then READ_ONCE() should be suitable for most
> >>>> architectures. If there is no possibility of racing (because HW doesn't write to
> >>>> the entry), then a simple dereference would be sufficient, I think (which is
> >>>> what the core code was already doing in most cases).
> >>>
> >>> The core code should be making no access to the PGD entries on 32-bit
> >>> ARM since the PGD level does not exist. Writes are done at PMD level
> >>> in arch code. Reads are done by core code at PMD level.
> >>>
> >>> It feels to me like pgd_get() just doesn't fit the model to which 32-bit
> >>> ARM was designed to use decades ago, so I want full details about what
> >>> pgd_get() is going to be used for and how it is going to be used,
> >>> because I feel completely in the dark over this new development. I fear
> >>> that someone hasn't understood the Linux page table model if they're
> >>> wanting to access stuff at levels that effectively "aren't implemented"
> >>> in the architecture specific kernel model of the page tables.
> >>
> >> This change isn't as big and scary as I think you fear.
> > 
> > The situation is as I state above. Core code must _not_ dereference pgd
> > pointers on 32-bit ARM.
> 
> Let's just rewind a bit. This thread exists because the kernel test robot failed
> to compile pgd_none_or_clear_bad() (a core-mm function) for the arm architecture
> after Anshuman changed the direct pgd dereference to pgdp_get(). The reason
> compilation failed is because arm defines its own pgdp_get() override, but it is
> broken (there is a typo).

Let's not rewind, because had you fully read and digested my reply, you
would have seen why this isn't a problem... but let me spell it out.

> 
> Code before Anshuman's change:
> 
> static inline int pgd_none_or_clear_bad(pgd_t *pgd)
> {
> 	if (pgd_none(*pgd))
> 		return 1;
> 	if (unlikely(pgd_bad(*pgd))) {
> 		pgd_clear_bad(pgd);
> 		return 1;
> 	}
> 	return 0;
> }

This isn't a problem as the code stands. While there is a dereference
in C, that dereference is a simple struct copy, something that we use
everywhere in the kernel. However, that is as far as it goes, because
neither pgd_none() and pgd_bad() make use of their argument, and thus
the compiler will optimise it away, resulting in no actual access to
the page tables - _as_ _intended_.

If these are going to be converted to pgd_get(), then we need pgd_get()
to _also_ be optimised away, and if e.g. this is the only place that
pgd_get() is going to be used, the suggestion I made in my previous
email is entirely reasonable, since we know that the result of pgd_get()
will not actually be used.

> As an aside, the kernel also dereferences p4d, pud, pmd and pte pointers in
> various circumstances.

I already covered these in my previous reply.

> And other changes in this series are also replacing those
> direct dereferences with calls to similar helpers. The fact that these are all
> folded (by a custom arm implementation if I've understood the below correctly)
> just means that each dereference is returning what you would call the pmd from
> the HW perspective, I think?

It'll "return" the first of each pair of level-1 page table entries,
which is pgd[0] or *p4d, *pud, *pmd - but all of these except *pmd
need to be optimised away, so throwing lots of READ_ONCE() around
this code without considering this is certainly the wrong approach.

> >> The core-mm today
> >> dereferences pgd pointers (and p4d, pud, pmd pointers) directly in its code. See
> >> follow_pfnmap_start(),
> > 
> > Doesn't seem to exist at least not in 6.11.
> 
> Appologies, I'm on mm-unstable and that isn't upstream yet. See follow_pte() in
> v6.11 or __apply_to_page_range(), or pgd_none_or_clear_bad() as per above.

Looking at follow_pte(), it's not a problem.

I think we wouldn't be having this conversation before:

commit a32618d28dbe6e9bf8ec508ccbc3561a7d7d32f0
Author: Russell King <rmk+kernel@arm.linux.org.uk>
Date:   Tue Nov 22 17:30:28 2011 +0000

    ARM: pgtable: switch to use pgtable-nopud.h

where:
-#define pgd_none(pgd)          (0)
-#define pgd_bad(pgd)           (0)

existed before this commit - and thus the dereference in things like:

	pgd_none(*pgd)

wouldn't even be visible to beyond the preprocessor step.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V2 7/7] mm: Use pgdp_get() for accessing PGD entries
  2024-09-20  9:47                 ` Russell King (Oracle)
@ 2024-09-23 15:21                   ` Ryan Roberts
  0 siblings, 0 replies; 10+ messages in thread
From: Ryan Roberts @ 2024-09-23 15:21 UTC (permalink / raw)
  To: Russell King (Oracle)
  Cc: Anshuman Khandual, kernel test robot, linux-mm, llvm,
	oe-kbuild-all, Andrew Morton, David Hildenbrand,
	Mike Rapoport (IBM), Arnd Bergmann, x86, linux-m68k,
	linux-fsdevel, kasan-dev, linux-kernel, linux-perf-users,
	Dimitri Sivanich, Alexander Viro, Muchun Song, Andrey Ryabinin,
	Miaohe Lin, Dennis Zhou, Tejun Heo, Christoph Lameter,
	Uladzislau Rezki, Christoph Hellwig

>> Let's just rewind a bit. This thread exists because the kernel test robot failed
>> to compile pgd_none_or_clear_bad() (a core-mm function) for the arm architecture
>> after Anshuman changed the direct pgd dereference to pgdp_get(). The reason
>> compilation failed is because arm defines its own pgdp_get() override, but it is
>> broken (there is a typo).
> 
> Let's not rewind, because had you fully read and digested my reply, you
> would have seen why this isn't a problem... but let me spell it out.
> 
>>
>> Code before Anshuman's change:
>>
>> static inline int pgd_none_or_clear_bad(pgd_t *pgd)
>> {
>> 	if (pgd_none(*pgd))
>> 		return 1;
>> 	if (unlikely(pgd_bad(*pgd))) {
>> 		pgd_clear_bad(pgd);
>> 		return 1;
>> 	}
>> 	return 0;
>> }
> 
> This isn't a problem as the code stands. While there is a dereference
> in C, that dereference is a simple struct copy, something that we use
> everywhere in the kernel. However, that is as far as it goes, because
> neither pgd_none() and pgd_bad() make use of their argument, and thus
> the compiler will optimise it away, resulting in no actual access to
> the page tables - _as_ _intended_.

Right. Are you saying you depend upon those loads being optimized away for
correctness or performance reasons?

> 
> If these are going to be converted to pgd_get(), then we need pgd_get()
> to _also_ be optimised away, 

OK, agreed.

So perhaps the best approach is to modify the existing default pxdp_get()
implementations to just do a C dereference. That will ensure that there are no
intended consequences, unlike moving to READ_ONCE() by default. Then riscv
(which I think is the only arch to actually use pxdp_get() currently?) will need
its own pxdp_get() overrides, which use READ_ONCE(). arm64 would also define its
own overrides in terms of READ_ONCE() to ensure single copy atomicity in the
presence of HW updates.

How does that sound to you?

> and if e.g. this is the only place that
> pgd_get() is going to be used, the suggestion I made in my previous
> email is entirely reasonable, since we know that the result of pgd_get()
> will not actually be used.

I guess you could do that as an arm-specific override, but I don't think it adds
anything over using my proposed reworked default? Your call.

> 
>> As an aside, the kernel also dereferences p4d, pud, pmd and pte pointers in
>> various circumstances.
> 
> I already covered these in my previous reply.
> 
>> And other changes in this series are also replacing those
>> direct dereferences with calls to similar helpers. The fact that these are all
>> folded (by a custom arm implementation if I've understood the below correctly)
>> just means that each dereference is returning what you would call the pmd from
>> the HW perspective, I think?
> 
> It'll "return" the first of each pair of level-1 page table entries,
> which is pgd[0] or *p4d, *pud, *pmd - but all of these except *pmd
> need to be optimised away, so throwing lots of READ_ONCE() around
> this code without considering this is certainly the wrong approach.

Yep, got it.

> 
>>>> The core-mm today
>>>> dereferences pgd pointers (and p4d, pud, pmd pointers) directly in its code. See
>>>> follow_pfnmap_start(),
>>>
>>> Doesn't seem to exist at least not in 6.11.
>>
>> Appologies, I'm on mm-unstable and that isn't upstream yet. See follow_pte() in
>> v6.11 or __apply_to_page_range(), or pgd_none_or_clear_bad() as per above.
> 
> Looking at follow_pte(), it's not a problem.
> 
> I think we wouldn't be having this conversation before:
> 
> commit a32618d28dbe6e9bf8ec508ccbc3561a7d7d32f0
> Author: Russell King <rmk+kernel@arm.linux.org.uk>
> Date:   Tue Nov 22 17:30:28 2011 +0000
> 
>     ARM: pgtable: switch to use pgtable-nopud.h
> 
> where:
> -#define pgd_none(pgd)          (0)
> -#define pgd_bad(pgd)           (0)
> 
> existed before this commit - and thus the dereference in things like:
> 
> 	pgd_none(*pgd)
> 
> wouldn't even be visible to beyond the preprocessor step.
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2024-09-23 15:21 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20240917073117.1531207-8-anshuman.khandual@arm.com>
2024-09-18 20:30 ` [PATCH V2 7/7] mm: Use pgdp_get() for accessing PGD entries kernel test robot
2024-09-19  7:55   ` Anshuman Khandual
2024-09-19  9:11     ` Russell King (Oracle)
2024-09-19 15:48       ` Ryan Roberts
2024-09-19 17:06         ` Russell King (Oracle)
2024-09-19 17:49           ` Ryan Roberts
2024-09-19 20:25             ` Russell King (Oracle)
2024-09-20  6:57               ` Ryan Roberts
2024-09-20  9:47                 ` Russell King (Oracle)
2024-09-23 15:21                   ` Ryan Roberts

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox