From: Greg Ungerer <gerg@kernel.org>
To: Serge Semin <fancer.lancer@gmail.com>,
Florian Fainelli <f.fainelli@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>,
"open list:MIPS" <linux-mips@vger.kernel.org>,
Jiaxun Yang <jiaxun.yang@flygoat.com>
Subject: Re: "virt_to_phys used for non-linear address" warnings
Date: Fri, 8 Jul 2022 12:06:10 +1000 [thread overview]
Message-ID: <532a9e40-e4ea-c898-d05a-4ddcc0740f2a@kernel.org> (raw)
In-Reply-To: <20220707114047.o4wrbab2laoka3nb@mobilestation>
Hi Sergey,
On 7/7/22 21:40, Serge Semin wrote:
> !To+= Florian
>
> On Thu, Jul 07, 2022 at 02:49:36PM +1000, Greg Ungerer wrote:
>> Hi Serge,
>>
>> On 6/7/22 22:05, Serge Semin wrote:
>>> On Wed, Jul 06, 2022 at 03:47:41PM +1000, Greg Ungerer wrote:
>>>> Hi Serge,
>>>>
>>>> On 6/7/22 03:42, Serge Semin wrote:
>>>>> !To += Thomas
>>>>> !Cc += Jiaxun
>>>>>
>>>>> Hi Greg,
>>>>>
>>>>> On Fri, Jul 01, 2022 at 05:24:22PM +1000, Greg Ungerer wrote:
>>>>>> Hi,
>>>>>>
>>>>>> I am debugging a strange memory problem on a Mediatek MT7621 SoC based
>>>>>> hardware platform. That problem leads to rare and somewhat random
>>>>>> oops that are mostly within vma and generic memory functions (often in
>>>>>> kmem_cache_alloc(), but sometimes other places like unlink_anon_vmas() or
>>>>>> anon_vma_interval_tree_remove() or vma_interval_tree_remove()).
>>>>>>
>>>>>> To help track this down I enabled a few of the kernels hacking memory
>>>>>> debug config options. Now this immediately leads to getting this on
>>>>>> every process fork/exec:
>>>>>>
>>>>>> ------------[ cut here ]------------
>>>>>> WARNING: CPU: 0 PID: 1 at arch/mips/mm/physaddr.c:38 __virt_to_phys+0x50/0x84
>>>>>> virt_to_phys used for non-linear address: c443e370 (0xbfbd0000)
>>>>>> Modules linked in:
>>>>>> CPU: 0 PID: 1 Comm: init Not tainted 5.17.0-ac0 #1
>>>>>> Stack : 81c70000 7ffbd000 77de5000 81086784 00000000 00000004 00000000 d95b60c8
>>>>>> 80441c84 81c43654 81b70000 81b60000 804583d8 00000001 80441c28 8045cd00
>>>>>> 00000000 00000000 81aaf3b4 80441a70 00000187 80441adc 00000000 20306361
>>>>>> 203a6d6d 81c4d2f5 81c4d31c 74696e69 81b60000 00000001 80441d3c 81b6dbf0
>>>>>> 82070040 8066f240 81c70000 7ffbd000 00000000 814ff0f4 00000000 81c40000
>>>>>> ...
>>>>>> Call Trace:
>>>>>> [<81008ed0>] show_stack+0x38/0x118
>>>>>> [<8198f9e4>] dump_stack_lvl+0x5c/0x7c
>>>>>> [<81989300>] __warn+0xc0/0xf4
>>>>>> [<819893c0>] warn_slowpath_fmt+0x8c/0xb8
>>>>>> [<81025480>] __virt_to_phys+0x50/0x84
>>>>>> [<8100bb30>] arch_setup_additional_pages+0x120/0x230
>>>>>> [<81239280>] load_elf_binary+0xacc/0x14e0
>>>>>> [<811d6788>] bprm_execve+0x288/0x5dc
>>>>>> [<811d7240>] kernel_execve+0x130/0x1b4
>>>>>> [<81988154>] try_to_run_init_process+0x14/0x4c
>>>>>> [<81995e40>] kernel_init+0xe4/0x118
>>>>>> [<81003398>] ret_from_kernel_thread+0x14/0x1c
>>>>>> ---[ end trace 0000000000000000 ]---
>>>>>>
>>>>>> This is caused by this code in arch/mips/kernel/vdso.c, function
>>>>>> arch_setup_additional_pages():
>>>>>>
>>>>>> /* Map GIC user page. */
>>>>>> if (gic_size) {
>>>>>> gic_base = (unsigned long)mips_gic_base + MIPS_GIC_USER_OFS;
>>>>>> gic_pfn = virt_to_phys((void *)gic_base) >> PAGE_SHIFT;
>>>>>>
>>>>>> ret = io_remap_pfn_range(vma, base, gic_pfn, gic_size,
>>>>>> pgprot_noncached(vma->vm_page_prot));
>>>>>> if (ret)
>>>>>> goto out;
>>>>>> }
>>>>>>
>>>>>
>>>>>> That virt_to_phys() is being passed a value assigned from an ioremap().
>>>>>> According to the comments in io.h, this is not a correct usage of it:
>>>>>>
>>>>>> * The returned physical address is the physical (CPU) mapping for
>>>>>> * the memory address given. It is only valid to use this function on
>>>>>> * addresses directly mapped or allocated via kmalloc.
>>>>>
>>>>> Physical address you are trying to retrieve is directly mapped. It's
>>>>> 0xbfbd0000, which belong to the UCAC kseg1 MIPS space:
>>>>> https://johnloomis.org/microchip/pic32/memory/memory.html
>>>>> So virt_to_phys() shall work for it with no problem.
>>>>
>>>> Yeah, I can see that in this case it always ended up with the same pfn.
>>>>
>>>>
>>>>> Anyway IIUC from the __debug_virt_addr_valid() implementation you've
>>>>> got the "high_memory" variable initialized with inaccurate value. At
>>>>> very least it causes your directly mapped IO-address to cause the
>>>>> warning printed. In some circumstance it may lead to more complex
>>>>> problems. I've got a patch created some time ago, which fixes that
>>>>> misconfiguration. Could you try it out and see whether it solves your
>>>>> problems?
>>>>
>>>
>>>> Tried the patch. Does not fix the issue. Still get the warning and dump
>>>> on every process startup.
>>>
>>> It's pity. I was hoping it would work and we wouldn't need to debug the
>>> problem remotely. See my next guess then.
>>>
>>>>
>>>>
>>>>> Could you also send out a log with "Zone ranges:" info retrieved on
>>>>> the kernel without my patch applied?
>>>>
>>>> Before patch (original code):
>>>>
>>>> Zone ranges:
>>>> Normal [mem 0x0000000000000000-0x000000000fffffff]
>>>> Movable zone start for each node
>>>> Early memory node ranges
>>>> node 0: [mem 0x0000000000000000-0x000000000fffffff]
>>>> Initmem setup node 0 [mem 0x0000000000000000-0x000000000fffffff]
>>>
>>> Here is what I can see from the arch code and the info above:
>>> 1). Mediatek MT7621 SoC is based on the MIPS32r2 arch revision. So it
>>> supports 32bit kernel with highmem used to access the memory above
>>> HIGHMEM_START = _AC(0x20000000, UL). // 512MB+
>>> Most likely CONFIG_HIGHMEM is enabled in your kernel.
>>
>> No, HIGHMEM is disabled in my kernel:
>>
>> # CONFIG_HIGHMEM is not set
>> CONFIG_CPU_SUPPORTS_HIGHMEM=y
>> CONFIG_SYS_SUPPORTS_HIGHMEM=y
>>
>> Yes, the MT7621 is a MIPS32r2 based CPU:
>>
>> # CONFIG_CPU_MIPS32_R1 is not set
>> CONFIG_CPU_MIPS32_R2=y
>> CONFIG_SYS_SUPPORTS_ZBOOT=y
>> CONFIG_SYS_HAS_CPU_MIPS32_R1=y
>> CONFIG_SYS_HAS_CPU_MIPS32_R2=y
>> CONFIG_WEAK_ORDERING=y
>> # end of CPU selection
>>
>> CONFIG_CPU_MIPS32=y
>> CONFIG_CPU_MIPSR2=y
>> CONFIG_TARGET_ISA_REV=2
>> CONFIG_SYS_SUPPORTS_32BIT_KERNEL=y
>> CONFIG_CPU_SUPPORTS_32BIT_KERNEL=y
>>
>>
>>> 2). Judging by the Zone ranges log above your platform have only 256MB
>>> of memory detected. Thus
>>> max_pfn = 256MB >> PAGE_SHIFT
>>> max_low_pfn = max_pfn (see arch/mips/kernel/setup.c: bootmem_init)
>>> high_memory = (void *) __va(max_low_pfn << PAGE_SHIFT) (see arch/mips/mm/init.c: mem_init())
>>> All of that memory is accessible over the kseg0 segment, which
>>> reflects the directly mapped memory. So no highmem really needed.
>>> That's why there is no highmem zone printed in the log. It also caused
>>> my patch not working for you.
>>
>> Yep, that makes sense.
>> My platform does have 256MB of RAM.
>>
>>
>>> 3) There is still IO-memory available in the directly mapped range:
>>> [0x1e000000 0x1fffffff] (arch/mips/boot/dts/ralink/mt7621.dtsi), which
>>> due to too small RAM detected turned to be out of the memory
>>> PFN/page-based access.
>>>
>>> Anyway I am not absolutely sure, but still inclined to thinking that
>>> the problem is in the inaccurate/invalid max_low_pfn value, which
>>> consequently causes having invalid 'high_memory' variable content.
>>> Most likely the problem has been caused by the commit a94e4f24ec83
>>> ("MIPS: init: Drop boot_mem_map"), but it's too complicated to revert.
>>> So at this stage I would focus on hacking the bootmem_init() method
>>> defined in arch/mips/kernel/setup.c. As a dirty hack I would try the
>>> next fixup:
>>>
>>> --- a/arch/mips/kernel/setup.c
>>> +++ b/arch/mips/kernel/setup.c
>>> if (min_low_pfn >= max_low_pfn)
>>> panic("Incorrect memory mapping !!!");
>>> + max_pfn = max_low_pfn = PFN_DOWN(HIGHMEM_START);
>>> if (max_pfn > PFN_DOWN(HIGHMEM_START)) {
>>> #ifdef CONFIG_HIGHMEM
>>> highstart_pfn = PFN_DOWN(HIGHMEM_START);
>>>
>>> If it helps then the problem most likely is in the
>>> max_pfn/min_low_pfn/max_low_pfn setup algorithm implemented in the
>>> bootmem_init() method. I would think of refactoring it then.
>>
>
>> No, still makes no difference. As soon as init runs the warnings start:
>>
>> ...
>> Freeing unused kernel image (initmem) memory: 288K
>> This architecture does not have kernel memory protection.
>> Run /sbin/init as init process
>> Run /etc/init as init process
>> Run /bin/init as init process
>> ------------[ cut here ]------------
>> WARNING: CPU: 3 PID: 1 at arch/mips/mm/physaddr.c:38 __virt_to_phys+0x50/0x84
>> virt_to_phys used for non-linear address: (ptrval) (0xbfbd0000)
>> Modules linked in:
>> CPU: 3 PID: 1 Comm: init Not tainted 5.17.0-ac0 #8
>> Stack : 80c20000 7ffd6000 77de7000 80086784 00000000 00000004 00000000 f881894e
>> 82c41c84 80bf3654 80b20000 80b10000 82c583d8 00000001 82c41c28 82c9bf00
>> 00000000 00000000 80a64648 82c41a70 00000149 82c41adc 00000000 20306361
>> 203a6d6d 80bfd185 80bfd1ac 74696e69 80b10000 00000001 82c41d3c 80b1bcb0
>> 82688040 85257240 80c20000 7ffd6000 00000000 804fc754 0000000c 80bf000c
>> ...
>> Call Trace:
>> [<80008ed0>] show_stack+0x38/0x118
>> [<80957c14>] dump_stack_lvl+0x5c/0x7c
>> [<80951530>] __warn+0xc0/0xf4
>> [<809515f0>] warn_slowpath_fmt+0x8c/0xb8
>> [<80025480>] __virt_to_phys+0x50/0x84
>> [<8000bb30>] arch_setup_additional_pages+0x120/0x230
>> [<802392a0>] load_elf_binary+0xacc/0x14e0
>> [<801d67a8>] bprm_execve+0x288/0x5dc
>> [<801d7260>] kernel_execve+0x130/0x1b4
>> [<80950384>] try_to_run_init_process+0x14/0x4c
>> [<8095df18>] kernel_init+0xe4/0x118
>> [<80003398>] ret_from_kernel_thread+0x14/0x1c
>>
>> ---[ end trace 0000000000000000 ]---
>
> Finally I think I've got it. Your high_memory value still doesn't permit the
> IO-addresses. It is initialized to support the kseg0 virtual space only.
> [0x80000000; 0xa0000000) = [CAC_BASE; CAC_BASE + 512MB): kseg0 - Cached;
> [0xa0000000; 0xc0000000) = [UCAC_BASE; UCAC_BASE + 512MB]: kseg1 - Uncached;
> In your case high_memory ~ (0x90000000|0xa0000000) (without and with
> my last suggestion applied). You keep getting the warning printed because the
> MIPS GIC virtual base address is 0xbfbd0000 and it belongs to kseg1,
> which is higher than the defined by the high_memory limit.
>
> So to speak the warning&trace is misleading in this case because you
> do access the linearly mapped address, but it's just the uncached
> mirror of the low memory. Most likely using the high_memory variable
> as the upper limit of the linearly mapped space wasn't quite correct,
> because the __pa() macro supports both Cached and Uncached addresses
> thus returning correct physical address for you. (The modification
> was introduced in commit dfad83cb7193 ("MIPS: Add support for
> CONFIG_DEBUG_VIRTUAL"), @Florian please join the discussion.) But I
> currently can't suggest better alternative.
>
> Getting back to what you asked in the original email:
>
>> It is trivial to fix this by using __pa() directly here instead of
>> virt_to_phys(). So simply doing this fixes it:
>> ...
>> I am wondering if that is the right approach or if there is a more correct way?
>
> It's OK to use __pa() in this particular case. But in general IMO it
> would be more useful to fix the __virt_addr_valid() method
> implementation somehow. At least taking the Uncached addresses into
> account there would be required.
>
> In your first email you also said:
>
>> That problem leads to rare and somewhat random
>> oops that are mostly within vma and generic memory functions (often in
>> kmem_cache_alloc(), but sometimes other places like unlink_anon_vmas() or
>> anon_vma_interval_tree_remove() or vma_interval_tree_remove()).
>
> Alas I don't really see how the denoted warning is connected with the
> random oopses you described, because the warning turn to be
> misleading. Most likely the oopses are caused by another problem.
Totally agree. I suspect they are in no way related.
I was enabling the kernel's memory debug options to see if anything
tripped. And hit this specific warning strait away. With it fixed I don't
get any other warnings during test runs - but the system does eventually
invoke the OOM after a couple of hours running. It doesn't do this
with none of the kernel memory debug options enabled.
Given we have essentially solved the issue of the virt_to_phys() warnings
I will start another thread with more detail about the oops and other
odd behavior I see. There is a bunch of detail I want to give on that.
Thanks
Greg
> -Sergey
>
>>
>> Regards
>> Greg
>>
>>
>>> -Sergey
>>>
>>>>
>>>> Regards
>>>> Greg
>>>>
>>>>
>>>>
>>>>> -Sergey
>>>>>
>>>>>>
>>>>>> So the debug warning is consistent with this comment.
>>>>>>
>>>>>> It is trivial to fix this by using __pa() directly here instead of
>>>>>> virt_to_phys(). So simply doing this fixes it:
>>>>>>
>>>>>> --- a/arch/mips/kernel/vdso.c
>>>>>> +++ b/arch/mips/kernel/vdso.c
>>>>>> @@ -159,7 +159,7 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
>>>>>> /* Map GIC user page. */
>>>>>> if (gic_size) {
>>>>>> gic_base = (unsigned long)mips_gic_base + MIPS_GIC_USER_OFS;
>>>>>> - gic_pfn = virt_to_phys((void *)gic_base) >> PAGE_SHIFT;
>>>>>> + gic_pfn = __pa(gic_base) >> PAGE_SHIFT;
>>>>>> ret = io_remap_pfn_range(vma, base, gic_pfn, gic_size,
>>>>>> pgprot_noncached(vma->vm_page_prot));
>>>>>>
>>>>>> I am wondering if that is the right approach or if there is a more correct way?
>>>>>>
>>>>>> Regards
>>>>>> Greg
>>>>>>
prev parent reply other threads:[~2022-07-08 2:06 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-07-01 7:24 "virt_to_phys used for non-linear address" warnings Greg Ungerer
2022-07-05 17:42 ` Serge Semin
2022-07-06 5:47 ` Greg Ungerer
2022-07-06 12:05 ` Serge Semin
2022-07-07 4:49 ` Greg Ungerer
2022-07-07 11:40 ` Serge Semin
2022-07-07 16:59 ` Florian Fainelli
2022-07-07 19:42 ` Serge Semin
2022-07-07 20:53 ` Florian Fainelli
2022-07-08 2:06 ` Greg Ungerer [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=532a9e40-e4ea-c898-d05a-4ddcc0740f2a@kernel.org \
--to=gerg@kernel.org \
--cc=f.fainelli@gmail.com \
--cc=fancer.lancer@gmail.com \
--cc=jiaxun.yang@flygoat.com \
--cc=linux-mips@vger.kernel.org \
--cc=tsbogend@alpha.franken.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox