From: Andrea Arcangeli <aarcange@redhat.com>
To: Robin Murphy <robin.murphy@arm.com>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Roman Gushchin <guro@fb.com>, Linux MM <linux-mm@kvack.org>,
iommu@lists.linux-foundation.org,
Wei Yang <richardw.yang@linux.intel.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Andrew Morton <akpm@linux-foundation.org>,
Michal Hocko <mhocko@kernel.org>, Will Deacon <will@kernel.org>,
Yang Shi <shy828301@gmail.com>,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Subject: Re: kernel BUG at mm/huge_memory.c:2613!
Date: Mon, 22 Jun 2020 17:56:58 -0400 [thread overview]
Message-ID: <20200622215658.GC12414@redhat.com> (raw)
In-Reply-To: <e31308f7-4e3c-b6bc-7201-3861b062d257@arm.com>
Hello,
On Mon, Jun 22, 2020 at 04:30:41PM +0100, Robin Murphy wrote:
> On 2020-06-22 13:46, Joerg Roedel wrote:
> > + Robin
> >
> > Robin, any idea on this?
>
> After a bit of archaeology, this dates back to the original review:
>
> https://lore.kernel.org/linux-arm-kernel/54C285D4.3070802@arm.com/
> https://lore.kernel.org/linux-arm-kernel/54DA2666.9030003@arm.com/
>
> In summary: originally this inherited from other arch code that did
> simply strip __GFP_COMP; that was deemed questionable because of the
> nonsensical comment about CONFIG_HUGETLBFS that was stuck to it; the
> current code is like it is because in 5 and a half years nobody said
> that it's wrong :)
>
> If there actually *are* good reasons for stripping __GFP_COMP, then I've
> certainly no objection to doing so.
The main question is if there's any good reasons for not forbidding
__GFP_COMP to be specified in the callers. The reason given in the
comment isn't convincing.
I don't see how a caller that gets a pointer can care about how the
page structure looks like and in turn why it's asking for __GFP_COMP.
As far as I can tell there are two orthogonal issues in play here:
1) The comment about __GFP_COMP facilitating the sound driver to do
partial mapping doesn't make much sense. It's probably best to
WARN_ON immediately in dma_alloc_coherent if __GFP_COMP is
specified, not only down the call stack in the
__iommu_dma_alloc_pages() path.
Note: the CMA paths would already ignore __GFP_COMP if it's
specified so that __GFP_COMP request can already be ignored. It
sounds preferable to warn the caller it's asking something it can't
get, than to silently ignore __GFP_COMP.
On a side note: hugetlbfs/THP pages can only be allocated with
__GFP_COMP because for example put_page() must work on all tail
pages (you can't call compound_head() unless the tail page is part
of a compound page). But for private driver pages mapped by
remap_pfn_range, any full or partial mapping is done manually and
nobody can call GUP on VM_PFNMAP|VM_IO anyway (there's not even the
requirement of a page struct backing those mappings in fact).
2) __iommu_dma_alloc_pages cannot use __GFP_COMP if it intends to
return an array of small pages, which is the only thing that the
current sg_alloc_table_from_pages() supports in input. split_page
will work as expected to generate small pages from non-compound
order>0 pages, incidentally it's implement on mm/page_alloc.c, not
in huge_memory.c.
split_huge_page as opposed is not intended to be used on newly
allocated compound page. Maybe we should renamed it to
split_trans_huge_page to make it more explicit, since it won't even
work on hugetlbfs (compound) pages.
Thanks,
Andrea
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
WARNING: multiple messages have this Message-ID (diff)
From: Andrea Arcangeli <aarcange@redhat.com>
To: Robin Murphy <robin.murphy@arm.com>
Cc: Joerg Roedel <joro@8bytes.org>, Roman Gushchin <guro@fb.com>,
Yang Shi <shy828301@gmail.com>,
iommu@lists.linux-foundation.org,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Linux MM <linux-mm@kvack.org>, Michal Hocko <mhocko@kernel.org>,
Johannes Weiner <hannes@cmpxchg.org>,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
Andrew Morton <akpm@linux-foundation.org>,
Wei Yang <richardw.yang@linux.intel.com>,
Will Deacon <will@kernel.org>
Subject: Re: kernel BUG at mm/huge_memory.c:2613!
Date: Mon, 22 Jun 2020 17:56:58 -0400 [thread overview]
Message-ID: <20200622215658.GC12414@redhat.com> (raw)
In-Reply-To: <e31308f7-4e3c-b6bc-7201-3861b062d257@arm.com>
Hello,
On Mon, Jun 22, 2020 at 04:30:41PM +0100, Robin Murphy wrote:
> On 2020-06-22 13:46, Joerg Roedel wrote:
> > + Robin
> >
> > Robin, any idea on this?
>
> After a bit of archaeology, this dates back to the original review:
>
> https://lore.kernel.org/linux-arm-kernel/54C285D4.3070802@arm.com/
> https://lore.kernel.org/linux-arm-kernel/54DA2666.9030003@arm.com/
>
> In summary: originally this inherited from other arch code that did
> simply strip __GFP_COMP; that was deemed questionable because of the
> nonsensical comment about CONFIG_HUGETLBFS that was stuck to it; the
> current code is like it is because in 5 and a half years nobody said
> that it's wrong :)
>
> If there actually *are* good reasons for stripping __GFP_COMP, then I've
> certainly no objection to doing so.
The main question is if there's any good reasons for not forbidding
__GFP_COMP to be specified in the callers. The reason given in the
comment isn't convincing.
I don't see how a caller that gets a pointer can care about how the
page structure looks like and in turn why it's asking for __GFP_COMP.
As far as I can tell there are two orthogonal issues in play here:
1) The comment about __GFP_COMP facilitating the sound driver to do
partial mapping doesn't make much sense. It's probably best to
WARN_ON immediately in dma_alloc_coherent if __GFP_COMP is
specified, not only down the call stack in the
__iommu_dma_alloc_pages() path.
Note: the CMA paths would already ignore __GFP_COMP if it's
specified so that __GFP_COMP request can already be ignored. It
sounds preferable to warn the caller it's asking something it can't
get, than to silently ignore __GFP_COMP.
On a side note: hugetlbfs/THP pages can only be allocated with
__GFP_COMP because for example put_page() must work on all tail
pages (you can't call compound_head() unless the tail page is part
of a compound page). But for private driver pages mapped by
remap_pfn_range, any full or partial mapping is done manually and
nobody can call GUP on VM_PFNMAP|VM_IO anyway (there's not even the
requirement of a page struct backing those mappings in fact).
2) __iommu_dma_alloc_pages cannot use __GFP_COMP if it intends to
return an array of small pages, which is the only thing that the
current sg_alloc_table_from_pages() supports in input. split_page
will work as expected to generate small pages from non-compound
order>0 pages, incidentally it's implement on mm/page_alloc.c, not
in huge_memory.c.
split_huge_page as opposed is not intended to be used on newly
allocated compound page. Maybe we should renamed it to
split_trans_huge_page to make it more explicit, since it won't even
work on hugetlbfs (compound) pages.
Thanks,
Andrea
next prev parent reply other threads:[~2020-06-22 21:57 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-06-19 0:19 kernel BUG at mm/huge_memory.c:2613! Roman Gushchin via iommu
2020-06-19 0:19 ` Roman Gushchin
2020-06-19 0:46 ` Yang Shi
2020-06-19 0:46 ` Yang Shi
2020-06-19 1:14 ` Roman Gushchin via iommu
2020-06-19 1:14 ` Roman Gushchin
2020-06-19 1:20 ` Yang Shi
2020-06-19 1:20 ` Yang Shi
2020-06-19 2:40 ` Andrea Arcangeli
2020-06-19 2:40 ` Andrea Arcangeli
2020-06-19 18:55 ` Roman Gushchin via iommu
2020-06-19 18:55 ` Roman Gushchin
2020-06-19 20:56 ` David Rientjes via iommu
2020-06-19 20:56 ` David Rientjes
2020-06-19 22:57 ` Roman Gushchin via iommu
2020-06-19 22:57 ` Roman Gushchin
2020-06-21 20:05 ` David Rientjes via iommu
2020-06-21 20:05 ` David Rientjes
2020-06-22 12:46 ` Joerg Roedel
2020-06-22 12:46 ` Joerg Roedel
2020-06-22 15:30 ` Robin Murphy
2020-06-22 15:30 ` Robin Murphy
2020-06-22 21:56 ` Andrea Arcangeli [this message]
2020-06-22 21:56 ` Andrea Arcangeli
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200622215658.GC12414@redhat.com \
--to=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=guro@fb.com \
--cc=hannes@cmpxchg.org \
--cc=iommu@lists.linux-foundation.org \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=richardw.yang@linux.intel.com \
--cc=robin.murphy@arm.com \
--cc=shy828301@gmail.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.