All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Kirill A. Shutemov" <kirill@shutemov.name>
To: Hugh Dickins <hughd@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Mel Gorman <mgorman@suse.de>, Rik van Riel <riel@redhat.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	Christoph Lameter <cl@gentwo.org>,
	Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	Steve Capper <steve.capper@linaro.org>,
	"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@suse.cz>,
	Jerome Marchand <jmarchan@redhat.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Jaroslav Kysela <perex@perex.cz>, Takashi Iwai <tiwai@suse.de>,
	alsa-devel@alsa-project.org
Subject: Re: [PATCH 05/16] page-flags: define behavior of FS/IO-related flags on compound pages
Date: Mon, 23 Mar 2015 14:17:26 +0200	[thread overview]
Message-ID: <20150323121726.GB30088@node.dhcp.inet.fi> (raw)
In-Reply-To: <alpine.LSU.2.11.1503221613280.2680@eggly.anvils>

On Sun, Mar 22, 2015 at 05:02:58PM -0700, Hugh Dickins wrote:
> On Thu, 19 Mar 2015, Kirill A. Shutemov wrote:
> > On Thu, Mar 19, 2015 at 11:29:52AM -0700, Dave Hansen wrote:
> > > On 03/19/2015 10:08 AM, Kirill A. Shutemov wrote:
> > > > The odd exception is PG_dirty: sound uses compound pages and maps them
> > > > with PTEs. NO_COMPOUND triggers VM_BUG_ON() in set_page_dirty() on
> > > > handling shared fault. Let's use HEAD for PG_dirty.
> 
> It really depends on what you do with PageDirty of the head, when you
> get to support 4k pagecache with subpages of a huge compound page.
> 
> HEAD will be fine, so long as PageDirty on the head means the whole
> huge page must be written back.  I expect that's what you will choose;
> but one could consider that if a huge page is only mapped read-only,
> but a few subpages of it writable, then only the few need be written
> back, in which case ANY would be more appropriate.  NO_COMPOUND is
> certainly wrong.
> 
> But that does illustrate that I consider this patch series premature:
> it belongs with your huge pagecache implementation.  You seem to be
> "tidying up" and adding overhead to things that are fine as they are.

I agree, it can be ANY too, since we don't use PG_dirty anywhere at the
moment. My first thought was that it's better to match PG_dirty behaviour
with LRU-related, but it's not necessary should be the case.

BTW, do we make any use of PG_dirty on pages with ->mapping == NULL?
Should we avoid dirtying them in the first place?

> > > Can we get the sound guys to look at this, btw?  It seems like an odd
> > > thing that we probably don't want to keep around, right?
> > 
> > CC: +sound guys
> 
> I don't think this is peculiar to sound at all: there are other users
> of __GFP_COMP in the tree, aren't there?  And although some of them
> might turn out not to need it any more, I expect most of them still
> need it for the same reason they did originally.

I haven't seen any other __GFP_COMP user which get it mapped to user-space
with PTEs. Do you? Probably I haven't just stepped on it.

... looking into code a bit more: at least one fb-drivers has compound
pages mapped with PTEs..

> > I'm not sure what is right fix here. At the time adding __GFP_COMP was a
> > fix: see f3d48f0373c1.
> 
> The only thing special about this one, was that I failed to add
> __GFP_COMP at first.
> 
> The purpose of __GFP_COMP is to allow a >0-order page (originally, just
> a hugetlb page: see 2.5.60) to be mapped into userspace, and parts of it
> then subjected to get_user_pages (ptrace, futex, direct I/O, infiniband
> etc), and now even munmap, without destroying the integrity of the
> underlying >0-order page.
> 
> We don't bother with __GFP_COMP when a >0-order page cannot be mapped
> into userspace (except through /dev/mem or suchlike); we add __GFP_COMP
> when it might be, to get the right reference counting.

Wouldn't non-compound >0-order page allocation + split_page() work too?

> It's normal for set_page_dirty() to be called in the course of
> get_user_pages(), and it's normal for set_page_dirty() to be called
> when releasing the get_user_pages() references, and it's normal for
> set_page_dirty() to be called when munmap'ing a pte_dirty().
> 
> > 
> > Other odd part about __GFP_COMP here is that we have ->_mapcount in tail
> > pages to be used for both: mapcount of the individual page and for gup
> > pins. __compound_tail_refcounted() doesn't recognize that we don't need
> > tail page accounting for these pages.
> 
> So page->_mapcount of the tails is being used for both their mapcount
> and their reference count: that's certainly funny, and further reason
> to pursue your aim of simplifying the way THPs are refcounted.  But
> not responsible for any actual bug, I think?

GUP pin would screw up page_mapcount() on these pages. It would affect
memory stats for the process and probably something else.

I think we can get __compound_tail_refcounted() ignore these pages by
checking if page->mapping is NULL.

> > Hugh, I tried to ask you about the situation several times (last time on
> > the summit). Any comments?
> 
> I do remember we began a curtailed conversation about this at LSF/MM.
> I do not remember you asking about it earlier: when was that?

http://lkml.kernel.org/g/20141217004734.GA23150@node.dhcp.inet.fi

-- 
 Kirill A. Shutemov

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: "Kirill A. Shutemov" <kirill@shutemov.name>
To: Hugh Dickins <hughd@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Mel Gorman <mgorman@suse.de>, Rik van Riel <riel@redhat.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	Christoph Lameter <cl@gentwo.org>,
	Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	Steve Capper <steve.capper@linaro.org>,
	"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@suse.cz>,
	Jerome Marchand <jmarchan@redhat.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Jaroslav Kysela <perex@perex.cz>, Takashi Iwai <tiwai@suse.de>,
	alsa-devel@alsa-project.org
Subject: Re: [PATCH 05/16] page-flags: define behavior of FS/IO-related flags on compound pages
Date: Mon, 23 Mar 2015 14:17:26 +0200	[thread overview]
Message-ID: <20150323121726.GB30088@node.dhcp.inet.fi> (raw)
In-Reply-To: <alpine.LSU.2.11.1503221613280.2680@eggly.anvils>

On Sun, Mar 22, 2015 at 05:02:58PM -0700, Hugh Dickins wrote:
> On Thu, 19 Mar 2015, Kirill A. Shutemov wrote:
> > On Thu, Mar 19, 2015 at 11:29:52AM -0700, Dave Hansen wrote:
> > > On 03/19/2015 10:08 AM, Kirill A. Shutemov wrote:
> > > > The odd exception is PG_dirty: sound uses compound pages and maps them
> > > > with PTEs. NO_COMPOUND triggers VM_BUG_ON() in set_page_dirty() on
> > > > handling shared fault. Let's use HEAD for PG_dirty.
> 
> It really depends on what you do with PageDirty of the head, when you
> get to support 4k pagecache with subpages of a huge compound page.
> 
> HEAD will be fine, so long as PageDirty on the head means the whole
> huge page must be written back.  I expect that's what you will choose;
> but one could consider that if a huge page is only mapped read-only,
> but a few subpages of it writable, then only the few need be written
> back, in which case ANY would be more appropriate.  NO_COMPOUND is
> certainly wrong.
> 
> But that does illustrate that I consider this patch series premature:
> it belongs with your huge pagecache implementation.  You seem to be
> "tidying up" and adding overhead to things that are fine as they are.

I agree, it can be ANY too, since we don't use PG_dirty anywhere at the
moment. My first thought was that it's better to match PG_dirty behaviour
with LRU-related, but it's not necessary should be the case.

BTW, do we make any use of PG_dirty on pages with ->mapping == NULL?
Should we avoid dirtying them in the first place?

> > > Can we get the sound guys to look at this, btw?  It seems like an odd
> > > thing that we probably don't want to keep around, right?
> > 
> > CC: +sound guys
> 
> I don't think this is peculiar to sound at all: there are other users
> of __GFP_COMP in the tree, aren't there?  And although some of them
> might turn out not to need it any more, I expect most of them still
> need it for the same reason they did originally.

I haven't seen any other __GFP_COMP user which get it mapped to user-space
with PTEs. Do you? Probably I haven't just stepped on it.

... looking into code a bit more: at least one fb-drivers has compound
pages mapped with PTEs..

> > I'm not sure what is right fix here. At the time adding __GFP_COMP was a
> > fix: see f3d48f0373c1.
> 
> The only thing special about this one, was that I failed to add
> __GFP_COMP at first.
> 
> The purpose of __GFP_COMP is to allow a >0-order page (originally, just
> a hugetlb page: see 2.5.60) to be mapped into userspace, and parts of it
> then subjected to get_user_pages (ptrace, futex, direct I/O, infiniband
> etc), and now even munmap, without destroying the integrity of the
> underlying >0-order page.
> 
> We don't bother with __GFP_COMP when a >0-order page cannot be mapped
> into userspace (except through /dev/mem or suchlike); we add __GFP_COMP
> when it might be, to get the right reference counting.

Wouldn't non-compound >0-order page allocation + split_page() work too?

> It's normal for set_page_dirty() to be called in the course of
> get_user_pages(), and it's normal for set_page_dirty() to be called
> when releasing the get_user_pages() references, and it's normal for
> set_page_dirty() to be called when munmap'ing a pte_dirty().
> 
> > 
> > Other odd part about __GFP_COMP here is that we have ->_mapcount in tail
> > pages to be used for both: mapcount of the individual page and for gup
> > pins. __compound_tail_refcounted() doesn't recognize that we don't need
> > tail page accounting for these pages.
> 
> So page->_mapcount of the tails is being used for both their mapcount
> and their reference count: that's certainly funny, and further reason
> to pursue your aim of simplifying the way THPs are refcounted.  But
> not responsible for any actual bug, I think?

GUP pin would screw up page_mapcount() on these pages. It would affect
memory stats for the process and probably something else.

I think we can get __compound_tail_refcounted() ignore these pages by
checking if page->mapping is NULL.

> > Hugh, I tried to ask you about the situation several times (last time on
> > the summit). Any comments?
> 
> I do remember we began a curtailed conversation about this at LSF/MM.
> I do not remember you asking about it earlier: when was that?

http://lkml.kernel.org/g/20141217004734.GA23150@node.dhcp.inet.fi

-- 
 Kirill A. Shutemov

  reply	other threads:[~2015-03-23 12:17 UTC|newest]

Thread overview: 119+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-19 17:08 [PATCH 00/16] Sanitize usage of ->flags and ->mapping for tail pages Kirill A. Shutemov
2015-03-19 17:08 ` Kirill A. Shutemov
2015-03-19 17:08 ` [PATCH 01/16] mm: consolidate all page-flags helpers in <linux/page-flags.h> Kirill A. Shutemov
2015-03-19 17:08   ` Kirill A. Shutemov
2015-03-23  0:10   ` Hugh Dickins
2015-03-23  0:10     ` Hugh Dickins
2015-03-19 17:08 ` [PATCH 02/16] page-flags: trivial cleanup for PageTrans* helpers Kirill A. Shutemov
2015-03-19 17:08   ` Kirill A. Shutemov
2015-03-23  0:12   ` Hugh Dickins
2015-03-23  0:12     ` Hugh Dickins
2015-03-19 17:08 ` [PATCH 03/16] page-flags: introduce page flags policies wrt compound pages Kirill A. Shutemov
2015-03-19 17:08   ` Kirill A. Shutemov
2015-03-20 20:35   ` Andrew Morton
2015-03-20 20:35     ` Andrew Morton
2015-03-20 21:34     ` Kirill A. Shutemov
2015-03-20 21:34       ` Kirill A. Shutemov
2015-03-19 17:08 ` [PATCH 04/16] page-flags: define PG_locked behavior on " Kirill A. Shutemov
2015-03-19 17:08   ` Kirill A. Shutemov
2015-03-27 15:11   ` Mateusz Krawczuk
2015-03-27 15:11     ` Mateusz Krawczuk
2015-03-27 15:13   ` Mateusz Krawczuk
2015-03-27 15:13     ` Mateusz Krawczuk
2015-03-27 15:13     ` Mateusz Krawczuk
2015-03-27 16:37     ` Kirill A. Shutemov
2015-03-27 16:37       ` Kirill A. Shutemov
2015-03-27 16:37       ` Kirill A. Shutemov
2015-07-15 20:20   ` Christoph Lameter
2015-07-15 20:20     ` Christoph Lameter
2015-08-06  4:15   ` page-flags behavior on compound pages: a worry Hugh Dickins
2015-08-06  4:15     ` Hugh Dickins
2015-08-06 15:33     ` Kirill A. Shutemov
2015-08-06 15:33       ` Kirill A. Shutemov
2015-08-06 19:24       ` Hugh Dickins
2015-08-06 19:24         ` Hugh Dickins
2015-08-06 20:45         ` Christoph Lameter
2015-08-06 20:45           ` Christoph Lameter
2015-08-07 14:50           ` Kirill A. Shutemov
2015-08-07 14:50             ` Kirill A. Shutemov
2015-08-07 15:28             ` Christoph Lameter
2015-08-07 15:28               ` Christoph Lameter
2015-08-10 11:09               ` Kirill A. Shutemov
2015-08-10 11:09                 ` Kirill A. Shutemov
2015-08-10 13:50                 ` Christoph Lameter
2015-08-10 13:50                   ` Christoph Lameter
2015-08-07 14:49         ` Kirill A. Shutemov
2015-08-07 14:49           ` Kirill A. Shutemov
2015-08-13  5:10           ` Hugh Dickins
2015-08-13  5:10             ` Hugh Dickins
2015-08-12 14:35         ` Kirill A. Shutemov
2015-08-12 14:35           ` Kirill A. Shutemov
2015-08-12 14:47           ` Vlastimil Babka
2015-08-12 14:47             ` Vlastimil Babka
2015-08-12 21:16           ` Andrew Morton
2015-08-12 21:16             ` Andrew Morton
2015-08-12 22:21             ` Kirill A. Shutemov
2015-08-12 22:21               ` Kirill A. Shutemov
2015-08-13  4:12               ` Hugh Dickins
2015-08-13  4:12                 ` Hugh Dickins
2015-03-19 17:08 ` [PATCH 05/16] page-flags: define behavior of FS/IO-related flags on compound pages Kirill A. Shutemov
2015-03-19 17:08   ` Kirill A. Shutemov
2015-03-19 18:29   ` Dave Hansen
2015-03-19 18:29     ` Dave Hansen
2015-03-19 20:02     ` Kirill A. Shutemov
2015-03-19 20:02       ` Kirill A. Shutemov
2015-03-23  0:02       ` Hugh Dickins
2015-03-23  0:02         ` Hugh Dickins
2015-03-23 12:17         ` Kirill A. Shutemov [this message]
2015-03-23 12:17           ` Kirill A. Shutemov
2015-03-24 22:54           ` Hugh Dickins
2015-03-24 22:54             ` Hugh Dickins
2015-03-25 10:23             ` Kirill A. Shutemov
2015-03-25 10:23               ` Kirill A. Shutemov
2015-03-25 18:56               ` Hugh Dickins
2015-03-25 18:56                 ` Hugh Dickins
2015-03-19 17:08 ` [PATCH 06/16] page-flags: define behavior of LRU-related " Kirill A. Shutemov
2015-03-19 17:08   ` Kirill A. Shutemov
2015-03-19 17:08 ` [PATCH 07/16] page-flags: define behavior SL*B-related " Kirill A. Shutemov
2015-03-19 17:08   ` Kirill A. Shutemov
2015-03-19 17:08 ` [PATCH 08/16] page-flags: define behavior of Xen-related " Kirill A. Shutemov
2015-03-19 17:08   ` Kirill A. Shutemov
2015-03-19 17:08 ` [PATCH 09/16] page-flags: define PG_reserved behavior " Kirill A. Shutemov
2015-03-19 17:08   ` Kirill A. Shutemov
2020-01-31 15:24   ` Chris Wilson
2020-02-03 15:18     ` Kirill A. Shutemov
2020-02-03 15:24       ` Chris Wilson
2020-02-03 17:10         ` David Hildenbrand
2020-02-03 17:29       ` Christoph Hellwig
2015-03-19 17:08 ` [PATCH 10/16] page-flags: define PG_swapbacked " Kirill A. Shutemov
2015-03-19 17:08   ` Kirill A. Shutemov
2015-03-19 17:08 ` [PATCH 11/16] page-flags: define PG_swapcache " Kirill A. Shutemov
2015-03-19 17:08   ` Kirill A. Shutemov
2015-03-19 17:08 ` [PATCH 12/16] page-flags: define PG_mlocked " Kirill A. Shutemov
2015-03-19 17:08   ` Kirill A. Shutemov
2015-03-19 17:08 ` [PATCH 13/16] page-flags: define PG_uncached " Kirill A. Shutemov
2015-03-19 17:08   ` Kirill A. Shutemov
2015-03-19 17:08 ` [PATCH 14/16] page-flags: define PG_uptodate " Kirill A. Shutemov
2015-03-19 17:08   ` Kirill A. Shutemov
2015-03-19 17:08 ` [PATCH 15/16] page-flags: look on head page if the flag is encoded in page->mapping Kirill A. Shutemov
2015-03-19 17:08   ` Kirill A. Shutemov
2015-03-19 17:08 ` [PATCH 16/16] mm: sanitize page->mapping for tail pages Kirill A. Shutemov
2015-03-19 17:08   ` Kirill A. Shutemov
2015-03-23  0:28 ` [PATCH 00/16] Sanitize usage of ->flags and ->mapping " Hugh Dickins
2015-03-23  0:28   ` Hugh Dickins
2015-03-23 10:04   ` Kirill A. Shutemov
2015-03-23 10:04     ` Kirill A. Shutemov
2015-03-24 23:42     ` Hugh Dickins
2015-03-24 23:42       ` Hugh Dickins
2015-03-25 10:55       ` Kirill A. Shutemov
2015-03-25 10:55         ` Kirill A. Shutemov
2015-03-24 17:39 ` Konstantin Khlebnikov
2015-03-24 17:39   ` Konstantin Khlebnikov
2015-03-24 20:04   ` Kirill A. Shutemov
2015-03-24 20:04     ` Kirill A. Shutemov
2015-07-15 20:20 ` Christoph Lameter
2015-07-15 20:20   ` Christoph Lameter
2015-07-15 21:18   ` Kirill A. Shutemov
2015-07-15 21:18     ` Kirill A. Shutemov
  -- strict thread matches above, loose matches on Subject: below --
2015-09-21 22:35 [PATCH 3/3] page-flags: rectify forward declaration Andrew Morton
2015-09-24 14:50 ` [PATCH 00/16] Refreshed page-flags patchset Kirill A. Shutemov
2015-09-24 14:50   ` [PATCH 05/16] page-flags: define behavior of FS/IO-related flags on compound pages Kirill A. Shutemov
2015-09-24 14:50     ` Kirill A. Shutemov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150323121726.GB30088@node.dhcp.inet.fi \
    --to=kirill@shutemov.name \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=alsa-devel@alsa-project.org \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=cl@gentwo.org \
    --cc=dave.hansen@intel.com \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=jmarchan@redhat.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@suse.cz \
    --cc=n-horiguchi@ah.jp.nec.com \
    --cc=perex@perex.cz \
    --cc=riel@redhat.com \
    --cc=steve.capper@linaro.org \
    --cc=tiwai@suse.de \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.