From: "Kirill A. Shutemov" <kirill@shutemov.name>
To: Hugh Dickins <hughd@google.com>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
Andrea Arcangeli <aarcange@redhat.com>,
Ning Qu <quning@gmail.com>,
Andrew Morton <akpm@linux-foundation.org>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH 11/24] huge tmpfs: shrinker to migrate and free underused holes
Date: Tue, 24 Mar 2015 14:57:52 +0200 [thread overview]
Message-ID: <20150324125752.GA4642@node.dhcp.inet.fi> (raw)
In-Reply-To: <alpine.LSU.2.11.1503222046510.5278@eggly.anvils>
On Sun, Mar 22, 2015 at 09:40:02PM -0700, Hugh Dickins wrote:
> (I think Kirill has a problem of that kind in his page_remove_rmap scan).
>
> It will be interesting to see what Kirill does to maintain the stats
> for huge pagecache: but he will have no difficulty in finding fields
> to store counts, because he's got lots of spare fields in those 511
> tail pages - that's a useful benefit of the compound page, but does
> prevent the tails from being used in ordinary ways. (I did try using
> team_head[1].team_usage for more, but atomicity needs prevented it.)
The patch below should address the race you pointed, if I've got all
right. Not hugely happy with the change though.
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 435c90f59227..a3e6b35520f8 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -423,8 +423,17 @@ static inline void page_mapcount_reset(struct page *page)
static inline int page_mapcount(struct page *page)
{
+ int ret;
VM_BUG_ON_PAGE(PageSlab(page), page);
- return atomic_read(&page->_mapcount) + compound_mapcount(page) + 1;
+ ret = atomic_read(&page->_mapcount) + 1;
+ if (compound_mapcount(page)) {
+ /*
+ * positive compound_mapcount() offsets ->_mapcount by one --
+ * substract here.
+ */
+ ret += compound_mapcount(page) - 1;
+ }
+ return ret;
}
static inline int page_count(struct page *page)
diff --git a/mm/rmap.c b/mm/rmap.c
index fc6eee4ed476..f4ab976276e7 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1066,9 +1066,17 @@ void do_page_add_anon_rmap(struct page *page,
* disabled.
*/
if (compound) {
+ int i;
VM_BUG_ON_PAGE(!PageTransHuge(page), page);
__inc_zone_page_state(page,
NR_ANON_TRANSPARENT_HUGEPAGES);
+ /*
+ * While compound_mapcount() is positive we keep *one*
+ * mapcount reference in all subpages. It's required
+ * for atomic removal from rmap.
+ */
+ for (i = 0; i < nr; i++)
+ atomic_set(&page[i]._mapcount, 0);
}
__mod_zone_page_state(page_zone(page), NR_ANON_PAGES, nr);
}
@@ -1103,10 +1111,19 @@ void page_add_new_anon_rmap(struct page *page,
VM_BUG_ON_VMA(address < vma->vm_start || address >= vma->vm_end, vma);
SetPageSwapBacked(page);
if (compound) {
+ int i;
+
VM_BUG_ON_PAGE(!PageTransHuge(page), page);
/* increment count (starts at -1) */
atomic_set(compound_mapcount_ptr(page), 0);
__inc_zone_page_state(page, NR_ANON_TRANSPARENT_HUGEPAGES);
+ /*
+ * While compound_mapcount() is positive we keep *one* mapcount
+ * reference in all subpages. It's required for atomic removal
+ * from rmap.
+ */
+ for (i = 0; i < nr; i++)
+ atomic_set(&page[i]._mapcount, 0);
} else {
/* Anon THP always mapped first with PMD */
VM_BUG_ON_PAGE(PageTransCompound(page), page);
@@ -1174,9 +1191,6 @@ out:
*/
void page_remove_rmap(struct page *page, bool compound)
{
- int nr = compound ? hpage_nr_pages(page) : 1;
- bool partial_thp_unmap;
-
if (!PageAnon(page)) {
VM_BUG_ON_PAGE(compound && !PageHuge(page), page);
page_remove_file_rmap(page);
@@ -1184,10 +1198,20 @@ void page_remove_rmap(struct page *page, bool compound)
}
/* page still mapped by someone else? */
- if (!atomic_add_negative(-1, compound ?
- compound_mapcount_ptr(page) :
- &page->_mapcount))
+ if (compound) {
+ int i;
+
+ VM_BUG_ON_PAGE(!PageTransHuge(page), page);
+ if (!atomic_add_negative(-1, compound_mapcount_ptr(page)))
+ return;
+ __dec_zone_page_state(page, NR_ANON_TRANSPARENT_HUGEPAGES);
+ for (i = 0; i < hpage_nr_pages(page); i++)
+ page_remove_rmap(page + i, false);
return;
+ } else {
+ if (!atomic_add_negative(-1, &page->_mapcount))
+ return;
+ }
/* Hugepages are not counted in NR_ANON_PAGES for now. */
if (unlikely(PageHuge(page)))
@@ -1198,26 +1222,12 @@ void page_remove_rmap(struct page *page, bool compound)
* these counters are not modified in interrupt context, and
* pte lock(a spinlock) is held, which implies preemption disabled.
*/
- if (compound) {
- int i;
- VM_BUG_ON_PAGE(!PageTransHuge(page), page);
- __dec_zone_page_state(page, NR_ANON_TRANSPARENT_HUGEPAGES);
- /* The page can be mapped with ptes */
- for (i = 0; i < hpage_nr_pages(page); i++)
- if (page_mapcount(page + i))
- nr--;
- partial_thp_unmap = nr != hpage_nr_pages(page);
- } else if (PageTransCompound(page)) {
- partial_thp_unmap = !compound_mapcount(page);
- } else
- partial_thp_unmap = false;
-
- __mod_zone_page_state(page_zone(page), NR_ANON_PAGES, -nr);
+ __dec_zone_page_state(page, NR_ANON_PAGES);
if (unlikely(PageMlocked(page)))
clear_page_mlock(page);
- if (partial_thp_unmap)
+ if (PageTransCompound(page))
deferred_split_huge_page(compound_head(page));
/*
--
Kirill A. Shutemov
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: "Kirill A. Shutemov" <kirill@shutemov.name>
To: Hugh Dickins <hughd@google.com>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
Andrea Arcangeli <aarcange@redhat.com>,
Ning Qu <quning@gmail.com>,
Andrew Morton <akpm@linux-foundation.org>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH 11/24] huge tmpfs: shrinker to migrate and free underused holes
Date: Tue, 24 Mar 2015 14:57:52 +0200 [thread overview]
Message-ID: <20150324125752.GA4642@node.dhcp.inet.fi> (raw)
In-Reply-To: <alpine.LSU.2.11.1503222046510.5278@eggly.anvils>
On Sun, Mar 22, 2015 at 09:40:02PM -0700, Hugh Dickins wrote:
> (I think Kirill has a problem of that kind in his page_remove_rmap scan).
>
> It will be interesting to see what Kirill does to maintain the stats
> for huge pagecache: but he will have no difficulty in finding fields
> to store counts, because he's got lots of spare fields in those 511
> tail pages - that's a useful benefit of the compound page, but does
> prevent the tails from being used in ordinary ways. (I did try using
> team_head[1].team_usage for more, but atomicity needs prevented it.)
The patch below should address the race you pointed, if I've got all
right. Not hugely happy with the change though.
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 435c90f59227..a3e6b35520f8 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -423,8 +423,17 @@ static inline void page_mapcount_reset(struct page *page)
static inline int page_mapcount(struct page *page)
{
+ int ret;
VM_BUG_ON_PAGE(PageSlab(page), page);
- return atomic_read(&page->_mapcount) + compound_mapcount(page) + 1;
+ ret = atomic_read(&page->_mapcount) + 1;
+ if (compound_mapcount(page)) {
+ /*
+ * positive compound_mapcount() offsets ->_mapcount by one --
+ * substract here.
+ */
+ ret += compound_mapcount(page) - 1;
+ }
+ return ret;
}
static inline int page_count(struct page *page)
diff --git a/mm/rmap.c b/mm/rmap.c
index fc6eee4ed476..f4ab976276e7 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1066,9 +1066,17 @@ void do_page_add_anon_rmap(struct page *page,
* disabled.
*/
if (compound) {
+ int i;
VM_BUG_ON_PAGE(!PageTransHuge(page), page);
__inc_zone_page_state(page,
NR_ANON_TRANSPARENT_HUGEPAGES);
+ /*
+ * While compound_mapcount() is positive we keep *one*
+ * mapcount reference in all subpages. It's required
+ * for atomic removal from rmap.
+ */
+ for (i = 0; i < nr; i++)
+ atomic_set(&page[i]._mapcount, 0);
}
__mod_zone_page_state(page_zone(page), NR_ANON_PAGES, nr);
}
@@ -1103,10 +1111,19 @@ void page_add_new_anon_rmap(struct page *page,
VM_BUG_ON_VMA(address < vma->vm_start || address >= vma->vm_end, vma);
SetPageSwapBacked(page);
if (compound) {
+ int i;
+
VM_BUG_ON_PAGE(!PageTransHuge(page), page);
/* increment count (starts at -1) */
atomic_set(compound_mapcount_ptr(page), 0);
__inc_zone_page_state(page, NR_ANON_TRANSPARENT_HUGEPAGES);
+ /*
+ * While compound_mapcount() is positive we keep *one* mapcount
+ * reference in all subpages. It's required for atomic removal
+ * from rmap.
+ */
+ for (i = 0; i < nr; i++)
+ atomic_set(&page[i]._mapcount, 0);
} else {
/* Anon THP always mapped first with PMD */
VM_BUG_ON_PAGE(PageTransCompound(page), page);
@@ -1174,9 +1191,6 @@ out:
*/
void page_remove_rmap(struct page *page, bool compound)
{
- int nr = compound ? hpage_nr_pages(page) : 1;
- bool partial_thp_unmap;
-
if (!PageAnon(page)) {
VM_BUG_ON_PAGE(compound && !PageHuge(page), page);
page_remove_file_rmap(page);
@@ -1184,10 +1198,20 @@ void page_remove_rmap(struct page *page, bool compound)
}
/* page still mapped by someone else? */
- if (!atomic_add_negative(-1, compound ?
- compound_mapcount_ptr(page) :
- &page->_mapcount))
+ if (compound) {
+ int i;
+
+ VM_BUG_ON_PAGE(!PageTransHuge(page), page);
+ if (!atomic_add_negative(-1, compound_mapcount_ptr(page)))
+ return;
+ __dec_zone_page_state(page, NR_ANON_TRANSPARENT_HUGEPAGES);
+ for (i = 0; i < hpage_nr_pages(page); i++)
+ page_remove_rmap(page + i, false);
return;
+ } else {
+ if (!atomic_add_negative(-1, &page->_mapcount))
+ return;
+ }
/* Hugepages are not counted in NR_ANON_PAGES for now. */
if (unlikely(PageHuge(page)))
@@ -1198,26 +1222,12 @@ void page_remove_rmap(struct page *page, bool compound)
* these counters are not modified in interrupt context, and
* pte lock(a spinlock) is held, which implies preemption disabled.
*/
- if (compound) {
- int i;
- VM_BUG_ON_PAGE(!PageTransHuge(page), page);
- __dec_zone_page_state(page, NR_ANON_TRANSPARENT_HUGEPAGES);
- /* The page can be mapped with ptes */
- for (i = 0; i < hpage_nr_pages(page); i++)
- if (page_mapcount(page + i))
- nr--;
- partial_thp_unmap = nr != hpage_nr_pages(page);
- } else if (PageTransCompound(page)) {
- partial_thp_unmap = !compound_mapcount(page);
- } else
- partial_thp_unmap = false;
-
- __mod_zone_page_state(page_zone(page), NR_ANON_PAGES, -nr);
+ __dec_zone_page_state(page, NR_ANON_PAGES);
if (unlikely(PageMlocked(page)))
clear_page_mlock(page);
- if (partial_thp_unmap)
+ if (PageTransCompound(page))
deferred_split_huge_page(compound_head(page));
/*
--
Kirill A. Shutemov
next prev parent reply other threads:[~2015-03-24 12:58 UTC|newest]
Thread overview: 76+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-02-21 3:49 [PATCH 00/24] huge tmpfs: an alternative approach to THPageCache Hugh Dickins
2015-02-21 3:49 ` Hugh Dickins
2015-02-21 3:51 ` [PATCH 01/24] mm: update_lru_size warn and reset bad lru_size Hugh Dickins
2015-02-21 3:51 ` Hugh Dickins
2015-02-23 9:30 ` Kirill A. Shutemov
2015-02-23 9:30 ` Kirill A. Shutemov
2015-03-23 2:44 ` Hugh Dickins
2015-03-23 2:44 ` Hugh Dickins
2015-02-21 3:54 ` [PATCH 02/24] mm: update_lru_size do the __mod_zone_page_state Hugh Dickins
2015-02-21 3:54 ` Hugh Dickins
2015-02-21 3:56 ` [PATCH 03/24] mm: use __SetPageSwapBacked and don't ClearPageSwapBacked Hugh Dickins
2015-02-21 3:56 ` Hugh Dickins
2015-02-25 10:53 ` Mel Gorman
2015-02-25 10:53 ` Mel Gorman
2015-03-23 3:01 ` Hugh Dickins
2015-03-23 3:01 ` Hugh Dickins
2015-02-21 3:58 ` [PATCH 04/24] mm: make page migration's newpage handling more robust Hugh Dickins
2015-02-21 3:58 ` Hugh Dickins
2015-02-21 4:00 ` [PATCH 05/24] tmpfs: preliminary minor tidyups Hugh Dickins
2015-02-21 4:00 ` Hugh Dickins
2015-02-21 4:01 ` [PATCH 06/24] huge tmpfs: prepare counts in meminfo, vmstat and SysRq-m Hugh Dickins
2015-02-21 4:01 ` Hugh Dickins
2015-02-21 4:03 ` [PATCH 07/24] huge tmpfs: include shmem freeholes in available memory counts Hugh Dickins
2015-02-21 4:03 ` Hugh Dickins
2015-02-21 4:05 ` [PATCH 08/24] huge tmpfs: prepare huge=N mount option and /proc/sys/vm/shmem_huge Hugh Dickins
2015-02-21 4:05 ` Hugh Dickins
2015-02-21 4:06 ` [PATCH 09/24] huge tmpfs: try to allocate huge pages, split into a team Hugh Dickins
2015-02-21 4:06 ` Hugh Dickins
2015-02-21 4:07 ` [PATCH 10/24] huge tmpfs: avoid team pages in a few places Hugh Dickins
2015-02-21 4:07 ` Hugh Dickins
2015-02-21 4:09 ` [PATCH 11/24] huge tmpfs: shrinker to migrate and free underused holes Hugh Dickins
2015-02-21 4:09 ` Hugh Dickins
2015-03-19 16:56 ` Konstantin Khlebnikov
2015-03-19 16:56 ` Konstantin Khlebnikov
2015-03-23 4:40 ` Hugh Dickins
2015-03-23 4:40 ` Hugh Dickins
2015-03-23 12:50 ` Kirill A. Shutemov
2015-03-23 12:50 ` Kirill A. Shutemov
2015-03-23 13:50 ` Kirill A. Shutemov
2015-03-23 13:50 ` Kirill A. Shutemov
2015-03-24 12:57 ` Kirill A. Shutemov [this message]
2015-03-24 12:57 ` Kirill A. Shutemov
2015-03-25 0:41 ` Hugh Dickins
2015-03-25 0:41 ` Hugh Dickins
2015-02-21 4:11 ` [PATCH 12/24] huge tmpfs: get_unmapped_area align and fault supply huge page Hugh Dickins
2015-02-21 4:11 ` Hugh Dickins
2015-02-21 4:12 ` [PATCH 13/24] huge tmpfs: extend get_user_pages_fast to shmem pmd Hugh Dickins
2015-02-21 4:12 ` Hugh Dickins
2015-02-21 4:13 ` [PATCH 14/24] huge tmpfs: extend vma_adjust_trans_huge " Hugh Dickins
2015-02-21 4:13 ` Hugh Dickins
2015-02-21 4:15 ` [PATCH 15/24] huge tmpfs: rework page_referenced_one and try_to_unmap_one Hugh Dickins
2015-02-21 4:15 ` Hugh Dickins
2015-02-21 4:16 ` [PATCH 16/24] huge tmpfs: fix problems from premature exposure of pagetable Hugh Dickins
2015-02-21 4:16 ` Hugh Dickins
2015-07-01 10:53 ` Kirill A. Shutemov
2015-07-01 10:53 ` Kirill A. Shutemov
2015-02-21 4:18 ` [PATCH 17/24] huge tmpfs: map shmem by huge page pmd or by page team ptes Hugh Dickins
2015-02-21 4:18 ` Hugh Dickins
2015-02-21 4:20 ` [PATCH 18/24] huge tmpfs: mmap_sem is unlocked when truncation splits huge pmd Hugh Dickins
2015-02-21 4:20 ` Hugh Dickins
2015-02-21 4:22 ` [PATCH 19/24] huge tmpfs: disband split huge pmds on race or memory failure Hugh Dickins
2015-02-21 4:22 ` Hugh Dickins
2015-02-21 4:23 ` [PATCH 20/24] huge tmpfs: use Unevictable lru with variable hpage_nr_pages() Hugh Dickins
2015-02-21 4:23 ` Hugh Dickins
2015-02-21 4:25 ` [PATCH 21/24] huge tmpfs: fix Mlocked meminfo, tracking huge and unhuge mlocks Hugh Dickins
2015-02-21 4:25 ` Hugh Dickins
2015-02-21 4:27 ` [PATCH 22/24] huge tmpfs: fix Mapped meminfo, tracking huge and unhuge mappings Hugh Dickins
2015-02-21 4:27 ` Hugh Dickins
2015-02-21 4:29 ` [PATCH 23/24] kvm: plumb return of hva when resolving page fault Hugh Dickins
2015-02-21 4:29 ` Hugh Dickins
2015-02-21 4:31 ` [PATCH 24/24] kvm: teach kvm to map page teams as huge pages Hugh Dickins
2015-02-21 4:31 ` Hugh Dickins
2015-02-23 13:48 ` [PATCH 00/24] huge tmpfs: an alternative approach to THPageCache Kirill A. Shutemov
2015-02-23 13:48 ` Kirill A. Shutemov
2015-03-23 2:25 ` Hugh Dickins
2015-03-23 2:25 ` Hugh Dickins
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150324125752.GA4642@node.dhcp.inet.fi \
--to=kirill@shutemov.name \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=hughd@google.com \
--cc=khlebnikov@yandex-team.ru \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=quning@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.