All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrea Arcangeli <aarcange@redhat.com>
To: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Hillf Danton <dhillf@gmail.com>, Hugh Dickins <hughd@google.com>,
	Dave Jones <davej@redhat.com>,
	Linux Kernel <linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Mel Gorman <mgorman@suse.de>, Linux-MM <linux-mm@kvack.org>,
	Rik van Riel <riel@redhat.com>
Subject: Re: oops in copy_page_rep()
Date: Tue, 8 Jan 2013 18:49:51 +0100	[thread overview]
Message-ID: <20130108174951.GG9163@redhat.com> (raw)
In-Reply-To: <20130108173058.GA27727@shutemov.name>

Hi Kirill,

On Tue, Jan 08, 2013 at 07:30:58PM +0200, Kirill A. Shutemov wrote:
> Merged patch is obviously broken: huge_pmd_set_accessed() can be called
> only if the pmd is under splitting.

Of course I assume you meant "only if the pmd is not under splitting".

But no, setting a bitflag like the young bit or clearing or setting
the numa bit won't screw with split_huge_page and it's safe even if
the pmd is under splitting.

Those bits are only checked here at the last stage of
split_huge_page_map after taking the PT lock:

	spin_lock(&mm->page_table_lock);
	pmd = page_check_address_pmd(page, mm, address,
				     PAGE_CHECK_ADDRESS_PMD_SPLITTING_FLAG);
	if (pmd) {
		pgtable = pgtable_trans_huge_withdraw(mm);
		pmd_populate(mm, &_pmd, pgtable);

		haddr = address;
		for (i = 0; i < HPAGE_PMD_NR; i++, haddr += PAGE_SIZE) {
			pte_t *pte, entry;
			BUG_ON(PageCompound(page+i));
			entry = mk_pte(page + i, vma->vm_page_prot);
			entry = maybe_mkwrite(pte_mkdirty(entry), vma);
			if (!pmd_write(*pmd))
				entry = pte_wrprotect(entry);
			else
				BUG_ON(page_mapcount(page) != 1);
			if (!pmd_young(*pmd))
				entry = pte_mkold(entry);
			if (pmd_numa(*pmd))
				entry = pte_mknuma(entry);
			pte = pte_offset_map(&_pmd, haddr);
			BUG_ON(!pte_none(*pte));
			set_pte_at(mm, haddr, pte, entry);
			pte_unmap(pte);
		}

If "young" or "numa" bitflags changed on the original *pmd for the
previous part of split_huge_page, nothing will go wrong by the time we
get to split_huge_page_map (the same is not true if the pfn changes!).

If you think this is too tricky, we could also decide to forbid
huge_pmd_set_accessed if the pmd is in splitting state, but I don't
think that flipping young/numa bits while in splitting state, can
cause any problem (if done correctly with PT lock + pmd_same).

Thanks!

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Andrea Arcangeli <aarcange@redhat.com>
To: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Hillf Danton <dhillf@gmail.com>, Hugh Dickins <hughd@google.com>,
	Dave Jones <davej@redhat.com>,
	Linux Kernel <linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Mel Gorman <mgorman@suse.de>, Linux-MM <linux-mm@kvack.org>,
	Rik van Riel <riel@redhat.com>
Subject: Re: oops in copy_page_rep()
Date: Tue, 8 Jan 2013 18:49:51 +0100	[thread overview]
Message-ID: <20130108174951.GG9163@redhat.com> (raw)
In-Reply-To: <20130108173058.GA27727@shutemov.name>

Hi Kirill,

On Tue, Jan 08, 2013 at 07:30:58PM +0200, Kirill A. Shutemov wrote:
> Merged patch is obviously broken: huge_pmd_set_accessed() can be called
> only if the pmd is under splitting.

Of course I assume you meant "only if the pmd is not under splitting".

But no, setting a bitflag like the young bit or clearing or setting
the numa bit won't screw with split_huge_page and it's safe even if
the pmd is under splitting.

Those bits are only checked here at the last stage of
split_huge_page_map after taking the PT lock:

	spin_lock(&mm->page_table_lock);
	pmd = page_check_address_pmd(page, mm, address,
				     PAGE_CHECK_ADDRESS_PMD_SPLITTING_FLAG);
	if (pmd) {
		pgtable = pgtable_trans_huge_withdraw(mm);
		pmd_populate(mm, &_pmd, pgtable);

		haddr = address;
		for (i = 0; i < HPAGE_PMD_NR; i++, haddr += PAGE_SIZE) {
			pte_t *pte, entry;
			BUG_ON(PageCompound(page+i));
			entry = mk_pte(page + i, vma->vm_page_prot);
			entry = maybe_mkwrite(pte_mkdirty(entry), vma);
			if (!pmd_write(*pmd))
				entry = pte_wrprotect(entry);
			else
				BUG_ON(page_mapcount(page) != 1);
			if (!pmd_young(*pmd))
				entry = pte_mkold(entry);
			if (pmd_numa(*pmd))
				entry = pte_mknuma(entry);
			pte = pte_offset_map(&_pmd, haddr);
			BUG_ON(!pte_none(*pte));
			set_pte_at(mm, haddr, pte, entry);
			pte_unmap(pte);
		}

If "young" or "numa" bitflags changed on the original *pmd for the
previous part of split_huge_page, nothing will go wrong by the time we
get to split_huge_page_map (the same is not true if the pfn changes!).

If you think this is too tricky, we could also decide to forbid
huge_pmd_set_accessed if the pmd is in splitting state, but I don't
think that flipping young/numa bits while in splitting state, can
cause any problem (if done correctly with PT lock + pmd_same).

Thanks!

  parent reply	other threads:[~2013-01-08 17:49 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-01-05 15:22 oops in copy_page_rep() Dave Jones
2013-01-06  3:57 ` Linus Torvalds
2013-01-07  0:37   ` Dave Jones
2013-01-07  3:38     ` Hugh Dickins
2013-01-06 11:55 ` Hillf Danton
2013-01-06 16:10   ` Dave Jones
2013-01-06 19:06   ` Hugh Dickins
2013-01-07 12:24     ` Hillf Danton
2013-01-07 17:34       ` Linus Torvalds
2013-01-08 13:04         ` Hillf Danton
2013-01-08 13:04           ` Hillf Danton
2013-01-08 15:37           ` Linus Torvalds
2013-01-08 15:37             ` Linus Torvalds
2013-01-08 16:31             ` Kirill A. Shutemov
2013-01-08 16:31               ` Kirill A. Shutemov
2013-01-08 16:52               ` Linus Torvalds
2013-01-08 16:52                 ` Linus Torvalds
2013-01-08 17:30                 ` Kirill A. Shutemov
2013-01-08 17:30                   ` Kirill A. Shutemov
2013-01-08 17:38                   ` Linus Torvalds
2013-01-08 17:38                     ` Linus Torvalds
2013-01-08 17:49                   ` Andrea Arcangeli [this message]
2013-01-08 17:49                     ` Andrea Arcangeli
2013-01-08 18:03                     ` Kirill A. Shutemov
2013-01-08 18:03                       ` Kirill A. Shutemov
2013-01-11  7:50                     ` Simon Jeons
2013-01-11  7:50                       ` Simon Jeons
2013-01-11 14:01                       ` Andrea Arcangeli
2013-01-11 14:01                         ` Andrea Arcangeli
2013-01-08 17:37                 ` Andrea Arcangeli
2013-01-08 17:37                   ` Andrea Arcangeli
2013-01-08 17:51                   ` Linus Torvalds
2013-01-08 18:03                     ` Andrea Arcangeli
2013-01-08 18:03                       ` Andrea Arcangeli
2013-01-08 18:21                       ` Linus Torvalds
2013-01-08 18:21                         ` Linus Torvalds
2013-01-09 11:38                         ` Hillf Danton
2013-01-09 11:38                           ` Hillf Danton
2013-01-09  4:23                   ` Hugh Dickins
2013-01-09  4:23                     ` Hugh Dickins
2013-01-09 11:44                 ` Mel Gorman
2013-01-09 11:44                   ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130108174951.GG9163@redhat.com \
    --to=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=davej@redhat.com \
    --cc=dhillf@gmail.com \
    --cc=hughd@google.com \
    --cc=kirill@shutemov.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=riel@redhat.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.