All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Kirill A. Shutemov" <kirill@shutemov.name>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "Jerome Glisse" <j.glisse@gmail.com>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>, "Joerg Roedel" <joro@8bytes.org>,
	"Mel Gorman" <mgorman@suse.de>, "H. Peter Anvin" <hpa@zytor.com>,
	"Peter Zijlstra" <peterz@infradead.org>,
	"Andrea Arcangeli" <aarcange@redhat.com>,
	"Johannes Weiner" <jweiner@redhat.com>,
	"Larry Woodman" <lwoodman@redhat.com>,
	"Rik van Riel" <riel@redhat.com>,
	"Dave Airlie" <airlied@redhat.com>,
	"Brendan Conoboy" <blc@redhat.com>,
	"Joe Donohue" <jdonohue@redhat.com>,
	"Duncan Poole" <dpoole@nvidia.com>,
	"Sherry Cheung" <SCheung@nvidia.com>,
	"Subhash Gutti" <sgutti@nvidia.com>,
	"John Hubbard" <jhubbard@nvidia.com>,
	"Mark Hairgrove" <mhairgrove@nvidia.com>,
	"Lucien Dunning" <ldunning@nvidia.com>,
	"Cameron Buschardt" <cabuschardt@nvidia.com>,
	"Arvind Gopalakrishnan" <arvindg@nvidia.com>,
	"Shachar Raindel" <raindel@mellanox.com>,
	"Liran Liss" <liranl@mellanox.com>,
	"Roland Dreier" <roland@purestorage.com>,
	"Ben Sander" <ben.sander@amd.com>,
	"Greg Stoner" <Greg.Stoner@amd.com>,
	"John Bridgman" <John.Bridgman@amd.com>,
	"Michael Mantor" <Michael.Mantor@amd.com>,
	"Paul Blinzer" <Paul.Blinzer@amd.com>,
	"Laurent Morichetti" <Laurent.Morichetti@amd.com>,
	"Alexander Deucher" <Alexander.Deucher@amd.com>,
	"Oded Gabbay" <Oded.Gabbay@amd.com>,
	"Jérôme Glisse" <jglisse@redhat.com>
Subject: Re: [PATCH 3/5] lib: lockless generic and arch independent page table (gpt) v2.
Date: Fri, 14 Nov 2014 02:58:33 +0200	[thread overview]
Message-ID: <20141114005833.GA1572@node.dhcp.inet.fi> (raw)
In-Reply-To: <CA+55aFxYnBxGZr3ed0i46SpSdOj+3VSVBZiqRbdJuwFMuTmxDw@mail.gmail.com>

On Thu, Nov 13, 2014 at 03:50:02PM -0800, Linus Torvalds wrote:
> +/*
> + * The 'tree_level' data only describes one particular level
> + * of the tree. The upper levels are totally invisible to the
> + * user of the tree walker, since the tree walker will walk
> + * those using the tree definitions.
> + *
> + * NOTE! "struct tree_entry" is an opaque type, and is just a
> + * used as a pointer to the particular level. You can figure
> + * out which level you are at by looking at the "tree_level",
> + * but even better is to just use different "lookup()"
> + * functions for different levels, at which point the
> + * function is inherent to the level.

Please, don't.

We will end up with the same last-level centric code as we have now in mm
subsystem: all code only cares about pte. It makes implementing variable
page size support really hard and lead to copy-paste approach. And to
hugetlb parallel world...

It would be nice to have tree_level description generic enough to get rid
of pte_present()/pte_dirty()/pte_* and implement generic helpers instead.

Apart from variable page size problem, we could get one day support
different CPU page table format supported in runtime: PAE/non-PAE on
32-bit x86 or LPAE/non-LPAE on ARM in one binary kernel image.

The big topic is how to get it done without significant runtime cost :-/

-- 
 Kirill A. Shutemov

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: "Kirill A. Shutemov" <kirill@shutemov.name>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "Jerome Glisse" <j.glisse@gmail.com>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>, "Joerg Roedel" <joro@8bytes.org>,
	"Mel Gorman" <mgorman@suse.de>, "H. Peter Anvin" <hpa@zytor.com>,
	"Peter Zijlstra" <peterz@infradead.org>,
	"Andrea Arcangeli" <aarcange@redhat.com>,
	"Johannes Weiner" <jweiner@redhat.com>,
	"Larry Woodman" <lwoodman@redhat.com>,
	"Rik van Riel" <riel@redhat.com>,
	"Dave Airlie" <airlied@redhat.com>,
	"Brendan Conoboy" <blc@redhat.com>,
	"Joe Donohue" <jdonohue@redhat.com>,
	"Duncan Poole" <dpoole@nvidia.com>,
	"Sherry Cheung" <SCheung@nvidia.com>,
	"Subhash Gutti" <sgutti@nvidia.com>,
	"John Hubbard" <jhubbard@nvidia.com>,
	"Mark Hairgrove" <mhairgrove@nvidia.com>,
	"Lucien Dunning" <ldunning@nvidia.com>,
	"Cameron Buschardt" <cabuschardt@nvidia.com>,
	"Arvind Gopalakrishnan" <arvindg@nvidia.com>,
	"Shachar Raindel" <raindel@mellanox.com>,
	"Liran Liss" <liranl@mellanox.com>,
	"Roland Dreier" <roland@purestorage.com>,
	"Ben Sander" <ben.sander@amd.com>,
	"Greg Stoner" <Greg.Stoner@amd.com>,
	"John Bridgman" <John.Bridgman@amd.com>,
	"Michael Mantor" <Michael.Mantor@amd.com>,
	"Paul Blinzer" <Paul.Blinzer@amd.com>,
	"Laurent Morichetti" <Laurent.Morichetti@amd.com>,
	"Alexander Deucher" <Alexander.Deucher@amd.com>,
	"Oded Gabbay" <Oded.Gabbay@amd.com>,
	"Jérôme Glisse" <jglisse@redhat.com>
Subject: Re: [PATCH 3/5] lib: lockless generic and arch independent page table (gpt) v2.
Date: Fri, 14 Nov 2014 02:58:33 +0200	[thread overview]
Message-ID: <20141114005833.GA1572@node.dhcp.inet.fi> (raw)
In-Reply-To: <CA+55aFxYnBxGZr3ed0i46SpSdOj+3VSVBZiqRbdJuwFMuTmxDw@mail.gmail.com>

On Thu, Nov 13, 2014 at 03:50:02PM -0800, Linus Torvalds wrote:
> +/*
> + * The 'tree_level' data only describes one particular level
> + * of the tree. The upper levels are totally invisible to the
> + * user of the tree walker, since the tree walker will walk
> + * those using the tree definitions.
> + *
> + * NOTE! "struct tree_entry" is an opaque type, and is just a
> + * used as a pointer to the particular level. You can figure
> + * out which level you are at by looking at the "tree_level",
> + * but even better is to just use different "lookup()"
> + * functions for different levels, at which point the
> + * function is inherent to the level.

Please, don't.

We will end up with the same last-level centric code as we have now in mm
subsystem: all code only cares about pte. It makes implementing variable
page size support really hard and lead to copy-paste approach. And to
hugetlb parallel world...

It would be nice to have tree_level description generic enough to get rid
of pte_present()/pte_dirty()/pte_* and implement generic helpers instead.

Apart from variable page size problem, we could get one day support
different CPU page table format supported in runtime: PAE/non-PAE on
32-bit x86 or LPAE/non-LPAE on ARM in one binary kernel image.

The big topic is how to get it done without significant runtime cost :-/

-- 
 Kirill A. Shutemov

  reply	other threads:[~2014-11-14  1:01 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-11-10 18:28 HMM (heterogeneous memory management) v6 j.glisse
2014-11-10 18:28 ` j.glisse
2014-11-10 18:28 ` j.glisse
2014-11-10 18:28 ` [PATCH 1/5] mmu_notifier: add event information to address invalidation v6 j.glisse
2014-11-10 18:28   ` j.glisse
2014-11-10 18:28 ` [PATCH 2/5] mmu_notifier: keep track of active invalidation ranges v2 j.glisse
2014-11-10 18:28   ` j.glisse
2014-11-10 18:28 ` [PATCH 3/5] lib: lockless generic and arch independent page table (gpt) v2 j.glisse
2014-11-10 18:28   ` j.glisse
2014-11-10 20:22   ` Linus Torvalds
2014-11-10 20:22     ` Linus Torvalds
2014-11-10 20:58     ` Jerome Glisse
2014-11-10 20:58       ` Jerome Glisse
2014-11-10 21:35       ` Linus Torvalds
2014-11-10 21:35         ` Linus Torvalds
2014-11-10 21:47         ` Linus Torvalds
2014-11-10 21:47           ` Linus Torvalds
2014-11-10 22:58           ` Jerome Glisse
2014-11-10 22:58             ` Jerome Glisse
2014-11-10 22:50         ` Jerome Glisse
2014-11-10 22:50           ` Jerome Glisse
2014-11-10 23:53           ` Linus Torvalds
2014-11-10 23:53             ` Linus Torvalds
2014-11-11  2:45             ` Jerome Glisse
2014-11-11  2:45               ` Jerome Glisse
2014-11-11  3:16               ` Linus Torvalds
2014-11-11  3:16                 ` Linus Torvalds
2014-11-11  4:19                 ` Jerome Glisse
2014-11-11  4:19                   ` Jerome Glisse
2014-11-11  4:29                 ` Linus Torvalds
2014-11-11  4:29                   ` Linus Torvalds
2014-11-11  9:59               ` Peter Zijlstra
2014-11-11  9:59                 ` Peter Zijlstra
2014-11-11 13:42                 ` Jerome Glisse
2014-11-11 13:42                   ` Jerome Glisse
2014-11-11 21:01                 ` David Airlie
2014-11-11 21:01                   ` David Airlie
2014-11-13 23:50             ` Linus Torvalds
2014-11-14  0:58               ` Kirill A. Shutemov [this message]
2014-11-14  0:58                 ` Kirill A. Shutemov
2014-11-14  1:18                 ` Linus Torvalds
2014-11-14  1:18                   ` Linus Torvalds
2014-11-14  1:50                   ` Linus Torvalds
2014-11-14  1:50                     ` Linus Torvalds
2014-11-13 16:07     ` Rik van Riel
2014-11-13 16:07       ` Rik van Riel
2014-11-10 18:28 ` [PATCH 4/5] hmm: heterogeneous memory management v6 j.glisse
2014-11-10 18:28   ` j.glisse
2014-11-11 19:00 ` HMM (heterogeneous memory management) v6 Christoph Lameter
2014-11-11 19:00   ` Christoph Lameter
2014-11-11 19:00   ` Christoph Lameter
2014-11-12 20:09   ` Jerome Glisse
2014-11-12 20:09     ` Jerome Glisse
2014-11-12 20:09     ` Jerome Glisse
2014-11-12 23:08     ` Christoph Lameter
2014-11-12 23:08       ` Christoph Lameter
2014-11-12 23:08       ` Christoph Lameter
2014-11-13  4:28       ` Jerome Glisse
2014-11-13  4:28         ` Jerome Glisse
2014-11-13  4:28         ` Jerome Glisse
  -- strict thread matches above, loose matches on Subject: below --
2014-11-03 20:42 HMM (heterogeneous memory management) v5 j.glisse
2014-11-03 20:42 ` [PATCH 3/5] lib: lockless generic and arch independent page table (gpt) v2 j.glisse
2014-11-03 20:42   ` j.glisse
2014-11-06 22:32   ` Rik van Riel
2014-11-06 22:32     ` Rik van Riel
2014-11-06 22:40     ` Jerome Glisse
2014-11-06 22:40       ` Jerome Glisse
2014-11-06 22:56       ` Rik van Riel
2014-11-06 22:56         ` Rik van Riel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20141114005833.GA1572@node.dhcp.inet.fi \
    --to=kirill@shutemov.name \
    --cc=Alexander.Deucher@amd.com \
    --cc=Greg.Stoner@amd.com \
    --cc=John.Bridgman@amd.com \
    --cc=Laurent.Morichetti@amd.com \
    --cc=Michael.Mantor@amd.com \
    --cc=Oded.Gabbay@amd.com \
    --cc=Paul.Blinzer@amd.com \
    --cc=SCheung@nvidia.com \
    --cc=aarcange@redhat.com \
    --cc=airlied@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=arvindg@nvidia.com \
    --cc=ben.sander@amd.com \
    --cc=blc@redhat.com \
    --cc=cabuschardt@nvidia.com \
    --cc=dpoole@nvidia.com \
    --cc=hpa@zytor.com \
    --cc=j.glisse@gmail.com \
    --cc=jdonohue@redhat.com \
    --cc=jglisse@redhat.com \
    --cc=jhubbard@nvidia.com \
    --cc=joro@8bytes.org \
    --cc=jweiner@redhat.com \
    --cc=ldunning@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=liranl@mellanox.com \
    --cc=lwoodman@redhat.com \
    --cc=mgorman@suse.de \
    --cc=mhairgrove@nvidia.com \
    --cc=peterz@infradead.org \
    --cc=raindel@mellanox.com \
    --cc=riel@redhat.com \
    --cc=roland@purestorage.com \
    --cc=sgutti@nvidia.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.