From mboxrd@z Thu Jan 1 00:00:00 1970 From: Shao Miller Subject: Shared Memory Pages for Same Base Device Date: Sat, 20 Sep 2014 11:50:23 -0400 Message-ID: <541DA23F.6000003@treefrog.ca> Reply-To: device-mapper development Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============5090571618278986678==" Return-path: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: dm-devel-bounces@redhat.com Errors-To: dm-devel-bounces@redhat.com To: dm-devel@redhat.com List-Id: dm-devel.ids This is a multi-part message in MIME format. --===============5090571618278986678== Content-Type: multipart/alternative; boundary="------------090100080907070603020005" This is a multi-part message in MIME format. --------------090100080907070603020005 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Good day to all. If I've some block device "base" and I've two derived device-mapper devices "derived1" and "derived2" that are copy-on-write layers over top of that base, and I mount the filesystems on those derived block devices and run the same program "foo" from both, does the "foo" running from "derived1" share any [read and execute] memory pages with the "foo" running from "derived2", since the underlying sectors are both from the same position on "base"? The more general question would be about mmap, but I hope this example is clear. "Docker"[1] uses device-mapper in scenarios like the above example and I'm curious if they benefit from shared pages. [1] https://www.docker.com/ -- Shao Miller /Network Technician/ /905-836-4442 *ext: 112*/ www.treefrog.ca/shao-miller * * /*Treefrog Inc.*/ 905-836-4442 567 Davis Drive, Newmarket, ON www.treefrog.ca - @Treefrog --------------090100080907070603020005 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Good day to all.

If I've some block device "base" and I've two derived device-mapper devices "derived1" and "derived2" that are copy-on-write layers over top of that base, and I mount the filesystems on those derived block devices and run the same program "foo" from both, does the "foo" running from "derived1" share any [read and execute] memory pages with the "foo" running from "derived2", since the underlying sectors are both from the same position on "base"?

The more general question would be about mmap, but I hope this example is clear.  "Docker"[1] uses device-mapper in scenarios like the above example and I'm curious if they benefit from shared pages.

[1] https://www.docker.com/
--

Shao Miller
Network Technician
905-836-4442 ext: 112
www.treefrog.ca/shao-miller

Treefrog Inc.
905-836-4442
567 Davis Drive, Newmarket, ON
www.treefrog.ca - @Treefrog


--------------090100080907070603020005-- --===============5090571618278986678== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline --===============5090571618278986678==-- From mboxrd@z Thu Jan 1 00:00:00 1970 From: Mike Snitzer Subject: Re: Shared Memory Pages for Same Base Device Date: Mon, 22 Sep 2014 09:08:03 -0400 Message-ID: <20140922130803.GA5343@redhat.com> References: <541DA23F.6000003@treefrog.ca> Reply-To: device-mapper development Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: <541DA23F.6000003@treefrog.ca> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: dm-devel-bounces@redhat.com Errors-To: dm-devel-bounces@redhat.com To: Shao Miller Cc: dm-devel@redhat.com List-Id: dm-devel.ids On Sat, Sep 20 2014 at 11:50am -0400, Shao Miller wrote: > Good day to all. > > If I've some block device "base" and I've two derived device-mapper > devices "derived1" and "derived2" that are copy-on-write layers over > top of that base, and I mount the filesystems on those derived block > devices and run the same program "foo" from both, does the "foo" > running from "derived1" share any [read and execute] memory pages > with the "foo" running from "derived2", since the underlying sectors > are both from the same position on "base"? > > The more general question would be about mmap, but I hope this > example is clear. "Docker"[1] uses device-mapper in scenarios like > the above example and I'm curious if they benefit from shared pages. > > [1] https://www.docker.com/ Unfortunately device-mapper thin provisioning doesn't offer shared pagecache pages across snapshot volumes. This is a block layer limitation (the block layer doesn't allow pages to be shared across block devices, and dm-thinp snapshot volumes are each a block device). Modifying the VM, block and DM subsystems to provide this capability is not an easy task and as such is really not a near-term priority. Interestingly BTRFS does _not_ offer this page sharing either. I'm told that the only emerging solution for this is overlayfs. From mboxrd@z Thu Jan 1 00:00:00 1970 From: Shao Miller Subject: Re: Shared Memory Pages for Same Base Device Date: Mon, 22 Sep 2014 10:08:13 -0400 Message-ID: <54202D4D.1030901@treefrog.ca> References: <541DA23F.6000003@treefrog.ca> <20140922130803.GA5343@redhat.com> Reply-To: device-mapper development Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20140922130803.GA5343@redhat.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: dm-devel-bounces@redhat.com Errors-To: dm-devel-bounces@redhat.com To: dm-devel@redhat.com List-Id: dm-devel.ids On 9/22/2014 9:08 AM, Mike Snitzer wrote: > > Unfortunately device-mapper thin provisioning doesn't offer shared > pagecache pages across snapshot volumes. This is a block layer > limitation (the block layer doesn't allow pages to be shared across > block devices, and dm-thinp snapshot volumes are each a block device). > Modifying the VM, block and DM subsystems to provide this capability is > not an easy task and as such is really not a near-term priority. I sincerely appreciate your response, Mike. I figured as much, but hopefully this thread will be useful to future searchers. > Interestingly BTRFS does _not_ offer this page sharing either. I'm told > that the only emerging solution for this is overlayfs. This is very surprising. I've just done some testing and /proc/PID/maps for the same binary from two different snapshots deriving from the same base do show the same block-device major and minor, with the same offset. Do you happen to have a reference for that detail about BTRFS, or perhaps an idea for the right venue to learn more about that? Unfortunately, I guess it's off-topic for device-mapper. Also unfortunately, it seems that overlayfs isn't mainline Linux nor out-of-the-box for some distributions. (Like CentOS 7.) I enjoy device-mapper for CoW iSCSI and AoE. Keep up the great work, folks! -- Shao Miller /Network Technician/ /905-836-4442 *ext: 112*/ www.treefrog.ca/shao-miller * * /*Treefrog Inc.*/ 905-836-4442 567 Davis Drive, Newmarket, ON www.treefrog.ca - @Treefrog From mboxrd@z Thu Jan 1 00:00:00 1970 From: Mike Snitzer Subject: Re: Shared Memory Pages for Same Base Device Date: Mon, 22 Sep 2014 10:26:27 -0400 Message-ID: <20140922142626.GB5698@redhat.com> References: <541DA23F.6000003@treefrog.ca> <20140922130803.GA5343@redhat.com> <54202D4D.1030901@treefrog.ca> Reply-To: device-mapper development Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: <54202D4D.1030901@treefrog.ca> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: dm-devel-bounces@redhat.com Errors-To: dm-devel-bounces@redhat.com To: Shao Miller Cc: clm@fb.com, dm-devel@redhat.com List-Id: dm-devel.ids On Mon, Sep 22 2014 at 10:08am -0400, Shao Miller wrote: > On 9/22/2014 9:08 AM, Mike Snitzer wrote: > > > >Unfortunately device-mapper thin provisioning doesn't offer shared > >pagecache pages across snapshot volumes. This is a block layer > >limitation (the block layer doesn't allow pages to be shared across > >block devices, and dm-thinp snapshot volumes are each a block device). > >Modifying the VM, block and DM subsystems to provide this capability is > >not an easy task and as such is really not a near-term priority. > > I sincerely appreciate your response, Mike. I figured as much, but > hopefully this thread will be useful to future searchers. > > >Interestingly BTRFS does _not_ offer this page sharing either. I'm told > >that the only emerging solution for this is overlayfs. > This is very surprising. I've just done some testing and > /proc/PID/maps for the same binary from two different snapshots > deriving from the same base do show the same block-device major and > minor, with the same offset. Do you happen to have a reference for > that detail about BTRFS, or perhaps an idea for the right venue to > learn more about that? Unfortunately, I guess it's off-topic for > device-mapper. I learned as much from Chris Mason (cc'd) in a FB thread I started on this very topic. Chris said: "It would be awesome if we had a way to share page cache pages cow style across filesystem snapshots. Every file ends up with its own address space, so we really need to just share the pages. It's complex though" > Also unfortunately, it seems that overlayfs isn't mainline Linux nor > out-of-the-box for some distributions. (Like CentOS 7.) > > I enjoy device-mapper for CoW iSCSI and AoE. Keep up the great work, folks! Thanks