windows tmem

xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed

* windows tmem
@ 2013-05-25  4:57 James Harper
  2013-05-28  8:35 ` Paul Durrant
  0 siblings, 1 reply; 17+ messages in thread
From: James Harper @ 2013-05-25  4:57 UTC (permalink / raw)
  To: xen-devel@lists.xen.org

Do any of the Windows PV drivers make use of tmem in any way? I'm exploring how I could integrate this into GPLPV... so far I'm thinking of a filesystem filter that detects pagefile reads and writes and redirects them to tmem where appropriate.

Thanks

James

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: windows tmem
  2013-05-25  4:57 windows tmem James Harper
@ 2013-05-28  8:35 ` Paul Durrant
  2013-05-28  8:57   ` James Harper
  0 siblings, 1 reply; 17+ messages in thread
From: Paul Durrant @ 2013-05-28  8:35 UTC (permalink / raw)
  To: James Harper, xen-devel@lists.xen.org

> 
> Do any of the Windows PV drivers make use of tmem in any way? I'm
> exploring how I could integrate this into GPLPV... so far I'm thinking of a
> filesystem filter that detects pagefile reads and writes and redirects them to
> tmem where appropriate.
> 

I'd mulled it over a while ago for the Citrix PV drivers but failed to really find a usecase. I agree that pagefile *would* be a good usecase but to detect what was pagefile I suspect we'd need a filter driver somewhere fairly high up the storage stack and I didn't really fancy get into that at the time.

    Paul

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: windows tmem
  2013-05-28  8:35 ` Paul Durrant
@ 2013-05-28  8:57   ` James Harper
  2013-05-28  9:30     ` Paul Durrant
  0 siblings, 1 reply; 17+ messages in thread
From: James Harper @ 2013-05-28  8:57 UTC (permalink / raw)
  To: Paul Durrant, xen-devel@lists.xen.org

> >
> > Do any of the Windows PV drivers make use of tmem in any way? I'm
> > exploring how I could integrate this into GPLPV... so far I'm thinking of a
> > filesystem filter that detects pagefile reads and writes and redirects them
> > to tmem where appropriate.
> >
> 
> I'd mulled it over a while ago for the Citrix PV drivers but failed to really find a
> usecase. I agree that pagefile *would* be a good usecase but to detect what
> was pagefile I suspect we'd need a filter driver somewhere fairly high up the
> storage stack and I didn't really fancy get into that at the time.
> 

I've just been dipping my toe in the windows fs filters and am finding them quite simple to deal with, so a straightforward implementation could be very easy. Detecting if the file is a pagefile is easy enough, there is a call to query that (although it has some IRQL and errata limitations).

The one thing I'm not sure about is if windows provides a way to know that the page is done with. In theory, I would divert a write to tmem, then divert the subsequent read from tmem, and the read would signify that the page of memory could be removed from tmem, but that doesn't necessarily follow. I haven't solutions for the following:

. differentiating pagefile metadata writes from actual page writes. Eg if the swapfile was grown on the fly (windows does this) then windows would presumably update the signature of the swapfile, and this signature is likely undocumented. I suspect actual page writes will have particular buffer characteristics so maybe this isn't going to be too difficult.

. windows could optimistically write a page out during periods of idle io, even though the page is still in use, and then later throw the memory page out if required, but the application the memory belongs to could be unloaded before then so there may be a write but never a read. 

. windows could read a page back into memory, later throw the memory page out, and then still expect the pagefile to hold the data. I imagine this sort of behaviour is documented nowhere and even if I prove that a write is always later followed by exactly one read, this isn't guaranteed for the future.

A less risky implementation would just use tmem as a write-thru cache and then just throw out old pages on an LRU basis or something. Or discard the pages from tmem on read but write them back to disk. It kind of sucks the usefulness out of it though if you can't avoid the writes, and if windows is doing some trickery to page out during periods of low io then I'd be upsetting that too.

Anyway I have written a skeleton fs filter so I can monitor what is going on in better detail when I get a few minutes. Later versions of Windows might make use of discard (trim/unmap) which would solve most of the above problems.

There do seem to be some (windows equivalent of) page cache operations that could be hooked too... or else the api callback naming is leading me astray.

Thanks

James

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: windows tmem
  2013-05-28  8:57   ` James Harper
@ 2013-05-28  9:30     ` Paul Durrant
  2013-05-28  9:53       ` James Harper
  0 siblings, 1 reply; 17+ messages in thread
From: Paul Durrant @ 2013-05-28  9:30 UTC (permalink / raw)
  To: James Harper, xen-devel@lists.xen.org

> -----Original Message-----
> From: James Harper [mailto:james.harper@bendigoit.com.au]
> Sent: 28 May 2013 09:58
> To: Paul Durrant; xen-devel@lists.xen.org
> Subject: RE: windows tmem
> 
> > >
> > > Do any of the Windows PV drivers make use of tmem in any way? I'm
> > > exploring how I could integrate this into GPLPV... so far I'm thinking of a
> > > filesystem filter that detects pagefile reads and writes and redirects them
> > > to tmem where appropriate.
> > >
> >
> > I'd mulled it over a while ago for the Citrix PV drivers but failed to really find
> a
> > usecase. I agree that pagefile *would* be a good usecase but to detect
> what
> > was pagefile I suspect we'd need a filter driver somewhere fairly high up
> the
> > storage stack and I didn't really fancy get into that at the time.
> >
> 
> I've just been dipping my toe in the windows fs filters and am finding them
> quite simple to deal with, so a straightforward implementation could be very
> easy. Detecting if the file is a pagefile is easy enough, there is a call to query
> that (although it has some IRQL and errata limitations).
> 
> The one thing I'm not sure about is if windows provides a way to know that
> the page is done with. In theory, I would divert a write to tmem, then divert
> the subsequent read from tmem, and the read would signify that the page of
> memory could be removed from tmem, but that doesn't necessarily follow. I
> haven't solutions for the following:
> 
> . differentiating pagefile metadata writes from actual page writes. Eg if the
> swapfile was grown on the fly (windows does this) then windows would
> presumably update the signature of the swapfile, and this signature is likely
> undocumented. I suspect actual page writes will have particular buffer
> characteristics so maybe this isn't going to be too difficult.
> 
> . windows could optimistically write a page out during periods of idle io, even
> though the page is still in use, and then later throw the memory page out if
> required, but the application the memory belongs to could be unloaded
> before then so there may be a write but never a read.
> 
> . windows could read a page back into memory, later throw the memory
> page out, and then still expect the pagefile to hold the data. I imagine this
> sort of behaviour is documented nowhere and even if I prove that a write is
> always later followed by exactly one read, this isn't guaranteed for the
> future.
> 
> A less risky implementation would just use tmem as a write-thru cache and
> then just throw out old pages on an LRU basis or something. Or discard the
> pages from tmem on read but write them back to disk. It kind of sucks the
> usefulness out of it though if you can't avoid the writes, and if windows is
> doing some trickery to page out during periods of low io then I'd be upsetting
> that too.
> 

This sounds a lot less fragile and the saving on reads to the storage backend could still be significant.
 
> Anyway I have written a skeleton fs filter so I can monitor what is going on in
> better detail when I get a few minutes. Later versions of Windows might
> make use of discard (trim/unmap) which would solve most of the above
> problems.
> 
> There do seem to be some (windows equivalent of) page cache operations
> that could be hooked too... or else the api callback naming is leading me
> astray.
> 

Sounds interesting. Presumably, if you can reliably intercept all IO on a pagefile then I guess you could use tmem as a write-back cache in front of doing your own file i/o down the storage stack, as long as you could reliably flush it out when necessary. E.g. does windows assume anything about the pagefile content on resume from S3 or S4?

  Paul

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: windows tmem
  2013-05-28  9:30     ` Paul Durrant
@ 2013-05-28  9:53       ` James Harper
  2013-05-28 14:17         ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 17+ messages in thread
From: James Harper @ 2013-05-28  9:53 UTC (permalink / raw)
  To: Paul Durrant, xen-devel@lists.xen.org

> > A less risky implementation would just use tmem as a write-thru cache and
> > then just throw out old pages on an LRU basis or something. Or discard the
> > pages from tmem on read but write them back to disk. It kind of sucks the
> > usefulness out of it though if you can't avoid the writes, and if windows is
> > doing some trickery to page out during periods of low io then I'd be
> > upsetting that too.
> >
> 
> This sounds a lot less fragile and the saving on reads to the storage backend
> could still be significant.

Should be easy enough to test I guess.

> > Anyway I have written a skeleton fs filter so I can monitor what is going on
> > in
> > better detail when I get a few minutes. Later versions of Windows might
> > make use of discard (trim/unmap) which would solve most of the above
> > problems.
> >
> > There do seem to be some (windows equivalent of) page cache operations
> > that could be hooked too... or else the api callback naming is leading me
> > astray.
> >
> 
> Sounds interesting. Presumably, if you can reliably intercept all IO on a
> pagefile then I guess you could use tmem as a write-back cache in front of
> doing your own file i/o down the storage stack, as long as you could reliably
> flush it out when necessary. E.g. does windows assume anything about the
> pagefile content on resume from S3 or S4?
> 

I need to look up if FS filter is notified about power state transitions. There may be a FLUSH of some sort that happens at that time. Newer versions of windows have a thing called 'hybrid suspend', where the hibernate file is written out as if windows were about to be hibernated, but it goes to sleep instead of hibernating but if power is lost a resume is still possible. It may be acceptable to say that tmem = no hibernate. Migrate should be easy enough as I have direct control over that and can make tmem be written back out to the pagefile first.

This all assumes that write back is possible too...

James

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: windows tmem
  2013-05-28  9:53       ` James Harper
@ 2013-05-28 14:17         ` Konrad Rzeszutek Wilk
  2013-05-28 14:17           ` Konrad Rzeszutek Wilk
  2013-05-29  0:19           ` James Harper
  0 siblings, 2 replies; 17+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-05-28 14:17 UTC (permalink / raw)
  To: James Harper; +Cc: Paul Durrant, xen-devel@lists.xen.org

On Tue, May 28, 2013 at 09:53:23AM +0000, James Harper wrote:
> > > A less risky implementation would just use tmem as a write-thru cache and
> > > then just throw out old pages on an LRU basis or something. Or discard the
> > > pages from tmem on read but write them back to disk. It kind of sucks the
> > > usefulness out of it though if you can't avoid the writes, and if windows is
> > > doing some trickery to page out during periods of low io then I'd be
> > > upsetting that too.
> > >
> > 
> > This sounds a lot less fragile and the saving on reads to the storage backend
> > could still be significant.
> 
> Should be easy enough to test I guess.
> 
> > > Anyway I have written a skeleton fs filter so I can monitor what is going on
> > > in
> > > better detail when I get a few minutes. Later versions of Windows might
> > > make use of discard (trim/unmap) which would solve most of the above
> > > problems.
> > >
> > > There do seem to be some (windows equivalent of) page cache operations
> > > that could be hooked too... or else the api callback naming is leading me
> > > astray.
> > >
> > 
> > Sounds interesting. Presumably, if you can reliably intercept all IO on a
> > pagefile then I guess you could use tmem as a write-back cache in front of
> > doing your own file i/o down the storage stack, as long as you could reliably
> > flush it out when necessary. E.g. does windows assume anything about the
> > pagefile content on resume from S3 or S4?
> > 
> 
> I need to look up if FS filter is notified about power state transitions. There may be a FLUSH of some sort that happens at that time. Newer versions of windows have a thing called 'hybrid suspend', where the hibernate file is written out as if windows were about to be hibernated, but it goes to sleep instead of hibernating but if power is lost a resume is still possible. It may be acceptable to say that tmem = no hibernate. Migrate should be easy enough as I have direct control over that and can make tmem be written back out to the pagefile first.
> 
> This all assumes that write back is possible too...

I am not familiar with the Windows APIs, but it sounds like you
want to use the tmem ephermeal disk cache as an secondary cache
(which is BTW what Linux does too).

That is OK the only thing you need to keep in mind that the
hypervisor might flush said cache out if it decides to do it
(say a new guest is launched and it needs the memory that
said cache is using).

So the tmem_get might tell that it does not have the page anymore.
> 
> James
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: windows tmem
  2013-05-28 14:17         ` Konrad Rzeszutek Wilk
@ 2013-05-28 14:17           ` Konrad Rzeszutek Wilk
  2013-05-29  0:19           ` James Harper
  1 sibling, 0 replies; 17+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-05-28 14:17 UTC (permalink / raw)
  To: James Harper; +Cc: Paul Durrant, xen-devel@lists.xen.org

On Tue, May 28, 2013 at 10:17:00AM -0400, Konrad Rzeszutek Wilk wrote:
> On Tue, May 28, 2013 at 09:53:23AM +0000, James Harper wrote:
> > > > A less risky implementation would just use tmem as a write-thru cache and
> > > > then just throw out old pages on an LRU basis or something. Or discard the
> > > > pages from tmem on read but write them back to disk. It kind of sucks the
> > > > usefulness out of it though if you can't avoid the writes, and if windows is
> > > > doing some trickery to page out during periods of low io then I'd be
> > > > upsetting that too.
> > > >
> > > 
> > > This sounds a lot less fragile and the saving on reads to the storage backend
> > > could still be significant.
> > 
> > Should be easy enough to test I guess.
> > 
> > > > Anyway I have written a skeleton fs filter so I can monitor what is going on
> > > > in
> > > > better detail when I get a few minutes. Later versions of Windows might
> > > > make use of discard (trim/unmap) which would solve most of the above
> > > > problems.
> > > >
> > > > There do seem to be some (windows equivalent of) page cache operations
> > > > that could be hooked too... or else the api callback naming is leading me
> > > > astray.
> > > >
> > > 
> > > Sounds interesting. Presumably, if you can reliably intercept all IO on a
> > > pagefile then I guess you could use tmem as a write-back cache in front of
> > > doing your own file i/o down the storage stack, as long as you could reliably
> > > flush it out when necessary. E.g. does windows assume anything about the
> > > pagefile content on resume from S3 or S4?
> > > 
> > 
> > I need to look up if FS filter is notified about power state transitions. There may be a FLUSH of some sort that happens at that time. Newer versions of windows have a thing called 'hybrid suspend', where the hibernate file is written out as if windows were about to be hibernated, but it goes to sleep instead of hibernating but if power is lost a resume is still possible. It may be acceptable to say that tmem = no hibernate. Migrate should be easy enough as I have direct control over that and can make tmem be written back out to the pagefile first.
> > 
> > This all assumes that write back is possible too...
> 
> I am not familiar with the Windows APIs, but it sounds like you
> want to use the tmem ephermeal disk cache as an secondary cache
> (which is BTW what Linux does too).
> 
> That is OK the only thing you need to keep in mind that the
> hypervisor might flush said cache out if it decides to do it
> (say a new guest is launched and it needs the memory that
> said cache is using).
> 
> So the tmem_get might tell that it does not have the page anymore.

Oh and I should mention that I would be more than thrilled to try
this out and see how it works.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: windows tmem
  2013-05-28 14:17         ` Konrad Rzeszutek Wilk
  2013-05-28 14:17           ` Konrad Rzeszutek Wilk
@ 2013-05-29  0:19           ` James Harper
  2013-05-29 15:42             ` Konrad Rzeszutek Wilk
  1 sibling, 1 reply; 17+ messages in thread
From: James Harper @ 2013-05-29  0:19 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: Paul Durrant, xen-devel@lists.xen.org

> 
> I am not familiar with the Windows APIs, but it sounds like you
> want to use the tmem ephermeal disk cache as an secondary cache
> (which is BTW what Linux does too).
> 
> That is OK the only thing you need to keep in mind that the
> hypervisor might flush said cache out if it decides to do it
> (say a new guest is launched and it needs the memory that
> said cache is using).
> 
> So the tmem_get might tell that it does not have the page anymore.

Yes I've read the brief :)

I actually wanted to implement the equivalent of 'frontswap' originally by trapping writes to the pagefile. A bit of digging and testing suggests it may not be possible to determine when a page written to the pagefile is discarded, meaning that tmem use would just grow until fill and then stop being useful unless I eject pages on an LRU basis or something, so ephemeral tmem as a best-effort write-through cache might be the best and easiest starting point.

James

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: windows tmem
  2013-05-29  0:19           ` James Harper
@ 2013-05-29 15:42             ` Konrad Rzeszutek Wilk
  2013-05-30  2:51               ` James Harper
  0 siblings, 1 reply; 17+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-05-29 15:42 UTC (permalink / raw)
  To: James Harper; +Cc: Paul Durrant, xen-devel@lists.xen.org

On Wed, May 29, 2013 at 12:19:25AM +0000, James Harper wrote:
> > 
> > I am not familiar with the Windows APIs, but it sounds like you
> > want to use the tmem ephermeal disk cache as an secondary cache
> > (which is BTW what Linux does too).
> > 
> > That is OK the only thing you need to keep in mind that the
> > hypervisor might flush said cache out if it decides to do it
> > (say a new guest is launched and it needs the memory that
> > said cache is using).
> > 
> > So the tmem_get might tell that it does not have the page anymore.
> 
> Yes I've read the brief :)
> 
> I actually wanted to implement the equivalent of 'frontswap' originally by trapping writes to the pagefile. A bit of digging and testing suggests it may not be possible to determine when a page written to the pagefile is discarded, meaning that tmem use would just grow until fill and then stop being useful unless I eject pages on an LRU basis or something, so ephemeral tmem as a best-effort write-through cache might be the best and easiest starting point.
> 

<nods>
> James

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: windows tmem
  2013-05-29 15:42             ` Konrad Rzeszutek Wilk
@ 2013-05-30  2:51               ` James Harper
  2013-06-02  7:36                 ` James Harper
  0 siblings, 1 reply; 17+ messages in thread
From: James Harper @ 2013-05-30  2:51 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: Paul Durrant, xen-devel@lists.xen.org

> 
> On Wed, May 29, 2013 at 12:19:25AM +0000, James Harper wrote:
> > >
> > > I am not familiar with the Windows APIs, but it sounds like you
> > > want to use the tmem ephermeal disk cache as an secondary cache
> > > (which is BTW what Linux does too).
> > >
> > > That is OK the only thing you need to keep in mind that the
> > > hypervisor might flush said cache out if it decides to do it
> > > (say a new guest is launched and it needs the memory that
> > > said cache is using).
> > >
> > > So the tmem_get might tell that it does not have the page anymore.
> >
> > Yes I've read the brief :)
> >
> > I actually wanted to implement the equivalent of 'frontswap' originally by
> > trapping writes to the pagefile. A bit of digging and testing suggests it may
> > not be possible to determine when a page written to the pagefile is
> > discarded, meaning that tmem use would just grow until fill and then stop
> > being useful unless I eject pages on an LRU basis or something, so ephemeral
> > tmem as a best-effort write-through cache might be the best and easiest
> > starting point.
> >
> 
> <nods>

Unfortunately it gets worse... I'm testing on windows 2003 at the moment, and it seems to always write out data in 64k chunks, which are aligned to a 4k boundary. Then it reads in one or more of those pages, and maybe later re-uses the same part of the swapfile for something else. It seems that all reads are 4k in size, but there may be some grouping of those requests at a lower layer.

So I would end up caching up to 16x the actual data, with no way of knowing which of those 16 pages are actually being swapped out and which are just optimistically being written to disk without actually being paged out.

I'll do a bit of analysis of the MDL being written as that may give me some more information but it's not looking as good as I'd hoped.

James

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: windows tmem
  2013-05-30  2:51               ` James Harper
@ 2013-06-02  7:36                 ` James Harper
  2013-06-03  8:54                   ` Paul Durrant
  0 siblings, 1 reply; 17+ messages in thread
From: James Harper @ 2013-06-02  7:36 UTC (permalink / raw)
  To: James Harper, Konrad Rzeszutek Wilk; +Cc: Paul Durrant, xen-devel@lists.xen.org

> 
> Unfortunately it gets worse... I'm testing on windows 2003 at the moment,
> and it seems to always write out data in 64k chunks, which are aligned to a 4k
> boundary. Then it reads in one or more of those pages, and maybe later re-
> uses the same part of the swapfile for something else. It seems that all reads
> are 4k in size, but there may be some grouping of those requests at a lower
> layer.
> 
> So I would end up caching up to 16x the actual data, with no way of knowing
> which of those 16 pages are actually being swapped out and which are just
> optimistically being written to disk without actually being paged out.
> 
> I'll do a bit of analysis of the MDL being written as that may give me some
> more information but it's not looking as good as I'd hoped.
> 

I now have a working implementation that does write-through caching of pagefile writes to ephemeral tmem. It keeps some counters on get and put operations, and on a Windows 2003 server with 256MB memory assigned, after a bit of running and flipping between applications I get:

put_success_count = 96565
put_fail_count    = 0
get_success_count = 34514
get_fail_count    = 5369

which is somewhere around 85% hit rate vs misses. That seems pretty good except that windows is quite aggressive about paging out, so there are a lot of unused writes (and therefore tmem usage) and I'm not sure if it's a net win.

Subjectively, windows does seem faster with my driver active, but I'm using it over a crappy adsl connection so it's hard to measure in any precise way.

I'm trying to see if it is possible to use tmem as a page cache cache which would be more useful but I'm not yet sure if the required hooks exist in fs minifilters.

James

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: windows tmem
  2013-06-02  7:36                 ` James Harper
@ 2013-06-03  8:54                   ` Paul Durrant
  2013-06-03 12:49                     ` James Harper
                                       ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Paul Durrant @ 2013-06-03  8:54 UTC (permalink / raw)
  To: James Harper, Konrad Rzeszutek Wilk; +Cc: xen-devel@lists.xen.org

> -----Original Message-----
> From: James Harper [mailto:james.harper@bendigoit.com.au]
> Sent: 02 June 2013 08:37
> To: James Harper; Konrad Rzeszutek Wilk
> Cc: Paul Durrant; xen-devel@lists.xen.org
> Subject: RE: [Xen-devel] windows tmem
> 
> >
> > Unfortunately it gets worse... I'm testing on windows 2003 at the moment,
> > and it seems to always write out data in 64k chunks, which are aligned to a
> 4k
> > boundary. Then it reads in one or more of those pages, and maybe later re-
> > uses the same part of the swapfile for something else. It seems that all
> reads
> > are 4k in size, but there may be some grouping of those requests at a lower
> > layer.
> >
> > So I would end up caching up to 16x the actual data, with no way of
> knowing
> > which of those 16 pages are actually being swapped out and which are just
> > optimistically being written to disk without actually being paged out.
> >
> > I'll do a bit of analysis of the MDL being written as that may give me some
> > more information but it's not looking as good as I'd hoped.
> >
> 
> I now have a working implementation that does write-through caching of
> pagefile writes to ephemeral tmem. It keeps some counters on get and put
> operations, and on a Windows 2003 server with 256MB memory assigned,
> after a bit of running and flipping between applications I get:
> 
> put_success_count = 96565
> put_fail_count    = 0
> get_success_count = 34514
> get_fail_count    = 5369
> 
> which is somewhere around 85% hit rate vs misses. That seems pretty good
> except that windows is quite aggressive about paging out, so there are a lot
> of unused writes (and therefore tmem usage) and I'm not sure if it's a net
> win.
> 
> Subjectively, windows does seem faster with my driver active, but I'm using
> it over a crappy adsl connection so it's hard to measure in any precise way.
> 
> I'm trying to see if it is possible to use tmem as a page cache cache which
> would be more useful but I'm not yet sure if the required hooks exist in fs
> minifilters.
> 

Do you have any numbers for a more recent version of Windows? At least a 6.x kernel. Perhaps the pageout characteristics are better?

  Paul

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: windows tmem
  2013-06-03  8:54                   ` Paul Durrant
@ 2013-06-03 12:49                     ` James Harper
  2013-06-03 12:56                     ` James Harper
  2013-06-04  3:09                     ` James Harper
  2 siblings, 0 replies; 17+ messages in thread
From: James Harper @ 2013-06-03 12:49 UTC (permalink / raw)
  To: Paul Durrant, Konrad Rzeszutek Wilk; +Cc: xen-devel@lists.xen.org

> 
> Do you have any numbers for a more recent version of Windows? At least a
> 6.x kernel. Perhaps the pageout characteristics are better?
> 

Building a 2008R2 server now. It's on an older AMD server (Dual-Core AMD Opteron(tm) Processor 1210) and seems to be installing really slowly...

James

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: windows tmem
  2013-06-03  8:54                   ` Paul Durrant
  2013-06-03 12:49                     ` James Harper
@ 2013-06-03 12:56                     ` James Harper
  2013-06-04  3:09                     ` James Harper
  2 siblings, 0 replies; 17+ messages in thread
From: James Harper @ 2013-06-03 12:56 UTC (permalink / raw)
  To: Paul Durrant, Konrad Rzeszutek Wilk; +Cc: xen-devel@lists.xen.org

> >
> > Do you have any numbers for a more recent version of Windows? At least a
> > 6.x kernel. Perhaps the pageout characteristics are better?
> >
> 
> Building a 2008R2 server now. It's on an older AMD server (Dual-Core AMD
> Opteron(tm) Processor 1210) and seems to be installing really slowly...
> 

[117301.344358] powernow-k8: fid trans failed, fid 0xa, curr 0x0
[117301.344417] powernow-k8: transition frequency failed
[117301.348266] powernow-k8: fid trans failed, fid 0x2, curr 0x0
[117301.348326] powernow-k8: transition frequency failed

I suspect that's the reason...  I normally disable the powernow-k8 module but forgot after the latest update.

James

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: windows tmem
  2013-06-03  8:54                   ` Paul Durrant
  2013-06-03 12:49                     ` James Harper
  2013-06-03 12:56                     ` James Harper
@ 2013-06-04  3:09                     ` James Harper
  2013-06-04  8:24                       ` Paul Durrant
  2 siblings, 1 reply; 17+ messages in thread
From: James Harper @ 2013-06-04  3:09 UTC (permalink / raw)
  To: Paul Durrant, Konrad Rzeszutek Wilk; +Cc: xen-devel@lists.xen.org

> 
> Do you have any numbers for a more recent version of Windows? At least a
> 6.x kernel. Perhaps the pageout characteristics are better?
> 

Fresh install of 2008r2 with 512kb of memory and tmem active, with updates installing for the last 30 minutes:

put_success_count = 1286906
put_fail_count    = 0
get_success_count = 511937
get_fail_count    = 286789

a 'get fail' is a 'miss'.

James

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: windows tmem
  2013-06-04  3:09                     ` James Harper
@ 2013-06-04  8:24                       ` Paul Durrant
  2013-06-04 10:54                         ` James Harper
  0 siblings, 1 reply; 17+ messages in thread
From: Paul Durrant @ 2013-06-04  8:24 UTC (permalink / raw)
  To: James Harper, Konrad Rzeszutek Wilk; +Cc: xen-devel@lists.xen.org

> -----Original Message-----
> From: James Harper [mailto:james.harper@bendigoit.com.au]
> Sent: 04 June 2013 04:09
> To: Paul Durrant; Konrad Rzeszutek Wilk
> Cc: xen-devel@lists.xen.org
> Subject: RE: [Xen-devel] windows tmem
> 
> >
> > Do you have any numbers for a more recent version of Windows? At least a
> > 6.x kernel. Perhaps the pageout characteristics are better?
> >
> 
> Fresh install of 2008r2 with 512kb of memory and tmem active, with updates
> installing for the last 30 minutes:
> 
> put_success_count = 1286906
> put_fail_count    = 0
> get_success_count = 511937
> get_fail_count    = 286789
> 
> a 'get fail' is a 'miss'.
> 

Hmm. On the face of it a much higher miss rate than 2K3, but the workload is different so it's hard to tell how comparable the numbers are. I wonder whether use of ephemeral tmem is an issue because of the get-implies-flush characteristic. I guess you'd always expect a put between gets for a pagefile but it might be interesting to see what miss rate you get with persistent tmem.

  Paul

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: windows tmem
  2013-06-04  8:24                       ` Paul Durrant
@ 2013-06-04 10:54                         ` James Harper
  0 siblings, 0 replies; 17+ messages in thread
From: James Harper @ 2013-06-04 10:54 UTC (permalink / raw)
  To: Paul Durrant, Konrad Rzeszutek Wilk; +Cc: xen-devel@lists.xen.org

> > Fresh install of 2008r2 with 512kb of memory and tmem active, with
> > updates
> > installing for the last 30 minutes:
> >
> > put_success_count = 1286906
> > put_fail_count    = 0
> > get_success_count = 511937
> > get_fail_count    = 286789
> >
> > a 'get fail' is a 'miss'.
> >
> 
> Hmm. On the face of it a much higher miss rate than 2K3, but the workload is
> different so it's hard to tell how comparable the numbers are. I wonder
> whether use of ephemeral tmem is an issue because of the get-implies-flush
> characteristic. I guess you'd always expect a put between gets for a pagefile
> but it might be interesting to see what miss rate you get with persistent
> tmem.
> 

After running for a while longer:

put_success_count = 15732240
put_fail_count    = 0
get_success_count = 10330032
get_fail_count    = 4460352

which is a similar hit rate of get_success vs get_fail (~70%), but a much better hit rate of put_success vs get_success. If ephemeral pages are discarded on read then this tells me that around 66% of pages I put into tmem were read back in, vs around 40% in my first sample.

For persistent tmem to work I'd need to know when Windows will not need the memory again which is information I don't have access to, or alternatively maintain my own LRU structure. What I really need to know is when Windows discards a page from memory, but all I know so far is when it writes out a page of memory to disk which only tells me that at some future time it  might discard the page from memory.

I'm only testing this one vm on a physical machine, so xen isn't trying to do any balancing of tmem pools against other vm's. Assigning 384mb (I said 512mb before but I was mistaken) to a Windows 2008R2 server isn't even close to a realistic scenario, and with a bunch of vm's all competing for ephemeral tmem memory might mean that pages are mostly discarded before they need to be retrieved.

James

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2013-06-04 10:54 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-05-25  4:57 windows tmem James Harper
2013-05-28  8:35 ` Paul Durrant
2013-05-28  8:57   ` James Harper
2013-05-28  9:30     ` Paul Durrant
2013-05-28  9:53       ` James Harper
2013-05-28 14:17         ` Konrad Rzeszutek Wilk
2013-05-28 14:17           ` Konrad Rzeszutek Wilk
2013-05-29  0:19           ` James Harper
2013-05-29 15:42             ` Konrad Rzeszutek Wilk
2013-05-30  2:51               ` James Harper
2013-06-02  7:36                 ` James Harper
2013-06-03  8:54                   ` Paul Durrant
2013-06-03 12:49                     ` James Harper
2013-06-03 12:56                     ` James Harper
2013-06-04  3:09                     ` James Harper
2013-06-04  8:24                       ` Paul Durrant
2013-06-04 10:54                         ` James Harper

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).