* Re: [PATCH 00/31] Swap over NFS -v20
2009-10-01 17:42 ` Christoph Hellwig
@ 2009-10-02 5:52 ` Neil Brown
2009-10-02 8:21 ` Suresh Jayaraman
2009-10-04 21:41 ` Peter Zijlstra
2 siblings, 0 replies; 8+ messages in thread
From: Neil Brown @ 2009-10-02 5:52 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Suresh Jayaraman, Linus Torvalds, Andrew Morton, linux-kernel,
linux-mm, netdev, Miklos Szeredi, Wouter Verhelst, Peter Zijlstra,
trond.myklebust
On Thursday October 1, hch@infradead.org wrote:
>
> The other really big one is adding a proper method for safe, page-backed
> kernelspace I/O on files. That is not something like the grotty
> swap-tied address_space operations in this patch, but more something in
> the direction of the kernel direct I/O patches from Jenx Axboe he did
> for using in the loop driver. But even those aren't complete as they
> don't touch the locking issue yet.
Do you have a problem with the proposed address_space operations apart
from their names including the word "swap"? Would something like:
direct_on, direct_off, direct_read, direct_write
be better.
Semantics being that the read and write:
- bypass the page cache (invalidation is up to caller)
- must not make a blocking non-emergency memory allocation
direct_on does any pre-allocation and pre-reading to ensure those
semantics and be provided.
I have wondered if an extra flag along the lines of "I don't care
about this data after a crash" would be useful.
It would be set for swap, but not set for other users. Thus
e.g. RAID1 could easily avoid resyncing an area that was used only for
swap.
The only thing of Jens' that I could find used bmap - is there
something more recent I should look for?
>
> Especially the latter is an absolutely essential step to make any
> progress here, and an excellent patch series of it's own as there are
> multiple users for this, like making swap safe on btrfs files, making
> the MD bitmap code actually safe or improving the loop driver.
100% agree.
Thanks,
NeilBrown
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 00/31] Swap over NFS -v20
2009-10-01 17:42 ` Christoph Hellwig
2009-10-02 5:52 ` Neil Brown
@ 2009-10-02 8:21 ` Suresh Jayaraman
2009-10-04 21:41 ` Peter Zijlstra
2 siblings, 0 replies; 8+ messages in thread
From: Suresh Jayaraman @ 2009-10-02 8:21 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Linus Torvalds, Andrew Morton, linux-kernel, linux-mm, netdev,
Neil Brown, Miklos Szeredi, Wouter Verhelst, Peter Zijlstra,
trond.myklebust
Christoph Hellwig wrote:
> On Thu, Oct 01, 2009 at 07:34:18PM +0530, Suresh Jayaraman wrote:
>
> The other really big one is adding a proper method for safe, page-backed
> kernelspace I/O on files. That is not something like the grotty
> swap-tied address_space operations in this patch, but more something in
I'm not sure I understood about what problems you see with the proposed
address_space operations. Could you please elaborate a bit more?
> the direction of the kernel direct I/O patches from Jenx Axboe he did
> for using in the loop driver. But even those aren't complete as they
> don't touch the locking issue yet.
>
Thanks,
--
Suresh Jayaraman
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 00/31] Swap over NFS -v20
2009-10-01 17:42 ` Christoph Hellwig
2009-10-02 5:52 ` Neil Brown
2009-10-02 8:21 ` Suresh Jayaraman
@ 2009-10-04 21:41 ` Peter Zijlstra
2009-10-10 12:06 ` Pavel Machek
2 siblings, 1 reply; 8+ messages in thread
From: Peter Zijlstra @ 2009-10-04 21:41 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Suresh Jayaraman, Linus Torvalds, Andrew Morton, linux-kernel,
linux-mm, netdev, Neil Brown, Miklos Szeredi, Wouter Verhelst,
trond.myklebust
On Thu, 2009-10-01 at 13:42 -0400, Christoph Hellwig wrote:
> One of them
> would be the whole VM/net work to just make swap over nbd/iscsi safe.
Getting those two 'fixed' is going to be tons of interesting work
because they involve interaction with userspace daemons.
NBD has fairly simple userspace, but iSCSI has a rather large userspace
footprint and a rather complicated user/kernel interaction which will be
mighty interesting to get allocation safe.
Ideally the swap-over-$foo bits have no userspace component.
That said, Wouter is the NBD userspace maintainer and has expressed
interest into looking at making that work, but its sure going to be
non-trivial, esp. since exposing PF_MEMALLOC to userspace is a, not over
my dead-bodym like thing.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 00/31] Swap over NFS -v20
2009-10-04 21:41 ` Peter Zijlstra
@ 2009-10-10 12:06 ` Pavel Machek
2009-10-10 12:23 ` Peter Zijlstra
0 siblings, 1 reply; 8+ messages in thread
From: Pavel Machek @ 2009-10-10 12:06 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Christoph Hellwig, Suresh Jayaraman, Linus Torvalds,
Andrew Morton, linux-kernel, linux-mm, netdev, Neil Brown,
Miklos Szeredi, Wouter Verhelst, trond.myklebust
Hi!
> > One of them
> > would be the whole VM/net work to just make swap over nbd/iscsi safe.
>
> Getting those two 'fixed' is going to be tons of interesting work
> because they involve interaction with userspace daemons.
>
> NBD has fairly simple userspace, but iSCSI has a rather large userspace
> footprint and a rather complicated user/kernel interaction which will be
> mighty interesting to get allocation safe.
>
> Ideally the swap-over-$foo bits have no userspace component.
>
> That said, Wouter is the NBD userspace maintainer and has expressed
> interest into looking at making that work, but its sure going to be
> non-trivial, esp. since exposing PF_MEMALLOC to userspace is a, not over
> my dead-bodym like thing.
Well, as long as nbd-server is on separate machine (with real swap),
safe swapping over network should be ok, without PF_MEMALLOC for
userspace or similar nightmares, right?
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 00/31] Swap over NFS -v20
2009-10-10 12:06 ` Pavel Machek
@ 2009-10-10 12:23 ` Peter Zijlstra
2009-10-10 21:10 ` Pavel Machek
0 siblings, 1 reply; 8+ messages in thread
From: Peter Zijlstra @ 2009-10-10 12:23 UTC (permalink / raw)
To: Pavel Machek
Cc: Christoph Hellwig, Suresh Jayaraman, Linus Torvalds,
Andrew Morton, linux-kernel, linux-mm, netdev, Neil Brown,
Miklos Szeredi, Wouter Verhelst, trond.myklebust
On Sat, 2009-10-10 at 14:06 +0200, Pavel Machek wrote:
> Hi!
>
> > > One of them
> > > would be the whole VM/net work to just make swap over nbd/iscsi safe.
> >
> > Getting those two 'fixed' is going to be tons of interesting work
> > because they involve interaction with userspace daemons.
> >
> > NBD has fairly simple userspace, but iSCSI has a rather large userspace
> > footprint and a rather complicated user/kernel interaction which will be
> > mighty interesting to get allocation safe.
> >
> > Ideally the swap-over-$foo bits have no userspace component.
> >
> > That said, Wouter is the NBD userspace maintainer and has expressed
> > interest into looking at making that work, but its sure going to be
> > non-trivial, esp. since exposing PF_MEMALLOC to userspace is a, not over
> > my dead-bodym like thing.
>
> Well, as long as nbd-server is on separate machine (with real swap),
> safe swapping over network should be ok, without PF_MEMALLOC for
> userspace or similar nightmares, right?
Nope, as soon as the nbd-client looses its connection you're up shit
creek.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 00/31] Swap over NFS -v20
2009-10-10 12:23 ` Peter Zijlstra
@ 2009-10-10 21:10 ` Pavel Machek
0 siblings, 0 replies; 8+ messages in thread
From: Pavel Machek @ 2009-10-10 21:10 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Christoph Hellwig, Suresh Jayaraman, Linus Torvalds,
Andrew Morton, linux-kernel, linux-mm, netdev, Neil Brown,
Miklos Szeredi, Wouter Verhelst, trond.myklebust
On Sat 2009-10-10 14:23:41, Peter Zijlstra wrote:
> On Sat, 2009-10-10 at 14:06 +0200, Pavel Machek wrote:
> > Hi!
> >
> > > > One of them
> > > > would be the whole VM/net work to just make swap over nbd/iscsi safe.
> > >
> > > Getting those two 'fixed' is going to be tons of interesting work
> > > because they involve interaction with userspace daemons.
> > >
> > > NBD has fairly simple userspace, but iSCSI has a rather large userspace
> > > footprint and a rather complicated user/kernel interaction which will be
> > > mighty interesting to get allocation safe.
> > >
> > > Ideally the swap-over-$foo bits have no userspace component.
> > >
> > > That said, Wouter is the NBD userspace maintainer and has expressed
> > > interest into looking at making that work, but its sure going to be
> > > non-trivial, esp. since exposing PF_MEMALLOC to userspace is a, not over
> > > my dead-bodym like thing.
> >
> > Well, as long as nbd-server is on separate machine (with real swap),
> > safe swapping over network should be ok, without PF_MEMALLOC for
> > userspace or similar nightmares, right?
>
> Nope, as soon as the nbd-client looses its connection you're up shit
> creek.
Oops, right. Putting reconnect logic into the kernel would make sense.
I misunderstood your proposal. I thought you'd want to put
nbd-_server_ into the kernel too. I guess we violently agree that
that's unneccessary.
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread