* [RFC] shmgetfd idea @ 2014-01-28 1:37 John Stultz 2014-01-28 1:53 ` Kay Sievers ` (2 more replies) 0 siblings, 3 replies; 25+ messages in thread From: John Stultz @ 2014-01-28 1:37 UTC (permalink / raw) To: linux-mm@kvack.org Cc: Greg KH, Kay Sievers, Android Kernel Team, Andrew Morton, Mel Gorman, Hugh Dickins, Dave Hansen, Rik van Riel, Michel Lespinasse, Johannes Weiner, H. Peter Anvin, Neil Brown, Andrea Arcangeli, Takahiro Akashi, Minchan Kim, Lennart Poettering In working with ashmem and looking briefly at kdbus' memfd ideas, there's a commonality that both basically act as a method to provide applications with unlinked tmpfs/shmem fds. In the Android case, its important to have this interface to atomically provide these unlinked tmpfs fds, because they'd like to avoid having tmpfs mounts that are writable by applications (since that creates a potential DOS on the system by applications writing random files that persist after the process has been killed). It also provides better life-cycle management for resources, since as the fds never have named links in the filesystem, their resources are automatically cleaned up when the last process with the fd dies, and there's no potential races between create and unlink with processes being terminated, which avoids the need for cleanup management. I won't speak for the kdbus use, but my understanding is memfds address similar needs along with being something to connect with other features. So one idea was maybe we need a new interface. Something like: int shmgetfd(char* name, size_t size, int shmflg); Basically this would be very similar to shmget, but would return a file descriptor which could be mapped and passed to other processes to map. Basically very similar to the in-kernel shmem_file_setup() interface. (Thanks to Akashi-san for initially pointing out the similarity to shmget.) Of course, shmgetfd on its own wouldn't address the quota issue right away, but it would be fairly easy have a limit for the total number of bytes a process could generate, or some other limiting mechanism. The probably more major drawback here is that both ashmem and memfd tack on additional features that can be done to the fds. In ashmems' case it allows for changing the segment's name, and unpinning regions which can then be lazily discarded by the kernel. For memfd, the extra feature is sealing, which prevents modification of the file when its shared. In ashmem's case, both vma-naming and volatile ranges are trying to address how the needed features would be generically applied to tmpfs fds (as well as potentially wider uses as well) - so with something like shmgetfd it would provide all the functionality needed. I am not aware of any current plans for memfd's sealing to be similarly worked into a generic concept - the code hasn't even been submitted, so this is too early - but in any case, its important to note none of these plans for generic functionality have been merged or even received with much interest, so I do understand how a proposal for a new interface that only solves half of the needed infrastructure may not be particularly welcome. So while I do understand the difficulty of trying to create more generic interfaces rather then just creating a new chardev/ioctl interface to a more limited subset of functionality, I do think its worth exploring if we can find a way to share infrastructure at some level (even if its just due-diligence to prove if the more limited scope chardev/ioctl interfaces are widely agreed to be better). Anyway, I just wanted to submit this sketched out idea as food for thought to see if there was any objection or interest (I've got a draft patch I'll send out once I get a chance to test it). So let me know if you have any feedback or comments. thanks -john -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC] shmgetfd idea 2014-01-28 1:37 [RFC] shmgetfd idea John Stultz @ 2014-01-28 1:53 ` Kay Sievers 2014-01-28 19:47 ` John Stultz 2014-01-28 3:52 ` H. Peter Anvin 2014-01-30 8:46 ` Christoph Hellwig 2 siblings, 1 reply; 25+ messages in thread From: Kay Sievers @ 2014-01-28 1:53 UTC (permalink / raw) To: John Stultz Cc: linux-mm@kvack.org, Greg KH, Android Kernel Team, Andrew Morton, Mel Gorman, Hugh Dickins, Dave Hansen, Rik van Riel, Michel Lespinasse, Johannes Weiner, H. Peter Anvin, Neil Brown, Andrea Arcangeli, Takahiro Akashi, Minchan Kim, Lennart Poettering On Tue, Jan 28, 2014 at 2:37 AM, John Stultz <john.stultz@linaro.org> wrote: > In working with ashmem and looking briefly at kdbus' memfd ideas, > there's a commonality that both basically act as a method to provide > applications with unlinked tmpfs/shmem fds. > > In the Android case, its important to have this interface to atomically > provide these unlinked tmpfs fds, because they'd like to avoid having > tmpfs mounts that are writable by applications (since that creates a > potential DOS on the system by applications writing random files that > persist after the process has been killed). It also provides better > life-cycle management for resources, since as the fds never have named > links in the filesystem, their resources are automatically cleaned up > when the last process with the fd dies, and there's no potential races > between create and unlink with processes being terminated, which avoids > the need for cleanup management. > > I won't speak for the kdbus use, but my understanding is memfds address > similar needs along with being something to connect with other features. > > > So one idea was maybe we need a new interface. Something like: > > int shmgetfd(char* name, size_t size, int shmflg); > > > Basically this would be very similar to shmget, but would return a file > descriptor which could be mapped and passed to other processes to map. > Basically very similar to the in-kernel shmem_file_setup() interface. > > (Thanks to Akashi-san for initially pointing out the similarity to shmget.) > > Of course, shmgetfd on its own wouldn't address the quota issue right > away, but it would be fairly easy have a limit for the total number of > bytes a process could generate, or some other limiting mechanism. > > > The probably more major drawback here is that both ashmem and memfd tack > on additional features that can be done to the fds. > > In ashmems' case it allows for changing the segment's name, and > unpinning regions which can then be lazily discarded by the kernel. > > For memfd, the extra feature is sealing, which prevents modification of > the file when its shared. > > In ashmem's case, both vma-naming and volatile ranges are trying to > address how the needed features would be generically applied to tmpfs > fds (as well as potentially wider uses as well) - so with something like > shmgetfd it would provide all the functionality needed. I am not aware > of any current plans for memfd's sealing to be similarly worked into a > generic concept - the code hasn't even been submitted, so this is too > early - but in any case, its important to note none of these plans for > generic functionality have been merged or even received with much > interest, so I do understand how a proposal for a new interface that > only solves half of the needed infrastructure may not be particularly > welcome. > > So while I do understand the difficulty of trying to create more generic > interfaces rather then just creating a new chardev/ioctl interface to a > more limited subset of functionality, I do think its worth exploring if > we can find a way to share infrastructure at some level (even if its > just due-diligence to prove if the more limited scope chardev/ioctl > interfaces are widely agreed to be better). > > Anyway, I just wanted to submit this sketched out idea as food for > thought to see if there was any objection or interest (I've got a draft > patch I'll send out once I get a chance to test it). So let me know if > you have any feedback or comments. The reason "kdbus-memfd" exists is primarily the sealing. We need a way to pass possibly large areas of memory from one process to another, without requiring any trust relation between the two processes; there cannot be an assumption about trusted vs. untrusted or creator vs. consumer; all variations must be able to mix in all combinations, and still be safe A sender of the message must be sure that the receiver cannot alter the message, the same way the receiver must be sure that the sender cannot alter the message content it just sent. It would be nice if we can generalize the whole memfd logic, but the shmem allocation facility alone, without the sealing function cannot replace kdbus-memfd. We would need secure sealing right from the start for the kdbus use case; other than that, there are no specific requirements from the kdbus side. Kay -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC] shmgetfd idea 2014-01-28 1:53 ` Kay Sievers @ 2014-01-28 19:47 ` John Stultz 0 siblings, 0 replies; 25+ messages in thread From: John Stultz @ 2014-01-28 19:47 UTC (permalink / raw) To: Kay Sievers Cc: linux-mm@kvack.org, Greg KH, Android Kernel Team, Andrew Morton, Mel Gorman, Hugh Dickins, Dave Hansen, Rik van Riel, Michel Lespinasse, Johannes Weiner, H. Peter Anvin, Neil Brown, Andrea Arcangeli, Takahiro Akashi, Minchan Kim, Lennart Poettering On 01/27/2014 05:53 PM, Kay Sievers wrote: > On Tue, Jan 28, 2014 at 2:37 AM, John Stultz <john.stultz@linaro.org> wrote: >> Anyway, I just wanted to submit this sketched out idea as food for >> thought to see if there was any objection or interest (I've got a draft >> patch I'll send out once I get a chance to test it). So let me know if >> you have any feedback or comments. > The reason "kdbus-memfd" exists is primarily the sealing. [snip] > It would be nice if we can generalize the whole memfd logic, but the > shmem allocation facility alone, without the sealing function cannot > replace kdbus-memfd. Yes. Quite understood. And I too hope to discuss how the sealing feature could be generalized when the code is submitted for review. I just figured I'd start here, so when that time comes we have a sketch for what the rest of the parts that would be needed are. > We would need secure sealing right from the start for the kdbus use > case; other than that, there are no specific requirements from the > kdbus side. Thanks for the clarifications! -john -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC] shmgetfd idea 2014-01-28 1:37 [RFC] shmgetfd idea John Stultz 2014-01-28 1:53 ` Kay Sievers @ 2014-01-28 3:52 ` H. Peter Anvin 2014-01-28 19:56 ` John Stultz 2014-01-30 8:46 ` Christoph Hellwig 2 siblings, 1 reply; 25+ messages in thread From: H. Peter Anvin @ 2014-01-28 3:52 UTC (permalink / raw) To: John Stultz, linux-mm@kvack.org Cc: Greg KH, Kay Sievers, Android Kernel Team, Andrew Morton, Mel Gorman, Hugh Dickins, Dave Hansen, Rik van Riel, Michel Lespinasse, Johannes Weiner, Neil Brown, Andrea Arcangeli, Takahiro Akashi, Minchan Kim, Lennart Poettering On 01/27/2014 05:37 PM, John Stultz wrote: > > In the Android case, its important to have this interface to atomically > provide these unlinked tmpfs fds, because they'd like to avoid having > tmpfs mounts that are writable by applications (since that creates a > potential DOS on the system by applications writing random files that > persist after the process has been killed). It also provides better > life-cycle management for resources, since as the fds never have named > links in the filesystem, their resources are automatically cleaned up > when the last process with the fd dies, and there's no potential races > between create and unlink with processes being terminated, which avoids > the need for cleanup management. > What about if tmpfs could be restricted to only only O_TMPFILE open()s? This pretty much amounts to an option to prevent tmpfs from creating new directory entries. -hpa -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC] shmgetfd idea 2014-01-28 3:52 ` H. Peter Anvin @ 2014-01-28 19:56 ` John Stultz 2014-01-28 20:37 ` H. Peter Anvin 0 siblings, 1 reply; 25+ messages in thread From: John Stultz @ 2014-01-28 19:56 UTC (permalink / raw) To: H. Peter Anvin, linux-mm@kvack.org Cc: Greg KH, Kay Sievers, Android Kernel Team, Andrew Morton, Mel Gorman, Hugh Dickins, Dave Hansen, Rik van Riel, Michel Lespinasse, Johannes Weiner, Neil Brown, Andrea Arcangeli, Takahiro Akashi, Minchan Kim, Lennart Poettering On 01/27/2014 07:52 PM, H. Peter Anvin wrote: > On 01/27/2014 05:37 PM, John Stultz wrote: >> In the Android case, its important to have this interface to atomically >> provide these unlinked tmpfs fds, because they'd like to avoid having >> tmpfs mounts that are writable by applications (since that creates a >> potential DOS on the system by applications writing random files that >> persist after the process has been killed). It also provides better >> life-cycle management for resources, since as the fds never have named >> links in the filesystem, their resources are automatically cleaned up >> when the last process with the fd dies, and there's no potential races >> between create and unlink with processes being terminated, which avoids >> the need for cleanup management. >> > What about if tmpfs could be restricted to only only O_TMPFILE open()s? > This pretty much amounts to an option to prevent tmpfs from creating > new directory entries. Thanks for reminding me about O_TMPFILE.. I have it on my list to look into how it could be used. As for the O_TMPFILE only tmpfs option, it seems maybe a little clunky to me, but possible. If others think this would be preferred over a new syscall, I'll dig in deeper. thanks -john -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC] shmgetfd idea 2014-01-28 19:56 ` John Stultz @ 2014-01-28 20:37 ` H. Peter Anvin 2014-01-28 20:58 ` John Stultz 0 siblings, 1 reply; 25+ messages in thread From: H. Peter Anvin @ 2014-01-28 20:37 UTC (permalink / raw) To: John Stultz, linux-mm@kvack.org Cc: Greg KH, Kay Sievers, Android Kernel Team, Andrew Morton, Mel Gorman, Hugh Dickins, Dave Hansen, Rik van Riel, Michel Lespinasse, Johannes Weiner, Neil Brown, Andrea Arcangeli, Takahiro Akashi, Minchan Kim, Lennart Poettering On 01/28/2014 11:56 AM, John Stultz wrote: > > Thanks for reminding me about O_TMPFILE.. I have it on my list to look > into how it could be used. > > As for the O_TMPFILE only tmpfs option, it seems maybe a little clunky > to me, but possible. If others think this would be preferred over a new > syscall, I'll dig in deeper. > What is clunky about it? It reuses an existing interface and still points to the specific tmpfs instance that should be populated. -hpa -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC] shmgetfd idea 2014-01-28 20:37 ` H. Peter Anvin @ 2014-01-28 20:58 ` John Stultz 2014-01-28 21:01 ` Kay Sievers 0 siblings, 1 reply; 25+ messages in thread From: John Stultz @ 2014-01-28 20:58 UTC (permalink / raw) To: H. Peter Anvin, linux-mm@kvack.org Cc: Greg KH, Kay Sievers, Android Kernel Team, Andrew Morton, Mel Gorman, Hugh Dickins, Dave Hansen, Rik van Riel, Michel Lespinasse, Johannes Weiner, Neil Brown, Andrea Arcangeli, Takahiro Akashi, Minchan Kim, Lennart Poettering On 01/28/2014 12:37 PM, H. Peter Anvin wrote: > On 01/28/2014 11:56 AM, John Stultz wrote: >> Thanks for reminding me about O_TMPFILE.. I have it on my list to look >> into how it could be used. >> >> As for the O_TMPFILE only tmpfs option, it seems maybe a little clunky >> to me, but possible. If others think this would be preferred over a new >> syscall, I'll dig in deeper. >> > What is clunky about it? It reuses an existing interface and still > points to the specific tmpfs instance that should be populated. It would require new mount point convention that userland would have to standardize. To me (and admittedly its a taste thing), a new O_TMPFILE-only tmpfs mount point seems to be to be a bigger interface change from an application writers perspective then a new syscall. But maybe I'm misunderstanding your suggestion? thanks -john -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC] shmgetfd idea 2014-01-28 20:58 ` John Stultz @ 2014-01-28 21:01 ` Kay Sievers 2014-01-28 21:05 ` John Stultz 0 siblings, 1 reply; 25+ messages in thread From: Kay Sievers @ 2014-01-28 21:01 UTC (permalink / raw) To: John Stultz Cc: H. Peter Anvin, linux-mm@kvack.org, Greg KH, Android Kernel Team, Andrew Morton, Mel Gorman, Hugh Dickins, Dave Hansen, Rik van Riel, Michel Lespinasse, Johannes Weiner, Neil Brown, Andrea Arcangeli, Takahiro Akashi, Minchan Kim, Lennart Poettering On Tue, Jan 28, 2014 at 9:58 PM, John Stultz <john.stultz@linaro.org> wrote: > On 01/28/2014 12:37 PM, H. Peter Anvin wrote: >> On 01/28/2014 11:56 AM, John Stultz wrote: >>> Thanks for reminding me about O_TMPFILE.. I have it on my list to look >>> into how it could be used. >>> >>> As for the O_TMPFILE only tmpfs option, it seems maybe a little clunky >>> to me, but possible. If others think this would be preferred over a new >>> syscall, I'll dig in deeper. >>> >> What is clunky about it? It reuses an existing interface and still >> points to the specific tmpfs instance that should be populated. > > It would require new mount point convention that userland would have to > standardize. To me (and admittedly its a taste thing), a new > O_TMPFILE-only tmpfs mount point seems to be to be a bigger interface > change from an application writers perspective then a new syscall. > > But maybe I'm misunderstanding your suggestion? General purpose Linux has /dev/shm/ for that already, which will not go away anytime soon.. Kay -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC] shmgetfd idea 2014-01-28 21:01 ` Kay Sievers @ 2014-01-28 21:05 ` John Stultz 2014-01-28 21:10 ` H. Peter Anvin 2014-01-28 21:28 ` Kay Sievers 0 siblings, 2 replies; 25+ messages in thread From: John Stultz @ 2014-01-28 21:05 UTC (permalink / raw) To: Kay Sievers Cc: H. Peter Anvin, linux-mm@kvack.org, Greg KH, Android Kernel Team, Andrew Morton, Mel Gorman, Hugh Dickins, Dave Hansen, Rik van Riel, Michel Lespinasse, Johannes Weiner, Neil Brown, Andrea Arcangeli, Takahiro Akashi, Minchan Kim, Lennart Poettering On 01/28/2014 01:01 PM, Kay Sievers wrote: > On Tue, Jan 28, 2014 at 9:58 PM, John Stultz <john.stultz@linaro.org> wrote: >> On 01/28/2014 12:37 PM, H. Peter Anvin wrote: >>> On 01/28/2014 11:56 AM, John Stultz wrote: >>>> Thanks for reminding me about O_TMPFILE.. I have it on my list to look >>>> into how it could be used. >>>> >>>> As for the O_TMPFILE only tmpfs option, it seems maybe a little clunky >>>> to me, but possible. If others think this would be preferred over a new >>>> syscall, I'll dig in deeper. >>>> >>> What is clunky about it? It reuses an existing interface and still >>> points to the specific tmpfs instance that should be populated. >> It would require new mount point convention that userland would have to >> standardize. To me (and admittedly its a taste thing), a new >> O_TMPFILE-only tmpfs mount point seems to be to be a bigger interface >> change from an application writers perspective then a new syscall. >> >> But maybe I'm misunderstanding your suggestion? > General purpose Linux has /dev/shm/ for that already, which will not > go away anytime soon.. Right, though making /dev/shm/ O_TMPFILE only would likely break things, no? thanks -john -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC] shmgetfd idea 2014-01-28 21:05 ` John Stultz @ 2014-01-28 21:10 ` H. Peter Anvin 2014-01-28 21:54 ` John Stultz 2014-01-28 21:28 ` Kay Sievers 1 sibling, 1 reply; 25+ messages in thread From: H. Peter Anvin @ 2014-01-28 21:10 UTC (permalink / raw) To: John Stultz, Kay Sievers Cc: linux-mm@kvack.org, Greg KH, Android Kernel Team, Andrew Morton, Mel Gorman, Hugh Dickins, Dave Hansen, Rik van Riel, Michel Lespinasse, Johannes Weiner, Neil Brown, Andrea Arcangeli, Takahiro Akashi, Minchan Kim, Lennart Poettering On 01/28/2014 01:05 PM, John Stultz wrote: >> General purpose Linux has /dev/shm/ for that already, which will not >> go away anytime soon.. > > Right, though making /dev/shm/ O_TMPFILE only would likely break things, no? If it isn't, then you already have a writable tmpfs, which is what you said you didn't want. -hpa -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC] shmgetfd idea 2014-01-28 21:10 ` H. Peter Anvin @ 2014-01-28 21:54 ` John Stultz 2014-01-28 22:14 ` Kay Sievers 0 siblings, 1 reply; 25+ messages in thread From: John Stultz @ 2014-01-28 21:54 UTC (permalink / raw) To: H. Peter Anvin, Kay Sievers Cc: linux-mm@kvack.org, Greg KH, Android Kernel Team, Andrew Morton, Mel Gorman, Hugh Dickins, Dave Hansen, Rik van Riel, Michel Lespinasse, Johannes Weiner, Neil Brown, Andrea Arcangeli, Takahiro Akashi, Minchan Kim, Lennart Poettering On 01/28/2014 01:10 PM, H. Peter Anvin wrote: > On 01/28/2014 01:05 PM, John Stultz wrote: >>> General purpose Linux has /dev/shm/ for that already, which will not >>> go away anytime soon.. >> Right, though making /dev/shm/ O_TMPFILE only would likely break things, no? > If it isn't, then you already have a writable tmpfs, which is what you > said you didn't want. Well, rather then finding a solution exclusively for Android, I'm trying to find an approach that would work more generically. While classic Linux systems do have writable /dev/shm/, which we *have* to preserve, it seem to me that classic linux systems may some day want to deal with the issues with writable tmpfs that Android has intentionally avoided. For examples of grumblings on these issues see: https://bugzilla.redhat.com/show_bug.cgi?id=693253 (and its dup) Requiring a binary on/off flag for /dev/shm makes it so you have to choose if you are a classic or new-style (android-like) system. By avoiding re-using existing convention via providing a new syscall (or alternatively with your approach, a new yet to be standardized mount point convention), it would allow best practices to be updated, and allow for a slow deprecation of the writable /dev/shm, possibly by limiting permissions to /dev/shm to only legacy applications, etc. But yes, alternatively classic systems may be able to get around the issues via tmpfs quotas and convincing applications to use O_TMPFILE there. But to me this seems less ideal then the Android approach, where the lifecycle of the tmpfs fds more limited and clear. And my main point being: Both Android's ashmem and kdbus' memfds are both utilizing these semantics (though maybe they aren't as important/intentional for kdbus?), so it seems like some generic method (which would work in both environments) would generally useful. Again, I really do appreciate your feedback here, and I don't mean to be panning your idea (I'm quite willing to look further into it if others think its the right way)! I just want to explain my point of view and motivations a bit better. thanks! -john -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC] shmgetfd idea 2014-01-28 21:54 ` John Stultz @ 2014-01-28 22:14 ` Kay Sievers 2014-01-28 23:02 ` H. Peter Anvin 2014-01-28 23:14 ` John Stultz 0 siblings, 2 replies; 25+ messages in thread From: Kay Sievers @ 2014-01-28 22:14 UTC (permalink / raw) To: John Stultz Cc: H. Peter Anvin, linux-mm@kvack.org, Greg KH, Android Kernel Team, Andrew Morton, Mel Gorman, Hugh Dickins, Dave Hansen, Rik van Riel, Michel Lespinasse, Johannes Weiner, Neil Brown, Andrea Arcangeli, Takahiro Akashi, Minchan Kim, Lennart Poettering On Tue, Jan 28, 2014 at 10:54 PM, John Stultz <john.stultz@linaro.org> wrote: > On 01/28/2014 01:10 PM, H. Peter Anvin wrote: >> On 01/28/2014 01:05 PM, John Stultz wrote: >>>> General purpose Linux has /dev/shm/ for that already, which will not >>>> go away anytime soon.. >>> Right, though making /dev/shm/ O_TMPFILE only would likely break things, no? >> If it isn't, then you already have a writable tmpfs, which is what you >> said you didn't want. > > Well, rather then finding a solution exclusively for Android, I'm trying > to find an approach that would work more generically. > > While classic Linux systems do have writable /dev/shm/, which we *have* > to preserve, it seem to me that classic linux systems may some day want > to deal with the issues with writable tmpfs that Android has > intentionally avoided. > > For examples of grumblings on these issues see: > https://bugzilla.redhat.com/show_bug.cgi?id=693253 (and its dup) > > Requiring a binary on/off flag for /dev/shm makes it so you have to > choose if you are a classic or new-style (android-like) system. By > avoiding re-using existing convention via providing a new syscall (or > alternatively with your approach, a new yet to be standardized mount > point convention), it would allow best practices to be updated, and > allow for a slow deprecation of the writable /dev/shm, possibly by > limiting permissions to /dev/shm to only legacy applications, etc. > > But yes, alternatively classic systems may be able to get around the > issues via tmpfs quotas and convincing applications to use O_TMPFILE > there. But to me this seems less ideal then the Android approach, where > the lifecycle of the tmpfs fds more limited and clear. Tmpfs supports no quota, it's all a huge hole and unsafe in that regard on every system today. But ashmem and kdbus, as they are today, are not better. > And my main point being: Both Android's ashmem and kdbus' memfds are > both utilizing these semantics (though maybe they aren't as > important/intentional for kdbus?), We need a way to securely identify an fd that is a memfd in the kernel and in userspace, and we need to be able to seal it. The rest does not really matter, we could use O_TMPFILE if we need to, but it still lacks all the other features. > so it seems like some generic method > (which would work in both environments) would generally useful. Sure, would be nice. There are people from the wayland and X camp, who asked for a secure semantics and sharing of shmfds too. > Again, I really do appreciate your feedback here, and I don't mean to be > panning your idea (I'm quite willing to look further into it if others > think its the right way)! I just want to explain my point of view and > motivations a bit better. I think the most convincing option right now is a new memfd() syscall or a character device. We would need more than a create syscall for the sealing/unsealing, not sure if fcntl() could be (mis-)used/extended for the sealing interface. A new character device with ioctls, replacing the current ashmem and the kdbus memfd part could also work. It has the advantage that it would just be an optional device driver and it not a primary API with all the promises, and would provide us with all we need, just the creation part with the involved ioctl struct definitions is not really pretty. Kay -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC] shmgetfd idea 2014-01-28 22:14 ` Kay Sievers @ 2014-01-28 23:02 ` H. Peter Anvin 2014-01-28 23:14 ` Kay Sievers 2014-01-28 23:14 ` John Stultz 1 sibling, 1 reply; 25+ messages in thread From: H. Peter Anvin @ 2014-01-28 23:02 UTC (permalink / raw) To: Kay Sievers, John Stultz Cc: linux-mm@kvack.org, Greg KH, Android Kernel Team, Andrew Morton, Mel Gorman, Hugh Dickins, Dave Hansen, Rik van Riel, Michel Lespinasse, Johannes Weiner, Neil Brown, Andrea Arcangeli, Takahiro Akashi, Minchan Kim, Lennart Poettering On 01/28/2014 02:14 PM, Kay Sievers wrote: >> >> But yes, alternatively classic systems may be able to get around the >> issues via tmpfs quotas and convincing applications to use O_TMPFILE >> there. But to me this seems less ideal then the Android approach, where >> the lifecycle of the tmpfs fds more limited and clear. > > Tmpfs supports no quota, it's all a huge hole and unsafe in that > regard on every system today. But ashmem and kdbus, as they are today, > are not better. > We can fix that aspect in tmpfs. Creating new file objcts outside of filesystems really doesn't make things any better, since our toolbox around this stuff largely revolves around filesystems. -hpa -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC] shmgetfd idea 2014-01-28 23:02 ` H. Peter Anvin @ 2014-01-28 23:14 ` Kay Sievers 2014-01-28 23:19 ` H. Peter Anvin 0 siblings, 1 reply; 25+ messages in thread From: Kay Sievers @ 2014-01-28 23:14 UTC (permalink / raw) To: H. Peter Anvin Cc: John Stultz, linux-mm@kvack.org, Greg KH, Android Kernel Team, Andrew Morton, Mel Gorman, Hugh Dickins, Dave Hansen, Rik van Riel, Michel Lespinasse, Johannes Weiner, Neil Brown, Andrea Arcangeli, Takahiro Akashi, Minchan Kim, Lennart Poettering On Wed, Jan 29, 2014 at 12:02 AM, H. Peter Anvin <hpa@zytor.com> wrote: > On 01/28/2014 02:14 PM, Kay Sievers wrote: >>> >>> But yes, alternatively classic systems may be able to get around the >>> issues via tmpfs quotas and convincing applications to use O_TMPFILE >>> there. But to me this seems less ideal then the Android approach, where >>> the lifecycle of the tmpfs fds more limited and clear. >> >> Tmpfs supports no quota, it's all a huge hole and unsafe in that >> regard on every system today. But ashmem and kdbus, as they are today, >> are not better. > > We can fix that aspect in tmpfs. Creating new file objcts outside of > filesystems really doesn't make things any better, since our toolbox > around this stuff largely revolves around filesystems. Sure, it should be fixed, not doubt, even when not in this context, it's something that we should have. Back to the topic, let's say, if we would require a tmpfs mount to get to an unlinked shmemfd, which sounds acceptable if we can solve the other features in a nice way. What would be the interface for additional functionality like sealing/unsealing that thing, that no operation can destruct its content as long as there is more than a single owner? That would be a new syscall or fcntl() with specific shmemfd options? We also need to solve the problem that the inode does not show up in /proc/$PID/fd/, so that nothing can create a new file for it which we don't catch with the "single owner" logic. Or we could determine the "single owner" state from the inode itself? Kay -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC] shmgetfd idea 2014-01-28 23:14 ` Kay Sievers @ 2014-01-28 23:19 ` H. Peter Anvin 2014-01-29 0:14 ` Kay Sievers 0 siblings, 1 reply; 25+ messages in thread From: H. Peter Anvin @ 2014-01-28 23:19 UTC (permalink / raw) To: Kay Sievers Cc: John Stultz, linux-mm@kvack.org, Greg KH, Android Kernel Team, Andrew Morton, Mel Gorman, Hugh Dickins, Dave Hansen, Rik van Riel, Michel Lespinasse, Johannes Weiner, Neil Brown, Andrea Arcangeli, Takahiro Akashi, Minchan Kim, Lennart Poettering On 01/28/2014 03:14 PM, Kay Sievers wrote: > > What would be the interface for additional functionality like > sealing/unsealing that thing, that no operation can destruct its > content as long as there is more than a single owner? That would be a > new syscall or fcntl() with specific shmemfd options? > > We also need to solve the problem that the inode does not show up in > /proc/$PID/fd/, so that nothing can create a new file for it which we > don't catch with the "single owner" logic. Or we could determine the > "single owner" state from the inode itself? > If the "single owner" is determined by the file structure (e.g. via a fcntl as opposed to a ioctl), then presumably we would simply deny an attempt to open the inode and create a new file structure for it. On Linux, /proc/$PID/fd is an open as opposed to a dup (as much as I personally don't like those semantics, they are well set in stone at this point) so it satisfies your requirements. -hpa -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC] shmgetfd idea 2014-01-28 23:19 ` H. Peter Anvin @ 2014-01-29 0:14 ` Kay Sievers 2014-01-29 0:20 ` H. Peter Anvin 0 siblings, 1 reply; 25+ messages in thread From: Kay Sievers @ 2014-01-29 0:14 UTC (permalink / raw) To: H. Peter Anvin Cc: John Stultz, linux-mm@kvack.org, Greg KH, Android Kernel Team, Andrew Morton, Mel Gorman, Hugh Dickins, Dave Hansen, Rik van Riel, Michel Lespinasse, Johannes Weiner, Neil Brown, Andrea Arcangeli, Takahiro Akashi, Minchan Kim, Lennart Poettering On Wed, Jan 29, 2014 at 12:19 AM, H. Peter Anvin <hpa@zytor.com> wrote: > On 01/28/2014 03:14 PM, Kay Sievers wrote: >> >> What would be the interface for additional functionality like >> sealing/unsealing that thing, that no operation can destruct its >> content as long as there is more than a single owner? That would be a >> new syscall or fcntl() with specific shmemfd options? >> >> We also need to solve the problem that the inode does not show up in >> /proc/$PID/fd/, so that nothing can create a new file for it which we >> don't catch with the "single owner" logic. Or we could determine the >> "single owner" state from the inode itself? >> > > If the "single owner" is determined by the file structure (e.g. via a > fcntl as opposed to a ioctl), then presumably we would simply deny an > attempt to open the inode and create a new file structure for it. > > On Linux, /proc/$PID/fd is an open as opposed to a dup (as much as I > personally don't like those semantics, they are well set in stone at > this point) so it satisfies your requirements. If that all could be made working, for the kdbus case we would be fine with requiring *any* tmpfs mount, create a new memfd from there with O_TMPFILE, and use new fcntl() definitios to protect/seal/unseal and identify that fd. For the more restricted cases like Android that tmpfs mount could get a mount option to not allow the creation of any non-unlinked file, I guess. Kay -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC] shmgetfd idea 2014-01-29 0:14 ` Kay Sievers @ 2014-01-29 0:20 ` H. Peter Anvin 2014-01-29 0:49 ` Kay Sievers 0 siblings, 1 reply; 25+ messages in thread From: H. Peter Anvin @ 2014-01-29 0:20 UTC (permalink / raw) To: Kay Sievers Cc: John Stultz, linux-mm@kvack.org, Greg KH, Android Kernel Team, Andrew Morton, Mel Gorman, Hugh Dickins, Dave Hansen, Rik van Riel, Michel Lespinasse, Johannes Weiner, Neil Brown, Andrea Arcangeli, Takahiro Akashi, Minchan Kim, Lennart Poettering On 01/28/2014 04:14 PM, Kay Sievers wrote: >> >> If the "single owner" is determined by the file structure (e.g. via a >> fcntl as opposed to a ioctl), then presumably we would simply deny an >> attempt to open the inode and create a new file structure for it. >> >> On Linux, /proc/$PID/fd is an open as opposed to a dup (as much as I >> personally don't like those semantics, they are well set in stone at >> this point) so it satisfies your requirements. > > If that all could be made working, for the kdbus case we would be fine > with requiring *any* tmpfs mount, create a new memfd from there with > O_TMPFILE, and use new fcntl() definitios to protect/seal/unseal and > identify that fd. > > For the more restricted cases like Android that tmpfs mount could get > a mount option to not allow the creation of any non-unlinked file, I > guess. > Right, that would be the idea. -hpa -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC] shmgetfd idea 2014-01-29 0:20 ` H. Peter Anvin @ 2014-01-29 0:49 ` Kay Sievers 0 siblings, 0 replies; 25+ messages in thread From: Kay Sievers @ 2014-01-29 0:49 UTC (permalink / raw) To: H. Peter Anvin Cc: John Stultz, linux-mm@kvack.org, Greg KH, Android Kernel Team, Andrew Morton, Mel Gorman, Hugh Dickins, Dave Hansen, Rik van Riel, Michel Lespinasse, Johannes Weiner, Neil Brown, Andrea Arcangeli, Takahiro Akashi, Minchan Kim, Lennart Poettering On Wed, Jan 29, 2014 at 1:20 AM, H. Peter Anvin <hpa@zytor.com> wrote: > On 01/28/2014 04:14 PM, Kay Sievers wrote: >>> >>> If the "single owner" is determined by the file structure (e.g. via a >>> fcntl as opposed to a ioctl), then presumably we would simply deny an >>> attempt to open the inode and create a new file structure for it. >>> >>> On Linux, /proc/$PID/fd is an open as opposed to a dup (as much as I >>> personally don't like those semantics, they are well set in stone at >>> this point) so it satisfies your requirements. >> >> If that all could be made working, for the kdbus case we would be fine >> with requiring *any* tmpfs mount, create a new memfd from there with >> O_TMPFILE, and use new fcntl() definitios to protect/seal/unseal and >> identify that fd. >> >> For the more restricted cases like Android that tmpfs mount could get >> a mount option to not allow the creation of any non-unlinked file, I >> guess. >> > > Right, that would be the idea. I like your idea. Sounds worth trying, if you think we can make the protection/sealing work without too much ugly workarounds. With the filesystem as a "domain" / the root for all the unlinked shmem files, we could even mount a separate tmpfs for every logged-in user, and put the quota on the user that way. It will still not solve the /dev/shm/ or /tmp quota problem, but it would at least not get bigger with every new shmem user we invent. :) Kay -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC] shmgetfd idea 2014-01-28 22:14 ` Kay Sievers 2014-01-28 23:02 ` H. Peter Anvin @ 2014-01-28 23:14 ` John Stultz 1 sibling, 0 replies; 25+ messages in thread From: John Stultz @ 2014-01-28 23:14 UTC (permalink / raw) To: Kay Sievers Cc: H. Peter Anvin, linux-mm@kvack.org, Greg KH, Android Kernel Team, Andrew Morton, Mel Gorman, Hugh Dickins, Dave Hansen, Rik van Riel, Michel Lespinasse, Johannes Weiner, Neil Brown, Andrea Arcangeli, Takahiro Akashi, Minchan Kim, Lennart Poettering On 01/28/2014 02:14 PM, Kay Sievers wrote: > On Tue, Jan 28, 2014 at 10:54 PM, John Stultz <john.stultz@linaro.org> wrote: >> But yes, alternatively classic systems may be able to get around the >> issues via tmpfs quotas and convincing applications to use O_TMPFILE >> there. But to me this seems less ideal then the Android approach, where >> the lifecycle of the tmpfs fds more limited and clear. > Tmpfs supports no quota, it's all a huge hole and unsafe in that > regard on every system today. But ashmem and kdbus, as they are today, > are not better. While its true ashmem and kdbus currently have no limitation on the amount of memory an application can consume via the unlinked tmpfs fds, they both do have the benefit that those unlinked files are cleaned up when the last user dies (or is killed). While adding quota to these approaches would improve things, tmpfs quota alone on writable tmpfs mounts only limits the DOS to the user (ie: one bad application could fill up the user's tmpfs and quit, then other applications would fail to work or have some sort of logic to figure out what tmpfs files could safely be cleaned up). Other then this minor point, I think I'm in agreement with the other points in your mail. thanks -john -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC] shmgetfd idea 2014-01-28 21:05 ` John Stultz 2014-01-28 21:10 ` H. Peter Anvin @ 2014-01-28 21:28 ` Kay Sievers 1 sibling, 0 replies; 25+ messages in thread From: Kay Sievers @ 2014-01-28 21:28 UTC (permalink / raw) To: John Stultz Cc: H. Peter Anvin, linux-mm@kvack.org, Greg KH, Android Kernel Team, Andrew Morton, Mel Gorman, Hugh Dickins, Dave Hansen, Rik van Riel, Michel Lespinasse, Johannes Weiner, Neil Brown, Andrea Arcangeli, Takahiro Akashi, Minchan Kim, Lennart Poettering On Tue, Jan 28, 2014 at 10:05 PM, John Stultz <john.stultz@linaro.org> wrote: > On 01/28/2014 01:01 PM, Kay Sievers wrote: >> On Tue, Jan 28, 2014 at 9:58 PM, John Stultz <john.stultz@linaro.org> wrote: >>> On 01/28/2014 12:37 PM, H. Peter Anvin wrote: >>>> On 01/28/2014 11:56 AM, John Stultz wrote: >>>>> Thanks for reminding me about O_TMPFILE.. I have it on my list to look >>>>> into how it could be used. >>>>> >>>>> As for the O_TMPFILE only tmpfs option, it seems maybe a little clunky >>>>> to me, but possible. If others think this would be preferred over a new >>>>> syscall, I'll dig in deeper. >>>>> >>>> What is clunky about it? It reuses an existing interface and still >>>> points to the specific tmpfs instance that should be populated. >>> It would require new mount point convention that userland would have to >>> standardize. To me (and admittedly its a taste thing), a new >>> O_TMPFILE-only tmpfs mount point seems to be to be a bigger interface >>> change from an application writers perspective then a new syscall. >>> >>> But maybe I'm misunderstanding your suggestion? >> General purpose Linux has /dev/shm/ for that already, which will not >> go away anytime soon.. > > Right, though making /dev/shm/ O_TMPFILE only would likely break things, no? Right, general purpose Linux could not mount with that option without expecting major breakage, see: man shm_overview. But a custom OS could just define that, I guess. The current /dev/shm/ semantics and the shm apis in general are a kind of a broken idea from the very beginning. Kay -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC] shmgetfd idea 2014-01-28 1:37 [RFC] shmgetfd idea John Stultz 2014-01-28 1:53 ` Kay Sievers 2014-01-28 3:52 ` H. Peter Anvin @ 2014-01-30 8:46 ` Christoph Hellwig 2014-01-30 16:02 ` Kay Sievers 2 siblings, 1 reply; 25+ messages in thread From: Christoph Hellwig @ 2014-01-30 8:46 UTC (permalink / raw) To: John Stultz Cc: linux-mm@kvack.org, Greg KH, Kay Sievers, Android Kernel Team, Andrew Morton, Mel Gorman, Hugh Dickins, Dave Hansen, Rik van Riel, Michel Lespinasse, Johannes Weiner, H. Peter Anvin, Neil Brown, Andrea Arcangeli, Takahiro Akashi, Minchan Kim, Lennart Poettering On Mon, Jan 27, 2014 at 05:37:04PM -0800, John Stultz wrote: > In working with ashmem and looking briefly at kdbus' memfd ideas, > there's a commonality that both basically act as a method to provide > applications with unlinked tmpfs/shmem fds. Just use O_TMPFILE on a tmpfs file and you're done. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC] shmgetfd idea 2014-01-30 8:46 ` Christoph Hellwig @ 2014-01-30 16:02 ` Kay Sievers 2014-01-30 21:42 ` John Stultz 2014-02-03 15:03 ` Christoph Hellwig 0 siblings, 2 replies; 25+ messages in thread From: Kay Sievers @ 2014-01-30 16:02 UTC (permalink / raw) To: Christoph Hellwig Cc: John Stultz, linux-mm@kvack.org, Greg KH, Android Kernel Team, Andrew Morton, Mel Gorman, Hugh Dickins, Dave Hansen, Rik van Riel, Michel Lespinasse, Johannes Weiner, H. Peter Anvin, Neil Brown, Andrea Arcangeli, Takahiro Akashi, Minchan Kim, Lennart Poettering On Thu, Jan 30, 2014 at 9:46 AM, Christoph Hellwig <hch@infradead.org> wrote: > On Mon, Jan 27, 2014 at 05:37:04PM -0800, John Stultz wrote: >> In working with ashmem and looking briefly at kdbus' memfd ideas, >> there's a commonality that both basically act as a method to provide >> applications with unlinked tmpfs/shmem fds. > > Just use O_TMPFILE on a tmpfs file and you're done. Ashmem and kdbus can name the deleted files, which is useful for debugging and tools to show the associated name for the file descriptor. They also show up in /proc/$PID/maps/ and possibly in /proc/$PID/fd/. O_TMPFILE always creates files with just the name "/". Unless that is changed we wouldn't want switch over to O_TMPFILE, because we would lose that nice feature. Is there are way to "fix" O_TMPFILE to accept the name of the file to be created, instead of insisting to take only the leading directory as the argument? Kay -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC] shmgetfd idea 2014-01-30 16:02 ` Kay Sievers @ 2014-01-30 21:42 ` John Stultz 2014-01-31 0:01 ` Kay Sievers 2014-02-03 15:03 ` Christoph Hellwig 1 sibling, 1 reply; 25+ messages in thread From: John Stultz @ 2014-01-30 21:42 UTC (permalink / raw) To: Kay Sievers, Christoph Hellwig Cc: linux-mm@kvack.org, Greg KH, Android Kernel Team, Andrew Morton, Mel Gorman, Hugh Dickins, Dave Hansen, Rik van Riel, Michel Lespinasse, Johannes Weiner, H. Peter Anvin, Neil Brown, Andrea Arcangeli, Takahiro Akashi, Minchan Kim, Lennart Poettering On 01/30/2014 08:02 AM, Kay Sievers wrote: > On Thu, Jan 30, 2014 at 9:46 AM, Christoph Hellwig <hch@infradead.org> wrote: >> On Mon, Jan 27, 2014 at 05:37:04PM -0800, John Stultz wrote: >>> In working with ashmem and looking briefly at kdbus' memfd ideas, >>> there's a commonality that both basically act as a method to provide >>> applications with unlinked tmpfs/shmem fds. >> Just use O_TMPFILE on a tmpfs file and you're done. > Ashmem and kdbus can name the deleted files, which is useful for > debugging and tools to show the associated name for the file > descriptor. They also show up in /proc/$PID/maps/ and possibly in > /proc/$PID/fd/. > > O_TMPFILE always creates files with just the name "/". Unless that is > changed we wouldn't want switch over to O_TMPFILE, because we would > lose that nice feature. Not sure, but would Colin's vma-naming patch (or something like it) help address this? https://lkml.org/lkml/2013/10/30/518 thanks -john -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC] shmgetfd idea 2014-01-30 21:42 ` John Stultz @ 2014-01-31 0:01 ` Kay Sievers 0 siblings, 0 replies; 25+ messages in thread From: Kay Sievers @ 2014-01-31 0:01 UTC (permalink / raw) To: John Stultz Cc: Christoph Hellwig, linux-mm@kvack.org, Greg KH, Android Kernel Team, Andrew Morton, Mel Gorman, Hugh Dickins, Dave Hansen, Rik van Riel, Michel Lespinasse, Johannes Weiner, H. Peter Anvin, Neil Brown, Andrea Arcangeli, Takahiro Akashi, Minchan Kim, Lennart Poettering On Thu, Jan 30, 2014 at 10:42 PM, John Stultz <john.stultz@linaro.org> wrote: > On 01/30/2014 08:02 AM, Kay Sievers wrote: >> On Thu, Jan 30, 2014 at 9:46 AM, Christoph Hellwig <hch@infradead.org> wrote: >>> On Mon, Jan 27, 2014 at 05:37:04PM -0800, John Stultz wrote: >>>> In working with ashmem and looking briefly at kdbus' memfd ideas, >>>> there's a commonality that both basically act as a method to provide >>>> applications with unlinked tmpfs/shmem fds. >>> Just use O_TMPFILE on a tmpfs file and you're done. >> Ashmem and kdbus can name the deleted files, which is useful for >> debugging and tools to show the associated name for the file >> descriptor. They also show up in /proc/$PID/maps/ and possibly in >> /proc/$PID/fd/. >> >> O_TMPFILE always creates files with just the name "/". Unless that is >> changed we wouldn't want switch over to O_TMPFILE, because we would >> lose that nice feature. > > Not sure, but would Colin's vma-naming patch (or something like it) help > address this? > https://lkml.org/lkml/2013/10/30/518 Hmm, I don't think so, this seems to be about anonymous memory only, but shmem files are not anonymous. We actually just really want the actual file names, ashmem too, like shmem_file_setup() accepts the name for the unlinked file to create. Kay -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC] shmgetfd idea 2014-01-30 16:02 ` Kay Sievers 2014-01-30 21:42 ` John Stultz @ 2014-02-03 15:03 ` Christoph Hellwig 1 sibling, 0 replies; 25+ messages in thread From: Christoph Hellwig @ 2014-02-03 15:03 UTC (permalink / raw) To: Kay Sievers Cc: Christoph Hellwig, John Stultz, linux-mm@kvack.org, Greg KH, Android Kernel Team, Andrew Morton, Mel Gorman, Hugh Dickins, Dave Hansen, Rik van Riel, Michel Lespinasse, Johannes Weiner, H. Peter Anvin, Neil Brown, Andrea Arcangeli, Takahiro Akashi, Minchan Kim, Lennart Poettering, Al Viro, Michael Kerrisk On Thu, Jan 30, 2014 at 05:02:40PM +0100, Kay Sievers wrote: > Ashmem and kdbus can name the deleted files, which is useful for > debugging and tools to show the associated name for the file > descriptor. They also show up in /proc/$PID/maps/ and possibly in > /proc/$PID/fd/. > > O_TMPFILE always creates files with just the name "/". Unless that is > changed we wouldn't want switch over to O_TMPFILE, because we would > lose that nice feature. > > Is there are way to "fix" O_TMPFILE to accept the name of the file to > be created, instead of insisting to take only the leading directory as > the argument? As far as the VFS is concerned this should be fairly easily doable, we'd just have to switch O_TMPFILE to the same lookup parent first algorithm used for O_CREAT. The filesystems shouldn't really care at all as the name will never be stored on disk. In fact such a full-path O_TMPFILE would be much nicer than the current one as it has more similar arguments to the normal O_CREAT open that I would document it as the default one, even if the old semantics would have to still be supported for backwards compatibility. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 25+ messages in thread
end of thread, other threads:[~2014-02-03 15:03 UTC | newest] Thread overview: 25+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-01-28 1:37 [RFC] shmgetfd idea John Stultz 2014-01-28 1:53 ` Kay Sievers 2014-01-28 19:47 ` John Stultz 2014-01-28 3:52 ` H. Peter Anvin 2014-01-28 19:56 ` John Stultz 2014-01-28 20:37 ` H. Peter Anvin 2014-01-28 20:58 ` John Stultz 2014-01-28 21:01 ` Kay Sievers 2014-01-28 21:05 ` John Stultz 2014-01-28 21:10 ` H. Peter Anvin 2014-01-28 21:54 ` John Stultz 2014-01-28 22:14 ` Kay Sievers 2014-01-28 23:02 ` H. Peter Anvin 2014-01-28 23:14 ` Kay Sievers 2014-01-28 23:19 ` H. Peter Anvin 2014-01-29 0:14 ` Kay Sievers 2014-01-29 0:20 ` H. Peter Anvin 2014-01-29 0:49 ` Kay Sievers 2014-01-28 23:14 ` John Stultz 2014-01-28 21:28 ` Kay Sievers 2014-01-30 8:46 ` Christoph Hellwig 2014-01-30 16:02 ` Kay Sievers 2014-01-30 21:42 ` John Stultz 2014-01-31 0:01 ` Kay Sievers 2014-02-03 15:03 ` Christoph Hellwig
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).