From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jens Axboe Subject: Re: [patch] pipe: add support for shrinking and growing pipes Date: Sun, 23 May 2010 19:47:06 +0200 Message-ID: <20100523174706.GP23411@kernel.dk> References: <20100522223838.ebca396a.akpm@linux-foundation.org> <20100523070917.GO23411@kernel.dk> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Andrew Morton , Linus Torvalds , Miklos Szeredi , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org To: mtk.manpages@gmail.com Return-path: Content-Disposition: inline In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org List-Id: linux-fsdevel.vger.kernel.org On Sun, May 23 2010, Michael Kerrisk wrote: > On Sun, May 23, 2010 at 9:09 AM, Jens Axboe w= rote: > > On Sun, May 23 2010, Michael Kerrisk wrote: > >> On Sun, May 23, 2010 at 4:38 AM, Andrew Morton > >> wrote: > >> > On Sun, 23 May 2010 07:30:01 +0200 Michael Kerrisk wrote: > >> > > >> >> Hi all, > >> >> > >> >> I see that this patch has hit Linus's git, so some questions > >> >> > >> >> On Wed, May 19, 2010 at 6:49 PM, Linus Torvalds > >> >> wrote: > >> >> > > >> >> > > >> >> > On Wed, 19 May 2010, Miklos Szeredi wrote: > >> >> >> > >> >> >> One issue I see is that it's possible to grow pipes indefini= tely. > >> >> >> Should this be restricted to privileged users? > >> >> > > >> >> > Yes. But perhaps only if it grows past the default (or perhap= s "default*2" > >> >> > or similar). That way a normal user could shrink the pipe buf= fers, and > >> >> > then grow them again if he wants to. > >> >> > > >> >> > Oh, and I think you need to also require that there be at lea= st two > >> >> > buffers. Otherwise we can't guarantee POSIX behavior, I think= =2E > >> >> > >> >> Is there any documentation (e.g., a man-pages patch) for these = changes? > >> >> > >> >> The argument of the fcntl() operations is expressed in pages. I= take > >> >> it that this means that the semantics of the argument will very > >> >> depending on the system page size? So for example, 2 on x86 wil= l mean > >> >> 8192 bytes, but will mean 32768 of ia64? That seems very weird.= (And > >> >> what about architectures where the page size is switchable?) Su= ch > >> >> changes in semantics should not be silent for the use, IMO. > >> > > >> > Well, there is getpagesize(). =A0But I agree - this interface is= just > >> > asking (x86) people to write non-portable code. > >> > > >> > otoh, if the arg was in bytes, they'd just hard-code "8192". =A0= They're > >> > clever like that. > >> > > >> > But we have gone to some lengths to avoid exposing things like > >> > PAGE_SIZE and HZ in procfs, so it makes sense to take the same a= pproach > >> > to syscalls. > >> > >> Quite. All of the other memory-related APIs that I can think of > >> require the user to express the info in bytes. (mlock(), > >> remap_file_pages(), mmap(), mremap(), mprotect(), shmget(), and so > >> on). Not doing the same for this interface is needlessly inconsist= ent. > >> And while there will be the silly users you mention above, smart u= sers > >> will know how to do the right thing with a consistently designed > >> interface. > > > > We can easily make F_GETPIPE_SZ return bytes, but I don't think pas= sing > > in bytes to F_SETPIPE_SZ makes a lot of sense. The pipe array must = be a > > power of 2 in pages. So the question is if that makes the API clean= er, > > passing in number of pages but returning bytes? Or pass in bytes al= l > > around, but have F_SETPIPE_SZ round to the nearest multiple of pow2= in > > pages if need be. Then it would return a size at least what was pas= sed > > in, or error. >=20 > I'd recommend this: Pass it in and out in bytes. Don't round to a > power of 2. Require the user to know what they are doing. Give an > error if the user doesn't supply a power-of-2 * page-size for > F_SETPIPE_SZ. (Again, consider the case of architectures with > switchable page sizes.) But is there much point in erroring on an incorrect size? If the application says "I need at least 120kb of space in there", kernel returns "OK, you got 128kb". Would returning -1/EINVAL for that case really make a better API? Doesn't seem like it to me. --=20 Jens Axboe