public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Expose SHM_HUGETLB in shmctl(id, IPC_STAT, ...)
@ 2005-11-09 18:46 Arun Sharma
  2005-11-10  6:22 ` Andrew Morton
  0 siblings, 1 reply; 11+ messages in thread
From: Arun Sharma @ 2005-11-09 18:46 UTC (permalink / raw)
  To: akpm; +Cc: rohit.seth, linux-kernel

Allow shmctl to find out if a shmid corresponds to a HUGETLB segment

Signed-off-by: Arun Sharma <arun.sharma@google.com>
Acked-by: Rohit Seth <rohit.seth@intel.com>

--- a/ipc/shm.c	Tue Nov  8 20:58:38 2005
+++ b/ipc/shm.c	Wed Nov  9 10:26:37 2005
@@ -197,7 +197,7 @@
 		return -ENOMEM;
 
 	shp->shm_perm.key = key;
-	shp->shm_flags = (shmflg & S_IRWXUGO);
+	shp->shm_flags = (shmflg & (S_IRWXUGO | SHM_HUGETLB));
 	shp->mlock_user = NULL;
 
 	shp->shm_perm.security = NULL;

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Expose SHM_HUGETLB in shmctl(id, IPC_STAT, ...)
  2005-11-09 18:46 [PATCH] Expose SHM_HUGETLB in shmctl(id, IPC_STAT, ...) Arun Sharma
@ 2005-11-10  6:22 ` Andrew Morton
  2005-11-10 18:35   ` Arun Sharma
  0 siblings, 1 reply; 11+ messages in thread
From: Andrew Morton @ 2005-11-10  6:22 UTC (permalink / raw)
  To: Arun Sharma; +Cc: rohit.seth, linux-kernel

Arun Sharma <arun.sharma@google.com> wrote:
>
> Allow shmctl to find out if a shmid corresponds to a HUGETLB segment
> 
>  Signed-off-by: Arun Sharma <arun.sharma@google.com>
>  Acked-by: Rohit Seth <rohit.seth@intel.com>
> 
>  --- a/ipc/shm.c	Tue Nov  8 20:58:38 2005
>  +++ b/ipc/shm.c	Wed Nov  9 10:26:37 2005
>  @@ -197,7 +197,7 @@
>   		return -ENOMEM;
>   
>   	shp->shm_perm.key = key;
>  -	shp->shm_flags = (shmflg & S_IRWXUGO);
>  +	shp->shm_flags = (shmflg & (S_IRWXUGO | SHM_HUGETLB));
>   	shp->mlock_user = NULL;
>   
>   	shp->shm_perm.security = NULL;

I dunno.  The manpage says:

       The highlighted fields in the member shm_perm can be set:

           struct ipc_perm {
       ...
               ushort mode;  /* lower 9 bits of access modes */
       ...
           };

So if an application used to do:

	if (perm.mode == 0666)

it will now break, because we've gone and set bit 9 on hugetlb segments.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Expose SHM_HUGETLB in shmctl(id, IPC_STAT, ...)
  2005-11-10  6:22 ` Andrew Morton
@ 2005-11-10 18:35   ` Arun Sharma
  2005-11-10 19:59     ` Andrew Morton
  0 siblings, 1 reply; 11+ messages in thread
From: Arun Sharma @ 2005-11-10 18:35 UTC (permalink / raw)
  To: Andrew Morton; +Cc: rohit.seth, linux-kernel

Andrew Morton wrote:

>> +	shp->shm_flags = (shmflg & (S_IRWXUGO | SHM_HUGETLB));
> [...]
> I dunno.  The manpage says:
> 
>        The highlighted fields in the member shm_perm can be set:
> 
>            struct ipc_perm {
>        ...
>                ushort mode;  /* lower 9 bits of access modes */
>        ...
>            };
> 
> So if an application used to do:
> 
> 	if (perm.mode == 0666)
> 
> it will now break, because we've gone and set bit 9 on hugetlb segments.

The man page on my system says:

               unsigned short mode;  /* Permissions + SHM_DEST and
                                         SHM_LOCKED flags */

I looked for a precendent before sending the patch and thought that 
SHM_LOCKED was a good one.

	-Arun

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Expose SHM_HUGETLB in shmctl(id, IPC_STAT, ...)
  2005-11-10 18:35   ` Arun Sharma
@ 2005-11-10 19:59     ` Andrew Morton
  2005-11-10 21:41       ` Arun Sharma
  0 siblings, 1 reply; 11+ messages in thread
From: Andrew Morton @ 2005-11-10 19:59 UTC (permalink / raw)
  To: Arun Sharma; +Cc: rohit.seth, linux-kernel

Arun Sharma <arun.sharma@google.com> wrote:
>
> Andrew Morton wrote:
> 
> >> +	shp->shm_flags = (shmflg & (S_IRWXUGO | SHM_HUGETLB));
> > [...]
> > I dunno.  The manpage says:
> > 
> >        The highlighted fields in the member shm_perm can be set:
> > 
> >            struct ipc_perm {
> >        ...
> >                ushort mode;  /* lower 9 bits of access modes */
> >        ...
> >            };
> > 
> > So if an application used to do:
> > 
> > 	if (perm.mode == 0666)
> > 
> > it will now break, because we've gone and set bit 9 on hugetlb segments.
> 
> The man page on my system says:
> 
>                unsigned short mode;  /* Permissions + SHM_DEST and
>                                          SHM_LOCKED flags */
> 
> I looked for a precendent before sending the patch and thought that 
> SHM_LOCKED was a good one.
> 

hm, OK.   But an app could still do

	if (mode == 0666|SHM_LOCKED)


How important is this feature?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Expose SHM_HUGETLB in shmctl(id, IPC_STAT, ...)
  2005-11-10 19:59     ` Andrew Morton
@ 2005-11-10 21:41       ` Arun Sharma
  2005-11-10 22:06         ` Andrew Morton
  0 siblings, 1 reply; 11+ messages in thread
From: Arun Sharma @ 2005-11-10 21:41 UTC (permalink / raw)
  To: Andrew Morton; +Cc: rohit.seth, linux-kernel

Andrew Morton wrote:

>>The man page on my system says:
>>
>>               unsigned short mode;  /* Permissions + SHM_DEST and
>>                                         SHM_LOCKED flags */
>>
>>I looked for a precendent before sending the patch and thought that 
>>SHM_LOCKED was a good one.
>>
> 
> 
> hm, OK.   But an app could still do
> 
> 	if (mode == 0666|SHM_LOCKED)
> 

I'd argue that the app should really be doing (perm.mode & 0777 = 0666)

> 
> How important is this feature?

Without this feature, an application has no way to figure out if a given 
segment is hugetlb or not. Applications need to know this to be able to 
handle alignment issues properly.

Also, if the flag is exported via ipcs, the system administrator would 
have a better idea about how the hugetlb pages she configured on the 
system are getting used.

	-Arun

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Expose SHM_HUGETLB in shmctl(id, IPC_STAT, ...)
  2005-11-10 21:41       ` Arun Sharma
@ 2005-11-10 22:06         ` Andrew Morton
  2005-11-11  2:49           ` Arun Sharma
  0 siblings, 1 reply; 11+ messages in thread
From: Andrew Morton @ 2005-11-10 22:06 UTC (permalink / raw)
  To: Arun Sharma; +Cc: rohit.seth, linux-kernel

Arun Sharma <arun.sharma@google.com> wrote:
>
> Andrew Morton wrote:
> 
> >>The man page on my system says:
> >>
> >>               unsigned short mode;  /* Permissions + SHM_DEST and
> >>                                         SHM_LOCKED flags */
> >>
> >>I looked for a precendent before sending the patch and thought that 
> >>SHM_LOCKED was a good one.
> >>
> > 
> > 
> > hm, OK.   But an app could still do
> > 
> > 	if (mode == 0666|SHM_LOCKED)
> > 
> 
> I'd argue that the app should really be doing (perm.mode & 0777 = 0666)

I find your faith in your fellow programmers to be trluy inspirational ;)

> > 
> > How important is this feature?
> 
> Without this feature, an application has no way to figure out if a given 
> segment is hugetlb or not. Applications need to know this to be able to 
> handle alignment issues properly.
> 
> Also, if the flag is exported via ipcs, the system administrator would 
> have a better idea about how the hugetlb pages she configured on the 
> system are getting used.
> 

I'd suggest that any API which allows us to query the hugeness of a piece
of memory should also work for mmap(hugetld_fd...).  IOW: this capability
shouldn't be restricted to sysv shm areas.

I can't think of any syscall which can be sanely overloaded for this.

The most general way of exposing this info would be to export it in
/proc/pid/maps in some back-compatible manner.

And once I've lost the "oh oh we'd need to write 100 lines of userspace
code for that" bunfight I'd say add a new syscall sys_query_pagesize(void
*addr) which returns the size of the page which backs `addr'.

But then again, if it was possible to write 100 lines of userspace code, we
wouldn't need this capability at all.  I bet if the userspace guys tried a
bit harder they'd work out a way of teaching their applications to remember
what they did.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Expose SHM_HUGETLB in shmctl(id, IPC_STAT, ...)
  2005-11-10 22:06         ` Andrew Morton
@ 2005-11-11  2:49           ` Arun Sharma
  2005-11-11  3:12             ` Andrew Morton
  2005-11-11  7:30             ` William Lee Irwin III
  0 siblings, 2 replies; 11+ messages in thread
From: Arun Sharma @ 2005-11-11  2:49 UTC (permalink / raw)
  To: Andrew Morton; +Cc: rohit.seth, linux-kernel

Andrew Morton wrote:

>>>How important is this feature?
>>
>>Without this feature, an application has no way to figure out if a given 
>>segment is hugetlb or not. Applications need to know this to be able to 
>>handle alignment issues properly.
>>
>>Also, if the flag is exported via ipcs, the system administrator would 
>>have a better idea about how the hugetlb pages she configured on the 
>>system are getting used.
>>
> 
> 
> I'd suggest that any API which allows us to query the hugeness of a piece
> of memory should also work for mmap(hugetld_fd...).  IOW: this capability
> shouldn't be restricted to sysv shm areas.

The capability I was talking about was the ability to figure out where 
the configured hugetlb pages are going (vs is this a hugetlb page?).

I suspect that one can use lsof+/proc/pid/maps and look for hugetlbfs 
mount points to gather that data. But for shared memory hugepages, we 
don't have a way.

> But then again, if it was possible to write 100 lines of userspace code, we
> wouldn't need this capability at all.  I bet if the userspace guys tried a
> bit harder they'd work out a way of teaching their applications to remember
> what they did.

Why do we need shmctl(IPC_STAT) then? Applications should remember what 
they did :)

	-Arun

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Expose SHM_HUGETLB in shmctl(id, IPC_STAT, ...)
  2005-11-11  2:49           ` Arun Sharma
@ 2005-11-11  3:12             ` Andrew Morton
  2005-11-12  7:19               ` Michael Kerrisk
  2005-11-11  7:30             ` William Lee Irwin III
  1 sibling, 1 reply; 11+ messages in thread
From: Andrew Morton @ 2005-11-11  3:12 UTC (permalink / raw)
  To: Arun Sharma; +Cc: rohit.seth, linux-kernel

Arun Sharma <arun.sharma@google.com> wrote:
>
>  Andrew Morton wrote:
> 
>  >>>How important is this feature?
>  >>
>  >>Without this feature, an application has no way to figure out if a given 
>  >>segment is hugetlb or not. Applications need to know this to be able to 
>  >>handle alignment issues properly.
>  >>
>  >>Also, if the flag is exported via ipcs, the system administrator would 
>  >>have a better idea about how the hugetlb pages she configured on the 
>  >>system are getting used.
>  >>
>  > 
>  > 
>  > I'd suggest that any API which allows us to query the hugeness of a piece
>  > of memory should also work for mmap(hugetld_fd...).  IOW: this capability
>  > shouldn't be restricted to sysv shm areas.
> 
>  The capability I was talking about was the ability to figure out where 
>  the configured hugetlb pages are going (vs is this a hugetlb page?).

Well, please figure out a way which has less risk of breaking userspace.

Bear in mind that the sort of apps we're talking about here are
dubiously-written monsters with long and costly upgrade cycles and we tend
to not get any reports until many many months after we made a kernel
change.  It's very costly all round and we need to be cautious.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Expose SHM_HUGETLB in shmctl(id, IPC_STAT, ...)
  2005-11-11  2:49           ` Arun Sharma
  2005-11-11  3:12             ` Andrew Morton
@ 2005-11-11  7:30             ` William Lee Irwin III
  1 sibling, 0 replies; 11+ messages in thread
From: William Lee Irwin III @ 2005-11-11  7:30 UTC (permalink / raw)
  To: Arun Sharma; +Cc: Andrew Morton, rohit.seth, linux-kernel

Andrew Morton wrote:
>> But then again, if it was possible to write 100 lines of userspace code, we
>> wouldn't need this capability at all.  I bet if the userspace guys tried a
>> bit harder they'd work out a way of teaching their applications to remember
>> what they did.

On Thu, Nov 10, 2005 at 06:49:56PM -0800, Arun Sharma wrote:
> Why do we need shmctl(IPC_STAT) then? Applications should remember what 
> they did :)

atime, dtime, ctime, lpid, and nattch are not "rememberable" this way.


-- wli

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Expose SHM_HUGETLB in shmctl(id, IPC_STAT, ...)
  2005-11-11  3:12             ` Andrew Morton
@ 2005-11-12  7:19               ` Michael Kerrisk
  2005-11-12  7:38                 ` Andrew Morton
  0 siblings, 1 reply; 11+ messages in thread
From: Michael Kerrisk @ 2005-11-12  7:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: rohit.seth, linux-kernel, Arun Sharma, mtk-lkml, mtk-manpages

> Arun Sharma <arun.sharma@google.com> wrote:
> >
> >  Andrew Morton wrote:
> > 
> >  >>>How important is this feature?
> >  >>
> >  >>Without this feature, an application has no way to figure out if a given 
> >  >>segment is hugetlb or not. Applications need to know this to be able to 
> >  >>handle alignment issues properly.
> >  >>
> >  >>Also, if the flag is exported via ipcs, the system administrator would 
> >  >>have a better idea about how the hugetlb pages she configured on the 
> >  >>system are getting used.
> >  >>
> >  > 
> >  > 
> >  > I'd suggest that any API which allows us to query the hugeness of a piece
> >  > of memory should also work for mmap(hugetld_fd...).  IOW: this capability
> >  > shouldn't be restricted to sysv shm areas.
> > 
> >  The capability I was talking about was the ability to figure out where 
> >  the configured hugetlb pages are going (vs is this a hugetlb page?).
> 
> Well, please figure out a way which has less risk of breaking userspace.
> 
> Bear in mind that the sort of apps we're talking about here are
> dubiously-written monsters with long and costly upgrade cycles and we tend
> to not get any reports until many many months after we made a kernel
> change.  It's very costly all round and we need to be cautious.

Andrew,

I am late to this discussion, but for what it's worth, a 
portable application really must use checks of the like 
(perm.mode & 0777 = 0666), because many implementations 
define additional read-only flags for perm.mode:

Tru64 5.1
#define	SHM_LOCKED	01000	/* segment locked in memory */
#define	SHM_REMOVED	02000	/* already removed */

Linux
#define	SHM_DEST	01000	/* segment will be destroyed on last detach */
#define SHM_LOCKED      02000   /* segment will not be swapped */

HP-UX 11
#  define SHM_CLEAR    01000	/* clear segment on next attach */
#  define SHM_DEST     02000	/* destroy segment when # attached = 0 */
#  define SHM_NOSWAP  010000	/* region for shared memory is memory locked */
			      /* (or should be when the region is allocated) */

AIX 5.1
#define	SHM_DEST	02000	/* destroy segment when # attached = 0 */

So the chances are probably good that portable applications 
wouldn't break with Arun's proposal.  Of course applications 
that were written just for Linux, and don't take care, might 
also be at risk, but I think the risk is probably low.  
A check of the form:

if (mode == 0666|SHM_LOCKED)

instead of:

if (mode & SHM_LOCKED)

is very obtuse.

This might not change your point of view (there is a theoretical risk 
after all), but I thought it worth mentioning.

Cheers,

Michael

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Expose SHM_HUGETLB in shmctl(id, IPC_STAT, ...)
  2005-11-12  7:19               ` Michael Kerrisk
@ 2005-11-12  7:38                 ` Andrew Morton
  0 siblings, 0 replies; 11+ messages in thread
From: Andrew Morton @ 2005-11-12  7:38 UTC (permalink / raw)
  To: Michael Kerrisk
  Cc: rohit.seth, linux-kernel, arun.sharma, mtk-lkml, mtk-manpages

"Michael Kerrisk" <michael.kerrisk@gmx.net> wrote:
>
> > Bear in mind that the sort of apps we're talking about here are
> > dubiously-written monsters with long and costly upgrade cycles and we tend
> > to not get any reports until many many months after we made a kernel
> > change.  It's very costly all round and we need to be cautious.
> 
> Andrew,
> 
> I am late to this discussion, but for what it's worth, a 
> portable application really must use checks of the like 
> (perm.mode & 0777 = 0666), because many implementations 
> define additional read-only flags for perm.mode:
> 
> Tru64 5.1
> #define	SHM_LOCKED	01000	/* segment locked in memory */
> #define	SHM_REMOVED	02000	/* already removed */
> 
> Linux
> #define	SHM_DEST	01000	/* segment will be destroyed on last detach */
> #define SHM_LOCKED      02000   /* segment will not be swapped */
> 
> HP-UX 11
> #  define SHM_CLEAR    01000	/* clear segment on next attach */
> #  define SHM_DEST     02000	/* destroy segment when # attached = 0 */
> #  define SHM_NOSWAP  010000	/* region for shared memory is memory locked */
> 			      /* (or should be when the region is allocated) */
> 
> AIX 5.1
> #define	SHM_DEST	02000	/* destroy segment when # attached = 0 */
> 
> So the chances are probably good that portable applications 
> wouldn't break with Arun's proposal.

The chances of breakage I agree are very low.  But non-zero.  I'd still
like us to find a way which is completely safe.

>  Of course applications 
> that were written just for Linux, and don't take care, might 
> also be at risk, but I think the risk is probably low.  
> A check of the form:
> 
> if (mode == 0666|SHM_LOCKED)
> 
> instead of:
> 
> if (mode & SHM_LOCKED)
> 
> is very obtuse.

Yes, but

	if ((mode & ~(SHM_LOCKED|SHM_REMOVED)) == 0666)

is pretty perverse, but more likely.  Stranger things have
been seen ;)

> This might not change your point of view (there is a theoretical risk 
> after all), but I thought it worth mentioning.

Thanks.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2005-11-12  7:38 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-11-09 18:46 [PATCH] Expose SHM_HUGETLB in shmctl(id, IPC_STAT, ...) Arun Sharma
2005-11-10  6:22 ` Andrew Morton
2005-11-10 18:35   ` Arun Sharma
2005-11-10 19:59     ` Andrew Morton
2005-11-10 21:41       ` Arun Sharma
2005-11-10 22:06         ` Andrew Morton
2005-11-11  2:49           ` Arun Sharma
2005-11-11  3:12             ` Andrew Morton
2005-11-12  7:19               ` Michael Kerrisk
2005-11-12  7:38                 ` Andrew Morton
2005-11-11  7:30             ` William Lee Irwin III

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox