From: Davidlohr Bueso <dave@stgolabs.net>
To: "Eric W. Biederman" <ebiederm@xmission.com>, manfred@colorfullife.com
Cc: Linux Containers <containers@lists.linux-foundation.org>,
linux-kernel@vger.kernel.org, linux-api@vger.kernel.org,
khlebnikov@yandex-team.ru, prakash.sangappa@oracle.com,
luto@kernel.org, akpm@linux-foundation.org, oleg@redhat.com,
serge.hallyn@ubuntu.com, esyr@redhat.com, jannh@google.com,
linux-security-module@vger.kernel.org,
Pavel Emelyanov <xemul@openvz.org>,
Nagarathnam Muthusamy <nagarathnam.muthusamy@oracle.com>
Subject: Re: [REVIEW][PATCH 11/11] ipc/sem: Fix semctl(..., GETPID, ...) between pid namespaces
Date: Fri, 30 Mar 2018 12:09:51 -0700 [thread overview]
Message-ID: <20180330190951.nfcdwuzp42bl2lfy@linux-n805> (raw)
In-Reply-To: <20180329005209.fnzr3hzvyr4oy3wi@linux-n805>
On Wed, 28 Mar 2018, Davidlohr Bueso wrote:
>On Fri, 23 Mar 2018, Eric W. Biederman wrote:
>
>>Today the last process to update a semaphore is remembered and
>>reported in the pid namespace of that process. If there are processes
>>in any other pid namespace querying that process id with GETPID the
>>result will be unusable nonsense as it does not make any
>>sense in your own pid namespace.
>
>Yeah that sounds pretty wrong.
>
>>
>>Due to ipc_update_pid I don't think you will be able to get System V
>>ipc semaphores into a troublesome cache line ping-pong. Using struct
>>pids from separate process are not a problem because they do not share
>>a cache line. Using struct pid from different threads of the same
>>process are unlikely to be a problem as the reference count update
>>can be avoided.
>>
>>Further linux futexes are a much better tool for the job of mutual
>>exclusion between processes than System V semaphores. So I expect
>>programs that are performance limited by their interprocess mutual
>>exclusion primitive will be using futexes.
>
>You would be wrong. There are plenty of real workloads out there
>that do not use futexes and are care about performance; in the end
>futexes are only good for the uncontended cases, it can also
>destroy numa boxes if you consider the global hash table. Experience
>as shown me that sysvipc sems are quite still used.
>
>>
>>So while it is possible that enhancing the storage of the last
>>rocess of a System V semaphore from an integer to a struct pid
>>will cause a performance regression because of the effect
>>of frequently updating the pid reference count. I don't expect
>>that to happen in practice.
>
>How's that? Now thanks to ipc_update_pid() for each semop the user
>passes, perform_atomic_semop() will do two atomic updates for the
>cases where there are multiple processes updating the sem. This is
>not uncommon.
>
>Could you please provide some numbers.
I ran this on a 40-core (no ht) Westmere with two benchmarks. The first
is Manfred's sysvsem lockunlock[1] program which uses _processes_ to,
well, lock and unlock the semaphore. The options are a little
unconventional, to keep the "critical region small" and the lock+unlock
frequency high I added busy_in=busy_out=10. Similarly, to get the
worst case scenario and have everyone update the same semaphore, a single
one is used. Here are the results (pretty low stddev from run to run)
for doing 100,000 lock+unlock.
- 1 proc:
* vanilla
total execution time: 0.110638 seconds for 100000 loops
* dirty
total execution time: 0.120144 seconds for 100000 loops
- 2 proc:
* vanilla
total execution time: 0.379756 seconds for 100000 loops
* dirty
total execution time: 0.477778 seconds for 100000 loops
- 4 proc:
* vanilla
total execution time: 6.749710 seconds for 100000 loops
* dirty
total execution time: 4.651872 seconds for 100000 loops
- 8 proc:
* vanilla
total execution time: 5.558404 seconds for 100000 loops
* dirty
total execution time: 7.143329 seconds for 100000 loops
- 16 proc:
* vanilla
total execution time: 9.016398 seconds for 100000 loops
* dirty
total execution time: 9.412055 seconds for 100000 loops
- 32 proc:
* vanilla
total execution time: 9.694451 seconds for 100000 loops
* dirty
total execution time: 9.990451 seconds for 100000 loops
- 64 proc:
* vanilla
total execution time: 9.844984 seconds for 100032 loops
* dirty
total execution time: 10.016464 seconds for 100032 loops
Lower task counts show pretty massive performance hits of ~9%, ~25%
and ~30% for single, two and four/eight processes. As more are added
I guess the overhead tends to disappear as for one you have a lot
more locking contention going on.
The second workload I ran this patch on was Chris Mason's sem-scalebench[2]
program which uses _threads_ for the sysvsem option (this benchmark is more
about semaphores as a concept rather than sysvsem specific). Dealing with
a single semaphore and increasing thread counts we get:
sembench-sem
vanill dirt
vanilla dirty
Hmean sembench-sem-2 286272.00 ( 0.00%) 288232.00 ( 0.68%)
Hmean sembench-sem-8 510966.00 ( 0.00%) 494375.00 ( -3.25%)
Hmean sembench-sem-12 435753.00 ( 0.00%) 465328.00 ( 6.79%)
Hmean sembench-sem-21 448144.00 ( 0.00%) 462091.00 ( 3.11%)
Hmean sembench-sem-30 479519.00 ( 0.00%) 471295.00 ( -1.72%)
Hmean sembench-sem-48 533270.00 ( 0.00%) 542525.00 ( 1.74%)
Hmean sembench-sem-79 510218.00 ( 0.00%) 528392.00 ( 3.56%)
Unsurprisingly, the thread case shows no overhead -- and yes, even better at
times but still noise). Similarly, when completely abusing the systems and doing
64*NCPUS there is pretty much no difference:
vanill dirt
vanilla dirty
User 1865.99 1819.75
System 35080.97 35396.34
Elapsed 3602.03 3560.50
So at least for a large box this patch hurts the cases where there is low
to medium cpu usage (no more than ~8 processes on a 40 core box) in a non
trivial way. For more processes it doesn't matter. We can confirm that the
case for threads is irrelevant. While I'm not happy about the 30% regression
I guess we can live with this.
Manfred, any thoughts?
Thanks
Davidlohr
[1] https://github.com/manfred-colorfu/ipcscale/blob/master/sem-lockunlock.c
[2] https://github.com/davidlohr/sembench-ng/blob/master/sembench.c
next prev parent reply other threads:[~2018-03-30 19:09 UTC|newest]
Thread overview: 51+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-03-12 17:18 [RESEND RFC] translate_pid API nagarathnam.muthusamy
2018-03-13 20:47 ` Jann Horn
2018-03-13 21:20 ` Nagarathnam Muthusamy
2018-03-13 21:28 ` Jann Horn
2018-03-13 21:44 ` Nagarathnam Muthusamy
2018-03-13 22:00 ` Jann Horn
2018-03-13 22:45 ` Nagarathnam Muthusamy
2018-03-13 23:10 ` Jann Horn
2018-03-13 23:52 ` Nagarathnam Muthusamy
2018-03-14 3:29 ` Eric W. Biederman
2018-03-14 21:22 ` Nagarathnam Muthusamy
2018-03-14 22:03 ` Eric W. Biederman
2018-03-20 20:14 ` Nagarathnam Muthusamy
2018-03-21 0:33 ` Eric W. Biederman
[not found] ` <87a7v2z2qa.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2018-03-23 19:11 ` [REVIEW][PATCH 00/11] ipc: Fixing the pid namespace support Eric W. Biederman
[not found] ` <87vadmobdw.fsf_-_-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2018-03-23 19:16 ` [REVIEW][PATCH 01/11] sem/security: Pass kern_ipc_perm not sem_array into the sem security hooks Eric W. Biederman
2018-03-23 21:46 ` Casey Schaufler
2018-03-28 23:20 ` Davidlohr Bueso
2018-03-23 19:16 ` [REVIEW][PATCH 02/11] shm/security: Pass kern_ipc_perm not shmid_kernel into the shm " Eric W. Biederman
2018-03-23 21:54 ` Casey Schaufler
2018-03-23 19:16 ` [REVIEW][PATCH 04/11] sem: Move struct sem and struct sem_array into ipc/sem.c Eric W. Biederman
2018-03-23 19:16 ` [REVIEW][PATCH 06/11] msg: Move struct msg_queue into ipc/msg.c Eric W. Biederman
2018-03-23 19:16 ` [REVIEW][PATCH 08/11] ipc/util: Helpers for making the sysvipc operations pid namespace aware Eric W. Biederman
2018-03-23 19:16 ` [REVIEW][PATCH 11/11] ipc/sem: Fix semctl(..., GETPID, ...) between pid namespaces Eric W. Biederman
2018-03-29 0:52 ` Davidlohr Bueso
2018-03-30 19:09 ` Davidlohr Bueso [this message]
2018-03-30 20:12 ` Eric W. Biederman
2018-03-30 20:45 ` Davidlohr Bueso
2018-04-02 11:11 ` Manfred Spraul
2018-03-24 5:40 ` [REVIEW][PATCH 12/11] ipc: Directly call the security hook in ipc_ops.associate Eric W. Biederman
2018-03-28 23:40 ` Davidlohr Bueso
[not found] ` <877eq2m3or.fsf_-_-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2018-03-31 2:13 ` James Morris
2018-03-29 1:12 ` [REVIEW][PATCH 00/11] ipc: Fixing the pid namespace support Davidlohr Bueso
2018-03-29 18:42 ` Eric W. Biederman
2018-03-23 19:16 ` [REVIEW][PATCH 03/11] msg/security: Pass kern_ipc_perm not msg_queue into the msg_queue security hooks Eric W. Biederman
2018-03-23 21:55 ` Casey Schaufler
[not found] ` <bb73b0ea-bcda-a996-8f14-48d9dd1b0940-iSGtlc1asvQWG2LlvL+J4A@public.gmane.org>
2018-03-24 5:37 ` Eric W. Biederman
2018-03-23 19:16 ` [REVIEW][PATCH 05/11] shm: Move struct shmid_kernel into ipc/shm.c Eric W. Biederman
2018-03-23 19:16 ` [REVIEW][PATCH 07/11] ipc: Move IPCMNI from include/ipc.h into ipc/util.h Eric W. Biederman
2018-03-23 19:16 ` [REVIEW][PATCH 09/11] ipc/shm: Fix shmctl(..., IPC_STAT, ...) between pid namespaces Eric W. Biederman
[not found] ` <20180323191614.32489-9-ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2018-03-23 21:17 ` NAGARATHNAM MUTHUSAMY
[not found] ` <7df62190-2407-bfd4-d144-7304a8ea8ae3-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2018-03-23 21:33 ` Eric W. Biederman
[not found] ` <87lgeio4tb.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2018-03-23 21:41 ` NAGARATHNAM MUTHUSAMY
2018-03-28 23:04 ` Eric W. Biederman
[not found] ` <87woxvajk9.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2018-03-28 23:18 ` Nagarathnam Muthusamy
2018-03-23 19:16 ` [REVIEW][PATCH 10/11] ipc/msg: Fix msgctl(..., " Eric W. Biederman
[not found] ` <20180323191614.32489-10-ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2018-03-23 21:21 ` NAGARATHNAM MUTHUSAMY
2018-03-24 5:42 ` [REVIEW][PATCH 13/11] ipc/smack: Tidy up from the change in type of the ipc security hooks Eric W. Biederman
[not found] ` <87y3iikp1y.fsf_-_-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2018-03-25 0:05 ` Casey Schaufler
2018-03-28 23:38 ` Davidlohr Bueso
2018-03-28 23:57 ` Davidlohr Bueso
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180330190951.nfcdwuzp42bl2lfy@linux-n805 \
--to=dave@stgolabs.net \
--cc=akpm@linux-foundation.org \
--cc=containers@lists.linux-foundation.org \
--cc=ebiederm@xmission.com \
--cc=esyr@redhat.com \
--cc=jannh@google.com \
--cc=khlebnikov@yandex-team.ru \
--cc=linux-api@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-security-module@vger.kernel.org \
--cc=luto@kernel.org \
--cc=manfred@colorfullife.com \
--cc=nagarathnam.muthusamy@oracle.com \
--cc=oleg@redhat.com \
--cc=prakash.sangappa@oracle.com \
--cc=serge.hallyn@ubuntu.com \
--cc=xemul@openvz.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox