From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <stable-owner@vger.kernel.org>
Received: from mx2.suse.de ([195.135.220.15]:58449 "EHLO mx2.suse.de"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1751456AbcADNCs (ORCPT <rfc822;stable@vger.kernel.org>);
	Mon, 4 Jan 2016 08:02:48 -0500
Date: Mon, 4 Jan 2016 05:02:31 -0800
From: Davidlohr Bueso <dave@stgolabs.net>
To: Manfred Spraul <manfred@colorfullife.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	LKML <linux-kernel@vger.kernel.org>, 1vier1@web.de,
	Ingo Molnar <mingo@redhat.com>,
	felixh@informatik.uni-bremen.de, stable@vger.kernel.org
Subject: Re: [PATCH] ipc/sem.c: Fix complex_count vs. simple op race
Message-ID: <20160104130231.GA3013@linux-uzut.site>
References: <1451736291-8115-1-git-send-email-manfred@colorfullife.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Disposition: inline
In-Reply-To: <1451736291-8115-1-git-send-email-manfred@colorfullife.com>
Sender: stable-owner@vger.kernel.org
List-ID: <stable.vger.kernel.org>

On Sat, 02 Jan 2016, Manfred Spraul wrote:

>Commit 6d07b68ce16a ("ipc/sem.c: optimize sem_lock()") introduced a race:
>
>sem_lock has a fast path that allows parallel simple operations.
>There are two reasons why a simple operation cannot run in parallel:
>- a non-simple operations is ongoing (sma->sem_perm.lock held)
>- a complex operation is sleeping (sma->complex_count != 0)
>
>As both facts are stored independently, a thread can bypass the current
>checks by sleeping in the right positions. See below for more details
>(or kernel bugzilla 105651).
>
>The patch fixes that by creating one variable (complex_mode)
>that tracks both reasons why parallel operations are not possible.
>
>The patch also updates stale documentation regarding the locking.
>
>With regards to stable kernels:
>The patch is required for all kernels that include the commit 6d07b68ce16a
>("ipc/sem.c: optimize sem_lock()") (3.10?)
>
>The alternative is to revert the patch that introduced the race.

I am just now catching up with this, but a quick thought is that we probably
want to keep 6d07b68ce16a as waiting on unlocking all sem->lock should be
much worse for performance than keeping track of the complex 'mode'. Specially
if we have a large array.

Also, any idea what workload exposed this race? Anyway, will take a closer look
at the patch/issue.

Thanks,
Davidlohr