From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753225AbbBYQ1I (ORCPT ); Wed, 25 Feb 2015 11:27:08 -0500 Received: from prod-mail-xrelay02.akamai.com ([72.246.2.14]:60508 "EHLO prod-mail-xrelay02.akamai.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751866AbbBYQ1G (ORCPT ); Wed, 25 Feb 2015 11:27:06 -0500 Message-ID: <54EDF7D8.60201@akamai.com> Date: Wed, 25 Feb 2015 11:27:04 -0500 From: Jason Baron User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.1.2 MIME-Version: 1.0 To: Ingo Molnar CC: peterz@infradead.org, mingo@redhat.com, viro@zeniv.linux.org.uk, akpm@linux-foundation.org, normalperson@yhbt.net, davidel@xmailserver.org, mtk.manpages@gmail.com, luto@amacapital.net, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-api@vger.kernel.org, Linus Torvalds , Alexander Viro Subject: Re: [PATCH v3 0/3] epoll: introduce round robin wakeup mode References: <20150225073814.GA14558@gmail.com> In-Reply-To: <20150225073814.GA14558@gmail.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 02/25/2015 02:38 AM, Ingo Molnar wrote: > * Jason Baron wrote: > >> Hi, >> >> When we are sharing a wakeup source among multiple epoll >> fds, we end up with thundering herd wakeups, since there >> is currently no way to add to the wakeup source >> exclusively. This series introduces a new EPOLL_ROTATE >> flag to allow for round robin exclusive wakeups. >> >> I believe this patch series addresses the two main >> concerns that were raised in prior postings. Namely, that >> it affected code (and potentially performance) of the >> core kernel wakeup functions, even in cases where it was >> not strictly needed, and that it could lead to wakeup >> starvation (since we were are no longer waking up all >> waiters). It does so by adding an extra layer of >> indirection, whereby waiters are attached to a 'psuedo' >> epoll fd, which in turn is attached directly to the >> wakeup source. >> sched/wait: add __wake_up_rotate() >> include/linux/wait.h | 1 + >> kernel/sched/wait.c | 27 ++++++++++++++++++++++ > So the scheduler bits are looking good to me in principle, > because they just add a new round-robin-rotating wakeup > variant and don't disturb the others. > > Is there consensus on the epoll ABI changes? With Davide I'm not sure there is a clear consensus on this change, but I'm hoping that I've addressed the outstanding concerns in this latest version. I also think the addition of a way to do a 'wakeup policy' here will open up other 'policies', such as taking into account cpu affinity as you suggested. So, I think its potentially an interesting direction for this code. > Libenzi inactive eventpoll appears to be without a > dedicated maintainer since 2011 or so. Is there anyone who > knows the code and its usages in detail and does final ABI > decisions on eventpoll - Andrew, Al or Linus? > Generally, Andrew and Al do more 'final' reviews here, and a lot of others on lkml are always very helpful in looking at this code. However, its not always clear, at least to me, who I should pester. Thanks, -Jason