From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752894AbbBRRva (ORCPT ); Wed, 18 Feb 2015 12:51:30 -0500 Received: from mail-wi0-f171.google.com ([209.85.212.171]:54289 "EHLO mail-wi0-f171.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751952AbbBRRv2 (ORCPT ); Wed, 18 Feb 2015 12:51:28 -0500 Date: Wed, 18 Feb 2015 18:51:23 +0100 From: Ingo Molnar To: Jason Baron Cc: peterz@infradead.org, mingo@redhat.com, viro@zeniv.linux.org.uk, akpm@linux-foundation.org, normalperson@yhbt.net, davidel@xmailserver.org, mtk.manpages@gmail.com, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-api@vger.kernel.org, Thomas Gleixner , Linus Torvalds , Peter Zijlstra Subject: Re: [PATCH v2 2/2] epoll: introduce EPOLLEXCLUSIVE and EPOLLROUNDROBIN Message-ID: <20150218175123.GA31878@gmail.com> References: <7956874bfdc7403f37afe8a75e50c24221039bd2.1424200151.git.jbaron@akamai.com> <20150218080740.GA10199@gmail.com> <54E4B2D0.8020706@akamai.com> <20150218163300.GA28007@gmail.com> <54E4CE14.5010708@akamai.com> <20150218174533.GB31566@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150218174533.GB31566@gmail.com> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org * Ingo Molnar wrote: > > [...] However, I think the userspace API change is less > > clear since epoll_wait() doesn't currently have an > > 'input' events argument as epoll_ctl() does. > > ... but the change would be a bit clearer and somewhat > more flexible: LIFO or FIFO queueing, right? > > But having the queueing model as part of the epoll > context is a legitimate approach as well. Btw., there's another optimization that the networking code already does when processing incoming packets: waking up a thread on the local CPU, where the wakeup is running. Doing the same on epoll would have real scalability advantages where incoming events are IRQ driven and are distributed amongst multiple CPUs. Where events are task driven the scheduler will already try to pair up waker and wakee so it might not show up in measurements that markedly. Thanks, Ingo