public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Oleg Nesterov <oleg@redhat.com>
To: Kent Overstreet <koverstreet@google.com>
Cc: linux-kernel@vger.kernel.org, Tejun Heo <tj@kernel.org>,
	Christoph Lameter <cl@linux-foundation.org>,
	Ingo Molnar <mingo@redhat.com>, Andi Kleen <andi@firstfloor.org>,
	Jens Axboe <axboe@kernel.dk>,
	"Nicholas A. Bellinger" <nab@linux-iscsi.org>
Subject: Re: [PATCH] Percpu tag allocator
Date: Wed, 12 Jun 2013 21:14:25 +0200	[thread overview]
Message-ID: <20130612191425.GA7098@redhat.com> (raw)
In-Reply-To: <20130612175915.GB6151@google.com>

On 06/12, Kent Overstreet wrote:
>
> So we'd need at least an atomic counter, but a bitmap isn't really any
> more trouble and it lets us skip most of the percpu lists that are empty

Yes, yes, I understand.

> - which should make a real difference in scalability to huge nr_cpus.

But this is not obvious to me. I mean, I am not sure I understand why
this all is "optimal". In particular, I do not really understand
"while (cpus_have_tags-- * TAG_CPU_SIZE > pool->nr_tags / 2)" in
steal_tags(), even if "the workload should not span more cpus than
nr_tags / 128" is true. I guess this connects to "we guarantee that
nr_tags / 2 can always be allocated" and we do not want to call
steal_tags() too often/otherwise, but cpus_have_tags * TAG_CPU_SIZE
can easily overestimate the number of free tags.

But I didn't read the patch carefully, and it is not that I think I
can suggest something better.

In short: please ignore ;)

> > > +enum {
> > > +	TAG_FAIL	= -1U,
> > > +	TAG_MAX		= TAG_FAIL - 1,
> > > +};
> >
> > This can probably go to .c, and it seems that TAG_MAX is not needed.
> > tag_init() can check "nr_tags >= TAG_FAIL" instead.
>
> Users need TAG_FAIL to check for allocation failure.

Ah, indeed, !__GFP_WAIT...

Hmm. but why gfp_t? why not "bool atomic" ?

Probably this is because you expect that most callers should have
gfp anyway. Looks a bit strange but I won't argue.

> > > +		if (nr_free) {
> > > +			memcpy(tags->freelist,
> > > +			       remote->freelist,
> > > +			       sizeof(unsigned) * nr_free);
> > > +			smp_mb();
> > > +			remote->nr_free = 0;
> > > +			tags->nr_free = nr_free;
> > > +			return true;
> > > +		} else {
> > > +			remote->nr_free = 0;
> > > +		}
> >
> > Both branches clear remote->nr_free.
>
> Yeah, but clearing it has to happen after the memcpy() and the smp_mb().

Yes, yes, we need mb() between memcpy() and "remote = 0",

> I couldn't find a way to combine them that I liked, but if you've got
> any suggestions I'm all ears.

Please ignore. Somehow I missed the fact we need to return or continue,
so we need "else" or another check.

Oleg.


  reply	other threads:[~2013-06-12 19:18 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-12  4:03 [PATCH] Percpu tag allocator Kent Overstreet
2013-06-12 17:08 ` Oleg Nesterov
2013-06-12 17:59   ` Kent Overstreet
2013-06-12 19:14     ` Oleg Nesterov [this message]
2013-06-12 23:38 ` Andrew Morton
2013-06-13  2:05   ` Kent Overstreet
2013-06-13  3:03     ` Andrew Morton
2013-06-13  3:54       ` Kent Overstreet
2013-06-13  5:46         ` Andrew Morton
2013-06-13 18:53       ` Tejun Heo
2013-06-13 19:04         ` Andrew Morton
2013-06-13 19:15           ` Tejun Heo
2013-06-13 19:23             ` Andrew Morton
2013-06-13 19:35               ` Tejun Heo
2013-06-13 22:10                 ` Andrew Morton
2013-06-13 22:30                   ` Tejun Heo
2013-06-13 22:35                     ` Andrew Morton
2013-06-13 23:13                       ` Tejun Heo
2013-06-13 23:23                         ` Tejun Heo
2013-06-19  1:32               ` Kent Overstreet
2013-06-13 19:08         ` J. Bruce Fields
2013-06-13 19:09         ` Jeff Layton
2013-06-13 21:53         ` Kent Overstreet
2013-06-13 19:06   ` Tejun Heo
2013-06-13 19:13     ` Andrew Morton
2013-06-13 19:21       ` Tejun Heo
2013-06-13 21:14         ` Kent Overstreet
2013-06-13 21:50           ` Tejun Heo
2013-06-13 21:58             ` Kent Overstreet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130612191425.GA7098@redhat.com \
    --to=oleg@redhat.com \
    --cc=andi@firstfloor.org \
    --cc=axboe@kernel.dk \
    --cc=cl@linux-foundation.org \
    --cc=koverstreet@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=nab@linux-iscsi.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox