From mboxrd@z Thu Jan  1 00:00:00 1970
From: NeilBrown <neilb@suse.de>
Subject: Re: [patch 2/2 v3]raid5: create multiple threads to handle stripes
Date: Thu, 28 Mar 2013 17:47:44 +1100
Message-ID: <20130328174744.79b04058@notabene.brown>
References: <20120809085808.GB30111@kernel.org>
	<20130307073123.GA5819@kernel.org>
	<20130312123927.16e49ea9@notabene.brown>
	<513FCBE3.90205@hardwarefreak.com>
Mime-Version: 1.0
Content-Type: multipart/signed; micalg=PGP-SHA1;
 boundary="Sig_/=6Hvhixzgi_3hCAex4TGcaD"; protocol="application/pgp-signature"
Return-path: <linux-raid-owner@vger.kernel.org>
In-Reply-To: <513FCBE3.90205@hardwarefreak.com>
Sender: linux-raid-owner@vger.kernel.org
To: stan@hardwarefreak.com
Cc: Shaohua Li <shli@kernel.org>, linux-raid@vger.kernel.org, dan.j.williams@gmail.com
List-Id: linux-raid.ids

--Sig_/=6Hvhixzgi_3hCAex4TGcaD
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable

On Tue, 12 Mar 2013 19:44:19 -0500 Stan Hoeppner <stan@hardwarefreak.com>
wrote:

> On 3/11/2013 8:39 PM, NeilBrown wrote:
> > On Thu, 7 Mar 2013 15:31:23 +0800 Shaohua Li <shli@kernel.org> wrote:
> ...
> >>> #echo 1-3 > /sys/block/md0/md/auxth0/cpulist
> >>> This will bind auxiliary thread 0 to cpu 1-3, and this thread will on=
ly handle
> >>> stripes produced by cpu 1-3. User tool can further change the thread's
> >>> affinity, but the thread can only handle stripes produced by cpu 1-3 =
till the
> >>> sysfs entry is changed again.
>=20
> Would it not be better to use the existing cpusets infrastructure for
> this, instead of manually binding threads to specific cores or sets of
> cores?
>=20
> Also, I understand the hot cache issue driving the desire to have a raid
> thread only process stripes created by its CPU.  But what of the
> scenario where an HPC user pins application threads to cores and needs
> all the L1/L2 cache?  Say this user has a dual socket 24 core NUMA
> system with 2 NUMA nodes per socket, 4 nodes total.  Each NUMA node has
> 6 cores and shared L3 cache.  The user pins 5 processes to 5 cores in
> each node, and wants to pin a raid thread to the remaining core in each
> node to handle the write IO generated by the 5 user threads on the node.
>=20
> Does your patch series allow this?  Using the above example, if the user
> creates 4 cpusets, can he assign a raid thread to that set and the
> thread will execute on any core in the set, and only that set, on any
> stripes created by any CPU in that set, and only that set?
>=20
> The infrastructure for this already exists, has since 2004.  And it
> seems is more flexible than what you've implemented here.  I suggest we
> make use of it, as it is the kernel standard for doing such things.
>=20
> See:  http://man7.org/linux/man-pages/man7/cpuset.7.html
>=20
> > Hi Shaohua,
> >  I still have this sitting in my queue, but I haven't had a chance to l=
ook at
> >  is properly yet - I'm sorry about that.  I'll try to get to it soon.
>=20

Thanks for this feedback.  The interface is the thing I am most concerned
about getting right at this stage, and is exactly what you are commenting o=
n.

The current code allows you to request N separate raid threads, and to tie
each one to a subset of processors.  This tying is in two senses.  The
thread can only run on cpus in the subset, and the requests queued by any
given processor will preferentially be processed by threads tied to that
processor.

It does sound a lot like cpusets could be used instead of lists of CPUs.
However it does merge the two different cpuset concepts which you seem to
suggest might not be ideal, and maybe it isn't.

A completely general solution might be to allow each thread to handle
requests from one cpuset, and run on any processor in another cpuset.
Would that be too much flexibility?

cpusets are a config option, so we would need to only enable multithreads if
CONFIG_CPUSETS were set.  Is this unnecessarily restrictive?  Are there any
other cases of kernel threads binding to cpusets?  If there aren't I'd be a
but cautious of being the first, I as have very little familiarity with this
stuff.


I still like the idea of an 'ioctl' which a process can call and will cause
it to start handling requests.
The process could bind itself to whatever cpu or cpuset it wanted to, then
could call the ioctl on the relevant md array, and pass in a bitmap of cpus
which indicate which requests it wants to be responsible for.  The current
kernel thread will then only handle requests that no-one else has put their
hand up for.  This leave all the details of configuration in user-space
(where I think it belongs).

Shaohua - have you given any thought to that approach.

If anyone else has any familiarity with multithreading and numa and cpu
affinity and would like to share some thoughts, I am all ears.

NeilBrown

--Sig_/=6Hvhixzgi_3hCAex4TGcaD
Content-Type: application/pgp-signature; name=signature.asc
Content-Disposition: attachment; filename=signature.asc

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)

iQIVAwUBUVPnkDnsnt1WYoG5AQIk2A//RHg3OhZp1P6/nzOyR2WKvY1Ybvcwi+iE
RFniowsjo8RBOCiMERx9UJscG4J1oLqu47eck88B4whzJA1ga5mXG6/CJK8Egwbn
ieGtLU0edoGFmCVtSzYi0l36Bk/nDYuVemuWJR+UVcNBti1PHhXmZGog5oX2ITiv
/mV1dcUp+iYOM/b0JWgVDfWhS+YJmpBcAJpm3QGu2OcHIHvClHAmm8sirmMywLaW
CNXhiZdDf3JnjhTOfpMJhMJiZaH7ESTFLzWlD1zZPnA094ESekmc4nZk38l5untj
33zH9smC6ZwrF6lVrvvx/N23om216IcS/kzK1w3VOjiYelUCIcr898CM+4PwspR5
VgdLlrYE3Mj0sLlv2bFBcNqBdnn4mM2/XDymTAwQTx85GeXG9TStJIKgLFnR6oWI
HBSAZ9262JXwbhbpT3nMjMgpgUtc+bkK8PAuLps+CwLbDoUpg0z+yD2fEaOTxOnV
RX5ZL/Oo8MrBwwbY27d7MrzS/bBgIWznq2SodjsMm/AlG5rQhzevCcK45oCyEjxn
/OtxIV/hbTLB2vDrkelIS2F5NuoTPk17T4N2t0LJBUWx4lvQWX7xnWS5BWRGsFqb
aNC0B7YxUk0zWXyDtfZLH/pmw1dvgMW1yoqGTOLbg1N/ARmpplWfR9HuylCQ7hHA
L6fg7ywdEBI=
=l4N0
-----END PGP SIGNATURE-----

--Sig_/=6Hvhixzgi_3hCAex4TGcaD--