All of lore.kernel.org
 help / color / mirror / Atom feed
diff for duplicates of <20100324193156.GH5021@think>

diff --git a/a/1.txt b/N1/1.txt
index af537ad..454bf12 100644
--- a/a/1.txt
+++ b/N1/1.txt
@@ -1,28 +1,21 @@
 On Wed, Mar 24, 2010 at 11:06:40AM -0700, Dan Williams wrote:
-> On Wed, Mar 24, 2010 at 8:51 AM, Chris Mason <chris.mason@oracle.com>=
- wrote:
+> On Wed, Mar 24, 2010 at 8:51 AM, Chris Mason <chris.mason@oracle.com> wrote:
 > > On Wed, Mar 24, 2010 at 07:53:10AM -0700, Dan Williams wrote:
-> >> The current implementation with the async thread pool ends up spre=
-ading
-> >> the work over too many threads. =A0The btrfs workqueue is targeted=
- at high
-> >> cpu utilization works and has a threshold mechanism to limit threa=
-d
-> >> spawning. =A0Unfortunately it still ends up increasing cpu utiliza=
-tion
-> >> without a comparable improvement in throughput. =A0Here are the nu=
-mbers
+> >> The current implementation with the async thread pool ends up spreading
+> >> the work over too many threads.  The btrfs workqueue is targeted at high
+> >> cpu utilization works and has a threshold mechanism to limit thread
+> >> spawning.  Unfortunately it still ends up increasing cpu utilization
+> >> without a comparable improvement in throughput.  Here are the numbers
 > >> relative to the multicore disabled case:
 > >>
-> >> idle_thresh =A0 throughput =A0 =A0 =A0cycles
-> >> 4 =A0 =A0 =A0 =A0 =A0 =A0 +0% =A0 =A0 =A0 =A0 =A0 =A0 +102%
-> >> 64 =A0 =A0 =A0 =A0 =A0 =A0+4% =A0 =A0 =A0 =A0 =A0 =A0 +63%
-> >> 128 =A0 =A0 =A0 =A0 =A0 +1% =A0 =A0 =A0 =A0 =A0 =A0 +45%
+> >> idle_thresh   throughput      cycles
+> >> 4             +0%             +102%
+> >> 64            +4%             +63%
+> >> 128           +1%             +45%
 > >
-> > Interesting, do the btrfs workqueues improve things? =A0Or do you t=
-hink they are
+> > Interesting, do the btrfs workqueues improve things?  Or do you think they are
 > > just a better base for more tuning?
->=20
+> 
 > Both, throughput falls off a cliff with the async thread pool, and
 > there are more knobs to turn in this implementation.
 
@@ -31,56 +24,38 @@ at 63% more cycles.  That looks like something more than just cache
 affinity, but at least for btrfs I always had bigger problems than
 affinity, so I haven't dug into that part yet.
 
->=20
-> > I had always hoped to find more users for the work queues and tried=
- to
-> > keep btrfs specific features out of them. =A0The place I didn't ent=
-irely
+> 
+> > I had always hoped to find more users for the work queues and tried to
+> > keep btrfs specific features out of them.  The place I didn't entirely
 > > succeed was in the spin locks, the ordered queues take regular spin
-> > locks to avoid turning off irqs where btrfs always does things outs=
-ide
-> > of interrupt time. =A0Doesn't look like raid needs the ordered queu=
-es so
+> > locks to avoid turning off irqs where btrfs always does things outside
+> > of interrupt time.  Doesn't look like raid needs the ordered queues so
 > > this should work pretty well.
 > >
 > >>
-> >> This appears to show that something more fundamental needs to happ=
-en to
-> >> take advantage of percpu raid processing. =A0More profiling is nee=
-ded, but
-> >> the suspects in my mind are conf->device_lock contention and the f=
-act
-> >> that all work is serialized through conf->handle_list with no meth=
-od for
+> >> This appears to show that something more fundamental needs to happen to
+> >> take advantage of percpu raid processing.  More profiling is needed, but
+> >> the suspects in my mind are conf->device_lock contention and the fact
+> >> that all work is serialized through conf->handle_list with no method for
 > >> encouraging stripe_head to thread affinity.
 > >
-> > The big place I'd suggest to look inside the btrfs async-thread.c f=
-or
-> > optimization is the worker_loop(). =A0For work that tends to be bur=
-sty and
-> > relatively short, we can have worker threads finish their work fair=
-ly
-> > quickly and go to sleep, only to be woken up very quickly again wit=
-h
-> > another big batch of work. =A0The worker_loop code tries to wait ar=
-ound
+> > The big place I'd suggest to look inside the btrfs async-thread.c for
+> > optimization is the worker_loop().  For work that tends to be bursty and
+> > relatively short, we can have worker threads finish their work fairly
+> > quickly and go to sleep, only to be woken up very quickly again with
+> > another big batch of work.  The worker_loop code tries to wait around
 > > for a while, but the tuning here was btrfs specific.
 > >
-> > It might also help to tune the find_worker and next_worker code to =
-prefer
+> > It might also help to tune the find_worker and next_worker code to prefer
 > > giving work to threads that are running but almost done with their
-> > queue. =A0Maybe they can be put onto a special hot list as they get=
- near
+> > queue.  Maybe they can be put onto a special hot list as they get near
 > > the end of their queue.
 > >
->=20
-> Thanks I'll take a look at these suggestions.  For these optimization=
-s
-> to have a chance I think we need stripes to maintain affinity with th=
-e
+> 
+> Thanks I'll take a look at these suggestions.  For these optimizations
+> to have a chance I think we need stripes to maintain affinity with the
 > first core that picks up the work.  Currently all stripes take a trip
-> through the single-threaded raid5d when their reference count drops t=
-o
+> through the single-threaded raid5d when their reference count drops to
 > zero, only to be immediately reissued to the thread pool potentially
 > on a different core (but I need to back this assumption up with more
 > profiling).
@@ -88,23 +63,18 @@ o
 Well, I'd definitely say the pool is going to give you a random
 core.
 
->=20
-> > There's a rule somewhere that patches renaming things must have rep=
-lies
-> > questioning the new name. =A0The first reply isn't actually allowed=
- to
+> 
+> > There's a rule somewhere that patches renaming things must have replies
+> > questioning the new name.  The first reply isn't actually allowed to
 > > suggest a better name, which is good because I'm not very good at
 > > that kind of thing.
 > >
-> > Really though, btr_queue is fine by me, but don't feel obligated to=
- keep
+> > Really though, btr_queue is fine by me, but don't feel obligated to keep
 > > some variation of btrfs in the name ;)
->=20
-> btr_queue seemed to make sense since it's spreading work like "butter=
-" :-).
+> 
+> btr_queue seemed to make sense since it's spreading work like "butter" :-).
 
-Grin, it doesn't take many butter jokes to sell me.  Careful though, yo=
-u
+Grin, it doesn't take many butter jokes to sell me.  Careful though, you
 can't have just one buttery thread pool.  Just wait, you'll think of
 good reasons to add another real soon.
 
@@ -112,14 +82,12 @@ Speaking of which, you want to use the atomic thread starting helper.
 Basically kthread_run does GFP allocations that can wait on disk, and
 if you're the disk it is waiting on things can go badly in a hurry.
 
-We should probably integrate a single kthread run helper thread into th=
-e
+We should probably integrate a single kthread run helper thread into the
 pools instead of forcing people to make their own.  I'll take a look.
 
 -chris
 
 --
-To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
-n
+To unsubscribe from this list: send the line "unsubscribe linux-raid" in
 the body of a message to majordomo@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff --git a/a/content_digest b/N1/content_digest
index 62f0af4..039d80a 100644
--- a/a/content_digest
+++ b/N1/content_digest
@@ -10,30 +10,23 @@
  "\00:1\0"
  "b\0"
  "On Wed, Mar 24, 2010 at 11:06:40AM -0700, Dan Williams wrote:\n"
- "> On Wed, Mar 24, 2010 at 8:51 AM, Chris Mason <chris.mason@oracle.com>=\n"
- " wrote:\n"
+ "> On Wed, Mar 24, 2010 at 8:51 AM, Chris Mason <chris.mason@oracle.com> wrote:\n"
  "> > On Wed, Mar 24, 2010 at 07:53:10AM -0700, Dan Williams wrote:\n"
- "> >> The current implementation with the async thread pool ends up spre=\n"
- "ading\n"
- "> >> the work over too many threads. =A0The btrfs workqueue is targeted=\n"
- " at high\n"
- "> >> cpu utilization works and has a threshold mechanism to limit threa=\n"
- "d\n"
- "> >> spawning. =A0Unfortunately it still ends up increasing cpu utiliza=\n"
- "tion\n"
- "> >> without a comparable improvement in throughput. =A0Here are the nu=\n"
- "mbers\n"
+ "> >> The current implementation with the async thread pool ends up spreading\n"
+ "> >> the work over too many threads. \302\240The btrfs workqueue is targeted at high\n"
+ "> >> cpu utilization works and has a threshold mechanism to limit thread\n"
+ "> >> spawning. \302\240Unfortunately it still ends up increasing cpu utilization\n"
+ "> >> without a comparable improvement in throughput. \302\240Here are the numbers\n"
  "> >> relative to the multicore disabled case:\n"
  "> >>\n"
- "> >> idle_thresh =A0 throughput =A0 =A0 =A0cycles\n"
- "> >> 4 =A0 =A0 =A0 =A0 =A0 =A0 +0% =A0 =A0 =A0 =A0 =A0 =A0 +102%\n"
- "> >> 64 =A0 =A0 =A0 =A0 =A0 =A0+4% =A0 =A0 =A0 =A0 =A0 =A0 +63%\n"
- "> >> 128 =A0 =A0 =A0 =A0 =A0 +1% =A0 =A0 =A0 =A0 =A0 =A0 +45%\n"
+ "> >> idle_thresh \302\240 throughput \302\240 \302\240 \302\240cycles\n"
+ "> >> 4 \302\240 \302\240 \302\240 \302\240 \302\240 \302\240 +0% \302\240 \302\240 \302\240 \302\240 \302\240 \302\240 +102%\n"
+ "> >> 64 \302\240 \302\240 \302\240 \302\240 \302\240 \302\240+4% \302\240 \302\240 \302\240 \302\240 \302\240 \302\240 +63%\n"
+ "> >> 128 \302\240 \302\240 \302\240 \302\240 \302\240 +1% \302\240 \302\240 \302\240 \302\240 \302\240 \302\240 +45%\n"
  "> >\n"
- "> > Interesting, do the btrfs workqueues improve things? =A0Or do you t=\n"
- "hink they are\n"
+ "> > Interesting, do the btrfs workqueues improve things? \302\240Or do you think they are\n"
  "> > just a better base for more tuning?\n"
- ">=20\n"
+ "> \n"
  "> Both, throughput falls off a cliff with the async thread pool, and\n"
  "> there are more knobs to turn in this implementation.\n"
  "\n"
@@ -42,56 +35,38 @@
  "affinity, but at least for btrfs I always had bigger problems than\n"
  "affinity, so I haven't dug into that part yet.\n"
  "\n"
- ">=20\n"
- "> > I had always hoped to find more users for the work queues and tried=\n"
- " to\n"
- "> > keep btrfs specific features out of them. =A0The place I didn't ent=\n"
- "irely\n"
+ "> \n"
+ "> > I had always hoped to find more users for the work queues and tried to\n"
+ "> > keep btrfs specific features out of them. \302\240The place I didn't entirely\n"
  "> > succeed was in the spin locks, the ordered queues take regular spin\n"
- "> > locks to avoid turning off irqs where btrfs always does things outs=\n"
- "ide\n"
- "> > of interrupt time. =A0Doesn't look like raid needs the ordered queu=\n"
- "es so\n"
+ "> > locks to avoid turning off irqs where btrfs always does things outside\n"
+ "> > of interrupt time. \302\240Doesn't look like raid needs the ordered queues so\n"
  "> > this should work pretty well.\n"
  "> >\n"
  "> >>\n"
- "> >> This appears to show that something more fundamental needs to happ=\n"
- "en to\n"
- "> >> take advantage of percpu raid processing. =A0More profiling is nee=\n"
- "ded, but\n"
- "> >> the suspects in my mind are conf->device_lock contention and the f=\n"
- "act\n"
- "> >> that all work is serialized through conf->handle_list with no meth=\n"
- "od for\n"
+ "> >> This appears to show that something more fundamental needs to happen to\n"
+ "> >> take advantage of percpu raid processing. \302\240More profiling is needed, but\n"
+ "> >> the suspects in my mind are conf->device_lock contention and the fact\n"
+ "> >> that all work is serialized through conf->handle_list with no method for\n"
  "> >> encouraging stripe_head to thread affinity.\n"
  "> >\n"
- "> > The big place I'd suggest to look inside the btrfs async-thread.c f=\n"
- "or\n"
- "> > optimization is the worker_loop(). =A0For work that tends to be bur=\n"
- "sty and\n"
- "> > relatively short, we can have worker threads finish their work fair=\n"
- "ly\n"
- "> > quickly and go to sleep, only to be woken up very quickly again wit=\n"
- "h\n"
- "> > another big batch of work. =A0The worker_loop code tries to wait ar=\n"
- "ound\n"
+ "> > The big place I'd suggest to look inside the btrfs async-thread.c for\n"
+ "> > optimization is the worker_loop(). \302\240For work that tends to be bursty and\n"
+ "> > relatively short, we can have worker threads finish their work fairly\n"
+ "> > quickly and go to sleep, only to be woken up very quickly again with\n"
+ "> > another big batch of work. \302\240The worker_loop code tries to wait around\n"
  "> > for a while, but the tuning here was btrfs specific.\n"
  "> >\n"
- "> > It might also help to tune the find_worker and next_worker code to =\n"
- "prefer\n"
+ "> > It might also help to tune the find_worker and next_worker code to prefer\n"
  "> > giving work to threads that are running but almost done with their\n"
- "> > queue. =A0Maybe they can be put onto a special hot list as they get=\n"
- " near\n"
+ "> > queue. \302\240Maybe they can be put onto a special hot list as they get near\n"
  "> > the end of their queue.\n"
  "> >\n"
- ">=20\n"
- "> Thanks I'll take a look at these suggestions.  For these optimization=\n"
- "s\n"
- "> to have a chance I think we need stripes to maintain affinity with th=\n"
- "e\n"
+ "> \n"
+ "> Thanks I'll take a look at these suggestions.  For these optimizations\n"
+ "> to have a chance I think we need stripes to maintain affinity with the\n"
  "> first core that picks up the work.  Currently all stripes take a trip\n"
- "> through the single-threaded raid5d when their reference count drops t=\n"
- "o\n"
+ "> through the single-threaded raid5d when their reference count drops to\n"
  "> zero, only to be immediately reissued to the thread pool potentially\n"
  "> on a different core (but I need to back this assumption up with more\n"
  "> profiling).\n"
@@ -99,23 +74,18 @@
  "Well, I'd definitely say the pool is going to give you a random\n"
  "core.\n"
  "\n"
- ">=20\n"
- "> > There's a rule somewhere that patches renaming things must have rep=\n"
- "lies\n"
- "> > questioning the new name. =A0The first reply isn't actually allowed=\n"
- " to\n"
+ "> \n"
+ "> > There's a rule somewhere that patches renaming things must have replies\n"
+ "> > questioning the new name. \302\240The first reply isn't actually allowed to\n"
  "> > suggest a better name, which is good because I'm not very good at\n"
  "> > that kind of thing.\n"
  "> >\n"
- "> > Really though, btr_queue is fine by me, but don't feel obligated to=\n"
- " keep\n"
+ "> > Really though, btr_queue is fine by me, but don't feel obligated to keep\n"
  "> > some variation of btrfs in the name ;)\n"
- ">=20\n"
- "> btr_queue seemed to make sense since it's spreading work like \"butter=\n"
- "\" :-).\n"
+ "> \n"
+ "> btr_queue seemed to make sense since it's spreading work like \"butter\" :-).\n"
  "\n"
- "Grin, it doesn't take many butter jokes to sell me.  Careful though, yo=\n"
- "u\n"
+ "Grin, it doesn't take many butter jokes to sell me.  Careful though, you\n"
  "can't have just one buttery thread pool.  Just wait, you'll think of\n"
  "good reasons to add another real soon.\n"
  "\n"
@@ -123,16 +93,14 @@
  "Basically kthread_run does GFP allocations that can wait on disk, and\n"
  "if you're the disk it is waiting on things can go badly in a hurry.\n"
  "\n"
- "We should probably integrate a single kthread run helper thread into th=\n"
- "e\n"
+ "We should probably integrate a single kthread run helper thread into the\n"
  "pools instead of forcing people to make their own.  I'll take a look.\n"
  "\n"
  "-chris\n"
  "\n"
  "--\n"
- "To unsubscribe from this list: send the line \"unsubscribe linux-raid\" i=\n"
- "n\n"
+ "To unsubscribe from this list: send the line \"unsubscribe linux-raid\" in\n"
  "the body of a message to majordomo@vger.kernel.org\n"
  More majordomo info at  http://vger.kernel.org/majordomo-info.html
 
-c048c3f2123735ee9e9bc3abb26ceb0fe8f2b9382d5d3f14f1b8e9221e4821e6
+5774136127d6c5ab54b05301f762422e7762fbfc0523a72c5014725e7939310b

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.