public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* NFS livelock / starvation ?
@ 2007-04-16  8:24 Zhou Yingchao
  2007-04-16  9:20 ` Peter Zijlstra
  0 siblings, 1 reply; 2+ messages in thread
From: Zhou Yingchao @ 2007-04-16  8:24 UTC (permalink / raw)
  To: akpm, trond.myklebust, neilb, nfs; +Cc: linux-kernel

When we run a two nfs client and a nfs server in the following way, we
met a livelock / starvation condition.

MachineA        MachineB
 Client1             Client2
 Server

As shown in the figure, we run a client and server on one machine, and
run another client on another machine. When Client1 and Client2 make
many writes at the same time, the Client1's request is blocked until
Client2's writes finished.

We check the code, Client1 is blocked in generic_file_write-> ...
>balance_dirty_pages, balance_dirty_pages call writeback_inodes to
(only) flush data of the related fs.

In nfs, we found that the Server has enhanced its dirty_thresh. So in
the loop of writeback_inodes, Client1 has no data to write out, and
the condition "ns_reclaimable+wbs.nr_writeback<=dirty_thresh" will not
be true until Client2 finishes its write request to Server. So the
loop will only end after Client2 finished its write job.

The problem in this path is: why we write only pages of the related fs
in writeback_inodes but check the dirty thresh for total pages?

-- 
Yingchao Zhou
***********************************************
 Institute Of Computing Technology
 Chinese Academy of Sciences
***********************************************

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: NFS livelock / starvation ?
  2007-04-16  8:24 NFS livelock / starvation ? Zhou Yingchao
@ 2007-04-16  9:20 ` Peter Zijlstra
  0 siblings, 0 replies; 2+ messages in thread
From: Peter Zijlstra @ 2007-04-16  9:20 UTC (permalink / raw)
  To: Zhou Yingchao; +Cc: akpm, trond.myklebust, neilb, nfs, linux-kernel

On Mon, 2007-04-16 at 16:24 +0800, Zhou Yingchao wrote:
> When we run a two nfs client and a nfs server in the following way, we
> met a livelock / starvation condition.
> 
> MachineA        MachineB
>  Client1             Client2
>  Server
> 
> As shown in the figure, we run a client and server on one machine, and
> run another client on another machine. When Client1 and Client2 make
> many writes at the same time, the Client1's request is blocked until
> Client2's writes finished.
> 
> We check the code, Client1 is blocked in generic_file_write-> ...
> >balance_dirty_pages, balance_dirty_pages call writeback_inodes to
> (only) flush data of the related fs.
> 
> In nfs, we found that the Server has enhanced its dirty_thresh. So in
> the loop of writeback_inodes, Client1 has no data to write out, and
> the condition "ns_reclaimable+wbs.nr_writeback<=dirty_thresh" will not
> be true until Client2 finishes its write request to Server. So the
> loop will only end after Client2 finished its write job.
> 
> The problem in this path is: why we write only pages of the related fs
> in writeback_inodes but check the dirty thresh for total pages?

I am working on patches to fix this.

Current version at (against -mm):
  http://programming.kicks-ass.net/kernel-patches/balance_dirty_pages/

However, after a rewrite of the BDI statistics work there are some
funnies, which I haven't had time to analyse yet :-/

I hope to post a new version soonish...


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2007-04-16  9:20 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-04-16  8:24 NFS livelock / starvation ? Zhou Yingchao
2007-04-16  9:20 ` Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox