All of lore.kernel.org
 help / color / mirror / Atom feed
* Review of dm-block-manager.c
@ 2011-08-01 21:00 Mikulas Patocka
  2011-08-01 21:17 ` Mike Snitzer
                   ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Mikulas Patocka @ 2011-08-01 21:00 UTC (permalink / raw)
  To: Alasdair G. Kergon, Edward Thornber; +Cc: dm-devel

Hi

This is review of dm-block-manager.c:


char buffer_cache_name[32];
sprintf(bm->buffer_cache_name, "dm_block_buffer-%d:%d",
--- it may not fit in 32 bytes.


__wait_block uses TASK_INTERRUPTIBLE sleep and returns error code 
-ERESTARTSYS if interrupted by a signal. But this error code is never 
checked. Consequently, if the process receives a signal, this signal will 
interrupt waiting, and the rest of the buffer management code will 
mistakenly think that the event to wait for happened.
This should be replaced by TASK_UNINTERRUPTIBLE sleep and functions 
__wait_io, __wait_unlocked, __wait_read_lockable, __wait_all_writes, 
__wait_all_io, __wait_clean be changed to return void (because their 
return code is never checked anyway).


The code uses only a spinlock to protect it state. When the spinlock is 
dropped (for example during wait), the buffer may have been reused for 
other purposes, but it is not checked. There is a comment "/* FIXME: Can b 
have been recycled between io completion and here? */" indicating that Joe 
is aware of the problem.


b->write_lock_pending++;
__wait_unlocked(b, &flags);
b->write_lock_pending--;
if (b->where != block)
        goto retry;
If the buffer was reused while we were waiting, b->write_lock_pending was 
already reset to zero (in __transition BS_EMPTY). We decrement it to 
0xffffffff.


Error buffers are linked in error_list and this list is only flushed at a 
specific case (in __wait_flush). If there are many i/o errors (for 
example, the disk is unplugged) and __wait_flush is not called 
sufficiently often, all existing buffers will be moved to error_list and 
then the code deadlocks as there would be no empty or clean buffers.


The code uses fixed-size cache of 4096 buffers and a single process may 
hold more than one buffer. This may deadlock in case of massive 
parallelism --- for example, imagine that 4096 processes come 
concurrently, each process requesting two buffers --- each process 
allocates one buffer and then a deadlock happens, each process is waiting 
for some free buffer that never comes. (this bug existed already the last 
year when I looked at the code)


Mikulas

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2011-08-04  9:06 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-08-01 21:00 Review of dm-block-manager.c Mikulas Patocka
2011-08-01 21:17 ` Mike Snitzer
2011-08-02  0:15   ` Mike Snitzer
2011-08-02  0:30   ` Mike Snitzer
2011-08-02 13:07 ` Joe Thornber
2011-08-02 13:29   ` Joe Thornber
2011-08-02 14:36 ` [PATCH 1/4] The return code from the various wait functions is never acted upon. So change to uninterrupible waits and change the return type to void Joe Thornber
2011-08-02 14:36   ` [PATCH 2/4] Fix a race between reading a new block and having it recycled Joe Thornber
2011-08-03 14:53     ` Mikulas Patocka
2011-08-02 14:36   ` [PATCH 3/4] [block-manager] remove spurious decrement of write_lock_pending in the case of a recycled block Joe Thornber
2011-08-03 14:50     ` Mikulas Patocka
2011-08-04  9:06       ` Joe Thornber
2011-08-02 14:36   ` [PATCH 4/4] Track errored blocks Joe Thornber
2011-08-03 15:00     ` Mikulas Patocka
2011-08-03 14:42   ` [PATCH 1/4] The return code from the various wait functions is never acted upon. So change to uninterrupible waits and change the return type to void Mikulas Patocka

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.