All of lore.kernel.org
 help / color / mirror / Atom feed
* [JFFS2] Commit "jffs2: Fix lock acquisition order bug in jffs2_write_begin" introduces another dead lock.
@ 2013-08-22 11:18 Ming Liu
  0 siblings, 0 replies; only message in thread
From: Ming Liu @ 2013-08-22 11:18 UTC (permalink / raw)
  To: linux-mtd, deng.chao1, thomas.betker

Hi, all:

I've been working with 2.6.34 stable kernel and recently encountered a 
AB-BA dead lock issue with jffs2, the scenario is:

Run two scripts at the same time:

Script 1:

#!/bin/bash

while [ 1 ]

do

cp /mnt/mtd-folder/region_a/xxx.tar.gz /mnt/mtd-folder/region_b

usleep 10

done


Script 2:

#!/bin/bash

while [ 1 ]

do

tar -zxvf /mnt/mtd-folder/region_b/.tar.gz -C /dev/shm

done

In several hours, the processes "cp", "tar" and "jffs2_gcd_mtd" all turn 
to "D" state. After some investigation, I found that it's introduced by 
commit "jffs2: Fix lock acquisition order bug in jffs2_write_begin", 
which tried to fix a AB-BA dead lock as:

jffs2_garbage_collect_live

mutex_lock(&f->sem) (A)

jffs2_garbage_collect_dnode

     jffs2_gc_fetch_page

         read_cache_page_async

             do_read_cache_page

lock_page(page)             (B)

jffs2_write_begin

grab_cache_page_write_begin

     find_lock_page

lock_page(page)                     (B)

mutex_lock(&f->sem) (A)


But for do_generic_file_read()  first acquires the page lock, then 
f->sem,causes another AB-BA deadlock with jffs2_write_begin(), which 
firstacquires f->sem, then the page lock:

jffs2_write_begin

mutex_lock(&f->sem) (A)

grab_cache_page_write_begin

     find_lock_page

lock_page(page)                     (B)

do_generic_file_read

lock_page_killable(page) (B)

     jffs2_readpage

mutex_lock(&f->sem)                     (A)


I also noticed there was another thread discussed a similar deadlock 
also related to the same commit, with the title: "[JFFS2]The patch 
"jffs2: Fix lock acquisition order bug in jffs2_write_begin" introduces 
another dead lock bug.", posted by Deng Chao. And Deng had proposed a 
idea that involving in a function "read_cache_page_async_trylock" 
instead of "read_cache_page_async", is there anybody has implement that?

the best,
thank you

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2013-08-22 11:19 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-08-22 11:18 [JFFS2] Commit "jffs2: Fix lock acquisition order bug in jffs2_write_begin" introduces another dead lock Ming Liu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.