* Re: Copying to loop device hangs up everything
@ 2001-12-20 21:05 Momchil Velikov
0 siblings, 0 replies; 17+ messages in thread
From: Momchil Velikov @ 2001-12-20 21:05 UTC (permalink / raw)
To: linux-kernel
Ok, I'm convinced, given that writers are throttled above the loopback
thread. Follows the Andrea's patch, but against 2.4.17-rc2 + removed
unused gfp_mask parameter of sync_page_buffers.
Regards,
-velco
--- 1.5/fs/buffer.c Tue Dec 18 15:40:18 2001
+++ edited/fs/buffer.c Thu Dec 20 22:45:36 2001
@@ -2432,7 +2432,7 @@
return 1;
}
-static int sync_page_buffers(struct buffer_head *head, unsigned int gfp_mask)
+static int sync_page_buffers(struct buffer_head *head)
{
struct buffer_head * bh = head;
int tryagain = 0;
@@ -2533,9 +2533,10 @@
/* Uhhuh, start writeback so that we don't end up with all dirty pages */
write_unlock(&hash_table_lock);
spin_unlock(&lru_list_lock);
+ gfp_mask = pf_gfp_mask(gfp_mask);
if (gfp_mask & __GFP_IO) {
if ((gfp_mask & __GFP_HIGHIO) || !PageHighMem(page)) {
- if (sync_page_buffers(bh, gfp_mask)) {
+ if (sync_page_buffers(bh)) {
/* no IO or waiting next time */
gfp_mask = 0;
goto cleaned_buffers_try_again;
--- 1.2/include/linux/mm.h Sat Dec 8 02:36:12 2001
+++ edited/include/linux/mm.h Thu Dec 20 22:49:04 2001
@@ -547,6 +547,14 @@
platforms, used as appropriate on others */
#define GFP_DMA __GFP_DMA
+static inline unsigned int pf_gfp_mask(unsigned int gfp_mask)
+{
+ /* avoid all memory balancing I/O methods if this task cannot block on I/O */
+ if (current->flags & PF_NOIO)
+ gfp_mask &= ~(__GFP_IO | __GFP_HIGHIO | __GFP_FS);
+
+ return gfp_mask;
+}
/* vma is the first one with address < vma->vm_end,
* and even address < vma->vm_start. Have to extend vma. */
--- 1.2/mm/vmscan.c Tue Dec 18 15:40:23 2001
+++ edited/mm/vmscan.c Thu Dec 20 22:49:47 2001
@@ -588,6 +588,8 @@
int priority = DEF_PRIORITY;
int nr_pages = SWAP_CLUSTER_MAX;
+ gfp_mask = pf_gfp_mask(gfp_mask);
+
do {
nr_pages = shrink_caches(classzone, priority, gfp_mask, nr_pages);
if (nr_pages <= 0)
^ permalink raw reply [flat|nested] 17+ messages in thread* Copying to loop device hangs up everything
@ 2001-12-16 3:40 David Gomez
2001-12-16 4:00 ` Dave Jones
0 siblings, 1 reply; 17+ messages in thread
From: David Gomez @ 2001-12-16 3:40 UTC (permalink / raw)
To: Linux-kernel
Hi,
I'm using kernel 2.4.17-rc1 and found what i think is a bug, maybe related
to the loop device. This is the situation:
I've created and ext2 image (around 550Mb), and mounted it as loopback.
Then i've tried to copy some files from another ext2 image also mounted in
another loop device, with a 'cp -a'. After some data has been copied, I/O
stopped but the system was still usable, loop, and cp process were in D
state. Loop devices couldn't be umounted, so i rebooted the computer,
e2fsck the images because of the reboot, and tried again to copy the data,
this time successfully.
Next, i had some more data in my root partition to add to the ext2 images,
so i did another cp -a of some directory (around 200mb of data) to the
ext2 image mounted as loop. This time i got a 'full hang' ;), i couldn't
login, a alt+sysrq+t shows that cp and loop were again in D state, and
syncing/umounting with the magic key didn't work at all. I can reproduce
this hang always, copying the data to the mounted loop device.
All the data is in the same disk (hda1), which is an ext2 partition.
Any ideas about what is causing this ?
David Gómez
"The question of whether computers can think is just like the question of
whether submarines can swim." -- Edsger W. Dijkstra
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Copying to loop device hangs up everything
2001-12-16 3:40 David Gomez
@ 2001-12-16 4:00 ` Dave Jones
2001-12-16 11:41 ` David Gomez
0 siblings, 1 reply; 17+ messages in thread
From: Dave Jones @ 2001-12-16 4:00 UTC (permalink / raw)
To: David Gomez; +Cc: Linux-kernel
On Sun, 16 Dec 2001, David Gomez wrote:
> I'm using kernel 2.4.17-rc1 and found what i think is a bug, maybe related
> to the loop device. This is the situation:
Can you repeat it with this applied ?
ftp://ftp.kernel.org/pub/linux/kernel/people/andrea/kernels/v2.4/2.4.17rc1aa1/00_loop-deadlock-1
regards,
Dave.
--
| Dave Jones. http://www.codemonkey.org.uk
| SuSE Labs
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Copying to loop device hangs up everything
2001-12-16 4:00 ` Dave Jones
@ 2001-12-16 11:41 ` David Gomez
2001-12-16 16:53 ` Momchil Velikov
0 siblings, 1 reply; 17+ messages in thread
From: David Gomez @ 2001-12-16 11:41 UTC (permalink / raw)
To: Dave Jones; +Cc: Linux-kernel
On Sun, 16 Dec 2001, Dave Jones wrote:
> > I'm using kernel 2.4.17-rc1 and found what i think is a bug, maybe related
> > to the loop device. This is the situation:
>
> Can you repeat it with this applied ?
> ftp://ftp.kernel.org/pub/linux/kernel/people/andrea/kernels/v2.4/2.4.17rc1aa1/00_loop-deadlock-1
Thanks ;), this patch solves the problem and copying a lot of data to the
loop device now doesn't hang the computer.
Is this patch going to be applied to the stable kernel ? Marcelo ?
David Gómez
"The question of whether computers can think is just like the question of
whether submarines can swim." -- Edsger W. Dijkstra
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Copying to loop device hangs up everything
2001-12-16 11:41 ` David Gomez
@ 2001-12-16 16:53 ` Momchil Velikov
2001-12-16 19:42 ` David Gomez
2001-12-17 3:30 ` Dave Jones
0 siblings, 2 replies; 17+ messages in thread
From: Momchil Velikov @ 2001-12-16 16:53 UTC (permalink / raw)
To: David Gomez; +Cc: Dave Jones, Linux-kernel
>>>>> "David" == David Gomez <davidge@jazzfree.com> writes:
David> On Sun, 16 Dec 2001, Dave Jones wrote:
>> > I'm using kernel 2.4.17-rc1 and found what i think is a bug, maybe related
>> > to the loop device. This is the situation:
>>
>> Can you repeat it with this applied ?
>> ftp://ftp.kernel.org/pub/linux/kernel/people/andrea/kernels/v2.4/2.4.17rc1aa1/00_loop-deadlock-1
David> Thanks ;), this patch solves the problem and copying a lot of data to the
David> loop device now doesn't hang the computer.
David> Is this patch going to be applied to the stable kernel ? Marcelo ?
I've had exactly the same hangups with or without the patch.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Copying to loop device hangs up everything
2001-12-16 16:53 ` Momchil Velikov
@ 2001-12-16 19:42 ` David Gomez
2001-12-16 19:50 ` Momchil Velikov
2001-12-17 3:30 ` Dave Jones
1 sibling, 1 reply; 17+ messages in thread
From: David Gomez @ 2001-12-16 19:42 UTC (permalink / raw)
To: Momchil Velikov; +Cc: David Gomez, Dave Jones, Linux-kernel
On 16 Dec 2001, Momchil Velikov wrote:
> [...]
>
> David> Thanks ;), this patch solves the problem and copying a lot of data to the
> David> loop device now doesn't hang the computer.
>
> David> Is this patch going to be applied to the stable kernel ? Marcelo ?
>
> I've had exactly the same hangups with or without the patch.
I've tested several times after applying the loop-deadlock patch and the
bug seems to be fixed. No more hangups while copying a lot of data to
loopback devices. Post more info about your hangups, maybe is another
different loop device deadlock.
David Gómez
"The question of whether computers can think is just like the question of
whether submarines can swim." -- Edsger W. Dijkstra
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Copying to loop device hangs up everything
2001-12-16 19:42 ` David Gomez
@ 2001-12-16 19:50 ` Momchil Velikov
2001-12-16 21:52 ` Momchil Velikov
0 siblings, 1 reply; 17+ messages in thread
From: Momchil Velikov @ 2001-12-16 19:50 UTC (permalink / raw)
To: David Gomez; +Cc: Dave Jones, Linux-kernel
>>>>> "David" == David Gomez <davidge@jazzfree.com> writes:
David> On 16 Dec 2001, Momchil Velikov wrote:
>> [...]
>>
David> Thanks ;), this patch solves the problem and copying a lot of data to the
David> loop device now doesn't hang the computer.
>>
David> Is this patch going to be applied to the stable kernel ? Marcelo ?
>>
>> I've had exactly the same hangups with or without the patch.
David> I've tested several times after applying the loop-deadlock patch and the
David> bug seems to be fixed. No more hangups while copying a lot of data to
David> loopback devices. Post more info about your hangups, maybe is another
David> different loop device deadlock.
Maybe it's different I don't know. Looks like I've found a fix and in
a minute I'll test _without_ the Andrea's patch and post whatever
comes out of it.
Regards,
-velco
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Copying to loop device hangs up everything
2001-12-16 19:50 ` Momchil Velikov
@ 2001-12-16 21:52 ` Momchil Velikov
2001-12-18 19:46 ` Marcelo Tosatti
0 siblings, 1 reply; 17+ messages in thread
From: Momchil Velikov @ 2001-12-16 21:52 UTC (permalink / raw)
To: David Gomez; +Cc: Dave Jones, Linux-kernel, Marcelo Tosatti
>>>>> "Momchil" == Momchil Velikov <velco@fadata.bg> writes:
>>>>> "David" == David Gomez <davidge@jazzfree.com> writes:
David> On 16 Dec 2001, Momchil Velikov wrote:
>>> [...]
>>>
David> Thanks ;), this patch solves the problem and copying a lot of data to the
David> loop device now doesn't hang the computer.
>>>
David> Is this patch going to be applied to the stable kernel ? Marcelo ?
>>>
>>> I've had exactly the same hangups with or without the patch.
David> I've tested several times after applying the loop-deadlock patch and the
David> bug seems to be fixed. No more hangups while copying a lot of data to
David> loopback devices. Post more info about your hangups, maybe is another
David> different loop device deadlock.
Momchil> Maybe it's different I don't know. Looks like I've found a fix and in
Momchil> a minute I'll test _without_ the Andrea's patch and post whatever
Momchil> comes out of it.
It turned out that Andrea's patch is needed and it needs to be
augmented slightly. The loop_thread can do the following:
loop_thread
-> do_bh_filebacked
-> lo_send
-> ...
-> kmem_cache_alloc
-> ...
-> shrink_cache
-> try_to_release_page
-> try_to_free_buffers
-> sync_page_buffers
-> __wait_on_buffer
And if the buffer must be flushed to the loopback device we deadlock.
The following patch is the Andrea's one + one additional change -- we
don't allow the loop_thread to wait in sync_page_buffers.
Regards,
-velco
diff -Nru a/drivers/block/loop.c b/drivers/block/loop.c
--- a/drivers/block/loop.c Sun Dec 16 23:50:25 2001
+++ b/drivers/block/loop.c Sun Dec 16 23:50:25 2001
@@ -578,6 +578,8 @@
atomic_inc(&lo->lo_pending);
spin_unlock_irq(&lo->lo_lock);
+ current->flags |= PF_NOIO;
+
/*
* up sem, we are running
*/
diff -Nru a/fs/buffer.c b/fs/buffer.c
--- a/fs/buffer.c Sun Dec 16 23:50:25 2001
+++ b/fs/buffer.c Sun Dec 16 23:50:25 2001
@@ -1045,7 +1045,7 @@
/* First, check for the "real" dirty limit. */
if (dirty > soft_dirty_limit) {
- if (dirty > hard_dirty_limit)
+ if (dirty > hard_dirty_limit && !(current->flags & PF_NOIO))
return 1;
return 0;
}
@@ -2448,6 +2448,8 @@
/* Second time through we start actively writing out.. */
if (test_and_set_bit(BH_Lock, &bh->b_state)) {
if (!test_bit(BH_launder, &bh->b_state))
+ continue;
+ if (current->flags & PF_NOIO)
continue;
wait_on_buffer(bh);
tryagain = 1;
diff -Nru a/include/linux/sched.h b/include/linux/sched.h
--- a/include/linux/sched.h Sun Dec 16 23:50:25 2001
+++ b/include/linux/sched.h Sun Dec 16 23:50:25 2001
@@ -426,6 +426,7 @@
#define PF_MEMALLOC 0x00000800 /* Allocating memory */
#define PF_MEMDIE 0x00001000 /* Killed for out-of-memory */
#define PF_FREE_PAGES 0x00002000 /* per process page freeing */
+#define PF_NOIO 0x00004000 /* avoid generating further I/O */
#define PF_USEDFPU 0x00100000 /* task used FPU this quantum (SMP) */
^ permalink raw reply [flat|nested] 17+ messages in thread* Re: Copying to loop device hangs up everything
2001-12-16 21:52 ` Momchil Velikov
@ 2001-12-18 19:46 ` Marcelo Tosatti
2001-12-18 20:54 ` Momchil Velikov
0 siblings, 1 reply; 17+ messages in thread
From: Marcelo Tosatti @ 2001-12-18 19:46 UTC (permalink / raw)
To: Momchil Velikov; +Cc: lkml
Momchil,
Your fix does not look right. We _have_ to sync pages at
sync_page_buffers(), we cannot "ignore" them.
On 16 Dec 2001, Momchil Velikov wrote:
> >>>>> "Momchil" == Momchil Velikov <velco@fadata.bg> writes:
>
> >>>>> "David" == David Gomez <davidge@jazzfree.com> writes:
> David> On 16 Dec 2001, Momchil Velikov wrote:
>
> >>> [...]
> >>>
> David> Thanks ;), this patch solves the problem and copying a lot of data to the
> David> loop device now doesn't hang the computer.
> >>>
> David> Is this patch going to be applied to the stable kernel ? Marcelo ?
> >>>
> >>> I've had exactly the same hangups with or without the patch.
>
> David> I've tested several times after applying the loop-deadlock patch and the
> David> bug seems to be fixed. No more hangups while copying a lot of data to
> David> loopback devices. Post more info about your hangups, maybe is another
> David> different loop device deadlock.
>
> Momchil> Maybe it's different I don't know. Looks like I've found a fix and in
> Momchil> a minute I'll test _without_ the Andrea's patch and post whatever
> Momchil> comes out of it.
>
> It turned out that Andrea's patch is needed and it needs to be
> augmented slightly. The loop_thread can do the following:
>
> loop_thread
> -> do_bh_filebacked
> -> lo_send
> -> ...
> -> kmem_cache_alloc
> -> ...
> -> shrink_cache
> -> try_to_release_page
> -> try_to_free_buffers
> -> sync_page_buffers
> -> __wait_on_buffer
>
> And if the buffer must be flushed to the loopback device we deadlock.
>
> The following patch is the Andrea's one + one additional change -- we
> don't allow the loop_thread to wait in sync_page_buffers.
>
> Regards,
> -velco
>
> diff -Nru a/drivers/block/loop.c b/drivers/block/loop.c
> --- a/drivers/block/loop.c Sun Dec 16 23:50:25 2001
> +++ b/drivers/block/loop.c Sun Dec 16 23:50:25 2001
> @@ -578,6 +578,8 @@
> atomic_inc(&lo->lo_pending);
> spin_unlock_irq(&lo->lo_lock);
>
> + current->flags |= PF_NOIO;
> +
> /*
> * up sem, we are running
> */
> diff -Nru a/fs/buffer.c b/fs/buffer.c
> --- a/fs/buffer.c Sun Dec 16 23:50:25 2001
> +++ b/fs/buffer.c Sun Dec 16 23:50:25 2001
> @@ -1045,7 +1045,7 @@
>
> /* First, check for the "real" dirty limit. */
> if (dirty > soft_dirty_limit) {
> - if (dirty > hard_dirty_limit)
> + if (dirty > hard_dirty_limit && !(current->flags & PF_NOIO))
> return 1;
> return 0;
> }
> @@ -2448,6 +2448,8 @@
> /* Second time through we start actively writing out.. */
> if (test_and_set_bit(BH_Lock, &bh->b_state)) {
> if (!test_bit(BH_launder, &bh->b_state))
> + continue;
> + if (current->flags & PF_NOIO)
> continue;
> wait_on_buffer(bh);
> tryagain = 1;
> diff -Nru a/include/linux/sched.h b/include/linux/sched.h
> --- a/include/linux/sched.h Sun Dec 16 23:50:25 2001
> +++ b/include/linux/sched.h Sun Dec 16 23:50:25 2001
> @@ -426,6 +426,7 @@
> #define PF_MEMALLOC 0x00000800 /* Allocating memory */
> #define PF_MEMDIE 0x00001000 /* Killed for out-of-memory */
> #define PF_FREE_PAGES 0x00002000 /* per process page freeing */
> +#define PF_NOIO 0x00004000 /* avoid generating further I/O */
>
> #define PF_USEDFPU 0x00100000 /* task used FPU this quantum (SMP) */
>
>
^ permalink raw reply [flat|nested] 17+ messages in thread* Re: Copying to loop device hangs up everything
2001-12-18 19:46 ` Marcelo Tosatti
@ 2001-12-18 20:54 ` Momchil Velikov
2001-12-18 19:57 ` Marcelo Tosatti
0 siblings, 1 reply; 17+ messages in thread
From: Momchil Velikov @ 2001-12-18 20:54 UTC (permalink / raw)
To: Marcelo Tosatti; +Cc: lkml
>>>>> "Marcelo" == Marcelo Tosatti <marcelo@conectiva.com.br> writes:
Marcelo> Momchil,
Marcelo> Your fix does not look right. We _have_ to sync pages at
Marcelo> sync_page_buffers(), we cannot "ignore" them.
Sure, we don't ignore them, we just don't _wait_ for them, because
maybe _we_ are the one to write them.
Regards,
-velco
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Copying to loop device hangs up everything
2001-12-18 20:54 ` Momchil Velikov
@ 2001-12-18 19:57 ` Marcelo Tosatti
2001-12-18 21:26 ` Momchil Velikov
[not found] ` <3C1FC254.525B9108@zip.com.au>
0 siblings, 2 replies; 17+ messages in thread
From: Marcelo Tosatti @ 2001-12-18 19:57 UTC (permalink / raw)
To: Momchil Velikov; +Cc: lkml
On 18 Dec 2001, Momchil Velikov wrote:
> >>>>> "Marcelo" == Marcelo Tosatti <marcelo@conectiva.com.br> writes:
>
> Marcelo> Momchil,
>
> Marcelo> Your fix does not look right. We _have_ to sync pages at
> Marcelo> sync_page_buffers(), we cannot "ignore" them.
>
> Sure, we don't ignore them, we just don't _wait_ for them, because
> maybe _we_ are the one to write them.
What if we are not ?
^ permalink raw reply [flat|nested] 17+ messages in thread* Re: Copying to loop device hangs up everything
2001-12-18 19:57 ` Marcelo Tosatti
@ 2001-12-18 21:26 ` Momchil Velikov
[not found] ` <3C1FC254.525B9108@zip.com.au>
1 sibling, 0 replies; 17+ messages in thread
From: Momchil Velikov @ 2001-12-18 21:26 UTC (permalink / raw)
To: Marcelo Tosatti; +Cc: lkml
>>>>> "Marcelo" == Marcelo Tosatti <marcelo@conectiva.com.br> writes:
Marcelo> Momchil,
Marcelo> Your fix does not look right. We _have_ to sync pages at
Marcelo> sync_page_buffers(), we cannot "ignore" them.
>> Sure, we don't ignore them, we just don't _wait_ for them, because
>> maybe _we_ are the one to write them.
Marcelo> What if we are not ?
Hmm, looks like we pray to find another immediately usable page, to
finish _this_ request first, and then we will ``loop_get_bh'' the
buffer we just avoided waiting on and sync it.
Hmm, _maybe_ it is a good idea buffers submitted for IO by the
loopback threads to themselves go _in front_ of the loopback queue ?
Regards,
-velco
^ permalink raw reply [flat|nested] 17+ messages in thread[parent not found: <3C1FC254.525B9108@zip.com.au>]
* Re: Copying to loop device hangs up everything
2001-12-16 16:53 ` Momchil Velikov
2001-12-16 19:42 ` David Gomez
@ 2001-12-17 3:30 ` Dave Jones
1 sibling, 0 replies; 17+ messages in thread
From: Dave Jones @ 2001-12-17 3:30 UTC (permalink / raw)
To: Momchil Velikov; +Cc: Linux-kernel
On 16 Dec 2001, Momchil Velikov wrote:
> >> Can you repeat it with this applied ?
> >> ftp://ftp.kernel.org/pub/linux/kernel/people/andrea/kernels/v2.4/2.4.17rc1aa1/00_loop-deadlock-1
> I've had exactly the same hangups with or without the patch.
You could be hitting a different bug.. highmem box ?
Dave.
--
| Dave Jones. http://www.codemonkey.org.uk
| SuSE Labs
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2001-12-20 21:21 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-12-20 21:05 Copying to loop device hangs up everything Momchil Velikov
-- strict thread matches above, loose matches on Subject: below --
2001-12-16 3:40 David Gomez
2001-12-16 4:00 ` Dave Jones
2001-12-16 11:41 ` David Gomez
2001-12-16 16:53 ` Momchil Velikov
2001-12-16 19:42 ` David Gomez
2001-12-16 19:50 ` Momchil Velikov
2001-12-16 21:52 ` Momchil Velikov
2001-12-18 19:46 ` Marcelo Tosatti
2001-12-18 20:54 ` Momchil Velikov
2001-12-18 19:57 ` Marcelo Tosatti
2001-12-18 21:26 ` Momchil Velikov
[not found] ` <3C1FC254.525B9108@zip.com.au>
[not found] ` <3C1FCB96.83E49ECB@zip.com.au>
[not found] ` <3C204C4F.C989AD71@zip.com.au>
2001-12-19 13:42 ` Andrea Arcangeli
2001-12-20 7:41 ` Andrew Morton
2001-12-20 11:27 ` Andrea Arcangeli
2001-12-20 11:34 ` Andrea Arcangeli
2001-12-17 3:30 ` Dave Jones
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox