From: Helge Deller <deller@gmx.de>
To: John David Anglin <dave.anglin@nrc-cnrc.gc.ca>
Cc: John David Anglin <dave@hiauly1.hia.nrc.ca>,
Carlos O'Donell <carlos@systemhalted.org>,
gniibe@fsij.org, linux-parisc@vger.kernel.org
Subject: Re: threads and fork on machine with VIPT-WB cache
Date: Sun, 11 Apr 2010 20:50:31 +0200 [thread overview]
Message-ID: <4BC219F7.5020204@gmx.de> (raw)
In-Reply-To: <20100410225355.GA2812@hiauly1.hia.nrc.ca>
[-- Attachment #1: Type: text/plain, Size: 2623 bytes --]
On 04/11/2010 12:53 AM, John David Anglin wrote:
> On Sat, 10 Apr 2010, Helge Deller wrote:
>
>> Nevertheless, on my B2000 (32bit, SMP, 2.6.32.2 kernel) I still do see the minifail bug.
>> The only difference seems to be, that the minifail3 program doesn't get stuck any
>> more. It still crashes though from time to time...
>
> There are some issues with your minifail3.c testcase. The fork'd child
> shouldn't do any I/O and it should exit using _exit(0). Otherwise, it
> can corrupt the I/O structures of the parent. I'm not sure that this
> is the issue on your B2000, but it's worth a try.
>
> The testcase when modified as above doesn't crash on my c3750 (32bit, UP,
> 2.6.32.2 kernel).
>
> I found in debugging this testcase that the crash was always associated
> with the stack region for thread_run. I put a big loop in thread_run.
> The index for the loop when compiled at -O0 is constantly being saved
> and restored on the stack. I found that crashes occured after many
> iterations of the loop. Nothing else was going on.
>
> The COW discussion convinced me that cache flushing was the problem.
> The fork (clone) syscall causes the stack region used by thread_run
> to become COW'd. When thread_run is scheduled, the loop caused an
> instant COW break and stack corruption. The state of the stack region
> generally returned to its state before the fork.
>
> If the above doesn't fix the testcase on your B2000, there must be
> some difference and other PA8000 machines.
Hi Dave,
I did tested the attached testcase. I think this is the version you sent last
time, and which has the _exit(0).
Nevertheless, I still see the crashes with all kernel patches applied.
What I usually do is to start up more than 8 screen sessions. In each of the
sessions I start the bash loop:
-> i=0; while true; do i=$(($i+1)); echo Run $i; ./minifail; done;
and detach from the screen sessions.
After some time, the load goes up to 8-16 and a few crashes fill the syslog.
I'm sure the crashes are related to how much load the machine is, and how
often process switches will happen.
How many minifail testcases do you run in parallel?
ls3017:/scratch/linux-git# uname -a
Linux ls3017 2.6.33.2-32bit #31 SMP Fri Apr 9 12:36:49 CEST 2010 parisc GNU/Linux
ls3017:/scratch/linux-git# cat /proc/cpuinfo
cpu family : PA-RISC 2.0
cpu : PA8500 (PCX-W)
cpu MHz : 440.000000
model : 9000/785/J5000
model name : Forte W 2-way
I-cache : 512 KB
D-cache : 1024 KB (WB, direct mapped)
ITLB entries : 160
DTLB entries : 160 - shared with ITLB
Helge
[-- Attachment #2: minifail_dave.cpp --]
[-- Type: text/plain, Size: 1062 bytes --]
#include <pthread.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
/*
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=561203
clone(child_stack=0x4088d040, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x4108c4e8, tls=0x4108c900, child_tidptr=0x4108c4e8) = 14819
[pid 14819] set_robust_list(0x4108c4f0, 0xc) = 0
[pid 14818] clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x40002028) = 14820
g++ minifail.cpp -o minifail -O0 -pthread -g
i=0; while true; do i=$(($i+1)); echo Run $i; ./minifail; done;
*/
void* thread_run(void* arg) {
write(1,"Thread OK.\n",11);
}
int pure_test() {
pthread_t thread;
pthread_create(&thread, NULL, thread_run, NULL);
switch (fork()) {
case -1:
perror("fork() failed");
case 0:
write(1,"Child OK.\n",10);
_exit(0);
default:
break;
}
pthread_join(thread, NULL);
return 0;
}
int main(int argc, char** argv) {
return pure_test();
}
next prev parent reply other threads:[~2010-04-11 18:50 UTC|newest]
Thread overview: 74+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <4BA43CE5.4020807@fsij.org>
[not found] ` <87hbo4ek8l.fsf@thialfi.karme.de>
[not found] ` <4BB18B46.2070203@fsij.org>
[not found] ` <4BB53D26.60601@fsij.org>
2010-04-02 2:41 ` threads and fork on machine with VIPT-WB cache NIIBE Yutaka
2010-04-02 3:30 ` James Bottomley
2010-04-02 3:48 ` NIIBE Yutaka
2010-04-02 8:05 ` NIIBE Yutaka
2010-04-02 19:35 ` John David Anglin
2010-04-08 21:11 ` Helge Deller
2010-04-08 21:54 ` John David Anglin
2010-04-08 22:44 ` John David Anglin
2010-04-09 14:14 ` Carlos O'Donell
2010-04-09 15:13 ` John David Anglin
2010-04-09 15:48 ` James Bottomley
2010-04-09 16:22 ` John David Anglin
2010-04-09 16:31 ` James Bottomley
2010-04-10 20:46 ` Helge Deller
2010-04-10 21:56 ` John David Anglin
2010-04-10 22:53 ` John David Anglin
2010-04-11 18:50 ` Helge Deller [this message]
2010-04-11 22:25 ` John David Anglin
2010-04-12 21:02 ` Helge Deller
2010-04-12 21:41 ` John David Anglin
2010-04-13 11:55 ` Helge Deller
2010-04-13 14:03 ` John David Anglin
2010-04-15 22:35 ` John David Anglin
2010-04-19 16:26 ` John David Anglin
2010-04-20 17:59 ` Helge Deller
2010-04-20 18:52 ` John David Anglin
2010-05-09 12:43 ` John David Anglin
2010-05-09 14:14 ` Carlos O'Donell
2010-05-10 9:56 ` Helge Deller
2010-05-10 14:56 ` John David Anglin
2010-05-10 19:20 ` Helge Deller
2010-05-10 21:07 ` John David Anglin
2010-05-11 16:37 ` John David Anglin
2010-05-11 21:39 ` John David Anglin
2010-05-11 20:44 ` Helge Deller
2010-05-11 20:41 ` Helge Deller
2010-05-11 21:26 ` John David Anglin
2010-05-11 21:41 ` Helge Deller
2010-05-15 21:02 ` John David Anglin
2010-05-16 20:22 ` Helge Deller
2010-05-16 21:38 ` John David Anglin
2010-05-22 17:25 ` John David Anglin
2010-05-23 13:11 ` Carlos O'Donell
2010-05-23 14:43 ` John David Anglin
2010-05-01 18:34 ` Thibaut VARENE
2010-05-01 20:17 ` John David Anglin
2010-05-02 10:53 ` Thibaut VARÈNE
2010-04-11 16:36 ` [PATCH] Call pagefault_disable/pagefault_enable in kmap_atomic/kunmap_atomic John David Anglin
2010-04-11 17:03 ` [PATCH] Remove unnecessary macros from entry.S John David Anglin
2010-04-11 17:08 ` [PATCH] Delete unnecessary nop's in entry.S John David Anglin
2010-04-11 17:12 ` [PATCH] Avoid interruption in critical region " John David Anglin
2010-04-11 18:24 ` James Bottomley
2010-04-11 18:45 ` John David Anglin
2010-04-11 18:53 ` James Bottomley
2010-04-11 17:26 ` [PATCH] LWS fixes for syscall.S John David Anglin
2010-06-02 15:33 ` Bug#561203: threads and fork on machine with VIPT-WB cache Modestas Vainius
2010-06-02 17:16 ` John David Anglin
2010-06-02 17:56 ` Bug#561203: " dann frazier
2010-06-03 8:50 ` Modestas Vainius
2010-06-04 1:03 ` NIIBE Yutaka
2010-06-04 5:21 ` dann frazier
2010-06-04 10:44 ` Thibaut VARENE
2010-06-07 17:11 ` dann frazier
2010-06-07 18:27 ` Thibaut VARÈNE
2010-06-07 23:33 ` dann frazier
2010-06-06 1:01 ` Modestas Vainius
2010-04-02 12:22 ` James Bottomley
2010-04-05 0:39 ` NIIBE Yutaka
2010-04-05 2:51 ` John David Anglin
2010-04-05 2:58 ` John David Anglin
2010-04-05 16:18 ` James Bottomley
2010-04-06 4:57 ` NIIBE Yutaka
2010-04-06 13:37 ` James Bottomley
2010-04-06 13:44 ` James Bottomley
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4BC219F7.5020204@gmx.de \
--to=deller@gmx.de \
--cc=carlos@systemhalted.org \
--cc=dave.anglin@nrc-cnrc.gc.ca \
--cc=dave@hiauly1.hia.nrc.ca \
--cc=gniibe@fsij.org \
--cc=linux-parisc@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox