linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* PROBLEM: pthread-safety bug in write(2) on Linux 2.6.x
@ 2006-04-13  1:45 Dan Bonachea
  2006-04-13  2:10 ` Alistair John Strachan
                   ` (2 more replies)
  0 siblings, 3 replies; 24+ messages in thread
From: Dan Bonachea @ 2006-04-13  1:45 UTC (permalink / raw)
  To: linux-kernel

Hi - I believe I've discovered a thread-safety bug in the Linux 2.6.x kernel 
implementation of write(2).

The small C program below the problem - in a nutshell, if multiple threads 
write() to STDOUT_FILENO, and stdout has been redirected to a file, then some 
of the output lines get lost. The actual result is non-deterministic (even in 
a "correct" run) - however the expected correct behavior is 10 lines of output 
(in some non-deterministic order). However, the test program reproducibly 
generates some lost output (less than 10 lines of total output) on every run 
where the output is redirected to a new file. This appears to be a violation 
of the POSIX spec (POSIX 1003.1-2001:2.9.1 requires thread-safety of write()). 


The problem does not appear to occur if output goes to the console, or is 
redirected to append to an existing file, only when stdout is redirected to a 
new file.

---------------------------------------------------------------------------
/* Instructions:
  * compile as: gcc write-bug.c -lpthread
  * run with: ./a.out
  *   note that lines from all 10 threads are visible
  * run with: ./a.out > output
  *   note that lines from several threads are missing from the output file
  */
#include <pthread.h>
#include <stdio.h>
#include <unistd.h>
volatile int flag = 0;
void *start(void *arg) {
   char msg[255];
   int res;
   sprintf(msg,"hi from %i\n",(int)arg);
   if ((int)arg == 9) flag = 1;
   while (!flag) ; /* thread barrier */
   #if 1
     res = write(STDOUT_FILENO,msg,strlen(msg));
     if (res != strlen(msg)) fprintf(stderr,"Failure: %i 
%s\n",res,strerror(res));
   #else
     fputs(msg,stdout);
   #endif
   #if 1 /* work extra hard to flush output (makes no difference) */
     fflush(NULL);
     fsync(STDOUT_FILENO);
     sync();
   #endif
   return NULL;
}
int main() {
   int i;
   /* create 10 pthreads */
   pthread_t id[10];
   for (i =0 ; i < 10; i++) {
     int ret = pthread_create(&(id[i]),NULL,&start,(void*)i);
     if (ret) printf("pthread_create: %s\n",strerror(ret));
   }
   /* join 10 pthreads */
   for (i =0 ; i < 10; i++) {
     pthread_join(id[i], NULL);
   }
   sleep(1);
   return 0;
}
---------------------------------------------------------------------------

The problem occurs on every Linux 2.6 machine I've tried, including ones with 
Itanium-2, Athlon, Opteron and Pentium 4 chips, both SMP's and uniprocessors, 
using a variety of recent gcc versions. The problem does *not* appear to occur 
on any Linux 2.4 machine I've tried, even if I use the same executable which 
fails on the Linux 2.6 machine. Finally, replacing write(STDOUT_FILENO) with 
fputs(stdout) makes the problem disappear (presumably due to locking in libc).

I don't have administrative access to upgrade the kernel on these machines, 
however below is the full machine info for the most recent installed kernel 
version I have access to.

Any suggestions are appreciated.
Thanks,
Dan

# gcc --version
gcc (GCC) 3.4.4 20050721 (Red Hat 3.4.4-2)
Copyright (C) 2004 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
# cat /proc/version
Linux version 2.6.12-1.1376_FC3smp (bhcompile@tweety.build.redhat.com) (gcc 
version 3.4.4 20050721 (Red Hat 3.4.4-2)) #1 SMP Fri Aug 26 23:50:33 EDT 2005
# cat /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 15
model           : 2
model name      : Intel(R) Pentium(R) 4 CPU 2.40GHz
stepping        : 9
cpu MHz         : 2395.041
cache size      : 512 KB
physical id     : 0
siblings        : 2
core id         : 0
cpu cores       : 1
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 2
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov 
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe cid xtpr
bogomips        : 4734.97

processor       : 1
vendor_id       : GenuineIntel
cpu family      : 15
model           : 2
model name      : Intel(R) Pentium(R) 4 CPU 2.40GHz
stepping        : 9
cpu MHz         : 2395.041
cache size      : 512 KB
physical id     : 0
siblings        : 2
core id         : 0
cpu cores       : 1
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 2
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov 
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe cid xtpr
bogomips        : 4784.12
# cat /proc/modules
ipt_multiport 6465 2 - Live 0xe0a1d000
ipt_ULOG 11973 6 - Live 0xe0aff000
ipt_REJECT 9537 5 - Live 0xe0b03000
ipt_limit 6465 7 - Live 0xe0a2c000
ipt_state 5825 9 - Live 0xe0a20000
ip_conntrack 46249 1 ipt_state, Live 0xe0b54000
iptable_filter 6849 1 - Live 0xe09e3000
ip_tables 25025 6 
ipt_multiport,ipt_ULOG,ipt_REJECT,ipt_limit,ipt_state,iptable_filter, Live 
0xe0b3d000
parport_pc 30981 1 - Live 0xe0b0b000
lp 16713 0 - Live 0xe0a23000
parport 38793 2 parport_pc,lp, Live 0xe0af4000
autofs4 22725 11 - Live 0xe09ed000
nfs 196745 7 - Live 0xe0b77000
lockd 64745 2 nfs, Live 0xe09fa000
sunrpc 138629 3 nfs,lockd, Live 0xe0b1a000
dm_mod 59109 0 - Live 0xe0a0d000
video 19653 0 - Live 0xe09f4000
button 8001 0 - Live 0xe09aa000
battery 13253 0 - Live 0xe09e8000
ac 8773 0 - Live 0xe09a6000
md5 8001 1 - Live 0xe099d000
ipv6 262913 20 - Live 0xe0a30000
uhci_hcd 35665 0 - Live 0xe09d9000
ehci_hcd 37709 0 - Live 0xe0965000
hw_random 9429 0 - Live 0xe0961000
i2c_i801 12493 0 - Live 0xe089b000
i2c_core 25025 1 i2c_i801, Live 0xe0991000
snd_intel8x0 35585 0 - Live 0xe0935000
snd_ac97_codec 77497 1 snd_intel8x0, Live 0xe09c5000
snd_pcm_oss 53745 0 - Live 0xe090a000
snd_mixer_oss 21697 1 snd_pcm_oss, Live 0xe0894000
snd_pcm 91845 3 snd_intel8x0,snd_ac97_codec,snd_pcm_oss, Live 0xe09ad000
snd_timer 28357 1 snd_pcm, Live 0xe0861000
snd 58405 6 
snd_intel8x0,snd_ac97_codec,snd_pcm_oss,snd_mixer_oss,snd_pcm,snd_timer, Live 
0xe08fa000
soundcore 13217 1 snd, Live 0xe087a000
snd_page_alloc 13765 2 snd_intel8x0,snd_pcm, Live 0xe0875000
e1000 105517 0 - Live 0xe091a000
ext3 131529 2 - Live 0xe08d8000
jbd 61913 1 ext3, Live 0xe08a0000
ata_piix 13381 0 - Live 0xe0870000
libata 49093 1 ata_piix, Live 0xe087f000
sd_mod 22721 0 - Live 0xe0869000
scsi_mod 135561 2 libata,sd_mod, Live 0xe081a000
# cat /proc/ioports
0000-001f : dma1
0020-0021 : pic1
0040-0043 : timer0
0050-0053 : timer1
0060-006f : keyboard
0070-0077 : rtc
0080-008f : dma page reg
00a0-00a1 : pic2
00c0-00df : dma2
00f0-00ff : fpu
0170-0177 : ide1
01f0-01f7 : ide0
02f8-02ff : serial
0376-0376 : ide1
0378-037a : parport0
03c0-03df : vga+
03f6-03f6 : ide0
03f8-03ff : serial
0800-087f : 0000:00:1f.0
   0800-0803 : PM1a_EVT_BLK
   0804-0805 : PM1a_CNT_BLK
   0808-080b : PM_TMR
   0810-0815 : ACPI CPU throttle
   0828-082f : GPE0_BLK
0880-08bf : 0000:00:1f.0
0c00-0c7f : pnp 00:0a
0cf8-0cff : PCI conf1
df40-df7f : 0000:02:0c.0
   df40-df7f : e1000
eda0-edbf : 0000:00:1f.3
   eda0-edaf : i801-smbus
edc0-edff : 0000:00:1f.5
   edc0-edff : Intel ICH5
ee00-eeff : 0000:00:1f.5
   ee00-eeff : Intel ICH5
fe00-fe07 : 0000:00:1f.2
   fe00-fe07 : libata
fe10-fe13 : 0000:00:1f.2
   fe10-fe13 : libata
fe20-fe27 : 0000:00:1f.2
   fe20-fe27 : libata
fe30-fe33 : 0000:00:1f.2
   fe30-fe33 : libata
fea0-feaf : 0000:00:1f.2
   fea0-feaf : libata
ff20-ff3f : 0000:00:1d.3
   ff20-ff3f : uhci_hcd
ff40-ff5f : 0000:00:1d.2
   ff40-ff5f : uhci_hcd
ff60-ff7f : 0000:00:1d.1
   ff60-ff7f : uhci_hcd
ff80-ff9f : 0000:00:1d.0
   ff80-ff9f : uhci_hcd
ffa0-ffaf : 0000:00:1f.1
   ffa0-ffa7 : ide0
   ffa8-ffaf : ide1
# cat /proc/iomem
00000000-0009ffff : System RAM
000a0000-000bffff : Video RAM area
000c0000-000cf7ff : Video ROM
000cf800-000cffff : Adapter ROM
000f0000-000fffff : System ROM
00100000-1ff73fff : System RAM
   00100000-003027ae : Kernel code
   003027af-003c2b7f : Kernel data
1ff74000-1ff75fff : ACPI Non-volatile Storage
1ff76000-1ff96fff : ACPI Tables
1ff97000-1fffffff : reserved
e8000000-efffffff : 0000:00:00.0
f0000000-f7ffffff : PCI Bus #01
   f0000000-f7ffffff : 0000:01:00.0
fcfe0000-fcffffff : 0000:02:0c.0
   fcfe0000-fcffffff : e1000
fd000000-feafffff : PCI Bus #01
   fd000000-fdffffff : 0000:01:00.0
febff900-febff9ff : 0000:00:1f.5
   febff900-febff9ff : Intel ICH5
febffa00-febffbff : 0000:00:1f.5
   febffa00-febffbff : Intel ICH5
febffc00-febfffff : 0000:00:1f.1
fec00000-fec0ffff : reserved
fecf0000-fecf0fff : reserved
fed20000-fed8ffff : reserved
fee00000-fee0ffff : reserved
ffa80800-ffa80bff : 0000:00:1d.7
   ffa80800-ffa80bff : ehci_hcd
ffb00000-ffffffff : reserved
# cat /proc/scsi/scsi
Attached devices:
# lspci -vvv
00:00.0 Host bridge: Intel Corporation 82875P/E7210 Memory Controller Hub (rev 
02)
         Subsystem: Dell: Unknown device 0156
         Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR+ FastB2B-
         Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- 
<TAbort- <MAbort+ >SERR- <PERR-
         Latency: 0
         Region 0: Memory at e8000000 (32-bit, prefetchable) [size=128M]
         Capabilities: <available only to root>

00:01.0 PCI bridge: Intel Corporation 82875P Processor to AGP Controller (rev 
02) (prog-if 00 [Normal decode])
         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR+ FastB2B-
         Status: Cap- 66Mhz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- 
<TAbort- <MAbort- >SERR- <PERR-
         Latency: 64
         Bus: primary=00, secondary=01, subordinate=01, sec-latency=64
         I/O behind bridge: 0000f000-00000fff
         Memory behind bridge: fd000000-feafffff
         Prefetchable memory behind bridge: f0000000-f7ffffff
         Secondary status: 66Mhz+ FastB2B+ ParErr- DEVSEL=medium >TAbort- 
<TAbort- <MAbort+ <SERR- <PERR-
         BridgeCtl: Parity- SERR- NoISA- VGA+ MAbort- >Reset- FastB2B-

00:1d.0 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB UHCI 
Controller #1 (rev 02) (prog-if 00 [UHCI])
         Subsystem: Dell: Unknown device 0156
         Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B-
         Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- 
<TAbort- <MAbort- >SERR- <PERR-
         Latency: 0
         Interrupt: pin A routed to IRQ 193
         Region 4: I/O ports at ff80 [size=32]

00:1d.1 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB UHCI 
Controller #2 (rev 02) (prog-if 00 [UHCI])
         Subsystem: Dell: Unknown device 0156
         Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B-
         Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- 
<TAbort- <MAbort- >SERR- <PERR-
         Latency: 0
         Interrupt: pin B routed to IRQ 201
         Region 4: I/O ports at ff60 [size=32]

00:1d.2 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB UHCI #3 
(rev 02) (prog-if 00 [UHCI])
         Subsystem: Dell: Unknown device 0156
         Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B-
         Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- 
<TAbort- <MAbort- >SERR- <PERR-
         Latency: 0
         Interrupt: pin C routed to IRQ 169
         Region 4: I/O ports at ff40 [size=32]

00:1d.3 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB UHCI 
Controller #4 (rev 02) (prog-if 00 [UHCI])
         Subsystem: Dell: Unknown device 0156
         Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B-
         Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- 
<TAbort- <MAbort- >SERR- <PERR-
         Latency: 0
         Interrupt: pin A routed to IRQ 193
         Region 4: I/O ports at ff20 [size=32]

00:1d.7 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB2 EHCI 
Controller (rev 02) (prog-if 20 [EHCI])
         Subsystem: Dell: Unknown device 0156
         Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR+ FastB2B-
         Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- 
<TAbort- <MAbort- >SERR- <PERR-
         Latency: 0
         Interrupt: pin D routed to IRQ 185
         Region 0: Memory at ffa80800 (32-bit, non-prefetchable) [size=1K]
         Capabilities: <available only to root>

00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev c2) (prog-if 00 
[Normal decode])
         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR+ FastB2B-
         Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- 
<TAbort- <MAbort- >SERR- <PERR-
         Latency: 0
         Bus: primary=00, secondary=02, subordinate=02, sec-latency=32
         I/O behind bridge: 0000d000-0000dfff
         Memory behind bridge: fcf00000-fcffffff
         Prefetchable memory behind bridge: fff00000-000fffff
         Secondary status: 66Mhz- FastB2B+ ParErr- DEVSEL=medium >TAbort- 
<TAbort- <MAbort+ <SERR- <PERR-
         BridgeCtl: Parity- SERR+ NoISA- VGA- MAbort- >Reset- FastB2B-

00:1f.0 ISA bridge: Intel Corporation 82801EB/ER (ICH5/ICH5R) LPC Interface 
Bridge (rev 02)
         Control: I/O+ Mem+ BusMaster+ SpecCycle+ MemWINV- VGASnoop- ParErr- 
Stepping- SERR+ FastB2B-
         Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- 
<TAbort- <MAbort- >SERR- <PERR-
         Latency: 0

00:1f.1 IDE interface: Intel Corporation 82801EB/ER (ICH5/ICH5R) IDE 
Controller (rev 02) (prog-if 8a [Master SecP PriP])
         Subsystem: Dell: Unknown device 0156
         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B-
         Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- 
<TAbort- <MAbort- >SERR- <PERR-
         Latency: 0
         Interrupt: pin A routed to IRQ 169
         Region 0: I/O ports at <ignored>
         Region 1: I/O ports at <ignored>
         Region 2: I/O ports at <ignored>
         Region 3: I/O ports at <ignored>
         Region 4: I/O ports at ffa0 [size=16]
         Region 5: Memory at febffc00 (32-bit, non-prefetchable) [size=1K]

00:1f.2 IDE interface: Intel Corporation 82801EB (ICH5) SATA Controller (rev 
02) (prog-if 8f [Master SecP SecO PriP PriO])
         Subsystem: Dell: Unknown device 0156
         Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B-
         Status: Cap- 66Mhz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- 
<TAbort- <MAbort- >SERR- <PERR-
         Latency: 0
         Interrupt: pin A routed to IRQ 169
         Region 0: I/O ports at fe00 [size=8]
         Region 1: I/O ports at fe10 [size=4]
         Region 2: I/O ports at fe20 [size=8]
         Region 3: I/O ports at fe30 [size=4]
         Region 4: I/O ports at fea0 [size=16]

00:1f.3 SMBus: Intel Corporation 82801EB/ER (ICH5/ICH5R) SMBus Controller (rev 
02)
         Subsystem: Dell: Unknown device 0156
         Control: I/O+ Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B-
         Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- 
<TAbort- <MAbort- >SERR- <PERR-
         Interrupt: pin B routed to IRQ 10
         Region 4: I/O ports at eda0 [size=32]

00:1f.5 Multimedia audio controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) 
AC'97 Audio Controller (rev 02)
         Subsystem: Dell: Unknown device 0156
         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B-
         Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- 
<TAbort- <MAbort- >SERR- <PERR-
         Latency: 0
         Interrupt: pin B routed to IRQ 177
         Region 0: I/O ports at ee00 [size=256]
         Region 1: I/O ports at edc0 [size=64]
         Region 2: Memory at febffa00 (32-bit, non-prefetchable) [size=512]
         Region 3: Memory at febff900 (32-bit, non-prefetchable) [size=256]
         Capabilities: <available only to root>

01:00.0 VGA compatible controller: nVidia Corporation NV34GL [Quadro FX 
500/600 PCI] (rev a1) (prog-if 00 [VGA])
         Subsystem: nVidia Corporation: Unknown device 01ba
         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B-
         Status: Cap+ 66Mhz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- 
<TAbort- <MAbort- >SERR- <PERR-
         Latency: 64 (1250ns min, 250ns max)
         Interrupt: pin A routed to IRQ 11
         Region 0: Memory at fd000000 (32-bit, non-prefetchable) [size=16M]
         Region 1: Memory at f0000000 (32-bit, prefetchable) [size=128M]
         Expansion ROM at fea00000 [disabled] [size=128K]
         Capabilities: <available only to root>

02:0c.0 Ethernet controller: Intel Corporation 82540EM Gigabit Ethernet 
Controller (rev 02)
         Subsystem: Dell: Unknown device 0156
         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- 
Stepping- SERR+ FastB2B-
         Status: Cap+ 66Mhz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- 
<TAbort- <MAbort- >SERR- <PERR-
         Latency: 64 (63750ns min), Cache Line Size 10
         Interrupt: pin A routed to IRQ 169
         Region 0: Memory at fcfe0000 (32-bit, non-prefetchable) [size=128K]
         Region 2: I/O ports at df40 [size=64]
         Capabilities: <available only to root>

# 


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: PROBLEM: pthread-safety bug in write(2) on Linux 2.6.x
  2006-04-13  1:45 Dan Bonachea
@ 2006-04-13  2:10 ` Alistair John Strachan
  2006-04-13  4:46 ` Andrew Morton
  2010-12-19 22:45 ` Jens Moser
  2 siblings, 0 replies; 24+ messages in thread
From: Alistair John Strachan @ 2006-04-13  2:10 UTC (permalink / raw)
  To: Dan Bonachea; +Cc: linux-kernel

On Thursday 13 April 2006 02:45, Dan Bonachea wrote:
> Hi - I believe I've discovered a thread-safety bug in the Linux 2.6.x
> kernel implementation of write(2).

Hi, let's get the obvious out of the way first.

Is this reproducible on a current -stable, non-vendor kernel?

-- 
Cheers,
Alistair.

'No sense being pessimistic, it probably wouldn't work anyway.'
Third year Computer Science undergraduate.
1F2 55 South Clerk Street, Edinburgh, UK.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: PROBLEM: pthread-safety bug in write(2) on Linux 2.6.x
  2006-04-13  1:45 Dan Bonachea
  2006-04-13  2:10 ` Alistair John Strachan
@ 2006-04-13  4:46 ` Andrew Morton
  2006-04-13  5:33   ` Nick Piggin
  2006-04-13  9:18   ` Dan Bonachea
  2010-12-19 22:45 ` Jens Moser
  2 siblings, 2 replies; 24+ messages in thread
From: Andrew Morton @ 2006-04-13  4:46 UTC (permalink / raw)
  To: Dan Bonachea; +Cc: linux-kernel

Dan Bonachea <bonachead@comcast.net> wrote:
>
> Hi - I believe I've discovered a thread-safety bug in the Linux 2.6.x kernel 
>  implementation of write(2).
> 
>  The small C program below the problem - in a nutshell, if multiple threads 
>  write() to STDOUT_FILENO, and stdout has been redirected to a file, then some 
>  of the output lines get lost. The actual result is non-deterministic (even in 
>  a "correct" run) - however the expected correct behavior is 10 lines of output 
>  (in some non-deterministic order). However, the test program reproducibly 
>  generates some lost output (less than 10 lines of total output) on every run 
>  where the output is redirected to a new file. This appears to be a violation 
>  of the POSIX spec (POSIX 1003.1-2001:2.9.1 requires thread-safety of write()). 
> 
> 
>  The problem does not appear to occur if output goes to the console, or is 
>  redirected to append to an existing file, only when stdout is redirected to a 
>  new file.

This comes up occasionally.  Is very simple:

asmlinkage ssize_t sys_write(unsigned int fd, const char __user * buf, size_t count)
{
	struct file *file;
	ssize_t ret = -EBADF;
	int fput_needed;

	file = fget_light(fd, &fput_needed);
	if (file) {
		loff_t pos = file_pos_read(file);
		ret = vfs_write(file, buf, count, &pos);
		file_pos_write(file, pos);
		fput_light(file, fput_needed);
	}

	return ret;
}

we can have multiple threads in vfs_write() using the same `pos'.

Locking for file.f_pos is generally file->f_dentry->d_inode->i_mutex.  We
could use that if we were to restructure the code a lot.  Or we could add a
new lock to `struct file'.

Or we could do nothing, because a) the application is going to produce
inderterminate output anyway and b) because it only affects silly testcases
and not real-world apps.

OK, there _might_ be a real-world case: threads appending logging
information to a flat file.  Trivially workable-around with a userspace
lock, or by switching to stdio (same thing).

Yes, really we should fix it.  But it's not worth adding more overhead to
do so.  So the fix would involve widespread (but simple) change, to draw
that f_pos update inside i_mutex.

(We could pseudo-fix it by updating f_pos _before_ entering vfs_write(), too).

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: PROBLEM: pthread-safety bug in write(2) on Linux 2.6.x
  2006-04-13  4:46 ` Andrew Morton
@ 2006-04-13  5:33   ` Nick Piggin
  2006-04-13 15:01     ` Linus Torvalds
  2006-04-13  9:18   ` Dan Bonachea
  1 sibling, 1 reply; 24+ messages in thread
From: Nick Piggin @ 2006-04-13  5:33 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Dan Bonachea, linux-kernel, Linus Torvalds

Andrew Morton wrote:

> OK, there _might_ be a real-world case: threads appending logging
> information to a flat file.  Trivially workable-around with a userspace
> lock, or by switching to stdio (same thing).
> 
> Yes, really we should fix it.  But it's not worth adding more overhead to
> do so.  So the fix would involve widespread (but simple) change, to draw
> that f_pos update inside i_mutex.

Didn't Linus explicitly made the decision not to add synchronisation for
writes with the same file?

http://groups.google.com/group/fa.linux.kernel/msg/ef66c762e737bab7?hl=en&
Is the closest I could find, but I'm sure he said something similar,
specifically about write(2) vs write(2).

-- 
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: PROBLEM: pthread-safety bug in write(2) on Linux 2.6.x
  2006-04-13  4:46 ` Andrew Morton
  2006-04-13  5:33   ` Nick Piggin
@ 2006-04-13  9:18   ` Dan Bonachea
  2006-04-13  9:56     ` Andrew Morton
  2006-04-13 14:14     ` Jan Engelhardt
  1 sibling, 2 replies; 24+ messages in thread
From: Dan Bonachea @ 2006-04-13  9:18 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, Nick Piggin, Linus Torvalds

At 09:46 PM 4/12/2006, Andrew Morton wrote:
>Locking for file.f_pos is generally file->f_dentry->d_inode->i_mutex.  We
>could use that if we were to restructure the code a lot.  Or we could add a
>new lock to `struct file'.
>
>Or we could do nothing, because a) the application is going to produce
>inderterminate output anyway and b) because it only affects silly testcases
>and not real-world apps.
>
>OK, there _might_ be a real-world case: threads appending logging
>information to a flat file.  Trivially workable-around with a userspace
>lock, or by switching to stdio (same thing).
>
>Yes, really we should fix it.  But it's not worth adding more overhead to
>do so.  So the fix would involve widespread (but simple) change, to draw
>that f_pos update inside i_mutex.

Hi Andrew - thanks for the detailed response.

I don't know enough about the kernel implementation to comment on your 
proposed fixes.

However, I should clarify that this problem definitely affects more than just 
"silly testcases", and the fact that a program generates non-deterministically 
ordered output does not necessarily make it erroneous, invalid or unuseful.

This problem arose in the parallel runtime system for a scientific language 
compiler (nearly a million lines of code total - definitely a "real-world" 
program) - the example code is merely a pared-down demonstration of the 
problem. In parallel scientific computing, it's very common for many threads 
to be writing to stdout (usually for monitoring purposes) and it's expected 
and normal for output from separate threads to be arbitrarily interleaved, but 
it's *not* ok for output to be lost entirely. This is essentially equivalent 
to the real-world example you gave of many threads logging to a file.

We've worked around the problem in Linux 2.6 by adding locking at user-level 
around our writes, as you suggest, although this of course penalizes our 
performance on kernels that already correctly implement the thread-safety 
required by the POSIX spec. In any case it seemed like a problem that we 
should report, to be good open-source citizens - especially given that it 
appears to be a regression with respect to the Linux 2.4 kernel. How you 
choose to handle the report is of course your decision.

Thanks for your time.
Dan


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: PROBLEM: pthread-safety bug in write(2) on Linux 2.6.x
  2006-04-13  9:18   ` Dan Bonachea
@ 2006-04-13  9:56     ` Andrew Morton
  2006-04-13 10:28       ` Kyle Moffett
  2006-04-13 14:14     ` Jan Engelhardt
  1 sibling, 1 reply; 24+ messages in thread
From: Andrew Morton @ 2006-04-13  9:56 UTC (permalink / raw)
  To: Dan Bonachea; +Cc: linux-kernel, nickpiggin, torvalds

Dan Bonachea <bonachead@comcast.net> wrote:
>
> This problem arose in the parallel runtime system for a scientific language 
>  compiler (nearly a million lines of code total - definitely a "real-world" 
>  program) - the example code is merely a pared-down demonstration of the 
>  problem. In parallel scientific computing, it's very common for many threads 
>  to be writing to stdout (usually for monitoring purposes) and it's expected 
>  and normal for output from separate threads to be arbitrarily interleaved, but 
>  it's *not* ok for output to be lost entirely. This is essentially equivalent 
>  to the real-world example you gave of many threads logging to a file.

Interesting - afaik that's the first time this has been hit in a real
application.

>  We've worked around the problem in Linux 2.6 by adding locking at user-level 
>  around our writes, as you suggest, although this of course penalizes our 
>  performance on kernels that already correctly implement the thread-safety 
>  required by the POSIX spec. In any case it seemed like a problem that we 
>  should report, to be good open-source citizens - especially given that it 
>  appears to be a regression with respect to the Linux 2.4 kernel. How you 
>  choose to handle the report is of course your decision.

yup, thanks.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: PROBLEM: pthread-safety bug in write(2) on Linux 2.6.x
  2006-04-13  9:56     ` Andrew Morton
@ 2006-04-13 10:28       ` Kyle Moffett
  0 siblings, 0 replies; 24+ messages in thread
From: Kyle Moffett @ 2006-04-13 10:28 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Dan Bonachea, linux-kernel, nickpiggin, torvalds

On Apr 13, 2006, at 05:56:08, Andrew Morton wrote:
> Dan Bonachea <bonachead@comcast.net> wrote:
>>
>> This problem arose in the parallel runtime system for a scientific  
>> language compiler (nearly a million lines of code total -  
>> definitely a "real-world" program) - the example code is merely a  
>> pared-down demonstration of the problem. In parallel scientific  
>> computing, it's very common for many threads to be writing to  
>> stdout (usually for monitoring purposes) and it's expected and  
>> normal for output from separate threads to be arbitrarily  
>> interleaved, but it's *not* ok for output to be lost entirely.  
>> This is essentially equivalent to the real-world example you gave  
>> of many threads logging to a file.
>
> Interesting - afaik that's the first time this has been hit in a  
> real application.

I would guess that it could also be a problem with a wide variety of  
Perl CGIs running under Apache2+mod_perl with the worker threading  
model.  In fact this may actually explain some odd behavior I got  
from one such module that I "fixed" by switching to logging via  
syslog.  I don't remember which module it was or what the exact  
problem was, but I think it seemed similar in nature.  That is  
somewhat of a surprising and nonintuitive failure mode for logging,  
IMHO it would be nice to get fixed.  I would imagine that this has  
probably been hit a number of times before, mysteriously fixed by  
changed userspace locking or subtle thread ordering, and written off  
as the mysterious effects of cosmic rays on RAM.

Cheers,
Kyle Moffett


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: PROBLEM: pthread-safety bug in write(2) on Linux 2.6.x
  2006-04-13  9:18   ` Dan Bonachea
  2006-04-13  9:56     ` Andrew Morton
@ 2006-04-13 14:14     ` Jan Engelhardt
  1 sibling, 0 replies; 24+ messages in thread
From: Jan Engelhardt @ 2006-04-13 14:14 UTC (permalink / raw)
  To: Dan Bonachea; +Cc: Andrew Morton, linux-kernel, Nick Piggin, Linus Torvalds

>
> I don't know enough about the kernel implementation to comment on your proposed
> fixes.
>
> However, I should clarify that this problem definitely affects more than just
> "silly testcases", and the fact that a program generates non-deterministically
> ordered output does not necessarily make it erroneous, invalid or unuseful.
>
For example syslogd (in the rare event that it does not use stdio). Or any 
kind of logfile daemon that just dumps incoming events to a file that needs 
user analysis afterwards anyway.


Jan Engelhardt
-- 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: PROBLEM: pthread-safety bug in write(2) on Linux 2.6.x
  2006-04-13  5:33   ` Nick Piggin
@ 2006-04-13 15:01     ` Linus Torvalds
  2006-04-13 15:28       ` Linus Torvalds
  2006-04-13 21:50       ` Alan Cox
  0 siblings, 2 replies; 24+ messages in thread
From: Linus Torvalds @ 2006-04-13 15:01 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Andrew Morton, Dan Bonachea, linux-kernel



On Thu, 13 Apr 2006, Nick Piggin wrote:
> 
> Didn't Linus explicitly made the decision not to add synchronisation for
> writes with the same file?

I would be _very_ nervous to do it, yes. In particular, I do not consider 
it at all unlikely to have something database-like that does concurrent 
writes to the same file at different offsets, and serializing them on the 
VFS layer seems pretty broken. 

Also, doing it unconditionally in the VFS layer is actually pretty 
seriously broken: the VFS layer doesn't even know what kind of file it is, 
and locking on writes can be literally the wrong thing for some file 
descriptors (think pipes or sockets: if we have one blocking write holding 
the lock, that should _not_ imply that other - possibly nonblocking - 
writes shouldn't be able to call in to the low-level write handler).

That said, I wouldn't be 100% against it, especially under certain 
circumstances. However, the circumstances when I think it might be 
acceptable are fairly specific:

 - when we use f_pos (ie we'd synchronize write against write, but only 
   per "struct file", not on an inode basis)
 - only for files that are marked seekable with FMODE_LSEEK (thus avoiding 
   the stream-like objects like pipes and sockets)

Under those two circumstances, I'd certainly be ok with it, and I think we 
could argue that it is a "good thing". It would be a "f_pos" lock (so we 
migt do it for reads too), not a "data lock".

Comments?

		Linus

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: PROBLEM: pthread-safety bug in write(2) on Linux 2.6.x
  2006-04-13 15:01     ` Linus Torvalds
@ 2006-04-13 15:28       ` Linus Torvalds
  2006-04-14 10:20         ` Nikita Danilov
  2006-04-13 21:50       ` Alan Cox
  1 sibling, 1 reply; 24+ messages in thread
From: Linus Torvalds @ 2006-04-13 15:28 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Andrew Morton, Dan Bonachea, linux-kernel



On Thu, 13 Apr 2006, Linus Torvalds wrote:
> 
> That said, I wouldn't be 100% against it, especially under certain 
> circumstances. However, the circumstances when I think it might be 
> acceptable are fairly specific:
> 
>  - when we use f_pos (ie we'd synchronize write against write, but only 
>    per "struct file", not on an inode basis)
>  - only for files that are marked seekable with FMODE_LSEEK (thus avoiding 
>    the stream-like objects like pipes and sockets)
> 
> Under those two circumstances, I'd certainly be ok with it, and I think we 
> could argue that it is a "good thing". It would be a "f_pos" lock (so we 
> migt do it for reads too), not a "data lock".
> 
> Comments?

Something like this.

NOTE NOTE NOTE! Untested!

		Linus
---
diff --git a/fs/file_table.c b/fs/file_table.c
index bcea199..48728cc 100644
--- a/fs/file_table.c
+++ b/fs/file_table.c
@@ -120,6 +120,7 @@ struct file *get_empty_filp(void)
 	f->f_uid = tsk->fsuid;
 	f->f_gid = tsk->fsgid;
 	eventpoll_init_file(f);
+	mutex_init(&f->f_pos_lock);
 	/* f->f_version: 0 */
 	return f;
 
diff --git a/fs/read_write.c b/fs/read_write.c
index 5bc0e92..ad9db26 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -329,14 +329,23 @@ ssize_t vfs_write(struct file *file, con
 
 EXPORT_SYMBOL(vfs_write);
 
-static inline loff_t file_pos_read(struct file *file)
+/*
+ * FMODE_LSEEK will never change for a file during its lifetime,
+ * so it's ok to test it independently on the lock/unlock path
+ * rather than explicitly remembering whether we locked it.
+ */
+static inline loff_t file_pos_read_lock(struct file *file)
 {
+	if (file->f_mode & FMODE_LSEEK)
+		mutex_lock(&file->f_pos_lock);
 	return file->f_pos;
 }
 
-static inline void file_pos_write(struct file *file, loff_t pos)
+static inline void file_pos_write_unlock(struct file *file, loff_t pos)
 {
 	file->f_pos = pos;
+	if (file->f_mode & FMODE_LSEEK)
+		mutex_unlock(&file->f_pos_lock);
 }
 
 asmlinkage ssize_t sys_read(unsigned int fd, char __user * buf, size_t count)
@@ -347,9 +356,9 @@ asmlinkage ssize_t sys_read(unsigned int
 
 	file = fget_light(fd, &fput_needed);
 	if (file) {
-		loff_t pos = file_pos_read(file);
+		loff_t pos = file_pos_read_lock(file);
 		ret = vfs_read(file, buf, count, &pos);
-		file_pos_write(file, pos);
+		file_pos_write_unlock(file, pos);
 		fput_light(file, fput_needed);
 	}
 
@@ -365,9 +374,9 @@ asmlinkage ssize_t sys_write(unsigned in
 
 	file = fget_light(fd, &fput_needed);
 	if (file) {
-		loff_t pos = file_pos_read(file);
+		loff_t pos = file_pos_read_lock(file);
 		ret = vfs_write(file, buf, count, &pos);
-		file_pos_write(file, pos);
+		file_pos_write_unlock(file, pos);
 		fput_light(file, fput_needed);
 	}
 
@@ -603,9 +612,9 @@ sys_readv(unsigned long fd, const struct
 
 	file = fget_light(fd, &fput_needed);
 	if (file) {
-		loff_t pos = file_pos_read(file);
+		loff_t pos = file_pos_read_lock(file);
 		ret = vfs_readv(file, vec, vlen, &pos);
-		file_pos_write(file, pos);
+		file_pos_write_unlock(file, pos);
 		fput_light(file, fput_needed);
 	}
 
@@ -624,9 +633,9 @@ sys_writev(unsigned long fd, const struc
 
 	file = fget_light(fd, &fput_needed);
 	if (file) {
-		loff_t pos = file_pos_read(file);
+		loff_t pos = file_pos_read_lock(file);
 		ret = vfs_writev(file, vec, vlen, &pos);
-		file_pos_write(file, pos);
+		file_pos_write_unlock(file, pos);
 		fput_light(file, fput_needed);
 	}
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 162c6e5..1d58cd0 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -643,6 +643,7 @@ struct file {
 	loff_t			f_pos;
 	struct fown_struct	f_owner;
 	unsigned int		f_uid, f_gid;
+	struct mutex		f_pos_lock;
 	struct file_ra_state	f_ra;
 
 	unsigned long		f_version;

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: PROBLEM: pthread-safety bug in write(2) on Linux 2.6.x
  2006-04-13 15:01     ` Linus Torvalds
  2006-04-13 15:28       ` Linus Torvalds
@ 2006-04-13 21:50       ` Alan Cox
  2006-04-13 22:06         ` Dan Bonachea
  2006-04-13 22:40         ` Linus Torvalds
  1 sibling, 2 replies; 24+ messages in thread
From: Alan Cox @ 2006-04-13 21:50 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Nick Piggin, Andrew Morton, Dan Bonachea, linux-kernel

On Iau, 2006-04-13 at 08:01 -0700, Linus Torvalds wrote:
> That said, I wouldn't be 100% against it, especially under certain 
> circumstances. However, the circumstances when I think it might be 
> acceptable are fairly specific:
> 
>  - when we use f_pos (ie we'd synchronize write against write, but only 
>    per "struct file", not on an inode basis)
>  - only for files that are marked seekable with FMODE_LSEEK (thus avoiding 
>    the stream-like objects like pipes and sockets)
> 
> Under those two circumstances, I'd certainly be ok with it, and I think we 
> could argue that it is a "good thing". It would be a "f_pos" lock (so we 
> migt do it for reads too), not a "data lock".

The only serious case historically has been O_APPEND which does have
pretty precise semantics. Nowdays we also have pread/pwrite which have
pretty clear semantics and deal with threading. The O_APPEND case is
very important to get correct and 2.4 certainly did so.


Looking at the spec it says the following

"If the O_APPEND flag of the file status flags is set, the file offset
shall be set to the end of the file prior to each write and no
intervening file modification operation shall occur between changing the
file offset and the write operation."

This is what 2.4 took great paints to guarantee for file writes. Now in
actual fact no OS I know of implements this to the extreme (mmap) but
does for the other cases.

Outside of O_APPEND the specification says only that
- The write starts at the file position
- The file position is updated before the syscall returns

It makes no other guarantee I can see.


As such I belive that the O_APPEND case must be kept locked properly and
the non O_APPEND cases are already correctly handled by the kernel. That
seems to argue for f_pos serialization on O_APPEND only.

Alan









^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: PROBLEM: pthread-safety bug in write(2) on Linux 2.6.x
  2006-04-13 21:50       ` Alan Cox
@ 2006-04-13 22:06         ` Dan Bonachea
  2006-04-13 23:03           ` Linus Torvalds
  2006-04-13 23:11           ` Alan Cox
  2006-04-13 22:40         ` Linus Torvalds
  1 sibling, 2 replies; 24+ messages in thread
From: Dan Bonachea @ 2006-04-13 22:06 UTC (permalink / raw)
  To: Alan Cox, Linus Torvalds; +Cc: Nick Piggin, Andrew Morton, linux-kernel

At 02:50 PM 4/13/2006, Alan Cox wrote:
>The only serious case historically has been O_APPEND which does have
>pretty precise semantics. Nowdays we also have pread/pwrite which have
>pretty clear semantics and deal with threading. The O_APPEND case is
>very important to get correct and 2.4 certainly did so.
...
>As such I belive that the O_APPEND case must be kept locked properly and
>the non O_APPEND cases are already correctly handled by the kernel. That
>seems to argue for f_pos serialization on O_APPEND only.

I agree that O_APPEND is important to get correct. However, would that also 
handle the specific case of stdout redirection to a file? (ie "a.out > 
result", not just "a.out >> result" - the latter incidentally does not 
currently seem to fail, at least with my tests)

>Looking at the spec it says the following
>
>"If the O_APPEND flag of the file status flags is set, the file offset
>shall be set to the end of the file prior to each write and no
>intervening file modification operation shall occur between changing the
>file offset and the write operation."
>
>This is what 2.4 took great paints to guarantee for file writes. Now in
>actual fact no OS I know of implements this to the extreme (mmap) but
>does for the other cases.
>
>Outside of O_APPEND the specification says only that
>- The write starts at the file position
>- The file position is updated before the syscall returns
>
>It makes no other guarantee I can see.

The POSIX 1003.1-2001 spec seems to provide a very clear guarantee of 
thread-safety behavior wrt threads:

2.9.1 Thread-Safety
All functions defined by this volume of IEEE Std 1003.1-2001 shall be 
thread-safe, except that the following functions need not be thread-safe.
    <omitted list of functions, which does not include write>
Implementations shall provide internal synchronization as necessary in order 
to satisfy this
requirement.
...
2.9.7 Thread Interactions with Regular File Operations
All of the functions chmod( ), close( ), fchmod( ), fcntl( ), fstat( ), 
ftruncate( ), lseek( ), open( ), read( ), readlink( ), stat( ), symlink( ), 
and write( ) shall be atomic with respect to each other in the effects 
specified in IEEE Std 1003.1-2001 when they operate on regular files. If two 
threads each call one of these functions, each call shall either see all of 
the specified effects of the other call, or none of them.

Unless I'm missing something, that doesn't leave much ambiguity regarding 
what's required for POSIX compliance on this issue (although I'm not sure 
POSIX compliance is the right metric).

Dan


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: PROBLEM: pthread-safety bug in write(2) on Linux 2.6.x
  2006-04-13 21:50       ` Alan Cox
  2006-04-13 22:06         ` Dan Bonachea
@ 2006-04-13 22:40         ` Linus Torvalds
  2006-04-13 23:05           ` Alan Cox
  1 sibling, 1 reply; 24+ messages in thread
From: Linus Torvalds @ 2006-04-13 22:40 UTC (permalink / raw)
  To: Alan Cox; +Cc: Nick Piggin, Andrew Morton, Dan Bonachea, linux-kernel



On Thu, 13 Apr 2006, Alan Cox wrote:
> 
> The only serious case historically has been O_APPEND which does have
> pretty precise semantics. Nowdays we also have pread/pwrite which have
> pretty clear semantics and deal with threading. The O_APPEND case is
> very important to get correct and 2.4 certainly did so.

pread/pwrite automatically is safe.

Our O_APPEND handling should be safe - although since we do it at a FS 
level it actually depends on the filesystem itself. Most (all that use the 
generic routines at least) filesystems will get the inode semaphore for 
writing, and do the position handling within that semaphore.

So we follow the specs, but..

> Outside of O_APPEND the specification says only that
> - The write starts at the file position
> - The file position is updated before the syscall returns
> 
> It makes no other guarantee I can see.

Right. I think this is purely a "quality of implementation" issue. We 
already follow the spec, the question is whether we want to be better than 
that.

> As such I belive that the O_APPEND case must be kept locked properly and
> the non O_APPEND cases are already correctly handled by the kernel. That
> seems to argue for f_pos serialization on O_APPEND only.

f_pos doesn't really matter for O_APPEND, since we'll ignore it, and use 
the file size as the position. Which is why the patch I sent out doesn't 
matter (and which is why we already get O_APPEND right - we check the file 
size within the inode semaphore/mutex).

		Linus

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: PROBLEM: pthread-safety bug in write(2) on Linux 2.6.x
  2006-04-13 22:06         ` Dan Bonachea
@ 2006-04-13 23:03           ` Linus Torvalds
  2006-04-13 23:11           ` Alan Cox
  1 sibling, 0 replies; 24+ messages in thread
From: Linus Torvalds @ 2006-04-13 23:03 UTC (permalink / raw)
  To: Dan Bonachea; +Cc: Alan Cox, Nick Piggin, Andrew Morton, linux-kernel



On Thu, 13 Apr 2006, Dan Bonachea wrote:
>
> 2.9.7 Thread Interactions with Regular File Operations
> All of the functions chmod( ), close( ), fchmod( ), fcntl( ), fstat( ),
> ftruncate( ), lseek( ), open( ), read( ), readlink( ), stat( ), symlink( ),
> and write( ) shall be atomic with respect to each other in the effects
> specified in IEEE Std 1003.1-2001 when they operate on regular files. If two
> threads each call one of these functions, each call shall either see all of
> the specified effects of the other call, or none of them.
> 
> Unless I'm missing something, that doesn't leave much ambiguity regarding
> what's required for POSIX compliance on this issue (although I'm not sure
> POSIX compliance is the right metric).

Interesting. That's certainly not something we've guaranteed. My suggested 
patch makes read/write on the same file descriptor atomic wrt each other, 
but does not serialize them wrt different file descriptors (even if it's 
the same file). So you could see "half a write" from another read, for 
example, or fstat() could easily return the half-way file size for a write 
that hasn't completed.

Quite frankly, being totally atomic wrt each other seems unreasonable. 
Having fstat take the inode lock between threads is insane. It's 
definitely not even something you'd _want_ between processes (ie if 
somebody is doing a 2GB write, you want to be able to do an "ls" to see 
the file grow!), doing it between threads would just be insane.

Or maybe the "effects specified in IEEE Std 1003.1-2001" is really about 
the _other_ effects (ie things like write clearing SUID bits, atime/mtime 
etc).

So it smells like POSIX is broken here, in that it expects bad behaviour 
that isn't sane. But a subset of it would potentially be sane (ie exactly 
that write/write or write/read atomicity on the same file _descriptor_, 
exactly because they share f_pos).

		Linus

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: PROBLEM: pthread-safety bug in write(2) on Linux 2.6.x
  2006-04-13 22:40         ` Linus Torvalds
@ 2006-04-13 23:05           ` Alan Cox
  2006-04-13 23:06             ` Linus Torvalds
  0 siblings, 1 reply; 24+ messages in thread
From: Alan Cox @ 2006-04-13 23:05 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Nick Piggin, Andrew Morton, Dan Bonachea, linux-kernel

On Iau, 2006-04-13 at 15:40 -0700, Linus Torvalds wrote:
> > Outside of O_APPEND the specification says only that
> > - The write starts at the file position
> > - The file position is updated before the syscall returns
> > 
> > It makes no other guarantee I can see.
> 
> Right. I think this is purely a "quality of implementation" issue. We 
> already follow the spec, the question is whether we want to be better than 
> that.

Quality for whom ? There is a measurable cost to all that extra locking
which will hurt everyone. Given existing kernels don't make the
guarantee and SuS v3 does not make the guarantee the apps that need it
will continue to do the extra work themselves anyway.

I'd say the existing approach is the best quality of implementation for
those needing performance and that the cost for those needing ordering
guarantees in Linux is already astoundingly low thanks to the excellent
work done on futex based posix locking in glibc. I can choose to pay the
costs today, if we do extra locking I cannot opt out.

And of course I too would like to know if anyone is hitting O_APPEND
examples of this problem and if so on what fs ....

Alan


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: PROBLEM: pthread-safety bug in write(2) on Linux 2.6.x
  2006-04-13 23:05           ` Alan Cox
@ 2006-04-13 23:06             ` Linus Torvalds
  2006-04-13 23:11               ` Linus Torvalds
  0 siblings, 1 reply; 24+ messages in thread
From: Linus Torvalds @ 2006-04-13 23:06 UTC (permalink / raw)
  To: Alan Cox; +Cc: Nick Piggin, Andrew Morton, Dan Bonachea, linux-kernel



On Fri, 14 Apr 2006, Alan Cox wrote:
> 
> Quality for whom ? There is a measurable cost to all that extra locking
> which will hurt everyone. Given existing kernels don't make the
> guarantee and SuS v3 does not make the guarantee the apps that need it
> will continue to do the extra work themselves anyway.

True. 

> And of course I too would like to know if anyone is hitting O_APPEND
> examples of this problem and if so on what fs ....

Well, Dan already said that he doesn't see it with O_APPEND (aka ">>"), 
but yes, other filesystems may not be that lucky.

I'd actually not be surprised if the O_APPEND thing is tested by some of 
the FS stress tools, so I suspect all the regular filesystems get it 
right. Of course, since most of them - at least the local ones - use the 
same generic framework, that's an even safer bet.

NFS etc is very special, of course, especially in the presense of clients 
on different machines. At that point, I wouldn't guarantee a whole lot.

		Linus

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: PROBLEM: pthread-safety bug in write(2) on Linux 2.6.x
  2006-04-13 23:06             ` Linus Torvalds
@ 2006-04-13 23:11               ` Linus Torvalds
  0 siblings, 0 replies; 24+ messages in thread
From: Linus Torvalds @ 2006-04-13 23:11 UTC (permalink / raw)
  To: Alan Cox; +Cc: Nick Piggin, Andrew Morton, Dan Bonachea, linux-kernel



On Thu, 13 Apr 2006, Linus Torvalds wrote:

> 
> 
> On Fri, 14 Apr 2006, Alan Cox wrote:
> > 
> > Quality for whom ? There is a measurable cost to all that extra locking
> > which will hurt everyone. Given existing kernels don't make the
> > guarantee and SuS v3 does not make the guarantee the apps that need it
> > will continue to do the extra work themselves anyway.
> 
> True. 

Side note: if you want the strange POSIX guarantees that Dan quoted 
(atomicity between write and fstat), you'd almost have to do it with 
user-space crapola magic locks. Doing it in the kernel would just be 
insane, you'd have to have some per-process "IO semaphore".

(Doing it in user space _also_ sounds insane, but then you could have a 
switch like "POSIX me harder, and do all the really stupid things" at 
compile time or something)

			Linus

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: PROBLEM: pthread-safety bug in write(2) on Linux 2.6.x
  2006-04-13 22:06         ` Dan Bonachea
  2006-04-13 23:03           ` Linus Torvalds
@ 2006-04-13 23:11           ` Alan Cox
  2006-04-13 23:19             ` Linus Torvalds
  1 sibling, 1 reply; 24+ messages in thread
From: Alan Cox @ 2006-04-13 23:11 UTC (permalink / raw)
  To: Dan Bonachea; +Cc: Linus Torvalds, Nick Piggin, Andrew Morton, linux-kernel

On Iau, 2006-04-13 at 15:06 -0700, Dan Bonachea wrote:
> Unless I'm missing something, that doesn't leave much ambiguity regarding 
> what's required for POSIX compliance on this issue (although I'm not sure 
> POSIX compliance is the right metric).

Interesting. That pretty much conflicts with what write(2) itself is
defined as in the same specification, and means that the locking is
specific to posix thread groups not to processes. Well we've always
known that pthreads was a brain dead screw-up of a specification so I
guess that should be no suprise.

If the locking is thread group specific it may actually be best to
handle that one in glibc with a futex lock, as only glibc really knows
what is a posix pthread app, and it would avoid the idiocy escaping into
normal applications (which are 95+% of cases)

What does Ulrich think ?


Alan


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: PROBLEM: pthread-safety bug in write(2) on Linux 2.6.x
  2006-04-13 23:11           ` Alan Cox
@ 2006-04-13 23:19             ` Linus Torvalds
  0 siblings, 0 replies; 24+ messages in thread
From: Linus Torvalds @ 2006-04-13 23:19 UTC (permalink / raw)
  To: Alan Cox; +Cc: Dan Bonachea, Nick Piggin, Andrew Morton, linux-kernel



On Fri, 14 Apr 2006, Alan Cox wrote:
> 
> Interesting. That pretty much conflicts with what write(2) itself is
> defined as in the same specification

They may mean that writes get "as much atomicity" between threads as 
specified in other places (which is not a whole lot). In which case we're 
certainly totally according to spec (since Linux has exactly the same 
guarantees for threads as for anything else - since we just don't even 
_care_ if it's a thread or not).

It may be that the extra POSIX wording comes from user-space thread 
libraries that did "magic things" with select loops etc for IO using 
non-blocking file descriptors (which, together with some latency 
guarantees, could turn a single write into a series of smaller blocked 
writes).

That would explain the POSIX wording - that they are supposed to be "as 
thread safe" as a native write, even when they are wrapped inside a magic 
threaded IO library. Maybe the "in the effects specified in IEEE Std 
1003.1-2001" part is exactly about the fact that write is _not_ actually 
specified to be totally atomic by the _normal_ POSIX stuff, but that they 
wanted to make it clear that it's supposed to be "as atomic" as it's 
supposed to be.

Hmm? Trying to be a language lawyer over a spec is always painful. I'd 
suspect that the people who wrote that part didn't even really think about 
it a lot, they just meant that they were "thread safe" in the sense that 
you can call them concurrently without the system blowing up.

		Linus

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: PROBLEM: pthread-safety bug in write(2) on Linux 2.6.x
       [not found]   ` <616k2-6Xz-27@gated-at.bofh.it>
@ 2006-04-14  9:14     ` Kai Henningsen
  0 siblings, 0 replies; 24+ messages in thread
From: Kai Henningsen @ 2006-04-14  9:14 UTC (permalink / raw)
  To: linux-kernel

bonachead@comcast.net (Dan Bonachea)  wrote on 13.04.06 in <616k2-6Xz-27@gated-at.bofh.it>:

> At 09:46 PM 4/12/2006, Andrew Morton wrote:
> >Locking for file.f_pos is generally file->f_dentry->d_inode->i_mutex.  We
> >could use that if we were to restructure the code a lot.  Or we could add a
> >new lock to `struct file'.
> >
> >Or we could do nothing, because a) the application is going to produce
> >inderterminate output anyway and b) because it only affects silly testcases
> >and not real-world apps.
> >
> >OK, there _might_ be a real-world case: threads appending logging
> >information to a flat file.  Trivially workable-around with a userspace
> >lock, or by switching to stdio (same thing).
> >
> >Yes, really we should fix it.  But it's not worth adding more overhead to
> >do so.  So the fix would involve widespread (but simple) change, to draw
> >that f_pos update inside i_mutex.
>
> Hi Andrew - thanks for the detailed response.
>
> I don't know enough about the kernel implementation to comment on your
> proposed fixes.
>
> However, I should clarify that this problem definitely affects more than
> just "silly testcases", and the fact that a program generates
> non-deterministically ordered output does not necessarily make it erroneous,
> invalid or unuseful.
>
> This problem arose in the parallel runtime system for a scientific language
> compiler (nearly a million lines of code total - definitely a "real-world"
> program) - the example code is merely a pared-down demonstration of the
> problem. In parallel scientific computing, it's very common for many threads
> to be writing to stdout (usually for monitoring purposes) and it's expected
> and normal for output from separate threads to be arbitrarily interleaved,
> but it's *not* ok for output to be lost entirely. This is essentially
> equivalent to the real-world example you gave of many threads logging to a
> file.
>
> We've worked around the problem in Linux 2.6 by adding locking at user-level
> around our writes, as you suggest, although this of course penalizes our
> performance on kernels that already correctly implement the thread-safety
> required by the POSIX spec. In any case it seemed like a problem that we
> should report, to be good open-source citizens - especially given that it
> appears to be a regression with respect to the Linux 2.4 kernel. How you
> choose to handle the report is of course your decision.

So, considering the rest of the thread, shouldn't it be enough to simply  
use O_APPEND for that use case - and, for that matter, isn't using  
O_APPEND actually completely natural there, and isn't that (modulo the  
process-thread difference) exactly what O_APPEND was invented for?

MfG Kai

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: PROBLEM: pthread-safety bug in write(2) on Linux 2.6.x
  2006-04-13 15:28       ` Linus Torvalds
@ 2006-04-14 10:20         ` Nikita Danilov
  0 siblings, 0 replies; 24+ messages in thread
From: Nikita Danilov @ 2006-04-14 10:20 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Andrew Morton, Dan Bonachea, linux-kernel

Linus Torvalds writes:
 > 

[...]

 > + * so it's ok to test it independently on the lock/unlock path
 > + * rather than explicitly remembering whether we locked it.
 > + */
 > +static inline loff_t file_pos_read_lock(struct file *file)
 >  {
 > +	if (file->f_mode & FMODE_LSEEK)
 > +		mutex_lock(&file->f_pos_lock);
 >  	return file->f_pos;
 >  }
 >  
 > -static inline void file_pos_write(struct file *file, loff_t pos)
 > +static inline void file_pos_write_unlock(struct file *file, loff_t pos)
 >  {
 >  	file->f_pos = pos;
 > +	if (file->f_mode & FMODE_LSEEK)
 > +		mutex_unlock(&file->f_pos_lock);
 >  }

Naming of two functions above is confusing: it looks like read/write
lock is taken. Maybe file_pos_lock_and_get() and
file_pos_set_and_unlock()?

Nikita.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: PROBLEM: pthread-safety bug in write(2) on Linux 2.6.x
@ 2006-04-27  9:06 Samuel Thibault
  2006-04-27 15:23 ` Linus Torvalds
  0 siblings, 1 reply; 24+ messages in thread
From: Samuel Thibault @ 2006-04-27  9:06 UTC (permalink / raw)
  To: linux-kernel, bonachead, torvalds

Hi,

I'm getting a bit late in the discussion, but this is a bug that we've
here known for quite some time now. The easy fix for getting correct
concurrent writes was to add a pipe: instead of calling

my_parallel_program > log

just call

my_parallel_program | tee log

the pipe guarantees atomicity.

Another testcase was the following program:

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

int main(int argc, char *argv[])
{
  int n, num = 0, lus = 0;
  char c;

  if(argc >= 2) num = atoi(argv[1]) - 1;

  while (num) {
    if(fork() == 0) break;
    num--;
  }

  for(;;) {
    n = read(STDIN_FILENO, &c, 1);
    if(n == 1) lus++;
    else break;
  }

  printf("Proc %d read %d bytes\n", num, lus);
  return 0;
}

Running it with no parameter runs fine:

$ ls -l foo
-rwxr-xr-x 1 samy samy 8174 2006-04-07 11:21 testlocale
$ ./lire < foo
Proc 0 read 8174 bytes

Running it with a given number of processes shows the non-atomicity of
offset update, even on UP:

$ ./lire 2 < foo
Proc 0 read 3903 bytes
Proc 1 read 8174 bytes

And it is getting worse as the number of processes increase. When we
discovered this, we checked such atomicity (i.e. each byte of the file
is read exactly by one process) on several systems:

- As shown, linux completely fails.
- OS X almost succeeds: on a dual SMP system, with 10 processes on a
900KB file, 68 bytes got read by more that one process.
- OSF succeeds: on a 8-way SMP system, with 10 processes on a 50MB
file, _no_ byte got read by more than one process.
- Solaris completely fails.
- AIX completely fails.

So I guess that people should just _not_ rely on such atomicity.

Regards,
Samuel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: PROBLEM: pthread-safety bug in write(2) on Linux 2.6.x
  2006-04-27  9:06 PROBLEM: pthread-safety bug in write(2) on Linux 2.6.x Samuel Thibault
@ 2006-04-27 15:23 ` Linus Torvalds
  0 siblings, 0 replies; 24+ messages in thread
From: Linus Torvalds @ 2006-04-27 15:23 UTC (permalink / raw)
  To: Samuel Thibault; +Cc: linux-kernel, bonachead



On Thu, 27 Apr 2006, Samuel Thibault wrote:
> 
> I'm getting a bit late in the discussion, but this is a bug that we've
> here known for quite some time now. The easy fix for getting correct
> concurrent writes was to add a pipe: instead of calling
> 
> my_parallel_program > log
> 
> just call
> 
> my_parallel_program | tee log
> 
> the pipe guarantees atomicity.

Side note: the pipe only guarantees atomicity for writes smaller than 
PIPE_BUF (normally 4kB on linux).

For larger writes, you'll still see all the data exactly once, but 
individual write() calls may have their contents mixed up. Ie if two 
processes or threads each do a 8kB write, the reader might see 4kB from 
the first thread, 4kB from the second, and then the remaining 4kB from the 
first again (or any other pattern - you could see alternating bytes, 
although in practice that is obviously not what is going to happen).

And yes, that atomicity is actually guaranteed by standards, so this is 
portable, and how you should do things if you care.

		Linus

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: PROBLEM: pthread-safety bug in write(2) on Linux 2.6.x
  2006-04-13  1:45 Dan Bonachea
  2006-04-13  2:10 ` Alistair John Strachan
  2006-04-13  4:46 ` Andrew Morton
@ 2010-12-19 22:45 ` Jens Moser
  2 siblings, 0 replies; 24+ messages in thread
From: Jens Moser @ 2010-12-19 22:45 UTC (permalink / raw)
  To: linux-kernel

Dan Bonachea <bonachead <at> comcast.net> writes:

> 
> Hi - I believe I've discovered a thread-safety bug in the Linux 2.6.x kernel 
> implementation of write(2).
> 
> The small C program below the problem - in a nutshell, if multiple threads 
> write() to STDOUT_FILENO, and stdout has been redirected to a file, then some 
> of the output lines get lost. The actual result is non-deterministic (even in 
> a "correct" run) - however the expected correct behavior is 10 lines of output 
> (in some non-deterministic order). However, the test program reproducibly 
> generates some lost output (less than 10 lines of total output) on every run 
> where the output is redirected to a new file. This appears to be a violation 
> of the POSIX spec (POSIX 1003.1-2001:2.9.1 requires thread-safety of write()). 
> 
> The problem does not appear to occur if output goes to the console, or is 
> redirected to append to an existing file, only when stdout is redirected to a 
> new file.
> 
> ---------------------------------------------------------------------------
> /* Instructions:
>   * compile as: gcc write-bug.c -lpthread
>   * run with: ./a.out
>   *   note that lines from all 10 threads are visible
>   * run with: ./a.out > output
>   *   note that lines from several threads are missing from the output file
>   */
> #include <pthread.h>
> #include <stdio.h>
> #include <unistd.h>
> volatile int flag = 0;
> void *start(void *arg) {
>    char msg[255];
>    int res;
>    sprintf(msg,"hi from %i\n",(int)arg);
>    if ((int)arg == 9) flag = 1;
>    while (!flag) ; /* thread barrier */
>    #if 1
>      res = write(STDOUT_FILENO,msg,strlen(msg));
>      if (res != strlen(msg)) fprintf(stderr,"Failure: %i 
> %s\n",res,strerror(res));
>    #else
>      fputs(msg,stdout);
>    #endif
>    #if 1 /* work extra hard to flush output (makes no difference) */
>      fflush(NULL);
>      fsync(STDOUT_FILENO);
>      sync();
>    #endif
>    return NULL;
> }
> int main() {
>    int i;
>    /* create 10 pthreads */
>    pthread_t id[10];
>    for (i =0 ; i < 10; i++) {
>      int ret = pthread_create(&(id[i]),NULL,&start,(void*)i);
>      if (ret) printf("pthread_create: %s\n",strerror(ret));
>    }
>    /* join 10 pthreads */
>    for (i =0 ; i < 10; i++) {
>      pthread_join(id[i], NULL);
>    }
>    sleep(1);
>    return 0;
> }
> ---------------------------------------------------------------------------
> 
> The problem occurs on every Linux 2.6 machine I've tried, including ones with 
> Itanium-2, Athlon, Opteron and Pentium 4 chips, both SMP's and uniprocessors, 
> using a variety of recent gcc versions. The problem does *not* appear to occur 
> on any Linux 2.4 machine I've tried, even if I use the same executable which 
> fails on the Linux 2.6 machine. Finally, replacing write(STDOUT_FILENO) with 
> fputs(stdout) makes the problem disappear (presumably due to locking in libc).
> 
> I don't have administrative access to upgrade the kernel on these machines, 
> however below is the full machine info for the most recent installed kernel 
> version I have access to.
> 
> Any suggestions are appreciated.
> Thanks,
> Dan


Hello all,

it's a long time since above message was posted and I wondered if Linus' patch
found it's way into the mainline kernel. A quick check with Dan Bonachea's test
program below showed that this is probably not the case.

We are currently porting a program to Linux (x86 and x86-64) which suffers from
the exact same problem. In contrast to Dan's scientific program our application
could live with occasional losses of messages as long as the output of different
concurrent write(2)s won't be interleaved.

So if we had a write(2) of "abcdef" and a second concurrent write(2) of "uvwxyz"
it would be tolerable if the output of one of them was lost, but it would be
catastrophic for our application if the output was interleaved, e.g. in the form
"abcuvwdefxyz".

I checked for that case with a modified version of Dan's test program (with
write(2)s of up to 100 MB
per thread) and under no circumstances did I get interleaved output (but only
losses of output).

So my question is, does Linux do any synchronisation to prevent interleaving of
output in conjuction with concurrent write(2)s to the same file descriptor
(regular files only). If not, the test results would be a surprise to me, but
maybe someone with more knowledge in regards to the Linux kernel could shed some
light on that.

Any answers are higly appreciated!

Many thanks in advance,

Jens Moser






^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2010-12-29 22:20 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-04-27  9:06 PROBLEM: pthread-safety bug in write(2) on Linux 2.6.x Samuel Thibault
2006-04-27 15:23 ` Linus Torvalds
     [not found] <60Z8f-4QA-25@gated-at.bofh.it>
     [not found] ` <611Wl-u5-1@gated-at.bofh.it>
     [not found]   ` <616k2-6Xz-27@gated-at.bofh.it>
2006-04-14  9:14     ` Kai Henningsen
  -- strict thread matches above, loose matches on Subject: below --
2006-04-13  1:45 Dan Bonachea
2006-04-13  2:10 ` Alistair John Strachan
2006-04-13  4:46 ` Andrew Morton
2006-04-13  5:33   ` Nick Piggin
2006-04-13 15:01     ` Linus Torvalds
2006-04-13 15:28       ` Linus Torvalds
2006-04-14 10:20         ` Nikita Danilov
2006-04-13 21:50       ` Alan Cox
2006-04-13 22:06         ` Dan Bonachea
2006-04-13 23:03           ` Linus Torvalds
2006-04-13 23:11           ` Alan Cox
2006-04-13 23:19             ` Linus Torvalds
2006-04-13 22:40         ` Linus Torvalds
2006-04-13 23:05           ` Alan Cox
2006-04-13 23:06             ` Linus Torvalds
2006-04-13 23:11               ` Linus Torvalds
2006-04-13  9:18   ` Dan Bonachea
2006-04-13  9:56     ` Andrew Morton
2006-04-13 10:28       ` Kyle Moffett
2006-04-13 14:14     ` Jan Engelhardt
2010-12-19 22:45 ` Jens Moser

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).