* [linux-lvm] Creating snapshot causes processes hang?
@ 2005-11-01 2:09 Satoshi Nagayasu
2005-11-01 14:28 ` Kelly Sauke
0 siblings, 1 reply; 5+ messages in thread
From: Satoshi Nagayasu @ 2005-11-01 2:09 UTC (permalink / raw)
To: linux-lvm
Hi all,
I'm testing PostgreSQL(RDBMS) backup using LVM2 snapshot on RHEL4.
After creating a snapshot, some processes (kjournald,
one PostgreSQL backend and others) go to iowait status,
and they never come back. (see pid 8105 and 2973 in below)
In this situation, one PostgreSQL backend process is waiting
in COMMIT processing(it generates fsync() on logical volume),
and also kjournald is waiting something.
There is no kernel oops, and the processors are not used.
When I set an PostgreSQL option not to generate fsync
on COMMIT, it looks work well. No process hangs.
I guess some race conditions is occured around kjournald.
Any comments and suggestions?
Thanks.
------------------------------------
# uname -a
Linux st17 2.6.9-5.ELsmp #1 SMP Wed Jan 5 19:30:39 EST 2005 i686 i686 i386 GNU/Linux
# df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sda5 20161172 6717648 12419384 36% /
/dev/sda1 202219 11423 180356 6% /boot
none 1037448 0 1037448 0% /dev/shm
/dev/mapper/vg0-pgdata
1032088 515532 464128 53% /pgdata
# ps ax
PID TTY STAT TIME COMMAND
1 ? S 0:00 init [3]
2 ? S 0:00 [migration/0]
3 ? SN 0:00 [ksoftirqd/0]
4 ? S 0:00 [migration/1]
5 ? SN 0:00 [ksoftirqd/1]
6 ? S< 0:00 [events/0]
7 ? S< 0:00 [events/1]
8 ? S< 0:00 [khelper]
9 ? S< 0:00 [kacpid]
30 ? S< 0:00 [kblockd/0]
31 ? S< 0:00 [kblockd/1]
44 ? S< 0:00 [aio/0]
45 ? S< 0:00 [aio/1]
32 ? S 0:00 [khubd]
43 ? S 0:11 [kswapd0]
118 ? S 0:00 [kseriod]
186 ? S 0:00 [scsi_eh_0]
201 ? S 0:17 [kjournald]
1157 ? S<s 0:00 udevd
1342 ? S 0:00 [kjournald]
1761 ? Ss 0:00 syslogd -m 0
1765 ? Ss 0:00 klogd -x
1776 ? Ss 0:00 irqbalance
1794 ? Ss 0:00 portmap
1814 ? Ss 0:00 rpc.statd
1913 ? Ss 0:00 rpc.idmapd
1990 ? Ss 0:00 /usr/sbin/acpid
2003 ? Ss 0:00 cupsd
2042 ? Ss 0:00 /usr/sbin/sshd
2116 ? Ss 0:00 xinetd -stayalive -pidfile /var/run/xinetd.pid
2126 ? Ss 0:00 gpm -m /dev/input/mice -t imps2
2169 ? Ss 0:00 /usr/sbin/htt -retryonerror 0
2170 ? S 0:00 htt_server -nodaemon
2182 ? Ss 0:00 /usr/sbin/cannaserver -syslog -u canna
2233 ? Ss 0:00 crond
2274 ? Ss 0:00 xfs -droppriv -daemon
2293 ? Ss 0:00 /usr/sbin/atd
2303 ? Ssl 0:00 dbus-daemon-1 --system
2317 ? Ss 0:00 cups-config-daemon
2328 ? Ss 0:01 hald
2338 tty1 Ss+ 0:00 /sbin/mingetty tty1
2339 tty2 Ss+ 0:00 /sbin/mingetty tty2
2340 tty3 Ss+ 0:00 /sbin/mingetty tty3
2341 tty4 Ss+ 0:00 /sbin/mingetty tty4
2342 tty5 Ss+ 0:00 /sbin/mingetty tty5
2343 tty6 Ss+ 0:00 /sbin/mingetty tty6
2885 ? Ss 0:00 sshd: snaga [priv]
2887 ? S 0:00 sshd: snaga@pts/0
2888 pts/0 Ss 0:00 -bash
2908 ? Ss 0:00 sshd: snaga [priv]
2910 ? S 0:00 sshd: snaga@pts/1
2911 pts/1 Ss 0:00 -bash
2931 pts/1 S 0:00 su
2932 pts/1 S 0:00 bash
2973 ? D 0:31 [kjournald]
2980 ? S 0:54 [rpciod]
2981 ? S 0:00 [lockd]
3016 pts/0 S 0:00 su
3017 pts/0 S 0:00 bash
3035 pts/0 S 0:00 su postgres
3036 pts/0 S 0:00 bash
7866 pts/0 S+ 0:00 /bin/sh ./pgbench_lvm.sh
7906 pts/0 S+ 0:00 /bin/sh ./pgbench_lvm.sh
7907 pts/0 S+ 0:00 sed -e s/^/lvm:/
7915 pts/0 S+ 0:00 /usr/local/pgsql81b3/bin/postmaster -D /pgdata/data
7917 pts/0 S+ 0:03 postgres: writer process
7918 pts/0 S+ 0:00 postgres: archiver process
7919 pts/0 S+ 0:00 postgres: stats buffer process
7920 pts/0 S+ 0:00 postgres: stats collector process
8015 ? S 0:01 [pdflush]
8094 pts/0 S+ 0:01 pgbench -s 10 -t 1000 -c 16 pgbench
8096 pts/0 S+ 0:00 postgres: postgres pgbench [local] COMMIT
8097 pts/0 S+ 0:00 postgres: postgres pgbench [local] UPDATE waiting
8098 pts/0 S+ 0:00 postgres: postgres pgbench [local] UPDATE waiting
8099 pts/0 S+ 0:00 postgres: postgres pgbench [local] UPDATE waiting
8100 pts/0 S+ 0:00 postgres: postgres pgbench [local] UPDATE waiting
8101 pts/0 S+ 0:00 postgres: postgres pgbench [local] UPDATE waiting
8102 pts/0 S+ 0:00 postgres: postgres pgbench [local] UPDATE waiting
8103 pts/0 S+ 0:00 postgres: postgres pgbench [local] COMMIT
8104 pts/0 S+ 0:00 postgres: postgres pgbench [local] UPDATE waiting
8105 pts/0 D+ 0:00 postgres: postgres pgbench [local] COMMIT
8106 pts/0 S+ 0:00 postgres: postgres pgbench [local] COMMIT
8107 pts/0 S+ 0:00 postgres: postgres pgbench [local] COMMIT
8108 pts/0 S+ 0:00 postgres: postgres pgbench [local] UPDATE waiting
8109 pts/0 S+ 0:00 postgres: postgres pgbench [local] UPDATE waiting
8110 pts/0 S+ 0:00 postgres: postgres pgbench [local] UPDATE waiting
8111 pts/0 S+ 0:00 postgres: postgres pgbench [local] COMMIT
8119 ? S 0:00 [pdflush]
8147 ? S< 0:00 [kcopyd]
8179 ? S 0:00 [kjournald]
8188 pts/1 R+ 0:00 ps ax
# cat /etc/issue
Red Hat Enterprise Linux ES release 4 (Nahant)
Kernel \r on an \m
#
------------------------------------
--
NAGAYASU Satoshi <nagayasus@nttdata.co.jp>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [linux-lvm] Creating snapshot causes processes hang?
2005-11-01 2:09 [linux-lvm] Creating snapshot causes processes hang? Satoshi Nagayasu
@ 2005-11-01 14:28 ` Kelly Sauke
2005-11-01 21:24 ` Mike Snitzer
0 siblings, 1 reply; 5+ messages in thread
From: Kelly Sauke @ 2005-11-01 14:28 UTC (permalink / raw)
To: LVM general discussion and development
Snapshots in RHEL 4 are broken. We've had to move to veritas after Redhat told
us that they would not support snapshots in RHEL 4.
Satoshi Nagayasu wrote:
> Hi all,
>
> I'm testing PostgreSQL(RDBMS) backup using LVM2 snapshot on RHEL4.
>
> After creating a snapshot, some processes (kjournald,
> one PostgreSQL backend and others) go to iowait status,
> and they never come back. (see pid 8105 and 2973 in below)
>
> In this situation, one PostgreSQL backend process is waiting
> in COMMIT processing(it generates fsync() on logical volume),
> and also kjournald is waiting something.
>
> There is no kernel oops, and the processors are not used.
>
> When I set an PostgreSQL option not to generate fsync
> on COMMIT, it looks work well. No process hangs.
>
> I guess some race conditions is occured around kjournald.
>
> Any comments and suggestions?
>
> Thanks.
> ------------------------------------
> # uname -a
> Linux st17 2.6.9-5.ELsmp #1 SMP Wed Jan 5 19:30:39 EST 2005 i686 i686 i386 GNU/Linux
> # df
> Filesystem 1K-blocks Used Available Use% Mounted on
> /dev/sda5 20161172 6717648 12419384 36% /
> /dev/sda1 202219 11423 180356 6% /boot
> none 1037448 0 1037448 0% /dev/shm
> /dev/mapper/vg0-pgdata
> 1032088 515532 464128 53% /pgdata
> # ps ax
> PID TTY STAT TIME COMMAND
> 1 ? S 0:00 init [3]
> 2 ? S 0:00 [migration/0]
> 3 ? SN 0:00 [ksoftirqd/0]
> 4 ? S 0:00 [migration/1]
> 5 ? SN 0:00 [ksoftirqd/1]
> 6 ? S< 0:00 [events/0]
> 7 ? S< 0:00 [events/1]
> 8 ? S< 0:00 [khelper]
> 9 ? S< 0:00 [kacpid]
> 30 ? S< 0:00 [kblockd/0]
> 31 ? S< 0:00 [kblockd/1]
> 44 ? S< 0:00 [aio/0]
> 45 ? S< 0:00 [aio/1]
> 32 ? S 0:00 [khubd]
> 43 ? S 0:11 [kswapd0]
> 118 ? S 0:00 [kseriod]
> 186 ? S 0:00 [scsi_eh_0]
> 201 ? S 0:17 [kjournald]
> 1157 ? S<s 0:00 udevd
> 1342 ? S 0:00 [kjournald]
> 1761 ? Ss 0:00 syslogd -m 0
> 1765 ? Ss 0:00 klogd -x
> 1776 ? Ss 0:00 irqbalance
> 1794 ? Ss 0:00 portmap
> 1814 ? Ss 0:00 rpc.statd
> 1913 ? Ss 0:00 rpc.idmapd
> 1990 ? Ss 0:00 /usr/sbin/acpid
> 2003 ? Ss 0:00 cupsd
> 2042 ? Ss 0:00 /usr/sbin/sshd
> 2116 ? Ss 0:00 xinetd -stayalive -pidfile /var/run/xinetd.pid
> 2126 ? Ss 0:00 gpm -m /dev/input/mice -t imps2
> 2169 ? Ss 0:00 /usr/sbin/htt -retryonerror 0
> 2170 ? S 0:00 htt_server -nodaemon
> 2182 ? Ss 0:00 /usr/sbin/cannaserver -syslog -u canna
> 2233 ? Ss 0:00 crond
> 2274 ? Ss 0:00 xfs -droppriv -daemon
> 2293 ? Ss 0:00 /usr/sbin/atd
> 2303 ? Ssl 0:00 dbus-daemon-1 --system
> 2317 ? Ss 0:00 cups-config-daemon
> 2328 ? Ss 0:01 hald
> 2338 tty1 Ss+ 0:00 /sbin/mingetty tty1
> 2339 tty2 Ss+ 0:00 /sbin/mingetty tty2
> 2340 tty3 Ss+ 0:00 /sbin/mingetty tty3
> 2341 tty4 Ss+ 0:00 /sbin/mingetty tty4
> 2342 tty5 Ss+ 0:00 /sbin/mingetty tty5
> 2343 tty6 Ss+ 0:00 /sbin/mingetty tty6
> 2885 ? Ss 0:00 sshd: snaga [priv]
> 2887 ? S 0:00 sshd: snaga@pts/0
> 2888 pts/0 Ss 0:00 -bash
> 2908 ? Ss 0:00 sshd: snaga [priv]
> 2910 ? S 0:00 sshd: snaga@pts/1
> 2911 pts/1 Ss 0:00 -bash
> 2931 pts/1 S 0:00 su
> 2932 pts/1 S 0:00 bash
> 2973 ? D 0:31 [kjournald]
> 2980 ? S 0:54 [rpciod]
> 2981 ? S 0:00 [lockd]
> 3016 pts/0 S 0:00 su
> 3017 pts/0 S 0:00 bash
> 3035 pts/0 S 0:00 su postgres
> 3036 pts/0 S 0:00 bash
> 7866 pts/0 S+ 0:00 /bin/sh ./pgbench_lvm.sh
> 7906 pts/0 S+ 0:00 /bin/sh ./pgbench_lvm.sh
> 7907 pts/0 S+ 0:00 sed -e s/^/lvm:/
> 7915 pts/0 S+ 0:00 /usr/local/pgsql81b3/bin/postmaster -D /pgdata/data
> 7917 pts/0 S+ 0:03 postgres: writer process
> 7918 pts/0 S+ 0:00 postgres: archiver process
> 7919 pts/0 S+ 0:00 postgres: stats buffer process
> 7920 pts/0 S+ 0:00 postgres: stats collector process
> 8015 ? S 0:01 [pdflush]
> 8094 pts/0 S+ 0:01 pgbench -s 10 -t 1000 -c 16 pgbench
> 8096 pts/0 S+ 0:00 postgres: postgres pgbench [local] COMMIT
> 8097 pts/0 S+ 0:00 postgres: postgres pgbench [local] UPDATE waiting
> 8098 pts/0 S+ 0:00 postgres: postgres pgbench [local] UPDATE waiting
> 8099 pts/0 S+ 0:00 postgres: postgres pgbench [local] UPDATE waiting
> 8100 pts/0 S+ 0:00 postgres: postgres pgbench [local] UPDATE waiting
> 8101 pts/0 S+ 0:00 postgres: postgres pgbench [local] UPDATE waiting
> 8102 pts/0 S+ 0:00 postgres: postgres pgbench [local] UPDATE waiting
> 8103 pts/0 S+ 0:00 postgres: postgres pgbench [local] COMMIT
> 8104 pts/0 S+ 0:00 postgres: postgres pgbench [local] UPDATE waiting
> 8105 pts/0 D+ 0:00 postgres: postgres pgbench [local] COMMIT
> 8106 pts/0 S+ 0:00 postgres: postgres pgbench [local] COMMIT
> 8107 pts/0 S+ 0:00 postgres: postgres pgbench [local] COMMIT
> 8108 pts/0 S+ 0:00 postgres: postgres pgbench [local] UPDATE waiting
> 8109 pts/0 S+ 0:00 postgres: postgres pgbench [local] UPDATE waiting
> 8110 pts/0 S+ 0:00 postgres: postgres pgbench [local] UPDATE waiting
> 8111 pts/0 S+ 0:00 postgres: postgres pgbench [local] COMMIT
> 8119 ? S 0:00 [pdflush]
> 8147 ? S< 0:00 [kcopyd]
> 8179 ? S 0:00 [kjournald]
> 8188 pts/1 R+ 0:00 ps ax
> # cat /etc/issue
> Red Hat Enterprise Linux ES release 4 (Nahant)
> Kernel \r on an \m
>
> #
> ------------------------------------
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [linux-lvm] Creating snapshot causes processes hang?
2005-11-01 14:28 ` Kelly Sauke
@ 2005-11-01 21:24 ` Mike Snitzer
2005-11-01 22:02 ` Kelly Sauke
0 siblings, 1 reply; 5+ messages in thread
From: Mike Snitzer @ 2005-11-01 21:24 UTC (permalink / raw)
To: LVM general discussion and development
[-- Attachment #1: Type: text/plain, Size: 345 bytes --]
On 11/1/05, Kelly Sauke <ksauke@fastenal.com> wrote:
>
> Snapshots in RHEL 4 are broken. We've had to move to veritas after Redhat
> told
> us that they would not support snapshots in RHEL 4.
FUD? I find it hard to believe RedHat formally said they wouldn't support
LVM2 snapshots in RHEL4. Please elaborate/advise (RedHat?).
Mike
[-- Attachment #2: Type: text/html, Size: 670 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [linux-lvm] Creating snapshot causes processes hang?
2005-11-01 21:24 ` Mike Snitzer
@ 2005-11-01 22:02 ` Kelly Sauke
0 siblings, 0 replies; 5+ messages in thread
From: Kelly Sauke @ 2005-11-01 22:02 UTC (permalink / raw)
To: LVM general discussion and development
http://www.redhat.com/archives/linux-lvm/2005-July/msg00117.html
RHEL 4 U3 is due in Jan '06 I believe.
Mike Snitzer wrote:
>
>
> On 11/1/05, *Kelly Sauke* <ksauke@fastenal.com
> <mailto:ksauke@fastenal.com>> wrote:
>
> Snapshots in RHEL 4 are broken. We've had to move to veritas after
> Redhat told
> us that they would not support snapshots in RHEL 4.
>
>
> FUD? I find it hard to believe RedHat formally said they wouldn't
> support LVM2 snapshots in RHEL4. Please elaborate/advise (RedHat?).
>
> Mike
>
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [linux-lvm] Creating snapshot causes processes hang?
[not found] <20051101220316.6F44973980@hormel.redhat.com>
@ 2005-11-01 23:20 ` James G. Sack (jim)
0 siblings, 0 replies; 5+ messages in thread
From: James G. Sack (jim) @ 2005-11-01 23:20 UTC (permalink / raw)
To: LVM LIST linux-lvm@redhat.com
> Date: Tue, 01 Nov 2005 16:02:55 -0600
> From: Kelly Sauke <ksauke@fastenal.com>
> Subject: Re: [linux-lvm] Creating snapshot causes processes hang?
> To: LVM general discussion and development <linux-lvm@redhat.com>
> Message-ID: <4367E60F.8020004@fastenal.com>
> Content-Type: text/plain; charset=ISO-8859-1
>
> http://www.redhat.com/archives/linux-lvm/2005-July/msg00117.html
>
AHA! -- could this be a confirmation of what I have seen where lvremove
triggers the kcopyd BUG and consequent system deadlock?
(ref: my recent posting -- https://www.redhat.com/archives/linux-
lvm/2005-November/msg00002.html, where I report breakage upon lvremove
when there are 2 snapshots and i/o is occurring)
If so, do I also understand correctly that using unreleased patches and
some manual dmsetup commands might constitute a workaround?
I don't mind living with manual workarounds until snapshots is fixed,
but it would be wonderful to see an explicit recipe [for dummies, please
<grin>] for following the suggestions in the AGK july posting referenced
by ksauke, above), namely:
"Further patches are needed, but those two plus correct dmsetup use
should avoid machine lockups."
In particular:
have any of the 18 patches shown at
ftp://sources.redhat.com/pub/dm/patches/2.6-unstable/2.6.12-
rc2/2.6.12-rc2-udm1/
gotten into 2.6.13 or 2.6.14, or into kernels blessed by Fedora? (eg,
FC4-1526, maybe)
And, could someone give me an example of "correct dmsetup" use? I
presume it involves suspend and resume -- but what else?
.. and in my case, what if there is an existing snapshot when I need to
create/remove a second one?
..jim
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2005-11-01 23:20 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-11-01 2:09 [linux-lvm] Creating snapshot causes processes hang? Satoshi Nagayasu
2005-11-01 14:28 ` Kelly Sauke
2005-11-01 21:24 ` Mike Snitzer
2005-11-01 22:02 ` Kelly Sauke
[not found] <20051101220316.6F44973980@hormel.redhat.com>
2005-11-01 23:20 ` James G. Sack (jim)
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).