linux-lvm.redhat.com archive mirror
 help / color / mirror / Atom feed
* [linux-lvm] Creating snapshot causes processes hang?
@ 2005-11-01  2:09 Satoshi Nagayasu
  2005-11-01 14:28 ` Kelly Sauke
  0 siblings, 1 reply; 5+ messages in thread
From: Satoshi Nagayasu @ 2005-11-01  2:09 UTC (permalink / raw)
  To: linux-lvm

Hi all,

I'm testing PostgreSQL(RDBMS) backup using LVM2 snapshot on RHEL4.

After creating a snapshot, some processes (kjournald,
one PostgreSQL backend and others) go to iowait status,
and they never come back. (see pid 8105 and 2973 in below)

In this situation, one PostgreSQL backend process is waiting
in COMMIT processing(it generates fsync() on logical volume),
and also kjournald is waiting something.

There is no kernel oops, and the processors are not used.

When I set an PostgreSQL option not to generate fsync
on COMMIT, it looks work well. No process hangs.

I guess some race conditions is occured around kjournald.

Any comments and suggestions?

Thanks.
------------------------------------
# uname -a
Linux st17 2.6.9-5.ELsmp #1 SMP Wed Jan 5 19:30:39 EST 2005 i686 i686 i386 GNU/Linux
# df
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/sda5             20161172   6717648  12419384  36% /
/dev/sda1               202219     11423    180356   6% /boot
none                   1037448         0   1037448   0% /dev/shm
/dev/mapper/vg0-pgdata
                       1032088    515532    464128  53% /pgdata
# ps ax
  PID TTY      STAT   TIME COMMAND
    1 ?        S      0:00 init [3]
    2 ?        S      0:00 [migration/0]
    3 ?        SN     0:00 [ksoftirqd/0]
    4 ?        S      0:00 [migration/1]
    5 ?        SN     0:00 [ksoftirqd/1]
    6 ?        S<     0:00 [events/0]
    7 ?        S<     0:00 [events/1]
    8 ?        S<     0:00 [khelper]
    9 ?        S<     0:00 [kacpid]
   30 ?        S<     0:00 [kblockd/0]
   31 ?        S<     0:00 [kblockd/1]
   44 ?        S<     0:00 [aio/0]
   45 ?        S<     0:00 [aio/1]
   32 ?        S      0:00 [khubd]
   43 ?        S      0:11 [kswapd0]
  118 ?        S      0:00 [kseriod]
  186 ?        S      0:00 [scsi_eh_0]
  201 ?        S      0:17 [kjournald]
 1157 ?        S<s    0:00 udevd
 1342 ?        S      0:00 [kjournald]
 1761 ?        Ss     0:00 syslogd -m 0
 1765 ?        Ss     0:00 klogd -x
 1776 ?        Ss     0:00 irqbalance
 1794 ?        Ss     0:00 portmap
 1814 ?        Ss     0:00 rpc.statd
 1913 ?        Ss     0:00 rpc.idmapd
 1990 ?        Ss     0:00 /usr/sbin/acpid
 2003 ?        Ss     0:00 cupsd
 2042 ?        Ss     0:00 /usr/sbin/sshd
 2116 ?        Ss     0:00 xinetd -stayalive -pidfile /var/run/xinetd.pid
 2126 ?        Ss     0:00 gpm -m /dev/input/mice -t imps2
 2169 ?        Ss     0:00 /usr/sbin/htt -retryonerror 0
 2170 ?        S      0:00 htt_server -nodaemon
 2182 ?        Ss     0:00 /usr/sbin/cannaserver -syslog -u canna
 2233 ?        Ss     0:00 crond
 2274 ?        Ss     0:00 xfs -droppriv -daemon
 2293 ?        Ss     0:00 /usr/sbin/atd
 2303 ?        Ssl    0:00 dbus-daemon-1 --system
 2317 ?        Ss     0:00 cups-config-daemon
 2328 ?        Ss     0:01 hald
 2338 tty1     Ss+    0:00 /sbin/mingetty tty1
 2339 tty2     Ss+    0:00 /sbin/mingetty tty2
 2340 tty3     Ss+    0:00 /sbin/mingetty tty3
 2341 tty4     Ss+    0:00 /sbin/mingetty tty4
 2342 tty5     Ss+    0:00 /sbin/mingetty tty5
 2343 tty6     Ss+    0:00 /sbin/mingetty tty6
 2885 ?        Ss     0:00 sshd: snaga [priv]
 2887 ?        S      0:00 sshd: snaga@pts/0
 2888 pts/0    Ss     0:00 -bash
 2908 ?        Ss     0:00 sshd: snaga [priv]
 2910 ?        S      0:00 sshd: snaga@pts/1
 2911 pts/1    Ss     0:00 -bash
 2931 pts/1    S      0:00 su
 2932 pts/1    S      0:00 bash
 2973 ?        D      0:31 [kjournald]
 2980 ?        S      0:54 [rpciod]
 2981 ?        S      0:00 [lockd]
 3016 pts/0    S      0:00 su
 3017 pts/0    S      0:00 bash
 3035 pts/0    S      0:00 su postgres
 3036 pts/0    S      0:00 bash
 7866 pts/0    S+     0:00 /bin/sh ./pgbench_lvm.sh
 7906 pts/0    S+     0:00 /bin/sh ./pgbench_lvm.sh
 7907 pts/0    S+     0:00 sed -e s/^/lvm:/
 7915 pts/0    S+     0:00 /usr/local/pgsql81b3/bin/postmaster -D /pgdata/data
 7917 pts/0    S+     0:03 postgres: writer process
 7918 pts/0    S+     0:00 postgres: archiver process
 7919 pts/0    S+     0:00 postgres: stats buffer process
 7920 pts/0    S+     0:00 postgres: stats collector process
 8015 ?        S      0:01 [pdflush]
 8094 pts/0    S+     0:01 pgbench -s 10 -t 1000 -c 16 pgbench
 8096 pts/0    S+     0:00 postgres: postgres pgbench [local] COMMIT
 8097 pts/0    S+     0:00 postgres: postgres pgbench [local] UPDATE waiting
 8098 pts/0    S+     0:00 postgres: postgres pgbench [local] UPDATE waiting
 8099 pts/0    S+     0:00 postgres: postgres pgbench [local] UPDATE waiting
 8100 pts/0    S+     0:00 postgres: postgres pgbench [local] UPDATE waiting
 8101 pts/0    S+     0:00 postgres: postgres pgbench [local] UPDATE waiting
 8102 pts/0    S+     0:00 postgres: postgres pgbench [local] UPDATE waiting
 8103 pts/0    S+     0:00 postgres: postgres pgbench [local] COMMIT
 8104 pts/0    S+     0:00 postgres: postgres pgbench [local] UPDATE waiting
 8105 pts/0    D+     0:00 postgres: postgres pgbench [local] COMMIT
 8106 pts/0    S+     0:00 postgres: postgres pgbench [local] COMMIT
 8107 pts/0    S+     0:00 postgres: postgres pgbench [local] COMMIT
 8108 pts/0    S+     0:00 postgres: postgres pgbench [local] UPDATE waiting
 8109 pts/0    S+     0:00 postgres: postgres pgbench [local] UPDATE waiting
 8110 pts/0    S+     0:00 postgres: postgres pgbench [local] UPDATE waiting
 8111 pts/0    S+     0:00 postgres: postgres pgbench [local] COMMIT
 8119 ?        S      0:00 [pdflush]
 8147 ?        S<     0:00 [kcopyd]
 8179 ?        S      0:00 [kjournald]
 8188 pts/1    R+     0:00 ps ax
# cat /etc/issue
Red Hat Enterprise Linux ES release 4 (Nahant)
Kernel \r on an \m

#
------------------------------------

-- 
NAGAYASU Satoshi <nagayasus@nttdata.co.jp>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [linux-lvm] Creating snapshot causes processes hang?
  2005-11-01  2:09 [linux-lvm] Creating snapshot causes processes hang? Satoshi Nagayasu
@ 2005-11-01 14:28 ` Kelly Sauke
  2005-11-01 21:24   ` Mike Snitzer
  0 siblings, 1 reply; 5+ messages in thread
From: Kelly Sauke @ 2005-11-01 14:28 UTC (permalink / raw)
  To: LVM general discussion and development

Snapshots in RHEL 4 are broken.  We've had to move to veritas after Redhat told
us that they would not support snapshots in RHEL 4.

Satoshi Nagayasu wrote:
> Hi all,
> 
> I'm testing PostgreSQL(RDBMS) backup using LVM2 snapshot on RHEL4.
> 
> After creating a snapshot, some processes (kjournald,
> one PostgreSQL backend and others) go to iowait status,
> and they never come back. (see pid 8105 and 2973 in below)
> 
> In this situation, one PostgreSQL backend process is waiting
> in COMMIT processing(it generates fsync() on logical volume),
> and also kjournald is waiting something.
> 
> There is no kernel oops, and the processors are not used.
> 
> When I set an PostgreSQL option not to generate fsync
> on COMMIT, it looks work well. No process hangs.
> 
> I guess some race conditions is occured around kjournald.
> 
> Any comments and suggestions?
> 
> Thanks.
> ------------------------------------
> # uname -a
> Linux st17 2.6.9-5.ELsmp #1 SMP Wed Jan 5 19:30:39 EST 2005 i686 i686 i386 GNU/Linux
> # df
> Filesystem           1K-blocks      Used Available Use% Mounted on
> /dev/sda5             20161172   6717648  12419384  36% /
> /dev/sda1               202219     11423    180356   6% /boot
> none                   1037448         0   1037448   0% /dev/shm
> /dev/mapper/vg0-pgdata
>                        1032088    515532    464128  53% /pgdata
> # ps ax
>   PID TTY      STAT   TIME COMMAND
>     1 ?        S      0:00 init [3]
>     2 ?        S      0:00 [migration/0]
>     3 ?        SN     0:00 [ksoftirqd/0]
>     4 ?        S      0:00 [migration/1]
>     5 ?        SN     0:00 [ksoftirqd/1]
>     6 ?        S<     0:00 [events/0]
>     7 ?        S<     0:00 [events/1]
>     8 ?        S<     0:00 [khelper]
>     9 ?        S<     0:00 [kacpid]
>    30 ?        S<     0:00 [kblockd/0]
>    31 ?        S<     0:00 [kblockd/1]
>    44 ?        S<     0:00 [aio/0]
>    45 ?        S<     0:00 [aio/1]
>    32 ?        S      0:00 [khubd]
>    43 ?        S      0:11 [kswapd0]
>   118 ?        S      0:00 [kseriod]
>   186 ?        S      0:00 [scsi_eh_0]
>   201 ?        S      0:17 [kjournald]
>  1157 ?        S<s    0:00 udevd
>  1342 ?        S      0:00 [kjournald]
>  1761 ?        Ss     0:00 syslogd -m 0
>  1765 ?        Ss     0:00 klogd -x
>  1776 ?        Ss     0:00 irqbalance
>  1794 ?        Ss     0:00 portmap
>  1814 ?        Ss     0:00 rpc.statd
>  1913 ?        Ss     0:00 rpc.idmapd
>  1990 ?        Ss     0:00 /usr/sbin/acpid
>  2003 ?        Ss     0:00 cupsd
>  2042 ?        Ss     0:00 /usr/sbin/sshd
>  2116 ?        Ss     0:00 xinetd -stayalive -pidfile /var/run/xinetd.pid
>  2126 ?        Ss     0:00 gpm -m /dev/input/mice -t imps2
>  2169 ?        Ss     0:00 /usr/sbin/htt -retryonerror 0
>  2170 ?        S      0:00 htt_server -nodaemon
>  2182 ?        Ss     0:00 /usr/sbin/cannaserver -syslog -u canna
>  2233 ?        Ss     0:00 crond
>  2274 ?        Ss     0:00 xfs -droppriv -daemon
>  2293 ?        Ss     0:00 /usr/sbin/atd
>  2303 ?        Ssl    0:00 dbus-daemon-1 --system
>  2317 ?        Ss     0:00 cups-config-daemon
>  2328 ?        Ss     0:01 hald
>  2338 tty1     Ss+    0:00 /sbin/mingetty tty1
>  2339 tty2     Ss+    0:00 /sbin/mingetty tty2
>  2340 tty3     Ss+    0:00 /sbin/mingetty tty3
>  2341 tty4     Ss+    0:00 /sbin/mingetty tty4
>  2342 tty5     Ss+    0:00 /sbin/mingetty tty5
>  2343 tty6     Ss+    0:00 /sbin/mingetty tty6
>  2885 ?        Ss     0:00 sshd: snaga [priv]
>  2887 ?        S      0:00 sshd: snaga@pts/0
>  2888 pts/0    Ss     0:00 -bash
>  2908 ?        Ss     0:00 sshd: snaga [priv]
>  2910 ?        S      0:00 sshd: snaga@pts/1
>  2911 pts/1    Ss     0:00 -bash
>  2931 pts/1    S      0:00 su
>  2932 pts/1    S      0:00 bash
>  2973 ?        D      0:31 [kjournald]
>  2980 ?        S      0:54 [rpciod]
>  2981 ?        S      0:00 [lockd]
>  3016 pts/0    S      0:00 su
>  3017 pts/0    S      0:00 bash
>  3035 pts/0    S      0:00 su postgres
>  3036 pts/0    S      0:00 bash
>  7866 pts/0    S+     0:00 /bin/sh ./pgbench_lvm.sh
>  7906 pts/0    S+     0:00 /bin/sh ./pgbench_lvm.sh
>  7907 pts/0    S+     0:00 sed -e s/^/lvm:/
>  7915 pts/0    S+     0:00 /usr/local/pgsql81b3/bin/postmaster -D /pgdata/data
>  7917 pts/0    S+     0:03 postgres: writer process
>  7918 pts/0    S+     0:00 postgres: archiver process
>  7919 pts/0    S+     0:00 postgres: stats buffer process
>  7920 pts/0    S+     0:00 postgres: stats collector process
>  8015 ?        S      0:01 [pdflush]
>  8094 pts/0    S+     0:01 pgbench -s 10 -t 1000 -c 16 pgbench
>  8096 pts/0    S+     0:00 postgres: postgres pgbench [local] COMMIT
>  8097 pts/0    S+     0:00 postgres: postgres pgbench [local] UPDATE waiting
>  8098 pts/0    S+     0:00 postgres: postgres pgbench [local] UPDATE waiting
>  8099 pts/0    S+     0:00 postgres: postgres pgbench [local] UPDATE waiting
>  8100 pts/0    S+     0:00 postgres: postgres pgbench [local] UPDATE waiting
>  8101 pts/0    S+     0:00 postgres: postgres pgbench [local] UPDATE waiting
>  8102 pts/0    S+     0:00 postgres: postgres pgbench [local] UPDATE waiting
>  8103 pts/0    S+     0:00 postgres: postgres pgbench [local] COMMIT
>  8104 pts/0    S+     0:00 postgres: postgres pgbench [local] UPDATE waiting
>  8105 pts/0    D+     0:00 postgres: postgres pgbench [local] COMMIT
>  8106 pts/0    S+     0:00 postgres: postgres pgbench [local] COMMIT
>  8107 pts/0    S+     0:00 postgres: postgres pgbench [local] COMMIT
>  8108 pts/0    S+     0:00 postgres: postgres pgbench [local] UPDATE waiting
>  8109 pts/0    S+     0:00 postgres: postgres pgbench [local] UPDATE waiting
>  8110 pts/0    S+     0:00 postgres: postgres pgbench [local] UPDATE waiting
>  8111 pts/0    S+     0:00 postgres: postgres pgbench [local] COMMIT
>  8119 ?        S      0:00 [pdflush]
>  8147 ?        S<     0:00 [kcopyd]
>  8179 ?        S      0:00 [kjournald]
>  8188 pts/1    R+     0:00 ps ax
> # cat /etc/issue
> Red Hat Enterprise Linux ES release 4 (Nahant)
> Kernel \r on an \m
> 
> #
> ------------------------------------
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [linux-lvm] Creating snapshot causes processes hang?
  2005-11-01 14:28 ` Kelly Sauke
@ 2005-11-01 21:24   ` Mike Snitzer
  2005-11-01 22:02     ` Kelly Sauke
  0 siblings, 1 reply; 5+ messages in thread
From: Mike Snitzer @ 2005-11-01 21:24 UTC (permalink / raw)
  To: LVM general discussion and development

[-- Attachment #1: Type: text/plain, Size: 345 bytes --]

On 11/1/05, Kelly Sauke <ksauke@fastenal.com> wrote:
>
> Snapshots in RHEL 4 are broken. We've had to move to veritas after Redhat
> told
> us that they would not support snapshots in RHEL 4.


FUD? I find it hard to believe RedHat formally said they wouldn't support
LVM2 snapshots in RHEL4. Please elaborate/advise (RedHat?).

Mike

[-- Attachment #2: Type: text/html, Size: 670 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [linux-lvm] Creating snapshot causes processes hang?
  2005-11-01 21:24   ` Mike Snitzer
@ 2005-11-01 22:02     ` Kelly Sauke
  0 siblings, 0 replies; 5+ messages in thread
From: Kelly Sauke @ 2005-11-01 22:02 UTC (permalink / raw)
  To: LVM general discussion and development

http://www.redhat.com/archives/linux-lvm/2005-July/msg00117.html

RHEL 4 U3 is due in Jan '06 I believe.

Mike Snitzer wrote:
> 
> 
> On 11/1/05, *Kelly Sauke* <ksauke@fastenal.com
> <mailto:ksauke@fastenal.com>> wrote:
> 
>     Snapshots in RHEL 4 are broken.  We've had to move to veritas after
>     Redhat told
>     us that they would not support snapshots in RHEL 4.
> 
> 
> FUD?  I find it hard to believe RedHat formally said they wouldn't
> support LVM2 snapshots in RHEL4.  Please elaborate/advise (RedHat?).
> 
> Mike
> 
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [linux-lvm] Creating snapshot causes processes hang?
       [not found] <20051101220316.6F44973980@hormel.redhat.com>
@ 2005-11-01 23:20 ` James G. Sack (jim)
  0 siblings, 0 replies; 5+ messages in thread
From: James G. Sack (jim) @ 2005-11-01 23:20 UTC (permalink / raw)
  To: LVM LIST linux-lvm@redhat.com

> Date: Tue, 01 Nov 2005 16:02:55 -0600
> From: Kelly Sauke <ksauke@fastenal.com>
> Subject: Re: [linux-lvm] Creating snapshot causes processes hang?
> To: LVM general discussion and development <linux-lvm@redhat.com>
> Message-ID: <4367E60F.8020004@fastenal.com>
> Content-Type: text/plain; charset=ISO-8859-1
> 
> http://www.redhat.com/archives/linux-lvm/2005-July/msg00117.html
> 

AHA! -- could this be a confirmation of what I have seen where lvremove
triggers the kcopyd BUG and consequent system deadlock?

(ref: my recent posting --  https://www.redhat.com/archives/linux-
lvm/2005-November/msg00002.html, where I report breakage upon lvremove
when there are 2 snapshots and i/o is occurring)


If so, do I also understand correctly that using unreleased patches and
some manual dmsetup commands might constitute a workaround?

I don't mind living with manual workarounds until snapshots is fixed,
but it would be wonderful to see an explicit recipe [for dummies, please
<grin>] for following the suggestions in the AGK july posting referenced
by ksauke, above), namely:

  "Further patches are needed, but those two plus correct dmsetup use
should avoid machine lockups."

In particular:
have any of the 18 patches shown at 
  ftp://sources.redhat.com/pub/dm/patches/2.6-unstable/2.6.12-
rc2/2.6.12-rc2-udm1/
gotten into 2.6.13 or 2.6.14, or into kernels blessed by Fedora? (eg,
FC4-1526, maybe)

And, could someone give me an example of "correct dmsetup" use? I
presume it involves suspend and resume -- but what else?

.. and in my case, what if there is an existing snapshot when I need to
create/remove a second one?



..jim

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2005-11-01 23:20 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-11-01  2:09 [linux-lvm] Creating snapshot causes processes hang? Satoshi Nagayasu
2005-11-01 14:28 ` Kelly Sauke
2005-11-01 21:24   ` Mike Snitzer
2005-11-01 22:02     ` Kelly Sauke
     [not found] <20051101220316.6F44973980@hormel.redhat.com>
2005-11-01 23:20 ` James G. Sack (jim)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).