public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed
From: Lukas Kolbe <lkolbe@techfak.uni-bielefeld.de>
To: linux-scsi@vger.kernel.org
Subject: After memory pressure: can't read from tape anymore
Date: Sun, 28 Nov 2010 20:15:29 +0100	[thread overview]
Message-ID: <1290971729.2814.13.camel@larosa> (raw)

Hi, 

On our backup system (2 LTO4 drives/Tandberg library via LSISAS1068E,
Kernel 2.6.36 with the stock Fusion MPT SAS Host driver 3.04.17 on
debian/squeeze), we see reproducible tape read and write failures after
the system was under memory pressure:

[342567.297152] st0: Can't allocate 2097152 byte tape buffer.
[342569.316099] st0: Can't allocate 2097152 byte tape buffer.
[342570.805164] st0: Can't allocate 2097152 byte tape buffer.
[342571.958331] st0: Can't allocate 2097152 byte tape buffer.
[342572.704264] st0: Can't allocate 2097152 byte tape buffer.
[342873.737130] st: from_buffer offset overflow.

Bacula is spewing this message every time it tries to access the tape
drive:
28-Nov 19:58 sd1.techfak JobId 2857: Error: block.c:1002 Read error on fd=10 at file:blk 0:0 on device "drv2" (/dev/nst0). ERR=Input/output error

By memory pressure, I mean that the KVM processes containing the
postgres-db (~20million files) and the bacula director have used all
available RAM, one of them used ~4GiB of its 12GiB swap for an hour or
so (by selecting a full restore, it seems that the whole directory tree
of the 15mio files backup gets read into memory). After this, I wasn't
able to read from the second tape drive anymore (/dev/st0); whereas the
first tape drive was restoring the data happily (it is currently about
halfway through a 3TiB restore from 5 tapes).

This same behaviour appears when we're doing a few incremental backups;
after a while, it just isn't possible to use the tape drives anymore -
every I/O operation gives an I/O Error, even a simple dd bs=64k
count=10. After a restart, the system behaves correctly until
-seemingly- another memory pressure situation occured.

I'd be delighted if somebody can help me debug this; my systemtap skills
are non-existent unfortunatly.

kind regads,
Lukas Kolbe



             reply	other threads:[~2010-11-28 19:22 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-28 19:15 Lukas Kolbe [this message]
2010-11-29 17:09 ` After memory pressure: can't read from tape anymore Kai Makisara
2010-11-30 13:31   ` Lukas Kolbe
2010-11-30 16:10     ` Boaz Harrosh
2010-11-30 16:23       ` Kai Makisara
2010-11-30 16:44         ` Boaz Harrosh
2010-11-30 17:04           ` Kai Makisara
2010-11-30 17:24             ` Boaz Harrosh
2010-11-30 19:53               ` Kai Makisara
2010-12-01  9:40                 ` Lukas Kolbe
2010-12-02 11:17                   ` Desai, Kashyap
2010-12-02 16:22                     ` Kai Makisara
2010-12-02 18:14                       ` Desai, Kashyap
2010-12-02 20:25                         ` Kai Makisara
2010-12-05 10:44                           ` Lukas Kolbe
2010-12-03 10:13                       ` FUJITA Tomonori
2010-12-03 10:45                         ` Desai, Kashyap
2010-12-03 11:11                           ` FUJITA Tomonori
2010-12-02 10:01                 ` Lukas Kolbe
2010-12-03  9:44               ` FUJITA Tomonori
2010-11-30 16:20     ` Kai Makisara
2010-12-01 17:06       ` Lukas Kolbe
2010-12-02 16:41         ` Kai Makisara
2010-12-06  7:59           ` Kai Makisara
2010-12-06  8:50             ` FUJITA Tomonori
2010-12-06  9:36             ` Lukas Kolbe
2010-12-06 11:34               ` Bjørn Mork
2010-12-08 14:19               ` Lukas Kolbe
2010-12-03 12:27   ` FUJITA Tomonori
2010-12-03 14:59     ` Kai Mäkisara
2010-12-03 15:06       ` James Bottomley
2010-12-03 17:03         ` Lukas Kolbe
2010-12-03 18:10           ` James Bottomley
2010-12-05 10:53             ` Lukas Kolbe
2010-12-05 12:16               ` FUJITA Tomonori
2010-12-14 20:35             ` Vladislav Bolkhovitin
2010-12-14 22:23               ` Stephen Hemminger
2010-12-15 16:27                 ` Vladislav Bolkhovitin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1290971729.2814.13.camel@larosa \
    --to=lkolbe@techfak.uni-bielefeld.de \
    --cc=linux-scsi@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox