linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andy Isaacson <adi@hexapodia.org>
To: David Woodhouse <dwmw2@infradead.org>
Cc: Chris Wright <chrisw@sous-sol.org>,
	iommu@lists.linux-foundation.org, linux-ext4@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: DMAR regression in 2.6.31 leads to ext4 corruption?
Date: Wed, 14 Oct 2009 10:52:14 -0700	[thread overview]
Message-ID: <20091014175214.GD6827@hexapodia.org> (raw)
In-Reply-To: <1255522166.4523.238.camel@macbook.infradead.org>

On Wed, Oct 14, 2009 at 01:09:26PM +0100, David Woodhouse wrote:
> On Fri, 2009-10-09 at 18:47 -0700, Andy Isaacson wrote:
> > Well, we don't know for sure what happened on the previous boot where
> > the filesystem corruption occurred.  I'm imagining a nightmare scenario
> > where GPU erroneous writes cause DMAR faults and handling them somehow
> > causes AHCI DMA requests to get lost.
> 
> Seems unlikely. The GPU faults happen whenever the GATT changes, because
> it translates _every_ address in the GATT through the IOMMU right there
> and then -- so if parts of the table are uninitialised, they'll cause
> stray write faults. But no writes are actually _happening_.
> 
> > I'm going to go ahead on the theory that the BIOS needs an update.
> 
> I can't really imagine how that would help; how the BIOS would be
> responsible for this. I'm more inclined to blame the drive. It's not an
> SSD, is it?

It's a Fujitsu (now serviced by Toshiba?) MHZ2160BH.  smartctl says:

Device Model:     FUJITSU MHZ2160BH G1
Serial Number:    K60WT8C2HHRS
Firmware Version: 0084000A
User Capacity:    160,041,885,696 bytes
...
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_
FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   100   100   046    Pre-fail  Always       -
       219593
  2 Throughput_Performance  0x0005   100   100   030    Pre-fail  Offline      -
       27721728
  3 Spin_Up_Time            0x0003   100   100   025    Pre-fail  Always       -
       0
  4 Start_Stop_Count        0x0032   099   099   000    Old_age   Always       -
       406
  5 Reallocated_Sector_Ct   0x0033   100   100   024    Pre-fail  Always       -
       8589934592000
  7 Seek_Error_Rate         0x000f   100   100   047    Pre-fail  Always       -
       112
  8 Seek_Time_Performance   0x0005   100   100   019    Pre-fail  Offline      -
       0
  9 Power_On_Hours          0x0032   097   097   000    Old_age   Always       -
       1598
 10 Spin_Retry_Count        0x0013   100   100   020    Pre-fail  Always       -
       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -
       284
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -
       78
193 Load_Cycle_Count        0x0032   100   100   000    Old_age   Always       -
       1216
194 Temperature_Celsius     0x0022   100   100   000    Old_age   Always       -
       38 (Lifetime Min/Max 21/46)
195 Hardware_ECC_Recovered  0x001a   100   100   000    Old_age   Always       -
       247
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -
       457965568
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -
       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -
       0
199 UDMA_CRC_Error_Count    0x003e   200   253   000    Old_age   Always       -
       0
200 Multi_Zone_Error_Rate   0x000f   100   100   060    Pre-fail  Always       -
       10448
203 Run_Out_Cancel          0x0002   100   100   000    Old_age   Always       -
       1529011503750
240 Head_Flying_Hours       0x003e   200   200   000    Old_age   Always       -
       0

-andy

  parent reply	other threads:[~2009-10-14 17:52 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-10-09  6:17 DMAR regression in 2.6.31 leads to ext4 corruption? Andy Isaacson
2009-10-09 23:37 ` Andy Isaacson
2009-10-10  0:09 ` Chris Wright
2009-10-10  1:47   ` Andy Isaacson
2009-10-14 12:09     ` David Woodhouse
2009-10-14 15:26       ` Bhavesh Davda
2009-10-14 15:34         ` David Woodhouse
2009-10-14 17:52       ` Andy Isaacson [this message]
  -- strict thread matches above, loose matches on Subject: below --
2009-10-08 23:56 Andy Isaacson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20091014175214.GD6827@hexapodia.org \
    --to=adi@hexapodia.org \
    --cc=chrisw@sous-sol.org \
    --cc=dwmw2@infradead.org \
    --cc=iommu@lists.linux-foundation.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).