* Re: scsi: sg: assorted memory corruptions
[not found] ` <1517501859.3417.67.camel@codethink.co.uk>
@ 2018-02-01 16:21 ` Dmitry Vyukov
2018-02-04 9:07 ` Eric Biggers
0 siblings, 1 reply; 4+ messages in thread
From: Dmitry Vyukov @ 2018-02-01 16:21 UTC (permalink / raw)
To: Ben Hutchings, Tejun Heo, linux-ide
Cc: Doug Gilbert, Bart Van Assche, jejb@linux.vnet.ibm.com,
linux-scsi@vger.kernel.org, linux-kernel@vger.kernel.org,
martin.petersen@oracle.com, syzkaller@googlegroups.com
On Thu, Feb 1, 2018 at 5:17 PM, Ben Hutchings
<ben.hutchings@codethink.co.uk> wrote:
> On Thu, 2018-02-01 at 08:04 +0100, Dmitry Vyukov wrote:
>> On Thu, Feb 1, 2018 at 7:03 AM, Douglas Gilbert <dgilbert@interlog.com> wrote:
>> > On 2018-01-30 07:22 AM, Dmitry Vyukov wrote:
> [...]
>> > > [1:0:0:0] cd/dvd QEMU QEMU DVD-ROM 2.0. /dev/sr0 /dev/sg1
>> > >
>> > > # readlink /sys/class/scsi_generic/sg0
>> > >
>> > > ../../devices/pci0000:00/0000:00:01.1/ata1/host0/target0:0:0/0:0:0:0/scsi_generic/sg0
>> > >
>> > > # cat /sys/class/scsi_generic/sg0/device/vendor
>> > > ATA
>> >
>> >
>> > ^^^^^
>> > That subsystem is the culprit IMO, most likely libata.
>> >
>> > Until you can show this test failing on something other than an
>> > ATA disk, then I will treat this issue as closed.
>>
>> Hi Doug,
>>
>> Why is bug in ATA not a bug? Is it long unused by everybody? I've got
>> it by running qemu with default flags...
>
> If the bug is in libata then it's not on Doug to fix it since he's only
> maintaining sg.
Then I think we need to CC ata maintainers rather than treat it as closed.
+Tejun, linux-ide@, you can see full thread here:
https://groups.google.com/forum/#!topic/syzkaller/9RNr9Gu0MyY
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: scsi: sg: assorted memory corruptions
2018-02-01 16:21 ` scsi: sg: assorted memory corruptions Dmitry Vyukov
@ 2018-02-04 9:07 ` Eric Biggers
2018-02-04 11:10 ` Dmitry Vyukov
0 siblings, 1 reply; 4+ messages in thread
From: Eric Biggers @ 2018-02-04 9:07 UTC (permalink / raw)
To: Dmitry Vyukov
Cc: Ben Hutchings, Tejun Heo, linux-ide, Doug Gilbert,
Bart Van Assche, jejb@linux.vnet.ibm.com,
linux-scsi@vger.kernel.org, linux-kernel@vger.kernel.org,
martin.petersen@oracle.com, syzkaller@googlegroups.com
On Thu, Feb 01, 2018 at 05:21:12PM +0100, 'Dmitry Vyukov' via syzkaller wrote:
> On Thu, Feb 1, 2018 at 5:17 PM, Ben Hutchings
> <ben.hutchings@codethink.co.uk> wrote:
> > On Thu, 2018-02-01 at 08:04 +0100, Dmitry Vyukov wrote:
> >> On Thu, Feb 1, 2018 at 7:03 AM, Douglas Gilbert <dgilbert@interlog.com> wrote:
> >> > On 2018-01-30 07:22 AM, Dmitry Vyukov wrote:
> > [...]
> >> > > [1:0:0:0] cd/dvd QEMU QEMU DVD-ROM 2.0. /dev/sr0 /dev/sg1
> >> > >
> >> > > # readlink /sys/class/scsi_generic/sg0
> >> > >
> >> > > ../../devices/pci0000:00/0000:00:01.1/ata1/host0/target0:0:0/0:0:0:0/scsi_generic/sg0
> >> > >
> >> > > # cat /sys/class/scsi_generic/sg0/device/vendor
> >> > > ATA
> >> >
> >> >
> >> > ^^^^^
> >> > That subsystem is the culprit IMO, most likely libata.
> >> >
> >> > Until you can show this test failing on something other than an
> >> > ATA disk, then I will treat this issue as closed.
> >>
> >> Hi Doug,
> >>
> >> Why is bug in ATA not a bug? Is it long unused by everybody? I've got
> >> it by running qemu with default flags...
> >
> > If the bug is in libata then it's not on Doug to fix it since he's only
> > maintaining sg.
>
>
> Then I think we need to CC ata maintainers rather than treat it as closed.
> +Tejun, linux-ide@, you can see full thread here:
> https://groups.google.com/forum/#!topic/syzkaller/9RNr9Gu0MyY
>
To get memory corruption it's actually sufficient just to submit "1-byte" reads;
there's no need for the SG_NEXT_CMD_LEN ioctl or anything:
#include <fcntl.h>
#include <unistd.h>
int main()
{
int fd = open("/dev/sg0", O_RDWR);
char buf[43] = { [36] = 0x08 /* READ_6 */ };
for (;;)
write(fd, buf, sizeof(buf));
}
(where /dev/sg0 is the default QEMU disk type, "82371SB PIIX3 IDE")
The SCSI command descriptor block is the 6 bytes at indices 36-41, so index 42
is the only data byte.
Also this is a different bug from the crash in ata_bmdma_fill_sg() which is
fixed by "libata: fix length validation of ATAPI-relayed SCSI commands".
I'm guessing the driver is DMA'ing to somewhere it shouldn't be...
Eric
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: scsi: sg: assorted memory corruptions
2018-02-04 9:07 ` Eric Biggers
@ 2018-02-04 11:10 ` Dmitry Vyukov
2018-02-10 19:13 ` Eric Biggers
0 siblings, 1 reply; 4+ messages in thread
From: Dmitry Vyukov @ 2018-02-04 11:10 UTC (permalink / raw)
To: Eric Biggers
Cc: Ben Hutchings, Tejun Heo, linux-ide, Doug Gilbert,
Bart Van Assche, jejb@linux.vnet.ibm.com,
linux-scsi@vger.kernel.org, linux-kernel@vger.kernel.org,
martin.petersen@oracle.com, syzkaller@googlegroups.com
On Sun, Feb 4, 2018 at 10:07 AM, Eric Biggers <ebiggers3@gmail.com> wrote:
> On Thu, Feb 01, 2018 at 05:21:12PM +0100, 'Dmitry Vyukov' via syzkaller wrote:
>> On Thu, Feb 1, 2018 at 5:17 PM, Ben Hutchings
>> <ben.hutchings@codethink.co.uk> wrote:
>> > On Thu, 2018-02-01 at 08:04 +0100, Dmitry Vyukov wrote:
>> >> On Thu, Feb 1, 2018 at 7:03 AM, Douglas Gilbert <dgilbert@interlog.com> wrote:
>> >> > On 2018-01-30 07:22 AM, Dmitry Vyukov wrote:
>> > [...]
>> >> > > [1:0:0:0] cd/dvd QEMU QEMU DVD-ROM 2.0. /dev/sr0 /dev/sg1
>> >> > >
>> >> > > # readlink /sys/class/scsi_generic/sg0
>> >> > >
>> >> > > ../../devices/pci0000:00/0000:00:01.1/ata1/host0/target0:0:0/0:0:0:0/scsi_generic/sg0
>> >> > >
>> >> > > # cat /sys/class/scsi_generic/sg0/device/vendor
>> >> > > ATA
>> >> >
>> >> >
>> >> > ^^^^^
>> >> > That subsystem is the culprit IMO, most likely libata.
>> >> >
>> >> > Until you can show this test failing on something other than an
>> >> > ATA disk, then I will treat this issue as closed.
>> >>
>> >> Hi Doug,
>> >>
>> >> Why is bug in ATA not a bug? Is it long unused by everybody? I've got
>> >> it by running qemu with default flags...
>> >
>> > If the bug is in libata then it's not on Doug to fix it since he's only
>> > maintaining sg.
>>
>>
>> Then I think we need to CC ata maintainers rather than treat it as closed.
>> +Tejun, linux-ide@, you can see full thread here:
>> https://groups.google.com/forum/#!topic/syzkaller/9RNr9Gu0MyY
>>
>
> To get memory corruption it's actually sufficient just to submit "1-byte" reads;
> there's no need for the SG_NEXT_CMD_LEN ioctl or anything:
>
> #include <fcntl.h>
> #include <unistd.h>
>
> int main()
> {
> int fd = open("/dev/sg0", O_RDWR);
> char buf[43] = { [36] = 0x08 /* READ_6 */ };
>
> for (;;)
> write(fd, buf, sizeof(buf));
> }
>
> (where /dev/sg0 is the default QEMU disk type, "82371SB PIIX3 IDE")
>
> The SCSI command descriptor block is the 6 bytes at indices 36-41, so index 42
> is the only data byte.
>
> Also this is a different bug from the crash in ata_bmdma_fill_sg() which is
> fixed by "libata: fix length validation of ATAPI-relayed SCSI commands".
>
> I'm guessing the driver is DMA'ing to somewhere it shouldn't be...
It would be good to add KASAN checks to the DMA code that issues
transfers. This is another case where a silent memory corruption
causes dozens of assorted crashes all over the kernel. If we add
checks, KASAN would pinpoint the exact stack that issues the bad
command. This may be the simplest way to debug this bug as well. I've
filed https://bugzilla.kernel.org/show_bug.cgi?id=198661 for this.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: scsi: sg: assorted memory corruptions
2018-02-04 11:10 ` Dmitry Vyukov
@ 2018-02-10 19:13 ` Eric Biggers
0 siblings, 0 replies; 4+ messages in thread
From: Eric Biggers @ 2018-02-10 19:13 UTC (permalink / raw)
To: Dmitry Vyukov
Cc: Ben Hutchings, Tejun Heo, linux-ide, Doug Gilbert,
Bart Van Assche, jejb@linux.vnet.ibm.com,
linux-scsi@vger.kernel.org, linux-kernel@vger.kernel.org,
martin.petersen@oracle.com, syzkaller@googlegroups.com
On Sun, Feb 04, 2018 at 12:10:58PM +0100, Dmitry Vyukov wrote:
> >
> > To get memory corruption it's actually sufficient just to submit "1-byte" reads;
> > there's no need for the SG_NEXT_CMD_LEN ioctl or anything:
> >
> > #include <fcntl.h>
> > #include <unistd.h>
> >
> > int main()
> > {
> > int fd = open("/dev/sg0", O_RDWR);
> > char buf[43] = { [36] = 0x08 /* READ_6 */ };
> >
> > for (;;)
> > write(fd, buf, sizeof(buf));
> > }
> >
> > (where /dev/sg0 is the default QEMU disk type, "82371SB PIIX3 IDE")
> >
> > The SCSI command descriptor block is the 6 bytes at indices 36-41, so index 42
> > is the only data byte.
> >
> > Also this is a different bug from the crash in ata_bmdma_fill_sg() which is
> > fixed by "libata: fix length validation of ATAPI-relayed SCSI commands".
> >
> > I'm guessing the driver is DMA'ing to somewhere it shouldn't be...
>
> It would be good to add KASAN checks to the DMA code that issues
> transfers. This is another case where a silent memory corruption
> causes dozens of assorted crashes all over the kernel. If we add
> checks, KASAN would pinpoint the exact stack that issues the bad
> command. This may be the simplest way to debug this bug as well. I've
> filed https://bugzilla.kernel.org/show_bug.cgi?id=198661 for this.
It seems the problem is related to the fact that in the PRD (Physical Region
Descriptor) list for the DMA transfer for "BMDMA" ATA disks, the disk (emulated
in QEMU here: https://github.com/qemu/qemu/blob/master/hw/ide/pci.c#L89) ignores
the low bit in the lengths, causing a length of 1 (byte) to be interpreted as 0.
But, at the same time there is also a special case where a length of 0 is
interpreted as 65536 bytes.
So the disk will DMA up to 65536 bytes into a 1-byte buffer, causing massive
memory corruption.
I'm not sure what the best fix is, but probably it needs to be required that the
lengths in the sglist have the alignment needed for the disk.
KASAN would not have helped here unfortunately. Even if there were KASAN checks
when mapping the sglist for DMA or when filling in the PRD list, the kernel
would not have known that the disk would actually interpret "1" as "65536".
- Eric
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2018-02-10 19:13 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <CACT4Y+b0pizvOAxtWQ-TGCRAYvHXdv79BAEaXh50W6vF8gBvkQ@mail.gmail.com>
[not found] ` <1516638634.2545.0.camel@wdc.com>
[not found] ` <097921fa-52c5-8b2b-f564-4b24d9720478@interlog.com>
[not found] ` <CACT4Y+aEx0ie3qjx6afCYV2vN7gYii0uupKWPXaNTnWUmhdHQw@mail.gmail.com>
[not found] ` <c0a7ec39-e357-9db6-cf12-ff9c46259f26@interlog.com>
[not found] ` <CACT4Y+bObXrFiB0Qn+nfkD8DZRiVck1GJ5U9UeQz+-y85gh70Q@mail.gmail.com>
[not found] ` <1517501859.3417.67.camel@codethink.co.uk>
2018-02-01 16:21 ` scsi: sg: assorted memory corruptions Dmitry Vyukov
2018-02-04 9:07 ` Eric Biggers
2018-02-04 11:10 ` Dmitry Vyukov
2018-02-10 19:13 ` Eric Biggers
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox