public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Dominique Martinet <asmadeus@codewreck.org>
To: syzbot <syzbot+2222c34dc40b515f30dc@syzkaller.appspotmail.com>
Cc: davem@davemloft.net, ericvh@gmail.com,
	linux-kernel@vger.kernel.org, lucho@ionkov.net,
	netdev@vger.kernel.org, rminnich@sandia.gov,
	syzkaller-bugs@googlegroups.com,
	v9fs-developer@lists.sourceforge.net
Subject: Re: BUG: corrupted list in p9_read_work
Date: Tue, 9 Oct 2018 04:09:49 +0200	[thread overview]
Message-ID: <20181009020949.GA29622@nautica> (raw)
In-Reply-To: <000000000000fddb150577c15af6@google.com>

syzbot wrote on Mon, Oct 08, 2018:
> syzbot has found a reproducer for the following crash on:
> 
> HEAD commit:    0854ba5ff5c9 Merge git://git.kernel.org/pub/scm/linux/kern..
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=1514ec06400000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=88e9a8a39dc0be2d
> dashboard link: https://syzkaller.appspot.com/bug?extid=2222c34dc40b515f30dc
> compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=10b91685400000
> 
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+2222c34dc40b515f30dc@syzkaller.appspotmail.com
> 
> list_del corruption, ffff88019ae36ee8->next is LIST_POISON1
> (dead000000000100)
> ------------[ cut here ]------------
> [...]
>  list_del include/linux/list.h:125 [inline]
>  p9_read_work+0xab6/0x10e0 net/9p/trans_fd.c:379

Hmm this looks very much like the report from
syzbot+735d926e9d1317c3310c@syzkaller.appspotmail.com 
which should have been fixed by Tomas in 9f476d7c540cb
("net/9p/trans_fd.c: fix race by holding the lock")...

It looks like another double list_del, looking at the code again there
actually are other ways this could happen around connection errors.
For example,
 - p9_read_work receives something and lookup works... meanwhile
 - p9_write_work fails to write and calls p9_conn_cancel, which deletes
from the req_list without waiting for other works to finish (could also
happen in p9_poll_mux)
 - p9_read_work finishes processing the read and deletes from list again

For this one the simplest fix would probably be to just not
list_del/call p9_client_cb at all if m->r?req->status isn't
REQ_STATUS_ERROR in p9_read_work after the "got new packet" debug print,
and frankly I think that's saner so I'll send a patch shortly doing
that, but I have zero confidence there aren't similar bugs around, the
tcp code is so messy... Most of the syzbot reports recently have been
around trans_fd which I don't think is used much in real life, and this
is not really motivating (i.e. I think it would probably need a more
extensive rewrite but nobody cares) :/


Dmitry, on that note, do you think syzbot could possibly test other
transports somehow? rdma or virtio cannot be faked as easily as passing
a fd around, but I'd be very interested in seeing these flayed a bit.

(I'm also curious what logic is used to generate the syz tests, the
write$P9_Rxx replies have nothing to do with what the client would
expect so it probably doesn't test very far; this test in particular
does not even get past the initial P9_TVERSION that the client would
expect immediately after mount, so it's basically only testing logic
around packet handling on error... Or if we're accepting a RREADDIR in
reply to TVERSION we have bigger problems, and now I'm looking at it I
think we just might never check that....... I'll look at that for the
next cycle)


Back to the current patch, since as I said I am not confident this is a
good enough fix for the current bug, will I get notified if the bug
happens again once the patch hits linux-next with the Reported-by tag ?
(I don't have the setup necessary to run a syz repro as there is no C
repro, and won't have much time to do that setup sorry)


> FS-Cache: N-cookie d=000000000a092700 n=00000000d8ee0022
> FS-Cache: N-key=[10] '34323935303034313132'
> FS-Cache: Duplicate cookie detected
> FS-Cache: O-cookie c=00000000911358e4 [p=000000006545c95d fl=222 nc=0 na=1]
> FS-Cache: O-cookie d=000000000a092700 n=000000007635356b
> FS-Cache: O-key=[10] '
> [...]

(on an unrelated topic, I got these FS-Cache warnings quite often when
testing with cache enabled and have no idea what they mean. I don't
normally use cache so haven't spent time looking at it, but I find these
rather worrying... If someone having a clue reads this, I'd love to hear
what they could mean and what we should look at)

Thanks,
-- 
Dominique

  reply	other threads:[~2018-10-09  2:10 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-16  5:59 BUG: corrupted list in p9_read_work syzbot
2018-10-09  1:07 ` syzbot
2018-10-09  2:09   ` Dominique Martinet [this message]
2018-10-09  4:05     ` [PATCH 1/2] 9p/trans_fd: abort p9_read_work if req status changed Dominique Martinet
2018-10-09  4:05       ` [PATCH 2/2] 9p/trans_fd: put worker reqs on destroy Dominique Martinet
2018-10-09 13:19         ` Tomas Bortoli
2018-10-15 10:46           ` Dominique Martinet
2018-10-10 14:03     ` BUG: corrupted list in p9_read_work Dmitry Vyukov
2018-10-10 14:40       ` Dominique Martinet
2018-10-10 14:51         ` Dmitry Vyukov
2018-10-10 15:58           ` Dominique Martinet
2018-10-11 12:33             ` Dmitry Vyukov
2018-10-11 13:10               ` Dominique Martinet
2018-10-11 13:27                 ` Dmitry Vyukov
2018-10-11 13:40                   ` Dmitry Vyukov
2018-10-11 14:28                     ` 9p/RDMA for syzkaller (Was: BUG: corrupted list in p9_read_work) Dominique Martinet
2018-10-12 14:42                       ` Dmitry Vyukov
2018-10-11 14:19                   ` Dominique Martinet
2018-10-12 14:50                     ` Dmitry Vyukov
2018-10-12 15:08                       ` Dominique Martinet
2018-11-17  8:46                         ` Dominique Martinet
2018-11-20 11:20                           ` Dmitry Vyukov
2018-11-20 11:28                             ` Dominique Martinet
2018-10-10 14:29     ` BUG: corrupted list in p9_read_work Dmitry Vyukov
2018-10-10 14:48       ` Dominique Martinet
2018-10-10 14:49         ` syzbot
2018-10-10 16:00           ` Dominique Martinet
2018-10-10 16:02             ` syzbot
2018-10-10 16:10             ` Dominique Martinet
2018-10-10 16:29               ` syzbot
2018-10-10 16:36               ` Dmitry Vyukov
2018-10-10 22:55                 ` Dominique Martinet
2018-10-10 14:42     ` Dmitry Vyukov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181009020949.GA29622@nautica \
    --to=asmadeus@codewreck.org \
    --cc=davem@davemloft.net \
    --cc=ericvh@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lucho@ionkov.net \
    --cc=netdev@vger.kernel.org \
    --cc=rminnich@sandia.gov \
    --cc=syzbot+2222c34dc40b515f30dc@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    --cc=v9fs-developer@lists.sourceforge.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox