Re: [PATCH] e1000e: using bottom half to send packets

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Jason Wang <jasowang@redhat.com>
To: Li Qiang <liq3ea@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	Dmitry Fleytman <dmitry.fleytman@gmail.com>,
	Li Qiang <liq3ea@163.com>,
	Qemu Developers <qemu-devel@nongnu.org>,
	"Michael S. Tsirkin" <mst@redhat.com>
Subject: Re: [PATCH] e1000e: using bottom half to send packets
Date: Fri, 17 Jul 2020 13:38:44 +0800	[thread overview]
Message-ID: <d9c249a1-5f63-7497-3783-3a3e8cf7b2da@redhat.com> (raw)
In-Reply-To: <CAKXe6SLe_ZRqQQMi2hPFBkauWnbaOPKN27fwrdaTOymb-fTrFg@mail.gmail.com>


On 2020/7/17 下午12:46, Li Qiang wrote:
> Jason Wang <jasowang@redhat.com> 于2020年7月17日周五 上午11:10写道：
>>
>> On 2020/7/17 上午12:14, Li Qiang wrote:
>>> Alexander Bulekov reported a UAF bug related e1000e packets send.
>>>
>>> -->https://bugs.launchpad.net/qemu/+bug/1886362
>>>
>>> This is because the guest trigger a e1000e packet send and set the
>>> data's address to e1000e's MMIO address. So when the e1000e do DMA
>>> it will write the MMIO again and trigger re-entrancy and finally
>>> causes this UAF.
>>>
>>> Paolo suggested to use a bottom half whenever MMIO is doing complicate
>>> things in here:
>>> -->https://lists.nongnu.org/archive/html/qemu-devel/2020-07/msg03342.html
>>>
>>> Reference here:
>>> 'The easiest solution is to delay processing of descriptors to a bottom
>>> half whenever MMIO is doing something complicated.  This is also better
>>> for latency because it will free the vCPU thread more quickly and leave
>>> the work to the I/O thread.'
>>
>> I think several things were missed in this patch (take virtio-net as a
>> reference), do we need the following things:
>>
> Thanks Jason,
> In fact I know this, I'm scared for touching this but I want to try.
> Thanks for your advice.
>
>> - Cancel the bh when VM is stopped.
> Ok. I think add a vm state change notifier for e1000e can address this.
>
>> - A throttle to prevent bh from executing too much timer?
> Ok, I think add a config timeout and add a timer in e1000e can address this.


Sorry, a typo. I meant we probably need a tx_burst as what virtio-net did.


>
>> - A flag to record whether or not this a pending tx (and migrate it?)
> Is just a flag enough? Could you explain more about the idea behind
> processing the virtio-net/e1000e using bh like this?


Virtio-net use a tx_waiting variable to record whether or not there's a 
pending bh. (E.g bh is cancelled due to vmstop, we need reschedule it 
after vmresume). Maybe we can do something simpler by just schecule bh 
unconditionally during vm resuming.


> For example, if the guest trigger a lot of packets send and if the bh
> is scheduled in IO thread. So will we lost packets?


We don't since we don't populate virtqueue which means packets are 
queued there.

Thanks


> How we avoid this in virtio-net.
>
> Thanks,
> Li Qiang
>
>
>
>> Thanks
>>
>>
>>> This patch fixes this UAF.
>>>
>>> Signed-off-by: Li Qiang <liq3ea@163.com>
>>> ---
>>>    hw/net/e1000e_core.c | 25 +++++++++++++++++--------
>>>    hw/net/e1000e_core.h |  2 ++
>>>    2 files changed, 19 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
>>> index bcd186cac5..6165b04b68 100644
>>> --- a/hw/net/e1000e_core.c
>>> +++ b/hw/net/e1000e_core.c
>>> @@ -2423,32 +2423,27 @@ e1000e_set_dbal(E1000ECore *core, int index, uint32_t val)
>>>    static void
>>>    e1000e_set_tctl(E1000ECore *core, int index, uint32_t val)
>>>    {
>>> -    E1000E_TxRing txr;
>>>        core->mac[index] = val;
>>>
>>>        if (core->mac[TARC0] & E1000_TARC_ENABLE) {
>>> -        e1000e_tx_ring_init(core, &txr, 0);
>>> -        e1000e_start_xmit(core, &txr);
>>> +        qemu_bh_schedule(core->tx[0].tx_bh);
>>>        }
>>>
>>>        if (core->mac[TARC1] & E1000_TARC_ENABLE) {
>>> -        e1000e_tx_ring_init(core, &txr, 1);
>>> -        e1000e_start_xmit(core, &txr);
>>> +        qemu_bh_schedule(core->tx[1].tx_bh);
>>>        }
>>>    }
>>>
>>>    static void
>>>    e1000e_set_tdt(E1000ECore *core, int index, uint32_t val)
>>>    {
>>> -    E1000E_TxRing txr;
>>>        int qidx = e1000e_mq_queue_idx(TDT, index);
>>>        uint32_t tarc_reg = (qidx == 0) ? TARC0 : TARC1;
>>>
>>>        core->mac[index] = val & 0xffff;
>>>
>>>        if (core->mac[tarc_reg] & E1000_TARC_ENABLE) {
>>> -        e1000e_tx_ring_init(core, &txr, qidx);
>>> -        e1000e_start_xmit(core, &txr);
>>> +        qemu_bh_schedule(core->tx[qidx].tx_bh);
>>>        }
>>>    }
>>>
>>> @@ -3322,6 +3317,16 @@ e1000e_vm_state_change(void *opaque, int running, RunState state)
>>>        }
>>>    }
>>>
>>> +static void e1000e_core_tx_bh(void *opaque)
>>> +{
>>> +    struct e1000e_tx *tx = opaque;
>>> +    E1000ECore *core = tx->core;
>>> +    E1000E_TxRing txr;
>>> +
>>> +    e1000e_tx_ring_init(core, &txr, tx - &core->tx[0]);
>>> +    e1000e_start_xmit(core, &txr);
>>> +}
>>> +
>>>    void
>>>    e1000e_core_pci_realize(E1000ECore     *core,
>>>                            const uint16_t *eeprom_templ,
>>> @@ -3340,6 +3345,8 @@ e1000e_core_pci_realize(E1000ECore     *core,
>>>        for (i = 0; i < E1000E_NUM_QUEUES; i++) {
>>>            net_tx_pkt_init(&core->tx[i].tx_pkt, core->owner,
>>>                            E1000E_MAX_TX_FRAGS, core->has_vnet);
>>> +        core->tx[i].core = core;
>>> +        core->tx[i].tx_bh = qemu_bh_new(e1000e_core_tx_bh, &core->tx[i]);
>>>        }
>>>
>>>        net_rx_pkt_init(&core->rx_pkt, core->has_vnet);
>>> @@ -3367,6 +3374,8 @@ e1000e_core_pci_uninit(E1000ECore *core)
>>>        for (i = 0; i < E1000E_NUM_QUEUES; i++) {
>>>            net_tx_pkt_reset(core->tx[i].tx_pkt);
>>>            net_tx_pkt_uninit(core->tx[i].tx_pkt);
>>> +        qemu_bh_delete(core->tx[i].tx_bh);
>>> +        core->tx[i].tx_bh = NULL;
>>>        }
>>>
>>>        net_rx_pkt_uninit(core->rx_pkt);
>>> diff --git a/hw/net/e1000e_core.h b/hw/net/e1000e_core.h
>>> index aee32f7e48..94ddc6afc2 100644
>>> --- a/hw/net/e1000e_core.h
>>> +++ b/hw/net/e1000e_core.h
>>> @@ -77,6 +77,8 @@ struct E1000Core {
>>>            unsigned char sum_needed;
>>>            bool cptse;
>>>            struct NetTxPkt *tx_pkt;
>>> +        QEMUBH *tx_bh;
>>> +        E1000ECore *core;
>>>        } tx[E1000E_NUM_QUEUES];
>>>
>>>        struct NetRxPkt *rx_pkt;

next prev parent reply	other threads:[~2020-07-17  5:39 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-16 16:14 [PATCH] e1000e: using bottom half to send packets Li Qiang
2020-07-17  3:10 ` Jason Wang
2020-07-17  4:46   ` Li Qiang
2020-07-17  5:38     ` Jason Wang [this message]
2020-07-17 15:46       ` Li Qiang
2020-07-20  3:59         ` Jason Wang
2020-07-20  4:41           ` Li Qiang
2020-07-17 15:52   ` Peter Maydell
2020-07-20  4:00     ` Jason Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d9c249a1-5f63-7497-3783-3a3e8cf7b2da@redhat.com \
    --to=jasowang@redhat.com \
    --cc=dmitry.fleytman@gmail.com \
    --cc=liq3ea@163.com \
    --cc=liq3ea@gmail.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).