All of lore.kernel.org
 help / color / mirror / Atom feed
From: Takashi Iwai <tiwai@suse.de>
To: Curtis Malainey <cujomalainey@google.com>
Cc: alsa-devel@alsa-project.org,
	Curtis Malainey <cujomalainey@chromium.org>,
	Stephen Boyd <swboyd@google.com>
Subject: Re: [PATCH RFC 0/6] ALSA: Fix UAF with delayed kobj release
Date: Wed, 16 Aug 2023 07:35:52 +0200	[thread overview]
Message-ID: <878raby2fb.wl-tiwai@suse.de> (raw)
In-Reply-To: <CAOReqxiq86kxj1HEmjYPUZ5gW49y2x9ZFYa8mAH1VWFAF7dEuw@mail.gmail.com>

On Tue, 15 Aug 2023 23:32:31 +0200,
Curtis Malainey wrote:
> 
> On Tue, Aug 15, 2023 at 9:07 AM Takashi Iwai <tiwai@suse.de> wrote:
> >
> > On Mon, 14 Aug 2023 22:20:29 +0200,
> > Curtis Malainey wrote:
> > >
> > > On Sun, Aug 13, 2023 at 1:08 AM Takashi Iwai <tiwai@suse.de> wrote:
> > > >
> > > > On Wed, 09 Aug 2023 23:11:45 +0200,
> > > > Curtis Malainey wrote:
> > > > >
> > > > > >
> > > > > > And now looking back at kobj code and device code, they do refcount
> > > > > > parent objects.  Maybe the problem is in our side -- the all devices
> > > > > > are created with the original real device as the parent, including the
> > > > > > card_dev, while there are some dependencies among children.  So, if we
> > > > > > build up a proper tree, pci_dev -> card_dev -> ctl_dev, pcm_dev, etc,
> > > > > > one of the problems could be solved.  It's more or less similar as
> > > > > > what I suggested initially (referring card_dev at pcm), while changing
> > > > > > the parent would make it implicitly.
> > > > >
> > > > > Yes I think this would be the long term proper way to go, that way
> > > > > parents just put children and remove their reference, then they
> > > > > cleanup on their own time making everyone happy. My first patch was a
> > > > > very lazy attempt that, if we wanted to do the right thing we would
> > > > > obviously have to split the structs and free functions to operate in
> > > > > their own release. If you have time feel free to take another swing at
> > > > > the patches, otherwise I won't be able to start until next week.
> > > >
> > > > Now looking back at the problem again, I noticed that actually my
> > > > previous comment was wrong: as default, the device dependencies aren't
> > > > kept at the release time, but it's already cleared at device_del()
> > > > call.  The device_del() calls kobject_del() and put_device(parent).
> > > > So after this moment, both device releases become independent, and
> > > > it'll hit a problem if the released object has still some dependency
> > > > (such as the case of card vs ctl_dev in our case).
> > > >
> > > > An extra dependency to card_dev as I put in my early patch would "fix"
> > > > it.  But, there is yet another problem: the call of dev_free call for
> > > > snd_device object with SNDRV_DEV_LOWLEVEL can happen before releasing
> > > > PCM and other devices when the delayed kobj release is enabled.  And,
> > > > usually this callback does release the top-level resources, which
> > > > might be still accessed during the other releases.
> > >
> > >
> > > I was doing some testing late last week that confirms these fears
> > > actually. Basically the rig was opening a non-blocking PCM and just
> > > never sending data, removing the hw sound device (in this case a USB
> > > peripheral) but not closing the userspace app. As a result kobjects
> > > were released out of order with the top level "sound" being released
> > > at disconnect of the USB device, but the actual card not being
> > > released until the app closed. See annotated logs below
> > >
> > > [  690.528577] kobject: 'sound' (00000000266cd308):
> > > kobject_add_internal: parent: '3-9:1.0', set: '(null)' <----- plug in
> > > device
> > > [  690.528607] kobject: 'card1' (00000000f7fa0903):
> > > kobject_add_internal: parent: 'sound', set: 'devices'
> > > [  690.528732] kobject: 'card1' (00000000f7fa0903): kobject_uevent_env
> > > [  690.528756] kobject: 'card1' (00000000f7fa0903): fill_kobj_path:
> > > path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1'
> > > [  690.528988] kobject: 'pcmC1D0p' (0000000095ff4473):
> > > kobject_add_internal: parent: 'card1', set: 'devices'
> > > [  690.529640] kobject: 'pcmC1D0p' (0000000095ff4473): fill_kobj_path:
> > > path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1/pcmC1D0p'
> > > [  690.529778] kobject: 'pcmC1D0c' (00000000c4879d24):
> > > kobject_add_internal: parent: 'card1', set: 'devices'
> > > [  690.530006] kobject: 'pcmC1D0c' (00000000c4879d24): fill_kobj_path:
> > > path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1/pcmC1D0c'
> > > [  690.530108] kobject: 'controlC1' (00000000a0a25449):
> > > kobject_add_internal: parent: 'card1', set: 'devices'
> > > [  690.530373] kobject: 'controlC1' (00000000a0a25449):
> > > fill_kobj_path: path =
> > > '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1/controlC1'
> > > [  690.671889] kobject: 'card1' (00000000f7fa0903): kobject_uevent_env
> > > [  690.671906] kobject: 'card1' (00000000f7fa0903): fill_kobj_path:
> > > path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1'
> > > [  700.009244] kobject: 'controlC1' (00000000a0a25449):
> > > fill_kobj_path: path =
> > > '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1/controlC1'
> > > [  700.010131] kobject: 'pcmC1D0p' (0000000095ff4473): fill_kobj_path:
> > > path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1/pcmC1D0p'
> > > [  700.011344] kobject: 'pcmC1D0c' (00000000c4879d24): fill_kobj_path:
> > > path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1/pcmC1D0c'
> > > [  700.012916] kobject: 'card1' (00000000f7fa0903): kobject_uevent_env
> > > [  700.012951] kobject: 'card1' (00000000f7fa0903): fill_kobj_path:
> > > path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1'
> > >  <---- start blocked playback here
> > > [  700.013057] kobject: 'sound' (00000000266cd308): kobject_release,
> > > parent 0000000000000000 (delayed 1000) <--- unplug usb device
> > > [  701.054221] kobject: 'sound' (00000000266cd308): kobject_cleanup,
> > > parent 0000000000000000
> > > [  701.054236] kobject: 'sound' (00000000266cd308): calling ktype release
> > > [  701.054257] kobject: 'sound': free name
> > > [  713.639843] kobject: 'card1' (00000000f7fa0903): kobject_release,
> > > parent 0000000000000000 (delayed 3000) <--- Send EOF to playback
> > > stream
> > > [  716.669776] kobject: 'card1' (00000000f7fa0903): kobject_cleanup,
> > > parent 0000000000000000
> > > [  716.669810] kobject: 'card1' (00000000f7fa0903): calling ktype release
> > > [  716.670834] kobject: 'card1': free name
> > >
> > > >
> > > >
> > > > So, if we tie the object resource with each struct device release, we
> > > > have a lot of works:
> > > > 1. Add extra dependencies among device hierarchy
> > > > 2. Don't use card_dev refcount for managing the sync to device closes,
> > > >    introduce another kref instead; otherwise card_dev refcount would
> > > >    never reach to zero
> > > > 3. Fix race of devres vs card_dev release
> > > > 4. Move the second half part of snd_card_do_free() to the release
> > > >    callback of card_dev itself to sync with the top-level release
> > > > 5. Rewrite all SNDRV_DEV_LOWLEVEL usages to be called via
> > > >    card->private_free or such;
> > > >    maybe the only problem is hda_intel.c and hda_tegra.c that need
> > > >    some work at the disconnection time, and we may introduce another
> > > >    hook in the card object to replace that
> > > >
> > > > And, at this moment, I feel that it'd be easier to go back to the
> > > > early way of device management, i.e. it'll be just like your patch,
> > > > managing the device object independently from the rest resources.
> > > > (This means also that the way freeing the resource for hwdep and
> > > >  rawmidi will go back again without the embedded device, too; they
> > > >  also suffer from the same problem of .)
> > >
> > > I agree, I think as a simple stopgap, my earlier patch would at least
> > > appease the test until we can figure out the best way to do some
> > > heavier work on the kobj. I think the proxy pointer for devres would
> > > also be the best short term for 3.
> > >
> > > >
> > > > The change 2 and 3 above can be still applied with your change, which
> > > > will fix the remaining devres-vs-card_dev problem.
> > >
> > > I am not sure I follow the need for 2. If we broke ctl_dev out into
> > > its own memory region and structured everything as a proper tree we
> > > would have a proper cleanup and be able to use the refcounting
> > > properly.
> >
> > My thought was about the devres release that does kfree() of the card
> > while the card's card_dev release itself is still delayed.
> > This might be a needless fear, though, as snd_card_free() should sync
> > with the actual card_dev release.
> >
> > But, splitting the release-trigger and the actual memory release could
> > be still worth.
> >
> > > > Once after fixing the current problem, we may work further on other
> > > > stuff (e.g. item 5), so that we can switch again to the device-release
> > > > model eventually later, too.
> > >
> > > Agreed, I don't have any experience with SNDRV_DEV_LOWLEVEL but I am
> > > happy to help out here where I can.
> > >
> > > I am going to see if I can split the release card as mentioned but
> > > also have refcount work as expected and have the release calls roll up
> > > the tree.
> >
> > I quickly worked on and made a patch series.
> > It's put in topic/dev-split branch of sound git tree
> >   https://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git/log/?h=topic/dev-split
> >
> > It'd be appreciated if you can review / test it.
> 
> Took a look and ran it through the tests
> 
> You need to apply this diff

Thanks, I'll fix it up.

> 
> diff --git a/sound/usb/media.c b/sound/usb/media.c
> index 6d11fedb46326..d48db6f3ae659 100644
> --- a/sound/usb/media.c
> +++ b/sound/usb/media.c
> @@ -35,7 +35,7 @@ int snd_media_stream_init(struct snd_usb_substream
> *subs, struct snd_pcm *pcm,
>  {
>         struct media_device *mdev;
>         struct media_ctl *mctl;
> -       struct device *pcm_dev = &pcm->streams[stream].dev;
> +       struct device *pcm_dev = pcm->streams[stream].dev;
>         u32 intf_type;
>         int ret = 0;
>         u16 mixer_pad;
> 
> Hammering probe and remove appears to be fine. Went 45min without issue.
> 
> Userspace holding references past hw removal appears to still be
> broken as sound is released while the app is still running.
> 
> -- remove usb device --
> [ 4819.827476] kobject: 'controlC1' (00000000255a51c8):
> fill_kobj_path: path =
> '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1/controlC1'
> [ 4819.828114] kobject: 'pcmC1D0p' (00000000f0627532): kobject_uevent_env
> [ 4819.828145] kobject: 'pcmC1D0p' (00000000f0627532): fill_kobj_path:
> path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1/pcmC1D0p'
> [ 4819.828822] kobject: 'pcmC1D0c' (000000001b707a15): kobject_uevent_env
> [ 4819.828850] kobject: 'pcmC1D0c' (000000001b707a15): fill_kobj_path:
> path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1/pcmC1D0c'
> [ 4819.829405] kobject: 'card1' (000000005bce975e): kobject_uevent_env
> [ 4819.829428] kobject: 'card1' (000000005bce975e): fill_kobj_path:
> path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1'
> [ 4819.829516] kobject: 'sound' (000000000bb52434): kobject_release,
> parent 0000000000000000 (delayed 4000)
> [ 4823.873625] kobject: 'sound' (000000000bb52434): kobject_cleanup,
> parent 0000000000000000
> [ 4823.873645] kobject: 'sound' (000000000bb52434): calling ktype release
> [ 4823.873654] kobject: 'sound': free name
> 
> -- end app --
> [ 4849.581815] kobject: 'pcmC1D0p' (00000000f0627532):
> kobject_release, parent 0000000000000000 (delayed 2000)
> [ 4849.581960] kobject: 'pcmC1D0c' (000000001b707a15):
> kobject_release, parent 0000000000000000 (delayed 2000)
> [ 4849.582626] kobject: 'card1' (000000005bce975e): kobject_release,
> parent 0000000000000000 (delayed 1000)
> [ 4850.625615] kobject: 'card1' (000000005bce975e): kobject_cleanup,
> parent 0000000000000000
> [ 4850.625638] kobject: 'card1' (000000005bce975e): calling ktype release
> [ 4850.625663] kobject: 'card1': free name
> [ 4851.585647] kobject: 'pcmC1D0c' (000000001b707a15):
> kobject_cleanup, parent 0000000000000000
> [ 4851.585672] kobject: 'pcmC1D0c' (000000001b707a15): calling ktype release
> [ 4851.585708] kobject: 'pcmC1D0c': free name
> [ 4851.585727] kobject: 'pcmC1D0p' (00000000f0627532):
> kobject_cleanup, parent 0000000000000000
> [ 4851.585737] kobject: 'pcmC1D0p' (00000000f0627532): calling ktype release
> [ 4851.585752] kobject: 'pcmC1D0p': free name

It's the designed behavior.  Those are device *files* that are deleted
immediately at the disconnection while the application is still
running.  It's for avoiding a new application to be started after the
disconnect.  That is, only the device files in /dev/snd/* become
invisible.  Meanwhile, the already opened objects are still handled
internally.

> I still don't understand why you need the kref. The devices are
> already reference counting, why not use them? If we split them up into
> their own structs we could then just device_put everything on removal
> and let it roll up the tree with releases automatically, blocking
> where userspace is still holding references. I will share a patches
> sometime this week of what I mean. They will probably be a bit bigger
> blast radius but I think its what is needed here.

We want to trigger the top-level release free procedure once when all
files are closed.  This top-level release does put_device() of all
belonging devices.

The card_dev device refcount was used for this purpose.  OTOH, if we
want to construct the topology of the devices until the actual
deletion (i.e. keep card_dev until pcm and others are really
released/deleted), the card_dev refcount will be used for managing
the topology, too.  So, it'll get a side-effect side-effect that the
card_dev refcount won't be zero even after all files are closed (it's
referred from the children).

So, it's a kind of preparation for the future.


Takashi

  reply	other threads:[~2023-08-16  5:41 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-07 13:52 [PATCH RFC 0/6] ALSA: Fix UAF with delayed kobj release Takashi Iwai
2023-08-07 13:52 ` [PATCH RFC 1/6] ALSA: core: Introduced referenced memory allocator Takashi Iwai
2023-08-07 13:52 ` [PATCH RFC 2/6] ALSA: core: Fix potential UAF by delayed kobject release of card_dev Takashi Iwai
2023-08-07 13:52 ` [PATCH 2/6] ALSA: core: Fix race between devres and delayed kobject release for card_dev Takashi Iwai
2023-08-07 13:56   ` Takashi Iwai
2023-08-07 13:52 ` [PATCH RFC 3/6] ALSA: core: Associate memory reference with device initialization Takashi Iwai
2023-08-07 13:52 ` [PATCH RFC 4/6] ALSA: pcm: Release memory with reference Takashi Iwai
2023-08-07 13:52 ` [PATCH RFC 5/6] ALSA: control: Reference card by ctl_dev Takashi Iwai
2023-08-07 13:52 ` [PATCH RFC 6/6] ALSA: compress: Reference card by the device Takashi Iwai
2023-08-07 22:34 ` [PATCH RFC 0/6] ALSA: Fix UAF with delayed kobj release Curtis Malainey
2023-08-08 19:26   ` Curtis Malainey
2023-08-09  8:10     ` Takashi Iwai
2023-08-09 13:27       ` Takashi Iwai
2023-08-09 21:11         ` Curtis Malainey
2023-08-13  8:08           ` Takashi Iwai
2023-08-14 20:20             ` Curtis Malainey
2023-08-15 16:07               ` Takashi Iwai
2023-08-15 21:32                 ` Curtis Malainey
2023-08-16  5:35                   ` Takashi Iwai [this message]
2023-08-16  5:47                     ` Takashi Iwai
2023-08-16 21:46                     ` Curtis Malainey
2023-08-17  6:21                       ` Takashi Iwai
2023-08-17 17:25                         ` Curtis Malainey
2023-08-18  0:41                           ` Curtis Malainey

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=878raby2fb.wl-tiwai@suse.de \
    --to=tiwai@suse.de \
    --cc=alsa-devel@alsa-project.org \
    --cc=cujomalainey@chromium.org \
    --cc=cujomalainey@google.com \
    --cc=swboyd@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.