* Memory corruption in algif_skciper AIO sendpage with multiple iocb @ 2016-09-12 12:43 Stephan Mueller 2016-09-13 8:18 ` [PATCH v3] crypto: only call put_page on referenced and used pages Stephan Mueller 0 siblings, 1 reply; 5+ messages in thread From: Stephan Mueller @ 2016-09-12 12:43 UTC (permalink / raw) To: herbert; +Cc: linux-crypto Hi Herbert, after getting the AIO code working on sendmsg, tried it with vmsplice/splice and I get a memory corruption. Interestingly, the stack trace is partially garbled too. Thus, tracking this one down may be a bit of a challenge. Ciao Stephan ^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH v3] crypto: only call put_page on referenced and used pages 2016-09-12 12:43 Memory corruption in algif_skciper AIO sendpage with multiple iocb Stephan Mueller @ 2016-09-13 8:18 ` Stephan Mueller 2016-09-13 10:08 ` Herbert Xu 0 siblings, 1 reply; 5+ messages in thread From: Stephan Mueller @ 2016-09-13 8:18 UTC (permalink / raw) To: herbert; +Cc: linux-crypto Am Montag, 12. September 2016, 14:43:45 CEST schrieb Stephan Mueller: Hi Herbert, > Hi Herbert, > > after getting the AIO code working on sendmsg, tried it with vmsplice/splice > and I get a memory corruption. Interestingly, the stack trace is partially > garbled too. Thus, tracking this one down may be a bit of a challenge. The issue is a NULL pointer dereference in skcipher_free_async_sgls. The issue is that SGs may not have even a page mapped to them and thus the page entry is NULL. The following patch fixes the issue and replaces the patch I sent earlier. ---8<--- For asynchronous operation, SGs are allocated without a page mapped to them or with a page that is not used (ref-counted). If the SGL is freed, the code must only call put_page for an SG if there was a page assigned and ref-counted in the first place. This fixes a kernel crash when using io_submit with more than one iocb using the sendmsg and sendpage (vmsplice/splice) interface Signed-off-by: Stephan Mueller <smueller@chronox.de> --- crypto/algif_skcipher.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/crypto/algif_skcipher.c b/crypto/algif_skcipher.c index 28556fc..45af0fe 100644 --- a/crypto/algif_skcipher.c +++ b/crypto/algif_skcipher.c @@ -86,8 +86,13 @@ static void skcipher_free_async_sgls(struct skcipher_async_req *sreq) } sgl = sreq->tsg; n = sg_nents(sgl); - for_each_sg(sgl, sg, n, i) - put_page(sg_page(sg)); + for_each_sg(sgl, sg, n, i) { + struct page *page = sg_page(sg); + + /* some SGs may not have a page mapped */ + if (page && page_ref_count(page)) + put_page(page); + } kfree(sreq->tsg); } -- 2.7.4 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH v3] crypto: only call put_page on referenced and used pages 2016-09-13 8:18 ` [PATCH v3] crypto: only call put_page on referenced and used pages Stephan Mueller @ 2016-09-13 10:08 ` Herbert Xu 2016-09-13 11:27 ` Stephan Mueller 0 siblings, 1 reply; 5+ messages in thread From: Herbert Xu @ 2016-09-13 10:08 UTC (permalink / raw) To: Stephan Mueller; +Cc: linux-crypto On Tue, Sep 13, 2016 at 10:18:54AM +0200, Stephan Mueller wrote: > Am Montag, 12. September 2016, 14:43:45 CEST schrieb Stephan Mueller: > > Hi Herbert, > > > Hi Herbert, > > > > after getting the AIO code working on sendmsg, tried it with vmsplice/splice > > and I get a memory corruption. Interestingly, the stack trace is partially > > garbled too. Thus, tracking this one down may be a bit of a challenge. > > The issue is a NULL pointer dereference in skcipher_free_async_sgls. The issue is that SGs may not have even a page mapped to them and thus the page entry is NULL. > > The following patch fixes the issue and replaces the patch I sent earlier. This patch appears to be papering over a real bug. The async path should be exactly the same as the sync path, except that we don't wait for completion. So the question is why are we getting this crash here for async but not sync? Cheers, -- Email: Herbert Xu <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v3] crypto: only call put_page on referenced and used pages 2016-09-13 10:08 ` Herbert Xu @ 2016-09-13 11:27 ` Stephan Mueller 2016-11-11 14:28 ` Stephan Mueller 0 siblings, 1 reply; 5+ messages in thread From: Stephan Mueller @ 2016-09-13 11:27 UTC (permalink / raw) To: Herbert Xu; +Cc: linux-crypto Am Dienstag, 13. September 2016, 18:08:16 CEST schrieb Herbert Xu: Hi Herbert, > This patch appears to be papering over a real bug. > > The async path should be exactly the same as the sync path, except > that we don't wait for completion. So the question is why are we > getting this crash here for async but not sync? At least one reason is found in skcipher_recvmsg_async with the following code path: if (txbufs == tx_nents) { struct scatterlist *tmp; int x; /* Ran out of tx slots in async request * need to expand */ tmp = kcalloc(tx_nents * 2, sizeof(*tmp), GFP_KERNEL); if (!tmp) goto free; sg_init_table(tmp, tx_nents * 2); for (x = 0; x < tx_nents; x++) sg_set_page(&tmp[x], sg_page(&sreq->tsg[x]), sreq->tsg[x].length, sreq->tsg[x].offset); kfree(sreq->tsg); sreq->tsg = tmp; tx_nents *= 2; mark = true; } ==> the code allocates twice the amount of the previously existing memory, copies the existing SGs over, but does not set the remaining SGs to anything. If the caller provides less pages than the number of allocated SGs, some SGs are unset. Hence, the deallocation must not do anything with the yet uninitialized SGs. Ciao Stephan ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v3] crypto: only call put_page on referenced and used pages 2016-09-13 11:27 ` Stephan Mueller @ 2016-11-11 14:28 ` Stephan Mueller 0 siblings, 0 replies; 5+ messages in thread From: Stephan Mueller @ 2016-11-11 14:28 UTC (permalink / raw) To: Herbert Xu; +Cc: linux-crypto Am Dienstag, 13. September 2016, 13:27:34 CET schrieb Stephan Mueller: Hi Herbert, > Am Dienstag, 13. September 2016, 18:08:16 CEST schrieb Herbert Xu: > > Hi Herbert, > > > This patch appears to be papering over a real bug. > > > > The async path should be exactly the same as the sync path, except > > that we don't wait for completion. So the question is why are we > > getting this crash here for async but not sync? > > At least one reason is found in skcipher_recvmsg_async with the following > code path: > > if (txbufs == tx_nents) { > struct scatterlist *tmp; > int x; > /* Ran out of tx slots in async request > * need to expand */ > tmp = kcalloc(tx_nents * 2, sizeof(*tmp), > GFP_KERNEL); > if (!tmp) > goto free; > > sg_init_table(tmp, tx_nents * 2); > for (x = 0; x < tx_nents; x++) > sg_set_page(&tmp[x], sg_page(&sreq->tsg[x]), > sreq->tsg[x].length, > sreq->tsg[x].offset); > kfree(sreq->tsg); > sreq->tsg = tmp; > tx_nents *= 2; > mark = true; > } > > > ==> the code allocates twice the amount of the previously existing memory, > copies the existing SGs over, but does not set the remaining SGs to > anything. If the caller provides less pages than the number of allocated > SGs, some SGs are unset. Hence, the deallocation must not do anything with > the yet uninitialized SGs. I looked into the issue a bit deeper. In addition to the aforementioned code, the following code seems to be a second culprit: tx_nents = skcipher_all_sg_nents(ctx); sreq->tsg = kcalloc(tx_nents, sizeof(*sg), GFP_KERNEL); if (unlikely(!sreq->tsg)) goto unlock; sg_init_table(sreq->tsg, tx_nents); Here again, an SGL is initialized, but there are no pages mapped to the SGs. May I ask you to reconsider this patch as well as the patch "[PATCH] crypto: call put_page on used pages only" from September 10 since the current code of libkcapi can easily trigger these bugs and lead to a kernel crash. If you consider the patches papering over the heart of the problem, may I ask for suggestions on how the mentioned code should be changed such that the issues are removed? If the suggestion is to re-architect the memory handling in the async part, may I ask to at least apply the patches for now with the goal to have time for re-architecting the async code and yet have no open holes that lead to crashes? Thanks. Ciao Stephan ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2016-11-11 14:28 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-09-12 12:43 Memory corruption in algif_skciper AIO sendpage with multiple iocb Stephan Mueller 2016-09-13 8:18 ` [PATCH v3] crypto: only call put_page on referenced and used pages Stephan Mueller 2016-09-13 10:08 ` Herbert Xu 2016-09-13 11:27 ` Stephan Mueller 2016-11-11 14:28 ` Stephan Mueller
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox