From: Jens Axboe axboe@kernel.dk
mainline inclusion from mainline-5.1-rc6 commit b19062a567266ee1f10f6709325f766bbcc07d1c category: feature bugzilla: https://bugzilla.openeuler.org/show_bug.cgi?id=27 CVE: NA ---------------------------
If we have multiple threads, one doing io_uring_enter() while the other is doing io_uring_register(), we can run into a deadlock between the two. io_uring_register() must wait for existing users of the io_uring instance to exit. But it does so while holding the io_uring mutex. Callers of io_uring_enter() may need this mutex to make progress (and eventually exit). If we wait for users to exit in io_uring_register(), we can't do so with the io_uring mutex held without potentially risking a deadlock.
Drop the io_uring mutex while waiting for existing callers to exit. This is safe and guaranteed to make forward progress, since we already killed the percpu ref before doing so. Hence later callers of io_uring_enter() will be rejected.
Reported-by: syzbot+16dc03452dee970a0c3e@syzkaller.appspotmail.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Zhihao Cheng chengzhihao1@huawei.com Signed-off-by: yangerkun yangerkun@huawei.com Reviewed-by: zhangyi (F) yi.zhang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- fs/io_uring.c | 12 ++++++++++++ 1 file changed, 12 insertions(+)
diff --git a/fs/io_uring.c b/fs/io_uring.c index 77aea48e1e61..742f541ccf09 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -2926,11 +2926,23 @@ SYSCALL_DEFINE2(io_uring_setup, u32, entries,
static int __io_uring_register(struct io_ring_ctx *ctx, unsigned opcode, void __user *arg, unsigned nr_args) + __releases(ctx->uring_lock) + __acquires(ctx->uring_lock) { int ret;
percpu_ref_kill(&ctx->refs); + + /* + * Drop uring mutex before waiting for references to exit. If another + * thread is currently inside io_uring_enter() it might need to grab + * the uring_lock to make progress. If we hold it here across the drain + * wait, then we can deadlock. It's safe to drop the mutex here, since + * no new references will come in after we've killed the percpu ref. + */ + mutex_unlock(&ctx->uring_lock); wait_for_completion(&ctx->ctx_done); + mutex_lock(&ctx->uring_lock);
switch (opcode) { case IORING_REGISTER_BUFFERS: