android_kernel_asus_sm8350/fs
Filipe Manana eb576fc43a btrfs: fix space cache memory leak after transaction abort
commit bbc37d6e475eee8ffa2156ec813efc6bbb43c06d upstream.

If a transaction aborts it can cause a memory leak of the pages array of
a block group's io_ctl structure. The following steps explain how that can
happen:

1) Transaction N is committing, currently in state TRANS_STATE_UNBLOCKED
   and it's about to start writing out dirty extent buffers;

2) Transaction N + 1 already started and another task, task A, just called
   btrfs_commit_transaction() on it;

3) Block group B was dirtied (extents allocated from it) by transaction
   N + 1, so when task A calls btrfs_start_dirty_block_groups(), at the
   very beginning of the transaction commit, it starts writeback for the
   block group's space cache by calling btrfs_write_out_cache(), which
   allocates the pages array for the block group's io_ctl with a call to
   io_ctl_init(). Block group A is added to the io_list of transaction
   N + 1 by btrfs_start_dirty_block_groups();

4) While transaction N's commit is writing out the extent buffers, it gets
   an IO error and aborts transaction N, also setting the file system to
   RO mode;

5) Task A has already returned from btrfs_start_dirty_block_groups(), is at
   btrfs_commit_transaction() and has set transaction N + 1 state to
   TRANS_STATE_COMMIT_START. Immediately after that it checks that the
   filesystem was turned to RO mode, due to transaction N's abort, and
   jumps to the "cleanup_transaction" label. After that we end up at
   btrfs_cleanup_one_transaction() which calls btrfs_cleanup_dirty_bgs().
   That helper finds block group B in the transaction's io_list but it
   never releases the pages array of the block group's io_ctl, resulting in
   a memory leak.

In fact at the point when we are at btrfs_cleanup_dirty_bgs(), the pages
array points to pages that were already released by us at
__btrfs_write_out_cache() through the call to io_ctl_drop_pages(). We end
up freeing the pages array only after waiting for the ordered extent to
complete through btrfs_wait_cache_io(), which calls io_ctl_free() to do
that. But in the transaction abort case we don't wait for the space cache's
ordered extent to complete through a call to btrfs_wait_cache_io(), so
that's why we end up with a memory leak - we wait for the ordered extent
to complete indirectly by shutting down the work queues and waiting for
any jobs in them to complete before returning from close_ctree().

We can solve the leak simply by freeing the pages array right after
releasing the pages (with the call to io_ctl_drop_pages()) at
__btrfs_write_out_cache(), since we will never use it anymore after that
and the pages array points to already released pages at that point, which
is currently not a problem since no one will use it after that, but not a
good practice anyway since it can easily lead to use-after-free issues.

So fix this by freeing the pages array right after releasing the pages at
__btrfs_write_out_cache().

This issue can often be reproduced with test case generic/475 from fstests
and kmemleak can detect it and reports it with the following trace:

unreferenced object 0xffff9bbf009fa600 (size 512):
  comm "fsstress", pid 38807, jiffies 4298504428 (age 22.028s)
  hex dump (first 32 bytes):
    00 a0 7c 4d 3d ed ff ff 40 a0 7c 4d 3d ed ff ff  ..|M=...@.|M=...
    80 a0 7c 4d 3d ed ff ff c0 a0 7c 4d 3d ed ff ff  ..|M=.....|M=...
  backtrace:
    [<00000000f4b5cfe2>] __kmalloc+0x1a8/0x3e0
    [<0000000028665e7f>] io_ctl_init+0xa7/0x120 [btrfs]
    [<00000000a1f95b2d>] __btrfs_write_out_cache+0x86/0x4a0 [btrfs]
    [<00000000207ea1b0>] btrfs_write_out_cache+0x7f/0xf0 [btrfs]
    [<00000000af21f534>] btrfs_start_dirty_block_groups+0x27b/0x580 [btrfs]
    [<00000000c3c23d44>] btrfs_commit_transaction+0xa6f/0xe70 [btrfs]
    [<000000009588930c>] create_subvol+0x581/0x9a0 [btrfs]
    [<000000009ef2fd7f>] btrfs_mksubvol+0x3fb/0x4a0 [btrfs]
    [<00000000474e5187>] __btrfs_ioctl_snap_create+0x119/0x1a0 [btrfs]
    [<00000000708ee349>] btrfs_ioctl_snap_create_v2+0xb0/0xf0 [btrfs]
    [<00000000ea60106f>] btrfs_ioctl+0x12c/0x3130 [btrfs]
    [<000000005c923d6d>] __x64_sys_ioctl+0x83/0xb0
    [<0000000043ace2c9>] do_syscall_64+0x33/0x80
    [<00000000904efbce>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

CC: stable@vger.kernel.org # 4.9+
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-03 11:27:02 +02:00
..
9p 9p: Fix memory leak in v9fs_mount 2020-08-19 08:16:24 +02:00
adfs
affs affs: fix a memory leak in affs_remount 2020-01-17 19:48:50 +01:00
afs afs: Fix NULL deref in afs_dynroot_depopulate() 2020-08-26 10:41:05 +02:00
autofs autofs: fix a leak in autofs_expire_indirect() 2019-10-25 00:03:11 -04:00
befs fs: Fill in max and min timestamps in superblock 2019-08-30 07:27:17 -07:00
bfs fs: Fill in max and min timestamps in superblock 2019-08-30 07:27:17 -07:00
btrfs btrfs: fix space cache memory leak after transaction abort 2020-09-03 11:27:02 +02:00
cachefiles cachefiles: Fix race between read_waiter and read_copier involving op->to_do 2020-06-03 08:21:11 +02:00
ceph ceph: do not access the kiocb after aio requests 2020-09-03 11:26:47 +02:00
cifs cifs: Fix leak when handling lease break for cached root fid 2020-08-21 13:05:24 +02:00
coda y2038: add inode timestamp clamping 2019-09-19 09:42:37 -07:00
configfs configfs: fix config_item refcnt leak in configfs_rmdir() 2020-05-27 17:46:30 +02:00
cramfs cramfs: fix usage on non-MTD device 2019-11-23 21:44:49 -05:00
crypto fscrypt: don't evict dirty inodes after removing key 2020-03-18 07:17:53 +01:00
debugfs debugfs: Check module state before warning in {full/open}_proxy_open() 2020-04-17 10:50:02 +02:00
devpts devpts_pty_kill(): don't bother with d_delete() 2019-09-03 09:30:56 -04:00
dlm dlm: Fix kobject memleak 2020-08-19 08:16:21 +02:00
ecryptfs ecryptfs: replace BUG_ON with error handling code 2020-02-28 17:22:26 +01:00
efivarfs
efs fs: Fill in max and min timestamps in superblock 2019-08-30 07:27:17 -07:00
erofs erofs: fix extended inode could cross boundary 2020-08-19 08:16:26 +02:00
exportfs exportfs_decode_fh(): negative pinned may become positive without the parent locked 2019-11-10 11:56:05 -05:00
ext2 ext2: fix missing percpu_counter_inc 2020-08-21 13:05:26 +02:00
ext4 fs: prevent BUG_ON in submit_bh_wbc() 2020-09-03 11:26:57 +02:00
f2fs f2fs: fix use-after-free issue 2020-09-03 11:26:46 +02:00
fat fat: don't allow to mount if the FAT length == 0 2020-06-17 16:40:36 +02:00
freevxfs fs: Fill in max and min timestamps in superblock 2019-08-30 07:27:17 -07:00
fscache
fuse fuse: fix weird page warning 2020-07-29 10:18:28 +02:00
gfs2 gfs2: Never call gfs2_block_zero_range with an open transaction 2020-08-26 10:40:48 +02:00
hfs
hfsplus hfsplus: fix crash and filesystem corruption when deleting files 2020-04-17 10:50:22 +02:00
hostfs
hpfs fs: hpfs: Initialize filesystem timestamp ranges 2019-08-30 08:11:25 -07:00
hugetlbfs hugetlbfs: prevent filesystem stacking of hugetlbfs 2020-09-03 11:26:48 +02:00
iomap iomap: fix return value of iomap_dio_bio_actor on 32bit systems 2020-01-04 19:17:31 +01:00
isofs y2038: add inode timestamp clamping 2019-09-19 09:42:37 -07:00
jbd2 jbd2: abort journal if free a async write error metadata buffer 2020-09-03 11:26:56 +02:00
jffs2 jffs2: fix UAF problem 2020-08-26 10:40:56 +02:00
jfs y2038: add inode timestamp clamping 2019-09-19 09:42:37 -07:00
kernfs kernfs: do not call fsnotify() with name without a parent 2020-08-19 08:16:12 +02:00
lockd
minix fs/minix: remove expected error message in block_to_path() 2020-08-21 13:05:37 +02:00
nfs nfs: Fix getxattr kernel panic and memory overflow 2020-08-21 13:05:37 +02:00
nfs_common
nfsd nfsd: Fix NFSv4 READ on RDMA when using readv 2020-08-11 15:33:42 +02:00
nilfs2 nilfs2: fix null pointer dereference at nilfs_segctor_do_construct() 2020-06-17 16:40:29 +02:00
nls
notify fanotify: fix ignore mask logic for events on child and on dir 2020-06-17 16:40:24 +02:00
ntfs utimes: Clamp the timestamps in notify_change() 2020-02-11 04:35:12 -08:00
ocfs2 ocfs2: change slot number type s16 to u16 2020-08-21 13:05:26 +02:00
omfs fs: omfs: Initialize filesystem timestamp ranges 2019-08-30 08:11:25 -07:00
openpromfs
orangefs orangefs: get rid of knob code... 2020-08-21 13:05:29 +02:00
overlayfs ovl: fix unneeded call to ovl_change_flags() 2020-07-22 09:33:12 +02:00
proc proc: Use new_inode not new_inode_pseudo 2020-06-17 16:40:33 +02:00
pstore pstore: Fix linking when crypto API disabled 2020-08-19 08:16:27 +02:00
qnx4 fs: Fill in max and min timestamps in superblock 2019-08-30 07:27:17 -07:00
qnx6 fs: Fill in max and min timestamps in superblock 2019-08-30 07:27:17 -07:00
quota fs: avoid softlockups in s_inodes iterators 2020-01-12 12:21:37 +01:00
ramfs vfs: Convert ramfs, shmem, tmpfs, devtmpfs, rootfs to use the new mount API 2019-09-12 21:05:34 -04:00
reiserfs reiserfs: prevent NULL pointer dereference in reiserfs_insert_item() 2020-02-24 08:37:00 +01:00
romfs romfs: fix uninitialized memory leak in romfs_dev_read() 2020-08-26 10:40:51 +02:00
squashfs Merge branch 'work.mount2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-09-19 10:06:57 -07:00
sysfs
sysv fs: sysv: Initialize filesystem timestamp ranges 2019-08-30 07:27:18 -07:00
tracefs tracing: Do not create tracefs files if tracefs lockdown is in effect 2019-10-12 20:49:07 -04:00
ubifs ubifs: Fix wrong orphan node deletion in ubifs_jnl_update|rename 2020-08-21 13:05:35 +02:00
udf udf: Fix free space reporting for metadata and virtual partitions 2020-02-24 08:36:44 +01:00
ufs fs/ufs: avoid potential u32 multiplication overflow 2020-08-21 13:05:37 +02:00
unicode unicode: make array 'token' static const, makes object smaller 2019-09-17 11:48:24 -04:00
verity
xfs xfs: Don't allow logging of XFS_ISTALE inodes 2020-09-03 11:26:44 +02:00
aio.c aio: fix async fsync creds 2020-06-17 16:40:24 +02:00
anon_inodes.c
attr.c utimes: Clamp the timestamps in notify_change() 2020-02-11 04:35:12 -08:00
bad_inode.c
binfmt_aout.c
binfmt_elf_fdpic.c
binfmt_elf.c fs/binfmt_elf.c: allocate initialized memory in fill_thread_core_info() 2020-06-03 08:21:27 +02:00
binfmt_em86.c
binfmt_flat.c binfmt_flat: revert "binfmt_flat: don't offset the data start" 2020-09-03 11:26:39 +02:00
binfmt_misc.c
binfmt_script.c
block_dev.c block: Fix use-after-free in blkdev_get() 2020-06-24 17:50:47 +02:00
buffer.c fs: prevent BUG_ON in submit_bh_wbc() 2020-09-03 11:26:57 +02:00
char_dev.c chardev: Avoid potential use-after-free in 'chrdev_open()' 2020-01-14 20:08:18 +01:00
compat_binfmt_elf.c
compat_ioctl.c fix compat handling of FICLONERANGE, FIDEDUPERANGE and FS_IOC_FIEMAP 2020-01-09 10:20:05 +01:00
compat.c
coredump.c coredump: fix crash when umh is disabled 2020-05-14 07:58:27 +02:00
d_path.c [PATCH] fix d_absolute_path() interplay with fsmount() 2019-08-30 19:31:09 -04:00
dax.c dax: pass NOWAIT flag to iomap_apply 2020-03-05 16:43:36 +01:00
dcache.c
dcookies.c
direct-io.c fs/direct-io.c: fix kernel-doc warning 2019-10-14 15:04:01 -07:00
drop_caches.c fs: avoid softlockups in s_inodes iterators 2020-01-12 12:21:37 +01:00
eventfd.c eventfd: track eventfd_signal() recursion depth 2020-02-11 04:35:37 -08:00
eventpoll.c do_epoll_ctl(): clean the failure exits up a bit 2020-08-26 10:41:07 +02:00
exec.c exec: Move would_dump into flush_old_exec 2020-05-20 08:20:34 +02:00
fcntl.c
fhandle.c
file_table.c
file.c fix multiplication overflow in copy_fdtable() 2020-05-27 17:46:12 +02:00
filesystems.c fs/filesystems.c: downgrade user-reachable WARN_ONCE() to pr_warn_once() 2020-04-17 10:50:21 +02:00
fs_context.c vfs: subtype handling moved to fuse 2019-09-06 21:28:49 +02:00
fs_parser.c vfs: Make fs_parse() handle fs_param_is_fd-type params better 2019-09-12 21:06:14 -04:00
fs_pin.c
fs_struct.c
fs_types.c
fs-writeback.c memcg: fix a crash in wb_workfn when a device disappears 2020-02-11 04:35:11 -08:00
fsopen.c
inode.c futex: Fix inode life-time issue 2020-03-25 08:25:58 +01:00
internal.h fs: move guard_bio_eod() after bio_set_op_attrs 2020-01-17 19:48:21 +01:00
io_uring.c io_uring: Fix NULL pointer dereference in loop_rw_iter() 2020-08-19 08:16:29 +02:00
ioctl.c compat_ioctl: add compat_ptr_ioctl() 2019-12-17 19:55:30 +01:00
Kconfig fs-verity for 5.4 2019-09-18 16:59:14 -07:00
Kconfig.binfmt
libfs.c libfs: fix infoleak in simple_attr_read() 2020-04-01 11:02:17 +02:00
locks.c locks: reinstate locks_delete_block optimization 2020-03-25 08:25:41 +01:00
Makefile fs-verity for 5.4 2019-09-18 16:59:14 -07:00
mbcache.c
mount.h
mpage.c fs: move guard_bio_eod() after bio_set_op_attrs 2020-01-17 19:48:21 +01:00
namei.c namei: only return -ECHILD from follow_dotdot_rcu() 2020-03-05 16:43:48 +01:00
namespace.c fs/namespace.c: fix use-after-free of mount in mnt_warn_timestamp_expiry() 2019-10-16 23:15:09 -04:00
no-block.c
nsfs.c
open.c cifs_atomic_open(): fix double-put on late allocation failure 2020-03-18 07:17:51 +01:00
pipe.c
pnode.c propagate_one(): mnt_set_mountpoint() needs mount_lock 2020-05-02 08:48:44 +02:00
pnode.h
posix_acl.c
proc_namespace.c vfs: subtype handling moved to fuse 2019-09-06 21:28:49 +02:00
read_write.c fs: allow deduplication of eof block into the end of the destination file 2020-02-11 04:35:23 -08:00
readdir.c readdir: be more conservative with directory entry names 2020-01-29 16:45:31 +01:00
select.c
seq_file.c
signalfd.c fs/signalfd.c: fix inconsistent return codes for signalfd4 2020-08-26 10:40:58 +02:00
splice.c splice: only read in as much information as there is pipe buffer space 2019-12-17 19:56:52 +01:00
stack.c
stat.c
statfs.c vfs: Fix EOVERFLOW testing in put_compat_statfs64 2019-10-03 14:21:35 -07:00
super.c Fix use after free in get_tree_bdev() 2020-05-06 08:15:15 +02:00
sync.c
timerfd.c
userfaultfd.c userfaultfd: require CAP_SYS_PTRACE for UFFD_FEATURE_EVENT_FORK 2020-01-04 19:18:32 +01:00
utimes.c utimes: Clamp the timestamps in notify_change() 2020-02-11 04:35:12 -08:00
xattr.c xattr: break delegations in {set,remove}xattr 2020-08-11 15:33:39 +02:00