android11-5.4
5974 Commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
|
c7f89f1b6b |
This is the 5.4.249 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmSb7V4ACgkQONu9yGCS aT5vLxAA0yhg7h210wyMLrPNgQHrIItxkvcosoAG04WziImnvTT84XYpvthKlQrZ jzLGwdrH8ggdZIq+jPblmGvfvpGuM7MjKw1F8tgmviMnMyfKziGO/kIEzkNPaHSt sRFuGniXx2Q/m2IVblhC8pqJG6SRgkBbNgg3by7SpTRSEHBjpxaOVxvGC53Bdlkb ep90ox3iVbA4Q45rGCn5UfJM22wEnUYbzRv04085fzWaPDEZyHi5S6a3rHepVbrq 7ElDQgUgHKlLm7rd1ngB8Ac+EdfavVcPok789pbEmQwf6jsAetl43yPUSEE6xFXb 5FZAA7uUUa+E7P+140+iWBCZwQX9g+WglEkOxJV8gOMtWoiFZjpPcJxyWnvz/7ch XFz88WW/Ub4+bpg62TJ2F3dboeF0x1rN5kB8/ylb+Gf9vACT2gPLDbFaeG24DZEr s1hdsRx1Q3m8ffOYbsuTTn3bfGv8TfycV4Cwy+v+QPwJF/WPdMUnIDRY7VgWJ6fO scRdhkgMer9MLDrcSwxgS3tyn6JObQMp5A40H1Yb6ZVwN+q2BRC/B4Gqi6BmUNKr uU0BRMeyExyyQfKYCgvcf0M23qUf5L4PDpk1MX38pU+AHm8rPHlE36/pNFG4PG0g p6vBTlKzYeHKh12VAdPJjiWICloaz2ixf3K85xJ+vH56jXfjbSY= =3Pqk -----END PGP SIGNATURE----- Merge 5.4.249 into android11-5.4-lts Changes in 5.4.249 nilfs2: reject devices with insufficient block count mm: rewrite wait_on_page_bit_common() logic list: add "list_del_init_careful()" to go with "list_empty_careful()" epoll: ep_autoremove_wake_function should use list_del_init_careful tracing: Add tracing_reset_all_online_cpus_unlocked() function x86/purgatory: remove PGO flags tick/common: Align tick period during sched_timer setup media: dvbdev: Fix memleak in dvb_register_device media: dvbdev: fix error logic at dvb_register_device() media: dvb-core: Fix use-after-free due to race at dvb_register_device() nilfs2: fix buffer corruption due to concurrent device reads Drivers: hv: vmbus: Fix vmbus_wait_for_unload() to scan present CPUs PCI: hv: Fix a race condition bug in hv_pci_query_relations() cgroup: Do not corrupt task iteration when rebinding subsystem mmc: meson-gx: remove redundant mmc_request_done() call from irq context ip_tunnels: allow VXLAN/GENEVE to inherit TOS/TTL from VLAN writeback: fix dereferencing NULL mapping->host on writeback_page_template nilfs2: prevent general protection fault in nilfs_clear_dirty_page() cifs: Clean up DFS referral cache cifs: Get rid of kstrdup_const()'d paths cifs: Introduce helpers for finding TCP connection cifs: Merge is_path_valid() into get_normalized_path() cifs: Fix potential deadlock when updating vol in cifs_reconnect() x86/mm: Avoid using set_pgd() outside of real PGD pages rcu: Upgrade rcu_swap_protected() to rcu_replace_pointer() ieee802154: hwsim: Fix possible memory leaks xfrm: Linearize the skb after offloading if needed. net: qca_spi: Avoid high load if QCA7000 is not available mmc: mtk-sd: fix deferred probing mmc: mvsdio: convert to devm_platform_ioremap_resource mmc: mvsdio: fix deferred probing mmc: omap: fix deferred probing mmc: omap_hsmmc: fix deferred probing mmc: sdhci-acpi: fix deferred probing mmc: sh_mmcif: fix deferred probing mmc: usdhi60rol0: fix deferred probing ipvs: align inner_mac_header for encapsulation net: dsa: mt7530: fix trapping frames on non-MT7621 SoC MT7530 switch be2net: Extend xmit workaround to BE3 chip netfilter: nf_tables: disallow element updates of bound anonymous sets netfilter: nfnetlink_osf: fix module autoload Revert "net: phy: dp83867: perform soft reset and retain established link" sch_netem: acquire qdisc lock in netem_change() scsi: target: iscsi: Prevent login threads from racing between each other HID: wacom: Add error check to wacom_parse_and_register() arm64: Add missing Set/Way CMO encodings media: cec: core: don't set last_initiator if tx in progress nfcsim.c: Fix error checking for debugfs_create_dir usb: gadget: udc: fix NULL dereference in remove() s390/cio: unregister device when the only path is gone ASoC: nau8824: Add quirk to active-high jack-detect ARM: dts: Fix erroneous ADS touchscreen polarities drm/exynos: vidi: fix a wrong error return drm/exynos: fix race condition UAF in exynos_g2d_exec_ioctl drm/radeon: fix race condition UAF in radeon_gem_set_domain_ioctl x86/apic: Fix kernel panic when booting with intremap=off and x2apic_phys i2c: imx-lpi2c: fix type char overflow issue when calculating the clock cycle mm: fix VM_BUG_ON(PageTail) and BUG_ON(PageWriteback) mm: make wait_on_page_writeback() wait for multiple pending writebacks xfs: verify buffer contents when we skip log replay Linux 5.4.249 Change-Id: I3f7cf3804fddac70b4c1accef1c7374b184b1ea3 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
|
c874390551 |
xfs: verify buffer contents when we skip log replay
commit 22ed903eee23a5b174e240f1cdfa9acf393a5210 upstream. syzbot detected a crash during log recovery: XFS (loop0): Mounting V5 Filesystem bfdc47fc-10d8-4eed-a562-11a831b3f791 XFS (loop0): Torn write (CRC failure) detected at log block 0x180. Truncating head block from 0x200. XFS (loop0): Starting recovery (logdev: internal) ================================================================== BUG: KASAN: slab-out-of-bounds in xfs_btree_lookup_get_block+0x15c/0x6d0 fs/xfs/libxfs/xfs_btree.c:1813 Read of size 8 at addr ffff88807e89f258 by task syz-executor132/5074 CPU: 0 PID: 5074 Comm: syz-executor132 Not tainted 6.2.0-rc1-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x1b1/0x290 lib/dump_stack.c:106 print_address_description+0x74/0x340 mm/kasan/report.c:306 print_report+0x107/0x1f0 mm/kasan/report.c:417 kasan_report+0xcd/0x100 mm/kasan/report.c:517 xfs_btree_lookup_get_block+0x15c/0x6d0 fs/xfs/libxfs/xfs_btree.c:1813 xfs_btree_lookup+0x346/0x12c0 fs/xfs/libxfs/xfs_btree.c:1913 xfs_btree_simple_query_range+0xde/0x6a0 fs/xfs/libxfs/xfs_btree.c:4713 xfs_btree_query_range+0x2db/0x380 fs/xfs/libxfs/xfs_btree.c:4953 xfs_refcount_recover_cow_leftovers+0x2d1/0xa60 fs/xfs/libxfs/xfs_refcount.c:1946 xfs_reflink_recover_cow+0xab/0x1b0 fs/xfs/xfs_reflink.c:930 xlog_recover_finish+0x824/0x920 fs/xfs/xfs_log_recover.c:3493 xfs_log_mount_finish+0x1ec/0x3d0 fs/xfs/xfs_log.c:829 xfs_mountfs+0x146a/0x1ef0 fs/xfs/xfs_mount.c:933 xfs_fs_fill_super+0xf95/0x11f0 fs/xfs/xfs_super.c:1666 get_tree_bdev+0x400/0x620 fs/super.c:1282 vfs_get_tree+0x88/0x270 fs/super.c:1489 do_new_mount+0x289/0xad0 fs/namespace.c:3145 do_mount fs/namespace.c:3488 [inline] __do_sys_mount fs/namespace.c:3697 [inline] __se_sys_mount+0x2d3/0x3c0 fs/namespace.c:3674 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7f89fa3f4aca Code: 83 c4 08 5b 5d c3 66 2e 0f 1f 84 00 00 00 00 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007fffd5fb5ef8 EFLAGS: 00000206 ORIG_RAX: 00000000000000a5 RAX: ffffffffffffffda RBX: 00646975756f6e2c RCX: 00007f89fa3f4aca RDX: 0000000020000100 RSI: 0000000020009640 RDI: 00007fffd5fb5f10 RBP: 00007fffd5fb5f10 R08: 00007fffd5fb5f50 R09: 000000000000970d R10: 0000000000200800 R11: 0000000000000206 R12: 0000000000000004 R13: 0000555556c6b2c0 R14: 0000000000200800 R15: 00007fffd5fb5f50 </TASK> The fuzzed image contains an AGF with an obviously garbage agf_refcount_level value of 32, and a dirty log with a buffer log item for that AGF. The ondisk AGF has a higher LSN than the recovered log item. xlog_recover_buf_commit_pass2 reads the buffer, compares the LSNs, and decides to skip replay because the ondisk buffer appears to be newer. Unfortunately, the ondisk buffer is corrupt, but recovery just read the buffer with no buffer ops specified: error = xfs_buf_read(mp->m_ddev_targp, buf_f->blf_blkno, buf_f->blf_len, buf_flags, &bp, NULL); Skipping the buffer leaves its contents in memory unverified. This sets us up for a kernel crash because xfs_refcount_recover_cow_leftovers reads the buffer (which is still around in XBF_DONE state, so no read verification) and creates a refcountbt cursor of height 32. This is impossible so we run off the end of the cursor object and crash. Fix this by invoking the verifier on all skipped buffers and aborting log recovery if the ondisk buffer is corrupt. It might be smarter to force replay the log item atop the buffer and then see if it'll pass the write verifier (like ext4 does) but for now let's go with the conservative option where we stop immediately. Link: https://syzkaller.appspot.com/bug?extid=7e9494b8b399902e994e Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Acked-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
6d6982b563 |
Merge 5.4.246 into android11-5.4-lts
Changes in 5.4.246 RDMA/efa: Fix unsupported page sizes in device RDMA/bnxt_re: Enable SRIOV VF support on Broadcom's 57500 adapter series RDMA/bnxt_re: Refactor queue pair creation code RDMA/bnxt_re: Fix return value of bnxt_re_process_raw_qp_pkt_rx iommu/rockchip: Fix unwind goto issue iommu/amd: Don't block updates to GATag if guest mode is on dmaengine: pl330: rename _start to prevent build error net/mlx5: fw_tracer, Fix event handling netrom: fix info-leak in nr_write_internal() af_packet: Fix data-races of pkt_sk(sk)->num. amd-xgbe: fix the false linkup in xgbe_phy_status mtd: rawnand: ingenic: fix empty stub helper definitions af_packet: do not use READ_ONCE() in packet_bind() tcp: deny tcp_disconnect() when threads are waiting tcp: Return user_mss for TCP_MAXSEG in CLOSE/LISTEN state if user_mss set net/sched: sch_ingress: Only create under TC_H_INGRESS net/sched: sch_clsact: Only create under TC_H_CLSACT net/sched: Reserve TC_H_INGRESS (TC_H_CLSACT) for ingress (clsact) Qdiscs net/sched: Prohibit regrafting ingress or clsact Qdiscs net: sched: fix NULL pointer dereference in mq_attach ocfs2/dlm: move BITS_TO_BYTES() to bitops.h for wider use net/netlink: fix NETLINK_LIST_MEMBERSHIPS length report udp6: Fix race condition in udp6_sendmsg & connect net/sched: flower: fix possible OOB write in fl_set_geneve_opt() net: dsa: mv88e6xxx: Increase wait after reset deactivation mtd: rawnand: marvell: ensure timing values are written mtd: rawnand: marvell: don't set the NAND frequency select watchdog: menz069_wdt: fix watchdog initialisation mailbox: mailbox-test: Fix potential double-free in mbox_test_message_write() ARM: 9295/1: unwind:fix unwind abort for uleb128 case media: rcar-vin: Select correct interrupt mode for V4L2_FIELD_ALTERNATE fbdev: modedb: Add 1920x1080 at 60 Hz video mode fbdev: stifb: Fix info entry in sti_struct on error path nbd: Fix debugfs_create_dir error checking ASoC: dwc: limit the number of overrun messages xfrm: Check if_id in inbound policy/secpath match ASoC: ssm2602: Add workaround for playback distortions media: dvb_demux: fix a bug for the continuity counter media: dvb-usb: az6027: fix three null-ptr-deref in az6027_i2c_xfer() media: dvb-usb-v2: ec168: fix null-ptr-deref in ec168_i2c_xfer() media: dvb-usb-v2: ce6230: fix null-ptr-deref in ce6230_i2c_master_xfer() media: dvb-usb-v2: rtl28xxu: fix null-ptr-deref in rtl28xxu_i2c_xfer media: dvb-usb: digitv: fix null-ptr-deref in digitv_i2c_xfer() media: dvb-usb: dw2102: fix uninit-value in su3000_read_mac_address media: netup_unidvb: fix irq init by register it at the end of probe media: dvb_ca_en50221: fix a size write bug media: ttusb-dec: fix memory leak in ttusb_dec_exit_dvb() media: mn88443x: fix !CONFIG_OF error by drop of_match_ptr from ID table media: dvb-core: Fix use-after-free due on race condition at dvb_net media: dvb-core: Fix kernel WARNING for blocking operation in wait_event*() media: dvb-core: Fix use-after-free due to race condition at dvb_ca_en50221 wifi: rtl8xxxu: fix authentication timeout due to incorrect RCR value ARM: dts: stm32: add pin map for CAN controller on stm32f7 arm64/mm: mark private VM_FAULT_X defines as vm_fault_t scsi: core: Decrease scsi_device's iorequest_cnt if dispatch failed wifi: b43: fix incorrect __packed annotation netfilter: conntrack: define variables exp_nat_nla_policy and any_addr with CONFIG_NF_NAT ALSA: oss: avoid missing-prototype warnings atm: hide unused procfs functions mailbox: mailbox-test: fix a locking issue in mbox_test_message_write() iio: adc: mxs-lradc: fix the order of two cleanup operations HID: google: add jewel USB id HID: wacom: avoid integer overflow in wacom_intuos_inout() iio: light: vcnl4035: fixed chip ID check iio: dac: mcp4725: Fix i2c_master_send() return value handling iio: dac: build ad5758 driver when AD5758 is selected net: usb: qmi_wwan: Set DTR quirk for BroadMobi BM818 usb: gadget: f_fs: Add unbind event before functionfs_unbind misc: fastrpc: return -EPIPE to invocations on device removal misc: fastrpc: reject new invocations during device removal scsi: stex: Fix gcc 13 warnings ata: libata-scsi: Use correct device no in ata_find_dev() flow_dissector: work around stack frame size warning x86/boot: Wrap literal addresses in absolute_pointer() ACPI: thermal: drop an always true check gcc-12: disable '-Wdangling-pointer' warning for now eth: sun: cassini: remove dead code kernel/extable.c: use address-of operator on section symbols treewide: Remove uninitialized_var() usage lib/dynamic_debug.c: use address-of operator on section symbols wifi: rtlwifi: remove always-true condition pointed out by GCC 12 mmc: vub300: fix invalid response handling tty: serial: fsl_lpuart: use UARTCTRL_TXINV to send break instead of UARTCTRL_SBK selinux: don't use make's grouped targets feature yet tracing/probe: trace_probe_primary_from_call(): checked list_first_entry ext4: add EA_INODE checking to ext4_iget() ext4: set lockdep subclass for the ea_inode in ext4_xattr_inode_cache_find() ext4: disallow ea_inodes with extended attributes ext4: add lockdep annotations for i_data_sem for ea_inode's fbcon: Fix null-ptr-deref in soft_cursor test_firmware: fix the memory leak of the allocated firmware buffer regmap: Account for register length when chunking scsi: dpt_i2o: Remove broken pass-through ioctl (I2OUSERCMD) scsi: dpt_i2o: Do not process completions with invalid addresses RDMA/bnxt_re: Remove set but not used variable 'dev_attr' RDMA/bnxt_re: Remove the qp from list only if the qp destroy succeeds drm/edid: Fix uninitialized variable in drm_cvt_modes() wifi: rtlwifi: 8192de: correct checking of IQK reload drm/edid: fix objtool warning in drm_cvt_modes() Linux 5.4.246 Change-Id: I8721e40543af31c56dbbd47910dd3b474e3a79ab Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
|
0638dcc7e7 |
treewide: Remove uninitialized_var() usage
commit 3f649ab728cda8038259d8f14492fe400fbab911 upstream. Using uninitialized_var() is dangerous as it papers over real bugs[1] (or can in the future), and suppresses unrelated compiler warnings (e.g. "unused variable"). If the compiler thinks it is uninitialized, either simply initialize the variable or make compiler changes. In preparation for removing[2] the[3] macro[4], remove all remaining needless uses with the following script: git grep '\buninitialized_var\b' | cut -d: -f1 | sort -u | \ xargs perl -pi -e \ 's/\buninitialized_var\(([^\)]+)\)/\1/g; s:\s*/\* (GCC be quiet|to make compiler happy) \*/$::g;' drivers/video/fbdev/riva/riva_hw.c was manually tweaked to avoid pathological white-space. No outstanding warnings were found building allmodconfig with GCC 9.3.0 for x86_64, i386, arm64, arm, powerpc, powerpc64le, s390x, mips, sparc64, alpha, and m68k. [1] https://lore.kernel.org/lkml/20200603174714.192027-1-glider@google.com/ [2] https://lore.kernel.org/lkml/CA+55aFw+Vbj0i=1TGqCR5vQkCzWJ0QxK6CernOU6eedsudAixw@mail.gmail.com/ [3] https://lore.kernel.org/lkml/CA+55aFwgbgqhbp1fkxvRKEpzyR5J8n1vKT1VZdz9knmPuXhOeg@mail.gmail.com/ [4] https://lore.kernel.org/lkml/CA+55aFz2500WfbKXAx8s67wrm9=yVJu65TpLgN_ybYNv0VEOKA@mail.gmail.com/ Reviewed-by: Leon Romanovsky <leonro@mellanox.com> # drivers/infiniband and mlx4/mlx5 Acked-by: Jason Gunthorpe <jgg@mellanox.com> # IB Acked-by: Kalle Valo <kvalo@codeaurora.org> # wireless drivers Reviewed-by: Chao Yu <yuchao0@huawei.com> # erofs Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
734577d21e |
This is the 5.4.242 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmRI7cMACgkQONu9yGCS aT4KYhAA11Pj8JrPzyk0LSUHlFmZ6LRUA1/0aAog2V4wIY0tSODHiYMqcqsgiay9 SGg+eGlHU5+6FL36BGHneg8iLVwX/HHERPdRrOrTJO8YUe/GumFkgz0h2O7+ILOa QAao9FB9OBCNaR0guQjTZk+8QQtgfj3X381mBbzoFIIprNUw9/P3lYowyVPSWjHZ Ax+ptL2oxfBTGylrU543q7Vfzvy56DSNeDnCNyAHHVAXpQU3YhsWqW5pXcxQhtDh aEt21YB0CVBs4BF2sdSpUSXlQYS9NsH814lK3T1pEYONRJ/Osxl9QpdZ077avhTA 7pS9oKUMNfak8BoK4aRsSy51UrKBnb6IOflJVQiXYRft0GvF630qra47hwZLeZOf Sl1tIW0cwlzJcv7H5tsg2eWXsZBCfmvd2MtDHjdYD/fxED7NSvPXjCyO4Qs1UqB9 vq2cTU0CiCBQPL0UzKeuhPXfSwpbDXYmCMpH2GfHVtlYA1g+zPOlLA9HfGBR0pzH X1tdZ9hjUYSBdD5PoaTJzMcg+cME+tPq2kN09dECM+cTdlB/s5cKj8OcNLIv6jVD PfFXjJCL1CJk93C4UaHlO7efdQbferEVViu6hmfhGIAYOd3S400/5p71b22rfViH 9CwkhhlafZ25/AF2NCfhvK5gjRTFH+gOtw0JmG2AQ4f1z44joWw= =4sA8 -----END PGP SIGNATURE----- Merge 5.4.242 into android11-5.4-lts Changes in 5.4.242 ARM: dts: rockchip: fix a typo error for rk3288 spdif node arm64: dts: meson-g12-common: specify full DMC range netfilter: br_netfilter: fix recent physdev match breakage regulator: fan53555: Explicitly include bits header net: sched: sch_qfq: prevent slab-out-of-bounds in qfq_activate_agg virtio_net: bugfix overflow inside xdp_linearize_page() netfilter: nf_tables: fix ifdef to also consider nf_tables=m i40e: fix accessing vsi->active_filters without holding lock i40e: fix i40e_setup_misc_vector() error handling mlxfw: fix null-ptr-deref in mlxfw_mfa2_tlv_next() bpf: Fix incorrect verifier pruning due to missing register precision taints e1000e: Disable TSO on i219-LM card to increase speed f2fs: Fix f2fs_truncate_partial_nodes ftrace event Input: i8042 - add quirk for Fujitsu Lifebook A574/H selftests: sigaltstack: fix -Wuninitialized scsi: megaraid_sas: Fix fw_crash_buffer_show() scsi: core: Improve scsi_vpd_inquiry() checks net: dsa: b53: mmap: add phy ops s390/ptrace: fix PTRACE_GET_LAST_BREAK error handling nvme-tcp: fix a possible UAF when failing to allocate an io queue xen/netback: use same error messages for same errors iio: light: tsl2772: fix reading proximity-diodes from device tree nilfs2: initialize unused bytes in segment summary blocks memstick: fix memory leak if card device is never registered mmc: sdhci_am654: Set HIGH_SPEED_ENA for SDR12 and SDR25 MIPS: Define RUNTIME_DISCARD_EXIT in LD script x86/purgatory: Don't generate debug info for purgatory.ro Revert "ext4: fix use-after-free in ext4_xattr_set_entry" ext4: remove duplicate definition of ext4_xattr_ibody_inline_set() ext4: fix use-after-free in ext4_xattr_set_entry udp: Call inet6_destroy_sock() in setsockopt(IPV6_ADDRFORM). tcp/udp: Call inet6_destroy_sock() in IPv6 sk->sk_destruct(). inet6: Remove inet6_destroy_sock() in sk->sk_prot->destroy(). dccp: Call inet6_destroy_sock() via sk->sk_destruct(). sctp: Call inet6_destroy_sock() via sk->sk_destruct(). xfs: fix forkoff miscalculation related to XFS_LITINO(mp) pwm: meson: Explicitly set .polarity in .get_state() iio: adc: at91-sama5d2_adc: fix an error code in at91_adc_allocate_trigger() ASN.1: Fix check for strdup() success Linux 5.4.242 Change-Id: Id72de17865bf93d2a6d39009fbc46cb3f8f7b6ca Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
|
7f2b8046da |
xfs: fix forkoff miscalculation related to XFS_LITINO(mp)
commit ada49d64fb3538144192181db05de17e2ffc3551 upstream. Currently, commit e9e2eae89ddb dropped a (int) decoration from XFS_LITINO(mp), and since sizeof() expression is also involved, the result of XFS_LITINO(mp) is simply as the size_t type (commonly unsigned long). Considering the expression in xfs_attr_shortform_bytesfit(): offset = (XFS_LITINO(mp) - bytes) >> 3; let "bytes" be (int)340, and "XFS_LITINO(mp)" be (unsigned long)336. on 64-bit platform, the expression is offset = ((unsigned long)336 - (int)340) >> 3 = (int)(0xfffffffffffffffcUL >> 3) = -1 but on 32-bit platform, the expression is offset = ((unsigned long)336 - (int)340) >> 3 = (int)(0xfffffffcUL >> 3) = 0x1fffffff instead. so offset becomes a large positive number on 32-bit platform, and cause xfs_attr_shortform_bytesfit() returns maxforkoff rather than 0. Therefore, one result is "ASSERT(new_size <= XFS_IFORK_SIZE(ip, whichfork));" assertion failure in xfs_idata_realloc(), which was also the root cause of the original bugreport from Dennis, see: https://bugzilla.redhat.com/show_bug.cgi?id=1894177 And it can also be manually triggered with the following commands: $ touch a; $ setfattr -n user.0 -v "`seq 0 80`" a; $ setfattr -n user.1 -v "`seq 0 80`" a on 32-bit platform. Fix the case in xfs_attr_shortform_bytesfit() by bailing out "XFS_LITINO(mp) < bytes" in advance suggested by Eric and a misleading comment together with this bugfix suggested by Darrick. It seems the other users of XFS_LITINO(mp) are not impacted. Fixes: e9e2eae89ddb ("xfs: only check the superblock version for dinode size calculation") Cc: <stable@vger.kernel.org> # 5.7+ Reported-and-tested-by: Dennis Gilmore <dgilmore@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Gao Xiang <hsiangkao@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Acked-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
da8b283c08 |
This is the 5.4.241 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmRBDvQACgkQONu9yGCS aT58+g/+PmTvsCcjSlTvpHw+kFsIu2F2pIjlJNlX+YHSzEYMG/G2mn7XD1rUE7JB mQQ7hFXEeQZTuN4zsFnviotzn1xd6bV26Od598m+yFZ3Fty07NXy3OD1/08ZdP0G k+U9u4TDr47BV3WdokGWIdXAjnlvC0kwjc+EWznOmXquhW8hSrGvz0IUJQWgDue1 IpOkuRs60T85dUHoPYEL8T23XRyhFhdeIrlx6Hqzlp1fGg7dSpfh0yYJmDLF6Y21 CXO2WcADhMMszk+CFEXgfWVem5mheUuVsfmBUg34ZGzzNos0rPyHG+VmwZxnsPYE SVLVxLL649CzAuJdpXflTghdBUU/WKT/ZZ0aQddohPjij55My2oCV6/FpHXg7kuo ZTXN0ByyeAbV4DlY2+ooY6Vhr5lzQG1l15Ap9xj4ioQO8U3jeCCQ5wT9L01ig6a9 /9U9wb6CYp3thrY94x1WapJF5utiIigbiS+rioRfAwHAzj5JiLfQvdH1nVgBNz+9 DWCEGIiUONmHMRrt+X6Nnu8KkHWOkDFF2lisphXUsaY140gFG0+d3xnRArsU4ZDi j1zT0ErqigV6vzA39ic898EW8wkNqVCWPwYRLVRSBuPCDKK7SjnOuStGi7eIoYp2 zI2NV9UUMm5Qxub5zUNktpWDFFn9knN3E7Gc4wAKWkHkWc/DJ+U= =ffeb -----END PGP SIGNATURE----- Merge 5.4.241 into android11-5.4-lts Changes in 5.4.241 scsi: ses: Handle enclosure with just a primary component gracefully x86/PCI: Add quirk for AMD XHCI controller that loses MSI-X state in D3hot cgroup/cpuset: Wake up cpuset_attach_wq tasks in cpuset_cancel_attach() Revert "treewide: Replace DECLARE_TASKLET() with DECLARE_TASKLET_OLD()" treewide: Replace DECLARE_TASKLET() with DECLARE_TASKLET_OLD() smb3: fix problem with null cifs super block with previous patch pinctrl: amd: Use irqchip template pinctrl: amd: disable and mask interrupts on probe pinctrl: amd: Disable and mask interrupts on resume pwm: cros-ec: Explicitly set .polarity in .get_state() pwm: sprd: Explicitly set .polarity in .get_state() wifi: mac80211: fix invalid drv_sta_pre_rcu_remove calls for non-uploaded sta icmp: guard against too small mtu net: don't let netpoll invoke NAPI if in xmit context sctp: check send stream number after wait_for_sndbuf ipv6: Fix an uninit variable access bug in __ip6_make_skb() gpio: davinci: Add irq chip flag to skip set wake sunrpc: only free unix grouplist after RCU settles NFSD: callback request does not use correct credential for AUTH_SYS xhci: also avoid the XHCI_ZERO_64B_REGS quirk with a passthrough iommu USB: serial: cp210x: add Silicon Labs IFS-USB-DATACABLE IDs usb: typec: altmodes/displayport: Fix configure initial pin assignment USB: serial: option: add Telit FE990 compositions USB: serial: option: add Quectel RM500U-CN modem iio: adc: ti-ads7950: Set `can_sleep` flag for GPIO chip iio: dac: cio-dac: Fix max DAC write value check for 12-bit tty: serial: sh-sci: Fix transmit end interrupt handler tty: serial: sh-sci: Fix Rx on RZ/G2L SCI tty: serial: fsl_lpuart: avoid checking for transfer complete when UARTCTRL_SBK is asserted in lpuart32_tx_empty nilfs2: fix potential UAF of struct nilfs_sc_info in nilfs_segctor_thread() nilfs2: fix sysfs interface lifetime ALSA: hda/realtek: Add quirk for Clevo X370SNW perf/core: Fix the same task check in perf_event_set_output ftrace: Mark get_lock_parent_ip() __always_inline can: j1939: j1939_tp_tx_dat_new(): fix out-of-bounds memory access tracing: Free error logs of tracing instances net_sched: prevent NULL dereference if default qdisc setup failed drm/panfrost: Fix the panfrost_mmu_map_fault_addr() error path ring-buffer: Fix race while reader and writer are on the same page mm/swap: fix swap_info_struct race between swapoff and get_swap_pages() irqdomain: Look for existing mapping only once irqdomain: Refactor __irq_domain_alloc_irqs() irqdomain: Fix mapping-creation race Revert "pinctrl: amd: Disable and mask interrupts on resume" ALSA: emu10k1: fix capture interrupt handler unlinking ALSA: hda/sigmatel: add pin overrides for Intel DP45SG motherboard ALSA: i2c/cs8427: fix iec958 mixer control deactivation ALSA: firewire-tascam: add missing unwind goto in snd_tscm_stream_start_duplex() ALSA: hda/sigmatel: fix S/PDIF out on Intel D*45* motherboards Bluetooth: L2CAP: Fix use-after-free in l2cap_disconnect_{req,rsp} Bluetooth: Fix race condition in hidp_session_thread btrfs: print checksum type and implementation at mount time btrfs: fix fast csum implementation detection mtdblock: tolerate corrected bit-flips mtd: rawnand: meson: fix bitmask for length in command word mtd: rawnand: stm32_fmc2: remove unsupported EDO mode 9p/xen : Fix use after free bug in xen_9pfs_front_remove due to race condition niu: Fix missing unwind goto in niu_alloc_channels() qlcnic: check pci_reset_function result sctp: fix a potential overflow in sctp_ifwdtsn_skip RDMA/core: Fix GID entry ref leak when create_ah fails udp6: fix potential access to stale information net: macb: fix a memory corruption in extended buffer descriptor mode power: supply: cros_usbpd: reclassify "default case!" as debug i2c: imx-lpi2c: clean rx/tx buffers upon new message efi: sysfb_efi: Add quirk for Lenovo Yoga Book X91F/L drm: panel-orientation-quirks: Add quirk for Lenovo Yoga Book X90F verify_pefile: relax wrapper length check asymmetric_keys: log on fatal failures in PE/pkcs7 ubi: Fix failure attaching when vid_hdr offset equals to (sub)page size mtd: ubi: wl: Fix a couple of kernel-doc issues ubi: Fix deadlock caused by recursively holding work_sem i2c: ocores: generate stop condition after timeout in polling mode watchdog: sbsa_wdog: Make sure the timeout programming is within the limits coresight-etm4: Fix for() loop drvdata->nr_addr_cmp range bug xfs: show the proper user quota options xfs: merge the projid fields in struct xfs_icdinode xfs: ensure that the inode uid/gid match values match the icdinode ones xfs: remove the icdinode di_uid/di_gid members xfs: remove the kuid/kgid conversion wrappers xfs: add a new xfs_sb_version_has_v3inode helper xfs: only check the superblock version for dinode size calculation xfs: simplify di_flags2 inheritance in xfs_ialloc xfs: simplify a check in xfs_ioctl_setattr_check_cowextsize xfs: remove the di_version field from struct icdinode xfs: fix up non-directory creation in SGID directories xfs: set inode size after creating symlink xfs: report corruption only as a regular error xfs: shut down the filesystem if we screw up quota reservation xfs: consider shutdown in bmapbt cursor delete assert xfs: don't reuse busy extents on extent trim xfs: force log and push AIL to clear pinned inodes when aborting mount Linux 5.4.241 Change-Id: I428eec45c4ac9796104683d40b7cb0d38d4c8015 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
|
8795936437 |
xfs: force log and push AIL to clear pinned inodes when aborting mount
commit d336f7ebc65007f5831e2297e6f3383ae8dbf8ed upstream. [ Slightly modify fs/xfs/xfs_mount.c to resolve merge conflicts ] If we allocate quota inodes in the process of mounting a filesystem but then decide to abort the mount, it's possible that the quota inodes are sitting around pinned by the log. Now that inode reclaim relies on the AIL to flush inodes, we have to force the log and push the AIL in between releasing the quota inodes and kicking off reclaim to tear down all the incore inodes. Do this by extracting the bits we need from the unmount path and reusing them. As an added bonus, failed writes during a failed mount will not retry forever now. This was originally found during a fuzz test of metadata directories (xfs/1546), but the actual symptom was that reclaim hung up on the quota inodes. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Acked-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
c76dd36875 |
xfs: don't reuse busy extents on extent trim
commit 06058bc40534530e617e5623775c53bb24f032cb upstream.
Freed extents are marked busy from the point the freeing transaction
commits until the associated CIL context is checkpointed to the log.
This prevents reuse and overwrite of recently freed blocks before
the changes are committed to disk, which can lead to corruption
after a crash. The exception to this rule is that metadata
allocation is allowed to reuse busy extents because metadata changes
are also logged.
As of commit
|
||
|
4679b73a8e |
xfs: consider shutdown in bmapbt cursor delete assert
commit 1cd738b13ae9b29e03d6149f0246c61f76e81fcf upstream. [ Slightly modify fs/xfs/libxfs/xfs_btree.c to resolve merge conflicts ] The assert in xfs_btree_del_cursor() checks that the bmapbt block allocation field has been handled correctly before the cursor is freed. This field is used for accurate calculation of indirect block reservation requirements (for delayed allocations), for example. generic/019 reproduces a scenario where this assert fails because the filesystem has shutdown while in the middle of a bmbt record insertion. This occurs after a bmbt block has been allocated via the cursor but before the higher level bmap function (i.e. xfs_bmap_add_extent_hole_real()) completes and resets the field. Update the assert to accommodate the transient state if the filesystem has shutdown. While here, clean up the indentation and comments in the function. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Acked-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
9355fd118b |
xfs: shut down the filesystem if we screw up quota reservation
commit 2a4bdfa8558ca2904dc17b83497dc82aa7fc05e9 upstream. If we ever screw up the quota reservations enough to trip the assertions, something's wrong with the quota code. Shut down the filesystem when this happens, because this is corruption. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Acked-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Acked-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
48f75df5b3 |
xfs: report corruption only as a regular error
commit 6519f708cc355c4834edbe1885c8542c0e7ef907 uptream. [ Slightly modify fs/xfs/xfs_linux.h to resolve merge conflicts ] Redefine XFS_IS_CORRUPT so that it reports corruptions only via xfs_corruption_report. Since these are on-disk contents (and not checks of internal state), we don't ever want to panic the kernel. This also amends the corruption report to recommend unmounting and running xfs_repair. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Acked-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
3cce34ceb2 |
xfs: set inode size after creating symlink
commit 8aa921a95335d0a8c8e2be35a44467e7c91ec3e4 upstream. When XFS creates a new symlink, it writes its size to disk but not to the VFS inode. This causes i_size_read() to return 0 for that symlink until it is re-read from disk, for example when the system is rebooted. I found this inconsistency while protecting directories with eCryptFS. The command "stat path/to/symlink/in/ecryptfs" will report "Size: 0" if the symlink was created after the last reboot on an XFS root. Call i_size_write() in xfs_symlink() Signed-off-by: Jeffrey Mitchell <jeffrey.mitchell@starlab.io> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Acked-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
e76bd6da51 |
xfs: fix up non-directory creation in SGID directories
commit 01ea173e103edd5ec41acec65b9261b87e123fc2 upstream. XFS always inherits the SGID bit if it is set on the parent inode, while the generic inode_init_owner does not do this in a few cases where it can create a possible security problem, see commit |
||
|
ad6613c984 |
xfs: remove the di_version field from struct icdinode
commit 6471e9c5e7a109a952be8e3e80b8d9e262af239d upstream. We know the version is 3 if on a v5 file system. For earlier file systems formats we always upgrade the remaining v1 inodes to v2 and thus only use v2 inodes. Use the xfs_sb_version_has_large_dinode helper to check if we deal with small or large dinodes, and thus remove the need for the di_version field in struct icdinode. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Chandan Rajendra <chandanrlinux@gmail.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Acked-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
ca4533c951 |
xfs: simplify a check in xfs_ioctl_setattr_check_cowextsize
commit 5e28aafe708ba3e388f92a7148093319d3521c2f upstream. Only v5 file systems can have the reflink feature, and those will always use the large dinode format. Remove the extra check for the inode version. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Chandan Rajendra <chandanrlinux@gmail.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Acked-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
e078b3de3e |
xfs: simplify di_flags2 inheritance in xfs_ialloc
commit b3d1d37544d8c98be610df0ed66c884ff18748d5 upstream. di_flags2 is initialized to zero for v4 and earlier file systems. This means di_flags2 can only be non-zero for a v5 file systems, in which case both the parent and child inodes can store the field. Remove the extra di_version check, and also remove the rather pointless local di_flags2 variable while at it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Chandan Rajendra <chandanrlinux@gmail.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Acked-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
0c553917b6 |
xfs: only check the superblock version for dinode size calculation
commit e9e2eae89ddb658ea332295153fdca78c12c1e0d upstream. The size of the dinode structure is only dependent on the file system version, so instead of checking the individual inode version just use the newly added xfs_sb_version_has_large_dinode helper, and simplify various calling conventions. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Chandan Rajendra <chandanrlinux@gmail.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Acked-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
90aab52d06 |
xfs: add a new xfs_sb_version_has_v3inode helper
commit b81b79f4eda2ea98ae5695c0b6eb384c8d90b74d upstream. Add a new wrapper to check if a file system supports the v3 inode format with a larger dinode core. Previously we used xfs_sb_version_hascrc for that, which is technically correct but a little confusing to read. Also move xfs_dinode_good_version next to xfs_sb_version_has_v3inode so that we have one place that documents the superblock version to inode version relationship. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Chandan Rajendra <chandanrlinux@gmail.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Acked-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
edd36a57b4 |
xfs: remove the kuid/kgid conversion wrappers
commit ba8adad5d036733d240fa8a8f4d055f3d4490562 upstream. Remove the XFS wrappers for converting from and to the kuid/kgid types. Mostly this means switching to VFS i_{u,g}id_{read,write} helpers, but in a few spots the calls to the conversion functions is open coded. To match the use of sb->s_user_ns in the helpers and other file systems, sb->s_user_ns is also used in the quota code. The ACL code already does the conversion in a grotty layering violation in the VFS xattr code, so it keeps using init_user_ns for the identity mapping. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Acked-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
3ef81874f7 |
xfs: remove the icdinode di_uid/di_gid members
commit 542951592c99ff7b15c050954c051dd6dd6c0f97 upstream. Use the Linux inode i_uid/i_gid members everywhere and just convert from/to the scalar value when reading or writing the on-disk inode. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Acked-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
cc508a41ae |
xfs: ensure that the inode uid/gid match values match the icdinode ones
commit 3d8f2821502d0b60bac2789d0bea951fda61de0c upstream. Instead of only synchronizing the uid/gid values in xfs_setup_inode, ensure that they always match to prepare for removing the icdinode fields. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Acked-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
7a9dc79771 |
xfs: merge the projid fields in struct xfs_icdinode
commit de7a866fd41b227b0aa6e9cbeb0dae221c12f542 upstream. There is no point in splitting the fields like this in an purely in-memory structure. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Acked-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
4f3252e7e1 |
xfs: show the proper user quota options
commit 237d7887ae723af7d978e8b9a385fdff416f357b upstream. The quota option 'usrquota' should be shown if both the XFS_UQUOTA_ACCT and XFS_UQUOTA_ENFD flags are set. The option 'uqnoenforce' should be shown when only the XFS_UQUOTA_ACCT flag is set. The current code logic seems wrong, Fix it and show proper options. Signed-off-by: Kaixu Xia <kaixuxia@tencent.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Acked-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
abc4ede193 |
This is the 5.4.232 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmP2AZoACgkQONu9yGCS aT4Kog//cMOCPvc+9yam5NCZj76k9jzIfteKMZzvSyxjV/ShPGynLIwcR26vE4j1 CtEB0aknuxgpqthfCahjjf51POhYLJD62H62UtTfxgIkWxnETd8F6y2xvuVXsds5 mC0LUzQ9md6slgTTIobQF9ilIGAt/yKPOg89fUXNYzNsO2us46XZCmZOXg5MVwlI hXYQuVBze1VhWt40J8TYDFckjoQLUgH6lBawWHC8/r2MBBydzX1cZEyL2TXhDfFS 7t9gWXKteAFE6GWfgAY6MrtqGx+X25Xe7qds4V8v6FgxR2MFeo94+k3DbhXRnjjY gA6czJBurGzhiXWo2E4laYlEMfsY0qkl17M49C/LwkJhZCSjF60b0Vo0dNfLLogZ cWsG6qcrfV8/js5h97kFSluWZ4VM7xTgcJQ/qtU/O8IprRQioCERjvm4Dl+/emXI ycFaiZOP3RvYdHxADIsItm46C7WzpzqZpjjs+9jHEaACrnOQfepGGFmgImMd9P8r kkU5KUtPQoAgSFfPz1tvJgyQiazONRAKtg1UprPOnLN0PsBQrE8ekCOk9lDoW60l t+G2lC0dJBYkcKC+4jHa9y18U7wz/eYdYE+K/u8kUENYFLSBfYxIqbXPxQZcq6aO TnyVr1n+Dd3HXtLX58+vDE2RUjosvCXctBGrE6Q56d8AKXh6FvM= =rk4j -----END PGP SIGNATURE----- Merge 5.4.232 into android11-5.4-lts Changes in 5.4.232 firewire: fix memory leak for payload of request subaction to IEC 61883-1 FCP region bus: sunxi-rsb: Fix error handling in sunxi_rsb_init() ASoC: Intel: bytcr_rt5651: Drop reference count of ACPI device after use ALSA: hda/via: Avoid potential array out-of-bound in add_secret_dac_path() arm64: dts: imx8mm: Fix pad control for UART1_DTE_RX scsi: Revert "scsi: core: map PQ=1, PDT=other values to SCSI_SCAN_TARGET_PRESENT" WRITE is "data source", not destination... fix iov_iter_bvec() "direction" argument fix "direction" argument of iov_iter_kvec() netrom: Fix use-after-free caused by accept on already connected socket netfilter: br_netfilter: disable sabotage_in hook after first suppression squashfs: harden sanity check in squashfs_read_xattr_id_table net: phy: meson-gxl: Add generic dummy stubs for MMD register access can: j1939: fix errant WARN_ON_ONCE in j1939_session_deactivate ata: libata: Fix sata_down_spd_limit() when no link speed is reported selftests: net: udpgso_bench_rx: Fix 'used uninitialized' compiler warning selftests: net: udpgso_bench_rx/tx: Stop when wrong CLI args are provided selftests: net: udpgso_bench: Fix racing bug between the rx/tx programs selftests: net: udpgso_bench_tx: Cater for pending datagrams zerocopy benchmarking virtio-net: Keep stop() to follow mirror sequence of open() net: openvswitch: fix flow memory leak in ovs_flow_cmd_new efi: fix potential NULL deref in efi_mem_reserve_persistent scsi: target: core: Fix warning on RT kernels scsi: iscsi_tcp: Fix UAF during login when accessing the shost ipaddress i2c: rk3x: fix a bunch of kernel-doc warnings net/x25: Fix to not accept on connected socket iio: adc: stm32-dfsdm: fill module aliases usb: dwc3: dwc3-qcom: Fix typo in the dwc3 vbus override API usb: dwc3: qcom: enable vbus override when in OTG dr-mode usb: gadget: f_fs: Fix unbalanced spinlock in __ffs_ep0_queue_wait vc_screen: move load of struct vc_data pointer in vcs_read() to avoid UAF Input: i8042 - move __initconst to fix code styling warning Input: i8042 - merge quirk tables Input: i8042 - add TUXEDO devices to i8042 quirk tables Input: i8042 - add Clevo PCX0DX to i8042 quirk table fbcon: Check font dimension limits watchdog: diag288_wdt: do not use stack buffers for hardware data watchdog: diag288_wdt: fix __diag288() inline assembly efi: Accept version 2 of memory attributes table iio: hid: fix the retval in accel_3d_capture_sample iio: adc: berlin2-adc: Add missing of_node_put() in error path iio:adc:twl6030: Enable measurements of VUSB, VBAT and others parisc: Fix return code of pdc_iodc_print() parisc: Wire up PTRACE_GETREGS/PTRACE_SETREGS for compat case riscv: disable generation of unwind tables mm: hugetlb: proc: check for hugetlb shared PMD in /proc/PID/smaps fpga: stratix10-soc: Fix return value check in s10_ops_write_init() mm/swapfile: add cond_resched() in get_swap_pages() Squashfs: fix handling and sanity checking of xattr_ids count nvmem: core: fix cell removal on error mm: swap: properly update readahead statistics in unuse_pte_range() xprtrdma: Fix regbuf data not freed in rpcrdma_req_create() serial: 8250_dma: Fix DMA Rx completion race serial: 8250_dma: Fix DMA Rx rearm race powerpc/imc-pmu: Revert nest_init_lock to being a mutex fbdev: smscufx: fix error handling code in ufx_usb_probe f2fs: fix to do sanity check on i_extra_isize in is_alive() wifi: brcmfmac: Check the count value of channel spec to prevent out-of-bounds reads iio:adc:twl6030: Enable measurement of VAC btrfs: limit device extents to the device size btrfs: zlib: zero-initialize zlib workspace ALSA: emux: Avoid potential array out-of-bound in snd_emux_xg_control() tracing: Fix poll() and select() do not work on per_cpu trace_pipe and trace_pipe_raw can: j1939: do not wait 250 ms if the same addr was already claimed IB/hfi1: Restore allocated resources on failed copyout IB/IPoIB: Fix legacy IPoIB due to wrong number of queues iommu: Add gfp parameter to iommu_ops::map RDMA/usnic: use iommu_map_atomic() under spin_lock() xfrm: fix bug with DSCP copy to v6 from v4 tunnel bonding: fix error checking in bond_debug_reregister() net: phy: meson-gxl: use MMD access dummy stubs for GXL, internal PHY ionic: clean interrupt before enabling queue to avoid credit race ice: Do not use WQ_MEM_RECLAIM flag for workqueue rds: rds_rm_zerocopy_callback() use list_first_entry() selftests: forwarding: lib: quote the sysctl values ALSA: pci: lx6464es: fix a debug loop pinctrl: aspeed: Fix confusing types in return value pinctrl: single: fix potential NULL dereference pinctrl: intel: Restore the pins that used to be in Direct IRQ mode net: USB: Fix wrong-direction WARNING in plusb.c usb: core: add quirk for Alcor Link AK9563 smartcard reader usb: typec: altmodes/displayport: Fix probe pin assign check ceph: flush cap releases when the session is flushed riscv: Fixup race condition on PG_dcache_clean in flush_icache_pte arm64: dts: meson-gx: Make mmc host controller interrupts level-sensitive arm64: dts: meson-g12-common: Make mmc host controller interrupts level-sensitive arm64: dts: meson-axg: Make mmc host controller interrupts level-sensitive nvme-pci: Move enumeration by class to be last in the table bpf: Always return target ifindex in bpf_fib_lookup migrate: hugetlb: check for hugetlb shared PMD in node migration selftests/bpf: Verify copy_register_state() preserves parent/live fields ASoC: cs42l56: fix DT probe tools/virtio: fix the vringh test for virtio ring changes net/rose: Fix to not accept on connected socket net: stmmac: do not stop RX_CLK in Rx LPI state for qcs404 SoC net: sched: sch: Bounds check priority s390/decompressor: specify __decompress() buf len to avoid overflow nvme-fc: fix a missing queue put in nvmet_fc_ls_create_association aio: fix mremap after fork null-deref btrfs: free device in btrfs_close_devices for a single device filesystem netfilter: nft_tproxy: restrict to prerouting hook xfs: remove the xfs_efi_log_item_t typedef xfs: remove the xfs_efd_log_item_t typedef xfs: remove the xfs_inode_log_item_t typedef xfs: factor out a xfs_defer_create_intent helper xfs: merge the ->log_item defer op into ->create_intent xfs: merge the ->diff_items defer op into ->create_intent xfs: turn dfp_intent into a xfs_log_item xfs: refactor xfs_defer_finish_noroll xfs: log new intent items created as part of finishing recovered intent items xfs: fix finobt btree block recovery ordering xfs: proper replay of deferred ops queued during log recovery xfs: xfs_defer_capture should absorb remaining block reservations xfs: xfs_defer_capture should absorb remaining transaction reservation xfs: clean up bmap intent item recovery checking xfs: clean up xfs_bui_item_recover iget/trans_alloc/ilock ordering xfs: fix an incore inode UAF in xfs_bui_recover xfs: change the order in which child and parent defer ops are finished xfs: periodically relog deferred intent items xfs: expose the log push threshold xfs: only relog deferred intent items if free space in the log gets low xfs: fix missing CoW blocks writeback conversion retry xfs: ensure inobt record walks always make forward progress xfs: fix the forward progress assertion in xfs_iwalk_run_callbacks xfs: prevent UAF in xfs_log_item_in_current_chkpt xfs: sync lazy sb accounting on quiesce of read-only mounts Revert "ipv4: Fix incorrect route flushing when source address is deleted" ipv4: Fix incorrect route flushing when source address is deleted mmc: sdio: fix possible resource leaks in some error paths mmc: mmc_spi: fix error handling in mmc_spi_probe() ALSA: hda/conexant: add a new hda codec SN6180 ALSA: hda/realtek - fixed wrong gpio assigned sched/psi: Fix use-after-free in ep_remove_wait_queue() hugetlb: check for undefined shift on 32 bit architectures Revert "mm: Always release pages to the buddy allocator in memblock_free_late()." net: Fix unwanted sign extension in netdev_stats_to_stats64() revert "squashfs: harden sanity check in squashfs_read_xattr_id_table" ixgbe: allow to increase MTU to 3K with XDP enabled i40e: add double of VLAN header when computing the max MTU net: bgmac: fix BCM5358 support by setting correct flags sctp: sctp_sock_filter(): avoid list_entry() on possibly empty list dccp/tcp: Avoid negative sk_forward_alloc by ipv6_pinfo.pktoptions. net/usb: kalmia: Don't pass act_len in usb_bulk_msg error path net: stmmac: fix order of dwmac5 FlexPPS parametrization sequence bnxt_en: Fix mqprio and XDP ring checking logic net: stmmac: Restrict warning on disabling DMA store and fwd mode net: mpls: fix stale pointer if allocation fails during device rename ixgbe: add double of VLAN header when computing the max MTU ipv6: Fix datagram socket connection with DSCP. ipv6: Fix tcp socket connection with DSCP. i40e: Add checking for null for nlmsg_find_attr() kvm: initialize all of the kvm_debugregs structure before sending it to userspace nilfs2: fix underflow in second superblock position calculations ASoC: SOF: Intel: hda-dai: fix possible stream_tag leak net: sched: sch: Fix off by one in htb_activate_prios() iommu/amd: Pass gfp flags to iommu_map_page() in amd_iommu_map() Linux 5.4.232 Change-Id: I607aaac0f8477eb9a0f059e0a9d2f5c037fb19fc Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
|
85eda80883 |
xfs: sync lazy sb accounting on quiesce of read-only mounts
commit 50d25484bebe94320c49dd1347d3330c7063bbdb upstream. [ Modify xfs_log_unmount_write() to return zero when the log is in a read-only state ] xfs_log_sbcount() syncs the superblock specifically to accumulate the in-core percpu superblock counters and commit them to disk. This is required to maintain filesystem consistency across quiesce (freeze, read-only mount/remount) or unmount when lazy superblock accounting is enabled because individual transactions do not update the superblock directly. This mechanism works as expected for writable mounts, but xfs_log_sbcount() skips the update for read-only mounts. Read-only mounts otherwise still allow log recovery and write out an unmount record during log quiesce. If a read-only mount performs log recovery, it can modify the in-core superblock counters and write an unmount record when the filesystem unmounts without ever syncing the in-core counters. This leaves the filesystem with a clean log but in an inconsistent state with regard to lazy sb counters. Update xfs_log_sbcount() to use the same logic xfs_log_unmount_write() uses to determine when to write an unmount record. This ensures that lazy accounting is always synced before the log is cleaned. Refactor this logic into a new helper to distinguish between a writable filesystem and a writable log. Specifically, the log is writable unless the filesystem is mounted with the norecovery mount option, the underlying log device is read-only, or the filesystem is shutdown. Drop the freeze state check because the update is already allowed during the freezing process and no context calls this function on an already frozen fs. Also, retain the shutdown check in xfs_log_unmount_write() to catch the case where the preceding log force might have triggered a shutdown. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Gao Xiang <hsiangkao@redhat.com> Reviewed-by: Allison Henderson <allison.henderson@oracle.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Bill O'Donnell <billodo@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Acked-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
fb8ee907c1 |
xfs: prevent UAF in xfs_log_item_in_current_chkpt
commit f8d92a66e810acbef6ddbc0bd0cbd9b117ce8acd upstream. [ Continue to interpret xfs_log_item->li_seq as an LSN rather than a CIL sequence number. ] While I was running with KASAN and lockdep enabled, I stumbled upon an KASAN report about a UAF to a freed CIL checkpoint. Looking at the comment for xfs_log_item_in_current_chkpt, it seems pretty obvious to me that the original patch to xfs_defer_finish_noroll should have done something to lock the CIL to prevent it from switching the CIL contexts while the predicate runs. For upper level code that needs to know if a given log item is new enough not to need relogging, add a new wrapper that takes the CIL context lock long enough to sample the current CIL context. This is kind of racy in that the CIL can switch the contexts immediately after sampling, but that's ok because the consequence is that the defer ops code is a little slow to relog items. ================================================================== BUG: KASAN: use-after-free in xfs_log_item_in_current_chkpt+0x139/0x160 [xfs] Read of size 8 at addr ffff88804ea5f608 by task fsstress/527999 CPU: 1 PID: 527999 Comm: fsstress Tainted: G D 5.16.0-rc4-xfsx #rc4 Call Trace: <TASK> dump_stack_lvl+0x45/0x59 print_address_description.constprop.0+0x1f/0x140 kasan_report.cold+0x83/0xdf xfs_log_item_in_current_chkpt+0x139/0x160 xfs_defer_finish_noroll+0x3bb/0x1e30 __xfs_trans_commit+0x6c8/0xcf0 xfs_reflink_remap_extent+0x66f/0x10e0 xfs_reflink_remap_blocks+0x2dd/0xa90 xfs_file_remap_range+0x27b/0xc30 vfs_dedupe_file_range_one+0x368/0x420 vfs_dedupe_file_range+0x37c/0x5d0 do_vfs_ioctl+0x308/0x1260 __x64_sys_ioctl+0xa1/0x170 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f2c71a2950b Code: 0f 1e fa 48 8b 05 85 39 0d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 55 39 0d 00 f7 d8 64 89 01 48 RSP: 002b:00007ffe8c0e03c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00005600862a8740 RCX: 00007f2c71a2950b RDX: 00005600862a7be0 RSI: 00000000c0189436 RDI: 0000000000000004 RBP: 000000000000000b R08: 0000000000000027 R09: 0000000000000003 R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000005a R13: 00005600862804a8 R14: 0000000000016000 R15: 00005600862a8a20 </TASK> Allocated by task 464064: kasan_save_stack+0x1e/0x50 __kasan_kmalloc+0x81/0xa0 kmem_alloc+0xcd/0x2c0 [xfs] xlog_cil_ctx_alloc+0x17/0x1e0 [xfs] xlog_cil_push_work+0x141/0x13d0 [xfs] process_one_work+0x7f6/0x1380 worker_thread+0x59d/0x1040 kthread+0x3b0/0x490 ret_from_fork+0x1f/0x30 Freed by task 51: kasan_save_stack+0x1e/0x50 kasan_set_track+0x21/0x30 kasan_set_free_info+0x20/0x30 __kasan_slab_free+0xed/0x130 slab_free_freelist_hook+0x7f/0x160 kfree+0xde/0x340 xlog_cil_committed+0xbfd/0xfe0 [xfs] xlog_cil_process_committed+0x103/0x1c0 [xfs] xlog_state_do_callback+0x45d/0xbd0 [xfs] xlog_ioend_work+0x116/0x1c0 [xfs] process_one_work+0x7f6/0x1380 worker_thread+0x59d/0x1040 kthread+0x3b0/0x490 ret_from_fork+0x1f/0x30 Last potentially related work creation: kasan_save_stack+0x1e/0x50 __kasan_record_aux_stack+0xb7/0xc0 insert_work+0x48/0x2e0 __queue_work+0x4e7/0xda0 queue_work_on+0x69/0x80 xlog_cil_push_now.isra.0+0x16b/0x210 [xfs] xlog_cil_force_seq+0x1b7/0x850 [xfs] xfs_log_force_seq+0x1c7/0x670 [xfs] xfs_file_fsync+0x7c1/0xa60 [xfs] __x64_sys_fsync+0x52/0x80 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae The buggy address belongs to the object at ffff88804ea5f600 which belongs to the cache kmalloc-256 of size 256 The buggy address is located 8 bytes inside of 256-byte region [ffff88804ea5f600, ffff88804ea5f700) The buggy address belongs to the page: page:ffffea00013a9780 refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff88804ea5ea00 pfn:0x4ea5e head:ffffea00013a9780 order:1 compound_mapcount:0 flags: 0x4fff80000010200(slab|head|node=1|zone=1|lastcpupid=0xfff) raw: 04fff80000010200 ffffea0001245908 ffffea00011bd388 ffff888004c42b40 raw: ffff88804ea5ea00 0000000000100009 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff88804ea5f500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff88804ea5f580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc >ffff88804ea5f600: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff88804ea5f680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff88804ea5f700: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ================================================================== Fixes: 4e919af7827a ("xfs: periodically relog deferred intent items") Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Acked-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
7c07806ab0 |
xfs: fix the forward progress assertion in xfs_iwalk_run_callbacks
commit a5336d6bb2d02d0e9d4d3c8be04b80b8b68d56c8 upstream. In commit 27c14b5daa82 we started tracking the last inode seen during an inode walk to avoid infinite loops if a corrupt inobt record happens to have a lower ir_startino than the record preceeding it. Unfortunately, the assertion trips over the case where there are completely empty inobt records (which can happen quite easily on 64k page filesystems) because we advance the tracking cursor without actually putting the empty record into the processing buffer. Fix the assert to allow for this case. Reported-by: zlang@redhat.com Fixes: 27c14b5daa82 ("xfs: ensure inobt record walks always make forward progress") Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Zorro Lang <zlang@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Acked-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
313699d505 |
xfs: ensure inobt record walks always make forward progress
commit 27c14b5daa82861220d6fa6e27b51f05f21ffaa7 upstream.
[ In xfs_iwalk_ag(), Replace a call to XFS_IS_CORRUPT() with a call to
ASSERT() ]
The aim of the inode btree record iterator function is to call a
callback on every record in the btree. To avoid having to tear down and
recreate the inode btree cursor around every callback, it caches a
certain number of records in a memory buffer. After each batch of
callback invocations, we have to perform a btree lookup to find the
next record after where we left off.
However, if the keys of the inode btree are corrupt, the lookup might
put us in the wrong part of the inode btree, causing the walk function
to loop forever. Therefore, we add extra cursor tracking to make sure
that we never go backwards neither when performing the lookup nor when
jumping to the next inobt record. This also fixes an off by one error
where upon resume the lookup should have been for the inode /after/ the
point at which we stopped.
Found by fuzzing xfs/460 with keys[2].startino = ones causing bulkstat
and quotacheck to hang.
Fixes:
|
||
|
7f9309a9f5 |
xfs: fix missing CoW blocks writeback conversion retry
commit c2f09217a4305478c55adc9a98692488dd19cd32 upstream. [ Set xfs_writepage_ctx->fork to XFS_DATA_FORK since 5.4.y tracks current extent's fork in this variable ] In commit |
||
|
6246b3a18f |
xfs: only relog deferred intent items if free space in the log gets low
commit 74f4d6a1e065c92428c5b588099e307a582d79d9 upstream. Now that we have the ability to ask the log how far the tail needs to be pushed to maintain its free space targets, augment the decision to relog an intent item so that we only do it if the log has hit the 75% full threshold. There's no point in relogging an intent into the same checkpoint, and there's no need to relog if there's plenty of free space in the log. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Acked-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
09d6181447 |
xfs: expose the log push threshold
commit ed1575daf71e4e21d8ae735b6e687c95454aaa17 upstream. Separate the computation of the log push threshold and the push logic in xlog_grant_push_ail. This enables higher level code to determine (for example) that it is holding on to a logged intent item and the log is so busy that it is more than 75% full. In that case, it would be desirable to move the log item towards the head to release the tail, which we will cover in the next patch. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Acked-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
5d711e4136 |
xfs: periodically relog deferred intent items
commit 4e919af7827a6adfc28e82cd6c4ffcfcc3dd6118 upstream. [ Modify xfs_{bmap|extfree|refcount|rmap}_item.c to fix merge conflicts ] There's a subtle design flaw in the deferred log item code that can lead to pinning the log tail. Taking up the defer ops chain examples from the previous commit, we can get trapped in sequences like this: Caller hands us a transaction t0 with D0-D3 attached. The defer ops chain will look like the following if the transaction rolls succeed: t1: D0(t0), D1(t0), D2(t0), D3(t0) t2: d4(t1), d5(t1), D1(t0), D2(t0), D3(t0) t3: d5(t1), D1(t0), D2(t0), D3(t0) ... t9: d9(t7), D3(t0) t10: D3(t0) t11: d10(t10), d11(t10) t12: d11(t10) In transaction 9, we finish d9 and try to roll to t10 while holding onto an intent item for D3 that we logged in t0. The previous commit changed the order in which we place new defer ops in the defer ops processing chain to reduce the maximum chain length. Now make xfs_defer_finish_noroll capable of relogging the entire chain periodically so that we can always move the log tail forward. Most chains will never get relogged, except for operations that generate very long chains (large extents containing many blocks with different sharing levels) or are on filesystems with small logs and a lot of ongoing metadata updates. Callers are now required to ensure that the transaction reservation is large enough to handle logging done items and new intent items for the maximum possible chain length. Most callers are careful to keep the chain lengths low, so the overhead should be minimal. The decision to relog an intent item is made based on whether the intent was logged in a previous checkpoint, since there's no point in relogging an intent into the same checkpoint. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Acked-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
870e7d7108 |
xfs: change the order in which child and parent defer ops are finished
commit 27dada070d59c28a441f1907d2cec891b17dcb26 upstream. The defer ops code has been finishing items in the wrong order -- if a top level defer op creates items A and B, and finishing item A creates more defer ops A1 and A2, we'll put the new items on the end of the chain and process them in the order A B A1 A2. This is kind of weird, since it's convenient for programmers to be able to think of A and B as an ordered sequence where all the sub-tasks for A must finish before we move on to B, e.g. A A1 A2 D. Right now, our log intent items are not so complex that this matters, but this will become important for the atomic extent swapping patchset. In order to maintain correct reference counting of extents, we have to unmap and remap extents in that order, and we want to complete that work before moving on to the next range that the user wants to swap. This patch fixes defer ops to satsify that requirement. The primary symptom of the incorrect order was noticed in an early performance analysis of the atomic extent swap code. An astonishingly large number of deferred work items accumulated when userspace requested an atomic update of two very fragmented files. The cause of this was traced to the same ordering bug in the inner loop of xfs_defer_finish_noroll. If the ->finish_item method of a deferred operation queues new deferred operations, those new deferred ops are appended to the tail of the pending work list. To illustrate, say that a caller creates a transaction t0 with four deferred operations D0-D3. The first thing defer ops does is roll the transaction to t1, leaving us with: t1: D0(t0), D1(t0), D2(t0), D3(t0) Let's say that finishing each of D0-D3 will create two new deferred ops. After finish D0 and roll, we'll have the following chain: t2: D1(t0), D2(t0), D3(t0), d4(t1), d5(t1) d4 and d5 were logged to t1. Notice that while we're about to start work on D1, we haven't actually completed all the work implied by D0 being finished. So far we've been careful (or lucky) to structure the dfops callers such that D1 doesn't depend on d4 or d5 being finished, but this is a potential logic bomb. There's a second problem lurking. Let's see what happens as we finish D1-D3: t3: D2(t0), D3(t0), d4(t1), d5(t1), d6(t2), d7(t2) t4: D3(t0), d4(t1), d5(t1), d6(t2), d7(t2), d8(t3), d9(t3) t5: d4(t1), d5(t1), d6(t2), d7(t2), d8(t3), d9(t3), d10(t4), d11(t4) Let's say that d4-d11 are simple work items that don't queue any other operations, which means that we can complete each d4 and roll to t6: t6: d5(t1), d6(t2), d7(t2), d8(t3), d9(t3), d10(t4), d11(t4) t7: d6(t2), d7(t2), d8(t3), d9(t3), d10(t4), d11(t4) ... t11: d10(t4), d11(t4) t12: d11(t4) <done> When we try to roll to transaction #12, we're holding defer op d11, which we logged way back in t4. This means that the tail of the log is pinned at t4. If the log is very small or there are a lot of other threads updating metadata, this means that we might have wrapped the log and cannot get roll to t11 because there isn't enough space left before we'd run into t4. Let's shift back to the original failure. I mentioned before that I discovered this flaw while developing the atomic file update code. In that scenario, we have a defer op (D0) that finds a range of file blocks to remap, creates a handful of new defer ops to do that, and then asks to be continued with however much work remains. So, D0 is the original swapext deferred op. The first thing defer ops does is rolls to t1: t1: D0(t0) We try to finish D0, logging d1 and d2 in the process, but can't get all the work done. We log a done item and a new intent item for the work that D0 still has to do, and roll to t2: t2: D0'(t1), d1(t1), d2(t1) We roll and try to finish D0', but still can't get all the work done, so we log a done item and a new intent item for it, requeue D0 a second time, and roll to t3: t3: D0''(t2), d1(t1), d2(t1), d3(t2), d4(t2) If it takes 48 more rolls to complete D0, then we'll finally dispense with D0 in t50: t50: D<fifty primes>(t49), d1(t1), ..., d102(t50) We then try to roll again to get a chain like this: t51: d1(t1), d2(t1), ..., d101(t50), d102(t50) ... t152: d102(t50) <done> Notice that in rolling to transaction #51, we're holding on to a log intent item for d1 that was logged in transaction #1. This means that the tail of the log is pinned at t1. If the log is very small or there are a lot of other threads updating metadata, this means that we might have wrapped the log and cannot roll to t51 because there isn't enough space left before we'd run into t1. This is of course problem #2 again. But notice the third problem with this scenario: we have 102 defer ops tied to this transaction! Each of these items are backed by pinned kernel memory, which means that we risk OOM if the chains get too long. Yikes. Problem #1 is a subtle logic bomb that could hit someone in the future; problem #2 applies (rarely) to the current upstream, and problem This is not how incremental deferred operations were supposed to work. The dfops design of logging in the same transaction an intent-done item and a new intent item for the work remaining was to make it so that we only have to juggle enough deferred work items to finish that one small piece of work. Deferred log item recovery will find that first unfinished work item and restart it, no matter how many other intent items might follow it in the log. Therefore, it's ok to put the new intents at the start of the dfops chain. For the first example, the chains look like this: t2: d4(t1), d5(t1), D1(t0), D2(t0), D3(t0) t3: d5(t1), D1(t0), D2(t0), D3(t0) ... t9: d9(t7), D3(t0) t10: D3(t0) t11: d10(t10), d11(t10) t12: d11(t10) For the second example, the chains look like this: t1: D0(t0) t2: d1(t1), d2(t1), D0'(t1) t3: d2(t1), D0'(t1) t4: D0'(t1) t5: d1(t4), d2(t4), D0''(t4) ... t148: D0<50 primes>(t147) t149: d101(t148), d102(t148) t150: d102(t148) <done> This actually sucks more for pinning the log tail (we try to roll to t10 while holding an intent item that was logged in t1) but we've solved problem #1. We've also reduced the maximum chain length from: sum(all the new items) + nr_original_items to: max(new items that each original item creates) + nr_original_items This solves problem #3 by sharply reducing the number of defer ops that can be attached to a transaction at any given time. The change makes the problem of log tail pinning worse, but is improvement we need to solve problem #2. Actually solving #2, however, is left to the next patch. Note that a subsequent analysis of some hard-to-trigger reflink and COW livelocks on extremely fragmented filesystems (or systems running a lot of IO threads) showed the same symptoms -- uncomfortably large numbers of incore deferred work items and occasional stalls in the transaction grant code while waiting for log reservations. I think this patch and the next one will also solve these problems. As originally written, the code used list_splice_tail_init instead of list_splice_init, so change that, and leave a short comment explaining our actions. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Acked-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
f5af1d5c2d |
xfs: fix an incore inode UAF in xfs_bui_recover
commit ff4ab5e02a0447dd1e290883eb6cd7d94848e590 upstream. In xfs_bui_item_recover, there exists a use-after-free bug with regards to the inode that is involved in the bmap replay operation. If the mapping operation does not complete, we call xfs_bmap_unmap_extent to create a deferred op to finish the unmapping work, and we retain a pointer to the incore inode. Unfortunately, the very next thing we do is commit the transaction and drop the inode. If reclaim tears down the inode before we try to finish the defer ops, we dereference garbage and blow up. Therefore, create a way to join inodes to the defer ops freezer so that we can maintain the xfs_inode reference until we're done with the inode. Note: This imposes the requirement that there be enough memory to keep every incore inode in memory throughout recovery. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Acked-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
efcdc2e70e |
xfs: clean up xfs_bui_item_recover iget/trans_alloc/ilock ordering
commit 64a3f3315bc60f710a0a25c1798ac0ea58c6fa1f upstream. In most places in XFS, we have a specific order in which we gather resources: grab the inode, allocate a transaction, then lock the inode. xfs_bui_item_recover doesn't do it in that order, so fix it to be more consistent. This also makes the error bailout code a bit less weird. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Acked-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
abad319dee |
xfs: clean up bmap intent item recovery checking
commit 919522e89f8e71fc6a8f8abe17be4011573c6ea0 upstream. The bmap intent item checking code in xfs_bui_item_recover is spread all over the function. We should check the recovered log item at the top before we allocate any resources or do anything else, so do that. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Acked-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
6601531db8 |
xfs: xfs_defer_capture should absorb remaining transaction reservation
commit 929b92f64048d90d23e40a59c47adf59f5026903 upstream. When xfs_defer_capture extracts the deferred ops and transaction state from a transaction, it should record the transaction reservation type from the old transaction so that when we continue the dfops chain, we still use the same reservation parameters. Doing this means that the log item recovery functions get to determine the transaction reservation instead of abusing tr_itruncate in yet another part of xfs. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Acked-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
411b14e68c |
xfs: xfs_defer_capture should absorb remaining block reservations
commit 4f9a60c48078c0efa3459678fa8d6e050e8ada5d upstream. When xfs_defer_capture extracts the deferred ops and transaction state from a transaction, it should record the remaining block reservations so that when we continue the dfops chain, we can reserve the same number of blocks to use. We capture the reservations for both data and realtime volumes. This adds the requirement that every log intent item recovery function must be careful to reserve enough blocks to handle both itself and all defer ops that it can queue. On the other hand, this enables us to do away with the handwaving block estimation nonsense that was going on in xlog_finish_defer_ops. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Acked-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
3324249e6e |
xfs: proper replay of deferred ops queued during log recovery
commit e6fff81e487089e47358a028526a9f63cdbcd503 upstream.
When we replay unfinished intent items that have been recovered from the
log, it's possible that the replay will cause the creation of more
deferred work items. As outlined in commit
|
||
|
1c89c04305 |
xfs: fix finobt btree block recovery ordering
commit 671459676ab0e1d371c8d6b184ad1faa05b6941e upstream.
[ In 5.4.y, xlog_recover_get_buf_lsn() is defined inside
fs/xfs/xfs_log_recover.c ]
Nathan popped up on #xfs and pointed out that we fail to handle
finobt btree blocks in xlog_recover_get_buf_lsn(). This means they
always fall through the entire magic number matching code to "recover
immediately". Whilst most of the time this is the correct behaviour,
occasionally it will be incorrect and could potentially overwrite
more recent metadata because we don't check the LSN in the on disk
metadata at all.
This bug has been present since the finobt was first introduced, and
is a potential cause of the occasional xfs_iget_check_free_state()
failures we see that indicate that the inode btree state does not
match the on disk inode state.
Fixes:
|
||
|
6678b2787b |
xfs: log new intent items created as part of finishing recovered intent items
commit 93293bcbde93567efaf4e6bcd58cad270e1fcbf5 upstream. [Slightly edit fs/xfs/xfs_bmap_item.c & fs/xfs/xfs_refcount_item.c to resolve merge conflicts] During a code inspection, I found a serious bug in the log intent item recovery code when an intent item cannot complete all the work and decides to requeue itself to get that done. When this happens, the item recovery creates a new incore deferred op representing the remaining work and attaches it to the transaction that it allocated. At the end of _item_recover, it moves the entire chain of deferred ops to the dummy parent_tp that xlog_recover_process_intents passed to it, but fail to log a new intent item for the remaining work before committing the transaction for the single unit of work. xlog_finish_defer_ops logs those new intent items once recovery has finished dealing with the intent items that it recovered, but this isn't sufficient. If the log is forced to disk after a recovered log item decides to requeue itself and the system goes down before we call xlog_finish_defer_ops, the second log recovery will never see the new intent item and therefore has no idea that there was more work to do. It will finish recovery leaving the filesystem in a corrupted state. The same logic applies to /any/ deferred ops added during intent item recovery, not just the one handling the remaining work. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Acked-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
562da8e704 |
xfs: refactor xfs_defer_finish_noroll
commit bb47d79750f1a68a75d4c7defc2da934ba31de14 upstream. Split out a helper that operates on a single xfs_defer_pending structure to untangle the code. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Acked-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
42a2406f90 |
xfs: turn dfp_intent into a xfs_log_item
commit 13a8333339072b8654c1d2c75550ee9f41ee15de upstream. All defer op instance place their own extension of the log item into the dfp_intent field. Replace that with a xfs_log_item to improve type safety and make the code easier to follow. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Acked-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
e11f1516fc |
xfs: merge the ->diff_items defer op into ->create_intent
commit d367a868e46b025a8ced8e00ef2b3a3c2f3bf732 upstream. This avoids a per-item indirect call, and also simplifies the interface a bit. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Acked-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
e84096edf8 |
xfs: merge the ->log_item defer op into ->create_intent
commit c1f09188e8de0ae65433cb9c8ace4feb66359bcc upstream. These are aways called together, and my merging them we reduce the amount of indirect calls, improve type safety and in general clean up the code a bit. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Acked-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
64b21eaa33 |
xfs: factor out a xfs_defer_create_intent helper
commit e046e949486ec92d83b2ccdf0e7e9144f74ef028 upstream. Create a helper that encapsulates the whole logic to create a defer intent. This reorders some of the work that was done, but none of that has an affect on the operation as only fields that don't directly interact are affected. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Acked-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
d24633f3c2 |
xfs: remove the xfs_inode_log_item_t typedef
commit fd9cbe51215198ccffa64169c98eae35b0916088 upstream. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Acked-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
e0373eeaaa |
xfs: remove the xfs_efd_log_item_t typedef
commit c84e819090f39e96e4d432c9047a50d2424f99e0 upstream. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Acked-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
94e0639992 |
xfs: remove the xfs_efi_log_item_t typedef
commit 82ff450b2d936d778361a1de43eb078cc043c7fe upstream. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Acked-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |