Commit Graph

120 Commits

Author SHA1 Message Date
Greg Kroah-Hartman
2ebd481b31 Merge 5.10.221 into android12-5.10-lts
Changes in 5.10.221
	tracing/selftests: Fix kprobe event name test for .isra. functions
	null_blk: Print correct max open zones limit in null_init_zoned_dev()
	wifi: mac80211: mesh: Fix leak of mesh_preq_queue objects
	wifi: mac80211: Fix deadlock in ieee80211_sta_ps_deliver_wakeup()
	wifi: cfg80211: pmsr: use correct nla_get_uX functions
	wifi: iwlwifi: mvm: revert gen2 TX A-MPDU size to 64
	wifi: iwlwifi: dbg_ini: move iwl_dbg_tlv_free outside of debugfs ifdef
	wifi: iwlwifi: mvm: check n_ssids before accessing the ssids
	wifi: iwlwifi: mvm: don't read past the mfuart notifcation
	wifi: mac80211: correctly parse Spatial Reuse Parameter Set element
	net/ncsi: add NCSI Intel OEM command to keep PHY up
	net/ncsi: Simplify Kconfig/dts control flow
	net/ncsi: Fix the multi thread manner of NCSI driver
	ipv6: sr: block BH in seg6_output_core() and seg6_input_core()
	net: sched: sch_multiq: fix possible OOB write in multiq_tune()
	vxlan: Fix regression when dropping packets due to invalid src addresses
	tcp: count CLOSE-WAIT sockets for TCP_MIB_CURRESTAB
	net/sched: taprio: always validate TCA_TAPRIO_ATTR_PRIOMAP
	ptp: Fix error message on failed pin verification
	af_unix: Annotate data-race of sk->sk_state in unix_inq_len().
	af_unix: Annotate data-races around sk->sk_state in unix_write_space() and poll().
	af_unix: Annotate data-races around sk->sk_state in sendmsg() and recvmsg().
	af_unix: Annotate data-races around sk->sk_state in UNIX_DIAG.
	af_unix: Annotate data-race of net->unx.sysctl_max_dgram_qlen.
	af_unix: Use unix_recvq_full_lockless() in unix_stream_connect().
	af_unix: Use skb_queue_len_lockless() in sk_diag_show_rqlen().
	af_unix: Annotate data-race of sk->sk_shutdown in sk_diag_fill().
	ipv6: fix possible race in __fib6_drop_pcpu_from()
	usb: gadget: f_fs: Fix race between aio_cancel() and AIO request complete
	drm/amd/display: Handle Y carry-over in VCP X.Y calculation
	serial: sc16is7xx: replace hardcoded divisor value with BIT() macro
	serial: sc16is7xx: fix bug in sc16is7xx_set_baud() when using prescaler
	mmc: davinci: Don't strip remove function when driver is builtin
	selftests/mm: compaction_test: fix incorrect write of zero to nr_hugepages
	selftests/mm: conform test to TAP format output
	selftests/mm: compaction_test: fix bogus test success on Aarch64
	btrfs: fix leak of qgroup extent records after transaction abort
	nilfs2: Remove check for PageError
	nilfs2: return the mapped address from nilfs_get_page()
	nilfs2: fix nilfs_empty_dir() misjudgment and long loop on I/O errors
	USB: class: cdc-wdm: Fix CPU lockup caused by excessive log messages
	mei: me: release irq in mei_me_pci_resume error path
	jfs: xattr: fix buffer overflow for invalid xattr
	xhci: Set correct transferred length for cancelled bulk transfers
	xhci: Apply reset resume quirk to Etron EJ188 xHCI host
	xhci: Apply broken streams quirk to Etron EJ188 xHCI host
	scsi: mpt3sas: Avoid test/set_bit() operating in non-allocated memory
	powerpc/uaccess: Fix build errors seen with GCC 13/14
	Input: try trimming too long modalias strings
	SUNRPC: return proper error from gss_wrap_req_priv
	gpio: tqmx86: fix typo in Kconfig label
	HID: core: remove unnecessary WARN_ON() in implement()
	gpio: tqmx86: store IRQ trigger type and unmask status separately
	iommu/amd: Introduce pci segment structure
	iommu/amd: Fix sysfs leak in iommu init
	iommu: Return right value in iommu_sva_bind_device()
	HID: logitech-dj: Fix memory leak in logi_dj_recv_switch_to_dj_mode()
	drm/vmwgfx: 3D disabled should not effect STDU memory limits
	net: sfp: Always call `sfp_sm_mod_remove()` on remove
	net: hns3: add cond_resched() to hns3 ring buffer init process
	liquidio: Adjust a NULL pointer handling path in lio_vf_rep_copy_packet
	drm/komeda: check for error-valued pointer
	drm/bridge/panel: Fix runtime warning on panel bridge release
	tcp: fix race in tcp_v6_syn_recv_sock()
	net/mlx5e: Fix features validation check for tunneled UDP (non-VXLAN) packets
	Bluetooth: L2CAP: Fix rejecting L2CAP_CONN_PARAM_UPDATE_REQ
	netfilter: ipset: Fix race between namespace cleanup and gc in the list:set type
	net: stmmac: replace priv->speed with the portTransmitRate from the tc-cbs parameters
	net/ipv6: Fix the RT cache flush via sysctl using a previous delay
	ionic: fix use after netif_napi_del()
	iio: adc: ad9467: fix scan type sign
	iio: dac: ad5592r: fix temperature channel scaling value
	iio: imu: inv_icm42600: delete unneeded update watermark call
	drivers: core: synchronize really_probe() and dev_uevent()
	drm/exynos/vidi: fix memory leak in .get_modes()
	drm/exynos: hdmi: report safe 640x480 mode as a fallback when no EDID found
	vmci: prevent speculation leaks by sanitizing event in event_deliver()
	fs/proc: fix softlockup in __read_vmcore
	ocfs2: use coarse time for new created files
	ocfs2: fix races between hole punching and AIO+DIO
	PCI: rockchip-ep: Remove wrong mask on subsys_vendor_id
	dmaengine: axi-dmac: fix possible race in remove()
	remoteproc: k3-r5: Do not allow core1 to power up before core0 via sysfs
	intel_th: pci: Add Granite Rapids support
	intel_th: pci: Add Granite Rapids SOC support
	intel_th: pci: Add Sapphire Rapids SOC support
	intel_th: pci: Add Meteor Lake-S support
	intel_th: pci: Add Lunar Lake support
	nilfs2: fix potential kernel bug due to lack of writeback flag waiting
	tick/nohz_full: Don't abuse smp_call_function_single() in tick_setup_device()
	serial: 8250_pxa: Configure tx_loadsz to match FIFO IRQ level
	hugetlb_encode.h: fix undefined behaviour (34 << 26)
	mptcp: ensure snd_una is properly initialized on connect
	mptcp: pm: inc RmAddr MIB counter once per RM_ADDR ID
	mptcp: pm: update add_addr counters after connect
	remoteproc: k3-r5: Jump to error handling labels in start/stop errors
	greybus: Fix use-after-free bug in gb_interface_release due to race condition.
	usb-storage: alauda: Check whether the media is initialized
	i2c: at91: Fix the functionality flags of the slave-only interface
	i2c: designware: Fix the functionality flags of the slave-only interface
	zap_pid_ns_processes: clear TIF_NOTIFY_SIGNAL along with TIF_SIGPENDING
	padata: Disable BH when taking works lock on MT path
	rcutorture: Fix rcu_torture_one_read() pipe_count overflow comment
	rcutorture: Fix invalid context warning when enable srcu barrier testing
	block/ioctl: prefer different overflow check
	selftests/bpf: Prevent client connect before server bind in test_tc_tunnel.sh
	selftests/bpf: Fix flaky test btf_map_in_map/lookup_update
	batman-adv: bypass empty buckets in batadv_purge_orig_ref()
	wifi: ath9k: work around memset overflow warning
	af_packet: avoid a false positive warning in packet_setsockopt()
	drop_monitor: replace spin_lock by raw_spin_lock
	scsi: qedi: Fix crash while reading debugfs attribute
	kselftest: arm64: Add a null pointer check
	netpoll: Fix race condition in netpoll_owner_active
	HID: Add quirk for Logitech Casa touchpad
	ACPI: video: Add backlight=native quirk for Lenovo Slim 7 16ARH7
	Bluetooth: ath3k: Fix multiple issues reported by checkpatch.pl
	drm/amd/display: Exit idle optimizations before HDCP execution
	ASoC: Intel: sof_sdw: add JD2 quirk for HP Omen 14
	drm/lima: add mask irq callback to gp and pp
	drm/lima: mask irqs in timeout path before hard reset
	powerpc/pseries: Enforce hcall result buffer validity and size
	powerpc/io: Avoid clang null pointer arithmetic warnings
	power: supply: cros_usbpd: provide ID table for avoiding fallback match
	iommu/arm-smmu-v3: Free MSIs in case of ENOMEM
	f2fs: remove clear SB_INLINECRYPT flag in default_options
	usb: misc: uss720: check for incompatible versions of the Belkin F5U002
	udf: udftime: prevent overflow in udf_disk_stamp_to_time()
	PCI/PM: Avoid D3cold for HP Pavilion 17 PC/1972 PCIe Ports
	MIPS: Octeon: Add PCIe link status check
	serial: exar: adding missing CTI and Exar PCI ids
	MIPS: Routerboard 532: Fix vendor retry check code
	mips: bmips: BCM6358: make sure CBR is correctly set
	tracing: Build event generation tests only as modules
	cipso: fix total option length computation
	netrom: Fix a memory leak in nr_heartbeat_expiry()
	ipv6: prevent possible NULL deref in fib6_nh_init()
	ipv6: prevent possible NULL dereference in rt6_probe()
	xfrm6: check ip6_dst_idev() return value in xfrm6_get_saddr()
	netns: Make get_net_ns() handle zero refcount net
	qca_spi: Make interrupt remembering atomic
	net/sched: act_api: rely on rcu in tcf_idr_check_alloc
	net/sched: act_api: fix possible infinite loop in tcf_idr_check_alloc()
	tipc: force a dst refcount before doing decryption
	net/sched: act_ct: set 'net' pointer when creating new nf_flow_table
	sched: act_ct: add netns into the key of tcf_ct_flow_table
	net: stmmac: No need to calculate speed divider when offload is disabled
	virtio_net: checksum offloading handling fix
	netfilter: ipset: Fix suspicious rcu_dereference_protected()
	net: usb: rtl8150 fix unintiatilzed variables in rtl8150_get_link_ksettings
	regulator: core: Fix modpost error "regulator_get_regmap" undefined
	dmaengine: ioat: switch from 'pci_' to 'dma_' API
	dmaengine: ioat: Drop redundant pci_enable_pcie_error_reporting()
	dmaengine: ioatdma: Fix leaking on version mismatch
	dmaengine: ioat: use PCI core macros for PCIe Capability
	dmaengine: ioatdma: Fix error path in ioat3_dma_probe()
	dmaengine: ioatdma: Fix kmemleak in ioat_pci_probe()
	dmaengine: ioatdma: Fix missing kmem_cache_destroy()
	ACPICA: Revert "ACPICA: avoid Info: mapping multiple BARs. Your kernel is fine."
	RDMA/mlx5: Add check for srq max_sge attribute
	ALSA: hda/realtek: Limit mic boost on N14AP7
	drm/radeon: fix UBSAN warning in kv_dpm.c
	gcov: add support for GCC 14
	kcov: don't lose track of remote references during softirqs
	i2c: ocores: set IACK bit after core is enabled
	dt-bindings: i2c: google,cros-ec-i2c-tunnel: correct path to i2c-controller schema
	drm/amd/display: revert Exit idle optimizations before HDCP execution
	ARM: dts: samsung: smdkv310: fix keypad no-autorepeat
	ARM: dts: samsung: exynos4412-origen: fix keypad no-autorepeat
	ARM: dts: samsung: smdk4412: fix keypad no-autorepeat
	rtlwifi: rtl8192de: Style clean-ups
	wifi: rtlwifi: rtl8192de: Fix 5 GHz TX power
	pmdomain: ti-sci: Fix duplicate PD referrals
	knfsd: LOOKUP can return an illegal error value
	spmi: hisi-spmi-controller: Do not override device identifier
	bcache: fix variable length array abuse in btree_iter
	tracing: Add MODULE_DESCRIPTION() to preemptirq_delay_test
	x86/cpu/vfm: Add new macros to work with (vendor/family/model) values
	x86/cpu: Fix x86_match_cpu() to match just X86_VENDOR_INTEL
	r8169: remove unneeded memory barrier in rtl_tx
	r8169: improve rtl_tx
	r8169: improve rtl8169_start_xmit
	r8169: remove nr_frags argument from rtl_tx_slots_avail
	r8169: remove not needed check in rtl8169_start_xmit
	r8169: Fix possible ring buffer corruption on fragmented Tx packets.
	Revert "kheaders: substituting --sort in archive creation"
	kheaders: explicitly define file modes for archived headers
	perf/core: Fix missing wakeup when waiting for context reference
	PCI: Add PCI_ERROR_RESPONSE and related definitions
	x86/amd_nb: Check for invalid SMN reads
	cifs: missed ref-counting smb session in find
	smb: client: fix deadlock in smb2_find_smb_tcon()
	ACPI: Add quirks for AMD Renoir/Lucienne CPUs to force the D3 hint
	ACPI: x86: Add a quirk for Dell Inspiron 14 2-in-1 for StorageD3Enable
	ACPI: x86: Add another system to quirk list for forcing StorageD3Enable
	ACPI: x86: utils: Add Cezanne to the list for forcing StorageD3Enable
	ACPI: x86: utils: Add Picasso to the list for forcing StorageD3Enable
	ACPI: x86: Force StorageD3Enable on more products
	Input: ili210x - fix ili251x_read_touch_data() return value
	pinctrl: fix deadlock in create_pinctrl() when handling -EPROBE_DEFER
	pinctrl: rockchip: fix pinmux bits for RK3328 GPIO2-B pins
	pinctrl: rockchip: fix pinmux bits for RK3328 GPIO3-B pins
	pinctrl/rockchip: separate struct rockchip_pin_bank to a head file
	pinctrl: rockchip: use dedicated pinctrl type for RK3328
	pinctrl: rockchip: fix pinmux reset in rockchip_pmx_set
	drm/amdgpu: fix UBSAN warning in kv_dpm.c
	netfilter: nf_tables: validate family when identifying table via handle
	SUNRPC: Fix null pointer dereference in svc_rqst_free()
	SUNRPC: Fix a NULL pointer deref in trace_svc_stats_latency()
	SUNRPC: Fix svcxdr_init_decode's end-of-buffer calculation
	SUNRPC: Fix svcxdr_init_encode's buflen calculation
	nfsd: hold a lighter-weight client reference over CB_RECALL_ANY
	ASoC: fsl-asoc-card: set priv->pdev before using it
	net: dsa: microchip: fix initial port flush problem
	net: phy: micrel: add Microchip KSZ 9477 to the device table
	xdp: Move the rxq_info.mem clearing to unreg_mem_model()
	xdp: Allow registering memory model without rxq reference
	xdp: Remove WARN() from __xdp_reg_mem_model()
	sparc: fix old compat_sys_select()
	sparc: fix compat recv/recvfrom syscalls
	parisc: use correct compat recv/recvfrom syscalls
	netfilter: nf_tables: fully validate NFT_DATA_VALUE on store to data registers
	drm/panel: ilitek-ili9881c: Fix warning with GPIO controllers that sleep
	mtd: partitions: redboot: Added conversion of operands to a larger type
	bpf: Add a check for struct bpf_fib_lookup size
	net/iucv: Avoid explicit cpumask var allocation on stack
	net/dpaa2: Avoid explicit cpumask var allocation on stack
	ALSA: emux: improve patch ioctl data validation
	media: dvbdev: Initialize sbuf
	soc: ti: wkup_m3_ipc: Send NULL dummy message instead of pointer message
	drm/radeon/radeon_display: Decrease the size of allocated memory
	nvme: fixup comment for nvme RDMA Provider Type
	drm/panel: simple: Add missing display timing flags for KOE TX26D202VM0BWA
	gpio: davinci: Validate the obtained number of IRQs
	gpiolib: cdev: Disallow reconfiguration without direction (uAPI v1)
	x86: stop playing stack games in profile_pc()
	ocfs2: fix DIO failure due to insufficient transaction credits
	mmc: sdhci-pci: Convert PCIBIOS_* return codes to errnos
	mmc: sdhci: Do not invert write-protect twice
	mmc: sdhci: Do not lock spinlock around mmc_gpio_get_ro()
	counter: ti-eqep: enable clock at probe
	iio: adc: ad7266: Fix variable checking bug
	iio: chemical: bme680: Fix pressure value output
	iio: chemical: bme680: Fix calibration data variable
	iio: chemical: bme680: Fix overflows in compensate() functions
	iio: chemical: bme680: Fix sensor data read operation
	net: usb: ax88179_178a: improve link status logs
	usb: gadget: printer: SS+ support
	usb: gadget: printer: fix races against disable
	usb: musb: da8xx: fix a resource leak in probe()
	usb: atm: cxacru: fix endpoint checking in cxacru_bind()
	serial: 8250_omap: Implementation of Errata i2310
	tty: mcf: MCF54418 has 10 UARTS
	net: can: j1939: Initialize unused data in j1939_send_one()
	net: can: j1939: recover socket queue on CAN bus error during BAM transmission
	net: can: j1939: enhanced error handling for tightly received RTS messages in xtp_rx_rts_session_new
	kbuild: Install dtb files as 0644 in Makefile.dtbinst
	csky, hexagon: fix broken sys_sync_file_range
	hexagon: fix fadvise64_64 calling conventions
	drm/nouveau/dispnv04: fix null pointer dereference in nv17_tv_get_ld_modes
	drm/i915/gt: Fix potential UAF by revoke of fence registers
	drm/nouveau/dispnv04: fix null pointer dereference in nv17_tv_get_hd_modes
	batman-adv: Don't accept TT entries for out-of-spec VIDs
	ata: ahci: Clean up sysfs file on error
	ata: libata-core: Fix double free on error
	ftruncate: pass a signed offset
	syscalls: fix compat_sys_io_pgetevents_time64 usage
	mtd: spinand: macronix: Add support for serial NAND flash
	pwm: stm32: Refuse too small period requests
	nfs: Leave pages in the pagecache if readpage failed
	ipv6: annotate some data-races around sk->sk_prot
	ipv6: Fix data races around sk->sk_prot.
	tcp: Fix data races around icsk->icsk_af_ops.
	drivers: fix typo in firmware/efi/memmap.c
	efi: Correct comment on efi_memmap_alloc
	efi: memmap: Move manipulation routines into x86 arch tree
	efi: xen: Set EFI_PARAVIRT for Xen dom0 boot on all architectures
	efi/x86: Free EFI memory map only when installing a new one.
	KVM: arm64: vgic-v4: Make the doorbell request robust w.r.t preemption
	ARM: dts: rockchip: rk3066a: add #sound-dai-cells to hdmi node
	arm64: dts: rockchip: Add sound-dai-cells for RK3368
	xdp: xdp_mem_allocator can be NULL in trace_mem_connect().
	serial: 8250_omap: Fix Errata i2310 with RX FIFO level check
	tracing/net_sched: NULL pointer dereference in perf_trace_qdisc_reset()
	Linux 5.10.221

Change-Id: Icac1c62fcbda5102be7ea031121f28d6fee36875
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2024-07-17 09:08:09 +00:00
Justin Stitt
58706e482b block/ioctl: prefer different overflow check
[ Upstream commit ccb326b5f9e623eb7f130fbbf2505ec0e2dcaff9 ]

Running syzkaller with the newly reintroduced signed integer overflow
sanitizer shows this report:

[   62.982337] ------------[ cut here ]------------
[   62.985692] cgroup: Invalid name
[   62.986211] UBSAN: signed-integer-overflow in ../block/ioctl.c:36:46
[   62.989370] 9pnet_fd: p9_fd_create_tcp (7343): problem connecting socket to 127.0.0.1
[   62.992992] 9223372036854775807 + 4095 cannot be represented in type 'long long'
[   62.997827] 9pnet_fd: p9_fd_create_tcp (7345): problem connecting socket to 127.0.0.1
[   62.999369] random: crng reseeded on system resumption
[   63.000634] GUP no longer grows the stack in syz-executor.2 (7353): 20002000-20003000 (20001000)
[   63.000668] CPU: 0 PID: 7353 Comm: syz-executor.2 Not tainted 6.8.0-rc2-00035-gb3ef86b5a957 #1
[   63.000677] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[   63.000682] Call Trace:
[   63.000686]  <TASK>
[   63.000731]  dump_stack_lvl+0x93/0xd0
[   63.000919]  __get_user_pages+0x903/0xd30
[   63.001030]  __gup_longterm_locked+0x153e/0x1ba0
[   63.001041]  ? _raw_read_unlock_irqrestore+0x17/0x50
[   63.001072]  ? try_get_folio+0x29c/0x2d0
[   63.001083]  internal_get_user_pages_fast+0x1119/0x1530
[   63.001109]  iov_iter_extract_pages+0x23b/0x580
[   63.001206]  bio_iov_iter_get_pages+0x4de/0x1220
[   63.001235]  iomap_dio_bio_iter+0x9b6/0x1410
[   63.001297]  __iomap_dio_rw+0xab4/0x1810
[   63.001316]  iomap_dio_rw+0x45/0xa0
[   63.001328]  ext4_file_write_iter+0xdde/0x1390
[   63.001372]  vfs_write+0x599/0xbd0
[   63.001394]  ksys_write+0xc8/0x190
[   63.001403]  do_syscall_64+0xd4/0x1b0
[   63.001421]  ? arch_exit_to_user_mode_prepare+0x3a/0x60
[   63.001479]  entry_SYSCALL_64_after_hwframe+0x6f/0x77
[   63.001535] RIP: 0033:0x7f7fd3ebf539
[   63.001551] Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 f1 14 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
[   63.001562] RSP: 002b:00007f7fd32570c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[   63.001584] RAX: ffffffffffffffda RBX: 00007f7fd3ff3f80 RCX: 00007f7fd3ebf539
[   63.001590] RDX: 4db6d1e4f7e43360 RSI: 0000000020000000 RDI: 0000000000000004
[   63.001595] RBP: 00007f7fd3f1e496 R08: 0000000000000000 R09: 0000000000000000
[   63.001599] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[   63.001604] R13: 0000000000000006 R14: 00007f7fd3ff3f80 R15: 00007ffd415ad2b8
...
[   63.018142] ---[ end trace ]---

Historically, the signed integer overflow sanitizer did not work in the
kernel due to its interaction with `-fwrapv` but this has since been
changed [1] in the newest version of Clang; It was re-enabled in the
kernel with Commit 557f8c582a9ba8ab ("ubsan: Reintroduce signed overflow
sanitizer").

Let's rework this overflow checking logic to not actually perform an
overflow during the check itself, thus avoiding the UBSAN splat.

[1]: https://github.com/llvm/llvm-project/pull/82432

Signed-off-by: Justin Stitt <justinstitt@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20240507-b4-sio-block-ioctl-v3-1-ba0c2b32275e@google.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-05 09:12:33 +02:00
Greg Kroah-Hartman
9100d24dfd This is the 5.10.215 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmYaZdgACgkQONu9yGCS
 aT4oMxAA0pATFAq8RN5f9CmYlMg5HqHgzZ8lJv8P0/reOINhUa+F5sJb1n+x+Ch4
 WQbmiFeZRzfsKZ2qKhIdNR0Lg+9JOr/DtYXdSBZ6InfSWrTAIrQ9fjl5Warkmcgg
 O4WbgF5BVgU3vGFATgxLvnUZwhR1D7WK93oMDunzrT7+OqyncU3f1Uj53ZAu9030
 z18UNqnTxDLYH/CMGwAeRkaZqBev9gZ1HdgQWA27SVLqWQwZq0al81Cmlo+ECVmk
 5dF6V2pid4qfKGJjDDfx1NS0PVnoP68iK4By1SXyoFV9VBiSwp77nUUyDr7YsHsT
 u8GpZHr9jZvSO5/xtKv20NPLejTPCRKc06CbkwpikDRtGOocBL8em0GuVqlf8hMs
 KwDb6ZEzYhXZGPJHbJM+aRD1tq/KHw9X7TrldOszMQPr6lubBtscPbg1FCg3OlcC
 HUrtub0i275x7TH0dJeRTD8TRE9jRmF+tl7KQytEJM3JRrquFjLyhDj+/VJnZkiB
 lzj3FRf4zshzgz4+CAeqXO/8Lu8b3fGYmcW1acCmk7emjDcXUKojPj/Aig6T4l7P
 oCWDY3+w1E6eiyE8BazxY1KUa/41ld0VJnlW5JWGRaDFTJwrk0h6/rvf9qImSckw
 IGx24UezRyp6NS1op3Qm2iwHLr41pFRfKxNm9ppgH9iBPzOhe38=
 =pkLL
 -----END PGP SIGNATURE-----

Merge 5.10.215 into android12-5.10-lts

Changes in 5.10.215
	amdkfd: use calloc instead of kzalloc to avoid integer overflow
	Documentation/hw-vuln: Update spectre doc
	x86/cpu: Support AMD Automatic IBRS
	x86/bugs: Use sysfs_emit()
	timers: Update kernel-doc for various functions
	timers: Use del_timer_sync() even on UP
	timers: Rename del_timer_sync() to timer_delete_sync()
	wifi: brcmfmac: Fix use-after-free bug in brcmf_cfg80211_detach
	media: staging: ipu3-imgu: Set fields before media_entity_pads_init()
	clk: qcom: gcc-sdm845: Add soft dependency on rpmhpd
	smack: Set SMACK64TRANSMUTE only for dirs in smack_inode_setxattr()
	smack: Handle SMACK64TRANSMUTE in smack_inode_setsecurity()
	arm: dts: marvell: Fix maxium->maxim typo in brownstone dts
	drm/vmwgfx: stop using ttm_bo_create v2
	drm/vmwgfx: switch over to the new pin interface v2
	drm/vmwgfx/vmwgfx_cmdbuf_res: Remove unused variable 'ret'
	drm/vmwgfx: Fix some static checker warnings
	drm/vmwgfx: Fix possible null pointer derefence with invalid contexts
	serial: max310x: fix NULL pointer dereference in I2C instantiation
	media: xc4000: Fix atomicity violation in xc4000_get_frequency
	KVM: Always flush async #PF workqueue when vCPU is being destroyed
	sparc64: NMI watchdog: fix return value of __setup handler
	sparc: vDSO: fix return value of __setup handler
	crypto: qat - fix double free during reset
	crypto: qat - resolve race condition during AER recovery
	selftests/mqueue: Set timeout to 180 seconds
	ext4: correct best extent lstart adjustment logic
	block: introduce zone_write_granularity limit
	block: Clear zone limits for a non-zoned stacked queue
	bounds: support non-power-of-two CONFIG_NR_CPUS
	fat: fix uninitialized field in nostale filehandles
	ubifs: Set page uptodate in the correct place
	ubi: Check for too small LEB size in VTBL code
	ubi: correct the calculation of fastmap size
	mtd: rawnand: meson: fix scrambling mode value in command macro
	parisc: Avoid clobbering the C/B bits in the PSW with tophys and tovirt macros
	parisc: Fix ip_fast_csum
	parisc: Fix csum_ipv6_magic on 32-bit systems
	parisc: Fix csum_ipv6_magic on 64-bit systems
	parisc: Strip upper 32 bit of sum in csum_ipv6_magic for 64-bit builds
	PM: suspend: Set mem_sleep_current during kernel command line setup
	clk: qcom: gcc-ipq6018: fix terminating of frequency table arrays
	clk: qcom: gcc-ipq8074: fix terminating of frequency table arrays
	clk: qcom: mmcc-apq8084: fix terminating of frequency table arrays
	clk: qcom: mmcc-msm8974: fix terminating of frequency table arrays
	powerpc/fsl: Fix mfpmr build errors with newer binutils
	USB: serial: ftdi_sio: add support for GMC Z216C Adapter IR-USB
	USB: serial: add device ID for VeriFone adapter
	USB: serial: cp210x: add ID for MGP Instruments PDS100
	USB: serial: option: add MeiG Smart SLM320 product
	USB: serial: cp210x: add pid/vid for TDK NC0110013M and MM0110113M
	PM: sleep: wakeirq: fix wake irq warning in system suspend
	mmc: tmio: avoid concurrent runs of mmc_request_done()
	fuse: fix root lookup with nonzero generation
	fuse: don't unhash root
	usb: typec: ucsi: Clean up UCSI_CABLE_PROP macros
	printk/console: Split out code that enables default console
	serial: Lock console when calling into driver before registration
	btrfs: fix off-by-one chunk length calculation at contains_pending_extent()
	PCI: Drop pci_device_remove() test of pci_dev->driver
	PCI/PM: Drain runtime-idle callbacks before driver removal
	PCI/ERR: Cache RCEC EA Capability offset in pci_init_capabilities()
	PCI: Cache PCIe Device Capabilities register
	PCI: Work around Intel I210 ROM BAR overlap defect
	PCI/ASPM: Make Intel DG2 L1 acceptable latency unlimited
	PCI/DPC: Quirk PIO log size for certain Intel Root Ports
	PCI/DPC: Quirk PIO log size for Intel Raptor Lake Root Ports
	Revert "Revert "md/raid5: Wait for MD_SB_CHANGE_PENDING in raid5d""
	dm-raid: fix lockdep waring in "pers->hot_add_disk"
	mac802154: fix llsec key resources release in mac802154_llsec_key_del
	mm: swap: fix race between free_swap_and_cache() and swapoff()
	mmc: core: Fix switch on gp3 partition
	drm/etnaviv: Restore some id values
	hwmon: (amc6821) add of_match table
	ext4: fix corruption during on-line resize
	nvmem: meson-efuse: fix function pointer type mismatch
	slimbus: core: Remove usage of the deprecated ida_simple_xx() API
	phy: tegra: xusb: Add API to retrieve the port number of phy
	usb: gadget: tegra-xudc: Use dev_err_probe()
	usb: gadget: tegra-xudc: Fix USB3 PHY retrieval logic
	speakup: Fix 8bit characters from direct synth
	PCI/ERR: Clear AER status only when we control AER
	PCI/AER: Block runtime suspend when handling errors
	nfs: fix UAF in direct writes
	kbuild: Move -Wenum-{compare-conditional,enum-conversion} into W=1
	PCI: dwc: endpoint: Fix advertised resizable BAR size
	vfio/platform: Disable virqfds on cleanup
	ring-buffer: Fix waking up ring buffer readers
	ring-buffer: Do not set shortest_full when full target is hit
	ring-buffer: Fix resetting of shortest_full
	ring-buffer: Fix full_waiters_pending in poll
	soc: fsl: qbman: Always disable interrupts when taking cgr_lock
	soc: fsl: qbman: Add helper for sanity checking cgr ops
	soc: fsl: qbman: Add CGR update function
	soc: fsl: qbman: Use raw spinlock for cgr_lock
	s390/zcrypt: fix reference counting on zcrypt card objects
	drm/panel: do not return negative error codes from drm_panel_get_modes()
	drm/exynos: do not return negative values from .get_modes()
	drm/imx/ipuv3: do not return negative values from .get_modes()
	drm/vc4: hdmi: do not return negative values from .get_modes()
	memtest: use {READ,WRITE}_ONCE in memory scanning
	nilfs2: fix failure to detect DAT corruption in btree and direct mappings
	nilfs2: prevent kernel bug at submit_bh_wbc()
	cpufreq: dt: always allocate zeroed cpumask
	x86/CPU/AMD: Update the Zenbleed microcode revisions
	net: hns3: tracing: fix hclgevf trace event strings
	wireguard: netlink: check for dangling peer via is_dead instead of empty list
	wireguard: netlink: access device through ctx instead of peer
	ahci: asm1064: correct count of reported ports
	ahci: asm1064: asm1166: don't limit reported ports
	drm/amd/display: Return the correct HDCP error code
	drm/amd/display: Fix noise issue on HDMI AV mute
	dm snapshot: fix lockup in dm_exception_table_exit
	vxge: remove unnecessary cast in kfree()
	x86/stackprotector/32: Make the canary into a regular percpu variable
	x86/pm: Work around false positive kmemleak report in msr_build_context()
	scripts: kernel-doc: Fix syntax error due to undeclared args variable
	comedi: comedi_test: Prevent timers rescheduling during deletion
	cpufreq: brcmstb-avs-cpufreq: fix up "add check for cpufreq_cpu_get's return value"
	netfilter: nf_tables: mark set as dead when unbinding anonymous set with timeout
	netfilter: nf_tables: disallow anonymous set with timeout flag
	netfilter: nf_tables: reject constant set with timeout
	Drivers: hv: vmbus: Calculate ring buffer size for more efficient use of memory
	xfrm: Avoid clang fortify warning in copy_to_user_tmpl()
	KVM: SVM: Flush pages under kvm->lock to fix UAF in svm_register_enc_region()
	ALSA: hda/realtek - Fix headset Mic no show at resume back for Lenovo ALC897 platform
	USB: usb-storage: Prevent divide-by-0 error in isd200_ata_command
	usb: gadget: ncm: Fix handling of zero block length packets
	usb: port: Don't try to peer unused USB ports based on location
	tty: serial: fsl_lpuart: avoid idle preamble pending if CTS is enabled
	mei: me: add arrow lake point S DID
	mei: me: add arrow lake point H DID
	vt: fix unicode buffer corruption when deleting characters
	fs/aio: Check IOCB_AIO_RW before the struct aio_kiocb conversion
	tee: optee: Fix kernel panic caused by incorrect error handling
	xen/events: close evtchn after mapping cleanup
	printk: Update @console_may_schedule in console_trylock_spinning()
	btrfs: allocate btrfs_ioctl_defrag_range_args on stack
	x86/asm: Add _ASM_RIP() macro for x86-64 (%rip) suffix
	x86/bugs: Add asm helpers for executing VERW
	x86/entry_64: Add VERW just before userspace transition
	x86/entry_32: Add VERW just before userspace transition
	x86/bugs: Use ALTERNATIVE() instead of mds_user_clear static key
	KVM/VMX: Use BT+JNC, i.e. EFLAGS.CF to select VMRESUME vs. VMLAUNCH
	KVM/VMX: Move VERW closer to VMentry for MDS mitigation
	x86/mmio: Disable KVM mitigation when X86_FEATURE_CLEAR_CPU_BUF is set
	Documentation/hw-vuln: Add documentation for RFDS
	x86/rfds: Mitigate Register File Data Sampling (RFDS)
	KVM/x86: Export RFDS_NO and RFDS_CLEAR to guests
	perf/core: Fix reentry problem in perf_output_read_group()
	efivarfs: Request at most 512 bytes for variable names
	powerpc: xor_vmx: Add '-mhard-float' to CFLAGS
	serial: sc16is7xx: convert from _raw_ to _noinc_ regmap functions for FIFO
	mm/memory-failure: fix an incorrect use of tail pages
	mm/migrate: set swap entry values of THP tail pages properly.
	init: open /initrd.image with O_LARGEFILE
	wifi: mac80211: check/clear fast rx for non-4addr sta VLAN changes
	exec: Fix NOMMU linux_binprm::exec in transfer_args_to_stack()
	hexagon: vmlinux.lds.S: handle attributes section
	mmc: core: Initialize mmc_blk_ioc_data
	mmc: core: Avoid negative index with array access
	net: ll_temac: platform_get_resource replaced by wrong function
	usb: cdc-wdm: close race between read and workqueue
	ALSA: sh: aica: reorder cleanup operations to avoid UAF bugs
	scsi: core: Fix unremoved procfs host directory regression
	staging: vc04_services: changen strncpy() to strscpy_pad()
	staging: vc04_services: fix information leak in create_component()
	USB: core: Add hub_get() and hub_put() routines
	usb: dwc2: host: Fix remote wakeup from hibernation
	usb: dwc2: host: Fix hibernation flow
	usb: dwc2: host: Fix ISOC flow in DDMA mode
	usb: dwc2: gadget: LPM flow fix
	usb: udc: remove warning when queue disabled ep
	usb: typec: ucsi: Ack unsupported commands
	usb: typec: ucsi: Clear UCSI_CCI_RESET_COMPLETE before reset
	scsi: qla2xxx: Split FCE|EFT trace control
	scsi: qla2xxx: Fix command flush on cable pull
	scsi: qla2xxx: Delay I/O Abort on PCI error
	x86/cpu: Enable STIBP on AMD if Automatic IBRS is enabled
	PCI/DPC: Quirk PIO log size for Intel Ice Lake Root Ports
	scsi: lpfc: Correct size for wqe for memset()
	USB: core: Fix deadlock in usb_deauthorize_interface()
	nfc: nci: Fix uninit-value in nci_dev_up and nci_ntf_packet
	ixgbe: avoid sleeping allocation in ixgbe_ipsec_vf_add_sa()
	tcp: properly terminate timers for kernel sockets
	ACPICA: debugger: check status of acpi_evaluate_object() in acpi_db_walk_for_fields()
	bpf: Protect against int overflow for stack access size
	Octeontx2-af: fix pause frame configuration in GMP mode
	dm integrity: fix out-of-range warning
	r8169: fix issue caused by buggy BIOS on certain boards with RTL8168d
	x86/cpufeatures: Add new word for scattered features
	Bluetooth: hci_event: set the conn encrypted before conn establishes
	Bluetooth: Fix TOCTOU in HCI debugfs implementation
	netfilter: nf_tables: disallow timeout for anonymous sets
	net/rds: fix possible cp null dereference
	vfio/pci: Disable auto-enable of exclusive INTx IRQ
	vfio/pci: Lock external INTx masking ops
	vfio: Introduce interface to flush virqfd inject workqueue
	vfio/pci: Create persistent INTx handler
	vfio/platform: Create persistent IRQ handlers
	vfio/fsl-mc: Block calling interrupt handler without trigger
	io_uring: ensure '0' is returned on file registration success
	Revert "x86/mm/ident_map: Use gbpages only where full GB page should be mapped."
	mm, vmscan: prevent infinite loop for costly GFP_NOIO | __GFP_RETRY_MAYFAIL allocations
	x86/srso: Add SRSO mitigation for Hygon processors
	block: add check that partition length needs to be aligned with block size
	netfilter: nf_tables: reject new basechain after table flag update
	netfilter: nf_tables: flush pending destroy work before exit_net release
	netfilter: nf_tables: Fix potential data-race in __nft_flowtable_type_get()
	netfilter: validate user input for expected length
	vboxsf: Avoid an spurious warning if load_nls_xxx() fails
	bpf, sockmap: Prevent lock inversion deadlock in map delete elem
	net/sched: act_skbmod: prevent kernel-infoleak
	net: stmmac: fix rx queue priority assignment
	erspan: make sure erspan_base_hdr is present in skb->head
	selftests: reuseaddr_conflict: add missing new line at the end of the output
	ipv6: Fix infinite recursion in fib6_dump_done().
	udp: do not transition UDP GRO fraglist partial checksums to unnecessary
	octeontx2-pf: check negative error code in otx2_open()
	i40e: fix i40e_count_filters() to count only active/new filters
	i40e: fix vf may be used uninitialized in this function warning
	scsi: qla2xxx: Update manufacturer details
	scsi: qla2xxx: Update manufacturer detail
	Revert "usb: phy: generic: Get the vbus supply"
	udp: do not accept non-tunnel GSO skbs landing in a tunnel
	net: ravb: Always process TX descriptor ring
	arm64: dts: qcom: sc7180: Remove clock for bluetooth on Trogdor
	arm64: dts: qcom: sc7180-trogdor: mark bluetooth address as broken
	ASoC: ops: Fix wraparound for mask in snd_soc_get_volsw
	ata: sata_sx4: fix pdc20621_get_from_dimm() on 64-bit
	scsi: mylex: Fix sysfs buffer lengths
	ata: sata_mv: Fix PCI device ID table declaration compilation warning
	ALSA: hda/realtek: Update Panasonic CF-SZ6 quirk to support headset with microphone
	driver core: Introduce device_link_wait_removal()
	of: dynamic: Synchronize of_changeset_destroy() with the devlink removals
	x86/mce: Make sure to grab mce_sysfs_mutex in set_bank()
	s390/entry: align system call table on 8 bytes
	riscv: Fix spurious errors from __get/put_kernel_nofault
	x86/bugs: Fix the SRSO mitigation on Zen3/4
	x86/retpoline: Do the necessary fixup to the Zen3/4 srso return thunk for !SRSO
	mptcp: don't account accept() of non-MPC client as fallback to TCP
	x86/cpufeatures: Add CPUID_LNX_5 to track recently added Linux-defined word
	objtool: Add asm version of STACK_FRAME_NON_STANDARD
	wifi: ath9k: fix LNA selection in ath_ant_try_scan()
	VMCI: Fix memcpy() run-time warning in dg_dispatch_as_host()
	panic: Flush kernel log buffer at the end
	arm64: dts: rockchip: fix rk3328 hdmi ports node
	arm64: dts: rockchip: fix rk3399 hdmi ports node
	ionic: set adminq irq affinity
	pstore/zone: Add a null pointer check to the psz_kmsg_read
	tools/power x86_energy_perf_policy: Fix file leak in get_pkg_num()
	btrfs: handle chunk tree lookup error in btrfs_relocate_sys_chunks()
	btrfs: export: handle invalid inode or root reference in btrfs_get_parent()
	btrfs: send: handle path ref underflow in header iterate_inode_ref()
	net/smc: reduce rtnl pressure in smc_pnet_create_pnetids_list()
	Bluetooth: btintel: Fix null ptr deref in btintel_read_version
	Input: synaptics-rmi4 - fail probing if memory allocation for "phys" fails
	pinctrl: renesas: checker: Limit cfg reg enum checks to provided IDs
	sysv: don't call sb_bread() with pointers_lock held
	scsi: lpfc: Fix possible memory leak in lpfc_rcv_padisc()
	isofs: handle CDs with bad root inode but good Joliet root directory
	media: sta2x11: fix irq handler cast
	ext4: add a hint for block bitmap corrupt state in mb_groups
	ext4: forbid commit inconsistent quota data when errors=remount-ro
	drm/amd/display: Fix nanosec stat overflow
	SUNRPC: increase size of rpc_wait_queue.qlen from unsigned short to unsigned int
	Revert "ACPI: PM: Block ASUS B1400CEAE from suspend to idle by default"
	libperf evlist: Avoid out-of-bounds access
	block: prevent division by zero in blk_rq_stat_sum()
	RDMA/cm: add timeout to cm_destroy_id wait
	Input: allocate keycode for Display refresh rate toggle
	platform/x86: touchscreen_dmi: Add an extra entry for a variant of the Chuwi Vi8 tablet
	ktest: force $buildonly = 1 for 'make_warnings_file' test type
	ring-buffer: use READ_ONCE() to read cpu_buffer->commit_page in concurrent environment
	tools: iio: replace seekdir() in iio_generic_buffer
	usb: typec: tcpci: add generic tcpci fallback compatible
	usb: sl811-hcd: only defined function checkdone if QUIRK2 is defined
	fbdev: viafb: fix typo in hw_bitblt_1 and hw_bitblt_2
	drivers/nvme: Add quirks for device 126f:2262
	fbmon: prevent division by zero in fb_videomode_from_videomode()
	netfilter: nf_tables: release batch on table validation from abort path
	netfilter: nf_tables: release mutex after nft_gc_seq_end from abort path
	netfilter: nf_tables: discard table flag update with pending basechain deletion
	tty: n_gsm: require CAP_NET_ADMIN to attach N_GSM0710 ldisc
	virtio: reenable config if freezing device failed
	x86/mm/pat: fix VM_PAT handling in COW mappings
	drm/i915/gt: Reset queue_priority_hint on parking
	Bluetooth: btintel: Fixe build regression
	VMCI: Fix possible memcpy() run-time warning in vmci_datagram_invoke_guest_handler()
	kbuild: dummy-tools: adjust to stricter stackprotector check
	scsi: sd: Fix wrong zone_write_granularity value during revalidate
	x86/retpoline: Add NOENDBR annotation to the SRSO dummy return thunk
	x86/head/64: Re-enable stack protection
	Linux 5.10.215

Change-Id: I45a0a9c4a0683ff5ef97315690f1f884f666e1b5
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2024-06-01 11:03:55 +00:00
Greg Kroah-Hartman
f21d21f05e Revert "block: add a new set_read_only method"
This reverts commit 2a0f8202f7 which is
commit e00adcadf3af7a8335026d71ab9f0e0a922191ac upstream.

It breaks the Android kernel abi and can be brought back in the future
in an abi-safe way if it is really needed.

Bug: 161946584
Change-Id: I20342421827a743bfb159577aa151698b101d17d
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2024-05-20 10:43:46 +00:00
Min Li
8f6dfa1f1e block: add check that partition length needs to be aligned with block size
commit 6f64f866aa1ae6975c95d805ed51d7e9433a0016 upstream.

Before calling add partition or resize partition, there is no check
on whether the length is aligned with the logical block size.
If the logical block size of the disk is larger than 512 bytes,
then the partition size maybe not the multiple of the logical block size,
and when the last sector is read, bio_truncate() will adjust the bio size,
resulting in an IO error if the size of the read command is smaller than
the logical block size.If integrity data is supported, this will also
result in a null pointer dereference when calling bio_integrity_free.

Cc:  <stable@vger.kernel.org>
Signed-off-by: Min Li <min15.li@samsung.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20230629142517.121241-1-min15.li@samsung.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Ashwin Dayanand Kamat <ashwin.kamat@broadcom.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-13 12:59:23 +02:00
Christoph Hellwig
2a0f8202f7 block: add a new set_read_only method
[ Upstream commit e00adcadf3af7a8335026d71ab9f0e0a922191ac ]

Add a new method to allow for driver-specific processing when setting or
clearing the block device read-only state.  This allows to replace the
cumbersome and error-prone override of the whole ioctl implementation.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Stable-dep-of: 9674f54e41ff ("md: Don't clear MD_CLOSING when the raid is about to stop")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-03-26 18:21:48 -04:00
Khazhismel Kumykov
8bedbc8f7f block/compat_ioctl: fix range check in BLKGETSIZE
commit ccf16413e520164eb718cf8b22a30438da80ff23 upstream.

kernel ulong and compat_ulong_t may not be same width. Use type directly
to eliminate mismatches.

This would result in truncation rather than EFBIG for 32bit mode for
large disks.

Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Khazhismel Kumykov <khazhy@google.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Link: https://lore.kernel.org/r/20220414224056.2875681-1-khazhy@google.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-27 13:53:57 +02:00
Christoph Hellwig
fc2454cc0c block: return -EBUSY when there are open partitions in blkdev_reread_part
[ Upstream commit 68e6582e8f2dc32fd2458b9926564faa1fb4560e ]

The switch to go through blkdev_get_by_dev means we now ignore the
return value from bdev_disk_changed in __blkdev_get.  Add a manual
check to restore the old semantics.

Fixes: 4601b4b130de ("block: reopen the device in blkdev_reread_part")
Reported-by: Karel Zak <kzak@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210421160502.447418-1-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-04-28 13:39:59 +02:00
Christoph Hellwig
cc88a819a1 block: reopen the device in blkdev_reread_part
[ Upstream commit 4601b4b130de2329fe06df80ed5d77265f2058e5 ]

Historically the BLKRRPART ioctls called into the now defunct ->revalidate
method, which caused the sd driver to check if any media is present.
When the ->revalidate method was removed this revalidation was lost,
leading to lots of I/O errors when using the eject command.  Fix this by
reopening the device to rescan the partitions, and thus calling the
revalidation logic in the sd driver.

Fixes: 471bd0af54 ("sd: use bdev_check_media_change")
Reported--by: Tom Seewald <tseewald@gmail.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Tom Seewald <tseewald@gmail.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-04 11:38:21 +01:00
Christoph Hellwig
fa01b1e973 block: add a bdev_is_partition helper
Add a littler helper to make the somewhat arcane bd_contains checks a
little more obvious.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-25 08:18:57 -06:00
Christoph Hellwig
478162821d block: cleanup blkdev_bszset
Use blkdev_get_by_dev instead of bdgrab + blkdev_get.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-23 10:43:19 -06:00
Jan Kara
384d87ef2c block: Do not discard buffers under a mounted filesystem
Discarding blocks and buffers under a mounted filesystem is hardly
anything admin wants to do. Usually it will confuse the filesystem and
sometimes the loss of buffer_head state (including b_private field) can
even cause crashes like:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
PGD 0 P4D 0
Oops: 0002 [#1] SMP PTI
CPU: 4 PID: 203778 Comm: jbd2/dm-3-8 Kdump: loaded Tainted: G O     --------- -  - 4.18.0-147.5.0.5.h126.eulerosv2r9.x86_64 #1
Hardware name: Huawei RH2288H V3/BC11HGSA0, BIOS 1.57 08/11/2015
RIP: 0010:jbd2_journal_grab_journal_head+0x1b/0x40 [jbd2]
...
Call Trace:
 __jbd2_journal_insert_checkpoint+0x23/0x70 [jbd2]
 jbd2_journal_commit_transaction+0x155f/0x1b60 [jbd2]
 kjournald2+0xbd/0x270 [jbd2]

So if we don't have block device open with O_EXCL already, claim the
block device while we truncate buffer cache. This makes sure any
exclusive block device user (such as filesystem) cannot operate on the
device while we are discarding buffer cache.

Reported-by: Ye Bin <yebin10@huawei.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Christoph Hellwig <hch@lst.de>
[axboe: fix !CONFIG_BLOCK error in truncate_bdev_range()]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-07 20:10:55 -06:00
Bart Van Assche
c8210a5765 block: Fix type of first compat_put_{,u}long() argument
This patch fixes the following sparse warnings:

block/ioctl.c:209:16: warning: incorrect type in argument 1 (different address spaces)
block/ioctl.c:209:16:    expected void const volatile [noderef] <asn:1> *
block/ioctl.c:209:16:    got signed int [usertype] *argp
block/ioctl.c:214:16: warning: incorrect type in argument 1 (different address spaces)
block/ioctl.c:214:16:    expected void const volatile [noderef] <asn:1> *
block/ioctl.c:214:16:    got unsigned int [usertype] *argp
block/ioctl.c:666:40: warning: incorrect type in argument 1 (different address spaces)
block/ioctl.c:666:40:    expected signed int [usertype] *argp
block/ioctl.c:666:40:    got void [noderef] <asn:1> *argp
block/ioctl.c:672:41: warning: incorrect type in argument 1 (different address spaces)
block/ioctl.c:672:41:    expected unsigned int [usertype] *argp
block/ioctl.c:672:41:    got void [noderef] <asn:1> *argp

Fixes: 9b81648cb5 ("compat_ioctl: simplify up block/ioctl.c")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-19 09:40:29 -06:00
Christoph Hellwig
fa9156ae59 block: refactor blkpg_ioctl
Split each sub-command out into a separate helper, and move those helpers
to block/partitions/core.c instead of having a lot of partition
manipulation logic open coded in block/ioctl.c.

Signed-off-by: Christoph Hellwig <hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-04-20 11:32:59 -06:00
Christoph Hellwig
581e26004a block: move block layer internals out of include/linux/genhd.h
None of this needs to be exposed to drivers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-03-25 09:50:08 -06:00
Arnd Bergmann
9b81648cb5 compat_ioctl: simplify up block/ioctl.c
Having separate implementations of blkdev_ioctl() often leads to these
getting out of sync, despite the comment at the top.

Since most of the ioctl commands are compatible, and we try very hard
not to add any new incompatible ones, move all the common bits into a
shared function and leave only the ones that are historically different
in separate functions for native/compat mode.

To deal with the compat_ptr() conversion, pass both the integer
argument and the pointer argument into the new blkdev_common_ioctl()
and make sure to always use the correct one of these.

blkdev_ioctl() is now only kept as a separate exported interfact
for drivers/char/raw.c, which lacks a compat_ioctl variant.
We should probably either move raw.c to staging if there are no
more users, or export blkdev_compat_ioctl() as well.

Reviewed-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2020-01-03 09:42:52 +01:00
Arnd Bergmann
5fb889f587 compat_ioctl: block: simplify compat_blkpg_ioctl()
There is no need to go through a compat_alloc_user_space()
copy any more, just wrap the function in a small helper that
works the same way for native and compat mode.

Reviewed-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2020-01-03 09:42:52 +01:00
Arnd Bergmann
bdc1ddad3e compat_ioctl: block: move blkdev_compat_ioctl() into ioctl.c
Having both in the same file allows a number of simplifications
to the compat path, and makes it more likely that changes to
the native path get applied to the compat version as well.

Reviewed-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2020-01-03 09:42:52 +01:00
Arnd Bergmann
ee6a129dff compat_ioctl: block: add blkdev_compat_ptr_ioctl
A lot of block drivers need only a trivial .compat_ioctl callback.

Add a helper function that can be set as the callback pointer
to only convert the argument using the compat_ptr() conversion
and otherwise assume all input and output data is compatible,
or handled using in_compat_syscall() checks.

This mirrors the compat_ptr_ioctl() helper function used in
character devices.

Reviewed-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2020-01-03 09:32:59 +01:00
Christoph Hellwig
9b38bb4b1e block: simplify blkdev_nr_zones
Simplify the arguments to blkdev_nr_zones by passing a gendisk instead
of the block_device and capacity.  This also removes the need for
__blkdev_nr_zones as all callers are outside the fast path and can
deal with the additional branch.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-12-03 08:51:24 -07:00
Christoph Hellwig
f0b870df80 block: remove (__)blkdev_reread_part as an exported API
In general drivers should never mess with partition tables directly.
Unfortunately s390 and loop do for somewhat historic reasons, but they
can use bdev_disk_changed directly instead when we export it as they
satisfy the sanity checks we have in __blkdev_reread_part.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Stefan Haberland <sth@linux.ibm.com>	[dasd]
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-11-14 07:43:59 -07:00
Christoph Hellwig
142fe8f4bb block: fix bdev_disk_changed for non-partitioned devices
We still have to set the capacity to 0 if invalidating or call
revalidate_disk if not even if the disk has no partitions.  Fix
that by merging rescan_partitions into bdev_disk_changed and just
stubbing out blk_add_partitions and blk_drop_partitions for
non-partitioned devices.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-11-14 07:43:53 -07:00
Christoph Hellwig
6917d06899 block: merge invalidate_partitions into rescan_partitions
A lot of the logic in invalidate_partitions and rescan_partitions is
shared.  Merge the two functions to simplify things.  There is a small
behavior change in that we now send the kevent change notice also if we
were not invalidating but no partitions were found, which seems like
the right thing to do.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-11-14 07:42:41 -07:00
Ajay Joshi
e876df1fe0 block: add zone open, close and finish ioctl support
Introduce three new ioctl commands BLKOPENZONE, BLKCLOSEZONE and
BLKFINISHZONE to allow applications to control the condition of zones
on a zoned block device through the execution of the REQ_OP_ZONE_OPEN,
REQ_OP_ZONE_CLOSE and REQ_OP_ZONE_FINISH operations.

Contains contributions from Matias Bjorling, Hans Holmberg,
Dmitry Fomichev, Keith Busch, Damien Le Moal and Christoph Hellwig.

Reviewed-by: Javier González <javier@javigon.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ajay Joshi <ajay.joshi@wdc.com>
Signed-off-by: Matias Bjorling <matias.bjorling@wdc.com>
Signed-off-by: Hans Holmberg <hans.holmberg@wdc.com>
Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-11-07 06:31:50 -07:00
Christoph Hellwig
3dcf60bcb6 block: add SPDX tags to block layer files missing licensing information
Various block layer files do not have any licensing information at all.
Add SPDX tags for the default kernel GPLv2 license to those.

Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-04-30 16:12:03 -06:00
Damien Le Moal
65e4e3eee8 block: Introduce BLKGETNRZONES ioctl
Get a zoned block device total number of zones. The device can be a
partition of the whole device. The number of zones is always 0 for
regular block devices.

Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-25 11:17:40 -06:00
Damien Le Moal
72cd87576d block: Introduce BLKGETZONESZ ioctl
Get a zoned block device zone size in number of 512 B sectors.
The zone size is always 0 for regular block devices.

Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-25 11:17:40 -06:00
Ming Lei
0bd1ed4860 block: pass inclusive 'lend' parameter to truncate_inode_pages_range
The 'lend' parameter of truncate_inode_pages_range is required to be
inclusive, so follow the rule.

This patch fixes one memory corruption triggered by discard.

Cc: <stable@vger.kernel.org>
Cc: Dmitry Monakhov <dmonakhov@openvz.org>
Fixes: 351499a172 ("block: Invalidate cache on discard v2")
Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-02-23 15:20:19 -07:00
Ilya Dryomov
bb749b31c2 block: move CAP_SYS_ADMIN check in blkdev_roset()
Check for CAP_SYS_ADMIN before calling into the driver, similar to
blkdev_flushbuf().  This is safer and can spare a check in the driver.

(Currently BLKROSET is overridden by md and rbd, rbd is missing the
check.  md has the check, but it covers a lot more than BLKROSET.)

Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-10-25 12:25:00 -06:00
Dmitry Monakhov
351499a172 block: Invalidate cache on discard v2
It is reasonable drop page cache on discard, otherwise that pages may
be written by writeback second later, so thin provision devices will
not be happy. This seems to be a  security leak in case of secure discard case.

Also add check for queue_discard flag on early stage.

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-10-24 18:44:57 -06:00
Christoph Hellwig
48920ff2a5 block: remove the discard_zeroes_data flag
Now that we use the proper REQ_OP_WRITE_ZEROES operation everywhere we can
kill this hack.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2017-04-08 11:25:38 -06:00
Christoph Hellwig
ee472d835c block: add a flags argument to (__)blkdev_issue_zeroout
Turn the existing discard flag into a new BLKDEV_ZERO_UNMAP flag with
similar semantics, but without referring to diѕcard.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2017-04-08 11:25:38 -06:00
Jan Kara
efa7c9f97e block: Get rid of blk_get_backing_dev_info()
blk_get_backing_dev_info() is now a simple dereference. Remove that
function and simplify some code around that.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
2017-02-02 08:21:32 -07:00
Linus Torvalds
7c0f6ba682 Replace <asm/uaccess.h> with <linux/uaccess.h> globally
This was entirely automated, using the script by Al:

  PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>'
  sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \
        $(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h)

to do the replacement at the end of the merge window.

Requested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-24 11:46:01 -08:00
Stefan Haberland
633395b67b block: check partition alignment
Partitions that are not aligned to the blocksize of a device may cause
invalid I/O requests because the blocklayer cares only about alignment
within the partition when building requests on partitions.

device
|--------4096--------|--------4096--------|--------4096--------|
partition offset 512byte
|-512-|--------4096--------|--------4096--------|--------4096--------|

When reading/writing one 4k block of the partition this maps to
reading/writing with an offset of 512 byte of the device leading to
unaligned requests for the device which in turn may cause unexpected
behavior of the device driver.

For DASD devices we have to translate the block number into a cylinder,
head, record format. The unaligned requests lead to wrong calculation
and therefore to misdirected I/O. In a "good" case this leads to I/O
errors because the underlying hardware detects the wrong addressing.
In a worst case scenario this might destroy data on the device.

To prevent partitions that are not aligned to the physical blocksize
of a device check for the alignment in the blkpg_ioctl.

Signed-off-by: Stefan Haberland <sth@linux.vnet.ibm.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-12-19 09:20:59 -07:00
Shaun Tancheff
3ed05a987e blk-zoned: implement ioctls
Adds the new BLKREPORTZONE and BLKRESETZONE ioctls for respectively
obtaining the zone configuration of a zoned block device and resetting
the write pointer of sequential zones of a zoned block device.

The BLKREPORTZONE ioctl maps directly to a single call of the function
blkdev_report_zones. The zone information result is passed as an array
of struct blk_zone identical to the structure used internally for
processing the REQ_OP_ZONE_REPORT operation.  The BLKRESETZONE ioctl
maps to a single call of the blkdev_reset_zones function.

Signed-off-by: Shaun Tancheff <shaun.tancheff@seagate.com>
Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-10-18 10:05:42 -06:00
Darrick J. Wong
22dd6d3566 block: invalidate the page cache when issuing BLKZEROOUT
Patch series "fallocate for block devices", v11.

This is a patchset to fix page cache coherency with BLKZEROOUT and
implement fallocate for block devices.

The first patch is a fix to the existing BLKZEROOUT ioctl to invalidate
the page cache if the zeroing command to the underlying device succeeds.
Without this patch we still have the pagecache coherence bug that's been
in the kernel forever.

The second patch changes the internal block device functions to reject
attempts to discard or zeroout that are not aligned to the logical block
size.  Previously, we only checked that the start/len parameters were
512-byte aligned, which caused kernel BUG_ONs for unaligned IOs to 4k-LBA
devices.

The third patch creates an fallocate handler for block devices, wires up
the FALLOC_FL_PUNCH_HOLE flag to zeroing-discard, and connects
FALLOC_FL_ZERO_RANGE to write-same so that we can have a consistent
fallocate interface between files and block devices.  It also allows the
combination of PUNCH_HOLE and NO_HIDE_STALE to invoke non-zeroing discard.

Test cases for the new block device fallocate are now in xfstests as
generic/349-351.

This patch (of 3):

Invalidate the page cache (as a regular O_DIRECT write would do) to avoid
returning stale cache contents at a later time.

Link: http://lkml.kernel.org/r/147518378313.22791.16649519283678515021.stgit@birch.djwong.org
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: Brian Foster <bfoster@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11 15:06:30 -07:00
Linus Torvalds
315227f6da DAX error handling for 4.7
- Until now, dax has been disabled if media errors were found on
   any device. This enables the use of DAX in the presence of these
   errors by making all sector-aligned zeroing go through the driver.
 - The driver (already) has the ability to clear errors on writes that
   are sent through the block layer using 'DSMs' defined in ACPI 6.1.
 
 Other misc changes:
 
 - When mounting DAX filesystems, check to make sure the partition
   is page aligned. This is a requirement for DAX, and previously, we
   allowed such unaligned mounts to succeed, but subsequent reads/writes
   would fail.
 
 - Misc/cleanup fixes from Jan that remove unused code from DAX related to
   zeroing, writeback, and some size checks.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXQ4GKAAoJEHr6Yb6juE3/zowP/iclIhgXXXMQJRUHJlePMXC8
 15sGZ32JS1ak9g7vrsmNVEDNynfNtiMYdBxtUyRuj6xqgwdZvFk3F55KOCPtaeA1
 +yADkgeRkTAcwzmHw9WQVEzBCqyzSisdrwtEfH817qdq9FJdH66x2Kos6i+HeAVr
 5Q/e4gs7lKrjf384/QBl+wxNZOndJaQAPd2VRHQqx2A9F33v0ljdwRaUG1r4fjK2
 dtmhcZCqdQyuAGXW3piTnZc5ZFc3DPqO4FkEfqkEK3lFOflK0fd8wMsAZRp/Jd0j
 GJsgnVSWSqG0Dz476djlG0w8t2p5Jv1g9cKChV+ZZEdFLKWHCOUFqXNj8uI8I4k5
 cOEKCHyJ3IwfSHhNQqktEWrQN4T8ZXhWtuc9GuV4UZYuqJqHci6EdR/YsWsJjV+L
 lm/qvK4ipDS1pivxOy8KX/iN0z7Io8J9GXpStDx3g8iWjLlh4YYlbJLWeeRepo/z
 aPlV/QAKcHiGY6jzLExrZIyCWkzwo6O+0p1Kxerv9/7K/32HWbOodZ+tC8eD+N25
 pV69nCGf+u50T2TtIx1+iann4NC1r7zg5yqnT9AgpyZpiwR5joCDzI5sXW+D0rcS
 vPtfM84Ccdeq/e6mvfIpZgR0/npQapKnrmUest0J7P2BFPHiFPji1KzZ7M+1aFOo
 9R6JdrAj0Sc+FBa+cGzH
 =v6Of
 -----END PGP SIGNATURE-----

Merge tag 'dax-misc-for-4.7' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm

Pull misc DAX updates from Vishal Verma:
 "DAX error handling for 4.7

   - Until now, dax has been disabled if media errors were found on any
     device.  This enables the use of DAX in the presence of these
     errors by making all sector-aligned zeroing go through the driver.

   - The driver (already) has the ability to clear errors on writes that
     are sent through the block layer using 'DSMs' defined in ACPI 6.1.

  Other misc changes:

   - When mounting DAX filesystems, check to make sure the partition is
     page aligned.  This is a requirement for DAX, and previously, we
     allowed such unaligned mounts to succeed, but subsequent
     reads/writes would fail.

   - Misc/cleanup fixes from Jan that remove unused code from DAX
     related to zeroing, writeback, and some size checks"

* tag 'dax-misc-for-4.7' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  dax: fix a comment in dax_zero_page_range and dax_truncate_page
  dax: for truncate/hole-punch, do zeroing through the driver if possible
  dax: export a low-level __dax_zero_page_range helper
  dax: use sb_issue_zerout instead of calling dax_clear_sectors
  dax: enable dax in the presence of known media errors (badblocks)
  dax: fallback from pmd to pte on error
  block: Update blkdev_dax_capable() for consistency
  xfs: Add alignment check for DAX mount
  ext2: Add alignment check for DAX mount
  ext4: Add alignment check for DAX mount
  block: Add bdev_dax_supported() for dax mount checks
  block: Add vfs_msg() interface
  dax: Remove redundant inode size checks
  dax: Remove pointless writeback from dax_do_io()
  dax: Remove zeroing from dax_io()
  dax: Remove dead zeroing code from fault handlers
  ext2: Avoid DAX zeroing to corrupt data
  ext2: Fix block zeroing in ext2_get_blocks() for DAX
  dax: Remove complete_unwritten argument
  DAX: move RADIX_DAX_ definitions to dax.c
2016-05-26 19:34:26 -07:00
Dan Williams
acc93d30d7 Revert "block: enable dax for raw block devices"
This reverts commit 5a023cdba5.

The functionality is superseded by the new "Device DAX" facility.

Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Jan Kara <jack@suse.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-05-20 22:02:56 -07:00
Toshi Kani
a8078b1fc6 block: Update blkdev_dax_capable() for consistency
blkdev_dax_capable() is similar to bdev_dax_supported(), but needs
to remain as a separate interface for checking dax capability of
a raw block device.

Rename and relocate blkdev_dax_capable() to keep them maintained
consistently, and call bdev_direct_access() for the dax capability
check.

There is no change in the behavior.

Link: https://lkml.org/lkml/2016/5/9/950
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Jens Axboe <axboe@fb.com>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: Jan Kara <jack@suse.cz>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Boaz Harrosh <boaz@plexistor.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
2016-05-17 00:44:13 -06:00
Kirill A. Shutemov
09cbfeaf1a mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros
PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
ago with promise that one day it will be possible to implement page
cache with bigger chunks than PAGE_SIZE.

This promise never materialized.  And unlikely will.

We have many places where PAGE_CACHE_SIZE assumed to be equal to
PAGE_SIZE.  And it's constant source of confusion on whether
PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
especially on the border between fs and mm.

Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
breakage to be doable.

Let's stop pretending that pages in page cache are special.  They are
not.

The changes are pretty straight-forward:

 - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;

 - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;

 - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};

 - page_cache_get() -> get_page();

 - page_cache_release() -> put_page();

This patch contains automated changes generated with coccinelle using
script below.  For some reason, coccinelle doesn't patch header files.
I've called spatch for them manually.

The only adjustment after coccinelle is revert of changes to
PAGE_CAHCE_ALIGN definition: we are going to drop it later.

There are few places in the code where coccinelle didn't reach.  I'll
fix them manually in a separate patch.  Comments and documentation also
will be addressed with the separate patch.

virtual patch

@@
expression E;
@@
- E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E

@@
expression E;
@@
- E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E

@@
@@
- PAGE_CACHE_SHIFT
+ PAGE_SHIFT

@@
@@
- PAGE_CACHE_SIZE
+ PAGE_SIZE

@@
@@
- PAGE_CACHE_MASK
+ PAGE_MASK

@@
expression E;
@@
- PAGE_CACHE_ALIGN(E)
+ PAGE_ALIGN(E)

@@
expression E;
@@
- page_cache_get(E)
+ get_page(E)

@@
expression E;
@@
- page_cache_release(E)
+ put_page(E)

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-04-04 10:41:08 -07:00
Dan Williams
9f4736fe7c block: revert runtime dax control of the raw block device
Dynamically enabling DAX requires that the page cache first be flushed
and invalidated.  This must occur atomically with the change of DAX mode
otherwise we confuse the fsync/msync tracking and violate data
durability guarantees.  Eliminate the possibilty of DAX-disabled to
DAX-enabled transitions for now and revisit this for the next cycle.

Cc: Jan Kara <jack@suse.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-01-30 13:35:31 -08:00
Al Viro
5955102c99 wrappers for ->i_mutex access
parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested},
inode_foo(inode) being mutex_foo(&inode->i_mutex).

Please, use those for access to ->i_mutex; over the coming cycle
->i_mutex will become rwsem, with ->lookup() done with it held
only shared.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-01-22 18:04:28 -05:00
Dan Williams
57f7f317ab pmem, dax: disable dax in the presence of bad blocks
Longer term teach dax to punch "error" holes in mapping requests and
deliver SIGBUS to applications that consume a bad pmem page.  For now,
simply disable the dax performance optimization in the presence of known
errors.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-01-09 22:42:31 -08:00
Dan Williams
5a023cdba5 block: enable dax for raw block devices
If an application wants exclusive access to all of the persistent memory
provided by an NVDIMM namespace it can use this raw-block-dax facility
to forgo establishing a filesystem.  This capability is targeted
primarily to hypervisors wanting to provision persistent memory for
guests.  It can be disabled / enabled dynamically via the new BLKDAXSET
ioctl.

Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Reviewed-by: Jan Kara <jack@suse.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-01-09 06:30:49 -08:00
Christoph Hellwig
bbd3e06436 block: add an API for Persistent Reservations
This commits adds a driver API and ioctls for controlling Persistent
Reservations s/genericly/generically/ at the block layer.  Persistent
Reservations are supported by SCSI and NVMe and allow controlling who gets
access to a device in a shared storage setup.

Note that we add a pr_ops structure to struct block_device_operations
instead of adding the members directly to avoid bloating all instances
of devices that will never support Persistent Reservations.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-10-21 14:46:56 -06:00
Christoph Hellwig
d8e4bb8103 block: cleanup blkdev_ioctl
Split out helpers for all non-trivial ioctls to make this function simpler,
and also start passing around a pointer version of the argument, as that's
what most ioctl handlers actually need.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-10-21 14:46:55 -06:00
Ming Lei
b04a5636a6 block: replace trylock with mutex_lock in blkdev_reread_part()
The only possible problem of using mutex_lock() instead of trylock
is about deadlock.

If there aren't any locks held before calling blkdev_reread_part(),
deadlock can't be caused by this conversion.

If there are locks held before calling blkdev_reread_part(),
and if these locks arn't required in open, close handler and I/O
path, deadlock shouldn't be caused too.

Both user space's ioctl(BLKRRPART) and md_setup_drive() from
init/do_mounts_md.c belongs to the 1st case, so the conversion is safe
for the two cases.

For loop, the previous patches in this pathset has fixed the ABBA lock
dependency, so the conversion is OK.

For nbd, tx_lock is held when calling the function:

	- both open and release won't hold the lock
	- when blkdev_reread_part() is run, I/O thread has been stopped
	already, so tx_lock won't be acquired in I/O path at that time.
	- so the conversion won't cause deadlock for nbd

For dasd, both dasd_open(), dasd_release() and request function don't
acquire any mutex/semphone, so the conversion should be safe.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: Jarod Wilson <jarod@redhat.com>
Acked-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-05-20 09:05:45 -06:00
Jarod Wilson
be32417796 block: export blkdev_reread_part() and __blkdev_reread_part()
This patch exports blkdev_reread_part() for block drivers, also
introduce __blkdev_reread_part().

For some drivers, such as loop, reread of partitions can be run
from the release path, and bd_mutex may already be held prior to
calling ioctl_by_bdev(bdev, BLKRRPART, 0), so introduce
__blkdev_reread_part for use in such cases.

CC: Christoph Hellwig <hch@lst.de>
CC: Jens Axboe <axboe@kernel.dk>
CC: Tejun Heo <tj@kernel.org>
CC: Alexander Viro <viro@zeniv.linux.org.uk>
CC: Markus Pargmann <mpa@pengutronix.de>
CC: Stefan Weinhuber <wein@de.ibm.com>
CC: Stefan Haberland <stefan.haberland@de.ibm.com>
CC: Sebastian Ott <sebott@linux.vnet.ibm.com>
CC: Fabian Frederick <fabf@skynet.be>
CC: Ming Lei <ming.lei@canonical.com>
CC: David Herrmann <dh.herrmann@gmail.com>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Peter Zijlstra <peterz@infradead.org>
CC: nbd-general@lists.sourceforge.net
CC: linux-s390@vger.kernel.org
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-05-20 09:05:42 -06:00
Martin K. Petersen
d93ba7a5a9 block: Add discard flag to blkdev_issue_zeroout() function
blkdev_issue_discard() will zero a given block range. This is done by
way of explicit writing, thus provisioning or allocating the blocks on
disk.

There are use cases where the desired behavior is to zero the blocks but
unprovision them if possible. The blocks must deterministically contain
zeroes when they are subsequently read back.

This patch adds a flag to blkdev_issue_zeroout() that provides this
variant. If the discard flag is set and a block device guarantees
discard_zeroes_data we will use REQ_DISCARD to clear the block range. If
the device does not support discard_zeroes_data or if the discard
request fails we will fall back to first REQ_WRITE_SAME and then a
regular REQ_WRITE.

Also update the callers of blkdev_issue_zero() to reflect the new flag
and make sb_issue_zeroout() prefer the discard approach.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-01-21 10:41:46 -07:00