Commit Graph

159 Commits

Author SHA1 Message Date
Greg Kroah-Hartman
6136b834d6 Merge cdd86fb75f ("net/mlx5: Added cond_resched() to crdump collection") into android12-5.10-lts
Steps on the way to 5.10.227

Change-Id: I780b041f7c72ac3204110981ba8c0ce36764d971
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2024-11-14 09:58:25 +00:00
David Gow
52f7cab290 mm: only enforce minimum stack gap size if it's sensible
commit 69b50d4351ed924f29e3d46b159e28f70dfc707f upstream.

The generic mmap_base code tries to leave a gap between the top of the
stack and the mmap base address, but enforces a minimum gap size (MIN_GAP)
of 128MB, which is too large on some setups.  In particular, on arm tasks
without ADDR_LIMIT_32BIT, the STACK_TOP value is less than 128MB, so it's
impossible to fit such a gap in.

Only enforce this minimum if MIN_GAP < MAX_GAP, as we'd prefer to honour
MAX_GAP, which is defined proportionally, so scales better and always
leaves us with both _some_ stack space and some room for mmap.

This fixes the usercopy KUnit test suite on 32-bit arm, as it doesn't set
any personality flags so gets the default (in this case 26-bit) task size.
This test can be run with: ./tools/testing/kunit/kunit.py run --arch arm
usercopy --make_options LLVM=1

Link: https://lkml.kernel.org/r/20240803074642.1849623-2-davidgow@google.com
Fixes: dba79c3df4 ("arm: use generic mmap top-down layout and brk randomization")
Signed-off-by: David Gow <davidgow@google.com>
Reviewed-by: Kees Cook <kees@kernel.org>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-10-17 15:08:05 +02:00
Greg Kroah-Hartman
66e91da883 This is the 5.10.210 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmXYTLkACgkQONu9yGCS
 aT4+fhAAqqR/Cvx53ZKMQ8GZTCudAZnr/Dz6kWYwxhhhIbQjDpCaf9mgsrEDaQS2
 ancSZjzYaOUIXq/IsthXxQIUhiZbuM3iuSEi7+odWgSYdkFyzuUt8MWLBGSaB5Er
 ojn+APtq7vPXTSnp7uMwqMC3/BHCKkeYIjRVevhhHBKG5d3lzkV1xU8NcvMkLaly
 CIRxpWXD3w2b7K0GEbb/zN1GQEHDCQcxjuaJoe/5FKGJkqd3T31eyiJTRumCCMcz
 j8vkGkYmcMJpWf04iLgVA1p13I5/HGrXdEBI/GutN8IABIC3Cp42jW8phHYKW5ZM
 a4R25LZG5buND1Ubpq+EDrYn3EaPek5XRki0w8ZAXfNa3rYc+N6mQjkzNSOzhJ/5
 VNsn3EAE1Dwtar5Z3ASe9ugDbh+0bgx85PbfaADK88V+qWb3DVr1TBWmDNu2vfVP
 rv4I0EKu9r3vOE8aNMEBuhAVkIK3mEQUxwab6RKNrMby/5Uwa+ugrrUtQd8V+T1S
 j6r6v7u7aZ8mhYO7d6WSvAKL85lCWGbs3WRIKCJZmDRyqWrWW9tVWRN9wrZ2QnRr
 iaCQKk8P474P7/j1zwnmih8l4wS1oszveNziWwd0fi1Nn/WQYM+JKYQvpuQijmQ+
 J9jLyWo7a59zffIE6mzJdNwFy9hlw9X+VnJmExk/Q88Z7Bt5wPQ=
 =laYd
 -----END PGP SIGNATURE-----

Merge 5.10.210 into android12-5.10-lts

Changes in 5.10.210
	usb: cdns3: Fixes for sparse warnings
	usb: cdns3: fix uvc failure work since sg support enabled
	usb: cdns3: fix incorrect calculation of ep_buf_size when more than one config
	usb: cdns3: fix iso transfer error when mult is not zero
	usb: cdns3: Fix uvc fail when DMA cross 4k boundery since sg enabled
	PCI: mediatek: Clear interrupt status before dispatching handler
	units: change from 'L' to 'UL'
	units: add the HZ macros
	serial: sc16is7xx: set safe default SPI clock frequency
	spi: introduce SPI_MODE_X_MASK macro
	serial: sc16is7xx: add check for unsupported SPI modes during probe
	iio: adc: ad7091r: Set alert bit in config register
	iio: adc: ad7091r: Allow users to configure device events
	iio: adc: ad7091r: Enable internal vref if external vref is not supplied
	dmaengine: fix NULL pointer in channel unregistration function
	iio:adc:ad7091r: Move exports into IIO_AD7091R namespace.
	ext4: allow for the last group to be marked as trimmed
	crypto: api - Disallow identical driver names
	PM: hibernate: Enforce ordering during image compression/decompression
	hwrng: core - Fix page fault dead lock on mmap-ed hwrng
	crypto: s390/aes - Fix buffer overread in CTR mode
	rpmsg: virtio: Free driver_override when rpmsg_remove()
	bus: mhi: host: Drop chan lock before queuing buffers
	parisc/firmware: Fix F-extend for PDC addresses
	async: Split async_schedule_node_domain()
	async: Introduce async_schedule_dev_nocall()
	arm64: dts: qcom: sdm845: fix USB wakeup interrupt types
	arm64: dts: qcom: sdm845: fix USB DP/DM HS PHY interrupts
	lsm: new security_file_ioctl_compat() hook
	scripts/get_abi: fix source path leak
	mmc: core: Use mrq.sbc in close-ended ffu
	mmc: mmc_spi: remove custom DMA mapped buffers
	rtc: Adjust failure return code for cmos_set_alarm()
	nouveau/vmm: don't set addr on the fail path to avoid warning
	ubifs: ubifs_symlink: Fix memleak of inode->i_link in error path
	rename(): fix the locking of subdirectories
	block: Remove special-casing of compound pages
	stddef: Introduce DECLARE_FLEX_ARRAY() helper
	smb3: Replace smb2pdu 1-element arrays with flex-arrays
	mm: vmalloc: introduce array allocation functions
	KVM: use __vcalloc for very large allocations
	net/smc: fix illegal rmb_desc access in SMC-D connection dump
	tcp: make sure init the accept_queue's spinlocks once
	bnxt_en: Wait for FLR to complete during probe
	vlan: skip nested type that is not IFLA_VLAN_QOS_MAPPING
	llc: make llc_ui_sendmsg() more robust against bonding changes
	llc: Drop support for ETH_P_TR_802_2.
	net/rds: Fix UBSAN: array-index-out-of-bounds in rds_cmsg_recv
	tracing: Ensure visibility when inserting an element into tracing_map
	afs: Hide silly-rename files from userspace
	tcp: Add memory barrier to tcp_push()
	netlink: fix potential sleeping issue in mqueue_flush_file
	ipv6: init the accept_queue's spinlocks in inet6_create
	net/mlx5: DR, Use the right GVMI number for drop action
	net/mlx5e: fix a double-free in arfs_create_groups
	netfilter: nf_tables: restrict anonymous set and map names to 16 bytes
	netfilter: nf_tables: validate NFPROTO_* family
	net: mvpp2: clear BM pool before initialization
	selftests: netdevsim: fix the udp_tunnel_nic test
	fjes: fix memleaks in fjes_hw_setup
	net: fec: fix the unhandled context fault from smmu
	btrfs: ref-verify: free ref cache before clearing mount opt
	btrfs: tree-checker: fix inline ref size in error messages
	btrfs: don't warn if discard range is not aligned to sector
	btrfs: defrag: reject unknown flags of btrfs_ioctl_defrag_range_args
	btrfs: don't abort filesystem when attempting to snapshot deleted subvolume
	rbd: don't move requests to the running list on errors
	exec: Fix error handling in begin_new_exec()
	wifi: iwlwifi: fix a memory corruption
	netfilter: nft_chain_filter: handle NETDEV_UNREGISTER for inet/ingress basechain
	netfilter: nf_tables: reject QUEUE/DROP verdict parameters
	gpiolib: acpi: Ignore touchpad wakeup on GPD G1619-04
	drm: Don't unref the same fb many times by mistake due to deadlock handling
	drm/bridge: nxp-ptn3460: fix i2c_master_send() error checking
	drm/tidss: Fix atomic_flush check
	drm/bridge: nxp-ptn3460: simplify some error checking
	PM: sleep: Use dev_printk() when possible
	PM: sleep: Avoid calling put_device() under dpm_list_mtx
	PM: core: Remove unnecessary (void *) conversions
	PM: sleep: Fix possible deadlocks in core system-wide PM code
	fs/pipe: move check to pipe_has_watch_queue()
	pipe: wakeup wr_wait after setting max_usage
	ARM: dts: samsung: exynos4210-i9100: Unconditionally enable LDO12
	arm64: dts: qcom: sc7180: Use pdc interrupts for USB instead of GIC interrupts
	arm64: dts: qcom: sc7180: fix USB wakeup interrupt types
	media: mtk-jpeg: Fix use after free bug due to error path handling in mtk_jpeg_dec_device_run
	mm: use __pfn_to_section() instead of open coding it
	mm/sparsemem: fix race in accessing memory_section->usage
	btrfs: remove err variable from btrfs_delete_subvolume
	btrfs: avoid copying BTRFS_ROOT_SUBVOL_DEAD flag to snapshot of subvolume being deleted
	drm: panel-simple: add missing bus flags for Tianma tm070jvhg[30/33]
	drm/exynos: fix accidental on-stack copy of exynos_drm_plane
	drm/exynos: gsc: minor fix for loop iteration in gsc_runtime_resume
	gpio: eic-sprd: Clear interrupt after set the interrupt type
	spi: bcm-qspi: fix SFDP BFPT read by usig mspi read
	mips: Call lose_fpu(0) before initializing fcr31 in mips_set_personality_nan
	tick/sched: Preserve number of idle sleeps across CPU hotplug events
	x86/entry/ia32: Ensure s32 is sign extended to s64
	powerpc/mm: Fix null-pointer dereference in pgtable_cache_add
	drivers/perf: pmuv3: don't expose SW_INCR event in sysfs
	powerpc: Fix build error due to is_valid_bugaddr()
	powerpc/mm: Fix build failures due to arch_reserved_kernel_pages()
	x86/boot: Ignore NMIs during very early boot
	powerpc: pmd_move_must_withdraw() is only needed for CONFIG_TRANSPARENT_HUGEPAGE
	powerpc/lib: Validate size for vector operations
	x86/mce: Mark fatal MCE's page as poison to avoid panic in the kdump kernel
	perf/core: Fix narrow startup race when creating the perf nr_addr_filters sysfs file
	debugobjects: Stop accessing objects after releasing hash bucket lock
	regulator: core: Only increment use_count when enable_count changes
	audit: Send netlink ACK before setting connection in auditd_set
	ACPI: video: Add quirk for the Colorful X15 AT 23 Laptop
	PNP: ACPI: fix fortify warning
	ACPI: extlog: fix NULL pointer dereference check
	PM / devfreq: Synchronize devfreq_monitor_[start/stop]
	ACPI: APEI: set memory failure flags as MF_ACTION_REQUIRED on synchronous events
	FS:JFS:UBSAN:array-index-out-of-bounds in dbAdjTree
	UBSAN: array-index-out-of-bounds in dtSplitRoot
	jfs: fix slab-out-of-bounds Read in dtSearch
	jfs: fix array-index-out-of-bounds in dbAdjTree
	jfs: fix uaf in jfs_evict_inode
	pstore/ram: Fix crash when setting number of cpus to an odd number
	crypto: stm32/crc32 - fix parsing list of devices
	afs: fix the usage of read_seqbegin_or_lock() in afs_lookup_volume_rcu()
	afs: fix the usage of read_seqbegin_or_lock() in afs_find_server*()
	rxrpc_find_service_conn_rcu: fix the usage of read_seqbegin_or_lock()
	jfs: fix array-index-out-of-bounds in diNewExt
	s390/ptrace: handle setting of fpc register correctly
	KVM: s390: fix setting of fpc register
	SUNRPC: Fix a suspicious RCU usage warning
	ecryptfs: Reject casefold directory inodes
	ext4: fix inconsistent between segment fstrim and full fstrim
	ext4: unify the type of flexbg_size to unsigned int
	ext4: remove unnecessary check from alloc_flex_gd()
	ext4: avoid online resizing failures due to oversized flex bg
	wifi: rt2x00: restart beacon queue when hardware reset
	selftests/bpf: satisfy compiler by having explicit return in btf test
	selftests/bpf: Fix pyperf180 compilation failure with clang18
	scsi: lpfc: Fix possible file string name overflow when updating firmware
	PCI: Add no PM reset quirk for NVIDIA Spectrum devices
	bonding: return -ENOMEM instead of BUG in alb_upper_dev_walk
	scsi: arcmsr: Support new PCI device IDs 1883 and 1886
	ARM: dts: imx7d: Fix coresight funnel ports
	ARM: dts: imx7s: Fix lcdif compatible
	ARM: dts: imx7s: Fix nand-controller #size-cells
	wifi: ath9k: Fix potential array-index-out-of-bounds read in ath9k_htc_txstatus()
	bpf: Add map and need_defer parameters to .map_fd_put_ptr()
	scsi: libfc: Don't schedule abort twice
	scsi: libfc: Fix up timeout error in fc_fcp_rec_error()
	bpf: Set uattr->batch.count as zero before batched update or deletion
	ARM: dts: rockchip: fix rk3036 hdmi ports node
	ARM: dts: imx25/27-eukrea: Fix RTC node name
	ARM: dts: imx: Use flash@0,0 pattern
	ARM: dts: imx27: Fix sram node
	ARM: dts: imx1: Fix sram node
	ionic: pass opcode to devcmd_wait
	block/rnbd-srv: Check for unlikely string overflow
	ARM: dts: imx25: Fix the iim compatible string
	ARM: dts: imx25/27: Pass timing0
	ARM: dts: imx27-apf27dev: Fix LED name
	ARM: dts: imx23-sansa: Use preferred i2c-gpios properties
	ARM: dts: imx23/28: Fix the DMA controller node name
	net: dsa: mv88e6xxx: Fix mv88e6352_serdes_get_stats error path
	block: prevent an integer overflow in bvec_try_merge_hw_page
	md: Whenassemble the array, consult the superblock of the freshest device
	arm64: dts: qcom: msm8996: Fix 'in-ports' is a required property
	arm64: dts: qcom: msm8998: Fix 'out-ports' is a required property
	wifi: rtl8xxxu: Add additional USB IDs for RTL8192EU devices
	wifi: rtlwifi: rtl8723{be,ae}: using calculate_bit_shift()
	wifi: cfg80211: free beacon_ies when overridden from hidden BSS
	Bluetooth: qca: Set both WIDEBAND_SPEECH and LE_STATES quirks for QCA2066
	Bluetooth: L2CAP: Fix possible multiple reject send
	i40e: Fix VF disable behavior to block all traffic
	f2fs: fix to check return value of f2fs_reserve_new_block()
	ALSA: hda: Refer to correct stream index at loops
	ASoC: doc: Fix undefined SND_SOC_DAPM_NOPM argument
	fast_dput(): handle underflows gracefully
	RDMA/IPoIB: Fix error code return in ipoib_mcast_join
	drm/amd/display: Fix tiled display misalignment
	f2fs: fix write pointers on zoned device after roll forward
	drm/drm_file: fix use of uninitialized variable
	drm/framebuffer: Fix use of uninitialized variable
	drm/mipi-dsi: Fix detach call without attach
	media: stk1160: Fixed high volume of stk1160_dbg messages
	media: rockchip: rga: fix swizzling for RGB formats
	PCI: add INTEL_HDA_ARL to pci_ids.h
	ALSA: hda: Intel: add HDA_ARL PCI ID support
	ALSA: hda: intel-dspcfg: add filters for ARL-S and ARL
	drm/exynos: Call drm_atomic_helper_shutdown() at shutdown/unbind time
	IB/ipoib: Fix mcast list locking
	media: ddbridge: fix an error code problem in ddb_probe
	drm/msm/dpu: Ratelimit framedone timeout msgs
	clk: hi3620: Fix memory leak in hi3620_mmc_clk_init()
	clk: mmp: pxa168: Fix memory leak in pxa168_clk_init()
	watchdog: it87_wdt: Keep WDTCTRL bit 3 unmodified for IT8784/IT8786
	drm/amdgpu: Let KFD sync with VM fences
	drm/amdgpu: Drop 'fence' check in 'to_amdgpu_amdkfd_fence()'
	leds: trigger: panic: Don't register panic notifier if creating the trigger failed
	um: Fix naming clash between UML and scheduler
	um: Don't use vfprintf() for os_info()
	um: net: Fix return type of uml_net_start_xmit()
	i3c: master: cdns: Update maximum prescaler value for i2c clock
	xen/gntdev: Fix the abuse of underlying struct page in DMA-buf import
	mfd: ti_am335x_tscadc: Fix TI SoC dependencies
	PCI: Only override AMD USB controller if required
	PCI: switchtec: Fix stdev_release() crash after surprise hot remove
	usb: hub: Replace hardcoded quirk value with BIT() macro
	tty: allow TIOCSLCKTRMIOS with CAP_CHECKPOINT_RESTORE
	fs/kernfs/dir: obey S_ISGID
	PCI/AER: Decode Requester ID when no error info found
	libsubcmd: Fix memory leak in uniq()
	virtio_net: Fix "‘%d’ directive writing between 1 and 11 bytes into a region of size 10" warnings
	blk-mq: fix IO hang from sbitmap wakeup race
	ceph: fix deadlock or deadcode of misusing dget()
	drm/amd/powerplay: Fix kzalloc parameter 'ATOM_Tonga_PPM_Table' in 'get_platform_power_management_table()'
	drm/amdgpu: Release 'adev->pm.fw' before return in 'amdgpu_device_need_post()'
	perf: Fix the nr_addr_filters fix
	wifi: cfg80211: fix RCU dereference in __cfg80211_bss_update
	drm: using mul_u32_u32() requires linux/math64.h
	scsi: isci: Fix an error code problem in isci_io_request_build()
	scsi: core: Introduce enum scsi_disposition
	scsi: core: Move scsi_host_busy() out of host lock for waking up EH handler
	ip6_tunnel: use dev_sw_netstats_rx_add()
	ip6_tunnel: make sure to pull inner header in __ip6_tnl_rcv()
	net-zerocopy: Refactor frag-is-remappable test.
	tcp: add sanity checks to rx zerocopy
	ixgbe: Remove non-inclusive language
	ixgbe: Refactor returning internal error codes
	ixgbe: Refactor overtemp event handling
	ixgbe: Fix an error handling path in ixgbe_read_iosf_sb_reg_x550()
	ipv6: Ensure natural alignment of const ipv6 loopback and router addresses
	llc: call sock_orphan() at release time
	netfilter: nf_log: replace BUG_ON by WARN_ON_ONCE when putting logger
	netfilter: nft_ct: sanitize layer 3 and 4 protocol number in custom expectations
	net: ipv4: fix a memleak in ip_setup_cork
	af_unix: fix lockdep positive in sk_diag_dump_icons()
	net: sysfs: Fix /sys/class/net/<iface> path
	HID: apple: Add support for the 2021 Magic Keyboard
	HID: apple: Add 2021 magic keyboard FN key mapping
	bonding: remove print in bond_verify_device_path
	uapi: stddef.h: Fix __DECLARE_FLEX_ARRAY for C++
	PM: sleep: Fix error handling in dpm_prepare()
	dmaengine: fsl-dpaa2-qdma: Fix the size of dma pools
	dmaengine: ti: k3-udma: Report short packet errors
	dmaengine: fsl-qdma: Fix a memory leak related to the status queue DMA
	dmaengine: fsl-qdma: Fix a memory leak related to the queue command DMA
	phy: renesas: rcar-gen3-usb2: Fix returning wrong error code
	dmaengine: fix is_slave_direction() return false when DMA_DEV_TO_DEV
	phy: ti: phy-omap-usb2: Fix NULL pointer dereference for SRP
	drm/msm/dp: return correct Colorimetry for DP_TEST_DYNAMIC_RANGE_CEA case
	net: stmmac: xgmac: fix handling of DPP safety error for DMA channels
	selftests: net: avoid just another constant wait
	tunnels: fix out of bounds access when building IPv6 PMTU error
	atm: idt77252: fix a memleak in open_card_ubr0
	hwmon: (aspeed-pwm-tacho) mutex for tach reading
	hwmon: (coretemp) Fix out-of-bounds memory access
	hwmon: (coretemp) Fix bogus core_id to attr name mapping
	inet: read sk->sk_family once in inet_recv_error()
	rxrpc: Fix response to PING RESPONSE ACKs to a dead call
	tipc: Check the bearer type before calling tipc_udp_nl_bearer_add()
	ppp_async: limit MRU to 64K
	netfilter: nft_compat: reject unused compat flag
	netfilter: nft_compat: restrict match/target protocol to u16
	netfilter: nft_ct: reject direction for ct id
	netfilter: nft_set_pipapo: store index in scratch maps
	netfilter: nft_set_pipapo: add helper to release pcpu scratch area
	netfilter: nft_set_pipapo: remove scratch_aligned pointer
	scsi: core: Move scsi_host_busy() out of host lock if it is for per-command
	blk-iocost: Fix an UBSAN shift-out-of-bounds warning
	net/af_iucv: clean up a try_then_request_module()
	USB: serial: qcserial: add new usb-id for Dell Wireless DW5826e
	USB: serial: option: add Fibocom FM101-GL variant
	USB: serial: cp210x: add ID for IMST iM871A-USB
	usb: host: xhci-plat: Add support for XHCI_SG_TRB_CACHE_SIZE_QUIRK
	hrtimer: Report offline hrtimer enqueue
	Input: i8042 - fix strange behavior of touchpad on Clevo NS70PU
	Input: atkbd - skip ATKBD_CMD_SETLEDS when skipping ATKBD_CMD_GETID
	vhost: use kzalloc() instead of kmalloc() followed by memset()
	clocksource: Skip watchdog check for large watchdog intervals
	net: stmmac: xgmac: use #define for string constants
	net: stmmac: xgmac: fix a typo of register name in DPP safety handling
	netfilter: nft_set_rbtree: skip end interval element from gc
	btrfs: forbid creating subvol qgroups
	btrfs: do not ASSERT() if the newly created subvolume already got read
	btrfs: forbid deleting live subvol qgroup
	btrfs: send: return EOPNOTSUPP on unknown flags
	of: unittest: Fix compile in the non-dynamic case
	net: openvswitch: limit the number of recursions from action sets
	spi: ppc4xx: Drop write-only variable
	ASoC: rt5645: Fix deadlock in rt5645_jack_detect_work()
	net: sysfs: Fix /sys/class/net/<iface> path for statistics
	MIPS: Add 'memory' clobber to csum_ipv6_magic() inline assembler
	i40e: Fix waiting for queues of all VSIs to be disabled
	tracing/trigger: Fix to return error if failed to alloc snapshot
	mm/writeback: fix possible divide-by-zero in wb_dirty_limits(), again
	ALSA: hda/realtek: Fix the external mic not being recognised for Acer Swift 1 SF114-32
	ALSA: hda/realtek: Enable Mute LED on HP Laptop 14-fq0xxx
	HID: wacom: generic: Avoid reporting a serial of '0' to userspace
	HID: wacom: Do not register input devices until after hid_hw_start
	usb: ucsi_acpi: Fix command completion handling
	USB: hub: check for alternate port before enabling A_ALT_HNP_SUPPORT
	usb: f_mass_storage: forbid async queue when shutdown happen
	media: ir_toy: fix a memleak in irtoy_tx
	powerpc/kasan: Fix addr error caused by page alignment
	i2c: i801: Remove i801_set_block_buffer_mode
	i2c: i801: Fix block process call transactions
	modpost: trim leading spaces when processing source files list
	scsi: Revert "scsi: fcoe: Fix potential deadlock on &fip->ctlr_lock"
	lsm: fix the logic in security_inode_getsecctx()
	firewire: core: correct documentation of fw_csr_string() kernel API
	kbuild: Fix changing ELF file type for output of gen_btf for big endian
	nfc: nci: free rx_data_reassembly skb on NCI device cleanup
	net: hsr: remove WARN_ONCE() in send_hsr_supervision_frame()
	xen-netback: properly sync TX responses
	ALSA: hda/realtek: Enable headset mic on Vaio VJFE-ADL
	binder: signal epoll threads of self-work
	misc: fastrpc: Mark all sessions as invalid in cb_remove
	ext4: fix double-free of blocks due to wrong extents moved_len
	tracing: Fix wasted memory in saved_cmdlines logic
	staging: iio: ad5933: fix type mismatch regression
	iio: magnetometer: rm3100: add boundary check for the value read from RM3100_REG_TMRC
	iio: accel: bma400: Fix a compilation problem
	media: rc: bpf attach/detach requires write permission
	hv_netvsc: Fix race condition between netvsc_probe and netvsc_remove
	ring-buffer: Clean ring_buffer_poll_wait() error return
	serial: max310x: set default value when reading clock ready bit
	serial: max310x: improve crystal stable clock detection
	x86/Kconfig: Transmeta Crusoe is CPU family 5, not 6
	x86/mm/ident_map: Use gbpages only where full GB page should be mapped.
	mmc: slot-gpio: Allow non-sleeping GPIO ro
	ALSA: hda/conexant: Add quirk for SWS JS201D
	nilfs2: fix data corruption in dsync block recovery for small block sizes
	nilfs2: fix hang in nilfs_lookup_dirty_data_buffers()
	crypto: ccp - Fix null pointer dereference in __sev_platform_shutdown_locked
	nfp: use correct macro for LengthSelect in BAR config
	nfp: flower: prevent re-adding mac index for bonded port
	wifi: mac80211: reload info pointer in ieee80211_tx_dequeue()
	irqchip/irq-brcmstb-l2: Add write memory barrier before exit
	irqchip/gic-v3-its: Fix GICv4.1 VPE affinity update
	s390/qeth: Fix potential loss of L3-IP@ in case of network issues
	ceph: prevent use-after-free in encode_cap_msg()
	of: property: fix typo in io-channels
	can: j1939: Fix UAF in j1939_sk_match_filter during setsockopt(SO_J1939_FILTER)
	pmdomain: core: Move the unused cleanup to a _sync initcall
	tracing: Inform kmemleak of saved_cmdlines allocation
	Revert "md/raid5: Wait for MD_SB_CHANGE_PENDING in raid5d"
	bus: moxtet: Add spi device table
	PCI: dwc: endpoint: Fix dw_pcie_ep_raise_msix_irq() alignment support
	mips: Fix max_mapnr being uninitialized on early stages
	crypto: lib/mpi - Fix unexpected pointer access in mpi_ec_init
	serial: Add rs485_supported to uart_port
	serial: 8250_exar: Fill in rs485_supported
	serial: 8250_exar: Set missing rs485_supported flag
	scripts/decode_stacktrace.sh: silence stderr messages from addr2line/nm
	scripts/decode_stacktrace.sh: support old bash version
	scripts: decode_stacktrace: demangle Rust symbols
	scripts/decode_stacktrace.sh: optionally use LLVM utilities
	netfilter: ipset: fix performance regression in swap operation
	netfilter: ipset: Missing gc cancellations fixed
	hrtimer: Ignore slack time for RT tasks in schedule_hrtimeout_range()
	Revert "arm64: Stash shadow stack pointer in the task struct on interrupt"
	net: prevent mss overflow in skb_segment()
	sched/membarrier: reduce the ability to hammer on sys_membarrier
	nilfs2: fix potential bug in end_buffer_async_write
	nilfs2: replace WARN_ONs for invalid DAT metadata block requests
	dm: limit the number of targets and parameter size area
	PM: runtime: add devm_pm_runtime_enable helper
	PM: runtime: Have devm_pm_runtime_enable() handle pm_runtime_dont_use_autosuspend()
	drm/msm/dsi: Enable runtime PM
	netfilter: nf_tables: fix pointer math issue in nft_byteorder_eval()
	net: bcmgenet: Fix EEE implementation
	PCI: dwc: Fix a 64bit bug in dw_pcie_ep_raise_msix_irq()
	Linux 5.10.210

Change-Id: I5e7327f58dd6abd26ac2b1e328a81c1010d1147c
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2024-04-10 07:10:03 +00:00
Paolo Bonzini
95670878a6 mm: vmalloc: introduce array allocation functions
commit a8749a35c39903120ec421ef2525acc8e0daa55c upstream.

Linux has dozens of occurrences of vmalloc(array_size()) and
vzalloc(array_size()).  Allow to simplify the code by providing
vmalloc_array and vcalloc, as well as the underscored variants that let
the caller specify the GFP flags.

Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alexander Ofitserov <oficerovas@altlinux.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-02-23 08:41:55 +01:00
Oven
69a794a283 ANDROID: vendor_hooks: Add hooks to avoid key threads stalled in
memory allocations

We add these hooks to avoid key threads blocked in memory allocation
path.
-android_vh_free_unref_page_bypass  ----We create a memory pool for the key threads. This hook determines whether a page should be free to the pool or to buddy freelist. It works with a existing hook `android_vh_alloc_pages_reclaim_bypass`, which takes pages out of the pool.

-android_vh_kvmalloc_node_use_vmalloc  ----For key threads, we perfer not to run into direct reclaim. So we clear __GFP_DIRECT_RECLAIM flag. For threads which are not that important, we perfer use vmalloc.

-android_vh_should_alloc_pages_retry  ----Before key threads run into direct reclaim, we want to retry with a lower watermark.

-android_vh_unreserve_highatomic_bypass  ----We want to keep more highatomic pages when unreserve them to avoid highatomic allocation failures.

-android_vh_pageset_update  ----We found the default per-cpu pageset is quite few in smartphones with large ram size. This hook is used to increase it to reduce zone->lock contentions.

-android_vh_rmqueue_bulk_bypass  ----We found sometimes when key threads run into rmqueue_bulk,  it took several milliseconds spinning at zone->lock or filling per-cpu pages. We use this hook to take pages from the mempool mentioned above,  rather than grab zone->lock and fill a batch of pages to per-cpu.

Bug: 288216516
Change-Id: I1656032d6819ca627723341987b6094775bc345f
Signed-off-by: Oven <liyangouwen1@oppo.com>
2023-06-30 08:42:41 +00:00
Greg Kroah-Hartman
fbe6a13851 This is the 5.10.137 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmMCMDQACgkQONu9yGCS
 aT6TwRAAvj1dnV1nLVVNET3jcelTO65SVUUpQjiyGD1npZQaQdH5PoGR0VhMWk7y
 mLUIwJyp/rR7+OLD3BMFwxDimDWHviFGdbmm/8fsyDrARuOeRd/M1fvtHXjIRQdb
 nOvfo1yTQWp4xA1k/JwJZslkvRFDsofXWHCRf+ffEryTRanFAVc7u5aFIg92W0b/
 JWYWEFe99C4TJ7LACpDoGaP9gE6WXsupaxSZBIu+Wxa+PfDmIeRRTkQn+j4Khn0h
 I6w+LkLd6ZP3l7sbe9KfS9ZGo1wWLgSng4zz742Z9IaFgxyj2ArS9tNsYCLkkhAM
 gLSXXkiPBAxUvAtDxR1tc0YROHc1bjAttSoxNXcaaacspSo/Vi0VAtp7t6boK0bI
 /8P3dh+Hq9u/Q1ClhZtVoFpp+GVj0fDbDd56qVcr2Cp6IokpqRJog1Jhgj0CVCoG
 iElr3n0+y7/IZfmE6/U1cK00SNcW86e2YduuIy4ifCawRT574zkRiSYZalpaO3qM
 z1lF9p+zUNq3v2q0wxXuBDLi/yPoJzbJgmCGScj4ryjjr6TOvR1udSVWkJ02dR4H
 s9km3lNLgoUPCYCLBMlZl7em4T49E09/+4YCrnj/Ezp+YdImf2+QzZyd/gG3ITl2
 fW7lpbK1dx3d/19JFP6Xkj9PaIlMl9e8Ne04G+Dabv67uN+0U+g=
 =Z4rz
 -----END PGP SIGNATURE-----

Merge 5.10.137 into android12-5.10-lts

Changes in 5.10.137
	Makefile: link with -z noexecstack --no-warn-rwx-segments
	x86: link vdso and boot with -z noexecstack --no-warn-rwx-segments
	Revert "pNFS: nfs3_set_ds_client should set NFS_CS_NOPING"
	scsi: Revert "scsi: qla2xxx: Fix disk failure to rediscover"
	ALSA: bcd2000: Fix a UAF bug on the error path of probing
	ALSA: hda/realtek: Add quirk for Clevo NV45PZ
	ALSA: hda/realtek: Add quirk for HP Spectre x360 15-eb0xxx
	wifi: mac80211_hwsim: fix race condition in pending packet
	wifi: mac80211_hwsim: add back erroneously removed cast
	wifi: mac80211_hwsim: use 32-bit skb cookie
	add barriers to buffer_uptodate and set_buffer_uptodate
	HID: wacom: Only report rotation for art pen
	HID: wacom: Don't register pad_input for touch switch
	KVM: nVMX: Snapshot pre-VM-Enter BNDCFGS for !nested_run_pending case
	KVM: nVMX: Snapshot pre-VM-Enter DEBUGCTL for !nested_run_pending case
	KVM: SVM: Don't BUG if userspace injects an interrupt with GIF=0
	KVM: s390: pv: don't present the ecall interrupt twice
	KVM: nVMX: Let userspace set nVMX MSR to any _host_ supported value
	KVM: x86: Mark TSS busy during LTR emulation _after_ all fault checks
	KVM: x86: Set error code to segment selector on LLDT/LTR non-canonical #GP
	KVM: x86: Tag kvm_mmu_x86_module_init() with __init
	riscv: set default pm_power_off to NULL
	mm: Add kvrealloc()
	xfs: only set IOMAP_F_SHARED when providing a srcmap to a write
	xfs: fix I_DONTCACHE
	mm/mremap: hold the rmap lock in write mode when moving page table entries.
	ALSA: hda/conexant: Add quirk for LENOVO 20149 Notebook model
	ALSA: hda/cirrus - support for iMac 12,1 model
	ALSA: hda/realtek: Add quirk for another Asus K42JZ model
	ALSA: hda/realtek: Add a quirk for HP OMEN 15 (8786) mute LED
	tty: vt: initialize unicode screen buffer
	vfs: Check the truncate maximum size in inode_newsize_ok()
	fs: Add missing umask strip in vfs_tmpfile
	thermal: sysfs: Fix cooling_device_stats_setup() error code path
	fbcon: Fix boundary checks for fbcon=vc:n1-n2 parameters
	fbcon: Fix accelerated fbdev scrolling while logo is still shown
	usbnet: Fix linkwatch use-after-free on disconnect
	ovl: drop WARN_ON() dentry is NULL in ovl_encode_fh()
	parisc: Fix device names in /proc/iomem
	parisc: Check the return value of ioremap() in lba_driver_probe()
	parisc: io_pgetevents_time64() needs compat syscall in 32-bit compat mode
	drm/gem: Properly annotate WW context on drm_gem_lock_reservations() error
	drm/vc4: hdmi: Disable audio if dmas property is present but empty
	drm/nouveau: fix another off-by-one in nvbios_addr
	drm/nouveau: Don't pm_runtime_put_sync(), only pm_runtime_put_autosuspend()
	drm/nouveau/acpi: Don't print error when we get -EINPROGRESS from pm_runtime
	drm/amdgpu: Check BO's requested pinning domains against its preferred_domains
	mtd: rawnand: arasan: Update NAND bus clock instead of system clock
	iio: light: isl29028: Fix the warning in isl29028_remove()
	scsi: sg: Allow waiting for commands to complete on removed device
	scsi: qla2xxx: Fix incorrect display of max frame size
	scsi: qla2xxx: Zero undefined mailbox IN registers
	fuse: limit nsec
	serial: mvebu-uart: uart2 error bits clearing
	md-raid: destroy the bitmap after destroying the thread
	md-raid10: fix KASAN warning
	media: [PATCH] pci: atomisp_cmd: fix three missing checks on list iterator
	ia64, processor: fix -Wincompatible-pointer-types in ia64_get_irr()
	PCI: Add defines for normal and subtractive PCI bridges
	powerpc/fsl-pci: Fix Class Code of PCIe Root Port
	powerpc/ptdump: Fix display of RW pages on FSL_BOOK3E
	powerpc/powernv: Avoid crashing if rng is NULL
	MIPS: cpuinfo: Fix a warning for CONFIG_CPUMASK_OFFSTACK
	coresight: Clear the connection field properly
	usb: typec: ucsi: Acknowledge the GET_ERROR_STATUS command completion
	USB: HCD: Fix URB giveback issue in tasklet function
	ARM: dts: uniphier: Fix USB interrupts for PXs2 SoC
	arm64: dts: uniphier: Fix USB interrupts for PXs3 SoC
	usb: dwc3: gadget: refactor dwc3_repare_one_trb
	usb: dwc3: gadget: fix high speed multiplier setting
	lockdep: Allow tuning tracing capacity constants.
	netfilter: nf_tables: do not allow SET_ID to refer to another table
	netfilter: nf_tables: do not allow CHAIN_ID to refer to another table
	netfilter: nf_tables: do not allow RULE_ID to refer to another chain
	netfilter: nf_tables: fix null deref due to zeroed list head
	epoll: autoremove wakers even more aggressively
	x86: Handle idle=nomwait cmdline properly for x86_idle
	arm64: Do not forget syscall when starting a new thread.
	arm64: fix oops in concurrently setting insn_emulation sysctls
	ext2: Add more validity checks for inode counts
	genirq: Don't return error on missing optional irq_request_resources()
	irqchip/mips-gic: Only register IPI domain when SMP is enabled
	genirq: GENERIC_IRQ_IPI depends on SMP
	irqchip/mips-gic: Check the return value of ioremap() in gic_of_init()
	wait: Fix __wait_event_hrtimeout for RT/DL tasks
	ARM: dts: imx6ul: add missing properties for sram
	ARM: dts: imx6ul: change operating-points to uint32-matrix
	ARM: dts: imx6ul: fix keypad compatible
	ARM: dts: imx6ul: fix csi node compatible
	ARM: dts: imx6ul: fix lcdif node compatible
	ARM: dts: imx6ul: fix qspi node compatible
	ARM: dts: BCM5301X: Add DT for Meraki MR26
	spi: synquacer: Add missing clk_disable_unprepare()
	ARM: OMAP2+: display: Fix refcount leak bug
	ACPI: EC: Remove duplicate ThinkPad X1 Carbon 6th entry from DMI quirks
	ACPI: EC: Drop the EC_FLAGS_IGNORE_DSDT_GPE quirk
	ACPI: PM: save NVS memory for Lenovo G40-45
	ACPI: LPSS: Fix missing check in register_device_clock()
	arm64: dts: qcom: ipq8074: fix NAND node name
	arm64: dts: allwinner: a64: orangepi-win: Fix LED node name
	ARM: shmobile: rcar-gen2: Increase refcount for new reference
	firmware: tegra: Fix error check return value of debugfs_create_file()
	PM: hibernate: defer device probing when resuming from hibernation
	selinux: Add boundary check in put_entry()
	powerpc/64s: Disable stack variable initialisation for prom_init
	spi: spi-rspi: Fix PIO fallback on RZ platforms
	ARM: findbit: fix overflowing offset
	meson-mx-socinfo: Fix refcount leak in meson_mx_socinfo_init
	arm64: dts: renesas: beacon: Fix regulator node names
	ARM: bcm: Fix refcount leak in bcm_kona_smc_init
	ACPI: processor/idle: Annotate more functions to live in cpuidle section
	ARM: dts: imx7d-colibri-emmc: add cpu1 supply
	Input: atmel_mxt_ts - fix up inverted RESET handler
	soc: renesas: r8a779a0-sysc: Fix A2DP1 and A2CV[2357] PDR values
	soc: amlogic: Fix refcount leak in meson-secure-pwrc.c
	arm64: dts: renesas: Fix thermal-sensors on single-zone sensors
	x86/pmem: Fix platform-device leak in error path
	ARM: dts: ast2500-evb: fix board compatible
	ARM: dts: ast2600-evb: fix board compatible
	hexagon: select ARCH_WANT_LD_ORPHAN_WARN
	arm64: cpufeature: Allow different PMU versions in ID_DFR0_EL1
	locking/lockdep: Fix lockdep_init_map_*() confusion
	soc: fsl: guts: machine variable might be unset
	block: fix infinite loop for invalid zone append
	ARM: dts: qcom: mdm9615: add missing PMIC GPIO reg
	ARM: OMAP2+: Fix refcount leak in omapdss_init_of
	ARM: OMAP2+: Fix refcount leak in omap3xxx_prm_late_init
	cpufreq: zynq: Fix refcount leak in zynq_get_revision
	regulator: qcom_smd: Fix pm8916_pldo range
	ACPI: APEI: Fix _EINJ vs EFI_MEMORY_SP
	soc: qcom: ocmem: Fix refcount leak in of_get_ocmem
	soc: qcom: aoss: Fix refcount leak in qmp_cooling_devices_register
	ARM: dts: qcom: pm8841: add required thermal-sensor-cells
	bus: hisi_lpc: fix missing platform_device_put() in hisi_lpc_acpi_probe()
	arm64: dts: mt7622: fix BPI-R64 WPS button
	arm64: tegra: Fix SDMMC1 CD on P2888
	erofs: avoid consecutive detection for Highmem memory
	blk-mq: don't create hctx debugfs dir until q->debugfs_dir is created
	hwmon: (drivetemp) Add module alias
	block: remove the request_queue to argument request based tracepoints
	blktrace: Trace remapped requests correctly
	regulator: of: Fix refcount leak bug in of_get_regulation_constraints()
	soc: qcom: Make QCOM_RPMPD depend on PM
	arm64: dts: qcom: qcs404: Fix incorrect USB2 PHYs assignment
	drivers/perf: arm_spe: Fix consistency of SYS_PMSCR_EL1.CX
	nohz/full, sched/rt: Fix missed tick-reenabling bug in dequeue_task_rt()
	selftests/seccomp: Fix compile warning when CC=clang
	thermal/tools/tmon: Include pthread and time headers in tmon.h
	dm: return early from dm_pr_call() if DM device is suspended
	pwm: sifive: Don't check the return code of pwmchip_remove()
	pwm: sifive: Simplify offset calculation for PWMCMP registers
	pwm: sifive: Ensure the clk is enabled exactly once per running PWM
	pwm: sifive: Shut down hardware only after pwmchip_remove() completed
	pwm: lpc18xx-sct: Convert to devm_platform_ioremap_resource()
	drm/bridge: tc358767: Move (e)DP bridge endpoint parsing into dedicated function
	drm/bridge: tc358767: Make sure Refclk clock are enabled
	ath10k: do not enforce interrupt trigger type
	drm/st7735r: Fix module autoloading for Okaya RH128128T
	wifi: rtlwifi: fix error codes in rtl_debugfs_set_write_h2c()
	ath11k: fix netdev open race
	drm/mipi-dbi: align max_chunk to 2 in spi_transfer
	ath11k: Fix incorrect debug_mask mappings
	drm/radeon: fix potential buffer overflow in ni_set_mc_special_registers()
	drm/mediatek: Modify dsi funcs to atomic operations
	drm/mediatek: Separate poweron/poweroff from enable/disable and define new funcs
	drm/mediatek: Add pull-down MIPI operation in mtk_dsi_poweroff function
	i2c: npcm: Remove own slave addresses 2:10
	i2c: npcm: Correct slave role behavior
	virtio-gpu: fix a missing check to avoid NULL dereference
	drm: adv7511: override i2c address of cec before accessing it
	crypto: sun8i-ss - do not allocate memory when handling hash requests
	crypto: sun8i-ss - fix error codes in allocate_flows()
	net: fix sk_wmem_schedule() and sk_rmem_schedule() errors
	i2c: Fix a potential use after free
	crypto: sun8i-ss - fix infinite loop in sun8i_ss_setup_ivs()
	media: tw686x: Register the irq at the end of probe
	ath9k: fix use-after-free in ath9k_hif_usb_rx_cb
	wifi: iwlegacy: 4965: fix potential off-by-one overflow in il4965_rs_fill_link_cmd()
	drm/radeon: fix incorrrect SPDX-License-Identifiers
	test_bpf: fix incorrect netdev features
	crypto: ccp - During shutdown, check SEV data pointer before using
	drm: bridge: adv7511: Add check for mipi_dsi_driver_register
	drm/mcde: Fix refcount leak in mcde_dsi_bind
	media: hdpvr: fix error value returns in hdpvr_read
	media: v4l2-mem2mem: prevent pollerr when last_buffer_dequeued is set
	media: tw686x: Fix memory leak in tw686x_video_init
	drm/vc4: plane: Remove subpixel positioning check
	drm/vc4: plane: Fix margin calculations for the right/bottom edges
	drm/vc4: dsi: Correct DSI divider calculations
	drm/vc4: dsi: Correct pixel order for DSI0
	drm/vc4: drv: Remove the DSI pointer in vc4_drv
	drm/vc4: dsi: Use snprintf for the PHY clocks instead of an array
	drm/vc4: dsi: Introduce a variant structure
	drm/vc4: dsi: Register dsi0 as the correct vc4 encoder type
	drm/vc4: dsi: Fix dsi0 interrupt support
	drm/vc4: dsi: Add correct stop condition to vc4_dsi_encoder_disable iteration
	drm/vc4: hdmi: Remove firmware logic for MAI threshold setting
	drm/vc4: hdmi: Avoid full hdmi audio fifo writes
	drm/vc4: hdmi: Don't access the connector state in reset if kmalloc fails
	drm/vc4: hdmi: Limit the BCM2711 to the max without scrambling
	drm/vc4: hdmi: Fix timings for interlaced modes
	drm/vc4: hdmi: Correct HDMI timing registers for interlaced modes
	crypto: arm64/gcm - Select AEAD for GHASH_ARM64_CE
	selftests/xsk: Destroy BPF resources only when ctx refcount drops to 0
	drm/rockchip: vop: Don't crash for invalid duplicate_state()
	drm/rockchip: Fix an error handling path rockchip_dp_probe()
	drm/mediatek: dpi: Remove output format of YUV
	drm/mediatek: dpi: Only enable dpi after the bridge is enabled
	drm: bridge: sii8620: fix possible off-by-one
	lib: bitmap: order includes alphabetically
	lib: bitmap: provide devm_bitmap_alloc() and devm_bitmap_zalloc()
	hinic: Use the bitmap API when applicable
	net: hinic: fix bug that ethtool get wrong stats
	net: hinic: avoid kernel hung in hinic_get_stats64()
	drm/msm/mdp5: Fix global state lock backoff
	crypto: hisilicon/sec - fixes some coding style
	crypto: hisilicon/sec - don't sleep when in softirq
	crypto: hisilicon - Kunpeng916 crypto driver don't sleep when in softirq
	media: platform: mtk-mdp: Fix mdp_ipi_comm structure alignment
	mt76: mt76x02u: fix possible memory leak in __mt76x02u_mcu_send_msg
	mediatek: mt76: mac80211: Fix missing of_node_put() in mt76_led_init()
	drm/exynos/exynos7_drm_decon: free resources when clk_set_parent() failed.
	tcp: make retransmitted SKB fit into the send window
	libbpf: Fix the name of a reused map
	selftests: timers: valid-adjtimex: build fix for newer toolchains
	selftests: timers: clocksource-switch: fix passing errors from child
	bpf: Fix subprog names in stack traces.
	fs: check FMODE_LSEEK to control internal pipe splicing
	wifi: wil6210: debugfs: fix info leak in wil_write_file_wmi()
	wifi: p54: Fix an error handling path in p54spi_probe()
	wifi: p54: add missing parentheses in p54_flush()
	selftests/bpf: fix a test for snprintf() overflow
	can: pch_can: do not report txerr and rxerr during bus-off
	can: rcar_can: do not report txerr and rxerr during bus-off
	can: sja1000: do not report txerr and rxerr during bus-off
	can: hi311x: do not report txerr and rxerr during bus-off
	can: sun4i_can: do not report txerr and rxerr during bus-off
	can: kvaser_usb_hydra: do not report txerr and rxerr during bus-off
	can: kvaser_usb_leaf: do not report txerr and rxerr during bus-off
	can: usb_8dev: do not report txerr and rxerr during bus-off
	can: error: specify the values of data[5..7] of CAN error frames
	can: pch_can: pch_can_error(): initialize errc before using it
	Bluetooth: hci_intel: Add check for platform_driver_register
	i2c: cadence: Support PEC for SMBus block read
	i2c: mux-gpmux: Add of_node_put() when breaking out of loop
	wifi: wil6210: debugfs: fix uninitialized variable use in `wil_write_file_wmi()`
	wifi: iwlwifi: mvm: fix double list_add at iwl_mvm_mac_wake_tx_queue
	wifi: libertas: Fix possible refcount leak in if_usb_probe()
	media: cedrus: hevc: Add check for invalid timestamp
	net/mlx5e: Remove WARN_ON when trying to offload an unsupported TLS cipher/version
	net/mlx5e: Fix the value of MLX5E_MAX_RQ_NUM_MTTS
	crypto: hisilicon/hpre - don't use GFP_KERNEL to alloc mem during softirq
	crypto: inside-secure - Add missing MODULE_DEVICE_TABLE for of
	crypto: hisilicon/sec - fix auth key size error
	inet: add READ_ONCE(sk->sk_bound_dev_if) in INET_MATCH()
	tcp: sk->sk_bound_dev_if once in inet_request_bound_dev_if()
	ipv6: add READ_ONCE(sk->sk_bound_dev_if) in INET6_MATCH()
	tcp: Fix data-races around sysctl_tcp_l3mdev_accept.
	net: allow unbound socket for packets in VRF when tcp_l3mdev_accept set
	iavf: Fix max_rate limiting
	netdevsim: Avoid allocation warnings triggered from user space
	net: rose: fix netdev reference changes
	net: ionic: fix error check for vlan flags in ionic_set_nic_features()
	dccp: put dccp_qpolicy_full() and dccp_qpolicy_push() in the same lock
	wireguard: ratelimiter: use hrtimer in selftest
	wireguard: allowedips: don't corrupt stack when detecting overflow
	clk: renesas: r9a06g032: Fix UART clkgrp bitsel
	mtd: maps: Fix refcount leak in of_flash_probe_versatile
	mtd: maps: Fix refcount leak in ap_flash_init
	mtd: rawnand: meson: Fix a potential double free issue
	PCI: tegra194: Fix PM error handling in tegra_pcie_config_ep()
	HID: cp2112: prevent a buffer overflow in cp2112_xfer()
	mtd: sm_ftl: Fix deadlock caused by cancel_work_sync in sm_release
	mtd: partitions: Fix refcount leak in parse_redboot_of
	mtd: st_spi_fsm: Add a clk_disable_unprepare() in .probe()'s error path
	fpga: altera-pr-ip: fix unsigned comparison with less than zero
	usb: host: Fix refcount leak in ehci_hcd_ppc_of_probe
	usb: ohci-nxp: Fix refcount leak in ohci_hcd_nxp_probe
	usb: gadget: tegra-xudc: Fix error check in tegra_xudc_powerdomain_init()
	usb: xhci: tegra: Fix error check
	netfilter: xtables: Bring SPDX identifier back
	iio: accel: bma400: Fix the scale min and max macro values
	platform/chrome: cros_ec: Always expose last resume result
	iio: accel: bma400: Reordering of header files
	clk: mediatek: reset: Fix written reset bit offset
	KVM: Don't set Accessed/Dirty bits for ZERO_PAGE
	mwifiex: Ignore BTCOEX events from the 88W8897 firmware
	mwifiex: fix sleep in atomic context bugs caused by dev_coredumpv
	dmaengine: dw-edma: Fix eDMA Rd/Wr-channels and DMA-direction semantics
	misc: rtsx: Fix an error handling path in rtsx_pci_probe()
	driver core: fix potential deadlock in __driver_attach
	clk: qcom: clk-krait: unlock spin after mux completion
	usb: host: xhci: use snprintf() in xhci_decode_trb()
	clk: qcom: ipq8074: fix NSS core PLL-s
	clk: qcom: ipq8074: SW workaround for UBI32 PLL lock
	clk: qcom: ipq8074: fix NSS port frequency tables
	clk: qcom: ipq8074: set BRANCH_HALT_DELAY flag for UBI clocks
	clk: qcom: camcc-sdm845: Fix topology around titan_top power domain
	PCI: dwc: Add unroll iATU space support to dw_pcie_disable_atu()
	PCI: dwc: Deallocate EPC memory on dw_pcie_ep_init() errors
	PCI: dwc: Always enable CDM check if "snps,enable-cdm-check" exists
	soundwire: bus_type: fix remove and shutdown support
	KVM: arm64: Don't return from void function
	dmaengine: sf-pdma: apply proper spinlock flags in sf_pdma_prep_dma_memcpy()
	dmaengine: sf-pdma: Add multithread support for a DMA channel
	PCI: endpoint: Don't stop controller when unbinding endpoint function
	intel_th: Fix a resource leak in an error handling path
	intel_th: msu-sink: Potential dereference of null pointer
	intel_th: msu: Fix vmalloced buffers
	staging: rtl8192u: Fix sleep in atomic context bug in dm_fsync_timer_callback
	mmc: sdhci-of-esdhc: Fix refcount leak in esdhc_signal_voltage_switch
	memstick/ms_block: Fix some incorrect memory allocation
	memstick/ms_block: Fix a memory leak
	mmc: sdhci-of-at91: fix set_uhs_signaling rewriting of MC1R
	mmc: block: Add single read for 4k sector cards
	KVM: s390: pv: leak the topmost page table when destroy fails
	PCI/portdrv: Don't disable AER reporting in get_port_device_capability()
	PCI: qcom: Set up rev 2.1.0 PARF_PHY before enabling clocks
	scsi: smartpqi: Fix DMA direction for RAID requests
	xtensa: iss/network: provide release() callback
	xtensa: iss: fix handling error cases in iss_net_configure()
	usb: gadget: udc: amd5536 depends on HAS_DMA
	usb: aspeed-vhub: Fix refcount leak bug in ast_vhub_init_desc()
	usb: dwc3: core: Deprecate GCTL.CORESOFTRESET
	usb: dwc3: core: Do not perform GCTL_CORE_SOFTRESET during bootup
	usb: dwc3: qcom: fix missing optional irq warnings
	eeprom: idt_89hpesx: uninitialized data in idt_dbgfs_csr_write()
	interconnect: imx: fix max_node_id
	um: random: Don't initialise hwrng struct with zero
	RDMA/rtrs: Define MIN_CHUNK_SIZE
	RDMA/rtrs: Avoid Wtautological-constant-out-of-range-compare
	RDMA/rtrs-srv: Fix modinfo output for stringify
	RDMA/qedr: Improve error logs for rdma_alloc_tid error return
	RDMA/qedr: Fix potential memory leak in __qedr_alloc_mr()
	RDMA/hns: Fix incorrect clearing of interrupt status register
	RDMA/siw: Fix duplicated reported IW_CM_EVENT_CONNECT_REPLY event
	RDMA/hfi1: fix potential memory leak in setup_base_ctxt()
	gpio: gpiolib-of: Fix refcount bugs in of_mm_gpiochip_add_data()
	HID: mcp2221: prevent a buffer overflow in mcp_smbus_write()
	mmc: cavium-octeon: Add of_node_put() when breaking out of loop
	mmc: cavium-thunderx: Add of_node_put() when breaking out of loop
	HID: alps: Declare U1_UNICORN_LEGACY support
	PCI: tegra194: Fix Root Port interrupt handling
	PCI: tegra194: Fix link up retry sequence
	USB: serial: fix tty-port initialized comments
	usb: cdns3: change place of 'priv_ep' assignment in cdns3_gadget_ep_dequeue(), cdns3_gadget_ep_enable()
	platform/olpc: Fix uninitialized data in debugfs write
	RDMA/srpt: Duplicate port name members
	RDMA/srpt: Introduce a reference count in struct srpt_device
	RDMA/srpt: Fix a use-after-free
	mm/mmap.c: fix missing call to vm_unacct_memory in mmap_region
	selftests: kvm: set rax before vmcall
	RDMA/mlx5: Add missing check for return value in get namespace flow
	RDMA/rxe: Fix error unwind in rxe_create_qp()
	null_blk: fix ida error handling in null_add_dev()
	nvme: use command_id instead of req->tag in trace_nvme_complete_rq()
	jbd2: fix outstanding credits assert in jbd2_journal_commit_transaction()
	ext4: recover csum seed of tmp_inode after migrating to extents
	jbd2: fix assertion 'jh->b_frozen_data == NULL' failure when journal aborted
	usb: cdns3: Don't use priv_dev uninitialized in cdns3_gadget_ep_enable()
	opp: Fix error check in dev_pm_opp_attach_genpd()
	ASoC: cros_ec_codec: Fix refcount leak in cros_ec_codec_platform_probe
	ASoC: samsung: Fix error handling in aries_audio_probe
	ASoC: mediatek: mt8173: Fix refcount leak in mt8173_rt5650_rt5676_dev_probe
	ASoC: mt6797-mt6351: Fix refcount leak in mt6797_mt6351_dev_probe
	ASoC: codecs: da7210: add check for i2c_add_driver
	ASoC: mediatek: mt8173-rt5650: Fix refcount leak in mt8173_rt5650_dev_probe
	serial: 8250: Export ICR access helpers for internal use
	serial: 8250_dw: Store LSR into lsr_saved_flags in dw8250_tx_wait_empty()
	ASoC: codecs: msm8916-wcd-digital: move gains from SX_TLV to S8_TLV
	ASoC: codecs: wcd9335: move gains from SX_TLV to S8_TLV
	rpmsg: mtk_rpmsg: Fix circular locking dependency
	remoteproc: k3-r5: Fix refcount leak in k3_r5_cluster_of_init
	selftests/livepatch: better synchronize test_klp_callbacks_busy
	profiling: fix shift too large makes kernel panic
	ASoC: samsung: h1940_uda1380: include proepr GPIO consumer header
	powerpc/perf: Optimize clearing the pending PMI and remove WARN_ON for PMI check in power_pmu_disable
	ASoC: samsung: change gpiod_speaker_power and rx1950_audio from global to static variables
	tty: n_gsm: Delete gsmtty open SABM frame when config requester
	tty: n_gsm: fix user open not possible at responder until initiator open
	tty: n_gsm: fix wrong queuing behavior in gsm_dlci_data_output()
	tty: n_gsm: fix non flow control frames during mux flow off
	tty: n_gsm: fix packet re-transmission without open control channel
	tty: n_gsm: fix race condition in gsmld_write()
	ASoC: qcom: Fix missing of_node_put() in asoc_qcom_lpass_cpu_platform_probe()
	remoteproc: qcom: wcnss: Fix handling of IRQs
	vfio: Remove extra put/gets around vfio_device->group
	vfio: Simplify the lifetime logic for vfio_device
	vfio: Split creation of a vfio_device into init and register ops
	vfio/mdev: Make to_mdev_device() into a static inline
	vfio/ccw: Do not change FSM state in subchannel event
	tty: n_gsm: fix wrong T1 retry count handling
	tty: n_gsm: fix DM command
	tty: n_gsm: fix missing corner cases in gsmld_poll()
	iommu/exynos: Handle failed IOMMU device registration properly
	rpmsg: qcom_smd: Fix refcount leak in qcom_smd_parse_edge
	kfifo: fix kfifo_to_user() return type
	lib/smp_processor_id: fix imbalanced instrumentation_end() call
	remoteproc: sysmon: Wait for SSCTL service to come up
	mfd: t7l66xb: Drop platform disable callback
	mfd: max77620: Fix refcount leak in max77620_initialise_fps
	iommu/arm-smmu: qcom_iommu: Add of_node_put() when breaking out of loop
	perf tools: Fix dso_id inode generation comparison
	s390/dump: fix old lowcore virtual vs physical address confusion
	s390/zcore: fix race when reading from hardware system area
	ASoC: fsl_easrc: use snd_pcm_format_t type for sample_format
	ASoC: qcom: q6dsp: Fix an off-by-one in q6adm_alloc_copp()
	fuse: Remove the control interface for virtio-fs
	ASoC: audio-graph-card: Add of_node_put() in fail path
	watchdog: armada_37xx_wdt: check the return value of devm_ioremap() in armada_37xx_wdt_probe()
	video: fbdev: amba-clcd: Fix refcount leak bugs
	video: fbdev: sis: fix typos in SiS_GetModeID()
	ASoC: mchp-spdifrx: disable end of block interrupt on failures
	powerpc/32: Do not allow selection of e5500 or e6500 CPUs on PPC32
	powerpc/pci: Prefer PCI domain assignment via DT 'linux,pci-domain' and alias
	f2fs: don't set GC_FAILURE_PIN for background GC
	f2fs: write checkpoint during FG_GC
	f2fs: fix to remove F2FS_COMPR_FL and tag F2FS_NOCOMP_FL at the same time
	powerpc/spufs: Fix refcount leak in spufs_init_isolated_loader
	powerpc/xive: Fix refcount leak in xive_get_max_prio
	powerpc/cell/axon_msi: Fix refcount leak in setup_msi_msg_address
	perf symbol: Fail to read phdr workaround
	kprobes: Forbid probing on trampoline and BPF code areas
	powerpc/pci: Fix PHB numbering when using opal-phbid
	genelf: Use HAVE_LIBCRYPTO_SUPPORT, not the never defined HAVE_LIBCRYPTO
	scripts/faddr2line: Fix vmlinux detection on arm64
	sched/deadline: Merge dl_task_can_attach() and dl_cpu_busy()
	sched, cpuset: Fix dl_cpu_busy() panic due to empty cs->cpus_allowed
	x86/numa: Use cpumask_available instead of hardcoded NULL check
	video: fbdev: arkfb: Fix a divide-by-zero bug in ark_set_pixclock()
	tools/thermal: Fix possible path truncations
	sched: Fix the check of nr_running at queue wakelist
	x86/entry: Build thunk_$(BITS) only if CONFIG_PREEMPTION=y
	video: fbdev: vt8623fb: Check the size of screen before memset_io()
	video: fbdev: arkfb: Check the size of screen before memset_io()
	video: fbdev: s3fb: Check the size of screen before memset_io()
	scsi: zfcp: Fix missing auto port scan and thus missing target ports
	scsi: qla2xxx: Fix discovery issues in FC-AL topology
	scsi: qla2xxx: Turn off multi-queue for 8G adapters
	scsi: qla2xxx: Fix erroneous mailbox timeout after PCI error injection
	scsi: qla2xxx: Fix losing FCP-2 targets on long port disable with I/Os
	scsi: qla2xxx: Fix losing FCP-2 targets during port perturbation tests
	x86/bugs: Enable STIBP for IBPB mitigated RETBleed
	ftrace/x86: Add back ftrace_expected assignment
	x86/olpc: fix 'logical not is only applied to the left hand side'
	posix-cpu-timers: Cleanup CPU timers before freeing them during exec
	Input: gscps2 - check return value of ioremap() in gscps2_probe()
	__follow_mount_rcu(): verify that mount_lock remains unchanged
	spmi: trace: fix stack-out-of-bound access in SPMI tracing functions
	drm/i915/dg1: Update DMC_DEBUG3 register
	drm/mediatek: Allow commands to be sent during video mode
	drm/mediatek: Keep dsi as LP00 before dcs cmds transfer
	HID: Ignore battery for Elan touchscreen on HP Spectre X360 15-df0xxx
	HID: hid-input: add Surface Go battery quirk
	drm/vc4: drv: Adopt the dma configuration from the HVS or V3D component
	mtd: rawnand: Add a helper to clarify the interface configuration
	mtd: rawnand: arasan: Check the proposed data interface is supported
	mtd: rawnand: Add NV-DDR timings
	mtd: rawnand: arasan: Fix a macro parameter
	mtd: rawnand: arasan: Support NV-DDR interface
	mtd: rawnand: arasan: Fix clock rate in NV-DDR
	usbnet: smsc95xx: Don't clear read-only PHY interrupt
	usbnet: smsc95xx: Avoid link settings race on interrupt reception
	firmware: arm_scpi: Ensure scpi_info is not assigned if the probe fails
	intel_th: pci: Add Meteor Lake-P support
	intel_th: pci: Add Raptor Lake-S PCH support
	intel_th: pci: Add Raptor Lake-S CPU support
	KVM: set_msr_mce: Permit guests to ignore single-bit ECC errors
	KVM: x86: Signal #GP, not -EPERM, on bad WRMSR(MCi_CTL/STATUS)
	iommu/vt-d: avoid invalid memory access via node_online(NUMA_NO_NODE)
	PCI/AER: Write AER Capability only when we control it
	PCI/ERR: Bind RCEC devices to the Root Port driver
	PCI/ERR: Rename reset_link() to reset_subordinates()
	PCI/ERR: Simplify by using pci_upstream_bridge()
	PCI/ERR: Simplify by computing pci_pcie_type() once
	PCI/ERR: Use "bridge" for clarity in pcie_do_recovery()
	PCI/ERR: Avoid negated conditional for clarity
	PCI/ERR: Add pci_walk_bridge() to pcie_do_recovery()
	PCI/ERR: Recover from RCEC AER errors
	PCI/AER: Iterate over error counters instead of error strings
	serial: 8250: Dissociate 4MHz Titan ports from Oxford ports
	serial: 8250: Correct the clock for OxSemi PCIe devices
	serial: 8250_pci: Refactor the loop in pci_ite887x_init()
	serial: 8250_pci: Replace dev_*() by pci_*() macros
	serial: 8250: Fold EndRun device support into OxSemi Tornado code
	dm writecache: set a default MAX_WRITEBACK_JOBS
	kexec, KEYS, s390: Make use of built-in and secondary keyring for signature verification
	dm thin: fix use-after-free crash in dm_sm_register_threshold_callback
	timekeeping: contribute wall clock to rng on time change
	um: Allow PM with suspend-to-idle
	btrfs: reject log replay if there is unsupported RO compat flag
	btrfs: reset block group chunk force if we have to wait
	ACPI: CPPC: Do not prevent CPPC from working in the future
	KVM: VMX: Drop guest CPUID check for VMXE in vmx_set_cr4()
	KVM: VMX: Drop explicit 'nested' check from vmx_set_cr4()
	KVM: SVM: Drop VMXE check from svm_set_cr4()
	KVM: x86: Move vendor CR4 validity check to dedicated kvm_x86_ops hook
	KVM: nVMX: Inject #UD if VMXON is attempted with incompatible CR0/CR4
	KVM: x86/pmu: preserve IA32_PERF_CAPABILITIES across CPUID refresh
	KVM: x86/pmu: Use binary search to check filtered events
	KVM: x86/pmu: Use different raw event masks for AMD and Intel
	KVM: x86/pmu: Introduce the ctrl_mask value for fixed counter
	KVM: VMX: Mark all PERF_GLOBAL_(OVF)_CTRL bits reserved if there's no vPMU
	KVM: x86/pmu: Ignore pmu->global_ctrl check if vPMU doesn't support global_ctrl
	xen-blkback: fix persistent grants negotiation
	xen-blkback: Apply 'feature_persistent' parameter when connect
	xen-blkfront: Apply 'feature_persistent' parameter when connect
	KEYS: asymmetric: enforce SM2 signature use pkey algo
	tpm: eventlog: Fix section mismatch for DEBUG_SECTION_MISMATCH
	tracing: Use a struct alignof to determine trace event field alignment
	ext4: check if directory block is within i_size
	ext4: add EXT4_INODE_HAS_XATTR_SPACE macro in xattr.h
	ext4: fix warning in ext4_iomap_begin as race between bmap and write
	ext4: make sure ext4_append() always allocates new block
	ext4: fix use-after-free in ext4_xattr_set_entry
	ext4: update s_overhead_clusters in the superblock during an on-line resize
	ext4: fix extent status tree race in writeback error recovery path
	ext4: correct max_inline_xattr_value_size computing
	ext4: correct the misjudgment in ext4_iget_extra_inode
	dm raid: fix address sanitizer warning in raid_resume
	dm raid: fix address sanitizer warning in raid_status
	net_sched: cls_route: remove from list when handle is 0
	KVM: Add infrastructure and macro to mark VM as bugged
	KVM: x86: Check lapic_in_kernel() before attempting to set a SynIC irq
	KVM: x86: Avoid theoretical NULL pointer dereference in kvm_irq_delivery_to_apic_fast()
	mac80211: fix a memory leak where sta_info is not freed
	tcp: fix over estimation in sk_forced_mem_schedule()
	Revert "mwifiex: fix sleep in atomic context bugs caused by dev_coredumpv"
	drm/bridge: tc358767: Fix (e)DP bridge endpoint parsing in dedicated function
	drm/vc4: change vc4_dma_range_matches from a global to static
	Revert "net: usb: ax88179_178a needs FLAG_SEND_ZLP"
	Bluetooth: L2CAP: Fix l2cap_global_chan_by_psm regression
	mtd: rawnand: arasan: Prevent an unsupported configuration
	kvm: x86/pmu: Fix the compare function used by the pmu event filter
	tee: add overflow check in register_shm_helper()
	net/9p: Initialize the iounit field during fid creation
	net_sched: cls_route: disallow handle of 0
	sched/fair: Fix fault in reweight_entity
	btrfs: only write the sectors in the vertical stripe which has data stripes
	btrfs: raid56: don't trust any cached sector in __raid56_parity_recover()
	Linux 5.10.137

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I5775ddfad6460c5a737b1ad3f8e0b8f798338786
2022-08-29 16:53:14 +02:00
Dave Chinner
f5f3e54f81 mm: Add kvrealloc()
commit de2860f4636256836450c6543be744a50118fc66 upstream.

During log recovery of an XFS filesystem with 64kB directory
buffers, rebuilding a buffer split across two log records results
in a memory allocation warning from krealloc like this:

xfs filesystem being mounted at /mnt/scratch supports timestamps until 2038 (0x7fffffff)
XFS (dm-0): Unmounting Filesystem
XFS (dm-0): Mounting V5 Filesystem
XFS (dm-0): Starting recovery (logdev: internal)
------------[ cut here ]------------
WARNING: CPU: 5 PID: 3435170 at mm/page_alloc.c:3539 get_page_from_freelist+0xdee/0xe40
.....
RIP: 0010:get_page_from_freelist+0xdee/0xe40
Call Trace:
 ? complete+0x3f/0x50
 __alloc_pages+0x16f/0x300
 alloc_pages+0x87/0x110
 kmalloc_order+0x2c/0x90
 kmalloc_order_trace+0x1d/0x90
 __kmalloc_track_caller+0x215/0x270
 ? xlog_recover_add_to_cont_trans+0x63/0x1f0
 krealloc+0x54/0xb0
 xlog_recover_add_to_cont_trans+0x63/0x1f0
 xlog_recovery_process_trans+0xc1/0xd0
 xlog_recover_process_ophdr+0x86/0x130
 xlog_recover_process_data+0x9f/0x160
 xlog_recover_process+0xa2/0x120
 xlog_do_recovery_pass+0x40b/0x7d0
 ? __irq_work_queue_local+0x4f/0x60
 ? irq_work_queue+0x3a/0x50
 xlog_do_log_recovery+0x70/0x150
 xlog_do_recover+0x38/0x1d0
 xlog_recover+0xd8/0x170
 xfs_log_mount+0x181/0x300
 xfs_mountfs+0x4a1/0x9b0
 xfs_fs_fill_super+0x3c0/0x7b0
 get_tree_bdev+0x171/0x270
 ? suffix_kstrtoint.constprop.0+0xf0/0xf0
 xfs_fs_get_tree+0x15/0x20
 vfs_get_tree+0x24/0xc0
 path_mount+0x2f5/0xaf0
 __x64_sys_mount+0x108/0x140
 do_syscall_64+0x3a/0x70
 entry_SYSCALL_64_after_hwframe+0x44/0xae

Essentially, we are taking a multi-order allocation from kmem_alloc()
(which has an open coded no fail, no warn loop) and then
reallocating it out to 64kB using krealloc(__GFP_NOFAIL) and that is
then triggering the above warning.

This is a regression caused by converting this code from an open
coded no fail/no warn reallocation loop to using __GFP_NOFAIL.

What we actually need here is kvrealloc(), so that if contiguous
page allocation fails we fall back to vmalloc() and we don't
get nasty warnings happening in XFS.

Fixes: 771915c4f6 ("xfs: remove kmem_realloc()")
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-21 15:15:21 +02:00
Greg Kroah-Hartman
f2eb31a498 Merge 5.10.119 into android12-5.10-lts
Changes in 5.10.119
	lockdown: also lock down previous kgdb use
	staging: rtl8723bs: prevent ->Ssid overflow in rtw_wx_set_scan()
	KVM: x86: Properly handle APF vs disabled LAPIC situation
	KVM: x86/mmu: fix NULL pointer dereference on guest INVPCID
	tcp: change source port randomizarion at connect() time
	secure_seq: use the 64 bits of the siphash for port offset calculation
	media: vim2m: Register video device after setting up internals
	media: vim2m: initialize the media device earlier
	ACPI: sysfs: Make sparse happy about address space in use
	ACPI: sysfs: Fix BERT error region memory mapping
	random: avoid arch_get_random_seed_long() when collecting IRQ randomness
	random: remove dead code left over from blocking pool
	MAINTAINERS: co-maintain random.c
	MAINTAINERS: add git tree for random.c
	crypto: lib/blake2s - Move selftest prototype into header file
	crypto: blake2s - define shash_alg structs using macros
	crypto: x86/blake2s - define shash_alg structs using macros
	crypto: blake2s - remove unneeded includes
	crypto: blake2s - move update and final logic to internal/blake2s.h
	crypto: blake2s - share the "shash" API boilerplate code
	crypto: blake2s - optimize blake2s initialization
	crypto: blake2s - add comment for blake2s_state fields
	crypto: blake2s - adjust include guard naming
	crypto: blake2s - include <linux/bug.h> instead of <asm/bug.h>
	lib/crypto: blake2s: include as built-in
	lib/crypto: blake2s: move hmac construction into wireguard
	lib/crypto: sha1: re-roll loops to reduce code size
	lib/crypto: blake2s: avoid indirect calls to compression function for Clang CFI
	random: document add_hwgenerator_randomness() with other input functions
	random: remove unused irq_flags argument from add_interrupt_randomness()
	random: use BLAKE2s instead of SHA1 in extraction
	random: do not sign extend bytes for rotation when mixing
	random: do not re-init if crng_reseed completes before primary init
	random: mix bootloader randomness into pool
	random: harmonize "crng init done" messages
	random: use IS_ENABLED(CONFIG_NUMA) instead of ifdefs
	random: early initialization of ChaCha constants
	random: avoid superfluous call to RDRAND in CRNG extraction
	random: don't reset crng_init_cnt on urandom_read()
	random: fix typo in comments
	random: cleanup poolinfo abstraction
	random: cleanup integer types
	random: remove incomplete last_data logic
	random: remove unused extract_entropy() reserved argument
	random: rather than entropy_store abstraction, use global
	random: remove unused OUTPUT_POOL constants
	random: de-duplicate INPUT_POOL constants
	random: prepend remaining pool constants with POOL_
	random: cleanup fractional entropy shift constants
	random: access input_pool_data directly rather than through pointer
	random: selectively clang-format where it makes sense
	random: simplify arithmetic function flow in account()
	random: continually use hwgenerator randomness
	random: access primary_pool directly rather than through pointer
	random: only call crng_finalize_init() for primary_crng
	random: use computational hash for entropy extraction
	random: simplify entropy debiting
	random: use linear min-entropy accumulation crediting
	random: always wake up entropy writers after extraction
	random: make credit_entropy_bits() always safe
	random: remove use_input_pool parameter from crng_reseed()
	random: remove batched entropy locking
	random: fix locking in crng_fast_load()
	random: use RDSEED instead of RDRAND in entropy extraction
	random: get rid of secondary crngs
	random: inline leaves of rand_initialize()
	random: ensure early RDSEED goes through mixer on init
	random: do not xor RDRAND when writing into /dev/random
	random: absorb fast pool into input pool after fast load
	random: use simpler fast key erasure flow on per-cpu keys
	random: use hash function for crng_slow_load()
	random: make more consistent use of integer types
	random: remove outdated INT_MAX >> 6 check in urandom_read()
	random: zero buffer after reading entropy from userspace
	random: fix locking for crng_init in crng_reseed()
	random: tie batched entropy generation to base_crng generation
	random: remove ifdef'd out interrupt bench
	random: remove unused tracepoints
	random: add proper SPDX header
	random: deobfuscate irq u32/u64 contributions
	random: introduce drain_entropy() helper to declutter crng_reseed()
	random: remove useless header comment
	random: remove whitespace and reorder includes
	random: group initialization wait functions
	random: group crng functions
	random: group entropy extraction functions
	random: group entropy collection functions
	random: group userspace read/write functions
	random: group sysctl functions
	random: rewrite header introductory comment
	random: defer fast pool mixing to worker
	random: do not take pool spinlock at boot
	random: unify early init crng load accounting
	random: check for crng_init == 0 in add_device_randomness()
	random: pull add_hwgenerator_randomness() declaration into random.h
	random: clear fast pool, crng, and batches in cpuhp bring up
	random: round-robin registers as ulong, not u32
	random: only wake up writers after zap if threshold was passed
	random: cleanup UUID handling
	random: unify cycles_t and jiffies usage and types
	random: do crng pre-init loading in worker rather than irq
	random: give sysctl_random_min_urandom_seed a more sensible value
	random: don't let 644 read-only sysctls be written to
	random: replace custom notifier chain with standard one
	random: use SipHash as interrupt entropy accumulator
	random: make consistent usage of crng_ready()
	random: reseed more often immediately after booting
	random: check for signal and try earlier when generating entropy
	random: skip fast_init if hwrng provides large chunk of entropy
	random: treat bootloader trust toggle the same way as cpu trust toggle
	random: re-add removed comment about get_random_{u32,u64} reseeding
	random: mix build-time latent entropy into pool at init
	random: do not split fast init input in add_hwgenerator_randomness()
	random: do not allow user to keep crng key around on stack
	random: check for signal_pending() outside of need_resched() check
	random: check for signals every PAGE_SIZE chunk of /dev/[u]random
	random: allow partial reads if later user copies fail
	random: make random_get_entropy() return an unsigned long
	random: document crng_fast_key_erasure() destination possibility
	random: fix sysctl documentation nits
	init: call time_init() before rand_initialize()
	ia64: define get_cycles macro for arch-override
	s390: define get_cycles macro for arch-override
	parisc: define get_cycles macro for arch-override
	alpha: define get_cycles macro for arch-override
	powerpc: define get_cycles macro for arch-override
	timekeeping: Add raw clock fallback for random_get_entropy()
	m68k: use fallback for random_get_entropy() instead of zero
	riscv: use fallback for random_get_entropy() instead of zero
	mips: use fallback for random_get_entropy() instead of just c0 random
	arm: use fallback for random_get_entropy() instead of zero
	nios2: use fallback for random_get_entropy() instead of zero
	x86/tsc: Use fallback for random_get_entropy() instead of zero
	um: use fallback for random_get_entropy() instead of zero
	sparc: use fallback for random_get_entropy() instead of zero
	xtensa: use fallback for random_get_entropy() instead of zero
	random: insist on random_get_entropy() existing in order to simplify
	random: do not use batches when !crng_ready()
	random: use first 128 bits of input as fast init
	random: do not pretend to handle premature next security model
	random: order timer entropy functions below interrupt functions
	random: do not use input pool from hard IRQs
	random: help compiler out with fast_mix() by using simpler arguments
	siphash: use one source of truth for siphash permutations
	random: use symbolic constants for crng_init states
	random: avoid initializing twice in credit race
	random: move initialization out of reseeding hot path
	random: remove ratelimiting for in-kernel unseeded randomness
	random: use proper jiffies comparison macro
	random: handle latent entropy and command line from random_init()
	random: credit architectural init the exact amount
	random: use static branch for crng_ready()
	random: remove extern from functions in header
	random: use proper return types on get_random_{int,long}_wait()
	random: make consistent use of buf and len
	random: move initialization functions out of hot pages
	random: move randomize_page() into mm where it belongs
	random: unify batched entropy implementations
	random: convert to using fops->read_iter()
	random: convert to using fops->write_iter()
	random: wire up fops->splice_{read,write}_iter()
	random: check for signals after page of pool writes
	ALSA: ctxfi: Add SB046x PCI ID
	Linux 5.10.119

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I65f898474b7704881a3dd528012e7e91b09b3767
2022-07-14 14:31:17 +02:00
Jason A. Donenfeld
552ae8e484 random: move randomize_page() into mm where it belongs
commit 5ad7dd882e45d7fe432c32e896e2aaa0b21746ea upstream.

randomize_page is an mm function. It is documented like one. It contains
the history of one. It has the naming convention of one. It looks
just like another very similar function in mm, randomize_stack_top().
And it has always been maintained and updated by mm people. There is no
need for it to be in random.c. In the "which shape does not look like
the other ones" test, pointing to randomize_page() is correct.

So move randomize_page() into mm/util.c, right next to the similar
randomize_stack_top() function.

This commit contains no actual code changes.

Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-05-30 09:33:45 +02:00
Greg Kroah-Hartman
0773736e48 This is the 5.10.104 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmInm+kACgkQONu9yGCS
 aT4/lw/6AoFz3oHbzlft4tCUn4UXQIut8gIiJfrXBnIqw6pa5YQPA1E7hgQBxTnG
 v9llnwDRFR7qouXq1qhoXU01vETiRqZ2ClkT/MnvLfRMQcHgtS9B61VgwnpNuNBg
 qStZORcqFi6rQHXySUgs2ObF6NZQ3BRyHY2LYqZPj0YkuHIdQo48WtQ9cy0XWZLV
 WT49LIVxkTZ6D0fs/k10qRv+M4lmeOzCqEZf4591F0sjoVuTFj3F1xMsSbu8W3xZ
 xXxE0hPbN0If3JFhnb3DqdQ20kRNSmrrV1CzYaN09jyP7KHpdIDVT8R1hSau3TFP
 3zd2fBWmOP0FPzOVkNnqetMuKspKH8p3kKW2rkTyHYcGtUFzh54Hm0QRpA3CVB3L
 JZje9HCkxWiBSl1mwypmBGp88kWOe+n3NRUOhX3yqPoT3R2n45coBV+sfSOakkxv
 K8mUw1FFbJTPjgJtMCs57zzxybInnMrAF5/7XA2MgHCr3SVvYQA7+joSPn3CO+0K
 zKO4kTdEmD9jTT+3vMDL4Z3VSmOJMVcxCHBTUrac/OBIiBz+7y9WQSc7a6aRbfdu
 k3wy7HJ98pmjYB6g73MJcXOtTwXuoTqur4QWU5MCgTw6+qglgRQHr5ILSs35ZeAV
 LO1zvAsklOWFMc/3fD8heLmKGBZ0GHcn+0Y7ZqfFKLmqOTnx7tA=
 =W+lo
 -----END PGP SIGNATURE-----

Merge 5.10.104 into android12-5.10-lts

Changes in 5.10.104
	mac80211_hwsim: report NOACK frames in tx_status
	mac80211_hwsim: initialize ieee80211_tx_info at hw_scan_work
	i2c: bcm2835: Avoid clock stretching timeouts
	ASoC: rt5668: do not block workqueue if card is unbound
	ASoC: rt5682: do not block workqueue if card is unbound
	regulator: core: fix false positive in regulator_late_cleanup()
	Input: clear BTN_RIGHT/MIDDLE on buttonpads
	KVM: arm64: vgic: Read HW interrupt pending state from the HW
	tipc: fix a bit overflow in tipc_crypto_key_rcv()
	cifs: fix double free race when mount fails in cifs_get_root()
	selftests/seccomp: Fix seccomp failure by adding missing headers
	dmaengine: shdma: Fix runtime PM imbalance on error
	i2c: cadence: allow COMPILE_TEST
	i2c: qup: allow COMPILE_TEST
	net: usb: cdc_mbim: avoid altsetting toggling for Telit FN990
	usb: gadget: don't release an existing dev->buf
	usb: gadget: clear related members when goto fail
	exfat: reuse exfat_inode_info variable instead of calling EXFAT_I()
	exfat: fix i_blocks for files truncated over 4 GiB
	tracing: Add test for user space strings when filtering on string pointers
	serial: stm32: prevent TDR register overwrite when sending x_char
	ata: pata_hpt37x: fix PCI clock detection
	drm/amdgpu: check vm ready by amdgpu_vm->evicting flag
	tracing: Add ustring operation to filtering string pointers
	ALSA: intel_hdmi: Fix reference to PCM buffer address
	riscv/efi_stub: Fix get_boot_hartid_from_fdt() return value
	riscv: Fix config KASAN && SPARSEMEM && !SPARSE_VMEMMAP
	riscv: Fix config KASAN && DEBUG_VIRTUAL
	ASoC: ops: Shift tested values in snd_soc_put_volsw() by +min
	iommu/amd: Recover from event log overflow
	drm/i915: s/JSP2/ICP2/ PCH
	xen/netfront: destroy queues before real_num_tx_queues is zeroed
	thermal: core: Fix TZ_GET_TRIP NULL pointer dereference
	ntb: intel: fix port config status offset for SPR
	mm: Consider __GFP_NOWARN flag for oversized kvmalloc() calls
	xfrm: fix MTU regression
	netfilter: fix use-after-free in __nf_register_net_hook()
	bpf, sockmap: Do not ignore orig_len parameter
	xfrm: fix the if_id check in changelink
	xfrm: enforce validity of offload input flags
	e1000e: Correct NVM checksum verification flow
	net: fix up skbs delta_truesize in UDP GRO frag_list
	netfilter: nf_queue: don't assume sk is full socket
	netfilter: nf_queue: fix possible use-after-free
	netfilter: nf_queue: handle socket prefetch
	batman-adv: Request iflink once in batadv-on-batadv check
	batman-adv: Request iflink once in batadv_get_real_netdevice
	batman-adv: Don't expect inter-netns unique iflink indices
	net: ipv6: ensure we call ipv6_mc_down() at most once
	net: dcb: flush lingering app table entries for unregistered devices
	net/smc: fix connection leak
	net/smc: fix unexpected SMC_CLC_DECL_ERR_REGRMB error generated by client
	net/smc: fix unexpected SMC_CLC_DECL_ERR_REGRMB error cause by server
	rcu/nocb: Fix missed nocb_timer requeue
	ice: Fix race conditions between virtchnl handling and VF ndo ops
	ice: fix concurrent reset and removal of VFs
	sched/topology: Make sched_init_numa() use a set for the deduplicating sort
	sched/topology: Fix sched_domain_topology_level alloc in sched_init_numa()
	ia64: ensure proper NUMA distance and possible map initialization
	mac80211: fix forwarded mesh frames AC & queue selection
	net: stmmac: fix return value of __setup handler
	mac80211: treat some SAE auth steps as final
	iavf: Fix missing check for running netdev
	net: sxgbe: fix return value of __setup handler
	ibmvnic: register netdev after init of adapter
	net: arcnet: com20020: Fix null-ptr-deref in com20020pci_probe()
	ixgbe: xsk: change !netif_carrier_ok() handling in ixgbe_xmit_zc()
	efivars: Respect "block" flag in efivar_entry_set_safe()
	firmware: arm_scmi: Remove space in MODULE_ALIAS name
	ASoC: cs4265: Fix the duplicated control name
	can: gs_usb: change active_channels's type from atomic_t to u8
	arm64: dts: rockchip: Switch RK3399-Gru DP to SPDIF output
	igc: igc_read_phy_reg_gpy: drop premature return
	ARM: Fix kgdb breakpoint for Thumb2
	ARM: 9182/1: mmu: fix returns from early_param() and __setup() functions
	selftests: mlxsw: tc_police_scale: Make test more robust
	pinctrl: sunxi: Use unique lockdep classes for IRQs
	igc: igc_write_phy_reg_gpy: drop premature return
	ibmvnic: free reset-work-item when flushing
	memfd: fix F_SEAL_WRITE after shmem huge page allocated
	s390/extable: fix exception table sorting
	ARM: dts: switch timer config to common devkit8000 devicetree
	ARM: dts: Use 32KiHz oscillator on devkit8000
	soc: fsl: guts: Revert commit 3c0d64e867
	soc: fsl: guts: Add a missing memory allocation failure check
	soc: fsl: qe: Check of ioremap return value
	ARM: tegra: Move panels to AUX bus
	ibmvnic: complete init_done on transport events
	net: chelsio: cxgb3: check the return value of pci_find_capability()
	iavf: Refactor iavf state machine tracking
	nl80211: Handle nla_memdup failures in handle_nan_filter
	drm/amdgpu: fix suspend/resume hang regression
	net: dcb: disable softirqs in dcbnl_flush_dev()
	Input: elan_i2c - move regulator_[en|dis]able() out of elan_[en|dis]able_power()
	Input: elan_i2c - fix regulator enable count imbalance after suspend/resume
	Input: samsung-keypad - properly state IOMEM dependency
	HID: add mapping for KEY_DICTATE
	HID: add mapping for KEY_ALL_APPLICATIONS
	tracing/histogram: Fix sorting on old "cpu" value
	tracing: Fix return value of __setup handlers
	btrfs: fix lost prealloc extents beyond eof after full fsync
	btrfs: qgroup: fix deadlock between rescan worker and remove qgroup
	btrfs: add missing run of delayed items after unlink during log replay
	Revert "xfrm: xfrm_state_mtu should return at least 1280 for ipv6"
	hamradio: fix macro redefine warning
	Linux 5.10.104

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I24dabeba483a0b0123a4e8c10d1a568b11dfb9c8
2022-03-12 13:57:09 +01:00
Daniel Borkmann
e93f2be33d mm: Consider __GFP_NOWARN flag for oversized kvmalloc() calls
commit 0708a0afe291bdfe1386d74d5ec1f0c27e8b9168 upstream.

syzkaller was recently triggering an oversized kvmalloc() warning via
xdp_umem_create().

The triggered warning was added back in 7661809d493b ("mm: don't allow
oversized kvmalloc() calls"). The rationale for the warning for huge
kvmalloc sizes was as a reaction to a security bug where the size was
more than UINT_MAX but not everything was prepared to handle unsigned
long sizes.

Anyway, the AF_XDP related call trace from this syzkaller report was:

  kvmalloc include/linux/mm.h:806 [inline]
  kvmalloc_array include/linux/mm.h:824 [inline]
  kvcalloc include/linux/mm.h:829 [inline]
  xdp_umem_pin_pages net/xdp/xdp_umem.c:102 [inline]
  xdp_umem_reg net/xdp/xdp_umem.c:219 [inline]
  xdp_umem_create+0x6a5/0xf00 net/xdp/xdp_umem.c:252
  xsk_setsockopt+0x604/0x790 net/xdp/xsk.c:1068
  __sys_setsockopt+0x1fd/0x4e0 net/socket.c:2176
  __do_sys_setsockopt net/socket.c:2187 [inline]
  __se_sys_setsockopt net/socket.c:2184 [inline]
  __x64_sys_setsockopt+0xb5/0x150 net/socket.c:2184
  do_syscall_x64 arch/x86/entry/common.c:50 [inline]
  do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
  entry_SYSCALL_64_after_hwframe+0x44/0xae

Björn mentioned that requests for >2GB allocation can still be valid:

  The structure that is being allocated is the page-pinning accounting.
  AF_XDP has an internal limit of U32_MAX pages, which is *a lot*, but
  still fewer than what memcg allows (PAGE_COUNTER_MAX is a LONG_MAX/
  PAGE_SIZE on 64 bit systems). [...]

  I could just change from U32_MAX to INT_MAX, but as I stated earlier
  that has a hacky feeling to it. [...] From my perspective, the code
  isn't broken, with the memcg limits in consideration. [...]

Linus says:

  [...] Pretty much every time this has come up, the kernel warning has
  shown that yes, the code was broken and there really wasn't a reason
  for doing allocations that big.

  Of course, some people would be perfectly fine with the allocation
  failing, they just don't want the warning. I didn't want __GFP_NOWARN
  to shut it up originally because I wanted people to see all those
  cases, but these days I think we can just say "yeah, people can shut
  it up explicitly by saying 'go ahead and fail this allocation, don't
  warn about it'".

  So enough time has passed that by now I'd certainly be ok with [it].

Thus allow call-sites to silence such userspace triggered splats if the
allocation requests have __GFP_NOWARN. For xdp_umem_pin_pages()'s call
to kvcalloc() this is already the case, so nothing else needed there.

Fixes: 7661809d493b ("mm: don't allow oversized kvmalloc() calls")
Reported-by: syzbot+11421fbbff99b989670e@syzkaller.appspotmail.com
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: syzbot+11421fbbff99b989670e@syzkaller.appspotmail.com
Cc: Björn Töpel <bjorn@kernel.org>
Cc: Magnus Karlsson <magnus.karlsson@intel.com>
Cc: Willy Tarreau <w@1wt.eu>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: David S. Miller <davem@davemloft.net>
Link: https://lore.kernel.org/bpf/CAJ+HfNhyfsT5cS_U9EC213ducHs9k9zNxX9+abqC0kTrPbQ0gg@mail.gmail.com
Link: https://lore.kernel.org/bpf/20211201202905.b9892171e3f5b9a60f9da251@linux-foundation.org
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Ackd-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-08 19:09:32 +01:00
Greg Kroah-Hartman
c23269dad5 This is the 5.10.71 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmFdqxMACgkQONu9yGCS
 aT4n8BAAt6WBtGY6OmnqqVDriJQxYPmF5oL+rpREdBRks97sinOCI4sAQS6NRb1T
 J8GUzwv1A2KbDOW+iky+XUhYV6wF6RFaiUnYbEAz0hbg+FEbJYBLcO98naJpReTr
 GnyjVEyMQ/NO/xDuJlguI3+6UHl6LPXmqoYR2XD77cwQiXEZW588VtbhtYoK4M8k
 r/Fh0bIbhS5CkWF7TYnzUD3ceSwHWq7N4yGK86s+yrkaeMJ0BsKeisOe4PW5JI3f
 iiqB4FJMbnNe412SdmYoPKfDcNWQbirJ4UnS1hdVslZMCyPktMiI2sRiVr1Euz45
 zh221ObMIqyFK4attV809C2dtyqdI2Zt3maMCwtJWgOJOrpdeUpjyQ91cZ0WJcW0
 2d0ZW0AqpkMpERFsHtcZNtkCBzLNcIgPu+yYJRlimG/Sh95VQWtMbtFsS0W5ZI5D
 F+2PC8cluXwGFLgHvxfkpas/KXVhv2w3m9x0xEgaWxZis31lKzQ4vRVzLewNqhJ9
 C5S7Qb6qEVjRzY9CzT07AV66+faai2RZp1UtC0Lf+mbh4nW4JN0jDc2uxggZWGMb
 inTxl9LfIFFK0apCt6xvuEDPYvMwySKumeNJK3VMP2F3Py/PuZ4SW5Z/OH09+0/S
 liA2dMFBOp8h/AivWQ7qV7B/qGcpasn5ZRabIkLYiaF6zftpUmo=
 =ZCXg
 -----END PGP SIGNATURE-----

Merge 5.10.71 into android12-5.10-lts

Changes in 5.10.71
	tty: Fix out-of-bound vmalloc access in imageblit
	cpufreq: schedutil: Use kobject release() method to free sugov_tunables
	scsi: qla2xxx: Changes to support kdump kernel for NVMe BFS
	cpufreq: schedutil: Destroy mutex before kobject_put() frees the memory
	usb: cdns3: fix race condition before setting doorbell
	ALSA: hda/realtek: Quirks to enable speaker output for Lenovo Legion 7i 15IMHG05, Yoga 7i 14ITL5/15ITL5, and 13s Gen2 laptops.
	ACPI: NFIT: Use fallback node id when numa info in NFIT table is incorrect
	fs-verity: fix signed integer overflow with i_size near S64_MAX
	hwmon: (tmp421) handle I2C errors
	hwmon: (w83793) Fix NULL pointer dereference by removing unnecessary structure field
	hwmon: (w83792d) Fix NULL pointer dereference by removing unnecessary structure field
	hwmon: (w83791d) Fix NULL pointer dereference by removing unnecessary structure field
	gpio: pca953x: do not ignore i2c errors
	scsi: ufs: Fix illegal offset in UPIU event trace
	mac80211: fix use-after-free in CCMP/GCMP RX
	x86/kvmclock: Move this_cpu_pvti into kvmclock.h
	KVM: x86: Fix stack-out-of-bounds memory access from ioapic_write_indirect()
	KVM: x86: nSVM: don't copy virt_ext from vmcb12
	KVM: nVMX: Filter out all unsupported controls when eVMCS was activated
	KVM: rseq: Update rseq when processing NOTIFY_RESUME on xfer to KVM guest
	media: ir_toy: prevent device from hanging during transmit
	RDMA/cma: Do not change route.addr.src_addr.ss_family
	drm/amd/display: Pass PCI deviceid into DC
	drm/amdgpu: correct initial cp_hqd_quantum for gfx9
	ipvs: check that ip_vs_conn_tab_bits is between 8 and 20
	bpf: Handle return value of BPF_PROG_TYPE_STRUCT_OPS prog
	IB/cma: Do not send IGMP leaves for sendonly Multicast groups
	RDMA/cma: Fix listener leak in rdma_cma_listen_on_all() failure
	bpf, mips: Validate conditional branch offsets
	hwmon: (mlxreg-fan) Return non-zero value when fan current state is enforced from sysfs
	mac80211: Fix ieee80211_amsdu_aggregate frag_tail bug
	mac80211: limit injected vht mcs/nss in ieee80211_parse_tx_radiotap
	mac80211: mesh: fix potentially unaligned access
	mac80211-hwsim: fix late beacon hrtimer handling
	sctp: break out if skb_header_pointer returns NULL in sctp_rcv_ootb
	mptcp: don't return sockets in foreign netns
	hwmon: (tmp421) report /PVLD condition as fault
	hwmon: (tmp421) fix rounding for negative values
	net: enetc: fix the incorrect clearing of IF_MODE bits
	net: ipv4: Fix rtnexthop len when RTA_FLOW is present
	smsc95xx: fix stalled rx after link change
	drm/i915/request: fix early tracepoints
	dsa: mv88e6xxx: 6161: Use chip wide MAX MTU
	dsa: mv88e6xxx: Fix MTU definition
	dsa: mv88e6xxx: Include tagger overhead when setting MTU for DSA and CPU ports
	e100: fix length calculation in e100_get_regs_len
	e100: fix buffer overrun in e100_get_regs
	RDMA/hns: Fix inaccurate prints
	bpf: Exempt CAP_BPF from checks against bpf_jit_limit
	selftests, bpf: Fix makefile dependencies on libbpf
	selftests, bpf: test_lwt_ip_encap: Really disable rp_filter
	net: ks8851: fix link error
	Revert "block, bfq: honor already-setup queue merges"
	scsi: csiostor: Add module softdep on cxgb4
	ixgbe: Fix NULL pointer dereference in ixgbe_xdp_setup
	net: hns3: do not allow call hns3_nic_net_open repeatedly
	net: hns3: keep MAC pause mode when multiple TCs are enabled
	net: hns3: fix mixed flag HCLGE_FLAG_MQPRIO_ENABLE and HCLGE_FLAG_DCB_ENABLE
	net: hns3: fix show wrong state when add existing uc mac address
	net: hns3: fix prototype warning
	net: hns3: reconstruct function hns3_self_test
	net: hns3: fix always enable rx vlan filter problem after selftest
	net: phy: bcm7xxx: Fixed indirect MMD operations
	net: sched: flower: protect fl_walk() with rcu
	af_unix: fix races in sk_peer_pid and sk_peer_cred accesses
	perf/x86/intel: Update event constraints for ICX
	hwmon: (pmbus/mp2975) Add missed POUT attribute for page 1 mp2975 controller
	nvme: add command id quirk for apple controllers
	elf: don't use MAP_FIXED_NOREPLACE for elf interpreter mappings
	debugfs: debugfs_create_file_size(): use IS_ERR to check for error
	ipack: ipoctal: fix stack information leak
	ipack: ipoctal: fix tty registration race
	ipack: ipoctal: fix tty-registration error handling
	ipack: ipoctal: fix missing allocation-failure check
	ipack: ipoctal: fix module reference leak
	ext4: fix loff_t overflow in ext4_max_bitmap_size()
	ext4: limit the number of blocks in one ADD_RANGE TLV
	ext4: fix reserved space counter leakage
	ext4: add error checking to ext4_ext_replay_set_iblocks()
	ext4: fix potential infinite loop in ext4_dx_readdir()
	HID: u2fzero: ignore incomplete packets without data
	net: udp: annotate data race around udp_sk(sk)->corkflag
	ASoC: dapm: use component prefix when checking widget names
	usb: hso: remove the bailout parameter
	crypto: ccp - fix resource leaks in ccp_run_aes_gcm_cmd()
	HID: betop: fix slab-out-of-bounds Write in betop_probe
	netfilter: ipset: Fix oversized kvmalloc() calls
	mm: don't allow oversized kvmalloc() calls
	HID: usbhid: free raw_report buffers in usbhid_stop
	KVM: x86: Handle SRCU initialization failure during page track init
	netfilter: conntrack: serialize hash resizes and cleanups
	netfilter: nf_tables: Fix oversized kvmalloc() calls
	Linux 5.10.71

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I238c3de739c3d4ba0a04a484460356161899f222
2021-10-06 17:33:06 +02:00
Linus Torvalds
57a269a1b1 mm: don't allow oversized kvmalloc() calls
commit 7661809d493b426e979f39ab512e3adf41fbcc69 upstream.

'kvmalloc()' is a convenience function for people who want to do a
kmalloc() but fall back on vmalloc() if there aren't enough physically
contiguous pages, or if the allocation is larger than what kmalloc()
supports.

However, let's make sure it doesn't get _too_ easy to do crazy things
with it.  In particular, don't allow big allocations that could be due
to integer overflow or underflow.  So make sure the allocation size fits
in an 'int', to protect against trivial integer conversion issues.

Acked-by: Willy Tarreau <w@1wt.eu>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-10-06 15:56:03 +02:00
Greg Kroah-Hartman
d69751309b This is the 5.10.70 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmFVcUcACgkQONu9yGCS
 aT4/Mw//b3IUn6Vy0r8Jc6MsU16U+UY0Rb6o8X6J5V7PXMI2RuHIf6+AXm4CDLPZ
 jpsgaPB3nSYUz63+b699kB6IZiUTbij8r0O/Yjy1p2/Z6HoDgSOX8WvU25kTO697
 MWxZT25Nj8sZzigPuXw1zy1ioZCdeGlRGXrDAoeZt8OL8TMd78eSLISYNQYv38L6
 Sg3TbtumEwjfZe3FeyzPA82Qc1jlsZ2ViKJ+E/BC74TJ9DBS5K+uMUzDwDyJEIaB
 MwswdjvQIbK5cN+uux6Ok3v4/6/bIKeouYkpLnQvnNtIrn8hk8FXO6OamU6XwTGl
 oI26Hu5mjL2WecHvpQJCcn6h8L0w/dMfQPg2b/m1gJ5l58NJobFS3Uy1bMaGlJic
 L1K2ZFPHQd+CR9Lvz/umiXqaBgL2K4QKKi28TrWxMgKatrMeip3Lo8krxNuxm0/Z
 VpJIsOajWkgf3n5HuQ/zfFGl+YUcjtBUqxO+WR3ocTLlN3kcG6ZjEMxHPK8VYmIr
 Yp4s+WyU7uRlGhSy6UpWI78AHcijx5WKS5n25ZI56VJRi38Qxgb3Q+EZ6vlpJuvh
 yTCgvjwi4FzLWXeYRR/RXpwzvwS8t5TKJT355ufjqZaAtQk/vE27deFdQs6B7Hqy
 17KvN8UjycbWKUXX/zM1CcU6ikXgj/h+q3+kAe99kldpEphjpMs=
 =vyz1
 -----END PGP SIGNATURE-----

Merge 5.10.70 into android12-5.10-lts

Changes in 5.10.70
	PCI: aardvark: Increase polling delay to 1.5s while waiting for PIO response
	ocfs2: drop acl cache for directories too
	mm: fix uninitialized use in overcommit_policy_handler
	usb: gadget: r8a66597: fix a loop in set_feature()
	usb: dwc2: gadget: Fix ISOC flow for BDMA and Slave
	usb: dwc2: gadget: Fix ISOC transfer complete handling for DDMA
	usb: musb: tusb6010: uninitialized data in tusb_fifo_write_unaligned()
	cifs: fix incorrect check for null pointer in header_assemble
	xen/x86: fix PV trap handling on secondary processors
	usb-storage: Add quirk for ScanLogic SL11R-IDE older than 2.6c
	USB: serial: cp210x: add ID for GW Instek GDM-834x Digital Multimeter
	USB: cdc-acm: fix minor-number release
	Revert "USB: bcma: Add a check for devm_gpiod_get"
	binder: make sure fd closes complete
	staging: greybus: uart: fix tty use after free
	Re-enable UAS for LaCie Rugged USB3-FW with fk quirk
	usb: dwc3: core: balance phy init and exit
	usb: core: hcd: Add support for deferring roothub registration
	USB: serial: mos7840: remove duplicated 0xac24 device ID
	USB: serial: option: add Telit LN920 compositions
	USB: serial: option: remove duplicate USB device ID
	USB: serial: option: add device id for Foxconn T99W265
	mcb: fix error handling in mcb_alloc_bus()
	erofs: fix up erofs_lookup tracepoint
	btrfs: prevent __btrfs_dump_space_info() to underflow its free space
	xhci: Set HCD flag to defer primary roothub registration
	serial: 8250: 8250_omap: Fix RX_LVL register offset
	serial: mvebu-uart: fix driver's tx_empty callback
	scsi: sd_zbc: Ensure buffer size is aligned to SECTOR_SIZE
	drm/amd/pm: Update intermediate power state for SI
	net: hso: fix muxed tty registration
	comedi: Fix memory leak in compat_insnlist()
	afs: Fix incorrect triggering of sillyrename on 3rd-party invalidation
	afs: Fix updating of i_blocks on file/dir extension
	platform/x86/intel: punit_ipc: Drop wrong use of ACPI_PTR()
	enetc: Fix illegal access when reading affinity_hint
	enetc: Fix uninitialized struct dim_sample field usage
	bnxt_en: Fix TX timeout when TX ring size is set to the smallest
	net: hns3: fix change RSS 'hfunc' ineffective issue
	net: hns3: check queue id range before using
	net/smc: add missing error check in smc_clc_prfx_set()
	net/smc: fix 'workqueue leaked lock' in smc_conn_abort_work
	net: dsa: don't allocate the slave_mii_bus using devres
	net: dsa: realtek: register the MDIO bus under devres
	kselftest/arm64: signal: Add SVE to the set of features we can check for
	kselftest/arm64: signal: Skip tests if required features are missing
	s390/qeth: fix NULL deref in qeth_clear_working_pool_list()
	gpio: uniphier: Fix void functions to remove return value
	qed: rdma - don't wait for resources under hw error recovery flow
	net/mlx4_en: Don't allow aRFS for encapsulated packets
	atlantic: Fix issue in the pm resume flow.
	scsi: iscsi: Adjust iface sysfs attr detection
	scsi: target: Fix the pgr/alua_support_store functions
	tty: synclink_gt, drop unneeded forward declarations
	tty: synclink_gt: rename a conflicting function name
	fpga: machxo2-spi: Return an error on failure
	fpga: machxo2-spi: Fix missing error code in machxo2_write_complete()
	nvme-tcp: fix incorrect h2cdata pdu offset accounting
	treewide: Change list_sort to use const pointers
	nvme: keep ctrl->namespaces ordered
	thermal/core: Potential buffer overflow in thermal_build_list_of_policies()
	cifs: fix a sign extension bug
	scsi: qla2xxx: Restore initiator in dual mode
	scsi: lpfc: Use correct scnprintf() limit
	irqchip/goldfish-pic: Select GENERIC_IRQ_CHIP to fix build
	irqchip/gic-v3-its: Fix potential VPE leak on error
	md: fix a lock order reversal in md_alloc
	x86/asm: Add a missing __iomem annotation in enqcmds()
	x86/asm: Fix SETZ size enqcmds() build failure
	io_uring: put provided buffer meta data under memcg accounting
	blktrace: Fix uaf in blk_trace access after removing by sysfs
	net: phylink: Update SFP selected interface on advertising changes
	net: macb: fix use after free on rmmod
	net: stmmac: allow CSR clock of 300MHz
	blk-mq: avoid to iterate over stale request
	m68k: Double cast io functions to unsigned long
	ipv6: delay fib6_sernum increase in fib6_add
	cpufreq: intel_pstate: Override parameters if HWP forced by BIOS
	bpf: Add oversize check before call kvcalloc()
	xen/balloon: use a kernel thread instead a workqueue
	nvme-multipath: fix ANA state updates when a namespace is not present
	nvme-rdma: destroy cm id before destroy qp to avoid use after free
	sparc32: page align size in arch_dma_alloc
	amd/display: downgrade validation failure log level
	block: check if a profile is actually registered in blk_integrity_unregister
	block: flush the integrity workqueue in blk_integrity_unregister
	blk-cgroup: fix UAF by grabbing blkcg lock before destroying blkg pd
	compiler.h: Introduce absolute_pointer macro
	net: i825xx: Use absolute_pointer for memcpy from fixed memory location
	sparc: avoid stringop-overread errors
	qnx4: avoid stringop-overread errors
	parisc: Use absolute_pointer() to define PAGE0
	arm64: Mark __stack_chk_guard as __ro_after_init
	alpha: Declare virt_to_phys and virt_to_bus parameter as pointer to volatile
	net: 6pack: Fix tx timeout and slot time
	spi: Fix tegra20 build with CONFIG_PM=n
	EDAC/synopsys: Fix wrong value type assignment for edac_mode
	EDAC/dmc520: Assign the proper type to dimm->edac_mode
	thermal/drivers/int340x: Do not set a wrong tcc offset on resume
	USB: serial: cp210x: fix dropped characters with CP2102
	xen/balloon: fix balloon kthread freezing
	qnx4: work around gcc false positive warning bug
	Linux 5.10.70

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I0be3ab08ab5dd724a79c5c5ff8e49c18d2666193
2021-10-01 11:20:43 +02:00
Chen Jun
838297222b mm: fix uninitialized use in overcommit_policy_handler
commit bcbda81020c3ee77e2c098cadf3e84f99ca3de17 upstream.

We get an unexpected value of /proc/sys/vm/overcommit_memory after
running the following program:

  int main()
  {
      int fd = open("/proc/sys/vm/overcommit_memory", O_RDWR);
      write(fd, "1", 1);
      write(fd, "2", 1);
      close(fd);
  }

write(fd, "2", 1) will pass *ppos = 1 to proc_dointvec_minmax.
proc_dointvec_minmax will return 0 without setting new_policy.

  t.data = &new_policy;
  ret = proc_dointvec_minmax(&t, write, buffer, lenp, ppos)
      -->do_proc_dointvec
         -->__do_proc_dointvec
              if (write) {
                if (proc_first_pos_non_zero_ignore(ppos, table))
                  goto out;

  sysctl_overcommit_memory = new_policy;

so sysctl_overcommit_memory will be set to an uninitialized value.

Check whether new_policy has been changed by proc_dointvec_minmax.

Link: https://lkml.kernel.org/r/20210923020524.13289-1-chenjun102@huawei.com
Fixes: 56f3547bfa ("mm: adjust vm_committed_as_batch according to vm overcommit policy")
Signed-off-by: Chen Jun <chenjun102@huawei.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Feng Tang <feng.tang@intel.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Rui Xiang <rui.xiang@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-30 10:10:58 +02:00
xieliujie
d80c70d7a8 ANDROID: android: export kernel function arch_mmap_rnd
Vendor module needs arch_mmap_rnd() to generate new mm->mmap_base
when defining a custom mmap_layout.
More details in https://buganizer.corp.google.com/issues/191439466

Bug: 191439466
Signed-off-by: xieliujie <xieliujie@oppo.com>
Change-Id: I37644438b4e170732adc62810388450155c178a4
2021-07-09 20:51:14 +00:00
Kuan-Ying Lee
a5543c9cd7 ANDROID: syscall_check: add vendor hook for mmap syscall
Through this vendor hook, we can get the timing to check
current running task for the validation of its credential
and related operations.

Bug: 191291287

Signed-off-by: Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com>
Change-Id: If20bd8bb8311ad10a374033734fbdc7ef61a7704
2021-07-09 13:48:33 +00:00
Bartosz Golaszewski
295a173023 mm/util.c: update the kerneldoc for kstrdup_const()
Memory allocated with kstrdup_const() must not be passed to regular
krealloc() as it is not aware of the possibility of the chunk residing in
.rodata.  Since there are no potential users of krealloc_const() at the
moment, let's just update the doc to make it explicit.

Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20200817173927.23389-1-brgl@bgdev.pl
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-16 11:11:17 -07:00
Catalin Marinas
4d1a8a2dc0 arm64: mte: Tags-aware aware memcmp_pages() implementation
When the Memory Tagging Extension is enabled, two pages are identical
only if both their data and tags are identical.

Make the generic memcmp_pages() a __weak function and add an
arm64-specific implementation which returns non-zero if any of the two
pages contain valid MTE tags (PG_mte_tagged set). There isn't much
benefit in comparing the tags of two pages since these are normally used
for heap allocations and likely to differ anyway.

Co-developed-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
2020-09-04 12:46:07 +01:00
Peter Collingbourne
45e55300f1 mm: remove unnecessary wrapper function do_mmap_pgoff()
The current split between do_mmap() and do_mmap_pgoff() was introduced in
commit 1fcfd8db7f ("mm, mpx: add "vm_flags_t vm_flags" arg to
do_mmap_pgoff()") to support MPX.

The wrapper function do_mmap_pgoff() always passed 0 as the value of the
vm_flags argument to do_mmap().  However, MPX support has subsequently
been removed from the kernel and there were no more direct callers of
do_mmap(); all calls were going via do_mmap_pgoff().

Simplify the code by removing do_mmap_pgoff() and changing all callers to
directly call do_mmap(), which now no longer takes a vm_flags argument.

Signed-off-by: Peter Collingbourne <pcc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Link: http://lkml.kernel.org/r/20200727194109.1371462-1-pcc@google.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-07 11:33:27 -07:00
Feng Tang
56f3547bfa mm: adjust vm_committed_as_batch according to vm overcommit policy
When checking a performance change for will-it-scale scalability mmap test
[1], we found very high lock contention for spinlock of percpu counter
'vm_committed_as':

    94.14%     0.35%  [kernel.kallsyms]         [k] _raw_spin_lock_irqsave
    48.21% _raw_spin_lock_irqsave;percpu_counter_add_batch;__vm_enough_memory;mmap_region;do_mmap;
    45.91% _raw_spin_lock_irqsave;percpu_counter_add_batch;__do_munmap;

Actually this heavy lock contention is not always necessary.  The
'vm_committed_as' needs to be very precise when the strict
OVERCOMMIT_NEVER policy is set, which requires a rather small batch number
for the percpu counter.

So keep 'batch' number unchanged for strict OVERCOMMIT_NEVER policy, and
lift it to 64X for OVERCOMMIT_ALWAYS and OVERCOMMIT_GUESS policies.  Also
add a sysctl handler to adjust it when the policy is reconfigured.

Benchmark with the same testcase in [1] shows 53% improvement on a 8C/16T
desktop, and 2097%(20X) on a 4S/72C/144T server.  We tested with test
platforms in 0day (server, desktop and laptop), and 80%+ platforms shows
improvements with that test.  And whether it shows improvements depends on
if the test mmap size is bigger than the batch number computed.

And if the lift is 16X, 1/3 of the platforms will show improvements,
though it should help the mmap/unmap usage generally, as Michal Hocko
mentioned:

: I believe that there are non-synthetic worklaods which would benefit from
: a larger batch.  E.g.  large in memory databases which do large mmaps
: during startups from multiple threads.

[1] https://lore.kernel.org/lkml/20200305062138.GI5972@shao2-debian/

Signed-off-by: Feng Tang <feng.tang@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Qian Cai <cai@lca.pw>
Cc: Kees Cook <keescook@chromium.org>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Tim Chen <tim.c.chen@intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: kernel test robot <rong.a.chen@intel.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Tejun Heo <tj@kernel.org>
Link: http://lkml.kernel.org/r/1589611660-89854-4-git-send-email-feng.tang@intel.com
Link: http://lkml.kernel.org/r/1592725000-73486-4-git-send-email-feng.tang@intel.com
Link: http://lkml.kernel.org/r/1594389708-60781-5-git-send-email-feng.tang@intel.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-07 11:33:26 -07:00
Feng Tang
4e2ee51e82 mm/util.c: make vm_memory_committed() more accurate
percpu_counter_sum_positive() will provide more accurate info.

As with percpu_counter_read_positive(), in worst case the deviation could
be 'batch * nr_cpus', which is totalram_pages/256 for now, and will be
more when the batch gets enlarged.

Its time cost is about 800 nanoseconds on a 2C/4T platform and 2~3
microseconds on a 2S/36C/72T Skylake server in normal case, and in worst
case where vm_committed_as's spinlock is under severe contention, it costs
30~40 microseconds for the 2S/36C/72T Skylake sever, which should be fine
for its only two users: /proc/meminfo and HyperV balloon driver's status
trace per second.

Signed-off-by: Feng Tang <feng.tang@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Michal Hocko <mhocko@suse.com> # for /proc/meminfo
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Qian Cai <cai@lca.pw>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Tim Chen <tim.c.chen@intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: kernel test robot <rong.a.chen@intel.com>
Cc: Tejun Heo <tj@kernel.org>
Link: http://lkml.kernel.org/r/1592725000-73486-3-git-send-email-feng.tang@intel.com
Link: http://lkml.kernel.org/r/1594389708-60781-3-git-send-email-feng.tang@intel.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-07 11:33:26 -07:00
Michel Lespinasse
c1e8d7c6a7 mmap locking API: convert mmap_sem comments
Convert comments that reference mmap_sem to reference mmap_lock instead.

[akpm@linux-foundation.org: fix up linux-next leftovers]
[akpm@linux-foundation.org: s/lockaphore/lock/, per Vlastimil]
[akpm@linux-foundation.org: more linux-next fixups, per Michel]

Signed-off-by: Michel Lespinasse <walken@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Laurent Dufour <ldufour@linux.ibm.com>
Cc: Liam Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ying Han <yinghan@google.com>
Link: http://lkml.kernel.org/r/20200520052908.204642-13-walken@google.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-09 09:39:14 -07:00
Michel Lespinasse
42fc541404 mmap locking API: add mmap_assert_locked() and mmap_assert_write_locked()
Add new APIs to assert that mmap_sem is held.

Using this instead of rwsem_is_locked and lockdep_assert_held[_write]
makes the assertions more tolerant of future changes to the lock type.

Signed-off-by: Michel Lespinasse <walken@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Laurent Dufour <ldufour@linux.ibm.com>
Cc: Liam Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ying Han <yinghan@google.com>
Link: http://lkml.kernel.org/r/20200520052908.204642-10-walken@google.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-09 09:39:14 -07:00
Michel Lespinasse
d8ed45c5dc mmap locking API: use coccinelle to convert mmap_sem rwsem call sites
This change converts the existing mmap_sem rwsem calls to use the new mmap
locking API instead.

The change is generated using coccinelle with the following rule:

// spatch --sp-file mmap_lock_api.cocci --in-place --include-headers --dir .

@@
expression mm;
@@
(
-init_rwsem
+mmap_init_lock
|
-down_write
+mmap_write_lock
|
-down_write_killable
+mmap_write_lock_killable
|
-down_write_trylock
+mmap_write_trylock
|
-up_write
+mmap_write_unlock
|
-downgrade_write
+mmap_write_downgrade
|
-down_read
+mmap_read_lock
|
-down_read_killable
+mmap_read_lock_killable
|
-down_read_trylock
+mmap_read_trylock
|
-up_read
+mmap_read_unlock
)
-(&mm->mmap_sem)
+(mm)

Signed-off-by: Michel Lespinasse <walken@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Reviewed-by: Laurent Dufour <ldufour@linux.ibm.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Liam Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ying Han <yinghan@google.com>
Link: http://lkml.kernel.org/r/20200520052908.204642-5-walken@google.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-09 09:39:14 -07:00
Waiman Long
d4eaa28378 mm: add kvfree_sensitive() for freeing sensitive data objects
For kvmalloc'ed data object that contains sensitive information like
cryptographic keys, we need to make sure that the buffer is always cleared
before freeing it.  Using memset() alone for buffer clearing may not
provide certainty as the compiler may compile it away.  To be sure, the
special memzero_explicit() has to be used.

This patch introduces a new kvfree_sensitive() for freeing those sensitive
data objects allocated by kvmalloc().  The relevant places where
kvfree_sensitive() can be used are modified to use it.

Fixes: 4f0882491a ("KEYS: Avoid false positive ENOMEM error on key read")
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Acked-by: David Howells <dhowells@redhat.com>
Cc: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Cc: James Morris <jmorris@namei.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: Joe Perches <joe@perches.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Uladzislau Rezki <urezki@gmail.com>
Link: http://lkml.kernel.org/r/20200407200318.11711-1-longman@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-04 19:06:22 -07:00
Feng Tang
c571686a92 mm/util.c: remove the VM_WARN_ONCE for vm_committed_as underflow check
This check was added by commit 82f71ae4a2 ("mm: catch memory
commitment underflow") in 2014 to have a safety check for issues which
have been fixed.  And there has been few report caught by it, as
described in its commit log:

: This shouldn't happen any more - the previous two patches fixed
: the committed_as underflow issues.

But it was really found by Qian Cai when he used the LTP memory stress
suite to test a RFC patchset, which tries to improve scalability of
per-cpu counter 'vm_committed_as', by chosing a bigger 'batch' number for
loose overcommit policies (OVERCOMMIT_ALWAYS and OVERCOMMIT_GUESS), while
keeping current number for OVERCOMMIT_NEVER.

With that patchset, when system firstly uses a loose policy, the
'vm_committed_as' count could be a big negative value, as its big 'batch'
number allows a big deviation, then when the policy is changed to
OVERCOMMIT_NEVER, the 'batch' will be decreased to a much smaller value,
thus hits this WARN check.

To mitigate this, one proposed solution is to queue work on all online
CPUs to do a local sync for 'vm_committed_as' when changing policy to
OVERCOMMIT_NEVER, plus some global syncing to garante the case won't be
hit.

But this solution is costy and slow, given this check hasn't shown real
trouble or benefit, simply drop it from one hot path of MM.  And perf
stats does show some tiny saving for removing it.

Reported-by: Qian Cai <cai@lca.pw>
Signed-off-by: Feng Tang <feng.tang@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Qian Cai <cai@lca.pw>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Kees Cook <keescook@chromium.org>
Link: http://lkml.kernel.org/r/20200603094804.GB89848@shbuild999.sh.intel.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-04 19:06:20 -07:00
Linus Torvalds
cb8e59cc87 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from David Miller:

 1) Allow setting bluetooth L2CAP modes via socket option, from Luiz
    Augusto von Dentz.

 2) Add GSO partial support to igc, from Sasha Neftin.

 3) Several cleanups and improvements to r8169 from Heiner Kallweit.

 4) Add IF_OPER_TESTING link state and use it when ethtool triggers a
    device self-test. From Andrew Lunn.

 5) Start moving away from custom driver versions, use the globally
    defined kernel version instead, from Leon Romanovsky.

 6) Support GRO vis gro_cells in DSA layer, from Alexander Lobakin.

 7) Allow hard IRQ deferral during NAPI, from Eric Dumazet.

 8) Add sriov and vf support to hinic, from Luo bin.

 9) Support Media Redundancy Protocol (MRP) in the bridging code, from
    Horatiu Vultur.

10) Support netmap in the nft_nat code, from Pablo Neira Ayuso.

11) Allow UDPv6 encapsulation of ESP in the ipsec code, from Sabrina
    Dubroca. Also add ipv6 support for espintcp.

12) Lots of ReST conversions of the networking documentation, from Mauro
    Carvalho Chehab.

13) Support configuration of ethtool rxnfc flows in bcmgenet driver,
    from Doug Berger.

14) Allow to dump cgroup id and filter by it in inet_diag code, from
    Dmitry Yakunin.

15) Add infrastructure to export netlink attribute policies to
    userspace, from Johannes Berg.

16) Several optimizations to sch_fq scheduler, from Eric Dumazet.

17) Fallback to the default qdisc if qdisc init fails because otherwise
    a packet scheduler init failure will make a device inoperative. From
    Jesper Dangaard Brouer.

18) Several RISCV bpf jit optimizations, from Luke Nelson.

19) Correct the return type of the ->ndo_start_xmit() method in several
    drivers, it's netdev_tx_t but many drivers were using
    'int'. From Yunjian Wang.

20) Add an ethtool interface for PHY master/slave config, from Oleksij
    Rempel.

21) Add BPF iterators, from Yonghang Song.

22) Add cable test infrastructure, including ethool interfaces, from
    Andrew Lunn. Marvell PHY driver is the first to support this
    facility.

23) Remove zero-length arrays all over, from Gustavo A. R. Silva.

24) Calculate and maintain an explicit frame size in XDP, from Jesper
    Dangaard Brouer.

25) Add CAP_BPF, from Alexei Starovoitov.

26) Support terse dumps in the packet scheduler, from Vlad Buslov.

27) Support XDP_TX bulking in dpaa2 driver, from Ioana Ciornei.

28) Add devm_register_netdev(), from Bartosz Golaszewski.

29) Minimize qdisc resets, from Cong Wang.

30) Get rid of kernel_getsockopt and kernel_setsockopt in order to
    eliminate set_fs/get_fs calls. From Christoph Hellwig.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2517 commits)
  selftests: net: ip_defrag: ignore EPERM
  net_failover: fixed rollback in net_failover_open()
  Revert "tipc: Fix potential tipc_aead refcnt leak in tipc_crypto_rcv"
  Revert "tipc: Fix potential tipc_node refcnt leak in tipc_rcv"
  vmxnet3: allow rx flow hash ops only when rss is enabled
  hinic: add set_channels ethtool_ops support
  selftests/bpf: Add a default $(CXX) value
  tools/bpf: Don't use $(COMPILE.c)
  bpf, selftests: Use bpf_probe_read_kernel
  s390/bpf: Use bcr 0,%0 as tail call nop filler
  s390/bpf: Maintain 8-byte stack alignment
  selftests/bpf: Fix verifier test
  selftests/bpf: Fix sample_cnt shared between two threads
  bpf, selftests: Adapt cls_redirect to call csum_level helper
  bpf: Add csum_level helper for fixing up csum levels
  bpf: Fix up bpf_skb_adjust_room helper's skb csum setting
  sfc: add missing annotation for efx_ef10_try_update_nic_stats_vf()
  crypto/chtls: IPv6 support for inline TLS
  Crypto/chcr: Fixes a coccinile check error
  Crypto/chcr: Fixes compilations warnings
  ...
2020-06-03 16:27:18 -07:00
Christoph Hellwig
2b9059489c mm: remove __vmalloc_node_flags_caller
Just use __vmalloc_node instead which gets and extra argument.  To be able
to to use __vmalloc_node in all caller make it available outside of
vmalloc and implement it in nommu.c.

[akpm@linux-foundation.org: fix nommu build]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Christophe Leroy <christophe.leroy@c-s.fr>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: David Airlie <airlied@linux.ie>
Cc: Gao Xiang <xiang@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Kelley <mikelley@microsoft.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Sakari Ailus <sakari.ailus@linux.intel.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Wei Liu <wei.liu@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mackerras <paulus@ozlabs.org>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Link: http://lkml.kernel.org/r/20200414131348.444715-25-hch@lst.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02 10:59:11 -07:00
Christoph Hellwig
32927393dc sysctl: pass kernel pointers to ->proc_handler
Instead of having all the sysctl handlers deal with user pointers, which
is rather hairy in terms of the BPF interaction, copy the input to and
from  userspace in common code.  This also means that the strings are
always NUL-terminated by the common code, making the API a little bit
safer.

As most handler just pass through the data to one of the common handlers
a lot of the changes are mechnical.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-04-27 02:07:40 -04:00
Wei Yang
aba6dfb75f mm/mmap.c: rb_parent is not necessary in __vma_link_list()
Now we use rb_parent to get next, while this is not necessary.

When prev is NULL, this means vma should be the first element in the list.
Then next should be current first one (mm->mmap), no matter whether we
have parent or not.

After removing it, the code shows the beauty of symmetry.

Link: http://lkml.kernel.org/r/20190813032656.16625-1-richardw.yang@linux.intel.com
Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-12-01 06:29:19 -08:00
Wei Yang
1b9fc5b24f mm/mmap.c: extract __vma_unlink_list() as counterpart for __vma_link_list()
Just make the code a little easier to read.

Link: http://lkml.kernel.org/r/20191006012636.31521-3-richardw.yang@linux.intel.com
Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-12-01 06:29:19 -08:00
Alexandre Ghiti
e7142bf5d2 arm64, mm: make randomization selected by generic topdown mmap layout
This commits selects ARCH_HAS_ELF_RANDOMIZE when an arch uses the generic
topdown mmap layout functions so that this security feature is on by
default.

Note that this commit also removes the possibility for arm64 to have elf
randomization and no MMU: without MMU, the security added by randomization
is worth nothing.

Link: http://lkml.kernel.org/r/20190730055113.23635-6-alex@ghiti.fr
Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: James Hogan <jhogan@kernel.org>
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:11 -07:00
Alexandre Ghiti
67f3977f80 arm64, mm: move generic mmap layout functions to mm
arm64 handles top-down mmap layout in a way that can be easily reused by
other architectures, so make it available in mm.  It then introduces a new
config ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT that can be set by other
architectures to benefit from those functions.  Note that this new config
depends on MMU being enabled, if selected without MMU support, a warning
will be thrown.

Link: http://lkml.kernel.org/r/20190730055113.23635-5-alex@ghiti.fr
Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Suggested-by: Christoph Hellwig <hch@infradead.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: James Hogan <jhogan@kernel.org>
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:11 -07:00
Alexandre Ghiti
649775be63 mm, fs: move randomize_stack_top from fs to mm
Patch series "Provide generic top-down mmap layout functions", v6.

This series introduces generic functions to make top-down mmap layout
easily accessible to architectures, in particular riscv which was the
initial goal of this series.  The generic implementation was taken from
arm64 and used successively by arm, mips and finally riscv.

Note that in addition the series fixes 2 issues:

- stack randomization was taken into account even if not necessary.

- [1] fixed an issue with mmap base which did not take into account
  randomization but did not report it to arm and mips, so by moving arm64
  into a generic library, this problem is now fixed for both
  architectures.

This work is an effort to factorize architecture functions to avoid code
duplication and oversights as in [1].

[1]: https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1429066.html

This patch (of 14):

This preparatory commit moves this function so that further introduction
of generic topdown mmap layout is contained only in mm/util.c.

Link: http://lkml.kernel.org/r/20190730055113.23635-2-alex@ghiti.fr
Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Acked-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Burton <paul.burton@mips.com>
Cc: James Hogan <jhogan@kernel.org>
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:11 -07:00
Song Liu
010c164a5f mm: move memcmp_pages() and pages_identical()
Patch series "THP aware uprobe", v13.

This patchset makes uprobe aware of THPs.

Currently, when uprobe is attached to text on THP, the page is split by
FOLL_SPLIT.  As a result, uprobe eliminates the performance benefit of
THP.

This set makes uprobe THP-aware.  Instead of FOLL_SPLIT, we introduces
FOLL_SPLIT_PMD, which only split PMD for uprobe.

After all uprobes within the THP are removed, the PTE-mapped pages are
regrouped as huge PMD.

This set (plus a few THP patches) is also available at

   https://github.com/liu-song-6/linux/tree/uprobe-thp

This patch (of 6):

Move memcmp_pages() to mm/util.c and pages_identical() to mm.h, so that we
can use them in other files.

Link: http://lkml.kernel.org/r/20190815164525.1848545-2-songliubraving@fb.com
Signed-off-by: Song Liu <songliubraving@fb.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Matthew Wilcox <matthew.wilcox@oracle.com>
Cc: William Kucharski <william.kucharski@oracle.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:11 -07:00
Matthew Wilcox (Oracle)
d8c6546b1a mm: introduce compound_nr()
Replace 1 << compound_order(page) with compound_nr(page).  Minor
improvements in readability.

Link: http://lkml.kernel.org/r/20190721104612.19120-4-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:08 -07:00
Daniel Jordan
79eb597cba mm: add account_locked_vm utility function
locked_vm accounting is done roughly the same way in five places, so
unify them in a helper.

Include the helper's caller in the debug print to distinguish between
callsites.

Error codes stay the same, so user-visible behavior does too.  The one
exception is that the -EPERM case in tce_account_locked_vm is removed
because Alexey has never seen it triggered.

[daniel.m.jordan@oracle.com: v3]
  Link: http://lkml.kernel.org/r/20190529205019.20927-1-daniel.m.jordan@oracle.com
[sfr@canb.auug.org.au: fix mm/util.c]
Link: http://lkml.kernel.org/r/20190524175045.26897-1-daniel.m.jordan@oracle.com
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Cc: Alan Tull <atull@kernel.org>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Christophe Leroy <christophe.leroy@c-s.fr>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Jason Gunthorpe <jgg@mellanox.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Moritz Fischer <mdf@kernel.org>
Cc: Paul Mackerras <paulus@ozlabs.org>
Cc: Steve Sistare <steven.sistare@oracle.com>
Cc: Wu Hao <hao.wu@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-16 19:23:25 -07:00
Christoph Hellwig
050a9adc64 mm: consolidate the get_user_pages* implementations
Always build mm/gup.c so that we don't have to provide separate nommu
stubs.  Also merge the get_user_pages_fast and __get_user_pages_fast stubs
when HAVE_FAST_GUP into the main implementations, which will never call
the fast path if HAVE_FAST_GUP is not set.

This also ensures the new put_user_pages* helpers are available for nommu,
as those are currently missing, which would create a problem as soon as we
actually grew users for it.

Link: http://lkml.kernel.org/r/20190625143715.1689-13-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Miller <davem@davemloft.net>
Cc: James Hogan <jhogan@kernel.org>
Cc: Jason Gunthorpe <jgg@mellanox.com>
Cc: Khalid Aziz <khalid.aziz@oracle.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Rich Felker <dalias@libc.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-12 11:05:45 -07:00
Michal Koutný
bc81426f5b prctl_set_mm: downgrade mmap_sem to read lock
The commit a3b609ef9f ("proc read mm's {arg,env}_{start,end} with mmap
semaphore taken.") added synchronization of reading argument/environment
boundaries under mmap_sem.  Later commit 88aa7cc688 ("mm: introduce
arg_lock to protect arg_start|end and env_start|end in mm_struct") avoided
the coarse use of mmap_sem in similar situations.  But there still
remained two places that (mis)use mmap_sem.

get_cmdline should also use arg_lock instead of mmap_sem when it reads the
boundaries.

The second place that should use arg_lock is in prctl_set_mm.  By
protecting the boundaries fields with the arg_lock, we can downgrade
mmap_sem to reader lock (analogous to what we already do in
prctl_set_mm_map).

[akpm@linux-foundation.org: coding style fixes]
Link: http://lkml.kernel.org/r/20190502125203.24014-3-mkoutny@suse.com
Fixes: 88aa7cc688 ("mm: introduce arg_lock to protect arg_start|end and env_start|end in mm_struct")
Signed-off-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
Co-developed-by: Laurent Dufour <ldufour@linux.ibm.com>
Reviewed-by: Cyrill Gorcunov <gorcunov@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Cc: Mateusz Guzik <mguzik@redhat.com>
Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-01 15:51:31 -07:00
Thomas Gleixner
457c899653 treewide: Add SPDX license identifier for missed files
Add SPDX license identifiers to all files which:

 - Have no license information of any form

 - Have EXPORT_.*_SYMBOL_GPL inside which was used in the
   initial scan/conversion to ignore the file

These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:

  GPL-2.0-only

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-21 10:50:45 +02:00
Johannes Weiner
8c7829b04c mm: fix false-positive OVERCOMMIT_GUESS failures
With the default overcommit==guess we occasionally run into mmap
rejections despite plenty of memory that would get dropped under
pressure but just isn't accounted reclaimable. One example of this is
dying cgroups pinned by some page cache. A previous case was auxiliary
path name memory associated with dentries; we have since annotated
those allocations to avoid overcommit failures (see d79f7aa496 ("mm:
treat indirectly reclaimable memory as free in overcommit logic")).

But trying to classify all allocated memory reliably as reclaimable
and unreclaimable is a bit of a fool's errand. There could be a myriad
of dependencies that constantly change with kernel versions.

It becomes even more questionable of an effort when considering how
this estimate of available memory is used: it's not compared to the
system-wide allocated virtual memory in any way. It's not even
compared to the allocating process's address space. It's compared to
the single allocation request at hand!

So we have an elaborate left-hand side of the equation that tries to
assess the exact breathing room the system has available down to a
page - and then compare it to an isolated allocation request with no
additional context. We could fail an allocation of N bytes, but for
two allocations of N/2 bytes we'd do this elaborate dance twice in a
row and then still let N bytes of virtual memory through. This doesn't
make a whole lot of sense.

Let's take a step back and look at the actual goal of the
heuristic. From the documentation:

   Heuristic overcommit handling. Obvious overcommits of address
   space are refused. Used for a typical system. It ensures a
   seriously wild allocation fails while allowing overcommit to
   reduce swap usage.  root is allowed to allocate slightly more
   memory in this mode. This is the default.

If all we want to do is catch clearly bogus allocation requests
irrespective of the general virtual memory situation, the physical
memory counter-part doesn't need to be that complicated, either.

When in GUESS mode, catch wild allocations by comparing their request
size to total amount of ram and swap in the system.

Link: http://lkml.kernel.org/r/20190412191418.26333-1-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-14 09:47:50 -07:00
Ira Weiny
73b0140bf0 mm/gup: change GUP fast to use flags rather than a write 'bool'
To facilitate additional options to get_user_pages_fast() change the
singular write parameter to be gup_flags.

This patch does not change any functionality.  New functionality will
follow in subsequent patches.

Some of the get_user_pages_fast() call sites were unchanged because they
already passed FOLL_WRITE or 0 for the write parameter.

NOTE: It was suggested to change the ordering of the get_user_pages_fast()
arguments to ensure that callers were converted.  This breaks the current
GUP call site convention of having the returned pages be the final
parameter.  So the suggestion was rejected.

Link: http://lkml.kernel.org/r/20190328084422.29911-4-ira.weiny@intel.com
Link: http://lkml.kernel.org/r/20190317183438.2057-4-ira.weiny@intel.com
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Mike Marshall <hubcap@omnibond.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Hogan <jhogan@kernel.org>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Rich Felker <dalias@libc.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-14 09:47:46 -07:00
Andrew Morton
e91455217d mm/util.c: fix strndup_user() comment
The kerneldoc misdescribes strndup_user()'s return value.

Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Timur Tabi <timur@freescale.com>
Cc: Mihai Caraman <mihai.caraman@freescale.com>
Cc: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-05 16:02:31 -10:00
Mike Rapoport
a862f68a8b docs/core-api/mm: fix return value descriptions in mm/
Many kernel-doc comments in mm/ have the return value descriptions
either misformatted or omitted at all which makes kernel-doc script
unhappy:

$ make V=1 htmldocs
...
./mm/util.c:36: info: Scanning doc for kstrdup
./mm/util.c:41: warning: No description found for return value of 'kstrdup'
./mm/util.c:57: info: Scanning doc for kstrdup_const
./mm/util.c:66: warning: No description found for return value of 'kstrdup_const'
./mm/util.c:75: info: Scanning doc for kstrndup
./mm/util.c:83: warning: No description found for return value of 'kstrndup'
...

Fixing the formatting and adding the missing return value descriptions
eliminates ~100 such warnings.

Link: http://lkml.kernel.org/r/1549549644-4903-4-git-send-email-rppt@linux.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-05 21:07:20 -08:00
Daniel Vetter
6c8fcc096b mm: don't let userspace spam allocations warnings
memdump_user usually gets fed unchecked userspace input.  Blasting a
full backtrace into dmesg every time is a bit excessive - I'm not sure
on the kernel rule in general, but at least in drm we're trying not to
let unpriviledge userspace spam the logs freely.  Definitely not entire
warning backtraces.

It also means more filtering for our CI, because our testsuite exercises
these corner cases and so hits these a lot.

Link: http://lkml.kernel.org/r/20190220204058.11676-1-daniel.vetter@ffwll.ch
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Jan Stancek <jstancek@redhat.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Bartosz Golaszewski <brgl@bgdev.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-02-21 09:01:01 -08:00
Jan Stancek
8ab88c7169 mm: page_mapped: don't assume compound page is huge or THP
LTP proc01 testcase has been observed to rarely trigger crashes
on arm64:
    page_mapped+0x78/0xb4
    stable_page_flags+0x27c/0x338
    kpageflags_read+0xfc/0x164
    proc_reg_read+0x7c/0xb8
    __vfs_read+0x58/0x178
    vfs_read+0x90/0x14c
    SyS_read+0x60/0xc0

The issue is that page_mapped() assumes that if compound page is not
huge, then it must be THP.  But if this is 'normal' compound page
(COMPOUND_PAGE_DTOR), then following loop can keep running (for
HPAGE_PMD_NR iterations) until it tries to read from memory that isn't
mapped and triggers a panic:

        for (i = 0; i < hpage_nr_pages(page); i++) {
                if (atomic_read(&page[i]._mapcount) >= 0)
                        return true;
	}

I could replicate this on x86 (v4.20-rc4-98-g60b548237fed) only
with a custom kernel module [1] which:
 - allocates compound page (PAGEC) of order 1
 - allocates 2 normal pages (COPY), which are initialized to 0xff (to
   satisfy _mapcount >= 0)
 - 2 PAGEC page structs are copied to address of first COPY page
 - second page of COPY is marked as not present
 - call to page_mapped(COPY) now triggers fault on access to 2nd COPY
   page at offset 0x30 (_mapcount)

[1] https://github.com/jstancek/reproducers/blob/master/kernel/page_mapped_crash/repro.c

Fix the loop to iterate for "1 << compound_order" pages.

Kirrill said "IIRC, sound subsystem can producuce custom mapped compound
pages".

Link: http://lkml.kernel.org/r/c440d69879e34209feba21e12d236d06bc0a25db.1543577156.git.jstancek@redhat.com
Fixes: e1534ae950 ("mm: differentiate page_mapped() from page_mapcount() for compound pages")
Signed-off-by: Jan Stancek <jstancek@redhat.com>
Debugged-by: Laszlo Ersek <lersek@redhat.com>
Suggested-by: "Kirill A. Shutemov" <kirill@shutemov.name>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-08 17:15:11 -08:00
Arun KS
ca79b0c211 mm: convert totalram_pages and totalhigh_pages variables to atomic
totalram_pages and totalhigh_pages are made static inline function.

Main motivation was that managed_page_count_lock handling was complicating
things.  It was discussed in length here,
https://lore.kernel.org/patchwork/patch/995739/#1181785 So it seemes
better to remove the lock and convert variables to atomic, with preventing
poteintial store-to-read tearing as a bonus.

[akpm@linux-foundation.org: coding style fixes]
Link: http://lkml.kernel.org/r/1542090790-21750-4-git-send-email-arunks@codeaurora.org
Signed-off-by: Arun KS <arunks@codeaurora.org>
Suggested-by: Michal Hocko <mhocko@suse.com>
Suggested-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Reviewed-by: Pavel Tatashin <pasha.tatashin@soleen.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-12-28 12:11:47 -08:00
Andrey Ryabinin
52414d3302 kvfree(): fix misleading comment
vfree() might sleep if called not in interrupt context.  So does kvfree()
too.  Fix misleading kvfree()'s comment about allowed context.

Link: http://lkml.kernel.org/r/20180914130512.10394-1-aryabinin@virtuozzo.com
Fixes: 04b8e94607 ("mm/util.c: improve kvfree() kerneldoc")
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-10-26 16:26:33 -07:00
Vlastimil Babka
b29940c1ab mm: rename and change semantics of nr_indirectly_reclaimable_bytes
The vmstat counter NR_INDIRECTLY_RECLAIMABLE_BYTES was introduced by
commit eb59254608 ("mm: introduce NR_INDIRECTLY_RECLAIMABLE_BYTES") with
the goal of accounting objects that can be reclaimed, but cannot be
allocated via a SLAB_RECLAIM_ACCOUNT cache.  This is now possible via
kmalloc() with __GFP_RECLAIMABLE flag, and the dcache external names user
is converted.

The counter is however still useful for accounting direct page allocations
(i.e.  not slab) with a shrinker, such as the ION page pool.  So keep it,
and:

- change granularity to pages to be more like other counters; sub-page
  allocations should be able to use kmalloc
- rename the counter to NR_KERNEL_MISC_RECLAIMABLE
- expose the counter again in vmstat as "nr_kernel_misc_reclaimable"; we can
  again remove the check for not printing "hidden" counters

Link: http://lkml.kernel.org/r/20180731090649.16028-5-vbabka@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Christoph Lameter <cl@linux.com>
Acked-by: Roman Gushchin <guro@fb.com>
Cc: Vijayanand Jitta <vjitta@codeaurora.org>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-10-26 16:26:32 -07:00