Commit Graph

901 Commits

Author SHA1 Message Date
Greg Kroah-Hartman
3ca4271578 Reapply "Merge tag 'android14-6.1.75_r00' into android14-6.1"
This reverts commit 6bad1052c2, it is the
LTS merge that had to previously get reverted due to being merged too
early.

Cc: Todd Kjos <tkjos@google.com>
Change-Id: I31b7d660bd833cf022ac4870f6d01e723fda5182
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2024-04-02 19:49:12 +00:00
Todd Kjos
6bad1052c2 Revert "Merge tag 'android14-6.1.75_r00' into android14-6.1"
This reverts commit 1dbafe61e3.

Reason for revert: Too early. Needs to wait until 2024-03-27

Change-Id: I769b944bd089aa2278659ec87f7ba4ac4e74ee4a
Signed-off-by: Todd Kjos <tkjos@google.com>
2024-03-07 21:18:27 +00:00
Greg Kroah-Hartman
e1b12db2de Merge 6.1.72 into android14-6.1-lts
Changes in 6.1.72
	keys, dns: Fix missing size check of V1 server-list header
	block: Don't invalidate pagecache for invalid falloc modes
	ALSA: hda/realtek: enable SND_PCI_QUIRK for hp pavilion 14-ec1xxx series
	ALSA: hda/realtek: fix mute/micmute LEDs for a HP ZBook
	ALSA: hda/realtek: Fix mute and mic-mute LEDs for HP ProBook 440 G6
	mptcp: prevent tcp diag from closing listener subflows
	Revert "PCI/ASPM: Remove pcie_aspm_pm_state_change()"
	drm/mgag200: Fix gamma lut not initialized for G200ER, G200EV, G200SE
	cifs: cifs_chan_is_iface_active should be called with chan_lock held
	cifs: do not depend on release_iface for maintaining iface_list
	KVM: x86/pmu: fix masking logic for MSR_CORE_PERF_GLOBAL_CTRL
	wifi: iwlwifi: pcie: don't synchronize IRQs from IRQ
	drm/bridge: ti-sn65dsi86: Never store more than msg->size bytes in AUX xfer
	netfilter: use skb_ip_totlen and iph_totlen
	netfilter: nf_tables: set transport offset from mac header for netdev/egress
	nfc: llcp_core: Hold a ref to llcp_local->dev when holding a ref to llcp_local
	octeontx2-af: Fix marking couple of structure as __packed
	drm/i915/dp: Fix passing the correct DPCD_REV for drm_dp_set_phy_test_pattern
	ice: Fix link_down_on_close message
	ice: Shut down VSI with "link-down-on-close" enabled
	i40e: Fix filter input checks to prevent config with invalid values
	igc: Report VLAN EtherType matching back to user
	igc: Check VLAN TCI mask
	igc: Check VLAN EtherType mask
	ASoC: fsl_rpmsg: Fix error handler with pm_runtime_enable
	ASoC: mediatek: mt8186: fix AUD_PAD_TOP register and offset
	mlxbf_gige: fix receive packet race condition
	net: sched: em_text: fix possible memory leak in em_text_destroy()
	r8169: Fix PCI error on system resume
	can: raw: add support for SO_MARK
	net-timestamp: extend SOF_TIMESTAMPING_OPT_ID to HW timestamps
	net: annotate data-races around sk->sk_tsflags
	net: annotate data-races around sk->sk_bind_phc
	net: Implement missing getsockopt(SO_TIMESTAMPING_NEW)
	selftests: bonding: do not set port down when adding to bond
	ARM: sun9i: smp: Fix array-index-out-of-bounds read in sunxi_mc_smp_init
	sfc: fix a double-free bug in efx_probe_filters
	net: bcmgenet: Fix FCS generation for fragmented skbuffs
	netfilter: nft_immediate: drop chain reference counter on error
	net: Save and restore msg_namelen in sock_sendmsg
	i40e: fix use-after-free in i40e_aqc_add_filters()
	ASoC: meson: g12a-toacodec: Validate written enum values
	ASoC: meson: g12a-tohdmitx: Validate written enum values
	ASoC: meson: g12a-toacodec: Fix event generation
	ASoC: meson: g12a-tohdmitx: Fix event generation for S/PDIF mux
	i40e: Restore VF MSI-X state during PCI reset
	igc: Fix hicredit calculation
	net/qla3xxx: fix potential memleak in ql_alloc_buffer_queues
	net/smc: fix invalid link access in dumping SMC-R connections
	octeontx2-af: Always configure NIX TX link credits based on max frame size
	octeontx2-af: Re-enable MAC TX in otx2_stop processing
	asix: Add check for usbnet_get_endpoints
	net: ravb: Wait for operating mode to be applied
	bnxt_en: Remove mis-applied code from bnxt_cfg_ntp_filters()
	net: Implement missing SO_TIMESTAMPING_NEW cmsg support
	selftests: secretmem: floor the memory size to the multiple of page_size
	cpu/SMT: Create topology_smt_thread_allowed()
	cpu/SMT: Make SMT control more robust against enumeration failures
	srcu: Fix callbacks acceleration mishandling
	bpf, x64: Fix tailcall infinite loop
	bpf, x86: Simplify the parsing logic of structure parameters
	bpf, x86: save/restore regs with BPF_DW size
	net: Declare MSG_SPLICE_PAGES internal sendmsg() flag
	udp: Convert udp_sendpage() to use MSG_SPLICE_PAGES
	splice, net: Add a splice_eof op to file-ops and socket-ops
	ipv4, ipv6: Use splice_eof() to flush
	udp: introduce udp->udp_flags
	udp: move udp->no_check6_tx to udp->udp_flags
	udp: move udp->no_check6_rx to udp->udp_flags
	udp: move udp->gro_enabled to udp->udp_flags
	udp: move udp->accept_udp_{l4|fraglist} to udp->udp_flags
	udp: lockless UDP_ENCAP_L2TPINUDP / UDP_GRO
	udp: annotate data-races around udp->encap_type
	wifi: iwlwifi: yoyo: swap cdb and jacket bits values
	arm64: dts: qcom: sdm845: align RPMh regulator nodes with bindings
	arm64: dts: qcom: sdm845: Fix PSCI power domain names
	fbdev: imsttfb: Release framebuffer and dealloc cmap on error path
	fbdev: imsttfb: fix double free in probe()
	bpf: decouple prune and jump points
	bpf: remove unnecessary prune and jump points
	bpf: Remove unused insn_cnt argument from visit_[func_call_]insn()
	bpf: clean up visit_insn()'s instruction processing
	bpf: Support new 32bit offset jmp instruction
	bpf: handle ldimm64 properly in check_cfg()
	bpf: fix precision backtracking instruction iteration
	blk-mq: make sure active queue usage is held for bio_integrity_prep()
	net/mlx5: Increase size of irq name buffer
	s390/mm: add missing arch_set_page_dat() call to vmem_crst_alloc()
	s390/cpumf: support user space events for counting
	f2fs: clean up i_compress_flag and i_compress_level usage
	f2fs: convert to use bitmap API
	f2fs: assign default compression level
	f2fs: set the default compress_level on ioctl
	selftests: mptcp: fix fastclose with csum failure
	selftests: mptcp: set FAILING_LINKS in run_tests
	media: camss: sm8250: Virtual channels for CSID
	media: qcom: camss: Fix set CSI2_RX_CFG1_VC_MODE when VC is greater than 3
	ext4: convert move_extent_per_page() to use folios
	khugepage: replace try_to_release_page() with filemap_release_folio()
	memory-failure: convert truncate_error_page() to use folio
	mm: merge folio_has_private()/filemap_release_folio() call pairs
	mm, netfs, fscache: stop read optimisation when folio removed from pagecache
	filemap: add a per-mapping stable writes flag
	block: update the stable_writes flag in bdev_add
	smb: client: fix missing mode bits for SMB symlinks
	net: dpaa2-eth: rearrange variable in dpaa2_eth_get_ethtool_stats
	dpaa2-eth: recycle the RX buffer only after all processing done
	ethtool: don't propagate EOPNOTSUPP from dumps
	bpf, sockmap: af_unix stream sockets need to hold ref for pair sock
	firmware: arm_scmi: Fix frequency truncation by promoting multiplier type
	ALSA: hda/realtek: Add quirk for Lenovo Yoga Pro 7
	genirq/affinity: Remove the 'firstvec' parameter from irq_build_affinity_masks
	genirq/affinity: Pass affinity managed mask array to irq_build_affinity_masks
	genirq/affinity: Don't pass irq_affinity_desc array to irq_build_affinity_masks
	genirq/affinity: Rename irq_build_affinity_masks as group_cpus_evenly
	genirq/affinity: Move group_cpus_evenly() into lib/
	lib/group_cpus.c: avoid acquiring cpu hotplug lock in group_cpus_evenly
	mm/memory_hotplug: add missing mem_hotplug_lock
	mm/memory_hotplug: fix error handling in add_memory_resource()
	net: sched: call tcf_ct_params_free to free params in tcf_ct_init
	netfilter: flowtable: allow unidirectional rules
	netfilter: flowtable: cache info of last offload
	net/sched: act_ct: offload UDP NEW connections
	net/sched: act_ct: Fix promotion of offloaded unreplied tuple
	netfilter: flowtable: GC pushes back packets to classic path
	net/sched: act_ct: Take per-cb reference to tcf_ct_flow_table
	octeontx2-af: Fix pause frame configuration
	octeontx2-af: Support variable number of lmacs
	btrfs: fix qgroup_free_reserved_data int overflow
	btrfs: mark the len field in struct btrfs_ordered_sum as unsigned
	ring-buffer: Fix 32-bit rb_time_read() race with rb_time_cmpxchg()
	firewire: ohci: suppress unexpected system reboot in AMD Ryzen machines and ASM108x/VT630x PCIe cards
	x86/kprobes: fix incorrect return address calculation in kprobe_emulate_call_indirect
	i2c: core: Fix atomic xfer check for non-preempt config
	mm: fix unmap_mapping_range high bits shift bug
	drm/amdgpu: skip gpu_info fw loading on navi12
	drm/amd/display: add nv12 bounding box
	mmc: meson-mx-sdhc: Fix initialization frozen issue
	mmc: rpmb: fixes pause retune on all RPMB partitions.
	mmc: core: Cancel delayed work before releasing host
	mmc: sdhci-sprd: Fix eMMC init failure after hw reset
	genirq/affinity: Only build SMP-only helper functions on SMP kernels
	f2fs: compress: fix to assign compress_level for lz4 correctly
	net/sched: act_ct: additional checks for outdated flows
	net/sched: act_ct: Always fill offloading tuple iifidx
	bpf: Fix a verifier bug due to incorrect branch offset comparison with cpu=v4
	bpf: syzkaller found null ptr deref in unix_bpf proto add
	media: qcom: camss: Comment CSID dt_id field
	smb3: Replace smb2pdu 1-element arrays with flex-arrays
	Revert "interconnect: qcom: sm8250: Enable sync_state"
	Linux 6.1.72

Change-Id: Id00eb2ae1159d4d5fa0ef914e672c5669cbf5b0a
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2024-01-14 13:26:13 +00:00
Greg Kroah-Hartman
8eac30b25e This is the 6.1.71 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmWYD8QACgkQONu9yGCS
 aT5cEA//UKwVnselP3QHU6yEm2j8Vuq5IOEIqIeYTDTyS7TGP83SsyM4n2KRlTwC
 /vaY3HWNsZHLqsNICPOPSdQn9STa7MYTnf/ackBbPglDnDz/A6mSB3zkXtCKFm6+
 UBmk6Y8pZwpdvk3aa6Z62Kr5bGGHdzvXdiJitERLlD2PFUOZT9/IHSncGnts3TQv
 PjFXy1KVIGsThKbtjtYPpa100RAti5HeLv/NbsaVbuKYMME/QCFmqyNRAp9k2iHx
 3nkze70aoREShEDjaLkcsirzwRKJu7qqNriYLt+wd7HmcD328R2UlTR8L3ZM0xOq
 qxBHnzbFtQyGR7NAudi2pStqwctPhFP6vRz1aJvt+w9tmbeKAWQWMd2pNvG8GhJm
 nxYFGyPLzTgPifK5SELCNIW4WXf8rnrRNgZ+Ph/JIGuhp+603//ATHRlVEwHcnl+
 M0GRbL06nWFVvfdKCYuu0autb9sW5T/vq02cbE5vRVVaziazry8S8EmxYQyOg9X/
 CBAd1XTybVZki9VkIP5zbdvWJL3LhFfsabBFy7TPZor/YCJQDvxzw1iwtY/BPVDT
 MryHjrYwH/n5RvibANRcTbCamMQY4IrJ4X3afJGgh7BK5N5C5ug4HYJ7oG5QB++x
 xC4A5x3L6D9SE/St8hFWghjYcd6lFcjlz1wJ5MyLImwYqfr8DnY=
 =Vt0s
 -----END PGP SIGNATURE-----

Merge 6.1.71 into android14-6.1-lts

Changes in 6.1.71
	ksmbd: replace one-element arrays with flexible-array members
	ksmbd: set SMB2_SESSION_FLAG_ENCRYPT_DATA when enforcing data encryption for this share
	ksmbd: use F_SETLK when unlocking a file
	ksmbd: Fix resource leak in smb2_lock()
	ksmbd: Convert to use sysfs_emit()/sysfs_emit_at() APIs
	ksmbd: Implements sess->rpc_handle_list as xarray
	ksmbd: fix typo, syncronous->synchronous
	ksmbd: Remove duplicated codes
	ksmbd: update Kconfig to note Kerberos support and fix indentation
	ksmbd: Fix spelling mistake "excceed" -> "exceeded"
	ksmbd: Fix parameter name and comment mismatch
	ksmbd: remove unused is_char_allowed function
	ksmbd: delete asynchronous work from list
	ksmbd: set NegotiateContextCount once instead of every inc
	ksmbd: avoid duplicate negotiate ctx offset increments
	ksmbd: remove unused compression negotiate ctx packing
	fs: introduce lock_rename_child() helper
	ksmbd: fix racy issue from using ->d_parent and ->d_name
	ksmbd: fix uninitialized pointer read in ksmbd_vfs_rename()
	ksmbd: fix uninitialized pointer read in smb2_create_link()
	ksmbd: call putname after using the last component
	ksmbd: fix posix_acls and acls dereferencing possible ERR_PTR()
	ksmbd: add mnt_want_write to ksmbd vfs functions
	ksmbd: remove unused ksmbd_tree_conn_share function
	ksmbd: use kzalloc() instead of __GFP_ZERO
	ksmbd: return a literal instead of 'err' in ksmbd_vfs_kern_path_locked()
	ksmbd: Change the return value of ksmbd_vfs_query_maximal_access to void
	ksmbd: use kvzalloc instead of kvmalloc
	ksmbd: Replace the ternary conditional operator with min()
	ksmbd: Use struct_size() helper in ksmbd_negotiate_smb_dialect()
	ksmbd: Replace one-element array with flexible-array member
	ksmbd: Fix unsigned expression compared with zero
	ksmbd: check if a mount point is crossed during path lookup
	ksmbd: switch to use kmemdup_nul() helper
	ksmbd: add support for read compound
	ksmbd: fix wrong interim response on compound
	ksmbd: fix `force create mode' and `force directory mode'
	ksmbd: Fix one kernel-doc comment
	ksmbd: add missing calling smb2_set_err_rsp() on error
	ksmbd: remove experimental warning
	ksmbd: remove unneeded mark_inode_dirty in set_info_sec()
	ksmbd: fix passing freed memory 'aux_payload_buf'
	ksmbd: return invalid parameter error response if smb2 request is invalid
	ksmbd: check iov vector index in ksmbd_conn_write()
	ksmbd: fix race condition with fp
	ksmbd: fix race condition from parallel smb2 logoff requests
	ksmbd: fix race condition from parallel smb2 lock requests
	ksmbd: fix race condition between tree conn lookup and disconnect
	ksmbd: fix wrong error response status by using set_smb2_rsp_status()
	ksmbd: fix Null pointer dereferences in ksmbd_update_fstate()
	ksmbd: fix potential double free on smb2_read_pipe() error path
	ksmbd: Remove unused field in ksmbd_user struct
	ksmbd: reorganize ksmbd_iov_pin_rsp()
	ksmbd: fix kernel-doc comment of ksmbd_vfs_setxattr()
	ksmbd: fix recursive locking in vfs helpers
	ksmbd: fix missing RDMA-capable flag for IPoIB device in ksmbd_rdma_capable_netdev()
	ksmbd: add support for surrogate pair conversion
	ksmbd: no need to wait for binded connection termination at logoff
	ksmbd: fix kernel-doc comment of ksmbd_vfs_kern_path_locked()
	ksmbd: prevent memory leak on error return
	ksmbd: fix possible deadlock in smb2_open
	ksmbd: separately allocate ci per dentry
	ksmbd: move oplock handling after unlock parent dir
	ksmbd: release interim response after sending status pending response
	ksmbd: move setting SMB2_FLAGS_ASYNC_COMMAND and AsyncId
	ksmbd: don't update ->op_state as OPLOCK_STATE_NONE on error
	ksmbd: set epoch in create context v2 lease
	ksmbd: set v2 lease capability
	ksmbd: downgrade RWH lease caching state to RH for directory
	ksmbd: send v2 lease break notification for directory
	ksmbd: lazy v2 lease break on smb2_write()
	ksmbd: avoid duplicate opinfo_put() call on error of smb21_lease_break_ack()
	ksmbd: fix wrong allocation size update in smb2_open()
	ARM: dts: Fix occasional boot hang for am3 usb
	usb: fotg210-hcd: delete an incorrect bounds test
	spi: Introduce spi_get_device_match_data() helper
	iio: imu: adis16475: add spi_device_id table
	nfsd: separate nfsd_last_thread() from nfsd_put()
	nfsd: call nfsd_last_thread() before final nfsd_put()
	linux/export: Ensure natural alignment of kcrctab array
	spi: Reintroduce spi_set_cs_timing()
	spi: Add APIs in spi core to set/get spi->chip_select and spi->cs_gpiod
	spi: atmel: Fix clock issue when using devices with different polarities
	block: renumber QUEUE_FLAG_HW_WC
	ksmbd: fix slab-out-of-bounds in smb_strndup_from_utf16()
	platform/x86: p2sb: Allow p2sb_bar() calls during PCI device probe
	mm/filemap: avoid buffered read/write race to read inconsistent data
	mm: migrate high-order folios in swap cache correctly
	mm/memory-failure: cast index to loff_t before shifting it
	mm/memory-failure: check the mapcount of the precise page
	ring-buffer: Fix wake ups when buffer_percent is set to 100
	tracing: Fix blocked reader of snapshot buffer
	ring-buffer: Remove useless update to write_stamp in rb_try_to_discard()
	netfilter: nf_tables: skip set commit for deleted/destroyed sets
	ring-buffer: Fix slowpath of interrupted event
	NFSD: fix possible oops when nfsd/pool_stats is closed.
	spi: Constify spi parameters of chip select APIs
	device property: Allow const parameter to dev_fwnode()
	kallsyms: Make module_kallsyms_on_each_symbol generally available
	tracing/kprobes: Fix symbol counting logic by looking at modules as well
	Revert "platform/x86: p2sb: Allow p2sb_bar() calls during PCI device probe"
	Linux 6.1.71

Change-Id: I7bc16d981b90e8e0b633628438f79fce898ad15a
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2024-01-14 11:21:18 +00:00
Greg Kroah-Hartman
bb47960a9d Merge branch 'android14-6.1' into branch 'android14-6.1-lts'
This merges all of the latest changes in 'android14-6.1' into
'android14-6.1-lts' to get it to pass TH again due to new symbols being
added.  Included in here are the following commits:

* a41a4ee370 ANDROID: Update the ABI symbol list
* 0801d8a89d ANDROID: mm: export dump_tasks symbol.
* 7c91752f5d FROMLIST: scsi: ufs: Remove the ufshcd_hba_exit() call from ufshcd_async_scan()
* 28154afe74 FROMLIST: scsi: ufs: Simplify power management during async scan
* febcf1429f ANDROID: gki_defconfig: Set CONFIG_IDLE_INJECT and CONFIG_CPU_IDLE_THERMAL into y
* bc4d82ee40 ANDROID: KMI workaround for CONFIG_NETFILTER_FAMILY_BRIDGE
* 227b55a7a3 ANDROID: dma-buf: don't re-purpose kobject as work_struct
* c1b1201d39 BACKPORT: FROMLIST: dma-buf: Move sysfs work out of DMA-BUF export path
* 928b3b5dde UPSTREAM: netfilter: nf_tables: skip set commit for deleted/destroyed sets
* 031f804149 ANDROID: KVM: arm64: Avoid BUG-ing from the host abort path
* c5dc4b4b3d ANDROID: Update the ABI symbol list
* 5070b3b594 UPSTREAM: ipv4: igmp: fix refcnt uaf issue when receiving igmp query packet
* 02aa72665c UPSTREAM: nvmet-tcp: Fix a possible UAF in queue intialization setup
* d6554d1262 FROMGIT: usb: dwc3: gadget: Handle EP0 request dequeuing properly
* 29544d4157 ANDROID: ABI: Update symbol list for imx
* 02f444ba07 UPSTREAM: io_uring/fdinfo: lock SQ thread while retrieving thread cpu/pid
* ec46fe0ac7 UPSTREAM: bpf: Fix prog_array_map_poke_run map poke update
* 98b0e4cf09 BACKPORT: xhci: track port suspend state correctly in unsuccessful resume cases
* ac90f08292 ANDROID: Update the ABI symbol list
* ef67750d99 ANDROID: sched: Export symbols for vendor modules
* 934a40576e UPSTREAM: usb: dwc3: core: add support for disabling High-speed park mode
* 8a597e7a2d ANDROID: KVM: arm64: Don't prepopulate MMIO regions for host stage-2
* ed9b660cd1 BACKPORT: FROMGIT fork: use __mt_dup() to duplicate maple tree in dup_mmap()
* 3743b40f65 FROMGIT: maple_tree: preserve the tree attributes when destroying maple tree
* 1bec2dd52e FROMGIT: maple_tree: update check_forking() and bench_forking()
* e57d333531 FROMGIT: maple_tree: skip other tests when BENCH is enabled
* c79ca61edc FROMGIT: maple_tree: update the documentation of maple tree
* 7befa7bbc9 FROMGIT: maple_tree: add test for mtree_dup()
* f73f881af4 FROMGIT: radix tree test suite: align kmem_cache_alloc_bulk() with kernel behavior.
* eb5048ea90 FROMGIT: maple_tree: introduce interfaces __mt_dup() and mtree_dup()
* dc9323545b FROMGIT: maple_tree: introduce {mtree,mas}_lock_nested()
* 4ddcdc519b FROMGIT: maple_tree: add mt_free_one() and mt_attr() helpers
* c52d48818b UPSTREAM: maple_tree: introduce __mas_set_range()
* 066d57de87 ANDROID: GKI: Enable symbols for v4l2 in async and fwnode
* e74417834e ANDROID: Update the ABI symbol list
* 15a93de464 ANDROID: KVM: arm64: Fix hyp event alignment
* 717d1f8f91 ANDROID: KVM: arm64: Fix host_smc print typo
* 8fc25d7862 FROMGIT: f2fs: do not return EFSCORRUPTED, but try to run online repair
* 99288e911a ANDROID: KVM: arm64: Document module_change_host_prot_range
* 4d99e41ce1 FROMGIT: PM / devfreq: Synchronize devfreq_monitor_[start/stop]
* 6c8f710857 FROMGIT: arch/mm/fault: fix major fault accounting when retrying under per-VMA lock
* 4a518d8633 UPSTREAM: mm: handle write faults to RO pages under the VMA lock
* c1da94fa44 UPSTREAM: mm: handle read faults under the VMA lock
* 6541fffd92 UPSTREAM: mm: handle COW faults under the VMA lock
* c7fa581a79 UPSTREAM: mm: handle shared faults under the VMA lock
* 95af8a80bb BACKPORT: mm: call wp_page_copy() under the VMA lock
* b43b26b4cd UPSTREAM: mm: make lock_folio_maybe_drop_mmap() VMA lock aware
* 9c4bc457ab UPSTREAM: mm/memory.c: fix mismerge
* 7d50253c27 ANDROID: Export functions to be used with dma_map_ops in modules
* 37e0a5b868 BACKPORT: FROMGIT: erofs: enable sub-page compressed block support
* f466d52164 FROMGIT: erofs: refine z_erofs_transform_plain() for sub-page block support
* a18efa4e4a FROMGIT: erofs: fix ztailpacking for subpage compressed blocks
* 0c6a18c75b BACKPORT: FROMGIT: erofs: fix up compacted indexes for block size < 4096
* d7bb85f1cb FROMGIT: erofs: record `pclustersize` in bytes instead of pages
* 9d259220ac FROMGIT: erofs: support I/O submission for sub-page compressed blocks
* 8a49ea9441 FROMGIT: erofs: fix lz4 inplace decompression
* bdc5d268ba FROMGIT: erofs: fix memory leak on short-lived bounced pages
* 0d329bbe5c BACKPORT: erofs: tidy up z_erofs_do_read_page()
* dc94c3cc6b UPSTREAM: erofs: move preparation logic into z_erofs_pcluster_begin()
* 7751567a71 BACKPORT: erofs: avoid obsolete {collector,collection} terms
* d0dbf74792 BACKPORT: erofs: simplify z_erofs_read_fragment()
* 4067dd9969 UPSTREAM: erofs: get rid of the remaining kmap_atomic()
* 365ca16da2 UPSTREAM: erofs: simplify z_erofs_transform_plain()
* 187d034575 BACKPORT: erofs: adapt managed inode operations into folios
* 3d93182661 UPSTREAM: erofs: avoid on-stack pagepool directly passed by arguments
* 5c1827383a UPSTREAM: erofs: allocate extra bvec pages directly instead of retrying
* bed20ed1d3 UPSTREAM: erofs: clean up z_erofs_pcluster_readmore()
* 5e861fa97e UPSTREAM: erofs: remove the member readahead from struct z_erofs_decompress_frontend
* 66595bb17c UPSTREAM: erofs: fold in z_erofs_decompress()
* 88a1939504 UPSTREAM: erofs: enable large folios for iomap mode
* 2c085909e7 ANDROID: Update the ABI symbol list
* d16a15fde5 UPSTREAM: USB: gadget: core: adjust uevent timing on gadget unbind
* d3006fb944 ANDROID: ABI: Update oplus symbol list
* bc97d5019a ANDROID: vendor_hooks: Add hooks for rt_mutex steal
* 401a2769d9 UPSTREAM: dm verity: don't perform FEC for failed readahead IO
* 30bca9e278 UPSTREAM: netfilter: nft_set_pipapo: skip inactive elements during set walk
* 44702d8fa1 FROMLIST: mm: migrate high-order folios in swap cache correctly
* 613d8368e3 ANDROID: fuse-bpf: Follow mounts in lookups

Change-Id: I49d28ad030d7840490441ce6a7936b5e1047913e
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2024-01-11 08:06:52 +00:00
David Howells
bceff380f3 mm: merge folio_has_private()/filemap_release_folio() call pairs
[ Upstream commit 0201ebf274a306a6ebb95e5dc2d6a0a27c737cac ]

Patch series "mm, netfs, fscache: Stop read optimisation when folio
removed from pagecache", v7.

This fixes an optimisation in fscache whereby we don't read from the cache
for a particular file until we know that there's data there that we don't
have in the pagecache.  The problem is that I'm no longer using PG_fscache
(aka PG_private_2) to indicate that the page is cached and so I don't get
a notification when a cached page is dropped from the pagecache.

The first patch merges some folio_has_private() and
filemap_release_folio() pairs and introduces a helper,
folio_needs_release(), to indicate if a release is required.

The second patch is the actual fix.  Following Willy's suggestions[1], it
adds an AS_RELEASE_ALWAYS flag to an address_space that will make
filemap_release_folio() always call ->release_folio(), even if
PG_private/PG_private_2 aren't set.  folio_needs_release() is altered to
add a check for this.

This patch (of 2):

Make filemap_release_folio() check folio_has_private().  Then, in most
cases, where a call to folio_has_private() is immediately followed by a
call to filemap_release_folio(), we can get rid of the test in the pair.

There are a couple of sites in mm/vscan.c that this can't so easily be
done.  In shrink_folio_list(), there are actually three cases (something
different is done for incompletely invalidated buffers), but
filemap_release_folio() elides two of them.

In shrink_active_list(), we don't have have the folio lock yet, so the
check allows us to avoid locking the page unnecessarily.

A wrapper function to check if a folio needs release is provided for those
places that still need to do it in the mm/ directory.  This will acquire
additional parts to the condition in a future patch.

After this, the only remaining caller of folio_has_private() outside of
mm/ is a check in fuse.

Link: https://lkml.kernel.org/r/20230628104852.3391651-1-dhowells@redhat.com
Link: https://lkml.kernel.org/r/20230628104852.3391651-2-dhowells@redhat.com
Reported-by: Rohith Surabattula <rohiths.msft@gmail.com>
Suggested-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Steve French <sfrench@samba.org>
Cc: Shyam Prasad N <nspmangalore@gmail.com>
Cc: Rohith Surabattula <rohiths.msft@gmail.com>
Cc: Dave Wysochanski <dwysocha@redhat.com>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Cc: Ilya Dryomov <idryomov@gmail.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: Xiubo Li <xiubli@redhat.com>
Cc: Jingbo Xu <jefflexu@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Stable-dep-of: 1898efcdbed3 ("block: update the stable_writes flag in bdev_add")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-10 17:10:31 +01:00
Baokun Li
a8df791470 mm/filemap: avoid buffered read/write race to read inconsistent data
commit e2c27b803bb664748e090d99042ac128b3f88d92 upstream.

The following concurrency may cause the data read to be inconsistent with
the data on disk:

             cpu1                           cpu2
------------------------------|------------------------------
                               // Buffered write 2048 from 0
                               ext4_buffered_write_iter
                                generic_perform_write
                                 copy_page_from_iter_atomic
                                 ext4_da_write_end
                                  ext4_da_do_write_end
                                   block_write_end
                                    __block_commit_write
                                     folio_mark_uptodate
// Buffered read 4096 from 0          smp_wmb()
ext4_file_read_iter                   set_bit(PG_uptodate, folio_flags)
 generic_file_read_iter            i_size_write // 2048
  filemap_read                     unlock_page(page)
   filemap_get_pages
    filemap_get_read_batch
    folio_test_uptodate(folio)
     ret = test_bit(PG_uptodate, folio_flags)
     if (ret)
      smp_rmb();
      // Ensure that the data in page 0-2048 is up-to-date.

                               // New buffered write 2048 from 2048
                               ext4_buffered_write_iter
                                generic_perform_write
                                 copy_page_from_iter_atomic
                                 ext4_da_write_end
                                  ext4_da_do_write_end
                                   block_write_end
                                    __block_commit_write
                                     folio_mark_uptodate
                                      smp_wmb()
                                      set_bit(PG_uptodate, folio_flags)
                                   i_size_write // 4096
                                   unlock_page(page)

   isize = i_size_read(inode) // 4096
   // Read the latest isize 4096, but without smp_rmb(), there may be
   // Load-Load disorder resulting in the data in the 2048-4096 range
   // in the page is not up-to-date.
   copy_page_to_iter
   // copyout 4096

In the concurrency above, we read the updated i_size, but there is no read
barrier to ensure that the data in the page is the same as the i_size at
this point, so we may copy the unsynchronized page out.  Hence adding the
missing read memory barrier to fix this.

This is a Load-Load reordering issue, which only occurs on some weak
mem-ordering architectures (e.g.  ARM64, ALPHA), but not on strong
mem-ordering architectures (e.g.  X86).  And theoretically the problem
doesn't only happen on ext4, filesystems that call filemap_read() but
don't hold inode lock (e.g.  btrfs, f2fs, ubifs ...) will have this
problem, while filesystems with inode lock (e.g.  xfs, nfs) won't have
this problem.

Link: https://lkml.kernel.org/r/20231213062324.739009-1-libaokun1@huawei.com
Signed-off-by: Baokun Li <libaokun1@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: yangerkun <yangerkun@huawei.com>
Cc: Yu Kuai <yukuai3@huawei.com>
Cc: Zhang Yi <yi.zhang@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-01-05 15:18:39 +01:00
Greg Kroah-Hartman
c9b484c69d This is the 6.1.68 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmV57F0ACgkQONu9yGCS
 aT5Ihg//f5xvyjEEbZyE7tFaBBgx8ceQCtteRyi+Jw3Hy65/9neETij0t97IhG37
 I89TIAddzNIl51ifl8UYZMWI780HbnW1YdbVLMElbngbmT5rHzIsGpAVCC+SDmMK
 NPWXrqWIw6yTVSbTwqKIqOLlEiLxGjdWnPxjoMXBVyje+EcmANBe+fe9qkLq98XC
 ZgzrRZyriS8QLMMscy/GmdxIyC32nxebdHDwwE6qgYM8GWNfqLLektX798VGFhra
 ByR9bvsJ0PD5m9siCGcx37lVusJDLMjJp4FtMIFTrH63i0sMQm7HKiggJmbCm4lH
 Sgbo4iwvSVa2xf1glPJagE9tiah5b0feLqgrQf/ONO2PdCjcERN47472IcQgRvQ+
 SDYKScZBSp1/Jd063dHiK/u79uxEBFEdisAkPG2MstjCySEDuhvDrV5R0iKDpQBP
 y2FXb4RArqZFrGwS4Zfxx/EQnj3MYJ11a4AE5I0yUGIj7vrFdddayBDBVdwhog84
 QhHPH0F/eC/zSMATYSQSCZTTSZ2UoR8NODXyOryoH5tmXlgxXWKq1oFi5nUnysoP
 SkGDT0dg+kbReQNA+eyj5qTS4lzincIyP2B4Ple9d75zpx1UENlqVm1xvWLccyFt
 3eV/XNRg8dAapsbqvEtW+iev6izutWgcG6p1hToObnbg5uHy6fI=
 =+iTJ
 -----END PGP SIGNATURE-----

Merge 6.1.68 into android14-6.1-lts

Changes in 6.1.68
	vdpa/mlx5: preserve CVQ vringh index
	hrtimers: Push pending hrtimers away from outgoing CPU earlier
	i2c: designware: Fix corrupted memory seen in the ISR
	netfilter: ipset: fix race condition between swap/destroy and kernel side add/del/test
	zstd: Fix array-index-out-of-bounds UBSAN warning
	tg3: Move the [rt]x_dropped counters to tg3_napi
	tg3: Increment tx_dropped in tg3_tso_bug()
	kconfig: fix memory leak from range properties
	drm/amdgpu: correct chunk_ptr to a pointer to chunk.
	x86: Introduce ia32_enabled()
	x86/coco: Disable 32-bit emulation by default on TDX and SEV
	x86/entry: Convert INT 0x80 emulation to IDTENTRY
	x86/entry: Do not allow external 0x80 interrupts
	x86/tdx: Allow 32-bit emulation by default
	dt: dt-extract-compatibles: Handle cfile arguments in generator function
	dt: dt-extract-compatibles: Don't follow symlinks when walking tree
	platform/x86: asus-wmi: Move i8042 filter install to shared asus-wmi code
	of: dynamic: Fix of_reconfig_get_state_change() return value documentation
	platform/x86: wmi: Skip blocks with zero instances
	ipv6: fix potential NULL deref in fib6_add()
	octeontx2-pf: Add missing mutex lock in otx2_get_pauseparam
	octeontx2-af: Check return value of nix_get_nixlf before using nixlf
	hv_netvsc: rndis_filter needs to select NLS
	r8152: Rename RTL8152_UNPLUG to RTL8152_INACCESSIBLE
	r8152: Add RTL8152_INACCESSIBLE checks to more loops
	r8152: Add RTL8152_INACCESSIBLE to r8156b_wait_loading_flash()
	r8152: Add RTL8152_INACCESSIBLE to r8153_pre_firmware_1()
	r8152: Add RTL8152_INACCESSIBLE to r8153_aldps_en()
	mlxbf-bootctl: correctly identify secure boot with development keys
	platform/mellanox: Add null pointer checks for devm_kasprintf()
	platform/mellanox: Check devm_hwmon_device_register_with_groups() return value
	arcnet: restoring support for multiple Sohard Arcnet cards
	octeontx2-pf: consider both Rx and Tx packet stats for adaptive interrupt coalescing
	net: stmmac: fix FPE events losing
	xsk: Skip polling event check for unbound socket
	octeontx2-af: fix a use-after-free in rvu_npa_register_reporters
	i40e: Fix unexpected MFS warning message
	iavf: validate tx_coalesce_usecs even if rx_coalesce_usecs is zero
	net: bnxt: fix a potential use-after-free in bnxt_init_tc
	tcp: fix mid stream window clamp.
	ionic: fix snprintf format length warning
	ionic: Fix dim work handling in split interrupt mode
	ipv4: ip_gre: Avoid skb_pull() failure in ipgre_xmit()
	net: atlantic: Fix NULL dereference of skb pointer in
	net: hns: fix wrong head when modify the tx feature when sending packets
	net: hns: fix fake link up on xge port
	octeontx2-af: Adjust Tx credits when MCS external bypass is disabled
	octeontx2-af: Fix mcs sa cam entries size
	octeontx2-af: Fix mcs stats register address
	octeontx2-af: Add missing mcs flr handler call
	octeontx2-af: Update Tx link register range
	dt-bindings: interrupt-controller: Allow #power-domain-cells
	netfilter: nft_exthdr: add boolean DCCP option matching
	netfilter: nf_tables: fix 'exist' matching on bigendian arches
	netfilter: nf_tables: bail out on mismatching dynset and set expressions
	netfilter: nf_tables: validate family when identifying table via handle
	netfilter: xt_owner: Fix for unsafe access of sk->sk_socket
	tcp: do not accept ACK of bytes we never sent
	bpf: sockmap, updating the sg structure should also update curr
	psample: Require 'CAP_NET_ADMIN' when joining "packets" group
	drop_monitor: Require 'CAP_SYS_ADMIN' when joining "events" group
	mm/damon/sysfs: eliminate potential uninitialized variable warning
	tee: optee: Fix supplicant based device enumeration
	RDMA/hns: Fix unnecessary err return when using invalid congest control algorithm
	RDMA/irdma: Do not modify to SQD on error
	RDMA/irdma: Add wait for suspend on SQD
	arm64: dts: rockchip: Expand reg size of vdec node for RK3328
	arm64: dts: rockchip: Expand reg size of vdec node for RK3399
	ASoC: fsl_sai: Fix no frame sync clock issue on i.MX8MP
	RDMA/rtrs-srv: Do not unconditionally enable irq
	RDMA/rtrs-clt: Start hb after path_up
	RDMA/rtrs-srv: Check return values while processing info request
	RDMA/rtrs-srv: Free srv_mr iu only when always_invalidate is true
	RDMA/rtrs-srv: Destroy path files after making sure no IOs in-flight
	RDMA/rtrs-clt: Fix the max_send_wr setting
	RDMA/rtrs-clt: Remove the warnings for req in_use check
	RDMA/bnxt_re: Correct module description string
	RDMA/irdma: Refactor error handling in create CQP
	RDMA/irdma: Fix UAF in irdma_sc_ccq_get_cqe_info()
	hwmon: (acpi_power_meter) Fix 4.29 MW bug
	ASoC: codecs: lpass-tx-macro: set active_decimator correct default value
	hwmon: (nzxt-kraken2) Fix error handling path in kraken2_probe()
	ASoC: wm_adsp: fix memleak in wm_adsp_buffer_populate
	RDMA/core: Fix umem iterator when PAGE_SIZE is greater then HCA pgsz
	RDMA/irdma: Avoid free the non-cqp_request scratch
	drm/bridge: tc358768: select CONFIG_VIDEOMODE_HELPERS
	arm64: dts: imx8mq: drop usb3-resume-missing-cas from usb
	arm64: dts: imx8mp: imx8mq: Add parkmode-disable-ss-quirk on DWC3
	ARM: dts: imx6ul-pico: Describe the Ethernet PHY clock
	tracing: Fix a warning when allocating buffered events fails
	scsi: be2iscsi: Fix a memleak in beiscsi_init_wrb_handle()
	ARM: imx: Check return value of devm_kasprintf in imx_mmdc_perf_init
	ARM: dts: imx7: Declare timers compatible with fsl,imx6dl-gpt
	ARM: dts: imx28-xea: Pass the 'model' property
	riscv: fix misaligned access handling of C.SWSP and C.SDSP
	md: introduce md_ro_state
	md: don't leave 'MD_RECOVERY_FROZEN' in error path of md_set_readonly()
	iommu: Avoid more races around device probe
	rethook: Use __rcu pointer for rethook::handler
	kprobes: consistent rcu api usage for kretprobe holder
	ASoC: amd: yc: Fix non-functional mic on ASUS E1504FA
	io_uring/af_unix: disable sending io_uring over sockets
	nvme-pci: Add sleep quirk for Kingston drives
	io_uring: fix mutex_unlock with unreferenced ctx
	ALSA: usb-audio: Add Pioneer DJM-450 mixer controls
	ALSA: pcm: fix out-of-bounds in snd_pcm_state_names
	ALSA: hda/realtek: Enable headset on Lenovo M90 Gen5
	ALSA: hda/realtek: add new Framework laptop to quirks
	ALSA: hda/realtek: Add Framework laptop 16 to quirks
	ring-buffer: Test last update in 32bit version of __rb_time_read()
	nilfs2: fix missing error check for sb_set_blocksize call
	nilfs2: prevent WARNING in nilfs_sufile_set_segment_usage()
	cgroup_freezer: cgroup_freezing: Check if not frozen
	checkstack: fix printed address
	tracing: Always update snapshot buffer size
	tracing: Disable snapshot buffer when stopping instance tracers
	tracing: Fix incomplete locking when disabling buffered events
	tracing: Fix a possible race when disabling buffered events
	packet: Move reference count in packet_sock to atomic_long_t
	r8169: fix rtl8125b PAUSE frames blasting when suspended
	regmap: fix bogus error on regcache_sync success
	platform/surface: aggregator: fix recv_buf() return value
	hugetlb: fix null-ptr-deref in hugetlb_vma_lock_write
	mm: fix oops when filemap_map_pmd() without prealloc_pte
	powercap: DTPM: Fix missing cpufreq_cpu_put() calls
	md/raid6: use valid sector values to determine if an I/O should wait on the reshape
	arm64: dts: mediatek: mt7622: fix memory node warning check
	arm64: dts: mediatek: mt8183-kukui-jacuzzi: fix dsi unnecessary cells properties
	arm64: dts: mediatek: cherry: Fix interrupt cells for MT6360 on I2C7
	arm64: dts: mediatek: mt8173-evb: Fix regulator-fixed node names
	arm64: dts: mediatek: mt8195: Fix PM suspend/resume with venc clocks
	arm64: dts: mediatek: mt8183: Fix unit address for scp reserved memory
	arm64: dts: mediatek: mt8183: Move thermal-zones to the root node
	arm64: dts: mediatek: mt8183-evb: Fix unit_address_vs_reg warning on ntc
	binder: fix memory leaks of spam and pending work
	coresight: etm4x: Make etm4_remove_dev() return void
	coresight: etm4x: Remove bogous __exit annotation for some functions
	hwtracing: hisi_ptt: Add dummy callback pmu::read()
	misc: mei: client.c: return negative error code in mei_cl_write
	misc: mei: client.c: fix problem of return '-EOVERFLOW' in mei_cl_write
	LoongArch: BPF: Don't sign extend memory load operand
	LoongArch: BPF: Don't sign extend function return value
	ring-buffer: Force absolute timestamp on discard of event
	tracing: Set actual size after ring buffer resize
	tracing: Stop current tracer when resizing buffer
	parisc: Reduce size of the bug_table on 64-bit kernel by half
	parisc: Fix asm operand number out of range build error in bug table
	arm64: dts: mediatek: add missing space before {
	arm64: dts: mt8183: kukui: Fix underscores in node names
	perf: Fix perf_event_validate_size()
	x86/sev: Fix kernel crash due to late update to read-only ghcb_version
	gpiolib: sysfs: Fix error handling on failed export
	drm/amdgpu: fix memory overflow in the IB test
	drm/amd/amdgpu: Fix warnings in amdgpu/amdgpu_display.c
	drm/amdgpu: correct the amdgpu runtime dereference usage count
	drm/amdgpu: Update ras eeprom support for smu v13_0_0 and v13_0_10
	drm/amdgpu: Add EEPROM I2C address support for ip discovery
	drm/amdgpu: Remove redundant I2C EEPROM address
	drm/amdgpu: Decouple RAS EEPROM addresses from chips
	drm/amdgpu: Add support for RAS table at 0x40000
	drm/amdgpu: Remove second moot switch to set EEPROM I2C address
	drm/amdgpu: Return from switch early for EEPROM I2C address
	drm/amdgpu: simplify amdgpu_ras_eeprom.c
	drm/amdgpu: Add I2C EEPROM support on smu v13_0_6
	drm/amdgpu: Update EEPROM I2C address for smu v13_0_0
	usb: gadget: f_hid: fix report descriptor allocation
	serial: 8250_dw: Add ACPI ID for Granite Rapids-D UART
	parport: Add support for Brainboxes IX/UC/PX parallel cards
	cifs: Fix non-availability of dedup breaking generic/304
	Revert "xhci: Loosen RPM as default policy to cover for AMD xHC 1.1"
	smb: client: fix potential NULL deref in parse_dfs_referrals()
	usb: typec: class: fix typec_altmode_put_partner to put plugs
	ARM: PL011: Fix DMA support
	serial: sc16is7xx: address RX timeout interrupt errata
	serial: 8250: 8250_omap: Clear UART_HAS_RHR_IT_DIS bit
	serial: 8250: 8250_omap: Do not start RX DMA on THRI interrupt
	serial: 8250_omap: Add earlycon support for the AM654 UART controller
	devcoredump: Send uevent once devcd is ready
	x86/CPU/AMD: Check vendor in the AMD microcode callback
	USB: gadget: core: adjust uevent timing on gadget unbind
	cifs: Fix flushing, invalidation and file size with copy_file_range()
	cifs: Fix flushing, invalidation and file size with FICLONE
	MIPS: kernel: Clear FPU states when setting up kernel threads
	KVM: s390/mm: Properly reset no-dat
	KVM: SVM: Update EFER software model on CR0 trap for SEV-ES
	MIPS: Loongson64: Reserve vgabios memory on boot
	MIPS: Loongson64: Handle more memory types passed from firmware
	MIPS: Loongson64: Enable DMA noncoherent support
	netfilter: nft_set_pipapo: skip inactive elements during set walk
	riscv: Kconfig: Add select ARM_AMBA to SOC_STARFIVE
	drm/i915/display: Drop check for doublescan mode in modevalid
	drm/i915/lvds: Use REG_BIT() & co.
	drm/i915/sdvo: stop caching has_hdmi_monitor in struct intel_sdvo
	drm/i915: Skip some timing checks on BXT/GLK DSI transcoders
	Linux 6.1.68

Change-Id: I0a824071a80b24dc4a2e0077f305b7cac42235b8
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2024-01-05 08:40:52 +00:00
Matthew Wilcox (Oracle)
b43b26b4cd UPSTREAM: mm: make lock_folio_maybe_drop_mmap() VMA lock aware
Patch series "Handle more faults under the VMA lock", v2.

At this point, we're handling the majority of file-backed page faults
under the VMA lock, using the ->map_pages entry point.  This patch set
attempts to expand that for the following siutations:

 - We have to do a read.  This could be because we've hit the point in
   the readahead window where we need to kick off the next readahead,
   or because the page is simply not present in cache.
 - We're handling a write fault.  Most applications don't do I/O by writes
   to shared mmaps for very good reasons, but some do, and it'd be nice
   to not make that slow unnecessarily.
 - We're doing a COW of a private mapping (both PTE already present
   and PTE not-present).  These are two different codepaths and I handle
   both of them in this patch set.

There is no support in this patch set for drivers to mark themselves as
being VMA lock friendly; they could implement the ->map_pages
vm_operation, but if they do, they would be the first.  This is probably
something we want to change at some point in the future, and I've marked
where to make that change in the code.

There is very little performance change in the benchmarks we've run;
mostly because the vast majority of page faults are handled through the
other paths.  I still think this patch series is useful for workloads that
may take these paths more often, and just for cleaning up the fault path
in general (it's now clearer why we have to retry in these cases).

This patch (of 6):

Drop the VMA lock instead of the mmap_lock if that's the one which
is held.

Link: https://lkml.kernel.org/r/20231006195318.4087158-1-willy@infradead.org
Link: https://lkml.kernel.org/r/20231006195318.4087158-2-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
(cherry picked from commit 5d74b2ab2c15d596c470bae6626f345d5575a9d0)

Bug: 293665307
Change-Id: Ife2d11ab12fb428868cd44751784cf731fbffe62
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2024-01-03 20:45:51 +00:00
Hugh Dickins
686cc4de09 mm: fix oops when filemap_map_pmd() without prealloc_pte
commit 9aa1345d66b8132745ffb99b348b1492088da9e2 upstream.

syzbot reports oops in lockdep's __lock_acquire(), called from
__pte_offset_map_lock() called from filemap_map_pages(); or when I run the
repro, the oops comes in pmd_install(), called from filemap_map_pmd()
called from filemap_map_pages(), just before the __pte_offset_map_lock().

The problem is that filemap_map_pmd() has been assuming that when it finds
pmd_none(), a page table has already been prepared in prealloc_pte; and
indeed do_fault_around() has been careful to preallocate one there, when
it finds pmd_none(): but what if *pmd became none in between?

My 6.6 mods in mm/khugepaged.c, avoiding mmap_lock for write, have made it
easy for *pmd to be cleared while servicing a page fault; but even before
those, a huge *pmd might be zapped while a fault is serviced.

The difference in symptomatic stack traces comes from the "memory model"
in use: pmd_install() uses pmd_populate() uses page_to_pfn(): in some
models that is strict, and will oops on the NULL prealloc_pte; in other
models, it will construct a bogus value to be populated into *pmd, then
__pte_offset_map_lock() oops when trying to access split ptlock pointer
(or some other symptom in normal case of ptlock embedded not pointer).

Link: https://lore.kernel.org/linux-mm/20231115065506.19780-1-jose.pekkarinen@foxhound.fi/
Link: https://lkml.kernel.org/r/6ed0c50c-78ef-0719-b3c5-60c0c010431c@google.com
Fixes: f9ce0be71d ("mm: Cleanup faultaround and finish_fault() codepaths")
Signed-off-by: Hugh Dickins <hughd@google.com>
Reported-and-tested-by: syzbot+89edd67979b52675ddec@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/linux-mm/0000000000005e44550608a0806c@google.com/
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Jann Horn <jannh@google.com>,
Cc: José Pekkarinen <jose.pekkarinen@foxhound.fi>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: <stable@vger.kernel.org>    [5.12+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-12-13 18:39:20 +01:00
Greg Kroah-Hartman
2b3ea8bdef This is the 6.1.63 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmVbOmsACgkQONu9yGCS
 aT5m1RAAx7hgbFDnLHCGh4YVBbNy8JngItsUBaJcI/67Mk5toNi0x8pqcS8mq7ED
 GTwRnRcKaIR2bTyco5Ed2OZn4jMCyHC4oiyBZnHWg6AMuQjSCYzIgm7DzlTCVYZ7
 2r8uRbt/uXADTILJ2kwR2mtVpGcwrXa+lsHrMqvt+MvNwRoSVHBHVVYCrAc+JXwR
 GXCopzV/RFGS6w4SBsX0K+8pV7GO+bhpxJ1lPz1T/xeLYfT4C3EwSTWDbUXPbez7
 IpJ+5yKJXXT9Xn9m/pekwZ/aOirLqtEbDxneEctsjvw140lCoQiEZn6ZRscgNEns
 3H+J3Asgc2zXqPzfZFH02TebPj31B8HZ43Upu0okr0hr4A4/4JL9pjXEhm1bON/Z
 x3jlTF4dyay4vOGGIEYOAuJSUbn6AqpZ318uBWCd3BSPocihEDMJz2aoazVHcb6k
 83MVxfFfEL6s9utcoSXB8VjHa4FQmpMYsozegloUSJJCsizgdzmih0buJYhBB9sI
 HbEohW+YAh3cACSn6arXUJIMH5F5xsfD89od2Pj+6UrapdlPz5gCaggA1RZplCho
 bjGc1k61Rp2qSdfMEcx+h4ypgoOdhgqZI0YhYDCgBSRcWOXnGrDjFvnnumatcT+H
 6vqyX6zlNt6U1NpE56Jtf7gt1Ds6PeoadD0L6B8vjXrkdeXOlUU=
 =AZ9s
 -----END PGP SIGNATURE-----

Merge 6.1.63 into android14-6.1-lts

Changes in 6.1.63
	hwmon: (nct6775) Fix incorrect variable reuse in fan_div calculation
	sched/fair: Fix cfs_rq_is_decayed() on !SMP
	iov_iter, x86: Be consistent about the __user tag on copy_mc_to_user()
	sched/uclamp: Set max_spare_cap_cpu even if max_spare_cap is 0
	sched/uclamp: Ignore (util == 0) optimization in feec() when p_util_max = 0
	objtool: Propagate early errors
	sched: Fix stop_one_cpu_nowait() vs hotplug
	vfs: fix readahead(2) on block devices
	writeback, cgroup: switch inodes with dirty timestamps to release dying cgwbs
	x86/srso: Fix SBPB enablement for (possible) future fixed HW
	futex: Don't include process MM in futex key on no-MMU
	x86/numa: Introduce numa_fill_memblks()
	ACPI/NUMA: Apply SRAT proximity domain to entire CFMWS window
	x86/sev-es: Allow copy_from_kernel_nofault() in earlier boot
	x86/boot: Fix incorrect startup_gdt_descr.size
	drivers/clocksource/timer-ti-dm: Don't call clk_get_rate() in stop function
	pstore/platform: Add check for kstrdup
	string: Adjust strtomem() logic to allow for smaller sources
	genirq/matrix: Exclude managed interrupts in irq_matrix_allocated()
	wifi: cfg80211: add flush functions for wiphy work
	wifi: mac80211: move radar detect work to wiphy work
	wifi: mac80211: move scan work to wiphy work
	wifi: mac80211: move offchannel works to wiphy work
	wifi: mac80211: move sched-scan stop work to wiphy work
	wifi: mac80211: fix # of MSDU in A-MSDU calculation
	wifi: iwlwifi: honor the enable_ini value
	i40e: fix potential memory leaks in i40e_remove()
	iavf: Fix promiscuous mode configuration flow messages
	selftests/bpf: Correct map_fd to data_fd in tailcalls
	udp: add missing WRITE_ONCE() around up->encap_rcv
	tcp: call tcp_try_undo_recovery when an RTOd TFO SYNACK is ACKed
	gve: Use size_add() in call to struct_size()
	mlxsw: Use size_mul() in call to struct_size()
	tls: Only use data field in crypto completion function
	tls: Use size_add() in call to struct_size()
	tipc: Use size_add() in calls to struct_size()
	net: spider_net: Use size_add() in call to struct_size()
	net: ethernet: mtk_wed: fix EXT_INT_STATUS_RX_FBUF definitions for MT7986 SoC
	wifi: rtw88: debug: Fix the NULL vs IS_ERR() bug for debugfs_create_file()
	wifi: ath11k: fix boot failure with one MSI vector
	wifi: mt76: mt7603: rework/fix rx pse hang check
	wifi: mt76: mt7603: improve watchdog reset reliablity
	wifi: mt76: mt7603: improve stuck beacon handling
	wifi: mt76: mt7915: fix beamforming availability check
	wifi: ath: dfs_pattern_detector: Fix a memory initialization issue
	tcp_metrics: add missing barriers on delete
	tcp_metrics: properly set tp->snd_ssthresh in tcp_init_metrics()
	tcp_metrics: do not create an entry from tcp_init_metrics()
	wifi: rtlwifi: fix EDCA limit set by BT coexistence
	ACPI: property: Allow _DSD buffer data only for byte accessors
	ACPI: video: Add acpi_backlight=vendor quirk for Toshiba Portégé R100
	wifi: ath11k: fix Tx power value during active CAC
	can: dev: can_restart(): don't crash kernel if carrier is OK
	can: dev: can_restart(): fix race condition between controller restart and netif_carrier_on()
	can: dev: can_put_echo_skb(): don't crash kernel if can_priv::echo_skb is accessed out of bounds
	PM / devfreq: rockchip-dfi: Make pmu regmap mandatory
	wifi: wfx: fix case where rates are out of order
	netfilter: nf_tables: Drop pointless memset when dumping rules
	thermal: core: prevent potential string overflow
	r8169: use tp_to_dev instead of open code
	r8169: fix rare issue with broken rx after link-down on RTL8125
	selftests: netfilter: test for sctp collision processing in nf_conntrack
	net: skb_find_text: Ignore patterns extending past 'to'
	chtls: fix tp->rcv_tstamp initialization
	tcp: fix cookie_init_timestamp() overflows
	wifi: iwlwifi: call napi_synchronize() before freeing rx/tx queues
	wifi: iwlwifi: pcie: synchronize IRQs before NAPI
	wifi: iwlwifi: empty overflow queue during flush
	Bluetooth: hci_sync: Fix Opcode prints in bt_dev_dbg/err
	bpf: Fix unnecessary -EBUSY from htab_lock_bucket
	ACPI: sysfs: Fix create_pnp_modalias() and create_of_modalias()
	ipv6: avoid atomic fragment on GSO packets
	net: add DEV_STATS_READ() helper
	ipvlan: properly track tx_errors
	regmap: debugfs: Fix a erroneous check after snprintf()
	spi: tegra: Fix missing IRQ check in tegra_slink_probe()
	clk: qcom: gcc-msm8996: Remove RPM bus clocks
	clk: qcom: clk-rcg2: Fix clock rate overflow for high parent frequencies
	clk: qcom: mmcc-msm8998: Don't check halt bit on some branch clks
	clk: qcom: mmcc-msm8998: Fix the SMMU GDSC
	clk: qcom: gcc-sm8150: Fix gcc_sdcc2_apps_clk_src
	regulator: mt6358: Fail probe on unknown chip ID
	clk: imx: Select MXC_CLK for CLK_IMX8QXP
	clk: imx: imx8mq: correct error handling path
	clk: imx: imx8qxp: Fix elcdif_pll clock
	clk: renesas: rcar-gen3: Extend SDnH divider table
	clk: renesas: rzg2l: Wait for status bit of SD mux before continuing
	clk: renesas: rzg2l: Lock around writes to mux register
	clk: renesas: rzg2l: Trust value returned by hardware
	clk: renesas: rzg2l: Use FIELD_GET() for PLL register fields
	clk: renesas: rzg2l: Fix computation formula
	clk: linux/clk-provider.h: fix kernel-doc warnings and typos
	spi: nxp-fspi: use the correct ioremap function
	clk: keystone: pll: fix a couple NULL vs IS_ERR() checks
	clk: ti: change ti_clk_register[_omap_hw]() API
	clk: ti: fix double free in of_ti_divider_clk_setup()
	clk: npcm7xx: Fix incorrect kfree
	clk: mediatek: clk-mt6765: Add check for mtk_alloc_clk_data
	clk: mediatek: clk-mt6779: Add check for mtk_alloc_clk_data
	clk: mediatek: clk-mt6797: Add check for mtk_alloc_clk_data
	clk: mediatek: clk-mt7629-eth: Add check for mtk_alloc_clk_data
	clk: mediatek: clk-mt7629: Add check for mtk_alloc_clk_data
	clk: mediatek: clk-mt2701: Add check for mtk_alloc_clk_data
	clk: qcom: config IPQ_APSS_6018 should depend on QCOM_SMEM
	platform/x86: wmi: Fix probe failure when failing to register WMI devices
	platform/x86: wmi: Fix opening of char device
	hwmon: (axi-fan-control) Fix possible NULL pointer dereference
	hwmon: (coretemp) Fix potentially truncated sysfs attribute name
	Revert "hwmon: (sch56xx-common) Add DMI override table"
	Revert "hwmon: (sch56xx-common) Add automatic module loading on supported devices"
	hwmon: (sch5627) Use bit macros when accessing the control register
	hwmon: (sch5627) Disallow write access if virtual registers are locked
	hte: tegra: Fix missing error code in tegra_hte_test_probe()
	drm/rockchip: vop: Fix reset of state in duplicate state crtc funcs
	drm/rockchip: vop: Fix call to crtc reset helper
	drm/rockchip: vop2: Don't crash for invalid duplicate_state
	drm/rockchip: vop2: Add missing call to crtc reset helper
	drm/radeon: possible buffer overflow
	drm: bridge: it66121: Fix invalid connector dereference
	drm/bridge: lt8912b: Add hot plug detection
	drm/bridge: lt8912b: Fix bridge_detach
	drm/bridge: lt8912b: Fix crash on bridge detach
	drm/bridge: lt8912b: Manually disable HPD only if it was enabled
	drm/bridge: lt8912b: Add missing drm_bridge_attach call
	drm/bridge: tc358768: Fix use of uninitialized variable
	drm/bridge: tc358768: Fix bit updates
	drm/bridge: tc358768: remove unused variable
	drm/bridge: tc358768: Use struct videomode
	drm/bridge: tc358768: Print logical values, not raw register values
	drm/bridge: tc358768: Use dev for dbg prints, not priv->dev
	drm/bridge: tc358768: Rename dsibclk to hsbyteclk
	drm/bridge: tc358768: Clean up clock period code
	drm/bridge: tc358768: Fix tc358768_ns_to_cnt()
	drm/amdkfd: fix some race conditions in vram buffer alloc/free of svm code
	drm/amd/display: Check all enabled planes in dm_check_crtc_cursor
	drm/amd/display: Refactor dm_get_plane_scale helper
	drm/amd/display: Bail from dm_check_crtc_cursor if no relevant change
	io_uring/kbuf: Fix check of BID wrapping in provided buffers
	io_uring/kbuf: Allow the full buffer id space for provided buffers
	drm/mediatek: Fix iommu fault by swapping FBs after updating plane state
	drm/mediatek: Fix iommu fault during crtc enabling
	drm/rockchip: cdn-dp: Fix some error handling paths in cdn_dp_probe()
	gpu: host1x: Correct allocated size for contexts
	drm/bridge: lt9611uxc: fix the race in the error path
	arm64/arm: xen: enlighten: Fix KPTI checks
	drm/rockchip: Fix type promotion bug in rockchip_gem_iommu_map()
	xenbus: fix error exit in xenbus_init()
	xen-pciback: Consider INTx disabled when MSI/MSI-X is enabled
	drm/msm/dsi: use msm_gem_kernel_put to free TX buffer
	drm/msm/dsi: free TX buffer in unbind
	clocksource/drivers/arm_arch_timer: limit XGene-1 workaround
	drm: mediatek: mtk_dsi: Fix NO_EOT_PACKET settings/handling
	drivers/perf: hisi: use cpuhp_state_remove_instance_nocalls() for hisi_hns3_pmu uninit process
	perf/arm-cmn: Revamp model detection
	perf/arm-cmn: Fix DTC domain detection
	drivers/perf: hisi_pcie: Check the type first in pmu::event_init()
	perf: hisi: Fix use-after-free when register pmu fails
	ARM: dts: renesas: blanche: Fix typo in GP_11_2 pin name
	arm64: dts: qcom: sdm845: cheza doesn't support LMh node
	arm64: dts: qcom: sc7280: link usb3_phy_wrapper_gcc_usb30_pipe_clk
	arm64: dts: qcom: msm8916: Fix iommu local address range
	arm64: dts: qcom: msm8992-libra: drop duplicated reserved memory
	arm64: dts: qcom: sc7280: Add missing LMH interrupts
	arm64: dts: qcom: sm8150: add ref clock to PCIe PHYs
	arm64: dts: qcom: sm8350: fix pinctrl for UART18
	arm64: dts: qcom: sdm845-mtp: fix WiFi configuration
	ARM64: dts: marvell: cn9310: Use appropriate label for spi1 pins
	arm64: dts: qcom: apq8016-sbc: Add missing ADV7533 regulators
	ARM: dts: qcom: mdm9615: populate vsdcc fixed regulator
	soc: qcom: llcc: Handle a second device without data corruption
	kunit: Fix missed memory release in kunit_free_suite_set()
	firmware: ti_sci: Mark driver as non removable
	arm64: dts: ti: k3-am62a7-sk: Drop i2c-1 to 100Khz
	firmware: arm_ffa: Assign the missing IDR allocation ID to the FFA device
	firmware: arm_ffa: Allow the FF-A drivers to use 32bit mode of messaging
	ARM: dts: am3517-evm: Fix LED3/4 pinmux
	clk: scmi: Free scmi_clk allocated when the clocks with invalid info are skipped
	arm64: dts: imx8qm-ss-img: Fix jpegenc compatible entry
	arm64: dts: imx8mm: Add sound-dai-cells to micfil node
	arm64: dts: imx8mn: Add sound-dai-cells to micfil node
	arm64: tegra: Use correct interrupts for Tegra234 TKE
	selftests/pidfd: Fix ksft print formats
	selftests/resctrl: Ensure the benchmark commands fits to its array
	module/decompress: use vmalloc() for gzip decompression workspace
	ASoC: cs35l41: Verify PM runtime resume errors in IRQ handler
	ASoC: cs35l41: Undo runtime PM changes at driver exit time
	ALSA: hda: cs35l41: Fix unbalanced pm_runtime_get()
	ALSA: hda: cs35l41: Undo runtime PM changes at driver exit time
	KEYS: Include linux/errno.h in linux/verification.h
	crypto: hisilicon/hpre - Fix a erroneous check after snprintf()
	hwrng: bcm2835 - Fix hwrng throughput regression
	hwrng: geode - fix accessing registers
	RDMA/core: Use size_{add,sub,mul}() in calls to struct_size()
	crypto: qat - ignore subsequent state up commands
	crypto: qat - relocate bufferlist logic
	crypto: qat - rename bufferlist functions
	crypto: qat - change bufferlist logic interface
	crypto: qat - generalize crypto request buffers
	crypto: qat - extend buffer list interface
	crypto: qat - fix unregistration of crypto algorithms
	scsi: ibmvfc: Fix erroneous use of rtas_busy_delay with hcall return code
	libnvdimm/of_pmem: Use devm_kstrdup instead of kstrdup and check its return value
	nd_btt: Make BTT lanes preemptible
	crypto: caam/qi2 - fix Chacha20 + Poly1305 self test failure
	crypto: caam/jr - fix Chacha20 + Poly1305 self test failure
	crypto: qat - increase size of buffers
	PCI: vmd: Correct PCI Header Type Register's multi-function check
	hid: cp2112: Fix duplicate workqueue initialization
	crypto: hisilicon/qm - delete redundant null assignment operations
	crypto: hisilicon/qm - modify the process of regs dfx
	crypto: hisilicon/qm - split a debugfs.c from qm
	crypto: hisilicon/qm - fix PF queue parameter issue
	ARM: 9321/1: memset: cast the constant byte to unsigned char
	ext4: move 'ix' sanity check to corrent position
	ASoC: fsl: mpc5200_dma.c: Fix warning of Function parameter or member not described
	IB/mlx5: Fix rdma counter binding for RAW QP
	RDMA/hns: Fix printing level of asynchronous events
	RDMA/hns: Fix uninitialized ucmd in hns_roce_create_qp_common()
	RDMA/hns: Fix signed-unsigned mixed comparisons
	RDMA/hns: Add check for SL
	RDMA/hns: The UD mode can only be configured with DCQCN
	ASoC: SOF: core: Ensure sof_ops_free() is still called when probe never ran.
	ASoC: fsl: Fix PM disable depth imbalance in fsl_easrc_probe
	scsi: ufs: core: Leave space for '\0' in utf8 desc string
	RDMA/hfi1: Workaround truncation compilation error
	HID: cp2112: Make irq_chip immutable
	hid: cp2112: Fix IRQ shutdown stopping polling for all IRQs on chip
	sh: bios: Revive earlyprintk support
	Revert "HID: logitech-hidpp: add a module parameter to keep firmware gestures"
	HID: logitech-hidpp: Remove HIDPP_QUIRK_NO_HIDINPUT quirk
	HID: logitech-hidpp: Don't restart IO, instead defer hid_connect() only
	HID: logitech-hidpp: Revert "Don't restart communication if not necessary"
	HID: logitech-hidpp: Move get_wireless_feature_index() check to hidpp_connect_event()
	ASoC: Intel: Skylake: Fix mem leak when parsing UUIDs fails
	padata: Fix refcnt handling in padata_free_shell()
	crypto: qat - fix deadlock in backlog processing
	ASoC: ams-delta.c: use component after check
	IB/mlx5: Fix init stage error handling to avoid double free of same QP and UAF
	mfd: core: Un-constify mfd_cell.of_reg
	mfd: core: Ensure disabled devices are skipped without aborting
	mfd: dln2: Fix double put in dln2_probe
	dt-bindings: mfd: mt6397: Add binding for MT6357
	dt-bindings: mfd: mt6397: Split out compatible for MediaTek MT6366 PMIC
	mfd: arizona-spi: Set pdata.hpdet_channel for ACPI enumerated devs
	leds: turris-omnia: Drop unnecessary mutex locking
	leds: turris-omnia: Do not use SMBUS calls
	leds: pwm: Don't disable the PWM when the LED should be off
	leds: trigger: ledtrig-cpu:: Fix 'output may be truncated' issue for 'cpu'
	kunit: add macro to allow conditionally exposing static symbols to tests
	apparmor: test: make static symbols visible during kunit testing
	apparmor: fix invalid reference on profile->disconnected
	perf stat: Fix aggr mode initialization
	iio: frequency: adf4350: Use device managed functions and fix power down issue.
	perf kwork: Fix incorrect and missing free atom in work_push_atom()
	perf kwork: Add the supported subcommands to the document
	perf kwork: Set ordered_events to true in 'struct perf_tool'
	filemap: add filemap_get_folios_tag()
	f2fs: convert f2fs_write_cache_pages() to use filemap_get_folios_tag()
	f2fs: compress: fix deadloop in f2fs_write_cache_pages()
	f2fs: compress: fix to avoid use-after-free on dic
	f2fs: compress: fix to avoid redundant compress extension
	tty: tty_jobctrl: fix pid memleak in disassociate_ctty()
	livepatch: Fix missing newline character in klp_resolve_symbols()
	pinctrl: renesas: rzg2l: Make reverse order of enable() for disable()
	perf record: Fix BTF type checks in the off-cpu profiling
	dmaengine: idxd: Register dsa_bus_type before registering idxd sub-drivers
	usb: dwc2: fix possible NULL pointer dereference caused by driver concurrency
	usb: chipidea: Fix DMA overwrite for Tegra
	usb: chipidea: Simplify Tegra DMA alignment code
	dmaengine: ti: edma: handle irq_of_parse_and_map() errors
	misc: st_core: Do not call kfree_skb() under spin_lock_irqsave()
	tools: iio: iio_generic_buffer ensure alignment
	USB: usbip: fix stub_dev hub disconnect
	dmaengine: pxa_dma: Remove an erroneous BUG_ON() in pxad_free_desc()
	f2fs: fix to initialize map.m_pblk in f2fs_precache_extents()
	interconnect: qcom: sc7180: Retire DEFINE_QBCM
	interconnect: qcom: sc7180: Set ACV enable_mask
	interconnect: qcom: sc7280: Set ACV enable_mask
	interconnect: qcom: sc8180x: Set ACV enable_mask
	interconnect: qcom: sc8280xp: Set ACV enable_mask
	interconnect: qcom: sdm845: Retire DEFINE_QBCM
	interconnect: qcom: sdm845: Set ACV enable_mask
	interconnect: qcom: sm6350: Retire DEFINE_QBCM
	interconnect: qcom: sm6350: Set ACV enable_mask
	interconnect: move ignore_list out of of_count_icc_providers()
	interconnect: qcom: sm8150: Drop IP0 interconnects
	interconnect: qcom: sm8150: Retire DEFINE_QBCM
	interconnect: qcom: sm8150: Set ACV enable_mask
	interconnect: qcom: sm8350: Retire DEFINE_QBCM
	interconnect: qcom: sm8350: Set ACV enable_mask
	powerpc: Only define __parse_fpscr() when required
	modpost: fix tee MODULE_DEVICE_TABLE built on big-endian host
	modpost: fix ishtp MODULE_DEVICE_TABLE built on big-endian host
	powerpc/40x: Remove stale PTE_ATOMIC_UPDATES macro
	powerpc/xive: Fix endian conversion size
	powerpc/vas: Limit open window failure messages in log bufffer
	powerpc/imc-pmu: Use the correct spinlock initializer.
	powerpc/pseries: fix potential memory leak in init_cpu_associativity()
	xhci: Loosen RPM as default policy to cover for AMD xHC 1.1
	usb: host: xhci-plat: fix possible kernel oops while resuming
	perf machine: Avoid out of bounds LBR memory read
	perf hist: Add missing puts to hist__account_cycles
	9p/net: fix possible memory leak in p9_check_errors()
	i3c: Fix potential refcount leak in i3c_master_register_new_i3c_devs
	cxl/mem: Fix shutdown order
	crypto: ccp - Name -1 return value as SEV_RET_NO_FW_CALL
	x86/sev: Change snp_guest_issue_request()'s fw_err argument
	virt: sevguest: Fix passing a stack buffer as a scatterlist target
	rtc: pcf85363: fix wrong mask/val parameters in regmap_update_bits call
	pcmcia: cs: fix possible hung task and memory leak pccardd()
	pcmcia: ds: fix refcount leak in pcmcia_device_add()
	pcmcia: ds: fix possible name leak in error path in pcmcia_device_add()
	media: hantro: Check whether reset op is defined before use
	media: verisilicon: Do not enable G2 postproc downscale if source is narrower than destination
	media: ov5640: Drop dead code using frame_interval
	media: ov5640: fix vblank unchange issue when work at dvp mode
	media: i2c: max9286: Fix some redundant of_node_put() calls
	media: ov5640: Fix a memory leak when ov5640_probe fails
	media: bttv: fix use after free error due to btv->timeout timer
	media: amphion: handle firmware debug message
	media: mtk-jpegenc: Fix bug in JPEG encode quality selection
	media: s3c-camif: Avoid inappropriate kfree()
	media: vidtv: psi: Add check for kstrdup
	media: vidtv: mux: Add check and kfree for kstrdup
	media: cedrus: Fix clock/reset sequence
	media: cadence: csi2rx: Unregister v4l2 async notifier
	media: dvb-usb-v2: af9035: fix missing unlock
	media: cec: meson: always include meson sub-directory in Makefile
	regmap: prevent noinc writes from clobbering cache
	pwm: sti: Reduce number of allocations and drop usage of chip_data
	pwm: brcmstb: Utilize appropriate clock APIs in suspend/resume
	Input: synaptics-rmi4 - fix use after free in rmi_unregister_function()
	watchdog: ixp4xx: Make sure restart always works
	llc: verify mac len before reading mac header
	hsr: Prevent use after free in prp_create_tagged_frame()
	tipc: Change nla_policy for bearer-related names to NLA_NUL_STRING
	bpf: Check map->usercnt after timer->timer is assigned
	inet: shrink struct flowi_common
	octeontx2-pf: Fix error codes
	octeontx2-pf: Fix holes in error code
	net: page_pool: add missing free_percpu when page_pool_init fail
	dccp: Call security_inet_conn_request() after setting IPv4 addresses.
	dccp/tcp: Call security_inet_conn_request() after setting IPv6 addresses.
	net: r8169: Disable multicast filter for RTL8168H and RTL8107E
	Fix termination state for idr_for_each_entry_ul()
	net: stmmac: xgmac: Enable support for multiple Flexible PPS outputs
	selftests: pmtu.sh: fix result checking
	octeontx2-pf: Rename tot_tx_queues to non_qos_queues
	octeontx2-pf: qos send queues management
	octeontx2-pf: Free pending and dropped SQEs
	net/smc: fix dangling sock under state SMC_APPFINCLOSEWAIT
	net/smc: allow cdc msg send rather than drop it with NULL sndbuf_desc
	net/smc: put sk reference if close work was canceled
	nvme: fix error-handling for io_uring nvme-passthrough
	tg3: power down device only on SYSTEM_POWER_OFF
	nbd: fix uaf in nbd_open
	blk-core: use pr_warn_ratelimited() in bio_check_ro()
	virtio/vsock: replace virtio_vsock_pkt with sk_buff
	vsock/virtio: remove socket from connected/bound list on shutdown
	r8169: respect userspace disabling IFF_MULTICAST
	i2c: iproc: handle invalid slave state
	netfilter: xt_recent: fix (increase) ipv6 literal buffer length
	netfilter: nft_redir: use `struct nf_nat_range2` throughout and deduplicate eval call-backs
	netfilter: nat: fix ipv6 nat redirect with mapped and scoped addresses
	RISC-V: Don't fail in riscv_of_parent_hartid() for disabled HARTs
	drm/syncobj: fix DRM_SYNCOBJ_WAIT_FLAGS_WAIT_AVAILABLE
	ASoC: mediatek: mt8186_mt6366_rt1019_rt5682s: trivial: fix error messages
	ASoC: hdmi-codec: register hpd callback on component probe
	ASoC: dapm: fix clock get name
	spi: spi-zynq-qspi: add spi-mem to driver kconfig dependencies
	fbdev: imsttfb: Fix error path of imsttfb_probe()
	fbdev: imsttfb: fix a resource leak in probe
	fbdev: fsl-diu-fb: mark wr_reg_wa() static
	tracing/kprobes: Fix the order of argument descriptions
	io_uring/net: ensure socket is marked connected on connect retry
	x86/amd_nb: Use Family 19h Models 60h-7Fh Function 4 IDs
	Revert "mmc: core: Capture correct oemid-bits for eMMC cards"
	btrfs: use u64 for buffer sizes in the tree search ioctls
	wifi: cfg80211: fix kernel-doc for wiphy_delayed_work_flush()
	virtio/vsock: don't use skbuff state to account credit
	virtio/vsock: remove redundant 'skb_pull()' call
	virtio/vsock: don't drop skbuff on copy failure
	vsock/loopback: use only sk_buff_head.lock to protect the packet queue
	virtio/vsock: fix leaks due to missing skb owner
	virtio/vsock: Fix uninit-value in virtio_transport_recv_pkt()
	virtio/vsock: fix header length on skb merging
	Linux 6.1.63

Change-Id: I87b7a539b11c90cfaf16edb07d613f74d54458a4
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2023-11-27 16:59:46 +00:00
Vishal Moola (Oracle)
599befdd79 filemap: add filemap_get_folios_tag()
[ Upstream commit 247f9e1feef4e57911510c8f82348efb4491ea0e ]

This is the equivalent of find_get_pages_range_tag(), except for folios
instead of pages.

One noteable difference is filemap_get_folios_tag() does not take in a
maximum pages argument.  It instead tries to fill a folio batch and stops
either once full (15 folios) or reaching the end of the search range.

The new function supports large folios, the initial function did not since
all callers don't use large folios.

Link: https://lkml.kernel.org/r/20230104211448.4804-3-vishal.moola@gmail.com
Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com>
Reviewed-by: Matthew Wilcow (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Stable-dep-of: c5d3f9b7649a ("f2fs: compress: fix deadloop in f2fs_write_cache_pages()")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-20 11:52:09 +01:00
Oven
019393a917 ANDROID: vendor_hook: Add hook to tune readaround size
In some situations, we want to decrease readaround size for better
performance. So we add this hook.

Bug: 288216516
Change-Id: If2f5f75976c99ff1f82ce29d370f9216926055ab
Signed-off-by: Oven <liyangouwen1@oppo.com>
2023-11-06 23:07:00 +00:00
Chiawei Wang
401b78ce87 ANDROID: mm: Add vendor hook in filemap_get_folio()
Add a vendor hook for pagecache hit/miss and other
vendor specific functions.

Bug: 174088128
Bug: 172987241
Signed-off-by: Chiawei Wang <chiaweiwang@google.com>
Change-Id: Ie9f14a69a86b8ed81de766e44e30f2eba1d9bd84
Signed-off-by: Richard Chang <richardycc@google.com>
(cherry picked from commit db158b4ae0543446d38313c3da942afee9947267)
Signed-off-by: Jack Lee <liangjlee@google.com>
2023-10-24 17:34:00 +00:00
Suren Baghdasaryan
e704d0e4f9 FROMGIT: mm: handle swap page faults under per-VMA lock
When page fault is handled under per-VMA lock protection, all swap page
faults are retried with mmap_lock because folio_lock_or_retry has to drop
and reacquire mmap_lock if folio could not be immediately locked.  Follow
the same pattern as mmap_lock to drop per-VMA lock when waiting for folio
and retrying once folio is available.

With this obstacle removed, enable do_swap_page to operate under per-VMA
lock protection.  Drivers implementing ops->migrate_to_ram might still
rely on mmap_lock, therefore we have to fall back to mmap_lock in that
particular case.

Note that the only time do_swap_page calls synchronous swap_readpage is
when SWP_SYNCHRONOUS_IO is set, which is only set for
QUEUE_FLAG_SYNCHRONOUS devices: brd, zram and nvdimms (both btt and pmem).
Therefore we don't sleep in this path, and there's no need to drop the
mmap or per-VMA lock.

Link: https://lkml.kernel.org/r/20230630211957.1341547-6-surenb@google.com
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Tested-by: Alistair Popple <apopple@nvidia.com>
Reviewed-by: Alistair Popple <apopple@nvidia.com>
Acked-by: Peter Xu <peterx@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Hillf Danton <hdanton@sina.com>
Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Josef Bacik <josef@toxicpanda.com>
Cc: Laurent Dufour <ldufour@linux.ibm.com>
Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Michel Lespinasse <michel@lespinasse.org>
Cc: Minchan Kim <minchan@google.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Punit Agrawal <punit.agrawal@bytedance.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yu Zhao <yuzhao@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

(cherry picked from commit cc989adb5544594d8c12893eda3c6df8682de11b
https: //git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-unstable)
Bug: 161210518
Change-Id: I5d80f435b2dbdc3f3d02be056e893f6fedbc7a98
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2023-08-16 09:59:37 -07:00
Suren Baghdasaryan
f8a65b694b FROMGIT: mm: change folio_lock_or_retry to use vm_fault directly
Change folio_lock_or_retry to accept vm_fault struct and return the
vm_fault_t directly.

Link: https://lkml.kernel.org/r/20230630211957.1341547-5-surenb@google.com
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Suggested-by: Matthew Wilcox <willy@infradead.org>
Acked-by: Peter Xu <peterx@redhat.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Hillf Danton <hdanton@sina.com>
Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Josef Bacik <josef@toxicpanda.com>
Cc: Laurent Dufour <ldufour@linux.ibm.com>
Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Michel Lespinasse <michel@lespinasse.org>
Cc: Minchan Kim <minchan@google.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Punit Agrawal <punit.agrawal@bytedance.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yu Zhao <yuzhao@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

(cherry picked from commit af27bb856a0a29a0673aabe163e4774df67a8bcd
https: //git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-unstable)
Bug: 161210518
Change-Id: I9d203e801f0d5517fba8430f9ab82d4063b517f3
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2023-08-16 09:59:35 -07:00
Suren Baghdasaryan
71c7092b68 ANDROID: Revert "mm: remove cleancache"
This reverts commit 0a4ee51818.

Conflicts:
	Documentation/mm/cleancache.rst
	Documentation/vm/index.rst
	arch/arm/configs/bcm2835_defconfig
	arch/arm/configs/qcom_defconfig
	arch/m68k/configs/amiga_defconfig
	arch/m68k/configs/apollo_defconfig
	arch/m68k/configs/atari_defconfig
	arch/m68k/configs/bvme6000_defconfig
	arch/m68k/configs/hp300_defconfig
	arch/m68k/configs/mac_defconfig
	arch/m68k/configs/multi_defconfig
	arch/m68k/configs/mvme147_defconfig
	arch/m68k/configs/mvme16x_defconfig
	arch/m68k/configs/q40_defconfig
	arch/m68k/configs/sun3_defconfig
	arch/m68k/configs/sun3x_defconfig
	arch/s390/configs/debug_defconfig
	arch/s390/configs/defconfig
	fs/f2fs/data.c
	fs/mpage.c

1. Skip documentation which was refactored.
2. Skip defconfigs unused in Android.
3. Replaced deprecated __submit_bio() with f2fs_submit_read_bio()
4. Replaced PageUptodate() with folio_test_uptodate()
5. Replaced SetPageUptodate() with folio_mark_uptodate()
6. Changed cleancache_get_page() call to use folio->page

Bug: 271544708
Change-Id: I93359509f7799de72f31b002a2539565d1bda9d6
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2023-04-26 17:01:50 +00:00
Qian Yingjin
d4d9bdc694 mm/filemap: fix page end in filemap_get_read_batch
commit 5956592ce337330cdff0399a6f8b6a5aea397a8e upstream.

I was running traces of the read code against an RAID storage system to
understand why read requests were being misaligned against the underlying
RAID strips.  I found that the page end offset calculation in
filemap_get_read_batch() was off by one.

When a read is submitted with end offset 1048575, then it calculates the
end page for read of 256 when it should be 255.  "last_index" is the index
of the page beyond the end of the read and it should be skipped when get a
batch of pages for read in @filemap_get_read_batch().

The below simple patch fixes the problem.  This code was introduced in
kernel 5.12.

Link: https://lkml.kernel.org/r/20230208022400.28962-1-coolqyj@163.com
Fixes: cbd59c48ae ("mm/filemap: use head pages in generic_file_buffered_read")
Signed-off-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-22 12:59:49 +01:00
Linus Torvalds
27bc50fc90 - Yu Zhao's Multi-Gen LRU patches are here. They've been under test in
linux-next for a couple of months without, to my knowledge, any negative
   reports (or any positive ones, come to that).
 
 - Also the Maple Tree from Liam R.  Howlett.  An overlapping range-based
   tree for vmas.  It it apparently slight more efficient in its own right,
   but is mainly targeted at enabling work to reduce mmap_lock contention.
 
   Liam has identified a number of other tree users in the kernel which
   could be beneficially onverted to mapletrees.
 
   Yu Zhao has identified a hard-to-hit but "easy to fix" lockdep splat
   (https://lkml.kernel.org/r/CAOUHufZabH85CeUN-MEMgL8gJGzJEWUrkiM58JkTbBhh-jew0Q@mail.gmail.com).
   This has yet to be addressed due to Liam's unfortunately timed
   vacation.  He is now back and we'll get this fixed up.
 
 - Dmitry Vyukov introduces KMSAN: the Kernel Memory Sanitizer.  It uses
   clang-generated instrumentation to detect used-unintialized bugs down to
   the single bit level.
 
   KMSAN keeps finding bugs.  New ones, as well as the legacy ones.
 
 - Yang Shi adds a userspace mechanism (madvise) to induce a collapse of
   memory into THPs.
 
 - Zach O'Keefe has expanded Yang Shi's madvise(MADV_COLLAPSE) to support
   file/shmem-backed pages.
 
 - userfaultfd updates from Axel Rasmussen
 
 - zsmalloc cleanups from Alexey Romanov
 
 - cleanups from Miaohe Lin: vmscan, hugetlb_cgroup, hugetlb and memory-failure
 
 - Huang Ying adds enhancements to NUMA balancing memory tiering mode's
   page promotion, with a new way of detecting hot pages.
 
 - memcg updates from Shakeel Butt: charging optimizations and reduced
   memory consumption.
 
 - memcg cleanups from Kairui Song.
 
 - memcg fixes and cleanups from Johannes Weiner.
 
 - Vishal Moola provides more folio conversions
 
 - Zhang Yi removed ll_rw_block() :(
 
 - migration enhancements from Peter Xu
 
 - migration error-path bugfixes from Huang Ying
 
 - Aneesh Kumar added ability for a device driver to alter the memory
   tiering promotion paths.  For optimizations by PMEM drivers, DRM
   drivers, etc.
 
 - vma merging improvements from Jakub Matěn.
 
 - NUMA hinting cleanups from David Hildenbrand.
 
 - xu xin added aditional userspace visibility into KSM merging activity.
 
 - THP & KSM code consolidation from Qi Zheng.
 
 - more folio work from Matthew Wilcox.
 
 - KASAN updates from Andrey Konovalov.
 
 - DAMON cleanups from Kaixu Xia.
 
 - DAMON work from SeongJae Park: fixes, cleanups.
 
 - hugetlb sysfs cleanups from Muchun Song.
 
 - Mike Kravetz fixes locking issues in hugetlbfs and in hugetlb core.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCY0HaPgAKCRDdBJ7gKXxA
 joPjAQDZ5LlRCMWZ1oxLP2NOTp6nm63q9PWcGnmY50FjD/dNlwEAnx7OejCLWGWf
 bbTuk6U2+TKgJa4X7+pbbejeoqnt5QU=
 =xfWx
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2022-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:

 - Yu Zhao's Multi-Gen LRU patches are here. They've been under test in
   linux-next for a couple of months without, to my knowledge, any
   negative reports (or any positive ones, come to that).

 - Also the Maple Tree from Liam Howlett. An overlapping range-based
   tree for vmas. It it apparently slightly more efficient in its own
   right, but is mainly targeted at enabling work to reduce mmap_lock
   contention.

   Liam has identified a number of other tree users in the kernel which
   could be beneficially onverted to mapletrees.

   Yu Zhao has identified a hard-to-hit but "easy to fix" lockdep splat
   at [1]. This has yet to be addressed due to Liam's unfortunately
   timed vacation. He is now back and we'll get this fixed up.

 - Dmitry Vyukov introduces KMSAN: the Kernel Memory Sanitizer. It uses
   clang-generated instrumentation to detect used-unintialized bugs down
   to the single bit level.

   KMSAN keeps finding bugs. New ones, as well as the legacy ones.

 - Yang Shi adds a userspace mechanism (madvise) to induce a collapse of
   memory into THPs.

 - Zach O'Keefe has expanded Yang Shi's madvise(MADV_COLLAPSE) to
   support file/shmem-backed pages.

 - userfaultfd updates from Axel Rasmussen

 - zsmalloc cleanups from Alexey Romanov

 - cleanups from Miaohe Lin: vmscan, hugetlb_cgroup, hugetlb and
   memory-failure

 - Huang Ying adds enhancements to NUMA balancing memory tiering mode's
   page promotion, with a new way of detecting hot pages.

 - memcg updates from Shakeel Butt: charging optimizations and reduced
   memory consumption.

 - memcg cleanups from Kairui Song.

 - memcg fixes and cleanups from Johannes Weiner.

 - Vishal Moola provides more folio conversions

 - Zhang Yi removed ll_rw_block() :(

 - migration enhancements from Peter Xu

 - migration error-path bugfixes from Huang Ying

 - Aneesh Kumar added ability for a device driver to alter the memory
   tiering promotion paths. For optimizations by PMEM drivers, DRM
   drivers, etc.

 - vma merging improvements from Jakub Matěn.

 - NUMA hinting cleanups from David Hildenbrand.

 - xu xin added aditional userspace visibility into KSM merging
   activity.

 - THP & KSM code consolidation from Qi Zheng.

 - more folio work from Matthew Wilcox.

 - KASAN updates from Andrey Konovalov.

 - DAMON cleanups from Kaixu Xia.

 - DAMON work from SeongJae Park: fixes, cleanups.

 - hugetlb sysfs cleanups from Muchun Song.

 - Mike Kravetz fixes locking issues in hugetlbfs and in hugetlb core.

Link: https://lkml.kernel.org/r/CAOUHufZabH85CeUN-MEMgL8gJGzJEWUrkiM58JkTbBhh-jew0Q@mail.gmail.com [1]

* tag 'mm-stable-2022-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (555 commits)
  hugetlb: allocate vma lock for all sharable vmas
  hugetlb: take hugetlb vma_lock when clearing vma_lock->vma pointer
  hugetlb: fix vma lock handling during split vma and range unmapping
  mglru: mm/vmscan.c: fix imprecise comments
  mm/mglru: don't sync disk for each aging cycle
  mm: memcontrol: drop dead CONFIG_MEMCG_SWAP config symbol
  mm: memcontrol: use do_memsw_account() in a few more places
  mm: memcontrol: deprecate swapaccounting=0 mode
  mm: memcontrol: don't allocate cgroup swap arrays when memcg is disabled
  mm/secretmem: remove reduntant return value
  mm/hugetlb: add available_huge_pages() func
  mm: remove unused inline functions from include/linux/mm_inline.h
  selftests/vm: add selftest for MADV_COLLAPSE of uffd-minor memory
  selftests/vm: add file/shmem MADV_COLLAPSE selftest for cleared pmd
  selftests/vm: add thp collapse shmem testing
  selftests/vm: add thp collapse file and tmpfs testing
  selftests/vm: modularize thp collapse memory operations
  selftests/vm: dedup THP helpers
  mm/khugepaged: add tracepoint to hpage_collapse_scan_file()
  mm/madvise: add file and shmem support to MADV_COLLAPSE
  ...
2022-10-10 17:53:04 -07:00
Alexander Potapenko
1468c6f455 mm: fs: initialize fsdata passed to write_begin/write_end interface
Functions implementing the a_ops->write_end() interface accept the `void
*fsdata` parameter that is supposed to be initialized by the corresponding
a_ops->write_begin() (which accepts `void **fsdata`).

However not all a_ops->write_begin() implementations initialize `fsdata`
unconditionally, so it may get passed uninitialized to a_ops->write_end(),
resulting in undefined behavior.

Fix this by initializing fsdata with NULL before the call to
write_begin(), rather than doing so in all possible a_ops implementations.

This patch covers only the following cases found by running x86 KMSAN
under syzkaller:

 - generic_perform_write()
 - cont_expand_zero() and generic_cont_expand_simple()
 - page_symlink()

Other cases of passing uninitialized fsdata may persist in the codebase.

Link: https://lkml.kernel.org/r/20220915150417.722975-43-glider@google.com
Signed-off-by: Alexander Potapenko <glider@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Eric Biggers <ebiggers@google.com>
Cc: Eric Biggers <ebiggers@kernel.org>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Ilya Leoshkevich <iii@linux.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Marco Elver <elver@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-03 14:03:25 -07:00
Ke Sun
c195c32157 mm/filemap: make folio_put_wait_locked static
It's only used in mm/filemap.c, since commit <ffa65753c431>
("mm/migrate.c: rework migration_entry_wait() to not take a pageref").

Make it static.

Link: https://lkml.kernel.org/r/20220914021738.3228011-1-sunke@kylinos.cn
Signed-off-by: Ke Sun <sunke@kylinos.cn>
Reported-by: k2ci <kernel-bot@kylinos.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-03 14:03:15 -07:00
Vishal Moola (Oracle)
b05f41a1aa filemap: convert filemap_range_has_writeback() to use folios
Removes 3 calls to compound_head().

Link: https://lkml.kernel.org/r/20220905214557.868606-1-vishal.moola@gmail.com
Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-03 14:02:56 -07:00
Yang Yang
aa1cf99b87 delayacct: support re-entrance detection of thrashing accounting
Once upon a time, we only support accounting thrashing of page cache. 
Then Joonsoo introduced workingset detection for anonymous pages and we
gained the ability to account thrashing of them[1].

For page cache thrashing accounting, there is no suitable place to do it
in fs level likes swap_readpage().  So we have to do it in
folio_wait_bit_common().

Then for anonymous pages thrashing accounting, we have to do it in both
swap_readpage() and folio_wait_bit_common().  This likes PSI, so we should
let thrashing accounting supports re-entrance detection.

This patch is to prepare complete thrashing accounting, and is based on
patch "filemap: make the accounting of thrashing more consistent".

[1] commit aae466b005 ("mm/swap: implement workingset detection for anonymous LRU")

Link: https://lkml.kernel.org/r/20220815071134.74551-1-yang.yang29@zte.com.cn
Signed-off-by: Yang Yang <yang.yang29@zte.com.cn>
Signed-off-by: CGEL ZTE <cgel.zte@gmail.com>
Reviewed-by: Ran Xiaokai <ran.xiaokai@zte.com.cn>
Reviewed-by: wangyong <wang.yong12@zte.com.cn>
Acked-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-26 19:46:07 -07:00
Yang Yang
f347c9d269 filemap: make the accounting of thrashing more consistent
Once upon a time, we only support accounting thrashing of page cache. 
Then Joonsoo introduced workingset detection for anonymous pages and we
gained the ability to account thrashing of them[1].

So let delayacct account both the thrashing of page cache and anonymous
pages, this could make the codes more consistent and simpler.

[1] commit aae466b005 ("mm/swap: implement workingset detection for anonymous LRU")

Link: https://lkml.kernel.org/r/20220805033838.1714674-1-yang.yang29@zte.com.cn
Signed-off-by: Yang Yang <yang.yang29@zte.com.cn>
Signed-off-by: CGEL ZTE <cgel.zte@gmail.com>
Acked-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Yang Yang <yang.yang29@zte.com.cn>
Cc: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-26 19:46:06 -07:00
Christoph Hellwig
176042404e mm: add PSI accounting around ->read_folio and ->readahead calls
PSI tries to account for the cost of bringing back in pages discarded by
the MM LRU management.  Currently the prime place for that is hooked into
the bio submission path, which is a rather bad place:

 - it does not actually account I/O for non-block file systems, of which
   we have many
 - it adds overhead and a layering violation to the block layer

Add the accounting into the two places in the core MM code that read
pages into an address space by calling into ->read_folio and ->readahead
so that the entire file system operations are covered, to broaden
the coverage and allow removing the accounting in the block layer going
forward.

As psi_memstall_enter can deal with nested calls this will not lead to
double accounting even while the bio annotations are still present.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Link: https://lore.kernel.org/r/20220915094200.139713-2-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-09-20 08:24:38 -06:00
Vishal Moola (Oracle)
48658d8509 filemap: remove find_get_pages_contig()
All callers of find_get_pages_contig() have been removed, so it is no
longer needed.

Link: https://lkml.kernel.org/r/20220824004023.77310-8-vishal.moola@gmail.com
Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Chris Mason <clm@fb.com>
Cc: David Sterba <dsterba@suse.com>
Cc: David Sterba <dsterb@suse.com>
Cc: Josef Bacik <josef@toxicpanda.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11 20:26:03 -07:00
Vishal Moola (Oracle)
35b471467f filemap: add filemap_get_folios_contig()
Patch series "Convert to filemap_get_folios_contig()", v3.

This patch series replaces find_get_pages_contig() with
filemap_get_folios_contig().


This patch (of 7):

This function is meant to replace find_get_pages_contig().

Unlike find_get_pages_contig(), filemap_get_folios_contig() no longer
takes in a target number of pages to find - It returns up to 15 contiguous
folios.

To be more consistent with filemap_get_folios(),
filemap_get_folios_contig() now also updates the start index passed in,
and takes an end index.

Link: https://lkml.kernel.org/r/20220824004023.77310-1-vishal.moola@gmail.com
Link: https://lkml.kernel.org/r/20220824004023.77310-2-vishal.moola@gmail.com
Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com>
Cc: Chris Mason <clm@fb.com>
Cc: Josef Bacik <josef@toxicpanda.com>
Cc: David Sterba <dsterba@suse.com>
Cc: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: David Sterba <dsterb@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11 20:26:02 -07:00
Shaoqin Huang
223ce4910b mm/filemap.c: convert page_endio() to use a folio
Replace three calls to compound_head() with one.

Link: https://lkml.kernel.org/r/20220809023256.178194-1-shaoqin.huang@intel.com
Signed-off-by: Shaoqin Huang <shaoqin.huang@intel.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11 20:25:48 -07:00
Linus Torvalds
6614a3c316 - The usual batches of cleanups from Baoquan He, Muchun Song, Miaohe
Lin, Yang Shi, Anshuman Khandual and Mike Rapoport
 
 - Some kmemleak fixes from Patrick Wang and Waiman Long
 
 - DAMON updates from SeongJae Park
 
 - memcg debug/visibility work from Roman Gushchin
 
 - vmalloc speedup from Uladzislau Rezki
 
 - more folio conversion work from Matthew Wilcox
 
 - enhancements for coherent device memory mapping from Alex Sierra
 
 - addition of shared pages tracking and CoW support for fsdax, from
   Shiyang Ruan
 
 - hugetlb optimizations from Mike Kravetz
 
 - Mel Gorman has contributed some pagealloc changes to improve latency
   and realtime behaviour.
 
 - mprotect soft-dirty checking has been improved by Peter Xu
 
 - Many other singleton patches all over the place
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCYuravgAKCRDdBJ7gKXxA
 jpqSAQDrXSdII+ht9kSHlaCVYjqRFQz/rRvURQrWQV74f6aeiAD+NHHeDPwZn11/
 SPktqEUrF1pxnGQxqLh1kUFUhsVZQgE=
 =w/UH
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2022-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:
 "Most of the MM queue. A few things are still pending.

  Liam's maple tree rework didn't make it. This has resulted in a few
  other minor patch series being held over for next time.

  Multi-gen LRU still isn't merged as we were waiting for mapletree to
  stabilize. The current plan is to merge MGLRU into -mm soon and to
  later reintroduce mapletree, with a view to hopefully getting both
  into 6.1-rc1.

  Summary:

   - The usual batches of cleanups from Baoquan He, Muchun Song, Miaohe
     Lin, Yang Shi, Anshuman Khandual and Mike Rapoport

   - Some kmemleak fixes from Patrick Wang and Waiman Long

   - DAMON updates from SeongJae Park

   - memcg debug/visibility work from Roman Gushchin

   - vmalloc speedup from Uladzislau Rezki

   - more folio conversion work from Matthew Wilcox

   - enhancements for coherent device memory mapping from Alex Sierra

   - addition of shared pages tracking and CoW support for fsdax, from
     Shiyang Ruan

   - hugetlb optimizations from Mike Kravetz

   - Mel Gorman has contributed some pagealloc changes to improve
     latency and realtime behaviour.

   - mprotect soft-dirty checking has been improved by Peter Xu

   - Many other singleton patches all over the place"

 [ XFS merge from hell as per Darrick Wong in

   https://lore.kernel.org/all/YshKnxb4VwXycPO8@magnolia/ ]

* tag 'mm-stable-2022-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (282 commits)
  tools/testing/selftests/vm/hmm-tests.c: fix build
  mm: Kconfig: fix typo
  mm: memory-failure: convert to pr_fmt()
  mm: use is_zone_movable_page() helper
  hugetlbfs: fix inaccurate comment in hugetlbfs_statfs()
  hugetlbfs: cleanup some comments in inode.c
  hugetlbfs: remove unneeded header file
  hugetlbfs: remove unneeded hugetlbfs_ops forward declaration
  hugetlbfs: use helper macro SZ_1{K,M}
  mm: cleanup is_highmem()
  mm/hmm: add a test for cross device private faults
  selftests: add soft-dirty into run_vmtests.sh
  selftests: soft-dirty: add test for mprotect
  mm/mprotect: fix soft-dirty check in can_change_pte_writable()
  mm: memcontrol: fix potential oom_lock recursion deadlock
  mm/gup.c: fix formatting in check_and_migrate_movable_page()
  xfs: fail dax mount if reflink is enabled on a partition
  mm/memcontrol.c: remove the redundant updating of stats_flush_threshold
  userfaultfd: don't fail on unrecognized features
  hugetlb_cgroup: fix wrong hugetlb cgroup numa stat
  ...
2022-08-05 16:32:45 -07:00
Linus Torvalds
f00654007f Folio changes for 6.0
- Fix an accounting bug that made NR_FILE_DIRTY grow without limit
    when running xfstests
 
  - Convert more of mpage to use folios
 
  - Remove add_to_page_cache() and add_to_page_cache_locked()
 
  - Convert find_get_pages_range() to filemap_get_folios()
 
  - Improvements to the read_cache_page() family of functions
 
  - Remove a few unnecessary checks of PageError
 
  - Some straightforward filesystem conversions to use folios
 
  - Split PageMovable users out from address_space_operations into their
    own movable_operations
 
  - Convert aops->migratepage to aops->migrate_folio
 
  - Remove nobh support (Christoph Hellwig)
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEejHryeLBw/spnjHrDpNsjXcpgj4FAmLpViQACgkQDpNsjXcp
 gj5pBgf/f3+K7Hi3qw7aYQCYJQ7IA/bLyE/DLWI59kuiao6wDSve40B9YH9X++Ha
 mRLp55bkQS+bwS2xa4jlqrIDJzAfNoWlXaXZHUXGL1C/52ChTF6jaH2cvO9PVlDS
 7fLv1hy2LwiIdzpKJkUW7T+kcQGj3QLKqtQ4x8zD0LGMg055yvt/qndHSUi41nWT
 /58+6W8Sk4vvRgkpeChFzF1lGLy00+FGT8y5V2kM9uRliFQ7XPCwqB2a3e5jbW6z
 C1NXQmRnopCrnOT1TFIhK3DyX6MDIWV5qcikNAmCKFb9fQFPmjDLPt9iSoMGjw2M
 Z+UVhJCaU3ISccd0DG5Ra/vzs9/O9Q==
 =DgUi
 -----END PGP SIGNATURE-----

Merge tag 'folio-6.0' of git://git.infradead.org/users/willy/pagecache

Pull folio updates from Matthew Wilcox:

 - Fix an accounting bug that made NR_FILE_DIRTY grow without limit
   when running xfstests

 - Convert more of mpage to use folios

 - Remove add_to_page_cache() and add_to_page_cache_locked()

 - Convert find_get_pages_range() to filemap_get_folios()

 - Improvements to the read_cache_page() family of functions

 - Remove a few unnecessary checks of PageError

 - Some straightforward filesystem conversions to use folios

 - Split PageMovable users out from address_space_operations into
   their own movable_operations

 - Convert aops->migratepage to aops->migrate_folio

 - Remove nobh support (Christoph Hellwig)

* tag 'folio-6.0' of git://git.infradead.org/users/willy/pagecache: (78 commits)
  fs: remove the NULL get_block case in mpage_writepages
  fs: don't call ->writepage from __mpage_writepage
  fs: remove the nobh helpers
  jfs: stop using the nobh helper
  ext2: remove nobh support
  ntfs3: refactor ntfs_writepages
  mm/folio-compat: Remove migration compatibility functions
  fs: Remove aops->migratepage()
  secretmem: Convert to migrate_folio
  hugetlb: Convert to migrate_folio
  aio: Convert to migrate_folio
  f2fs: Convert to filemap_migrate_folio()
  ubifs: Convert to filemap_migrate_folio()
  btrfs: Convert btrfs_migratepage to migrate_folio
  mm/migrate: Add filemap_migrate_folio()
  mm/migrate: Convert migrate_page() to migrate_folio()
  nfs: Convert to migrate_folio
  btrfs: Convert btree_migratepage to migrate_folio
  mm/migrate: Convert expected_page_refs() to folio_expected_refs()
  mm/migrate: Convert buffer_migrate_page() to buffer_migrate_folio()
  ...
2022-08-03 10:35:43 -07:00
Miaohe Lin
ccac11da67 filemap: minor cleanup for filemap_write_and_wait_range
Restructure the logic in filemap_write_and_wait_range to simplify the code
and make it more consistent with file_write_and_wait_range. No functional
change intended.

Link: https://lkml.kernel.org/r/20220627132351.55680-1-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-07-29 18:07:14 -07:00
Jens Axboe
0dd316ba86 mm: honor FGP_NOWAIT for page cache page allocation
If we're creating a page cache page with FGP_CREAT but FGP_NOWAIT is
set, we should dial back the gfp flags to avoid frivolous blocking
which is trivial to hit in low memory conditions:

[   10.117661]  __schedule+0x8c/0x550
[   10.118305]  schedule+0x58/0xa0
[   10.118897]  schedule_timeout+0x30/0xdc
[   10.119610]  __wait_for_common+0x88/0x114
[   10.120348]  wait_for_completion+0x1c/0x24
[   10.121103]  __flush_work.isra.0+0x16c/0x19c
[   10.121896]  flush_work+0xc/0x14
[   10.122496]  __drain_all_pages+0x144/0x218
[   10.123267]  drain_all_pages+0x10/0x18
[   10.123941]  __alloc_pages+0x464/0x9e4
[   10.124633]  __folio_alloc+0x18/0x3c
[   10.125294]  __filemap_get_folio+0x17c/0x204
[   10.126084]  iomap_write_begin+0xf8/0x428
[   10.126829]  iomap_file_buffered_write+0x144/0x24c
[   10.127710]  xfs_file_buffered_write+0xe8/0x248
[   10.128553]  xfs_file_write_iter+0xa8/0x120
[   10.129324]  io_write+0x16c/0x38c
[   10.129940]  io_issue_sqe+0x70/0x1cc
[   10.130617]  io_queue_sqe+0x18/0xfc
[   10.131277]  io_submit_sqes+0x5d4/0x600
[   10.131946]  __arm64_sys_io_uring_enter+0x224/0x600
[   10.132752]  invoke_syscall.constprop.0+0x70/0xc0
[   10.133616]  do_el0_svc+0xd0/0x118
[   10.134238]  el0_svc+0x78/0xa0

Clear IO, FS, and reclaim flags and mark the allocation as GFP_NOWAIT and
add __GFP_NOWARN to avoid polluting dmesg with pointless allocations
failures. A caller with FGP_NOWAIT must be expected to handle the
resulting -EAGAIN return and retry from a suitable context without NOWAIT
set.

Reviewed-by: Shakeel Butt <shakeelb@google.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24 18:39:32 -06:00
Matthew Wilcox (Oracle)
290e1a3204 filemap: Use filemap_read_folio() in do_read_cache_folio()
By passing ->read_folio to filemap_read_folio(), we can use
filemap_read_folio() in do_read_cache_folio().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-06-29 08:51:06 -04:00
Matthew Wilcox (Oracle)
1dfa24a4bf filemap: Handle AOP_TRUNCATED_PAGE in do_read_cache_folio()
If the call to filler() returns AOP_TRUNCATED_PAGE, we need to
retry the page cache lookup.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-06-29 08:51:06 -04:00
Matthew Wilcox (Oracle)
9bc3e86938 filemap: Move 'filler' case to the end of do_read_cache_folio()
No functionality change intended; this simply moves code around to
disentangle the function a little.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-06-29 08:51:06 -04:00
Matthew Wilcox (Oracle)
bb4b42ba92 filemap: Remove find_get_pages_range() and associated functions
All callers of find_get_pages_range(), pagevec_lookup_range() and
pagevec_lookup() have now been removed.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Christian Brauner (Microsoft) <brauner@kernel.org>
2022-06-29 08:51:06 -04:00
Matthew Wilcox (Oracle)
be0ced5e9c filemap: Add filemap_get_folios()
This is the equivalent of find_get_pages() but fills a folio_batch
instead of an array of pages.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Christian Brauner (Microsoft) <brauner@kernel.org>
2022-06-29 08:51:05 -04:00
Matthew Wilcox (Oracle)
2bb876b58d filemap: Remove add_to_page_cache() and add_to_page_cache_locked()
These functions have no more users, so delete them.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
2022-06-29 08:51:05 -04:00
Matthew Wilcox (Oracle)
cb995f4eeb filemap: Handle sibling entries in filemap_get_read_batch()
If a read races with an invalidation followed by another read, it is
possible for a folio to be replaced with a higher-order folio.  If that
happens, we'll see a sibling entry for the new folio in the next iteration
of the loop.  This manifests as a NULL pointer dereference while holding
the RCU read lock.

Handle this by simply returning.  The next call will find the new folio
and handle it correctly.  The other ways of handling this rare race are
more complex and it's just not worth it.

Reported-by: Dave Chinner <david@fromorbit.com>
Reported-by: Brian Foster <bfoster@redhat.com>
Debugged-by: Brian Foster <bfoster@redhat.com>
Tested-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Fixes: cbd59c48ae ("mm/filemap: use head pages in generic_file_buffered_read")
Cc: stable@vger.kernel.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-06-20 16:37:45 -04:00
Matthew Wilcox (Oracle)
5ccc944dce filemap: Correct the conditions for marking a folio as accessed
We had an off-by-one error which meant that we never marked the first page
in a read as accessed.  This was visible as a slowdown when re-reading
a file as pages were being evicted from cache too soon.  In reviewing
this code, we noticed a second bug where a multi-page folio would be
marked as accessed multiple times when doing reads that were less than
the size of the folio.

Abstract the comparison of whether two file positions are in the same
folio into a new function, fixing both of these bugs.

Reported-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-06-20 16:37:45 -04:00
Matthew Wilcox (Oracle)
dcfa24ba68 filemap: Cache the value of vm_flags
After we have unlocked the mmap_lock for I/O, the file is pinned, but
the VMA is not.  Checking this flag after that can be a use-after-free.
It's not a terribly interesting use-after-free as it can only read one
bit, and it's used to decide whether to read 2MB or 4MB.  But it
upsets the automated tools and it's generally bad practice anyway,
so let's fix it.

Reported-by: syzbot+5b96d55e5b54924c77ad@syzkaller.appspotmail.com
Fixes: 4687fdbb80 ("mm/filemap: Support VM_HUGEPAGE for file mappings")
Cc: stable@vger.kernel.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-06-09 16:24:25 -04:00
Linus Torvalds
98931dd95f Yang Shi has improved the behaviour of khugepaged collapsing of readonly
file-backed transparent hugepages.
 
 Johannes Weiner has arranged for zswap memory use to be tracked and
 managed on a per-cgroup basis.
 
 Munchun Song adds a /proc knob ("hugetlb_optimize_vmemmap") for runtime
 enablement of the recent huge page vmemmap optimization feature.
 
 Baolin Wang contributes a series to fix some issues around hugetlb
 pagetable invalidation.
 
 Zhenwei Pi has fixed some interactions between hwpoisoned pages and
 virtualization.
 
 Tong Tiangen has enabled the use of the presently x86-only
 page_table_check debugging feature on arm64 and riscv.
 
 David Vernet has done some fixup work on the memcg selftests.
 
 Peter Xu has taught userfaultfd to handle write protection faults against
 shmem- and hugetlbfs-backed files.
 
 More DAMON development from SeongJae Park - adding online tuning of the
 feature and support for monitoring of fixed virtual address ranges.  Also
 easier discovery of which monitoring operations are available.
 
 Nadav Amit has done some optimization of TLB flushing during mprotect().
 
 Neil Brown continues to labor away at improving our swap-over-NFS support.
 
 David Hildenbrand has some fixes to anon page COWing versus
 get_user_pages().
 
 Peng Liu fixed some errors in the core hugetlb code.
 
 Joao Martins has reduced the amount of memory consumed by device-dax's
 compound devmaps.
 
 Some cleanups of the arch-specific pagemap code from Anshuman Khandual.
 
 Muchun Song has found and fixed some errors in the TLB flushing of
 transparent hugepages.
 
 Roman Gushchin has done more work on the memcg selftests.
 
 And, of course, many smaller fixes and cleanups.  Notably, the customary
 million cleanup serieses from Miaohe Lin.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCYo52xQAKCRDdBJ7gKXxA
 jtJFAQD238KoeI9z5SkPMaeBRYSRQmNll85mxs25KapcEgWgGQD9FAb7DJkqsIVk
 PzE+d9hEfirUGdL6cujatwJ6ejYR8Q8=
 =nFe6
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2022-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:
 "Almost all of MM here. A few things are still getting finished off,
  reviewed, etc.

   - Yang Shi has improved the behaviour of khugepaged collapsing of
     readonly file-backed transparent hugepages.

   - Johannes Weiner has arranged for zswap memory use to be tracked and
     managed on a per-cgroup basis.

   - Munchun Song adds a /proc knob ("hugetlb_optimize_vmemmap") for
     runtime enablement of the recent huge page vmemmap optimization
     feature.

   - Baolin Wang contributes a series to fix some issues around hugetlb
     pagetable invalidation.

   - Zhenwei Pi has fixed some interactions between hwpoisoned pages and
     virtualization.

   - Tong Tiangen has enabled the use of the presently x86-only
     page_table_check debugging feature on arm64 and riscv.

   - David Vernet has done some fixup work on the memcg selftests.

   - Peter Xu has taught userfaultfd to handle write protection faults
     against shmem- and hugetlbfs-backed files.

   - More DAMON development from SeongJae Park - adding online tuning of
     the feature and support for monitoring of fixed virtual address
     ranges. Also easier discovery of which monitoring operations are
     available.

   - Nadav Amit has done some optimization of TLB flushing during
     mprotect().

   - Neil Brown continues to labor away at improving our swap-over-NFS
     support.

   - David Hildenbrand has some fixes to anon page COWing versus
     get_user_pages().

   - Peng Liu fixed some errors in the core hugetlb code.

   - Joao Martins has reduced the amount of memory consumed by
     device-dax's compound devmaps.

   - Some cleanups of the arch-specific pagemap code from Anshuman
     Khandual.

   - Muchun Song has found and fixed some errors in the TLB flushing of
     transparent hugepages.

   - Roman Gushchin has done more work on the memcg selftests.

  ... and, of course, many smaller fixes and cleanups. Notably, the
  customary million cleanup serieses from Miaohe Lin"

* tag 'mm-stable-2022-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (381 commits)
  mm: kfence: use PAGE_ALIGNED helper
  selftests: vm: add the "settings" file with timeout variable
  selftests: vm: add "test_hmm.sh" to TEST_FILES
  selftests: vm: check numa_available() before operating "merge_across_nodes" in ksm_tests
  selftests: vm: add migration to the .gitignore
  selftests/vm/pkeys: fix typo in comment
  ksm: fix typo in comment
  selftests: vm: add process_mrelease tests
  Revert "mm/vmscan: never demote for memcg reclaim"
  mm/kfence: print disabling or re-enabling message
  include/trace/events/percpu.h: cleanup for "percpu: improve percpu_alloc_percpu event trace"
  include/trace/events/mmflags.h: cleanup for "tracing: incorrect gfp_t conversion"
  mm: fix a potential infinite loop in start_isolate_page_range()
  MAINTAINERS: add Muchun as co-maintainer for HugeTLB
  zram: fix Kconfig dependency warning
  mm/shmem: fix shmem folio swapoff hang
  cgroup: fix an error handling path in alloc_pagecache_max_30M()
  mm: damon: use HPAGE_PMD_SIZE
  tracing: incorrect isolate_mote_t cast in mm_vmscan_lru_isolate
  nodemask.h: fix compilation error with GCC12
  ...
2022-05-26 12:32:41 -07:00
Peter Xu
5c041f5d1f mm: teach core mm about pte markers
This patch still does not use pte marker in any way, however it teaches
the core mm about the pte marker idea.

For example, handle_pte_marker() is introduced that will parse and handle
all the pte marker faults.

Many of the places are more about commenting it up - so that we know
there's the possibility of pte marker showing up, and why we don't need
special code for the cases.

[peterx@redhat.com: userfaultfd.c needs swapops.h]
  Link: https://lkml.kernel.org/r/YmRlVj3cdizYJsr0@xz-m1.local
Link: https://lkml.kernel.org/r/20220405014833.14015-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: "Kirill A . Shutemov" <kirill@shutemov.name>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Nadav Amit <nadav.amit@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-13 07:20:09 -07:00
Matthew Wilcox (Oracle)
8560cb1a7d fs: Remove aops->freepage
All implementations now use free_folio so we can delete the callers
and the method.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-05-09 23:12:54 -04:00
Matthew Wilcox (Oracle)
d2329aa0c7 fs: Add free_folio address space operation
Include documentation and convert the callers to use ->free_folio as
well as ->freepage.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-05-09 23:12:52 -04:00
Matthew Wilcox (Oracle)
68189fef88 fs: Change try_to_free_buffers() to take a folio
All but two of the callers already have a folio; pass a folio into
try_to_free_buffers().  This removes the last user of cancel_dirty_page()
so remove that wrapper function too.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
2022-05-09 23:12:34 -04:00
Matthew Wilcox (Oracle)
704ead2bed fs: Remove last vestiges of releasepage
All users are now converted to release_folio

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
2022-05-09 23:12:33 -04:00
Matthew Wilcox (Oracle)
fa29000b6b fs: Add aops->release_folio
This replaces aops->releasepage.  Update the documentation, and call it
if it exists.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
2022-05-09 23:12:30 -04:00
Matthew Wilcox (Oracle)
0795000869 mm/filemap: Hoist filler_t decision to the top of do_read_cache_folio()
Now that filler_t and aops->read_folio() have the same type, we can decide
which one to use at the top of the function, and cache ->read_folio in
the filler parameter.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-05-09 16:36:52 -04:00
Matthew Wilcox (Oracle)
e9b5b23e95 fs: Change the type of filler_t
By making filler_t the same as read_folio, we can use the same function
for both in gfs2.  We can push the use of folios down one more level
in jffs2 and nfs.  We also increase type safety for future users of the
various read_cache_page() family of functions by forcing the parameter
to be a pointer to struct file (or NULL).

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com>
2022-05-09 16:36:48 -04:00