Commit Graph

161 Commits

Author SHA1 Message Date
Todd Kjos
f27fc6ba23 Merge "Merge tag 'android14-6.1.68_r00' into branch 'android14-6.1'" into android14-6.1 2024-01-24 17:34:59 +00:00
Lianjun Huang
d4db0d5d08 ANDROID: GKI: add vendor hooks for swapping in ahead
Add vendor hooks to capture demand paging during APP launch,
so we can do it in advance in next launch.

Bug: 315913896
Signed-off-by: Lianjun Huang <huanglianjun@xiaomi.com>
Signed-off-by: Lianjun Huang <huanglianjun@xiaomi.corp-partner.google.com>
Change-Id: I2698fefd347745fb4ff84b111caedbb3bb365ce3
2024-01-12 18:47:42 +00:00
Greg Kroah-Hartman
2b3ea8bdef This is the 6.1.63 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmVbOmsACgkQONu9yGCS
 aT5m1RAAx7hgbFDnLHCGh4YVBbNy8JngItsUBaJcI/67Mk5toNi0x8pqcS8mq7ED
 GTwRnRcKaIR2bTyco5Ed2OZn4jMCyHC4oiyBZnHWg6AMuQjSCYzIgm7DzlTCVYZ7
 2r8uRbt/uXADTILJ2kwR2mtVpGcwrXa+lsHrMqvt+MvNwRoSVHBHVVYCrAc+JXwR
 GXCopzV/RFGS6w4SBsX0K+8pV7GO+bhpxJ1lPz1T/xeLYfT4C3EwSTWDbUXPbez7
 IpJ+5yKJXXT9Xn9m/pekwZ/aOirLqtEbDxneEctsjvw140lCoQiEZn6ZRscgNEns
 3H+J3Asgc2zXqPzfZFH02TebPj31B8HZ43Upu0okr0hr4A4/4JL9pjXEhm1bON/Z
 x3jlTF4dyay4vOGGIEYOAuJSUbn6AqpZ318uBWCd3BSPocihEDMJz2aoazVHcb6k
 83MVxfFfEL6s9utcoSXB8VjHa4FQmpMYsozegloUSJJCsizgdzmih0buJYhBB9sI
 HbEohW+YAh3cACSn6arXUJIMH5F5xsfD89od2Pj+6UrapdlPz5gCaggA1RZplCho
 bjGc1k61Rp2qSdfMEcx+h4ypgoOdhgqZI0YhYDCgBSRcWOXnGrDjFvnnumatcT+H
 6vqyX6zlNt6U1NpE56Jtf7gt1Ds6PeoadD0L6B8vjXrkdeXOlUU=
 =AZ9s
 -----END PGP SIGNATURE-----

Merge 6.1.63 into android14-6.1-lts

Changes in 6.1.63
	hwmon: (nct6775) Fix incorrect variable reuse in fan_div calculation
	sched/fair: Fix cfs_rq_is_decayed() on !SMP
	iov_iter, x86: Be consistent about the __user tag on copy_mc_to_user()
	sched/uclamp: Set max_spare_cap_cpu even if max_spare_cap is 0
	sched/uclamp: Ignore (util == 0) optimization in feec() when p_util_max = 0
	objtool: Propagate early errors
	sched: Fix stop_one_cpu_nowait() vs hotplug
	vfs: fix readahead(2) on block devices
	writeback, cgroup: switch inodes with dirty timestamps to release dying cgwbs
	x86/srso: Fix SBPB enablement for (possible) future fixed HW
	futex: Don't include process MM in futex key on no-MMU
	x86/numa: Introduce numa_fill_memblks()
	ACPI/NUMA: Apply SRAT proximity domain to entire CFMWS window
	x86/sev-es: Allow copy_from_kernel_nofault() in earlier boot
	x86/boot: Fix incorrect startup_gdt_descr.size
	drivers/clocksource/timer-ti-dm: Don't call clk_get_rate() in stop function
	pstore/platform: Add check for kstrdup
	string: Adjust strtomem() logic to allow for smaller sources
	genirq/matrix: Exclude managed interrupts in irq_matrix_allocated()
	wifi: cfg80211: add flush functions for wiphy work
	wifi: mac80211: move radar detect work to wiphy work
	wifi: mac80211: move scan work to wiphy work
	wifi: mac80211: move offchannel works to wiphy work
	wifi: mac80211: move sched-scan stop work to wiphy work
	wifi: mac80211: fix # of MSDU in A-MSDU calculation
	wifi: iwlwifi: honor the enable_ini value
	i40e: fix potential memory leaks in i40e_remove()
	iavf: Fix promiscuous mode configuration flow messages
	selftests/bpf: Correct map_fd to data_fd in tailcalls
	udp: add missing WRITE_ONCE() around up->encap_rcv
	tcp: call tcp_try_undo_recovery when an RTOd TFO SYNACK is ACKed
	gve: Use size_add() in call to struct_size()
	mlxsw: Use size_mul() in call to struct_size()
	tls: Only use data field in crypto completion function
	tls: Use size_add() in call to struct_size()
	tipc: Use size_add() in calls to struct_size()
	net: spider_net: Use size_add() in call to struct_size()
	net: ethernet: mtk_wed: fix EXT_INT_STATUS_RX_FBUF definitions for MT7986 SoC
	wifi: rtw88: debug: Fix the NULL vs IS_ERR() bug for debugfs_create_file()
	wifi: ath11k: fix boot failure with one MSI vector
	wifi: mt76: mt7603: rework/fix rx pse hang check
	wifi: mt76: mt7603: improve watchdog reset reliablity
	wifi: mt76: mt7603: improve stuck beacon handling
	wifi: mt76: mt7915: fix beamforming availability check
	wifi: ath: dfs_pattern_detector: Fix a memory initialization issue
	tcp_metrics: add missing barriers on delete
	tcp_metrics: properly set tp->snd_ssthresh in tcp_init_metrics()
	tcp_metrics: do not create an entry from tcp_init_metrics()
	wifi: rtlwifi: fix EDCA limit set by BT coexistence
	ACPI: property: Allow _DSD buffer data only for byte accessors
	ACPI: video: Add acpi_backlight=vendor quirk for Toshiba Portégé R100
	wifi: ath11k: fix Tx power value during active CAC
	can: dev: can_restart(): don't crash kernel if carrier is OK
	can: dev: can_restart(): fix race condition between controller restart and netif_carrier_on()
	can: dev: can_put_echo_skb(): don't crash kernel if can_priv::echo_skb is accessed out of bounds
	PM / devfreq: rockchip-dfi: Make pmu regmap mandatory
	wifi: wfx: fix case where rates are out of order
	netfilter: nf_tables: Drop pointless memset when dumping rules
	thermal: core: prevent potential string overflow
	r8169: use tp_to_dev instead of open code
	r8169: fix rare issue with broken rx after link-down on RTL8125
	selftests: netfilter: test for sctp collision processing in nf_conntrack
	net: skb_find_text: Ignore patterns extending past 'to'
	chtls: fix tp->rcv_tstamp initialization
	tcp: fix cookie_init_timestamp() overflows
	wifi: iwlwifi: call napi_synchronize() before freeing rx/tx queues
	wifi: iwlwifi: pcie: synchronize IRQs before NAPI
	wifi: iwlwifi: empty overflow queue during flush
	Bluetooth: hci_sync: Fix Opcode prints in bt_dev_dbg/err
	bpf: Fix unnecessary -EBUSY from htab_lock_bucket
	ACPI: sysfs: Fix create_pnp_modalias() and create_of_modalias()
	ipv6: avoid atomic fragment on GSO packets
	net: add DEV_STATS_READ() helper
	ipvlan: properly track tx_errors
	regmap: debugfs: Fix a erroneous check after snprintf()
	spi: tegra: Fix missing IRQ check in tegra_slink_probe()
	clk: qcom: gcc-msm8996: Remove RPM bus clocks
	clk: qcom: clk-rcg2: Fix clock rate overflow for high parent frequencies
	clk: qcom: mmcc-msm8998: Don't check halt bit on some branch clks
	clk: qcom: mmcc-msm8998: Fix the SMMU GDSC
	clk: qcom: gcc-sm8150: Fix gcc_sdcc2_apps_clk_src
	regulator: mt6358: Fail probe on unknown chip ID
	clk: imx: Select MXC_CLK for CLK_IMX8QXP
	clk: imx: imx8mq: correct error handling path
	clk: imx: imx8qxp: Fix elcdif_pll clock
	clk: renesas: rcar-gen3: Extend SDnH divider table
	clk: renesas: rzg2l: Wait for status bit of SD mux before continuing
	clk: renesas: rzg2l: Lock around writes to mux register
	clk: renesas: rzg2l: Trust value returned by hardware
	clk: renesas: rzg2l: Use FIELD_GET() for PLL register fields
	clk: renesas: rzg2l: Fix computation formula
	clk: linux/clk-provider.h: fix kernel-doc warnings and typos
	spi: nxp-fspi: use the correct ioremap function
	clk: keystone: pll: fix a couple NULL vs IS_ERR() checks
	clk: ti: change ti_clk_register[_omap_hw]() API
	clk: ti: fix double free in of_ti_divider_clk_setup()
	clk: npcm7xx: Fix incorrect kfree
	clk: mediatek: clk-mt6765: Add check for mtk_alloc_clk_data
	clk: mediatek: clk-mt6779: Add check for mtk_alloc_clk_data
	clk: mediatek: clk-mt6797: Add check for mtk_alloc_clk_data
	clk: mediatek: clk-mt7629-eth: Add check for mtk_alloc_clk_data
	clk: mediatek: clk-mt7629: Add check for mtk_alloc_clk_data
	clk: mediatek: clk-mt2701: Add check for mtk_alloc_clk_data
	clk: qcom: config IPQ_APSS_6018 should depend on QCOM_SMEM
	platform/x86: wmi: Fix probe failure when failing to register WMI devices
	platform/x86: wmi: Fix opening of char device
	hwmon: (axi-fan-control) Fix possible NULL pointer dereference
	hwmon: (coretemp) Fix potentially truncated sysfs attribute name
	Revert "hwmon: (sch56xx-common) Add DMI override table"
	Revert "hwmon: (sch56xx-common) Add automatic module loading on supported devices"
	hwmon: (sch5627) Use bit macros when accessing the control register
	hwmon: (sch5627) Disallow write access if virtual registers are locked
	hte: tegra: Fix missing error code in tegra_hte_test_probe()
	drm/rockchip: vop: Fix reset of state in duplicate state crtc funcs
	drm/rockchip: vop: Fix call to crtc reset helper
	drm/rockchip: vop2: Don't crash for invalid duplicate_state
	drm/rockchip: vop2: Add missing call to crtc reset helper
	drm/radeon: possible buffer overflow
	drm: bridge: it66121: Fix invalid connector dereference
	drm/bridge: lt8912b: Add hot plug detection
	drm/bridge: lt8912b: Fix bridge_detach
	drm/bridge: lt8912b: Fix crash on bridge detach
	drm/bridge: lt8912b: Manually disable HPD only if it was enabled
	drm/bridge: lt8912b: Add missing drm_bridge_attach call
	drm/bridge: tc358768: Fix use of uninitialized variable
	drm/bridge: tc358768: Fix bit updates
	drm/bridge: tc358768: remove unused variable
	drm/bridge: tc358768: Use struct videomode
	drm/bridge: tc358768: Print logical values, not raw register values
	drm/bridge: tc358768: Use dev for dbg prints, not priv->dev
	drm/bridge: tc358768: Rename dsibclk to hsbyteclk
	drm/bridge: tc358768: Clean up clock period code
	drm/bridge: tc358768: Fix tc358768_ns_to_cnt()
	drm/amdkfd: fix some race conditions in vram buffer alloc/free of svm code
	drm/amd/display: Check all enabled planes in dm_check_crtc_cursor
	drm/amd/display: Refactor dm_get_plane_scale helper
	drm/amd/display: Bail from dm_check_crtc_cursor if no relevant change
	io_uring/kbuf: Fix check of BID wrapping in provided buffers
	io_uring/kbuf: Allow the full buffer id space for provided buffers
	drm/mediatek: Fix iommu fault by swapping FBs after updating plane state
	drm/mediatek: Fix iommu fault during crtc enabling
	drm/rockchip: cdn-dp: Fix some error handling paths in cdn_dp_probe()
	gpu: host1x: Correct allocated size for contexts
	drm/bridge: lt9611uxc: fix the race in the error path
	arm64/arm: xen: enlighten: Fix KPTI checks
	drm/rockchip: Fix type promotion bug in rockchip_gem_iommu_map()
	xenbus: fix error exit in xenbus_init()
	xen-pciback: Consider INTx disabled when MSI/MSI-X is enabled
	drm/msm/dsi: use msm_gem_kernel_put to free TX buffer
	drm/msm/dsi: free TX buffer in unbind
	clocksource/drivers/arm_arch_timer: limit XGene-1 workaround
	drm: mediatek: mtk_dsi: Fix NO_EOT_PACKET settings/handling
	drivers/perf: hisi: use cpuhp_state_remove_instance_nocalls() for hisi_hns3_pmu uninit process
	perf/arm-cmn: Revamp model detection
	perf/arm-cmn: Fix DTC domain detection
	drivers/perf: hisi_pcie: Check the type first in pmu::event_init()
	perf: hisi: Fix use-after-free when register pmu fails
	ARM: dts: renesas: blanche: Fix typo in GP_11_2 pin name
	arm64: dts: qcom: sdm845: cheza doesn't support LMh node
	arm64: dts: qcom: sc7280: link usb3_phy_wrapper_gcc_usb30_pipe_clk
	arm64: dts: qcom: msm8916: Fix iommu local address range
	arm64: dts: qcom: msm8992-libra: drop duplicated reserved memory
	arm64: dts: qcom: sc7280: Add missing LMH interrupts
	arm64: dts: qcom: sm8150: add ref clock to PCIe PHYs
	arm64: dts: qcom: sm8350: fix pinctrl for UART18
	arm64: dts: qcom: sdm845-mtp: fix WiFi configuration
	ARM64: dts: marvell: cn9310: Use appropriate label for spi1 pins
	arm64: dts: qcom: apq8016-sbc: Add missing ADV7533 regulators
	ARM: dts: qcom: mdm9615: populate vsdcc fixed regulator
	soc: qcom: llcc: Handle a second device without data corruption
	kunit: Fix missed memory release in kunit_free_suite_set()
	firmware: ti_sci: Mark driver as non removable
	arm64: dts: ti: k3-am62a7-sk: Drop i2c-1 to 100Khz
	firmware: arm_ffa: Assign the missing IDR allocation ID to the FFA device
	firmware: arm_ffa: Allow the FF-A drivers to use 32bit mode of messaging
	ARM: dts: am3517-evm: Fix LED3/4 pinmux
	clk: scmi: Free scmi_clk allocated when the clocks with invalid info are skipped
	arm64: dts: imx8qm-ss-img: Fix jpegenc compatible entry
	arm64: dts: imx8mm: Add sound-dai-cells to micfil node
	arm64: dts: imx8mn: Add sound-dai-cells to micfil node
	arm64: tegra: Use correct interrupts for Tegra234 TKE
	selftests/pidfd: Fix ksft print formats
	selftests/resctrl: Ensure the benchmark commands fits to its array
	module/decompress: use vmalloc() for gzip decompression workspace
	ASoC: cs35l41: Verify PM runtime resume errors in IRQ handler
	ASoC: cs35l41: Undo runtime PM changes at driver exit time
	ALSA: hda: cs35l41: Fix unbalanced pm_runtime_get()
	ALSA: hda: cs35l41: Undo runtime PM changes at driver exit time
	KEYS: Include linux/errno.h in linux/verification.h
	crypto: hisilicon/hpre - Fix a erroneous check after snprintf()
	hwrng: bcm2835 - Fix hwrng throughput regression
	hwrng: geode - fix accessing registers
	RDMA/core: Use size_{add,sub,mul}() in calls to struct_size()
	crypto: qat - ignore subsequent state up commands
	crypto: qat - relocate bufferlist logic
	crypto: qat - rename bufferlist functions
	crypto: qat - change bufferlist logic interface
	crypto: qat - generalize crypto request buffers
	crypto: qat - extend buffer list interface
	crypto: qat - fix unregistration of crypto algorithms
	scsi: ibmvfc: Fix erroneous use of rtas_busy_delay with hcall return code
	libnvdimm/of_pmem: Use devm_kstrdup instead of kstrdup and check its return value
	nd_btt: Make BTT lanes preemptible
	crypto: caam/qi2 - fix Chacha20 + Poly1305 self test failure
	crypto: caam/jr - fix Chacha20 + Poly1305 self test failure
	crypto: qat - increase size of buffers
	PCI: vmd: Correct PCI Header Type Register's multi-function check
	hid: cp2112: Fix duplicate workqueue initialization
	crypto: hisilicon/qm - delete redundant null assignment operations
	crypto: hisilicon/qm - modify the process of regs dfx
	crypto: hisilicon/qm - split a debugfs.c from qm
	crypto: hisilicon/qm - fix PF queue parameter issue
	ARM: 9321/1: memset: cast the constant byte to unsigned char
	ext4: move 'ix' sanity check to corrent position
	ASoC: fsl: mpc5200_dma.c: Fix warning of Function parameter or member not described
	IB/mlx5: Fix rdma counter binding for RAW QP
	RDMA/hns: Fix printing level of asynchronous events
	RDMA/hns: Fix uninitialized ucmd in hns_roce_create_qp_common()
	RDMA/hns: Fix signed-unsigned mixed comparisons
	RDMA/hns: Add check for SL
	RDMA/hns: The UD mode can only be configured with DCQCN
	ASoC: SOF: core: Ensure sof_ops_free() is still called when probe never ran.
	ASoC: fsl: Fix PM disable depth imbalance in fsl_easrc_probe
	scsi: ufs: core: Leave space for '\0' in utf8 desc string
	RDMA/hfi1: Workaround truncation compilation error
	HID: cp2112: Make irq_chip immutable
	hid: cp2112: Fix IRQ shutdown stopping polling for all IRQs on chip
	sh: bios: Revive earlyprintk support
	Revert "HID: logitech-hidpp: add a module parameter to keep firmware gestures"
	HID: logitech-hidpp: Remove HIDPP_QUIRK_NO_HIDINPUT quirk
	HID: logitech-hidpp: Don't restart IO, instead defer hid_connect() only
	HID: logitech-hidpp: Revert "Don't restart communication if not necessary"
	HID: logitech-hidpp: Move get_wireless_feature_index() check to hidpp_connect_event()
	ASoC: Intel: Skylake: Fix mem leak when parsing UUIDs fails
	padata: Fix refcnt handling in padata_free_shell()
	crypto: qat - fix deadlock in backlog processing
	ASoC: ams-delta.c: use component after check
	IB/mlx5: Fix init stage error handling to avoid double free of same QP and UAF
	mfd: core: Un-constify mfd_cell.of_reg
	mfd: core: Ensure disabled devices are skipped without aborting
	mfd: dln2: Fix double put in dln2_probe
	dt-bindings: mfd: mt6397: Add binding for MT6357
	dt-bindings: mfd: mt6397: Split out compatible for MediaTek MT6366 PMIC
	mfd: arizona-spi: Set pdata.hpdet_channel for ACPI enumerated devs
	leds: turris-omnia: Drop unnecessary mutex locking
	leds: turris-omnia: Do not use SMBUS calls
	leds: pwm: Don't disable the PWM when the LED should be off
	leds: trigger: ledtrig-cpu:: Fix 'output may be truncated' issue for 'cpu'
	kunit: add macro to allow conditionally exposing static symbols to tests
	apparmor: test: make static symbols visible during kunit testing
	apparmor: fix invalid reference on profile->disconnected
	perf stat: Fix aggr mode initialization
	iio: frequency: adf4350: Use device managed functions and fix power down issue.
	perf kwork: Fix incorrect and missing free atom in work_push_atom()
	perf kwork: Add the supported subcommands to the document
	perf kwork: Set ordered_events to true in 'struct perf_tool'
	filemap: add filemap_get_folios_tag()
	f2fs: convert f2fs_write_cache_pages() to use filemap_get_folios_tag()
	f2fs: compress: fix deadloop in f2fs_write_cache_pages()
	f2fs: compress: fix to avoid use-after-free on dic
	f2fs: compress: fix to avoid redundant compress extension
	tty: tty_jobctrl: fix pid memleak in disassociate_ctty()
	livepatch: Fix missing newline character in klp_resolve_symbols()
	pinctrl: renesas: rzg2l: Make reverse order of enable() for disable()
	perf record: Fix BTF type checks in the off-cpu profiling
	dmaengine: idxd: Register dsa_bus_type before registering idxd sub-drivers
	usb: dwc2: fix possible NULL pointer dereference caused by driver concurrency
	usb: chipidea: Fix DMA overwrite for Tegra
	usb: chipidea: Simplify Tegra DMA alignment code
	dmaengine: ti: edma: handle irq_of_parse_and_map() errors
	misc: st_core: Do not call kfree_skb() under spin_lock_irqsave()
	tools: iio: iio_generic_buffer ensure alignment
	USB: usbip: fix stub_dev hub disconnect
	dmaengine: pxa_dma: Remove an erroneous BUG_ON() in pxad_free_desc()
	f2fs: fix to initialize map.m_pblk in f2fs_precache_extents()
	interconnect: qcom: sc7180: Retire DEFINE_QBCM
	interconnect: qcom: sc7180: Set ACV enable_mask
	interconnect: qcom: sc7280: Set ACV enable_mask
	interconnect: qcom: sc8180x: Set ACV enable_mask
	interconnect: qcom: sc8280xp: Set ACV enable_mask
	interconnect: qcom: sdm845: Retire DEFINE_QBCM
	interconnect: qcom: sdm845: Set ACV enable_mask
	interconnect: qcom: sm6350: Retire DEFINE_QBCM
	interconnect: qcom: sm6350: Set ACV enable_mask
	interconnect: move ignore_list out of of_count_icc_providers()
	interconnect: qcom: sm8150: Drop IP0 interconnects
	interconnect: qcom: sm8150: Retire DEFINE_QBCM
	interconnect: qcom: sm8150: Set ACV enable_mask
	interconnect: qcom: sm8350: Retire DEFINE_QBCM
	interconnect: qcom: sm8350: Set ACV enable_mask
	powerpc: Only define __parse_fpscr() when required
	modpost: fix tee MODULE_DEVICE_TABLE built on big-endian host
	modpost: fix ishtp MODULE_DEVICE_TABLE built on big-endian host
	powerpc/40x: Remove stale PTE_ATOMIC_UPDATES macro
	powerpc/xive: Fix endian conversion size
	powerpc/vas: Limit open window failure messages in log bufffer
	powerpc/imc-pmu: Use the correct spinlock initializer.
	powerpc/pseries: fix potential memory leak in init_cpu_associativity()
	xhci: Loosen RPM as default policy to cover for AMD xHC 1.1
	usb: host: xhci-plat: fix possible kernel oops while resuming
	perf machine: Avoid out of bounds LBR memory read
	perf hist: Add missing puts to hist__account_cycles
	9p/net: fix possible memory leak in p9_check_errors()
	i3c: Fix potential refcount leak in i3c_master_register_new_i3c_devs
	cxl/mem: Fix shutdown order
	crypto: ccp - Name -1 return value as SEV_RET_NO_FW_CALL
	x86/sev: Change snp_guest_issue_request()'s fw_err argument
	virt: sevguest: Fix passing a stack buffer as a scatterlist target
	rtc: pcf85363: fix wrong mask/val parameters in regmap_update_bits call
	pcmcia: cs: fix possible hung task and memory leak pccardd()
	pcmcia: ds: fix refcount leak in pcmcia_device_add()
	pcmcia: ds: fix possible name leak in error path in pcmcia_device_add()
	media: hantro: Check whether reset op is defined before use
	media: verisilicon: Do not enable G2 postproc downscale if source is narrower than destination
	media: ov5640: Drop dead code using frame_interval
	media: ov5640: fix vblank unchange issue when work at dvp mode
	media: i2c: max9286: Fix some redundant of_node_put() calls
	media: ov5640: Fix a memory leak when ov5640_probe fails
	media: bttv: fix use after free error due to btv->timeout timer
	media: amphion: handle firmware debug message
	media: mtk-jpegenc: Fix bug in JPEG encode quality selection
	media: s3c-camif: Avoid inappropriate kfree()
	media: vidtv: psi: Add check for kstrdup
	media: vidtv: mux: Add check and kfree for kstrdup
	media: cedrus: Fix clock/reset sequence
	media: cadence: csi2rx: Unregister v4l2 async notifier
	media: dvb-usb-v2: af9035: fix missing unlock
	media: cec: meson: always include meson sub-directory in Makefile
	regmap: prevent noinc writes from clobbering cache
	pwm: sti: Reduce number of allocations and drop usage of chip_data
	pwm: brcmstb: Utilize appropriate clock APIs in suspend/resume
	Input: synaptics-rmi4 - fix use after free in rmi_unregister_function()
	watchdog: ixp4xx: Make sure restart always works
	llc: verify mac len before reading mac header
	hsr: Prevent use after free in prp_create_tagged_frame()
	tipc: Change nla_policy for bearer-related names to NLA_NUL_STRING
	bpf: Check map->usercnt after timer->timer is assigned
	inet: shrink struct flowi_common
	octeontx2-pf: Fix error codes
	octeontx2-pf: Fix holes in error code
	net: page_pool: add missing free_percpu when page_pool_init fail
	dccp: Call security_inet_conn_request() after setting IPv4 addresses.
	dccp/tcp: Call security_inet_conn_request() after setting IPv6 addresses.
	net: r8169: Disable multicast filter for RTL8168H and RTL8107E
	Fix termination state for idr_for_each_entry_ul()
	net: stmmac: xgmac: Enable support for multiple Flexible PPS outputs
	selftests: pmtu.sh: fix result checking
	octeontx2-pf: Rename tot_tx_queues to non_qos_queues
	octeontx2-pf: qos send queues management
	octeontx2-pf: Free pending and dropped SQEs
	net/smc: fix dangling sock under state SMC_APPFINCLOSEWAIT
	net/smc: allow cdc msg send rather than drop it with NULL sndbuf_desc
	net/smc: put sk reference if close work was canceled
	nvme: fix error-handling for io_uring nvme-passthrough
	tg3: power down device only on SYSTEM_POWER_OFF
	nbd: fix uaf in nbd_open
	blk-core: use pr_warn_ratelimited() in bio_check_ro()
	virtio/vsock: replace virtio_vsock_pkt with sk_buff
	vsock/virtio: remove socket from connected/bound list on shutdown
	r8169: respect userspace disabling IFF_MULTICAST
	i2c: iproc: handle invalid slave state
	netfilter: xt_recent: fix (increase) ipv6 literal buffer length
	netfilter: nft_redir: use `struct nf_nat_range2` throughout and deduplicate eval call-backs
	netfilter: nat: fix ipv6 nat redirect with mapped and scoped addresses
	RISC-V: Don't fail in riscv_of_parent_hartid() for disabled HARTs
	drm/syncobj: fix DRM_SYNCOBJ_WAIT_FLAGS_WAIT_AVAILABLE
	ASoC: mediatek: mt8186_mt6366_rt1019_rt5682s: trivial: fix error messages
	ASoC: hdmi-codec: register hpd callback on component probe
	ASoC: dapm: fix clock get name
	spi: spi-zynq-qspi: add spi-mem to driver kconfig dependencies
	fbdev: imsttfb: Fix error path of imsttfb_probe()
	fbdev: imsttfb: fix a resource leak in probe
	fbdev: fsl-diu-fb: mark wr_reg_wa() static
	tracing/kprobes: Fix the order of argument descriptions
	io_uring/net: ensure socket is marked connected on connect retry
	x86/amd_nb: Use Family 19h Models 60h-7Fh Function 4 IDs
	Revert "mmc: core: Capture correct oemid-bits for eMMC cards"
	btrfs: use u64 for buffer sizes in the tree search ioctls
	wifi: cfg80211: fix kernel-doc for wiphy_delayed_work_flush()
	virtio/vsock: don't use skbuff state to account credit
	virtio/vsock: remove redundant 'skb_pull()' call
	virtio/vsock: don't drop skbuff on copy failure
	vsock/loopback: use only sk_buff_head.lock to protect the packet queue
	virtio/vsock: fix leaks due to missing skb owner
	virtio/vsock: Fix uninit-value in virtio_transport_recv_pkt()
	virtio/vsock: fix header length on skb merging
	Linux 6.1.63

Change-Id: I87b7a539b11c90cfaf16edb07d613f74d54458a4
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2023-11-27 16:59:46 +00:00
Reuben Hawkins
bc8e02850a vfs: fix readahead(2) on block devices
[ Upstream commit 7116c0af4b8414b2f19fdb366eea213cbd9d91c2 ]

Readahead was factored to call generic_fadvise.  That refactor added an
S_ISREG restriction which broke readahead on block devices.

In addition to S_ISREG, this change checks S_ISBLK to fix block device
readahead.  There is no change in behavior with any file type besides block
devices in this change.

Fixes: 3d8f761531 ("vfs: implement readahead(2) using POSIX_FADV_WILLNEED")
Signed-off-by: Reuben Hawkins <reubenhwk@gmail.com>
Link: https://lore.kernel.org/r/20231003015704.2415-1-reubenhwk@gmail.com
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-20 11:51:50 +01:00
liang zhang
0c859c2180 ANDROID: add for tuning readahead size
Tune ReadAhead size for better memory usage and performance.
accordding to Read-Ahead Efficiency on Mobile Devices: Observation,
Characterization, and Optimization form IEEE

Bug: 229839032
Change-Id: I91656bde5e616e181fd7557554d55e7ce1858136
Signed-off-by: liang zhang <liang.zhang@transsion.com>
Signed-off-by: Oven <liyangouwen1@oppo.com>
2023-11-06 23:07:00 +00:00
Chris Goldsworthy
342be123fd ANDROID: mm: Create hooks for ZONE_MOVABLE allocs
Create a vendor hook inside of gfp_zone() to modify which allocations
get to enter ZONE_MOVABLE, by zeroing out __GFP_HIGHMEM inside of the
trace hook based on certain conditions.

Separately, create a trace hook in the readahead path to affect the
behavior of the tracehook in gfp_zone().

In 5.15, we had set_skip_swapcache_flags trace-hook in do_swap_page()
but commit ac26e9c7b809 ("ANDROID: cma: allow to use CMA in swap-in path")
added __GFP_CMA explicitly, so the set_skip_swapcache_flags trace hook
is no longer needed.

Note:	To comply with vendor hook guidlines, avoid including types.h in
	trace/hooks/mm.h and use unsigned int for gfp_t.

Bug: 158645321
Change-Id: Idfa6b0b06b1b819d706c847e702bc94ddf7aa55a
Signed-off-by: Chris Goldsworthy <cgoldswo@codeaurora.org>
Signed-off-by: Sukadev Bhattiprolu <quic_sukadev@quicinc.com>
2023-04-26 17:01:52 +00:00
Christoph Hellwig
176042404e mm: add PSI accounting around ->read_folio and ->readahead calls
PSI tries to account for the cost of bringing back in pages discarded by
the MM LRU management.  Currently the prime place for that is hooked into
the bio submission path, which is a rather bad place:

 - it does not actually account I/O for non-block file systems, of which
   we have many
 - it adds overhead and a layering violation to the block layer

Add the accounting into the two places in the core MM code that read
pages into an address space by calling into ->read_folio and ->readahead
so that the entire file system operations are covered, to broaden
the coverage and allow removing the accounting in the block layer going
forward.

As psi_memstall_enter can deal with nested calls this will not lead to
double accounting even while the bio annotations are still present.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Link: https://lore.kernel.org/r/20220915094200.139713-2-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-09-20 08:24:38 -06:00
Alistair Popple
00fa15e0d5 filemap: Fix serialization adding transparent huge pages to page cache
Commit 793917d997 ("mm/readahead: Add large folio readahead")
introduced support for using large folios for filebacked pages if the
filesystem supports it.

page_cache_ra_order() was introduced to allocate and add these large
folios to the page cache. However adding pages to the page cache should
be serialized against truncation and hole punching by taking
invalidate_lock. Not doing so can lead to data races resulting in stale
data getting added to the page cache and marked up-to-date. See commit
730633f0b7 ("mm: Protect operations adding pages to page cache with
invalidate_lock") for more details.

This issue was found by inspection but a testcase revealed it was
possible to observe in practice on XFS. Fix this by taking
invalidate_lock in page_cache_ra_order(), to mirror what is done for the
non-thp case in page_cache_ra_unbounded().

Signed-off-by: Alistair Popple <apopple@nvidia.com>
Fixes: 793917d997 ("mm/readahead: Add large folio readahead")
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-06-23 12:22:00 -04:00
Matthew Wilcox (Oracle)
6bf74cddcf filemap: Don't release a locked folio
We must hold a reference over the call to filemap_release_folio(),
otherwise the page cache will put the last reference to the folio
before we unlock it, leading to splats like this:

 BUG: Bad page state in process u8:5  pfn:1ab1f4
 page:ffffea0006ac7d00 refcount:0 mapcount:0 mapping:0000000000000000 index:0x28b1de pfn:0x1ab1f4
 flags: 0x17ff80000040001(locked|reclaim|node=0|zone=2|lastcpupid=0xfff)
 raw: 017ff80000040001 dead000000000100 dead000000000122 0000000000000000
 raw: 000000000028b1de 0000000000000000 00000000ffffffff 0000000000000000
 page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set

It's an error path, so it doesn't see much testing.

Reported-by: Darrick J. Wong <djwong@kernel.org>
Fixes: a42634a6c0 ("readahead: Use a folio in read_pages()")
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-06-09 16:24:25 -04:00
Linus Torvalds
35b51afd23 RISC-V Patches for the 5.19 Merge Window, Part 1
* Support for the Svpbmt extension, which allows memory attributes to be
   encoded in pages.
 * Support for the Allwinner D1's implementation of page-based memory
   attributes.
 * Support for running rv32 binaries on rv64 systems, via the compat
   subsystem.
 * Support for kexec_file().
 * Support for the new generic ticket-based spinlocks, which allows us to
   also move to qrwlock.  These should have already gone in through the
   asm-geneic tree as well.
 * A handful of cleanups and fixes, include some larger ones around
   atomics and XIP.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmKWOx8THHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYieAiEADAUdP7ctoaSQwk5skd/fdA3b4KJuKn
 1Zjl+Br32WP0DlbirYBYWRUQZnCCsvABbTiwSJMcG7NBpU5pyQ5XDtB3OA5kJswO
 Fdp8Nd53//+GK1M5zdEM9OdgvT9fbfTZ3qTu8bKsROOQhGwnYL+Csc9KjFRqEmzN
 oQii0jlb3n5PM4FL3GsbV4uMn9zzkP9mnVAPQktcock2EKFEK/Fy3uNYMQiO2KPi
 n8O6bIDaeRdQ6SurzWOuOkt0cro0tEF85ilzT04mynQsOU0el5oGqCxnOhNH3VWg
 ndqPT6Yafw12hZOtbKJeP+nF8IIR6aJLP3jOtRwEVgcfbXYAw4QwbAV8kQZISefN
 ipn8JGY7GX9Y9TYU692OUGkcmAb3/dxb6c0WihBdvJ0M6YyLD5X+YKHNuG2onLgK
 ss43C5Mxsu629rsjdu/PV91B1+pve3rG9siVmF+g4eo0x9rjMq6/JB0Kal/8SLI1
 Je5T55d5ujV1a2XxhZLQOSD5owrK7J1M9owb0bloTnr9nVwFTWDrfEQEU82o3kP+
 Xm+FfXktnz9ai55NjkMbbEur5D++dKJhBavwCTnBcTrJmMtEH0R45GTK9ZehP+WC
 rNVrRXjIsS18wsTfJxnkZeFQA38as6VBKTzvwHvOgzTrrZU1/xk3lpkouYtAO6BG
 gKacHshVilmUuA==
 =Loi6
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-5.19-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V updates from Palmer Dabbelt:

 - Support for the Svpbmt extension, which allows memory attributes to
   be encoded in pages

 - Support for the Allwinner D1's implementation of page-based memory
   attributes

 - Support for running rv32 binaries on rv64 systems, via the compat
   subsystem

 - Support for kexec_file()

 - Support for the new generic ticket-based spinlocks, which allows us
   to also move to qrwlock. These should have already gone in through
   the asm-geneic tree as well

 - A handful of cleanups and fixes, include some larger ones around
   atomics and XIP

* tag 'riscv-for-linus-5.19-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (51 commits)
  RISC-V: Prepare dropping week attribute from arch_kexec_apply_relocations[_add]
  riscv: compat: Using seperated vdso_maps for compat_vdso_info
  RISC-V: Fix the XIP build
  RISC-V: Split out the XIP fixups into their own file
  RISC-V: ignore xipImage
  RISC-V: Avoid empty create_*_mapping definitions
  riscv: Don't output a bogus mmu-type on a no MMU kernel
  riscv: atomic: Add custom conditional atomic operation implementation
  riscv: atomic: Optimize dec_if_positive functions
  riscv: atomic: Cleanup unnecessary definition
  RISC-V: Load purgatory in kexec_file
  RISC-V: Add purgatory
  RISC-V: Support for kexec_file on panic
  RISC-V: Add kexec_file support
  RISC-V: use memcpy for kexec_file mode
  kexec_file: Fix kexec_file.c build error for riscv platform
  riscv: compat: Add COMPAT Kbuild skeletal support
  riscv: compat: ptrace: Add compat_arch_ptrace implement
  riscv: compat: signal: Add rt_frame implementation
  riscv: add memory-type errata for T-Head
  ...
2022-05-31 14:10:54 -07:00
Linus Torvalds
fdaf9a5840 Page cache changes for 5.19
- Appoint myself page cache maintainer
 
  - Fix how scsicam uses the page cache
 
  - Use the memalloc_nofs_save() API to replace AOP_FLAG_NOFS
 
  - Remove the AOP flags entirely
 
  - Remove pagecache_write_begin() and pagecache_write_end()
 
  - Documentation updates
 
  - Convert several address_space operations to use folios:
    - is_dirty_writeback
    - readpage becomes read_folio
    - releasepage becomes release_folio
    - freepage becomes free_folio
 
  - Change filler_t to require a struct file pointer be the first argument
    like ->read_folio
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEejHryeLBw/spnjHrDpNsjXcpgj4FAmKNMDUACgkQDpNsjXcp
 gj4/mwf/bpHhXH4ZoNIvtUpTF6rZbqeffmc0VrbxCZDZ6igRnRPglxZ9H9v6L53O
 7B0FBQIfxgNKHZpdqGdOkv8cjg/GMe/HJUbEy5wOakYPo4L9fZpHbDZ9HM2Eankj
 xBqLIBgBJ7doKr+Y62DAN19TVD8jfRfVtli5mqXJoNKf65J7BkxljoTH1L3EXD9d
 nhLAgyQjR67JQrT/39KMW+17GqLhGefLQ4YnAMONtB6TVwX/lZmigKpzVaCi4r26
 bnk5vaR/3PdjtNxIoYvxdc71y2Eg05n2jEq9Wcy1AaDv/5vbyZUlZ2aBSaIVbtKX
 WfrhN9O3L0bU5qS7p9PoyfLc9wpq8A==
 =djLv
 -----END PGP SIGNATURE-----

Merge tag 'folio-5.19' of git://git.infradead.org/users/willy/pagecache

Pull page cache updates from Matthew Wilcox:

 - Appoint myself page cache maintainer

 - Fix how scsicam uses the page cache

 - Use the memalloc_nofs_save() API to replace AOP_FLAG_NOFS

 - Remove the AOP flags entirely

 - Remove pagecache_write_begin() and pagecache_write_end()

 - Documentation updates

 - Convert several address_space operations to use folios:
     - is_dirty_writeback
     - readpage becomes read_folio
     - releasepage becomes release_folio
     - freepage becomes free_folio

 - Change filler_t to require a struct file pointer be the first
   argument like ->read_folio

* tag 'folio-5.19' of git://git.infradead.org/users/willy/pagecache: (107 commits)
  nilfs2: Fix some kernel-doc comments
  Appoint myself page cache maintainer
  fs: Remove aops->freepage
  secretmem: Convert to free_folio
  nfs: Convert to free_folio
  orangefs: Convert to free_folio
  fs: Add free_folio address space operation
  fs: Convert drop_buffers() to use a folio
  fs: Change try_to_free_buffers() to take a folio
  jbd2: Convert release_buffer_page() to use a folio
  jbd2: Convert jbd2_journal_try_to_free_buffers to take a folio
  reiserfs: Convert release_buffer_page() to use a folio
  fs: Remove last vestiges of releasepage
  ubifs: Convert to release_folio
  reiserfs: Convert to release_folio
  orangefs: Convert to release_folio
  ocfs2: Convert to release_folio
  nilfs2: Remove comment about releasepage
  nfs: Convert to release_folio
  jfs: Convert to release_folio
  ...
2022-05-24 19:55:07 -07:00
Linus Torvalds
115cd47132 for-5.19/block-2022-05-22
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmKKrUsQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpgDjD/44hY9h0JsOLoRH1IvFtuaH6n718JXuqG17
 hHCfmnAUVqj2jT00IUbVlUTd905bCGpfrodBL3PAmPev1zZHOUd/MnJKrSynJ+/s
 NJEMZQaHxLmocNDpJ1sZo7UbAFErsZXB0gVYUO8cH2bFYNu84H1mhRCOReYyqmvQ
 aIAASX5qRB/ciBQCivzAJl2jTdn4WOn5hWi9RLidQB7kSbaXGPmgKAuN88WI4H7A
 zQgAkEl2EEquyMI5tV1uquS7engJaC/4PsenF0S9iTyrhJLjneczJBJZKMLeMR8d
 sOm6sKJdpkrfYDyaA4PIkgmLoEGTtwGpqGHl4iXTyinUAxJoca5tmPvBb3wp66GE
 2Mr7pumxc1yJID2VHbsERXlOAX3aZNCowx2gum2MTRIO8g11Eu3aaVn2kv37MBJ2
 4R2a/cJFl5zj9M8536cG+Yqpy0DDVCCQKUIqEupgEu1dyfpznyWH5BTAHXi1E8td
 nxUin7uXdD0AJkaR0m04McjS/Bcmc1dc6I8xvkdUFYBqYCZWpKOTiEpIBlHg0XJA
 sxdngyz5lSYTGVA4o4QCrdR0Tx1n36A1IYFuQj0wzxBJYZ02jEZuII/A3dd+8hiv
 EY+VeUQeVIXFFuOcY+e0ScPpn7Nr17hAd1en/j2Hcoe4ZE8plqG2QTcnwgflcbis
 iomvJ4yk0Q==
 =0Rw1
 -----END PGP SIGNATURE-----

Merge tag 'for-5.19/block-2022-05-22' of git://git.kernel.dk/linux-block

Pull block updates from Jens Axboe:
 "Here are the core block changes for 5.19. This contains:

   - blk-throttle accounting fix (Laibin)

   - Series removing redundant assignments (Michal)

   - Expose bio cache via the bio_set, so that DM can use it (Mike)

   - Finish off the bio allocation interface cleanups by dealing with
     the weirdest member of the family. bio_kmalloc combines a kmalloc
     for the bio and bio_vecs with a hidden bio_init call and magic
     cleanup semantics (Christoph)

   - Clean up the block layer API so that APIs consumed by file systems
     are (almost) only struct block_device based, so that file systems
     don't have to poke into block layer internals like the
     request_queue (Christoph)

   - Clean up the blk_execute_rq* API (Christoph)

   - Clean up various lose end in the blk-cgroup code to make it easier
     to follow in preparation of reworking the blkcg assignment for bios
     (Christoph)

   - Fix use-after-free issues in BFQ when processes with merged queues
     get moved to different cgroups (Jan)

   - BFQ fixes (Jan)

   - Various fixes and cleanups (Bart, Chengming, Fanjun, Julia, Ming,
     Wolfgang, me)"

* tag 'for-5.19/block-2022-05-22' of git://git.kernel.dk/linux-block: (83 commits)
  blk-mq: fix typo in comment
  bfq: Remove bfq_requeue_request_body()
  bfq: Remove superfluous conversion from RQ_BIC()
  bfq: Allow current waker to defend against a tentative one
  bfq: Relax waker detection for shared queues
  blk-cgroup: delete rcu_read_lock_held() WARN_ON_ONCE()
  blk-throttle: Set BIO_THROTTLED when bio has been throttled
  blk-cgroup: Remove unnecessary rcu_read_lock/unlock()
  blk-cgroup: always terminate io.stat lines
  block, bfq: make bfq_has_work() more accurate
  block, bfq: protect 'bfqd->queued' by 'bfqd->lock'
  block: cleanup the VM accounting in submit_bio
  block: Fix the bio.bi_opf comment
  block: reorder the REQ_ flags
  blk-iocost: combine local_stat and desc_stat to stat
  block: improve the error message from bio_check_eod
  block: allow passing a NULL bdev to bio_alloc_clone/bio_init_clone
  block: remove superfluous calls to blkcg_bio_issue_init
  kthread: unexport kthread_blkcg
  blk-cgroup: cleanup blkcg_maybe_throttle_current
  ...
2022-05-23 13:56:39 -07:00
Matthew Wilcox (Oracle)
7e0a126519 mm,fs: Remove aops->readpage
With all implementations of aops->readpage converted to aops->read_folio,
we can stop checking whether it's set and remove the member from aops.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-05-09 16:28:36 -04:00
Matthew Wilcox (Oracle)
5efe7448a1 fs: Introduce aops->read_folio
Change all the callers of ->readpage to call ->read_folio in preference,
if it exists.  This is a transitional duplication, and will be removed
by the end of the series.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-05-09 16:21:40 -04:00
Matthew Wilcox (Oracle)
a42634a6c0 readahead: Use a folio in read_pages()
Handle multi-page folios correctly and removes a few calls to
compound_head().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2022-05-08 14:45:56 -04:00
Matthew Wilcox (Oracle)
b9ff43dd27 mm/readahead: Fix readahead with large folios
Reading 100KB chunks from a big file (eg dd bs=100K) leads to poor
readahead behaviour.  Studying the traces in detail, I noticed two
problems.

The first is that we were setting the readahead flag on the folio which
contains the last byte read from the block.  This is wrong because we
will trigger readahead at the end of the read without waiting to see
if a subsequent read is going to use the pages we just read.  Instead,
we need to set the readahead flag on the first folio _after_ the one
which contains the last byte that we're reading.

The second is that we were looking for the index of the folio with the
readahead flag set to exactly match the start + size - async_size.
If we've rounded this, either down (as previously) or up (as now),
we'll think we hit a folio marked as readahead by a different read,
and try to read the wrong pages.  So round the expected index to the
order of the folio we hit.

Reported-by: Guo Xuenan <guoxuenan@huawei.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-05-05 00:47:29 -04:00
Christoph Hellwig
c97ab27157 blk-cgroup: remove unneeded includes from <linux/blk-cgroup.h>
Remove all the includes that aren't actually needed from
<linux/blk-cgroup.h> and push them to the actual source files where
needed.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20220420042723.1010598-12-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-05-02 14:06:20 -06:00
Guo Ren
59c10c52f5
riscv: compat: syscall: Add compat_sys_call_table implementation
Implement compat sys_call_table and some system call functions:
truncate64, ftruncate64, fallocate, pread64, pwrite64,
sync_file_range, readahead, fadvise64_64 which need argument
translation.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Link: https://lore.kernel.org/r/20220405071314.3225832-12-guoren@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-04-26 13:36:25 -07:00
Matthew Wilcox (Oracle)
1e4702806f readahead: Update comments
- Refer to folios where appropriate, not pages (Matthew Wilcox)
 - Eliminate references to the internal PG_readhead
 - Use "readahead" consistently - not "read-ahead" or "read ahead"
   (mostly Neil Brown)
 - Clarify some sections that, on reflection, weren't very clear (Neil
   Brown)
 - Minor punctuation/spelling fixes (Neil Brown)

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-04-01 14:40:42 -04:00
Christoph Hellwig
b4e089d705 mm: remove the skip_page argument to read_pages
The skip_page argument to read_pages controls if rac->_index is
incremented before returning from the function.  Just open code that in
the callers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-04-01 13:45:52 -04:00
Christoph Hellwig
dfd8b4fc76 mm: remove the pages argument to read_pages
This is always an empty list or NULL with the removal of the ->readahead
support, so remove it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-04-01 13:45:43 -04:00
Matthew Wilcox (Oracle)
704528d895 fs: Remove ->readpages address space operation
All filesystems have now been converted to use ->readahead, so
remove the ->readpages operation and fix all the comments that
used to refer to it.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
2022-04-01 13:45:33 -04:00
Matthew Wilcox (Oracle)
ebf921a9fa readahead: Remove read_cache_pages()
With no remaining users, remove this function and the related
infrastructure.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
2022-04-01 13:45:08 -04:00
Linus Torvalds
6b1f86f8e9 Filesystem folio changes for 5.18
Primarily this series converts some of the address_space operations
 to take a folio instead of a page.
 
 ->is_partially_uptodate() takes a folio instead of a page and changes the
 type of the 'from' and 'count' arguments to make it obvious they're bytes.
 ->invalidatepage() becomes ->invalidate_folio() and has a similar type change.
 ->launder_page() becomes ->launder_folio()
 ->set_page_dirty() becomes ->dirty_folio() and adds the address_space as
 an argument.
 
 There are a couple of other misc changes up front that weren't worth
 separating into their own pull request.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEejHryeLBw/spnjHrDpNsjXcpgj4FAmI4hqMACgkQDpNsjXcp
 gj7r7Af/fVJ7m8kKqjP/IayX3HiJRuIDQw+vM++BlRNXdjz+IyED6whdmFGxJeOY
 BMyT+8ApOAz7ErS4G+7fAv4ScJK/aEgFUsnSeAiCp0PliiEJ5NNJzElp6sVmQ7H5
 SX7+Ek444FZUGsQuy0qL7/ELpR3ditnD7x+5U2g0p5TeaHGUQn84crRyfR4xuhNG
 EBD9D71BOb7OxUcOHe93pTkK51QsQ0aCrcIsB1tkK5KR0BAthn1HqF7ehL90Rvrr
 omx5M7aDWGY4oj7IKrhlAs+55Ah2WaOzrZBp0FXNbr4UENDBKWKyUxErwa4xPkf6
 Gm1iQG/CspOHnxN3YWsd5WjtlL3A+A==
 =cOiq
 -----END PGP SIGNATURE-----

Merge tag 'folio-5.18b' of git://git.infradead.org/users/willy/pagecache

Pull filesystem folio updates from Matthew Wilcox:
 "Primarily this series converts some of the address_space operations to
  take a folio instead of a page.

  Notably:

   - a_ops->is_partially_uptodate() takes a folio instead of a page and
     changes the type of the 'from' and 'count' arguments to make it
     obvious they're bytes.

   - a_ops->invalidatepage() becomes ->invalidate_folio() and has a
     similar type change.

   - a_ops->launder_page() becomes ->launder_folio()

   - a_ops->set_page_dirty() becomes ->dirty_folio() and adds the
     address_space as an argument.

  There are a couple of other misc changes up front that weren't worth
  separating into their own pull request"

* tag 'folio-5.18b' of git://git.infradead.org/users/willy/pagecache: (53 commits)
  fs: Remove aops ->set_page_dirty
  fb_defio: Use noop_dirty_folio()
  fs: Convert __set_page_dirty_no_writeback to noop_dirty_folio
  fs: Convert __set_page_dirty_buffers to block_dirty_folio
  nilfs: Convert nilfs_set_page_dirty() to nilfs_dirty_folio()
  mm: Convert swap_set_page_dirty() to swap_dirty_folio()
  ubifs: Convert ubifs_set_page_dirty to ubifs_dirty_folio
  f2fs: Convert f2fs_set_node_page_dirty to f2fs_dirty_node_folio
  f2fs: Convert f2fs_set_data_page_dirty to f2fs_dirty_data_folio
  f2fs: Convert f2fs_set_meta_page_dirty to f2fs_dirty_meta_folio
  afs: Convert afs_dir_set_page_dirty() to afs_dir_dirty_folio()
  btrfs: Convert extent_range_redirty_for_io() to use folios
  fs: Convert trivial uses of __set_page_dirty_nobuffers to filemap_dirty_folio
  btrfs: Convert from set_page_dirty to dirty_folio
  fscache: Convert fscache_set_page_dirty() to fscache_dirty_folio()
  fs: Add aops->dirty_folio
  fs: Remove aops->launder_page
  orangefs: Convert launder_page to launder_folio
  nfs: Convert from launder_page to launder_folio
  fuse: Convert from launder_page to launder_folio
  ...
2022-03-22 18:26:56 -07:00
Linus Torvalds
9030fb0bb9 Folio changes for 5.18
- Rewrite how munlock works to massively reduce the contention
    on i_mmap_rwsem (Hugh Dickins):
    https://lore.kernel.org/linux-mm/8e4356d-9622-a7f0-b2c-f116b5f2efea@google.com/
  - Sort out the page refcount mess for ZONE_DEVICE pages (Christoph Hellwig):
    https://lore.kernel.org/linux-mm/20220210072828.2930359-1-hch@lst.de/
  - Convert GUP to use folios and make pincount available for order-1
    pages. (Matthew Wilcox)
  - Convert a few more truncation functions to use folios (Matthew Wilcox)
  - Convert page_vma_mapped_walk to use PFNs instead of pages (Matthew Wilcox)
  - Convert rmap_walk to use folios (Matthew Wilcox)
  - Convert most of shrink_page_list() to use a folio (Matthew Wilcox)
  - Add support for creating large folios in readahead (Matthew Wilcox)
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEejHryeLBw/spnjHrDpNsjXcpgj4FAmI4ucgACgkQDpNsjXcp
 gj69Wgf6AwqwmO5Tmy+fLScDPqWxmXJofbocae1kyoGHf7Ui91OK4U2j6IpvAr+g
 P/vLIK+JAAcTQcrSCjymuEkf4HkGZOR03QQn7maPIEe4eLrZRQDEsmHC1L9gpeJp
 s/GMvDWiGE0Tnxu0EOzfVi/yT+qjIl/S8VvqtCoJv1HdzxitZ7+1RDuqImaMC5MM
 Qi3uHag78vLmCltLXpIOdpgZhdZexCdL2Y/1npf+b6FVkAJRRNUnA0gRbS7YpoVp
 CbxEJcmAl9cpJLuj5i5kIfS9trr+/QcvbUlzRxh4ggC58iqnmF2V09l2MJ7YU3XL
 v1O/Elq4lRhXninZFQEm9zjrri7LDQ==
 =n9Ad
 -----END PGP SIGNATURE-----

Merge tag 'folio-5.18c' of git://git.infradead.org/users/willy/pagecache

Pull folio updates from Matthew Wilcox:

 - Rewrite how munlock works to massively reduce the contention on
   i_mmap_rwsem (Hugh Dickins):

     https://lore.kernel.org/linux-mm/8e4356d-9622-a7f0-b2c-f116b5f2efea@google.com/

 - Sort out the page refcount mess for ZONE_DEVICE pages (Christoph
   Hellwig):

     https://lore.kernel.org/linux-mm/20220210072828.2930359-1-hch@lst.de/

 - Convert GUP to use folios and make pincount available for order-1
   pages. (Matthew Wilcox)

 - Convert a few more truncation functions to use folios (Matthew
   Wilcox)

 - Convert page_vma_mapped_walk to use PFNs instead of pages (Matthew
   Wilcox)

 - Convert rmap_walk to use folios (Matthew Wilcox)

 - Convert most of shrink_page_list() to use a folio (Matthew Wilcox)

 - Add support for creating large folios in readahead (Matthew Wilcox)

* tag 'folio-5.18c' of git://git.infradead.org/users/willy/pagecache: (114 commits)
  mm/damon: minor cleanup for damon_pa_young
  selftests/vm/transhuge-stress: Support file-backed PMD folios
  mm/filemap: Support VM_HUGEPAGE for file mappings
  mm/readahead: Switch to page_cache_ra_order
  mm/readahead: Align file mappings for non-DAX
  mm/readahead: Add large folio readahead
  mm: Support arbitrary THP sizes
  mm: Make large folios depend on THP
  mm: Fix READ_ONLY_THP warning
  mm/filemap: Allow large folios to be added to the page cache
  mm: Turn can_split_huge_page() into can_split_folio()
  mm/vmscan: Convert pageout() to take a folio
  mm/vmscan: Turn page_check_references() into folio_check_references()
  mm/vmscan: Account large folios correctly
  mm/vmscan: Optimise shrink_page_list for non-PMD-sized folios
  mm/vmscan: Free non-shmem folios without splitting them
  mm/rmap: Constify the rmap_walk_control argument
  mm/rmap: Convert rmap_walk() to take a folio
  mm: Turn page_anon_vma() into folio_anon_vma()
  mm/rmap: Turn page_lock_anon_vma_read() into folio_lock_anon_vma_read()
  ...
2022-03-22 17:03:12 -07:00
NeilBrown
fe55d563d4 remove inode_congested()
inode_congested() reports if the backing-device for the inode is
congested.  No bdi reports congestion any more, so this always returns
'false'.

So remove inode_congested() and related functions, and remove the call
sites, assuming that inode_congested() always returns 'false'.

Link: https://lkml.kernel.org/r/164549983741.9187.2174285592262191311.stgit@noble.brown
Signed-off-by: NeilBrown <neilb@suse.de>
Cc: Anna Schumaker <Anna.Schumaker@Netapp.com>
Cc: Chao Yu <chao@kernel.org>
Cc: Darrick J. Wong <djwong@kernel.org>
Cc: Ilya Dryomov <idryomov@gmail.com>
Cc: Jaegeuk Kim <jaegeuk@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Jeff Layton <jlayton@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Lars Ellenberg <lars.ellenberg@linbit.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Paolo Valente <paolo.valente@linaro.org>
Cc: Philipp Reisner <philipp.reisner@linbit.com>
Cc: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Cc: Trond Myklebust <trond.myklebust@hammerspace.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-22 15:57:01 -07:00
NeilBrown
9fd472af84 mm: improve cleanup when ->readpages doesn't process all pages
If ->readpages doesn't process all the pages, then it is best to act as
though they weren't requested so that a subsequent readahead can try
again.

So:

  - remove any 'ahead' pages from the page cache so they can be loaded
    with ->readahead() rather then multiple ->read()s

  - update the file_ra_state to reflect the reads that were actually
    submitted.

This allows ->readpages() to abort early due e.g.  to congestion, which
will then allow us to remove the inode_read_congested() test from
page_Cache_async_ra().

Link: https://lkml.kernel.org/r/164549983736.9187.16755913785880819183.stgit@noble.brown
Signed-off-by: NeilBrown <neilb@suse.de>
Cc: Anna Schumaker <Anna.Schumaker@Netapp.com>
Cc: Chao Yu <chao@kernel.org>
Cc: Darrick J. Wong <djwong@kernel.org>
Cc: Ilya Dryomov <idryomov@gmail.com>
Cc: Jaegeuk Kim <jaegeuk@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Jeff Layton <jlayton@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Lars Ellenberg <lars.ellenberg@linbit.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Paolo Valente <paolo.valente@linaro.org>
Cc: Philipp Reisner <philipp.reisner@linbit.com>
Cc: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Cc: Trond Myklebust <trond.myklebust@hammerspace.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-22 15:57:00 -07:00
NeilBrown
84dacdbd53 mm: document and polish read-ahead code
Add some "big-picture" documentation for read-ahead and polish the code
to make it fit this documentation.

The meaning of ->async_size is clarified to match its name.  i.e.  Any
request to ->readahead() has a sync part and an async part.  The caller
will wait for the sync pages to complete, but will not wait for the
async pages.  The first async page is still marked PG_readahead

Note that the current function names page_cache_sync_ra() and
page_cache_async_ra() are misleading.  All ra request are partly sync
and partly async, so either part can be empty.  A page_cache_sync_ra()
request will usually set ->async_size non-zero, implying it is not all
synchronous.

When a non-zero req_count is passed to page_cache_async_ra(), the
implication is that some prefix of the request is synchronous, though
the calculation made there is incorrect - I haven't tried to fix it.

Link: https://lkml.kernel.org/r/164549983734.9187.11586890887006601405.stgit@noble.brown
Signed-off-by: NeilBrown <neilb@suse.de>
Cc: Anna Schumaker <Anna.Schumaker@Netapp.com>
Cc: Chao Yu <chao@kernel.org>
Cc: Darrick J. Wong <djwong@kernel.org>
Cc: Ilya Dryomov <idryomov@gmail.com>
Cc: Jaegeuk Kim <jaegeuk@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Jeff Layton <jlayton@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Lars Ellenberg <lars.ellenberg@linbit.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Paolo Valente <paolo.valente@linaro.org>
Cc: Philipp Reisner <philipp.reisner@linbit.com>
Cc: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Cc: Trond Myklebust <trond.myklebust@hammerspace.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-22 15:57:00 -07:00
Matthew Wilcox (Oracle)
56a4d67c26 mm/readahead: Switch to page_cache_ra_order
do_page_cache_ra() was being exposed for the benefit of
do_sync_mmap_readahead().  Switch it over to page_cache_ra_order()
partly because it's a better interface but mostly for the benefit of
the next patch.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-03-21 13:01:36 -04:00
Matthew Wilcox (Oracle)
793917d997 mm/readahead: Add large folio readahead
Allocate large folios in the readahead code when the filesystem supports
them and it seems worth doing.  The heuristic for choosing which folio
sizes will surely need some tuning, but this aggressive ramp-up has been
good for testing.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-03-21 13:01:36 -04:00
Matthew Wilcox (Oracle)
5ad6b2bdaa fs: Turn do_invalidatepage() into folio_invalidate()
Take a folio instead of a page, fix the types of the offset & length,
and export it to filesystems.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Acked-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Tested-by: Mike Marshall <hubcap@omnibond.com> # orangefs
Tested-by: David Howells <dhowells@redhat.com> # afs
2022-03-15 08:23:25 -04:00
Matthew Wilcox (Oracle)
0387df1d1f readahead: Convert page_cache_ra_unbounded to folios
This saves 99 bytes of kernel text.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
2022-01-04 13:15:34 -05:00
Matthew Wilcox (Oracle)
7836d99900 readahead: Convert page_cache_async_ra() to take a folio
Using the folio here avoids checking whether it's a tail page.
This patch mostly just enables some of the following patches.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
2022-01-04 13:15:34 -05:00
Linus Torvalds
512b7931ad Merge branch 'akpm' (patches from Andrew)
Merge misc updates from Andrew Morton:
 "257 patches.

  Subsystems affected by this patch series: scripts, ocfs2, vfs, and
  mm (slab-generic, slab, slub, kconfig, dax, kasan, debug, pagecache,
  gup, swap, memcg, pagemap, mprotect, mremap, iomap, tracing, vmalloc,
  pagealloc, memory-failure, hugetlb, userfaultfd, vmscan, tools,
  memblock, oom-kill, hugetlbfs, migration, thp, readahead, nommu, ksm,
  vmstat, madvise, memory-hotplug, rmap, zsmalloc, highmem, zram,
  cleanups, kfence, and damon)"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (257 commits)
  mm/damon: remove return value from before_terminate callback
  mm/damon: fix a few spelling mistakes in comments and a pr_debug message
  mm/damon: simplify stop mechanism
  Docs/admin-guide/mm/pagemap: wordsmith page flags descriptions
  Docs/admin-guide/mm/damon/start: simplify the content
  Docs/admin-guide/mm/damon/start: fix a wrong link
  Docs/admin-guide/mm/damon/start: fix wrong example commands
  mm/damon/dbgfs: add adaptive_targets list check before enable monitor_on
  mm/damon: remove unnecessary variable initialization
  Documentation/admin-guide/mm/damon: add a document for DAMON_RECLAIM
  mm/damon: introduce DAMON-based Reclamation (DAMON_RECLAIM)
  selftests/damon: support watermarks
  mm/damon/dbgfs: support watermarks
  mm/damon/schemes: activate schemes based on a watermarks mechanism
  tools/selftests/damon: update for regions prioritization of schemes
  mm/damon/dbgfs: support prioritization weights
  mm/damon/vaddr,paddr: support pageout prioritization
  mm/damon/schemes: prioritize regions within the quotas
  mm/damon/selftests: support schemes quotas
  mm/damon/dbgfs: support quotas of schemes
  ...
2021-11-06 14:08:17 -07:00
Lin Feng
fb25a77dde mm/readahead.c: fix incorrect comments for get_init_ra_size
In fact, formated values returned by get_init_ra_size are not that
intuitive.  This patch make the comments reflect its truth.

Link: https://lkml.kernel.org/r/20211019104812.135602-1-linf@wangsu.com
Signed-off-by: Lin Feng <linf@wangsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:41 -07:00
Christoph Hellwig
518d55051a mm: remove spurious blkdev.h includes
Various files have acquired spurious includes of <linux/blkdev.h> over
time.  Remove them.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Link: https://lore.kernel.org/r/20210920123328.1399408-5-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-18 06:17:01 -06:00
Jan Kara
730633f0b7 mm: Protect operations adding pages to page cache with invalidate_lock
Currently, serializing operations such as page fault, read, or readahead
against hole punching is rather difficult. The basic race scheme is
like:

fallocate(FALLOC_FL_PUNCH_HOLE)			read / fault / ..
  truncate_inode_pages_range()
						  <create pages in page
						   cache here>
  <update fs block mapping and free blocks>

Now the problem is in this way read / page fault / readahead can
instantiate pages in page cache with potentially stale data (if blocks
get quickly reused). Avoiding this race is not simple - page locks do
not work because we want to make sure there are *no* pages in given
range. inode->i_rwsem does not work because page fault happens under
mmap_sem which ranks below inode->i_rwsem. Also using it for reads makes
the performance for mixed read-write workloads suffer.

So create a new rw_semaphore in the address_space - invalidate_lock -
that protects adding of pages to page cache for page faults / reads /
readahead.

Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
2021-07-13 13:14:27 +02:00
David Howells
3ca2364401 mm: Implement readahead_control pageset expansion
Provide a function, readahead_expand(), that expands the set of pages
specified by a readahead_control object to encompass a revised area with a
proposed size and length.

The proposed area must include all of the old area and may be expanded yet
more by this function so that the edges align on (transparent huge) page
boundaries as allocated.

The expansion will be cut short if a page already exists in either of the
areas being expanded into.  Note that any expansion made in such a case is
not rolled back.

This will be used by fscache so that reads can be expanded to cache granule
boundaries, thereby allowing whole granules to be stored in the cache, but
there are other potential users also.

Changes:
v6:
- Fold in a patch from Matthew Wilcox to tell the ondemand readahead
  algorithm about the expansion so that the next readahead starts at the
  right place[2].

v4:
- Moved the declaration of readahead_expand() to a better place[1].

Suggested-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Tested-by: Jeff Layton <jlayton@kernel.org>
Tested-by: Dave Wysochanski <dwysocha@redhat.com>
Tested-By: Marc Dionne <marc.dionne@auristor.com>
cc: Alexander Viro <viro@zeniv.linux.org.uk>
cc: Christoph Hellwig <hch@lst.de>
cc: Mike Marshall <hubcap@omnibond.com>
cc: linux-mm@kvack.org
cc: linux-cachefs@redhat.com
cc: linux-afs@lists.infradead.org
cc: linux-nfs@vger.kernel.org
cc: linux-cifs@vger.kernel.org
cc: ceph-devel@vger.kernel.org
cc: v9fs-developer@lists.sourceforge.net
cc: linux-fsdevel@vger.kernel.org
Link: https://lore.kernel.org/r/20210217161358.GM2858050@casper.infradead.org/ [1]
Link: https://lore.kernel.org/r/20210407201857.3582797-4-willy@infradead.org/ [2]
Link: https://lore.kernel.org/r/159974633888.2094769.8326206446358128373.stgit@warthog.procyon.org.uk/
Link: https://lore.kernel.org/r/160588479816.3465195.553952688795241765.stgit@warthog.procyon.org.uk/ # rfc
Link: https://lore.kernel.org/r/161118131787.1232039.4863969952441067985.stgit@warthog.procyon.org.uk/ # rfc
Link: https://lore.kernel.org/r/161161028670.2537118.13831420617039766044.stgit@warthog.procyon.org.uk/ # v2
Link: https://lore.kernel.org/r/161340389201.1303470.14353807284546854878.stgit@warthog.procyon.org.uk/ # v3
Link: https://lore.kernel.org/r/161539530488.286939.18085961677838089157.stgit@warthog.procyon.org.uk/ # v4
Link: https://lore.kernel.org/r/161653789422.2770958.2108046612147345000.stgit@warthog.procyon.org.uk/ # v5
Link: https://lore.kernel.org/r/161789069829.6155.4295672417565512161.stgit@warthog.procyon.org.uk/ # v6
2021-04-23 10:14:29 +01:00
Matthew Wilcox (Oracle)
f615bd5c47 mm/readahead: Handle ractl nr_pages being modified
Filesystems are not currently permitted to modify the number of pages
in the ractl.  An upcoming patch to add readahead_expand() changes that
rule, so remove the check and resync the loop counter after every call
to the filesystem.

Tested-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/20210420200116.3715790-1-willy@infradead.org/
Link: https://lore.kernel.org/r/20210421170923.4005574-1-willy@infradead.org/ # v2
2021-04-23 10:14:28 +01:00
Matthew Wilcox (Oracle)
fcd9ae4f7f mm/filemap: Pass the file_ra_state in the ractl
For readahead_expand(), we need to modify the file ra_state, so pass it
down by adding it to the ractl.  We have to do this because it's not always
the same as f_ra in the struct file that is already being passed.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Jeff Layton <jlayton@kernel.org>
Tested-by: Dave Wysochanski <dwysocha@redhat.com>
Tested-By: Marc Dionne <marc.dionne@auristor.com>
Link: https://lore.kernel.org/r/20210407201857.3582797-2-willy@infradead.org/
Link: https://lore.kernel.org/r/161789067431.6155.8063840447229665720.stgit@warthog.procyon.org.uk/ # v6
2021-04-23 09:25:00 +01:00
Jens Axboe
324bcf54c4 mm: use limited read-ahead to satisfy read
For the case where read-ahead is disabled on the file, or if the cgroup
is congested, ensure that we can at least do 1 page of read-ahead to
make progress on the read in an async fashion. This could potentially be
larger, but it's not needed in terms of functionality, so let's error on
the side of caution as larger counts of pages may run into reclaim
issues (particularly if we're congested).

This makes sure we're not hitting the potentially sync ->readpage() path
for IO that is marked IOCB_WAITQ, which could cause us to block. It also
means we'll use the same path for IO, regardless of whether or not
read-ahead happens to be disabled on the lower level device.

Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reported-by: Hao_Xu <haoxu@linux.alibaba.com>
[axboe: updated for new ractl API]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-10-17 13:49:08 -06:00
David Howells
b1647dc0de mm/readahead: pass a file_ra_state into force_page_cache_ra
The file_ra_state being passed into page_cache_sync_readahead() was being
ignored in favour of using the one embedded in the struct file.  The only
caller for which this makes a difference is the fsverity code if the file
has been marked as POSIX_FADV_RANDOM, but it's confusing and worth fixing.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Eric Biggers <ebiggers@google.com>
Link: https://lkml.kernel.org/r/20200903140844.14194-10-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-16 11:11:16 -07:00
Matthew Wilcox (Oracle)
fefa7c478f mm/readahead: add page_cache_sync_ra and page_cache_async_ra
Reimplement page_cache_sync_readahead() and page_cache_async_readahead()
as wrappers around versions of the function which take a readahead_control
in preparation for making do_sync_mmap_readahead() pass down an RAC
struct.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Eric Biggers <ebiggers@google.com>
Link: https://lkml.kernel.org/r/20200903140844.14194-8-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-16 11:11:16 -07:00
David Howells
7b3df3b9ac mm/readahead: pass readahead_control to force_page_cache_ra
Reimplement force_page_cache_readahead() as a wrapper around
force_page_cache_ra().  Pass the existing readahead_control from
page_cache_sync_readahead().

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Eric Biggers <ebiggers@google.com>
Link: https://lkml.kernel.org/r/20200903140844.14194-7-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-16 11:11:16 -07:00
David Howells
6e4af69ae9 mm/readahead: make ondemand_readahead take a readahead_control
Make ondemand_readahead() take a readahead_control struct in preparation
for making do_sync_mmap_readahead() pass down an RAC struct.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Eric Biggers <ebiggers@google.com>
Link: https://lkml.kernel.org/r/20200903140844.14194-6-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-16 11:11:16 -07:00
Matthew Wilcox (Oracle)
8238287ead mm/readahead: make do_page_cache_ra take a readahead_control
Rename __do_page_cache_readahead() to do_page_cache_ra() and call it
directly from ondemand_readahead() instead of indirecting via ra_submit().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Eric Biggers <ebiggers@google.com>
Link: https://lkml.kernel.org/r/20200903140844.14194-5-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-16 11:11:16 -07:00
Matthew Wilcox (Oracle)
73bb49da50 mm/readahead: make page_cache_ra_unbounded take a readahead_control
Define it in the callers instead of in page_cache_ra_unbounded().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Eric Biggers <ebiggers@google.com>
Link: https://lkml.kernel.org/r/20200903140844.14194-4-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-16 11:11:16 -07:00
Matthew Wilcox (Oracle)
1aa83cfa5a mm/readahead: add DEFINE_READAHEAD
Patch series "Readahead patches for 5.9/5.10".

These are infrastructure for both the THP patchset and for the fscache
rewrite,

For both pieces of infrastructure being build on top of this patchset, we
want the ractl to be available higher in the call-stack.

For David's work, he wants to add the 'critical page' to the ractl so that
he knows which page NEEDS to be brought in from storage, and which ones
are nice-to-have.  We might want something similar in block storage too.
It used to be simple -- the first page was the critical one, but then mmap
added fault-around and so for that usecase, the middle page is the
critical one.  Anyway, I don't have any code to show that yet, we just
know that the lowest point in the callchain where we have that information
is do_sync_mmap_readahead() and so the ractl needs to start its life
there.

For THP, we havew the code that needs it.  It's actually the apex patch to
the series; the one which finally starts to allocate THPs and present them
to consenting filesystems:
798bcf30ab

This patch (of 8):

Allow for a more concise definition of a struct readahead_control.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Eric Biggers <ebiggers@google.com>
Cc: David Howells <dhowells@redhat.com>
Link: https://lkml.kernel.org/r/20200903140844.14194-1-willy@infradead.org
Link: https://lkml.kernel.org/r/20200903140844.14194-3-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-16 11:11:15 -07:00
Matthew Wilcox (Oracle)
f2c817bed5 mm: use memalloc_nofs_save in readahead path
Ensure that memory allocations in the readahead path do not attempt to
reclaim file-backed pages, which could lead to a deadlock.  It is
possible, though unlikely this is the root cause of a problem observed
by Cong Wang.

Reported-by: Cong Wang <xiyou.wangcong@gmail.com>
Suggested-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
Cc: Chao Yu <yuchao0@huawei.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Dave Chinner <dchinner@redhat.com>
Cc: Eric Biggers <ebiggers@google.com>
Cc: Gao Xiang <gaoxiang25@huawei.com>
Cc: Jaegeuk Kim <jaegeuk@kernel.org>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Cc: Miklos Szeredi <mszeredi@redhat.com>
Link: http://lkml.kernel.org/r/20200414150233.24495-16-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02 10:59:07 -07:00
Matthew Wilcox (Oracle)
2d8163e489 mm: document why we don't set PageReadahead
If the page is already in cache, we don't set PageReadahead on it.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
Cc: Chao Yu <yuchao0@huawei.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Dave Chinner <dchinner@redhat.com>
Cc: Eric Biggers <ebiggers@google.com>
Cc: Gao Xiang <gaoxiang25@huawei.com>
Cc: Jaegeuk Kim <jaegeuk@kernel.org>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Cc: Miklos Szeredi <mszeredi@redhat.com>
Link: http://lkml.kernel.org/r/20200414150233.24495-15-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02 10:59:07 -07:00