Export cgroup_add_legacy_cftypes and a helper function to allow vendor module to expose additional files in the memory cgroup hierarchy.
Bug: 192052083
Signed-off-by: Liujie Xie <xieliujie@oppo.com>
(cherry picked from commit f41a95eadca98506e627b21f5cc73332bba4d95c)
(cherry picked from commit bf24c43b7f90290d2ac6f8163b43ab00f8f820b9)
Change-Id: Ie2b936b3e77c7ab6d740d1bb6d70e03c70a326a7
Signed-off-by: lvwenhuan <lvwenhuan@oppo.com>
Exporting the symbol freezer_cgrp_subsys, in that vendor module can
add can_attach & cancel_attach member function. It is vendor-specific
tuning.
Bug: 182496370
Bug: 281920779
Signed-off-by: Zhuguangqing <zhuguangqing@xiaomi.com>
Change-Id: I153682b9d1015eed3f048b45ea6495ebb8f3c261
(cherry picked from commit ee3f4d2821f5b2a794f0a1f5ed423f561a01adae)
(cherry picked from commit 8a90e4d4e555dd5484213c6fec5061958016a194)
This hook allows us to capture information when a process is forked so
that we can stat and set some key task's CPU affinity in the ko module
later. This patch, along with aosp/2548175, is necessary for our
affinity settings.
Bug: 183674818
Signed-off-by: lijianzhong <lijianzhong@xiaomi.com>
Change-Id: Ib93e05e5f6c338c5f7ada56bfebdd705f87f1f66
(cherry picked from commit a188361628461c58a4dfc72869d9acb1dfa2542f)
Exporting the symbol cpuset_cpus_allowed() so that we can adjust a
certain type of application's CPU affinity in vendor hooks according
to our tuning policy.
Related commit: aosp/2548175
Bug: 189725786
Signed-off-by: lijianzhong <lijianzhong@xiaomi.com>
Change-Id: I7919a893ab64bb441ab43cbb0b16825ed76d802d
(cherry picked from commit 5a7d01ed73e4fc812fda1d7288086dc73a283405)
Current 500ms min window size for psi triggers limits polling interval
to 50ms to prevent polling threads from using too much cpu bandwidth by
polling too frequently. However the number of cgroups with triggers is
unlimited, so this protection can be defeated by creating multiple
cgroups with psi triggers (triggers in each cgroup are served by a single
"psimon" kernel thread).
Instead of limiting min polling period, which also limits the latency of
psi events, it's better to limit psi trigger creation to authorized users
only, like we do for system-wide psi triggers (/proc/pressure/* files can
be written only by processes with CAP_SYS_RESOURCE capability). This also
makes access rules for cgroup psi files consistent with system-wide ones.
Add a CAP_SYS_RESOURCE capability check for cgroup psi file writers and
remove the psi window min size limitation.
Bug: 269247660
Change-Id: I8876aa306cf2ba5acdf4daa5a5eff0665537bfeb
Suggested-by: Sudarshan Rajagopalan <quic_sudaraja@quicinc.com>
Link: https://lore.kernel.org/all/20230303011346.3342233-1-surenb@google.com/
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Sudarshan Rajagopalan <quic_sudaraja@quicinc.com>
Signed-off-by: Georgi Djakov <quic_c_gdjako@quicinc.com>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmRBFW0ACgkQONu9yGCS
aT7Jew//Ytw9+JQ71LT1TuJnQ1GayJOL1BW5hgxoYgnBFasWDwsGA9rzHs6KHqHb
0Vjk7MX7VZB+6zWakOxY5CFVM33J4fS7wY8WE2bj8X3QQhD/J0HQDMdELvSBi3qF
7xI6sghEQEwOuwAj2+CBqm/q7rA5FTnO1QgJuk/AKJ6PHGRiQeZ7q1zGpFvSaj7S
cyKvY99RsjnUN+PYk4LE2+u/6DVCqiWYVDQrdjalb9zsrXg4+nmPH6ZJzZX8+bbM
eM0xAR675V8TXqDi+8bj7tWmiS52XyjYF3Q/bu9BmU67DqslH9FFyVQxhgTHUZpN
qWXkojEU2djIc3qt7T/bpZS/vD8Kg3Px1CgyIRN8Y5SlZfhZyqVdTZ4AQCtJuLQJ
wDIdQCLlGzzDNFvbD+LdfJSjZt7Ig1sM/HwtPZhUA9yF0FN1XV3dcESzCOeI0/S7
ohRh8cs1sidnxrbvVwiVNENSqbJD7G9/9vVjIfyfcnt57q+fs6xCBhpOyNoVOp74
I5i6ALMcVZoAB50vDjnoGZsSRe9W2AmOV6UMIkVCvRCWYFqBpgVftMTAACNyljni
UlXmO7aDQj+nbHD/auclFtU02oHQbk62FSrwoWMFS090zWztQqUhgRY7Qnl13yCM
poEvrKlskXhvunsNtdVmI5O3N2GANWKgGwkyFIiXvgxKkw1qpUo=
=zeN9
-----END PGP SIGNATURE-----
Merge 6.1.25 into android14-6.1
Changes in 6.1.25
Revert "pinctrl: amd: Disable and mask interrupts on resume"
drm/amd/display: Pass the right info to drm_dp_remove_payload
ALSA: emu10k1: fix capture interrupt handler unlinking
ALSA: hda/sigmatel: add pin overrides for Intel DP45SG motherboard
ALSA: i2c/cs8427: fix iec958 mixer control deactivation
ALSA: hda: patch_realtek: add quirk for Asus N7601ZM
ALSA: hda/realtek: Add quirks for Lenovo Z13/Z16 Gen2
ALSA: firewire-tascam: add missing unwind goto in snd_tscm_stream_start_duplex()
ALSA: emu10k1: don't create old pass-through playback device on Audigy
ALSA: hda/sigmatel: fix S/PDIF out on Intel D*45* motherboards
ALSA: hda/hdmi: disable KAE for Intel DG2
Bluetooth: L2CAP: Fix use-after-free in l2cap_disconnect_{req,rsp}
Bluetooth: Fix race condition in hidp_session_thread
bluetooth: btbcm: Fix logic error in forming the board name.
Bluetooth: Free potentially unfreed SCO connection
Bluetooth: hci_conn: Fix possible UAF
btrfs: restore the thread_pool= behavior in remount for the end I/O workqueues
btrfs: fix fast csum implementation detection
fbmem: Reject FB_ACTIVATE_KD_TEXT from userspace
mtdblock: tolerate corrected bit-flips
mtd: rawnand: meson: fix bitmask for length in command word
mtd: rawnand: stm32_fmc2: remove unsupported EDO mode
mtd: rawnand: stm32_fmc2: use timings.mode instead of checking tRC_min
KVM: arm64: PMU: Restore the guest's EL0 event counting after migration
fbcon: Fix error paths in set_con2fb_map
fbcon: set_con2fb_map needs to set con2fb_map!
drm/i915/dsi: fix DSS CTL register offsets for TGL+
clk: sprd: set max_register according to mapping range
RDMA/irdma: Do not generate SW completions for NOPs
RDMA/irdma: Fix memory leak of PBLE objects
RDMA/irdma: Increase iWARP CM default rexmit count
RDMA/irdma: Add ipv4 check to irdma_find_listener()
IB/mlx5: Add support for 400G_8X lane speed
RDMA/erdma: Update default EQ depth to 4096 and max_send_wr to 8192
RDMA/erdma: Inline mtt entries into WQE if supported
RDMA/erdma: Defer probing if netdevice can not be found
clk: rs9: Fix suspend/resume
RDMA/cma: Allow UD qp_type to join multicast only
bpf: tcp: Use sock_gen_put instead of sock_put in bpf_iter_tcp
LoongArch, bpf: Fix jit to skip speculation barrier opcode
dmaengine: apple-admac: Handle 'global' interrupt flags
dmaengine: apple-admac: Set src_addr_widths capability
dmaengine: apple-admac: Fix 'current_tx' not getting freed
9p/xen : Fix use after free bug in xen_9pfs_front_remove due to race condition
bpf, arm64: Fixed a BTI error on returning to patched function
KVM: arm64: Initialise hypervisor copies of host symbols unconditionally
KVM: arm64: Advertise ID_AA64PFR0_EL1.CSV2/3 to protected VMs
niu: Fix missing unwind goto in niu_alloc_channels()
tcp: restrict net.ipv4.tcp_app_win
bonding: fix ns validation on backup slaves
iavf: refactor VLAN filter states
iavf: remove active_cvlans and active_svlans bitmaps
net: openvswitch: fix race on port output
Bluetooth: hci_conn: Fix not cleaning up on LE Connection failure
Bluetooth: Fix printing errors if LE Connection times out
Bluetooth: SCO: Fix possible circular locking dependency sco_sock_getsockopt
Bluetooth: Set ISO Data Path on broadcast sink
drm/armada: Fix a potential double free in an error handling path
qlcnic: check pci_reset_function result
net: wwan: iosm: Fix error handling path in ipc_pcie_probe()
cgroup,freezer: hold cpu_hotplug_lock before freezer_mutex
net: qrtr: Fix an uninit variable access bug in qrtr_tx_resume()
sctp: fix a potential overflow in sctp_ifwdtsn_skip
RDMA/core: Fix GID entry ref leak when create_ah fails
selftests: openvswitch: adjust datapath NL message declaration
udp6: fix potential access to stale information
net: macb: fix a memory corruption in extended buffer descriptor mode
skbuff: Fix a race between coalescing and releasing SKBs
libbpf: Fix single-line struct definition output in btf_dump
ARM: 9290/1: uaccess: Fix KASAN false-positives
ARM: dts: qcom: apq8026-lg-lenok: add missing reserved memory
power: supply: rk817: Fix unsigned comparison with less than zero
power: supply: cros_usbpd: reclassify "default case!" as debug
power: supply: axp288_fuel_gauge: Added check for negative values
selftests/bpf: Fix progs/find_vma_fail1.c build error.
wifi: mwifiex: mark OF related data as maybe unused
i2c: imx-lpi2c: clean rx/tx buffers upon new message
i2c: hisi: Avoid redundant interrupts
efi: sysfb_efi: Add quirk for Lenovo Yoga Book X91F/L
block: ublk_drv: mark device as LIVE before adding disk
ACPI: video: Add backlight=native DMI quirk for Acer Aspire 3830TG
drm: panel-orientation-quirks: Add quirk for Lenovo Yoga Book X90F
hwmon: (peci/cputemp) Fix miscalculated DTS for SKX
hwmon: (xgene) Fix ioremap and memremap leak
verify_pefile: relax wrapper length check
asymmetric_keys: log on fatal failures in PE/pkcs7
nvme: send Identify with CNS 06h only to I/O controllers
wifi: iwlwifi: mvm: fix mvmtxq->stopped handling
wifi: iwlwifi: mvm: protect TXQ list manipulation
drm/amdgpu: add mes resume when do gfx post soft reset
drm/amdgpu: Force signal hw_fences that are embedded in non-sched jobs
drm/amdgpu/gfx: set cg flags to enter/exit safe mode
ACPI: resource: Add Medion S17413 to IRQ override quirk
x86/hyperv: Move VMCB enlightenment definitions to hyperv-tlfs.h
KVM: selftests: Move "struct hv_enlightenments" to x86_64/svm.h
KVM: SVM: Add a proper field for Hyper-V VMCB enlightenments
x86/hyperv: KVM: Rename "hv_enlightenments" to "hv_vmcb_enlightenments"
KVM: SVM: Flush Hyper-V TLB when required
tracing: Add trace_array_puts() to write into instance
tracing: Have tracing_snapshot_instance_cond() write errors to the appropriate instance
maple_tree: fix write memory barrier of nodes once dead for RCU mode
ksmbd: avoid out of bounds access in decode_preauth_ctxt()
riscv: add icache flush for nommu sigreturn trampoline
HID: intel-ish-hid: Fix kernel panic during warm reset
net: sfp: initialize sfp->i2c_block_size at sfp allocation
net: phy: nxp-c45-tja11xx: add remove callback
net: phy: nxp-c45-tja11xx: fix unsigned long multiplication overflow
scsi: ses: Handle enclosure with just a primary component gracefully
x86/PCI: Add quirk for AMD XHCI controller that loses MSI-X state in D3hot
cgroup: fix display of forceidle time at root
cgroup/cpuset: Fix partition root's cpuset.cpus update bug
cgroup/cpuset: Wake up cpuset_attach_wq tasks in cpuset_cancel_attach()
drm/amd/pm: correct SMU13.0.7 pstate profiling clock settings
drm/amd/pm: correct SMU13.0.7 max shader clock reporting
mptcp: use mptcp_schedule_work instead of open-coding it
mptcp: stricter state check in mptcp_worker
ubi: Fix failure attaching when vid_hdr offset equals to (sub)page size
ubi: Fix deadlock caused by recursively holding work_sem
i2c: mchp-pci1xxxx: Update Timing registers
powerpc/papr_scm: Update the NUMA distance table for the target node
sched/fair: Fix imbalance overflow
x86/rtc: Remove __init for runtime functions
i2c: ocores: generate stop condition after timeout in polling mode
cifs: fix negotiate context parsing
nvme-pci: mark Lexar NM760 as IGNORE_DEV_SUBNQN
nvme-pci: add NVME_QUIRK_BOGUS_NID for T-FORCE Z330 SSD
cgroup/cpuset: Skip spread flags update on v2
cgroup/cpuset: Make cpuset_fork() handle CLONE_INTO_CGROUP properly
cgroup/cpuset: Add cpuset_can_fork() and cpuset_cancel_fork() methods
Linux 6.1.25
Change-Id: Ib4d2c49ea9bacb8d8dbdb7b3a4eecce937016427
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
[ Upstream commit eee87853794187f6adbe19533ed79c8b44b36a91 ]
In the case of CLONE_INTO_CGROUP, not all cpusets are ready to accept
new tasks. It is too late to check that in cpuset_fork(). So we need
to add the cpuset_can_fork() and cpuset_cancel_fork() methods to
pre-check it before we can allow attachment to a different cpuset.
We also need to set the attach_in_progress flag to alert other code
that a new task is going to be added to the cpuset.
Fixes: ef2c41cf38 ("clone3: allow spawning processes into cgroups")
Suggested-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Waiman Long <longman@redhat.com>
Cc: stable@vger.kernel.org # v5.7+
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 42a11bf5c5436e91b040aeb04063be1710bb9f9c ]
By default, the clone(2) syscall spawn a child process into the same
cgroup as its parent. With the use of the CLONE_INTO_CGROUP flag
introduced by commit ef2c41cf38 ("clone3: allow spawning processes
into cgroups"), the child will be spawned into a different cgroup which
is somewhat similar to writing the child's tid into "cgroup.threads".
The current cpuset_fork() method does not properly handle the
CLONE_INTO_CGROUP case where the cpuset of the child may be different
from that of its parent. Update the cpuset_fork() method to treat the
CLONE_INTO_CGROUP case similar to cpuset_attach().
Since the newly cloned task has not been running yet, its actual
memory usage isn't known. So it is not necessary to make change to mm
in cpuset_fork().
Fixes: ef2c41cf38 ("clone3: allow spawning processes into cgroups")
Reported-by: Giuseppe Scrivano <gscrivan@redhat.com>
Signed-off-by: Waiman Long <longman@redhat.com>
Cc: stable@vger.kernel.org # v5.7+
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 18f9a4d47527772515ad6cbdac796422566e6440 ]
Cpuset v2 has no spread flags to set. So we can skip spread
flags update if cpuset v2 is being used. Also change the name to
cpuset_update_task_spread_flags() to indicate that there are multiple
spread flags.
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Stable-dep-of: 42a11bf5c543 ("cgroup/cpuset: Make cpuset_fork() handle CLONE_INTO_CGROUP properly")
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit ba9182a89626d5f83c2ee4594f55cb9c1e60f0e2 upstream.
After a successful cpuset_can_attach() call which increments the
attach_in_progress flag, either cpuset_cancel_attach() or cpuset_attach()
will be called later. In cpuset_attach(), tasks in cpuset_attach_wq,
if present, will be woken up at the end. That is not the case in
cpuset_cancel_attach(). So missed wakeup is possible if the attach
operation is somehow cancelled. Fix that by doing the wakeup in
cpuset_cancel_attach() as well.
Fixes: e44193d39e ("cpuset: let hotplug propagation work wait for task attaching")
Signed-off-by: Waiman Long <longman@redhat.com>
Reviewed-by: Michal Koutný <mkoutny@suse.com>
Cc: stable@vger.kernel.org # v3.11+
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 292fd843de26c551856e66faf134512c52dd78b4 upstream.
It was found that commit 7a2127e66a00 ("cpuset: Call
set_cpus_allowed_ptr() with appropriate mask for task") introduced a bug
that corrupted "cpuset.cpus" of a partition root when it was updated.
It is because the tmp->new_cpus field of the passed tmp parameter
of update_parent_subparts_cpumask() should not be used at all as
it contains important cpumask data that should not be overwritten.
Fix it by using tmp->addmask instead.
Also update update_cpumask() to make sure that trialcs->cpu_allowed
will not be corrupted until it is no longer needed.
Fixes: 7a2127e66a00 ("cpuset: Call set_cpus_allowed_ptr() with appropriate mask for task")
Signed-off-by: Waiman Long <longman@redhat.com>
Cc: stable@vger.kernel.org # v6.2+
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fcdb1eda5302599045bb366e679cccb4216f3873 upstream.
We need to reset forceidle_sum to 0 when reading from root, since the
bstat we accumulate into is stack allocated.
To make this more robust, just replace the existing cputime reset with a
memset of the overall bstat.
Signed-off-by: Josh Don <joshdon@google.com>
Fixes: 1fcf54deb7 ("sched/core: add forced idle accounting for cgroups")
Cc: stable@vger.kernel.org # v6.0+
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Changes in 6.1.12
hv_netvsc: Allocate memory in netvsc_dma_map() with GFP_ATOMIC
btrfs: limit device extents to the device size
btrfs: zlib: zero-initialize zlib workspace
ALSA: hda/realtek: Add Positivo N14KP6-TG
ALSA: emux: Avoid potential array out-of-bound in snd_emux_xg_control()
ALSA: hda/realtek: Fix the speaker output on Samsung Galaxy Book2 Pro 360
ALSA: hda/realtek: Enable mute/micmute LEDs on HP Elitebook, 645 G9
ALSA: hda/realtek: Add quirk for ASUS UM3402 using CS35L41
ALSA: hda/realtek: fix mute/micmute LEDs don't work for a HP platform.
Revert "PCI/ASPM: Save L1 PM Substates Capability for suspend/resume"
Revert "PCI/ASPM: Refactor L1 PM Substates Control Register programming"
tracing: Fix poll() and select() do not work on per_cpu trace_pipe and trace_pipe_raw
of/address: Return an error when no valid dma-ranges are found
can: j1939: do not wait 250 ms if the same addr was already claimed
HID: logitech: Disable hi-res scrolling on USB
xfrm: compat: change expression for switch in xfrm_xlate64
IB/hfi1: Restore allocated resources on failed copyout
xfrm/compat: prevent potential spectre v1 gadget in xfrm_xlate32_attr()
IB/IPoIB: Fix legacy IPoIB due to wrong number of queues
xfrm: annotate data-race around use_time
RDMA/irdma: Fix potential NULL-ptr-dereference
RDMA/usnic: use iommu_map_atomic() under spin_lock()
xfrm: fix bug with DSCP copy to v6 from v4 tunnel
of: Make OF framebuffer device names unique
net: phylink: move phy_device_free() to correctly release phy device
bonding: fix error checking in bond_debug_reregister()
net: macb: Perform zynqmp dynamic configuration only for SGMII interface
net: phy: meson-gxl: use MMD access dummy stubs for GXL, internal PHY
ionic: clean interrupt before enabling queue to avoid credit race
ionic: refactor use of ionic_rx_fill()
ionic: missed doorbell workaround
cpufreq: qcom-hw: Fix cpufreq_driver->get() for non-LMH systems
uapi: add missing ip/ipv6 header dependencies for linux/stddef.h
net: microchip: sparx5: fix PTP init/deinit not checking all ports
HID: amd_sfh: if no sensors are enabled, clean up
drm/i915: Don't do the WM0->WM1 copy w/a if WM1 is already enabled
drm/virtio: exbuf->fence_fd unmodified on interrupted wait
cpuset: Call set_cpus_allowed_ptr() with appropriate mask for task
nvidiafb: detect the hardware support before removing console.
ice: Do not use WQ_MEM_RECLAIM flag for workqueue
ice: Fix disabling Rx VLAN filtering with port VLAN enabled
ice: switch: fix potential memleak in ice_add_adv_recipe()
net: dsa: mt7530: don't change PVC_EG_TAG when CPU port becomes VLAN-aware
net: mscc: ocelot: fix VCAP filters not matching on MAC with "protocol 802.1Q"
net/mlx5e: Update rx ring hw mtu upon each rx-fcs flag change
net/mlx5: Bridge, fix ageing of peer FDB entries
net/mlx5e: Fix crash unsetting rx-vlan-filter in switchdev mode
net/mlx5e: IPoIB, Show unknown speed instead of error
net/mlx5: Store page counters in a single array
net/mlx5: Expose SF firmware pages counter
net/mlx5: fw_tracer, Clear load bit when freeing string DBs buffers
net/mlx5: fw_tracer, Zero consumer index when reloading the tracer
net/mlx5: Serialize module cleanup with reload and remove
igc: Add ndo_tx_timeout support
net: ethernet: mtk_eth_soc: fix wrong parameters order in __xdp_rxq_info_reg()
txhash: fix sk->sk_txrehash default
selftests: Fix failing VXLAN VNI filtering test
rds: rds_rm_zerocopy_callback() use list_first_entry()
net: mscc: ocelot: fix all IPv6 getting trapped to CPU when PTP timestamping is used
selftests: forwarding: lib: quote the sysctl values
arm64: dts: rockchip: fix input enable pinconf on rk3399
arm64: dts: rockchip: set sdmmc0 speed to sd-uhs-sdr50 on rock-3a
ALSA: pci: lx6464es: fix a debug loop
riscv: stacktrace: Fix missing the first frame
arm64: dts: mediatek: mt8195: Fix vdosys* compatible strings
ASoC: tas5805m: rework to avoid scheduling while atomic.
ASoC: tas5805m: add missing page switch.
ASoC: fsl_sai: fix getting version from VERID
ASoC: topology: Return -ENOMEM on memory allocation failure
clk: microchip: mpfs-ccc: Use devm_kasprintf() for allocating formatted strings
pinctrl: mediatek: Fix the drive register definition of some Pins
pinctrl: aspeed: Fix confusing types in return value
pinctrl: single: fix potential NULL dereference
spi: dw: Fix wrong FIFO level setting for long xfers
pinctrl: aspeed: Revert "Force to disable the function's signal"
pinctrl: intel: Restore the pins that used to be in Direct IRQ mode
cifs: Fix use-after-free in rdata->read_into_pages()
net: USB: Fix wrong-direction WARNING in plusb.c
mptcp: do not wait for bare sockets' timeout
mptcp: be careful on subflow status propagation on errors
selftests: mptcp: allow more slack for slow test-case
selftests: mptcp: stop tests earlier
btrfs: simplify update of last_dir_index_offset when logging a directory
btrfs: free device in btrfs_close_devices for a single device filesystem
usb: core: add quirk for Alcor Link AK9563 smartcard reader
usb: typec: altmodes/displayport: Fix probe pin assign check
cxl/region: Fix null pointer dereference for resetting decoder
cxl/region: Fix passthrough-decoder detection
clk: ingenic: jz4760: Update M/N/OD calculation algorithm
pinctrl: qcom: sm8450-lpass-lpi: correct swr_rx_data group
drm/amd/pm: add SMU 13.0.7 missing GetPptLimit message mapping
ceph: flush cap releases when the session is flushed
nvdimm: Support sizeof(struct page) > MAX_STRUCT_PAGE_SIZE
riscv: Fixup race condition on PG_dcache_clean in flush_icache_pte
riscv: kprobe: Fixup misaligned load text
powerpc/64s/interrupt: Fix interrupt exit race with security mitigation switch
drm/amdgpu: Use the TGID for trace_amdgpu_vm_update_ptes
tracing: Fix TASK_COMM_LEN in trace event format file
rtmutex: Ensure that the top waiter is always woken up
arm64: dts: meson-gx: Make mmc host controller interrupts level-sensitive
arm64: dts: meson-g12-common: Make mmc host controller interrupts level-sensitive
arm64: dts: meson-axg: Make mmc host controller interrupts level-sensitive
Fix page corruption caused by racy check in __free_pages
arm64: efi: Force the use of SetVirtualAddressMap() on eMAG and Altra Max machines
drm/amd/pm: bump SMU 13.0.0 driver_if header version
drm/amdgpu: Add unique_id support for GC 11.0.1/2
drm/amd/pm: bump SMU 13.0.7 driver_if header version
drm/amdgpu/fence: Fix oops due to non-matching drm_sched init/fini
drm/amdgpu/smu: skip pptable init under sriov
drm/amd/display: properly handling AGP aperture in vm setup
drm/amd/display: fix cursor offset on rotation 180
drm/i915: Move fd_install after last use of fence
drm/i915: Initialize the obj flags for shmem objects
drm/i915: Fix VBT DSI DVO port handling
x86/speculation: Identify processors vulnerable to SMT RSB predictions
KVM: x86: Mitigate the cross-thread return address predictions bug
Documentation/hw-vuln: Add documentation for Cross-Thread Return Predictions
Linux 6.1.12
Change-Id: I4deaf57516f3e7b40e728d473986fa355a11fc37
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
This reverts commit 8ecd88d9d3 as it is
broken with regards to upstream changes made in 6.1.12.
If this is still needed, it can be brought back in a way that works
properly based on the changes made upstream.
Bug: 174125747
Cc: Satya Durga Srinivasu Prabhala <satyap@codeaurora.org>
Cc: Sai Harshini Nimmala <quic_snimmala@quicinc.com>
Change-Id: Ic3163351faabbecbce688a87215f79ca3b5d6188
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
[ Upstream commit 7a2127e66a00e073db8d90f9aac308f4a8a64226 ]
set_cpus_allowed_ptr() will fail with -EINVAL if the requested
affinity mask is not a subset of the task_cpu_possible_mask() for the
task being updated. Consequently, on a heterogeneous system with cpusets
spanning the different CPU types, updates to the cgroup hierarchy can
silently fail to update task affinities when the effective affinity
mask for the cpuset is expanded.
For example, consider an arm64 system with 4 CPUs, where CPUs 2-3 are
the only cores capable of executing 32-bit tasks. Attaching a 32-bit
task to a cpuset containing CPUs 0-2 will correctly affine the task to
CPU 2. Extending the cpuset to CPUs 0-3, however, will fail to extend
the affinity mask of the 32-bit task because update_tasks_cpumask() will
pass the full 0-3 mask to set_cpus_allowed_ptr().
Extend update_tasks_cpumask() to take a temporary 'cpumask' paramater
and use it to mask the 'effective_cpus' mask with the possible mask for
each task being updated.
Fixes: 431c69fac0 ("cpuset: Honour task_cpu_possible_mask() in guarantee_online_cpus()")
Signed-off-by: Will Deacon <will@kernel.org>
Acked-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmPkywAACgkQONu9yGCS
aT42Kw/9FFrdwv29yND651dPIglYKgO0Oz27/LFNGqst1A/G1ITzfs/94NSRr+9j
uvwmBLbC+n/OXYavliBVWlPaYUCLqoFSfR+q953yz/UT0803E8BUvQ8NN8O7lsg7
hfbWJaASxt5puy2pBFypeWM+OXoVOvUBj3VhbgtUwwcYLPuYafj9rCAytdIIf5fr
RKWBLfx7As4OJ+Hb3KNkolTkFDTfV5+zqCAc9Ko474d1bpRnF15UdQN8Kkinr2+O
YNGTvDT8jR8eAk/9PiCNrG7DEMSKaczP8n/ap6PikD/KnK7ShtCLwZztLnmu65g1
vZG+cnEda8FuY3Ms03UrHhKqzMzBY/vslzBNMBTNmDsr+b7ilhffAYXPKS8s7xrg
bJjmfzfITFAjXrml25enVO0V9RtTxv6E07U7SnDrLsvE2KBFZfUR/3Xl70bVBb0S
db60kmEoq3XHHtoVySOHlfihVHSy02V9dlFcLOYMQsDHsGVsRXOR87g6d7+rJS3h
hYWz5YxMLJUr2qn2836DPBnX9Ix0VjDx+X2fB4bNYzKc1dMlgzbpYrhk9LEOUDsx
emJuqZskjkLby9Bw36N3eHW3fKPOFrwpYwPWYJHdWx1mmFSNdV6MdfEtZXpuEkFJ
iFyJPeeODGadoiznnXTaBFfhozRj+B6FXrY6pkF+WMoSt8ZlZpM=
=vu7j
-----END PGP SIGNATURE-----
Merge 6.1.11 into android14-6.1
Changes in 6.1.11
firewire: fix memory leak for payload of request subaction to IEC 61883-1 FCP region
bus: sunxi-rsb: Fix error handling in sunxi_rsb_init()
arm64: dts: imx8m-venice: Remove incorrect 'uart-has-rtscts'
arm64: dts: freescale: imx8dxl: fix sc_pwrkey's property name linux,keycode
ASoC: amd: acp-es8336: Drop reference count of ACPI device after use
ASoC: Intel: bytcht_es8316: Drop reference count of ACPI device after use
ASoC: Intel: bytcr_rt5651: Drop reference count of ACPI device after use
ASoC: Intel: bytcr_rt5640: Drop reference count of ACPI device after use
ASoC: Intel: bytcr_wm5102: Drop reference count of ACPI device after use
ASoC: Intel: sof_es8336: Drop reference count of ACPI device after use
ASoC: Intel: avs: Implement PCI shutdown
bpf: Fix off-by-one error in bpf_mem_cache_idx()
bpf: Fix a possible task gone issue with bpf_send_signal[_thread]() helpers
ALSA: hda/via: Avoid potential array out-of-bound in add_secret_dac_path()
bpf: Fix to preserve reg parent/live fields when copying range info
selftests/filesystems: grant executable permission to run_fat_tests.sh
ASoC: SOF: ipc4-mtrace: prevent underflow in sof_ipc4_priority_mask_dfs_write()
bpf: Add missing btf_put to register_btf_id_dtor_kfuncs
media: v4l2-ctrls-api.c: move ctrl->is_new = 1 to the correct line
bpf, sockmap: Check for any of tcp_bpf_prots when cloning a listener
arm64: dts: imx8mm: Fix pad control for UART1_DTE_RX
arm64: dts: imx8mm-verdin: Do not power down eth-phy
drm/vc4: hdmi: make CEC adapter name unique
drm/ssd130x: Init display before the SSD130X_DISPLAY_ON command
scsi: Revert "scsi: core: map PQ=1, PDT=other values to SCSI_SCAN_TARGET_PRESENT"
bpf: Fix the kernel crash caused by bpf_setsockopt().
ALSA: memalloc: Workaround for Xen PV
vhost/net: Clear the pending messages when the backend is removed
copy_oldmem_kernel() - WRITE is "data source", not destination
WRITE is "data source", not destination...
READ is "data destination", not source...
zcore: WRITE is "data source", not destination...
memcpy_real(): WRITE is "data source", not destination...
fix iov_iter_bvec() "direction" argument
fix 'direction' argument of iov_iter_{init,bvec}()
fix "direction" argument of iov_iter_kvec()
use less confusing names for iov_iter direction initializers
vhost-scsi: unbreak any layout for response
ice: Prevent set_channel from changing queues while RDMA active
qede: execute xdp_do_flush() before napi_complete_done()
virtio-net: execute xdp_do_flush() before napi_complete_done()
dpaa_eth: execute xdp_do_flush() before napi_complete_done()
dpaa2-eth: execute xdp_do_flush() before napi_complete_done()
skb: Do mix page pool and page referenced frags in GRO
sfc: correctly advertise tunneled IPv6 segmentation
net: phy: dp83822: Fix null pointer access on DP83825/DP83826 devices
net: wwan: t7xx: Fix Runtime PM initialization
block, bfq: replace 0/1 with false/true in bic apis
block, bfq: fix uaf for bfqq in bic_set_bfqq()
netrom: Fix use-after-free caused by accept on already connected socket
fscache: Use wait_on_bit() to wait for the freeing of relinquished volume
platform/x86/amd/pmf: update to auto-mode limits only after AMT event
platform/x86/amd/pmf: Add helper routine to update SPS thermals
platform/x86/amd/pmf: Fix to update SPS default pprof thermals
platform/x86/amd/pmf: Add helper routine to check pprof is balanced
platform/x86/amd/pmf: Fix to update SPS thermals when power supply change
platform/x86/amd/pmf: Ensure mutexes are initialized before use
platform/x86: thinkpad_acpi: Fix thinklight LED brightness returning 255
drm/i915/guc: Fix locking when searching for a hung request
drm/i915: Fix request ref counting during error capture & debugfs dump
drm/i915: Fix up locking around dumping requests lists
drm/i915/adlp: Fix typo for reference clock
net/tls: tls_is_tx_ready() checked list_entry
ALSA: firewire-motu: fix unreleased lock warning in hwdep device
netfilter: br_netfilter: disable sabotage_in hook after first suppression
block: ublk: extending queue_size to fix overflow
kunit: fix kunit_test_init_section_suites(...)
squashfs: harden sanity check in squashfs_read_xattr_id_table
maple_tree: should get pivots boundary by type
sctp: do not check hb_timer.expires when resetting hb_timer
net: phy: meson-gxl: Add generic dummy stubs for MMD register access
drm/panel: boe-tv101wum-nl6: Ensure DSI writes succeed during disable
ip/ip6_gre: Fix changing addr gen mode not generating IPv6 link local address
ip/ip6_gre: Fix non-point-to-point tunnel not generating IPv6 link local address
riscv: kprobe: Fixup kernel panic when probing an illegal position
igc: return an error if the mac type is unknown in igc_ptp_systim_to_hwtstamp()
octeontx2-af: Fix devlink unregister
can: j1939: fix errant WARN_ON_ONCE in j1939_session_deactivate
can: raw: fix CAN FD frame transmissions over CAN XL devices
can: mcp251xfd: mcp251xfd_ring_set_ringparam(): assign missing tx_obj_num_coalesce_irq
ata: libata: Fix sata_down_spd_limit() when no link speed is reported
selftests: net: udpgso_bench_rx: Fix 'used uninitialized' compiler warning
selftests: net: udpgso_bench_rx/tx: Stop when wrong CLI args are provided
selftests: net: udpgso_bench: Fix racing bug between the rx/tx programs
selftests: net: udpgso_bench_tx: Cater for pending datagrams zerocopy benchmarking
virtio-net: Keep stop() to follow mirror sequence of open()
net: openvswitch: fix flow memory leak in ovs_flow_cmd_new
efi: fix potential NULL deref in efi_mem_reserve_persistent
rtc: sunplus: fix format string for printing resource
certs: Fix build error when PKCS#11 URI contains semicolon
kbuild: modinst: Fix build error when CONFIG_MODULE_SIG_KEY is a PKCS#11 URI
i2c: designware-pci: Add new PCI IDs for AMD NAVI GPU
i2c: mxs: suppress probe-deferral error message
scsi: target: core: Fix warning on RT kernels
x86/aperfmperf: Erase stale arch_freq_scale values when disabling frequency invariance readings
perf/x86/intel: Add Emerald Rapids
perf/x86/intel/cstate: Add Emerald Rapids
scsi: iscsi_tcp: Fix UAF during logout when accessing the shost ipaddress
scsi: iscsi_tcp: Fix UAF during login when accessing the shost ipaddress
i2c: rk3x: fix a bunch of kernel-doc warnings
Revert "gfs2: stop using generic_writepages in gfs2_ail1_start_one"
x86/build: Move '-mindirect-branch-cs-prefix' out of GCC-only block
platform/x86: dell-wmi: Add a keymap for KEY_MUTE in type 0x0010 table
platform/x86: hp-wmi: Handle Omen Key event
platform/x86: gigabyte-wmi: add support for B450M DS3H WIFI-CF
platform/x86/amd: pmc: Disable IRQ1 wakeup for RN/CZN
net/x25: Fix to not accept on connected socket
drm/amd/display: Fix timing not changning when freesync video is enabled
bcache: Silence memcpy() run-time false positive warnings
iio: adc: stm32-dfsdm: fill module aliases
usb: dwc3: qcom: enable vbus override when in OTG dr-mode
usb: gadget: f_fs: Fix unbalanced spinlock in __ffs_ep0_queue_wait
vc_screen: move load of struct vc_data pointer in vcs_read() to avoid UAF
fbcon: Check font dimension limits
cgroup/cpuset: Fix wrong check in update_parent_subparts_cpumask()
hv_netvsc: Fix missed pagebuf entries in netvsc_dma_map/unmap()
ARM: dts: imx7d-smegw01: Fix USB host over-current polarity
net: qrtr: free memory on error path in radix_tree_insert()
can: isotp: split tx timer into transmission and timeout
can: isotp: handle wait_event_interruptible() return values
watchdog: diag288_wdt: do not use stack buffers for hardware data
watchdog: diag288_wdt: fix __diag288() inline assembly
ALSA: hda/realtek: Add Acer Predator PH315-54
ALSA: hda/realtek: fix mute/micmute LEDs, speaker don't work for a HP platform
ASoC: codecs: wsa883x: correct playback min/max rates
ASoC: SOF: sof-audio: unprepare when swidget->use_count > 0
ASoC: SOF: sof-audio: skip prepare/unprepare if swidget is NULL
ASoC: SOF: keep prepare/unprepare widgets in sink path
efi: Accept version 2 of memory attributes table
rtc: efi: Enable SET/GET WAKEUP services as optional
iio: hid: fix the retval in accel_3d_capture_sample
iio: hid: fix the retval in gyro_3d_capture_sample
iio: adc: xilinx-ams: fix devm_krealloc() return value check
iio: adc: berlin2-adc: Add missing of_node_put() in error path
iio: imx8qxp-adc: fix irq flood when call imx8qxp_adc_read_raw()
iio:adc:twl6030: Enable measurements of VUSB, VBAT and others
iio: light: cm32181: Fix PM support on system with 2 I2C resources
iio: imu: fxos8700: fix ACCEL measurement range selection
iio: imu: fxos8700: fix incomplete ACCEL and MAGN channels readback
iio: imu: fxos8700: fix IMU data bits returned to user space
iio: imu: fxos8700: fix map label of channel type to MAGN sensor
iio: imu: fxos8700: fix swapped ACCEL and MAGN channels readback
iio: imu: fxos8700: fix incorrect ODR mode readback
iio: imu: fxos8700: fix failed initialization ODR mode assignment
iio: imu: fxos8700: remove definition FXOS8700_CTRL_ODR_MIN
iio: imu: fxos8700: fix MAGN sensor scale and unit
nvmem: brcm_nvram: Add check for kzalloc
nvmem: sunxi_sid: Always use 32-bit MMIO reads
nvmem: qcom-spmi-sdam: fix module autoloading
parisc: Fix return code of pdc_iodc_print()
parisc: Replace hardcoded value with PRIV_USER constant in ptrace.c
parisc: Wire up PTRACE_GETREGS/PTRACE_SETREGS for compat case
riscv: disable generation of unwind tables
Revert "mm: kmemleak: alloc gray object for reserved region with direct map"
mm: multi-gen LRU: fix crash during cgroup migration
mm: hugetlb: proc: check for hugetlb shared PMD in /proc/PID/smaps
mm: memcg: fix NULL pointer in mem_cgroup_track_foreign_dirty_slowpath()
usb: gadget: f_uac2: Fix incorrect increment of bNumEndpoints
usb: typec: ucsi: Don't attempt to resume the ports before they exist
usb: gadget: udc: do not clear gadget driver.bus
kernel/irq/irqdomain.c: fix memory leak with using debugfs_lookup()
HV: hv_balloon: fix memory leak with using debugfs_lookup()
x86/debug: Fix stack recursion caused by wrongly ordered DR7 accesses
fpga: m10bmc-sec: Fix probe rollback
fpga: stratix10-soc: Fix return value check in s10_ops_write_init()
mm/uffd: fix pte marker when fork() without fork event
mm/swapfile: add cond_resched() in get_swap_pages()
mm/khugepaged: fix ->anon_vma race
mm, mremap: fix mremap() expanding for vma's with vm_ops->close()
mm/MADV_COLLAPSE: catch !none !huge !bad pmd lookups
highmem: round down the address passed to kunmap_flush_on_unmap()
ia64: fix build error due to switch case label appearing next to declaration
Squashfs: fix handling and sanity checking of xattr_ids count
maple_tree: fix mas_empty_area_rev() lower bound validation
migrate: hugetlb: check for hugetlb shared PMD in node migration
dma-buf: actually set signaling bit for private stub fences
serial: stm32: Merge hard IRQ and threaded IRQ handling into single IRQ handler
drm/i915: Avoid potential vm use-after-free
drm/i915: Fix potential bit_17 double-free
drm/amd: Fix initialization for nbio 4.3.0
drm/amd/pm: drop unneeded dpm features disablement for SMU 13.0.4/11
drm/amdgpu: update wave data type to 3 for gfx11
nvmem: core: initialise nvmem->id early
nvmem: core: remove nvmem_config wp_gpio
nvmem: core: fix cleanup after dev_set_name()
nvmem: core: fix registration vs use race
nvmem: core: fix device node refcounting
nvmem: core: fix cell removal on error
nvmem: core: fix return value
phy: qcom-qmp-combo: fix runtime suspend
serial: 8250_dma: Fix DMA Rx completion race
serial: 8250_dma: Fix DMA Rx rearm race
platform/x86/amd: pmc: add CONFIG_SERIO dependency
ASoC: SOF: sof-audio: prepare_widgets: Check swidget for NULL on sink failure
iio:adc:twl6030: Enable measurement of VAC
powerpc/64s/radix: Fix crash with unaligned relocated kernel
powerpc/64s: Fix local irq disable when PMIs are disabled
powerpc/imc-pmu: Revert nest_init_lock to being a mutex
fs/ntfs3: Validate attribute data and valid sizes
ovl: Use "buf" flexible array for memcpy() destination
f2fs: initialize locks earlier in f2fs_fill_super()
fbdev: smscufx: fix error handling code in ufx_usb_probe
f2fs: fix to do sanity check on i_extra_isize in is_alive()
wifi: brcmfmac: Check the count value of channel spec to prevent out-of-bounds reads
gfs2: Cosmetic gfs2_dinode_{in,out} cleanup
gfs2: Always check inode size of inline inodes
bpf: Skip invalid kfunc call in backtrack_insn
Linux 6.1.11
Change-Id: I69722bc9711b91f2fca18de59746ada373f64c5e
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
commit e5ae8803847b80fe9d744a3174abe2b7bfed222a upstream.
It was found that the check to see if a partition could use up all
the cpus from the parent cpuset in update_parent_subparts_cpumask()
was incorrect. As a result, it is possible to leave parent with no
effective cpu left even if there are tasks in the parent cpuset. This
can lead to system panic as reported in [1].
Fix this probem by updating the check to fail the enabling the partition
if parent's effective_cpus is a subset of the child's cpus_allowed.
Also record the error code when an error happens in update_prstate()
and add a test case where parent partition and child have the same cpu
list and parent has task. Enabling partition in the child will fail in
this case.
[1] https://www.spinics.net/lists/cgroups/msg36254.html
Fixes: f0af1bfc27 ("cgroup/cpuset: Relax constraints to partition & cpus changes")
Cc: stable@vger.kernel.org # v6.1
Reported-by: Srinivas Pandruvada <srinivas.pandruvada@intel.com>
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Vendors might want to change tasks affinity settings when they are
moving from one cpuset into the other. Add vendor hook to give control
to vendor to implement what they need.
This feature is necessary to control hotplug behaviour within Qualcomm's
proprietary load tracking scheme, WALT.
This reverts commit 034ddf86f7 ("Revert "ANDROID: sched/cpuset: Add
vendor hook to change tasks affinity"") to effectively bring back the
original change.
Bug: 174125747
Signed-off-by: Satya Durga Srinivasu Prabhala <satyap@codeaurora.org>
Signed-off-by: Sai Harshini Nimmala <quic_snimmala@quicinc.com>
Change-Id: Id4e9c3e47e3b4e041804bdf10cbd9e36179bc172
This deliberately changes the behavior of the per-cpuset
cpus file to not be effected by hotplug. When a cpu is offlined, it will
be removed from the cpuset/cpus file. When a cpu is onlined, if the cpuset
originally requested that cpu be a part of the cpuset, that cpu will be
restored to the cpuset. The cpus files still have to be hierachical, but
the ranges no longer have to be out of the currently online cpus, just the
physically present cpus.
This reverts commit 3fc3fe757f ("Revert "ANDROID: cpuset: Make cpusets
restore on hotplug""). Reverting the revert effectively bringing back
the original change.
Bug: 174125747
Bug: 120444281
Signed-off-by: Dmitry Shmidt <dimitrysh@google.com>
[AmitP: Refactored original changes to align with upstream commit
201af4c0fa ("cgroup: move cgroup files under kernel/cgroup/")]
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
[satyap@codeaurora.org: port to android-mainline kernel]
Signed-off-by: Satya Durga Srinivasu Prabhala <satyap@codeaurora.org>
[SaiHarshiniN: Resolved merge conflict post 6.1-rc1]
Signed-off-by: Sai Harshini Nimmala <quic_snimmala@quicinc.com>
Change-Id: I588f6172c15b48ecadb85f161dae948ce9aeca93
memcg_write_event_control() accesses the dentry->d_name of the specified
control fd to route the write call. As a cgroup interface file can't be
renamed, it's safe to access d_name as long as the specified file is a
regular cgroup file. Also, as these cgroup interface files can't be
removed before the directory, it's safe to access the parent too.
Prior to 347c4a8747 ("memcg: remove cgroup_event->cft"), there was a
call to __file_cft() which verified that the specified file is a regular
cgroupfs file before further accesses. The cftype pointer returned from
__file_cft() was no longer necessary and the commit inadvertently
dropped the file type check with it allowing any file to slip through.
With the invarients broken, the d_name and parent accesses can now race
against renames and removals of arbitrary files and cause
use-after-free's.
Fix the bug by resurrecting the file type check in __file_cft(). Now
that cgroupfs is implemented through kernfs, checking the file
operations needs to go through a layer of indirection. Instead, let's
check the superblock and dentry type.
Signed-off-by: Tejun Heo <tj@kernel.org>
Fixes: 347c4a8747 ("memcg: remove cgroup_event->cft")
Cc: stable@kernel.org # v3.14+
Reported-by: Jann Horn <jannh@google.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Vendor module needs to rebuild sched domains at boot, in the
event that cpufreq initializes the energy model too late.
Bug: 242898038
Change-Id: Ifaf1223366ac81c3f3c382dd0f61110fce9c1b20
Signed-off-by: Stephen Dickey <quic_dickey@quicinc.com>
Steps on the way to 6.1-rc1
Resolves conflicts in:
fs/exec.c
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ia649390f891780ed03530adcf0fac9560635e58f
* Fix a recent regression where a sleeping kernfs function is called with
css_set_lock (spinlock) held.
* Revert the commit to enable cgroup1 support for cgroup_get_from_fd/file().
Multiple users assume that the lookup only works for cgroup2 and breaks
when fed a cgroup1 file. Instead, introduce a separate set of functions to
lookup both v1 and v2 and use them where the user explicitly wants to
support both versions.
* Compat update for tools/perf/util/bpf_skel/bperf_cgroup.bpf.c.
* Add Josef Bacik as a blkcg maintainer.
-----BEGIN PGP SIGNATURE-----
iIQEABYIACwWIQTfIjM1kS57o3GsC/uxYfJx3gVYGQUCY03MlA4cdGpAa2VybmVs
Lm9yZwAKCRCxYfJx3gVYGTkUAQD7fNcSPuc2m/BvW+gySKQkp9PZMA6E6yOIqirc
QKmIgwEAwWECW7RR1alhOGD50RtYkuYVsLD1+6Ka4eMHe+EhwA4=
=kGLI
-----END PGP SIGNATURE-----
Merge tag 'cgroup-for-6.1-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup fixes from Tejun Heo:
- Fix a recent regression where a sleeping kernfs function is called
with css_set_lock (spinlock) held
- Revert the commit to enable cgroup1 support for cgroup_get_from_fd/file()
Multiple users assume that the lookup only works for cgroup2 and
breaks when fed a cgroup1 file. Instead, introduce a separate set of
functions to lookup both v1 and v2 and use them where the user
explicitly wants to support both versions.
- Compat update for tools/perf/util/bpf_skel/bperf_cgroup.bpf.c.
- Add Josef Bacik as a blkcg maintainer.
* tag 'cgroup-for-6.1-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
blkcg: Update MAINTAINERS entry
mm: cgroup: fix comments for get from fd/file helpers
perf stat: Support old kernels for bperf cgroup counting
bpf: cgroup_iter: support cgroup1 using cgroup fd
cgroup: add cgroup_v1v2_get_from_[fd/file]()
Revert "cgroup: enable cgroup_get_from_file() on cgroup1"
cgroup: Reorganize css_set_lock and kernfs path processing
- Various performance optimizations, resulting in a 4%-9% speedup
in the mmtests/config-scheduler-perfpipe micro-benchmark.
- New interface to turn PSI on/off on a per cgroup level.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmNJKPsRHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1iPmg//aovCitAQX2lLoHJDIgdQibU40oaEpKTX
wM549EGz3Dr6qmwF8+qT1U2Ge6af/hHQc5G/ZqDpKbuTjUIc3RmBkqX80dNKFLuH
uyi9UtfsSriw+ks8fWuDdjr+S4oppwW9ZoIXvK8v4bisd3F31DNGvKPTayNxt73m
lExfzJiD1oJixDxGX8MGO9QpcoywmjWjzjrB2P+J8hnTpArouHx/HOKdQOpG6wXq
ZRr9kZvju6ucDpXCTa1HJrfVRxNAh35tx/b4cDtXbBFifVAeKaPOrHapMTVsqfel
Z7T+2DymhidNYK0hrRJoGUwa/vkz+2Sm1ZLG9LlgUCXVco/9S1zw1ZuQakVvzPen
wriuxRaAkR+szCP0L8js5+/DAkGa43MjKsvQHmDVnetQtlsAD4eYnn+alQ837SXv
MP3jwFqF+e4mcWdoQcfh0OWUgGec5XZzdgRYrFkBKyTWGLB2iPivcAMNf0X/h82Q
xxv4DQJIIJ017GOQ/ho2saq+GbtFCvX8YnGYas9T47Bjjluhjo7jgTVtPTo+mhtN
RfwMdG718Ap/gvnAX7wMe/t+L/4AP8AIgDRi5L35dTRqETwOjH+LAvOYjleQFYgu
kMVtLMyzU+TGwHscuzPFRh7TnvSJ4sD48Ll1BPnyZsh3SS9u0gAs1bml7Cu7JbmW
SIZD/S/hzdI=
=91tB
-----END PGP SIGNATURE-----
Merge tag 'sched-psi-2022-10-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull PSI updates from Ingo Molnar:
- Various performance optimizations, resulting in a 4%-9% speedup in
the mmtests/config-scheduler-perfpipe micro-benchmark.
- New interface to turn PSI on/off on a per cgroup level.
* tag 'sched-psi-2022-10-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/psi: Per-cgroup PSI accounting disable/re-enable interface
sched/psi: Cache parent psi_group to speed up group iteration
sched/psi: Consolidate cgroup_psi()
sched/psi: Add PSI_IRQ to track IRQ/SOFTIRQ pressure
sched/psi: Remove NR_ONCPU task accounting
sched/psi: Optimize task switch inside shared cgroups again
sched/psi: Move private helpers to sched/stats.h
sched/psi: Save percpu memory when !psi_cgroups_enabled
sched/psi: Don't create cgroup PSI files when psi_disabled
sched/psi: Fix periodic aggregation shut off
This reverts commit 4d1ac6a160 as it
causes merge conflicts, and build conflicts in 6.1-rc1 due to upstream
cpuset changes. If this is needed in 6.1, it will have to be reworked
and forward ported in a different way as the structure information has
been changed in 6.1.
Bug: 174125747
Cc: Satya Durga Srinivasu Prabhala <satyap@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Iedc328c351830b8b338efb4642aac656b00c5a2a
This reverts commit c8dc4422c6. It causes
merge conflicts with 6.1-rc1. If this is still needed in 6.1, it can
come back after 6.1-rc1 is merged.
Bug: 174125747
Bug: 120444281
Cc: Dmitry Shmidt <dimitrysh@google.com>
Cc: Amit Pundir <amit.pundir@linaro.org>
Cc: Satya Durga Srinivasu Prabhala <satyap@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I8b040cf6ba360aad710eb462aabf76d87e803b23
Fix the documentation comments for cgroup_[v1v2_]get_from_[fd/file]().
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Add cgroup_v1v2_get_from_fd() and cgroup_v1v2_get_from_file() that
support both cgroup1 and cgroup2.
Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
linux-next for a couple of months without, to my knowledge, any negative
reports (or any positive ones, come to that).
- Also the Maple Tree from Liam R. Howlett. An overlapping range-based
tree for vmas. It it apparently slight more efficient in its own right,
but is mainly targeted at enabling work to reduce mmap_lock contention.
Liam has identified a number of other tree users in the kernel which
could be beneficially onverted to mapletrees.
Yu Zhao has identified a hard-to-hit but "easy to fix" lockdep splat
(https://lkml.kernel.org/r/CAOUHufZabH85CeUN-MEMgL8gJGzJEWUrkiM58JkTbBhh-jew0Q@mail.gmail.com).
This has yet to be addressed due to Liam's unfortunately timed
vacation. He is now back and we'll get this fixed up.
- Dmitry Vyukov introduces KMSAN: the Kernel Memory Sanitizer. It uses
clang-generated instrumentation to detect used-unintialized bugs down to
the single bit level.
KMSAN keeps finding bugs. New ones, as well as the legacy ones.
- Yang Shi adds a userspace mechanism (madvise) to induce a collapse of
memory into THPs.
- Zach O'Keefe has expanded Yang Shi's madvise(MADV_COLLAPSE) to support
file/shmem-backed pages.
- userfaultfd updates from Axel Rasmussen
- zsmalloc cleanups from Alexey Romanov
- cleanups from Miaohe Lin: vmscan, hugetlb_cgroup, hugetlb and memory-failure
- Huang Ying adds enhancements to NUMA balancing memory tiering mode's
page promotion, with a new way of detecting hot pages.
- memcg updates from Shakeel Butt: charging optimizations and reduced
memory consumption.
- memcg cleanups from Kairui Song.
- memcg fixes and cleanups from Johannes Weiner.
- Vishal Moola provides more folio conversions
- Zhang Yi removed ll_rw_block() :(
- migration enhancements from Peter Xu
- migration error-path bugfixes from Huang Ying
- Aneesh Kumar added ability for a device driver to alter the memory
tiering promotion paths. For optimizations by PMEM drivers, DRM
drivers, etc.
- vma merging improvements from Jakub Matěn.
- NUMA hinting cleanups from David Hildenbrand.
- xu xin added aditional userspace visibility into KSM merging activity.
- THP & KSM code consolidation from Qi Zheng.
- more folio work from Matthew Wilcox.
- KASAN updates from Andrey Konovalov.
- DAMON cleanups from Kaixu Xia.
- DAMON work from SeongJae Park: fixes, cleanups.
- hugetlb sysfs cleanups from Muchun Song.
- Mike Kravetz fixes locking issues in hugetlbfs and in hugetlb core.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCY0HaPgAKCRDdBJ7gKXxA
joPjAQDZ5LlRCMWZ1oxLP2NOTp6nm63q9PWcGnmY50FjD/dNlwEAnx7OejCLWGWf
bbTuk6U2+TKgJa4X7+pbbejeoqnt5QU=
=xfWx
-----END PGP SIGNATURE-----
Merge tag 'mm-stable-2022-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
- Yu Zhao's Multi-Gen LRU patches are here. They've been under test in
linux-next for a couple of months without, to my knowledge, any
negative reports (or any positive ones, come to that).
- Also the Maple Tree from Liam Howlett. An overlapping range-based
tree for vmas. It it apparently slightly more efficient in its own
right, but is mainly targeted at enabling work to reduce mmap_lock
contention.
Liam has identified a number of other tree users in the kernel which
could be beneficially onverted to mapletrees.
Yu Zhao has identified a hard-to-hit but "easy to fix" lockdep splat
at [1]. This has yet to be addressed due to Liam's unfortunately
timed vacation. He is now back and we'll get this fixed up.
- Dmitry Vyukov introduces KMSAN: the Kernel Memory Sanitizer. It uses
clang-generated instrumentation to detect used-unintialized bugs down
to the single bit level.
KMSAN keeps finding bugs. New ones, as well as the legacy ones.
- Yang Shi adds a userspace mechanism (madvise) to induce a collapse of
memory into THPs.
- Zach O'Keefe has expanded Yang Shi's madvise(MADV_COLLAPSE) to
support file/shmem-backed pages.
- userfaultfd updates from Axel Rasmussen
- zsmalloc cleanups from Alexey Romanov
- cleanups from Miaohe Lin: vmscan, hugetlb_cgroup, hugetlb and
memory-failure
- Huang Ying adds enhancements to NUMA balancing memory tiering mode's
page promotion, with a new way of detecting hot pages.
- memcg updates from Shakeel Butt: charging optimizations and reduced
memory consumption.
- memcg cleanups from Kairui Song.
- memcg fixes and cleanups from Johannes Weiner.
- Vishal Moola provides more folio conversions
- Zhang Yi removed ll_rw_block() :(
- migration enhancements from Peter Xu
- migration error-path bugfixes from Huang Ying
- Aneesh Kumar added ability for a device driver to alter the memory
tiering promotion paths. For optimizations by PMEM drivers, DRM
drivers, etc.
- vma merging improvements from Jakub Matěn.
- NUMA hinting cleanups from David Hildenbrand.
- xu xin added aditional userspace visibility into KSM merging
activity.
- THP & KSM code consolidation from Qi Zheng.
- more folio work from Matthew Wilcox.
- KASAN updates from Andrey Konovalov.
- DAMON cleanups from Kaixu Xia.
- DAMON work from SeongJae Park: fixes, cleanups.
- hugetlb sysfs cleanups from Muchun Song.
- Mike Kravetz fixes locking issues in hugetlbfs and in hugetlb core.
Link: https://lkml.kernel.org/r/CAOUHufZabH85CeUN-MEMgL8gJGzJEWUrkiM58JkTbBhh-jew0Q@mail.gmail.com [1]
* tag 'mm-stable-2022-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (555 commits)
hugetlb: allocate vma lock for all sharable vmas
hugetlb: take hugetlb vma_lock when clearing vma_lock->vma pointer
hugetlb: fix vma lock handling during split vma and range unmapping
mglru: mm/vmscan.c: fix imprecise comments
mm/mglru: don't sync disk for each aging cycle
mm: memcontrol: drop dead CONFIG_MEMCG_SWAP config symbol
mm: memcontrol: use do_memsw_account() in a few more places
mm: memcontrol: deprecate swapaccounting=0 mode
mm: memcontrol: don't allocate cgroup swap arrays when memcg is disabled
mm/secretmem: remove reduntant return value
mm/hugetlb: add available_huge_pages() func
mm: remove unused inline functions from include/linux/mm_inline.h
selftests/vm: add selftest for MADV_COLLAPSE of uffd-minor memory
selftests/vm: add file/shmem MADV_COLLAPSE selftest for cleared pmd
selftests/vm: add thp collapse shmem testing
selftests/vm: add thp collapse file and tmpfs testing
selftests/vm: modularize thp collapse memory operations
selftests/vm: dedup THP helpers
mm/khugepaged: add tracepoint to hpage_collapse_scan_file()
mm/madvise: add file and shmem support to MADV_COLLAPSE
...
This reverts commit f3a2aebdd6.
The commit enabled looking up v1 cgroups via cgroup_get_from_file().
However, there are multiple users, including CLONE_INTO_CGROUP, which have
been assuming that it would only look up v2 cgroups. Returning v1 cgroups
breaks them.
Let's revert the commit and retry later with a separate lookup interface
which allows both v1 and v2.
Signed-off-by: Tejun Heo <tj@kernel.org>
Link: http://lkml.kernel.org/r/000000000000385cbf05ea3f1862@google.com
Cc: Yosry Ahmed <yosryahmed@google.com>
The commit 74e4b956eb incorrectly wrapped kernfs_walk_and_get
(might_sleep) under css_set_lock (spinlock). css_set_lock is needed by
__cset_cgroup_from_root to ensure stable cset->cgrp_links but not for
kernfs_walk_and_get.
We only need to make sure that the returned root_cgrp won't be freed
under us. This is given in the case of global root because it is static
(cgrp_dfl_root.cgrp). When the root_cgrp is lower in the hierarchy, it
is pinned by cgroup_ns->root_cset (and `current` task cannot switch
namespace asynchronously so ns_proxy pins cgroup_ns).
Note this reasoning won't hold for root cgroups in v1 hierarchies,
therefore create a special-cased helper function just for the default
hierarchy.
Fixes: 74e4b956eb ("cgroup: Honor caller's cgroup NS when resolving path")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
* cpuset now support isolated cpus.partition type, which will enable dynamic
CPU isolation.
* pids.peak added to remember the max number of pids used.
* Holes in cgroup namespace plugged.
* Internal cleanups.
Note that for-6.1-fixes was pulled into for-6.1 twice. Both were for
follow-up cleanups and each merge commit has details.
Also, 8a693f7766 ("cgroup: Remove CFTYPE_PRESSURE") removes the flag used
by PSI changes in the tip tree and the merged result won't compile due to
the missing flag. Simply removing the struct init lines specifying the flag
is the correct resolution. linux-next already contains the correct fix:
https://lkml.kernel.org/r/20220912161812.072aaa3b@canb.auug.org.au
-----BEGIN PGP SIGNATURE-----
iIQEABYIACwWIQTfIjM1kS57o3GsC/uxYfJx3gVYGQUCYzsl7w4cdGpAa2VybmVs
Lm9yZwAKCRCxYfJx3gVYGYsxAP4kad4YPw+CueLyyEMiYgBHouqDt8cG0+FJWK3X
svTC7wD/eCLfxZM8TjjSrMmvaMrml586mr3NoQaFeW0x3twptQQ=
=LERu
-----END PGP SIGNATURE-----
Merge tag 'cgroup-for-6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup updates from Tejun Heo:
- cpuset now support isolated cpus.partition type, which will enable
dynamic CPU isolation
- pids.peak added to remember the max number of pids used
- holes in cgroup namespace plugged
- internal cleanups
* tag 'cgroup-for-6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (25 commits)
cgroup: use strscpy() is more robust and safer
iocost_monitor: reorder BlkgIterator
cgroup: simplify code in cgroup_apply_control
cgroup: Make cgroup_get_from_id() prettier
cgroup/cpuset: remove unreachable code
cgroup: Remove CFTYPE_PRESSURE
cgroup: Improve cftype add/rm error handling
kselftest/cgroup: Add cpuset v2 partition root state test
cgroup/cpuset: Update description of cpuset.cpus.partition in cgroup-v2.rst
cgroup/cpuset: Make partition invalid if cpumask change violates exclusivity rule
cgroup/cpuset: Relocate a code block in validate_change()
cgroup/cpuset: Show invalid partition reason string
cgroup/cpuset: Add a new isolated cpus.partition type
cgroup/cpuset: Relax constraints to partition & cpus changes
cgroup/cpuset: Allow no-task partition to have empty cpuset.cpus.effective
cgroup/cpuset: Miscellaneous cleanups & add helper functions
cgroup/cpuset: Enable update_tasks_cpumask() on top_cpuset
cgroup: add pids.peak interface for pids controller
cgroup: Remove data-race around cgrp_dfl_visible
cgroup: Fix build failure when CONFIG_SHRINKER_DEBUG
...
- Debuggability:
- Change most occurances of BUG_ON() to WARN_ON_ONCE()
- Reorganize & fix TASK_ state comparisons, turn it into a bitmap
- Update/fix misc scheduler debugging facilities
- Load-balancing & regular scheduling:
- Improve the behavior of the scheduler in presence of lot of
SCHED_IDLE tasks - in particular they should not impact other
scheduling classes.
- Optimize task load tracking, cleanups & fixes
- Clean up & simplify misc load-balancing code
- Freezer:
- Rewrite the core freezer to behave better wrt thawing and be simpler
in general, by replacing PF_FROZEN with TASK_FROZEN & fixing/adjusting
all the fallout.
- Deadline scheduler:
- Fix the DL capacity-aware code
- Factor out dl_task_is_earliest_deadline() & replenish_dl_new_period()
- Relax/optimize locking in task_non_contending()
- Cleanups:
- Factor out the update_current_exec_runtime() helper
- Various cleanups, simplifications
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmM/01cRHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1geZA/+PB4KC1T9aVxzaTHI36R03YgJYZmIdtxw
wTf02MixePmz+gQCbepJbempGOh5ST28aOcI0xhdYOql5B63MaUBBMlB0HvGUyDG
IU3zETqLMRtAbnSTdQFv8m++ECUtZYp8/x1FCel4WO7ya4ETkRu1NRfCoUepEhpZ
aVAlae9LH3NBaF9t7s0PT2lTjf3pIzMFRkddJ0ywJhbFR3VnWat05fAK+J6fGY8+
LS54coefNlJD4oDh5TY8uniL1j5SmWmmwbk9Cdj7bLU5P3dFSS0/+5FJNHJPVGDE
srGT7wstRUcDrN0CnZo48VIUBiApJCCDqTfJYi9wNYd0NAHvwY6MIJJgEIY8mKsI
L/qH26H81Wt+ezSZ/5JIlGlZ/LIeNaa6OO/fbWEYABBQogvvx3nxsRNUYKSQzumH
CnSBasBjLnjWyLlK4qARM9cI7NFSEK6NUigrEx/7h8JFu/8T4DlSy6LsF1HUyKgq
4+FJLAqG6cL0tcwB/fHYd0oRESN8dStnQhGxSojgufwLc7dlFULvCYF5JM/dX+/V
IKwbOfIOeOn6ViMtSOXAEGdII+IQ2/ZFPwr+8Z5JC7NzvTVL6xlu/3JXkLZR3L7o
yaXTSaz06h1vil7Z+GRf7RHc+wUeGkEpXh5vnarGZKXivhFdWsBdROIJANK+xR0i
TeSLCxQxXlU=
=KjMD
-----END PGP SIGNATURE-----
Merge tag 'sched-core-2022-10-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
"Debuggability:
- Change most occurances of BUG_ON() to WARN_ON_ONCE()
- Reorganize & fix TASK_ state comparisons, turn it into a bitmap
- Update/fix misc scheduler debugging facilities
Load-balancing & regular scheduling:
- Improve the behavior of the scheduler in presence of lot of
SCHED_IDLE tasks - in particular they should not impact other
scheduling classes.
- Optimize task load tracking, cleanups & fixes
- Clean up & simplify misc load-balancing code
Freezer:
- Rewrite the core freezer to behave better wrt thawing and be
simpler in general, by replacing PF_FROZEN with TASK_FROZEN &
fixing/adjusting all the fallout.
Deadline scheduler:
- Fix the DL capacity-aware code
- Factor out dl_task_is_earliest_deadline() &
replenish_dl_new_period()
- Relax/optimize locking in task_non_contending()
Cleanups:
- Factor out the update_current_exec_runtime() helper
- Various cleanups, simplifications"
* tag 'sched-core-2022-10-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (41 commits)
sched: Fix more TASK_state comparisons
sched: Fix TASK_state comparisons
sched/fair: Move call to list_last_entry() in detach_tasks
sched/fair: Cleanup loop_max and loop_break
sched/fair: Make sure to try to detach at least one movable task
sched: Show PF_flag holes
freezer,sched: Rewrite core freezer logic
sched: Widen TAKS_state literals
sched/wait: Add wait_event_state()
sched/completion: Add wait_for_completion_state()
sched: Add TASK_ANY for wait_task_inactive()
sched: Change wait_task_inactive()s match_state
freezer,umh: Clean up freezer/initrd interaction
freezer: Have {,un}lock_system_sleep() save/restore flags
sched: Rename task_running() to task_on_cpu()
sched/fair: Cleanup for SIS_PROP
sched/fair: Default to false in test_idle_cores()
sched/fair: Remove useless check in select_idle_core()
sched/fair: Avoid double search on same cpu
sched/fair: Remove redundant check in select_idle_smt()
...
Here is the big set of driver core and debug printk changes for 6.1-rc1.
Included in here is:
- dynamic debug updates for the core and the drm subsystem. The
drm changes have all been acked by the relevant maintainers.
- kernfs fixes for syzbot reported problems
- kernfs refactors and updates for cgroup requirements
- magic number cleanups and removals from the kernel tree (they
were not being used and they really did not actually do
anything.)
- other tiny cleanups
All of these have been in linux-next for a while with no reported
issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCY0BYUA8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ylozwCdFRlcghaf7XBUyNgRZRwMC+oQI8EAn1G/nEDE
6aFd2er41uK0IGQnSmYO
=OK0k
-----END PGP SIGNATURE-----
Merge tag 'driver-core-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here is the big set of driver core and debug printk changes for
6.1-rc1. Included in here is:
- dynamic debug updates for the core and the drm subsystem. The drm
changes have all been acked by the relevant maintainers
- kernfs fixes for syzbot reported problems
- kernfs refactors and updates for cgroup requirements
- magic number cleanups and removals from the kernel tree (they were
not being used and they really did not actually do anything)
- other tiny cleanups
All of these have been in linux-next for a while with no reported
issues"
* tag 'driver-core-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (74 commits)
docs: filesystems: sysfs: Make text and code for ->show() consistent
Documentation: NBD_REQUEST_MAGIC isn't a magic number
a.out: restore CMAGIC
device property: Add const qualifier to device_get_match_data() parameter
drm_print: add _ddebug descriptor to drm_*dbg prototypes
drm_print: prefer bare printk KERN_DEBUG on generic fn
drm_print: optimize drm_debug_enabled for jump-label
drm-print: add drm_dbg_driver to improve namespace symmetry
drm-print.h: include dyndbg header
drm_print: wrap drm_*_dbg in dyndbg descriptor factory macro
drm_print: interpose drm_*dbg with forwarding macros
drm: POC drm on dyndbg - use in core, 2 helpers, 3 drivers.
drm_print: condense enum drm_debug_category
debugfs: use DEFINE_SHOW_ATTRIBUTE to define debugfs_regset32_fops
driver core: use IS_ERR_OR_NULL() helper in device_create_groups_vargs()
Documentation: ENI155_MAGIC isn't a magic number
Documentation: NBD_REPLY_MAGIC isn't a magic number
nbd: remove define-only NBD_MAGIC, previously magic number
Documentation: FW_HEADER_MAGIC isn't a magic number
Documentation: EEPROM_MAGIC_VALUE isn't a magic number
...
Add /sys/kernel/mm/lru_gen/enabled as a kill switch. Components that
can be disabled include:
0x0001: the multi-gen LRU core
0x0002: walking page table, when arch_has_hw_pte_young() returns
true
0x0004: clearing the accessed bit in non-leaf PMD entries, when
CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG=y
[yYnN]: apply to all the components above
E.g.,
echo y >/sys/kernel/mm/lru_gen/enabled
cat /sys/kernel/mm/lru_gen/enabled
0x0007
echo 5 >/sys/kernel/mm/lru_gen/enabled
cat /sys/kernel/mm/lru_gen/enabled
0x0005
NB: the page table walks happen on the scale of seconds under heavy memory
pressure, in which case the mmap_lock contention is a lesser concern,
compared with the LRU lock contention and the I/O congestion. So far the
only well-known case of the mmap_lock contention happens on Android, due
to Scudo [1] which allocates several thousand VMAs for merely a few
hundred MBs. The SPF and the Maple Tree also have provided their own
assessments [2][3]. However, if walking page tables does worsen the
mmap_lock contention, the kill switch can be used to disable it. In this
case the multi-gen LRU will suffer a minor performance degradation, as
shown previously.
Clearing the accessed bit in non-leaf PMD entries can also be disabled,
since this behavior was not tested on x86 varieties other than Intel and
AMD.
[1] https://source.android.com/devices/tech/debug/scudo
[2] https://lore.kernel.org/r/20220128131006.67712-1-michel@lespinasse.org/
[3] https://lore.kernel.org/r/20220426150616.3937571-1-Liam.Howlett@oracle.com/
Link: https://lkml.kernel.org/r/20220918080010.2920238-11-yuzhao@google.com
Signed-off-by: Yu Zhao <yuzhao@google.com>
Acked-by: Brian Geffon <bgeffon@google.com>
Acked-by: Jan Alexander Steffens (heftig) <heftig@archlinux.org>
Acked-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Acked-by: Steven Barrett <steven@liquorix.net>
Acked-by: Suleiman Souhlal <suleiman@google.com>
Tested-by: Daniel Byrne <djbyrne@mtu.edu>
Tested-by: Donald Carr <d@chaos-reins.com>
Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com>
Tested-by: Konstantin Kharlamov <Hi-Angel@yandex.ru>
Tested-by: Shuang Zhai <szhai2@cs.rochester.edu>
Tested-by: Sofia Trinh <sofia.trinh@edi.works>
Tested-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Hillf Danton <hdanton@sina.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Michael Larabel <Michael@MichaelLarabel.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Qi Zheng <zhengqi.arch@bytedance.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The implementation of strscpy() is more robust and safer.
That's now the recommended way to copy NUL terminated strings.
Signed-off-by: ye xingchen <ye.xingchen@zte.com.cn>
Signed-off-by: Tejun Heo <tj@kernel.org>
It could directly return 'cgroup_update_dfl_csses' to simplify code.
Signed-off-by: William Dean <williamsukatube@163.com>
Reviewed-by: Mukesh Ojha <quic_mojha@quicinc.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
After merging 836ac87d ("cgroup: fix cgroup_get_from_id") into for-6.1, its
combination with two commits in for-6.1 - 4534dee9 ("cgroup: cgroup: Honor
caller's cgroup NS when resolving cgroup id") and fa7e439c ("cgroup:
Homogenize cgroup_get_from_id() return value") - makes the gotos in the
error handling path too ugly while not adding anything of value.
All that the gotos are saving is one extra kernfs_put() call. Let's remove
the gotos and perform error returns directly.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Michal Koutný <mkoutny@suse.com>
for-6.0 has the following fix for cgroup_get_from_id().
836ac87d ("cgroup: fix cgroup_get_from_id")
which conflicts with the following two commits in for-6.1.
4534dee9 ("cgroup: cgroup: Honor caller's cgroup NS when resolving cgroup id")
fa7e439c ("cgroup: Homogenize cgroup_get_from_id() return value")
While the resolution is straightforward, the code ends up pretty ugly
afterwards. Let's pull for-6.0-fixes into for-6.1 so that the code can be
fixed up there.
Signed-off-by: Tejun Heo <tj@kernel.org>