Commit Graph

1157742 Commits

Author SHA1 Message Date
Chungkai Mei
d0e2d333f9 ANDROID: Update the ABI symbol list
Adding the following symbols:
  - update_misfit_status

Bug: 318526590
Change-Id: I6bd490130932021298b4c72ee68725998ff2fb69
Signed-off-by: Chungkai Mei <chungkai@google.com>
2024-01-30 18:46:55 +00:00
Chungkai Mei
10558542a1 ANDROID: sched: export update_misfit_status symbol
Current scheduler cannot update misfit status immediately when we set uclamp min for some latency-sensitive tasks, it may cause some latency for these tasks so we may need to update misfit status in vendor kernel.

Bug: 318526590
Change-Id: I0f03d2e52588822d1a9ef9a5f24944dff4f4e4a0
Signed-off-by: Chungkai Mei <chungkai@google.com>
2024-01-30 18:46:55 +00:00
meitaogao
a0b3b39898 ANDROID: GKI: Add ASR KMI symbol list
INFO: 4 function symbol(s) added
  'void clk_rate_exclusive_put(struct clk*)'
  'int clk_set_rate_exclusive(struct clk*, unsigned long)'
  'void sdhci_enable_sdio_irq(struct mmc_host*, int)'
  'void sdhci_send_tuning(struct sdhci_host*, u32)'

Bug: 322838719
Change-Id: Icd2e4f245fd146c065e8192a6ceb9dc2171dadb0
Signed-off-by: meitaogao <meitaogao@asrmicro.com>
2024-01-30 18:40:36 +00:00
Uttkarsh Aggarwal
599710db0f FROMGIT: usb: dwc3: gadget: Fix NULL pointer dereference in dwc3_gadget_suspend
In current scenario if Plug-out and Plug-In performed continuously
there could be a chance while checking for dwc->gadget_driver in
dwc3_gadget_suspend, a NULL pointer dereference may occur.

Call Stack:

	CPU1:                           CPU2:
	gadget_unbind_driver            dwc3_suspend_common
	dwc3_gadget_stop                dwc3_gadget_suspend
                                        dwc3_disconnect_gadget

CPU1 basically clears the variable and CPU2 checks the variable.
Consider CPU1 is running and right before gadget_driver is cleared
and in parallel CPU2 executes dwc3_gadget_suspend where it finds
dwc->gadget_driver which is not NULL and resumes execution and then
CPU1 completes execution. CPU2 executes dwc3_disconnect_gadget where
it checks dwc->gadget_driver is already NULL because of which the
NULL pointer deference occur.

Cc: <stable@vger.kernel.org>
Fixes: 9772b47a4c ("usb: dwc3: gadget: Fix suspend/resume during device mode")
Acked-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
Signed-off-by: Uttkarsh Aggarwal <quic_uaggarwa@quicinc.com>

(cherry picked from commit 61a348857e869432e6a920ad8ea9132e8d44c316 )

Bug: 322899161
Link: https://lore.kernel.org/all/20240119094825.26530-1-quic_uaggarwa@quicinc.com/
Change-Id: I2f1663f19ebdd6c6b5b1874a66c81fd3f75b0e9a
Signed-off-by: Rajashekar kuruva <quic_kuruva@quicinc.com>
2024-01-30 16:48:10 +00:00
Udipto Goswami
9265fa90c1 FROMLIST: usb: core: Prevent null pointer dereference in update_port_device_state
Currently, the function update_port_device_state gets the usb_hub from
udev->parent by calling usb_hub_to_struct_hub.
However, in case the actconfig or the maxchild is 0, the usb_hub would
be NULL and upon further accessing to get port_dev would result in null
pointer dereference.

Fix this by introducing an if check after the usb_hub is populated.

Fixes: 83cb2604f641 ("usb: core: add sysfs entry for usb device state")
Cc: stable@vger.kernel.org
Signed-off-by: Udipto Goswami <quic_ugoswami@quicinc.com>
Reviewed-by: Alan Stern <stern@rowland.harvard.edu>

Bug: 321600650
Link: https://lore.kernel.org/all/20240110095814.7626-1-quic_ugoswami@quicinc.com/
Change-Id: I3fef553dce36a7ec2d335008fe8d51d848d6abd2
Signed-off-by: Rajashekar kuruva <quic_kuruva@quicinc.com>
2024-01-30 10:56:58 +00:00
Daniel Mentz
2730733d54 ANDROID: gki_defconfig: Enable CONFIG_NVME_MULTIPATH
Enable NVMe multipath support to get access to /dev/nvmeXnY block
devices.

Bug: 318459546
Change-Id: Id452462b4dbb474f1e3a53f5010f09edf63642bc
Signed-off-by: Daniel Mentz <danielmentz@google.com>
2024-01-26 18:05:27 +00:00
zhengyan
4f668f5682 BACKPORT: irqchip/gic-v3: Work around affinity issues on ASR8601
The ASR8601 SoC combines ARMv8.2 CPUs from ARM with a GIC-500,
also from ARM. However, the two are incompatible as the former
expose an affinity in the form of (cluster, core, thread),
while the latter can only deal with (cluster, core). If nothing
is done, the GIC simply cannot route interrupts to the CPUs.

Implement a workaround that shifts the affinity down by a level,
ensuring the delivery of interrupts despite the implementation
mismatch.

Signed-off-by: zhengyan <zhengyan@asrmicro.com>
[maz: rewrote commit message, reimplemented the workaround
 in a manageable way]
Signed-off-by: Marc Zyngier <maz@kernel.org>

Bug: 282025214
Change-Id: Id62a4f45ec52c1de543bbd712879dc34688d7904
(cherry picked from commit b4d81fab1ed0b302c71a869e5b93d81dfbfd3175)
[meitao: Resolved minor conflict in drivers/irqchip/irq-gic-v3.c ]
Signed-off-by: meitaogao <meitaogao@asrmicro.com>
(cherry picked from commit f17cd56e4e4273eef892e424adb030ec8e96b095)
2024-01-26 10:14:07 +00:00
Marc Zyngier
473a871315 BACKPORT: irqchip/gic-v3: Improve affinity helper
The GICv3 driver uses multiple formats for the affinity, all
derived from a reading of MPDR_EL1 on one CPU or another.

Simplify the handling of these affinity by moving the access
to the CPU affinity via cpu_logical_map() inside the helper,
and rename it accordingly.

This will be helpful to support some more broken hardware.

Signed-off-by: Marc Zyngier <maz@kernel.org>

Bug: 282025214
Change-Id: I2e6b9861d20336bec689a2e704b7fc50035841e7
(cherry picked from commit 3c65cbb7c5ebb4247968936899580c7f508ed223)
[meitao: Resolved minor conflict in drivers/irqchip/irq-gic-v3.c ]
Signed-off-by: meitaogao <meitaogao@asrmicro.com>
(cherry picked from commit 035e150e1af7221255b952865aaf80a4c1c6d96d)
2024-01-26 10:14:07 +00:00
Vincent Guittot
6c32acf537 UPSTREAM: sched/fair: Limit sched slice duration
In presence of a lot of small weight tasks like sched_idle tasks, normal
or high weight tasks can see their ideal runtime (sched_slice) to increase
to hundreds ms whereas it normally stays below sysctl_sched_latency.

2 normal tasks running on a CPU will have a max sched_slice of 12ms
(half of the sched_period). This means that they will make progress
every sysctl_sched_latency period.

If we now add 1000 idle tasks on the CPU, the sched_period becomes
3006 ms and the ideal runtime of the normal tasks becomes 609 ms.
It will even become 1500ms if the idle tasks belongs to an idle cgroup.
This means that the scheduler will look for picking another waiting task
after 609ms running time (1500ms respectively). The idle tasks change
significantly the way the 2 normal tasks interleave their running time
slot whereas they should have a small impact.

Such long sched_slice can delay significantly the release of resources
as the tasks can wait hundreds of ms before the next running slot just
because of idle tasks queued on the rq.

Cap the ideal_runtime to sysctl_sched_latency to make sure that tasks will
regularly make progress and will not be significantly impacted by
idle/background tasks queued on the rq.

Bug: 315185352
Bug: 269111781
Change-Id: I27f956ee275d17ef708d8d27dc082c66ed5a5275
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Link: https://lore.kernel.org/r/20230113133613.257342-1-vincent.guittot@linaro.org
(cherry picked from commit 79ba1e607d68178db7d3fe4f6a4aa38f06805e7b)
Signed-off-by: Qais Yousef <qyousef@google.com>
(cherry picked from commit e32aeb03b9c6b1b625ff0248b6d5670aa74e783b)
Signed-off-by: Qais Yousef <qyousef@google.com>
2024-01-25 21:17:02 +00:00
Qais Yousef
7088d250bf ANDROID: Update the ABI symbol list
Adding the following symbols:
  - idle_inject_get_duration
  - idle_inject_register
  - idle_inject_set_duration
  - idle_inject_set_latency
  - idle_inject_start
  - idle_inject_stop

Bug: 316903397
Change-Id: I528b90dd34fe0cd2b64b2b615029152d9a3bce60
Signed-off-by: Qais Yousef <qyousef@google.com>
2024-01-25 19:43:25 +00:00
Qais Yousef
c249740414 ANDROID: idle_inject: Export function symbols
To enable out of tree drivers that are based on top of this
functionality.

Bug: 316903397
Change-Id: I96bd84b805b984ebbc3fe0ac4badcd62bb00418b
Signed-off-by: Qais Yousef <qyousef@google.com>
2024-01-25 19:43:25 +00:00
Qais Yousef
990d341477 ANDROID: Update the ABI symbol list
Adding the following symbols:
  - max_load_balance_interval
  - static_key_count

Bug: 269111781
Change-Id: Iebb995e32afbdca06c1634ee75eccbfe579aa16e
Signed-off-by: Qais Yousef <qyousef@google.com>
2024-01-25 19:43:22 +00:00
James Tai
be92a6a1b4 ANDROID: GKI: Remove CONFIG_MEDIA_CEC_RC
This config will cause the 'CtsHdmiCecHostTestCases' test case to fail.
According to the discussion in bug 309377116, it is recommended to remove this config.

Bug: 322143898
Change-Id: Ied37a6c55f4198dbb9dbb9b6c3156a8a7a0bd945
Signed-off-by: James Tai <james.tai@realtek.com>
2024-01-25 18:19:21 +00:00
Wesley Cheng
fa9ac43f16 BACKPORT: usb: host: xhci: Avoid XHCI resume delay if SSUSB device is not present
There is a 120ms delay implemented for allowing the XHCI host controller to
detect a U3 wakeup pulse.  The intention is to wait for the device to retry
the wakeup event if the USB3 PORTSC doesn't reflect the RESUME link status
by the time it is checked.  As per the USB3 specification:

  tU3WakeupRetryDelay ("Table 7-12. LTSSM State Transition Timeouts")

This would allow the XHCI resume sequence to determine if the root hub
needs to be also resumed.  However, in case there is no device connected,
or if there is only a HSUSB device connected, this delay would still affect
the overall resume timing.

Since this delay is solely for detecting U3 wake events (USB3 specific)
then ignore this delay for the disconnected case and the HSUSB connected
only case.

[skip helper function, rename usb3_connected variable -Mathias ]

Signed-off-by: Wesley Cheng <quic_wcheng@quicinc.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20231019102924.2797346-20-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 200589374
(cherry picked from commit 6add6dd345cb754ce18ff992c7264cabf31e59f6 https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git usb-next)
[wcheng: removed the need to check for resume type]
Change-Id: I242a426ab0de40fd77705aaef57d228b8721d701
Signed-off-by: Wesley Cheng <quic_wcheng@quicinc.com>
2024-01-25 10:16:49 +00:00
Todd Kjos
f27fc6ba23 Merge "Merge tag 'android14-6.1.68_r00' into branch 'android14-6.1'" into android14-6.1 2024-01-24 17:34:59 +00:00
Jacky Liu
c96cea1a3c ANDROID: Update the ABI symbol list
Adding the following symbols:
  - gpiod_set_debounce

Bug: 316820336
Change-Id: I5f89b5ac4f52a05d1e29e3ff90abf6506551ef23
Signed-off-by: Jacky Liu <qsliu@google.com>
2024-01-24 17:10:28 +00:00
John Stultz
c2fbc12180 ANDROID: uid_sys_stats: Drop CONFIG_UID_SYS_STATS_DEBUG logic
It was pointed out that since commit b6115e140102 ("ANDROID:
uid_sys_stat: split the global lock uid_lock to the fine-grained locks
for each hlist in hash_table") taking a spin_lock in uid_lock()
causes a scheduling while atomic error if CONFIG_UID_SYS_STATS_DEBUG
is enabled, as get_full_task_comm() takes the mmap_write_lock()
which is a semaphore, breaking the proper ordering.

In the GKI CONFIG_UID_SYS_STATS_DEBUG is disabled, so this went
unnoticed.

The uid_sys_stats logic isn't ever going to go upstream (it depends
on reverting upstream logic) and will hopefully be replaced eventually.
So there's not much reason to drag around this debug logic that is
unused.

So drop it. Less code to schlep forward.

Bug: 320184870
Change-Id: I2cfce79d5a25a3eba11a5509444c07b4642ef2de
Signed-off-by: John Stultz <jstultz@google.com>
2024-01-23 17:07:00 +00:00
Ryan Huang
90bd30bdef ANDROID: Update the ABI symbol list
Adding the following symbols:
  - __traceiter_android_rvh_iommu_alloc_insert_iova
  - __traceiter_android_rvh_iommu_iovad_init_alloc_algo
  - __traceiter_android_rvh_iommu_limit_align_shift
  - __tracepoint_android_rvh_iommu_alloc_insert_iova
  - __tracepoint_android_rvh_iommu_iovad_init_alloc_algo
  - __tracepoint_android_rvh_iommu_limit_align_shift

Bug: 321292231
Change-Id: I06bc89027ffd05c43de2cfce67dc3ca0440bce05
Signed-off-by: Ryan Huang <tzukui@google.com>
2024-01-23 17:05:26 +00:00
Qian-Hao Huang
3280560843 ANDROID: Update the ABI symbol list
Adding the following symbols:
  - regulator_get_voltage
  - send_sig_info

Bug: 321669930
Change-Id: I3cf5e5a7b37b5d1837ab7cbf151b7aabbaced504
Signed-off-by: Qian-Hao Huang <qhhuang@google.com>
2024-01-23 17:00:54 +00:00
Avichal Rakesh
427210e440 UPSTREAM: usb: gadget: uvc: Remove nested locking
When handling error status from uvcg_video_usb_req_queue,
uvc_video_complete currently calls uvcg_queue_cancel with
video->req_lock held. uvcg_queue_cancel internally locks
queue->irqlock, which nests queue->irqlock inside
video->req_lock. This isn't a functional bug at the
moment, but does open up possibilities for ABBA
deadlocks in the future.

This patch fixes the accidental nesting by dropping
video->req_lock before calling uvcg_queue_cancel.

Fixes: 6acba0345b68 ("usb:gadget:uvc Do not use worker thread to pump isoc usb requests")
Signed-off-by: Avichal Rakesh <arakesh@google.com>
Link: https://lore.kernel.org/r/20240104215009.2252452-2-arakesh@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 314338409
(cherry picked from commit 9866dc4314c6c858e451933f965d64532aec00a9)
Change-Id: If25fba6661d55cd972d76068750f3b445c8360aa
Signed-off-by: Avichal Rakesh <arakesh@google.com>
2024-01-23 16:48:53 +00:00
John Stultz
9267e267be ANDROID: uid_sys_stats: Fully initialize uid_entry_tmp value
Amit Pundir at Linaro reported seeing crashes in uid_sys_stats
driver when building with GCC.

Looking into it, it seems the uid_entry_tmp value is used
while only partially initialized, causing potential out of bound
access on the uid_entry io arrays.

This likely has gone unnoticed with clang as I believe we're
using the zero initialization for stack variables security
feature.

So change the logic to fully initialize the uid_entry_tmp
value.

Fixes: f68d4f3c3b53 ("ANDROID: uid_sys_stat: instead update_io_stats_uid_locked to update_io_stats_uid")
Reported-by: Amit Pundir <amit.pundir@linaro.org>
Signed-off-by: John Stultz <jstultz@google.com>
Change-Id: I78de245e80ef60aabec78a615c7ba582ab5a2242
2024-01-23 01:28:49 +00:00
Hailong.Liu
2d3f0c9d41 ANDROID: Roll back some code to fix system_server registers psi trigger failed.
the commit 2c1e89916b
revert part of
https://android-review.googlesource.com/c/kernel/common/+/2199758
causing system_server registers psi trigger failed due to lack of
CAP_SYS_RESOURCE capability.

Bug: 243781242
Bug: 244148051
Signed-off-by: Hailong.Liu <liuhailong@oppo.com>
Change-Id: Ie22ea6f7a7dc848fa8307e6f4e8223779367df31
2024-01-22 23:38:15 +00:00
Avichal Rakesh
bd77c97c76 UPSTREAM: usb: gadget: uvc: Fix use are free during STREAMOFF
There is a path that may lead to freed memory being referenced,
causing kernel panics.

The kernel panic has the following stack trace:

Workqueue: uvcgadget uvcg_video_pump.c51fb85fece46625450f86adbf92c56c.cfi_jt
pstate: 60c00085 (nZCv daIf +PAN +UAO -TCO BTYPE=--)
pc : __list_del_entry_valid+0xc0/0xd4
lr : __list_del_entry_valid+0xc0/0xd4
Call trace:
  __list_del_entry_valid+0xc0/0xd4
  uvc_video_free_request+0x60/0x98
  uvcg_video_pump+0x1cc/0x204
  process_one_work+0x21c/0x4b8
  worker_thread+0x29c/0x574
  kthread+0x158/0x1b0
  ret_from_fork+0x10/0x30

The root cause is that uvcg_video_usb_req_queue frees the uvc_request
if is_enabled is false and returns an error status. video_pump also
frees the associated request if uvcg_video_usb_req_queue returns an
error status, leading to double free and accessing garbage memory.

To fix the issue, this patch removes freeing logic from
uvcg_video_usb_req_queue, and lets the callers to the function handle
queueing errors as they see fit.

Fixes: 6acba0345b68 ("usb:gadget:uvc Do not use worker thread to pump isoc usb requests")
Tested-by: Avichal Rakesh <arakesh@google.com>
Signed-off-by: Avichal Rakesh <arakesh@google.com>
Link: https://lore.kernel.org/r/20240104215009.2252452-1-arakesh@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 314338409
(cherry picked from commit fe814b5b0f3042f1a583734497e726ee53783cc1)
Change-Id: Id13dea3a37e37a79cff3719ced449f0d1902ebd6
Signed-off-by: Avichal Rakesh <arakesh@google.com>
2024-01-22 16:58:33 +00:00
Dylan Chang
21c71a7d0e ANDROID: GKI: Add symbol list for Nothing
Add symbol list for Nothing at the first time

2 function symbol(s) added
  'struct file_system_type* get_fs_type(const char*)'
  'void iterate_supers_type(struct file_system_type*, void(*)(struct super_block*, void*), void*)'

Bug: 321604034
Change-Id: I3cdf16cf21bf04df2c0ab10358e7e7da4e99ccd3
Signed-off-by: Dylan Chang <dylan.chang@nothing.tech>
2024-01-22 03:35:38 +00:00
Qais Yousef
aba5a3fe09 ANDROID: Enable CONFIG_LAZY_RCU in x86 gki_defconfig
It is still disabled by default. Must specify
rcutree.android_enable_rcu_lazy and rcu_nocbs=all in boot time parameter
to actually enable it.

Bug: 258241771
Change-Id: Ic9e15b846d58ffa3d5dd81842c568da79352ff2d
Signed-off-by: Qais Yousef <qyousef@google.com>
2024-01-20 02:45:24 +00:00
Paul Lawrence
204160394a ANDROID: fuse-bpf: Fix the issue of abnormal lseek system calls
fuse_lseek_backing was returning the offset as an int, which would then
be treated as an ERR if in the range 4G-4096 and 4G.

Although the call would appear to work correctly, the file position
would be incorrect according to a subsequent fseek with SEEK_CUR.

Based on a change by chenyuwen <chenyuwen1@meizu.com> who found and
fixed this issue.

Bug: 319219307
Change-Id: I3aef5fb22751a72ce2bd7674ee081956a89fc752
Signed-off-by: chenyuwen <chenyuwen1@meizu.com>
Signed-off-by: Paul Lawrence <paullawrence@google.com>
2024-01-19 17:32:20 +00:00
Zhipeng Wang
947708f1ff ANDROID: ABI: Update symbol list for imx
INFO: 1 function symbol(s) added
  'int v4l2_fwnode_device_parse(struct device*, struct v4l2_fwnode_device_properties*)'

Bug: 320539650

Change-Id: Id75312751f4832f1459387bd11c0583749d4fa4d
Signed-off-by: Zhipeng Wang <zhipeng.wang_1@nxp.com>
2024-01-19 10:38:15 +00:00
Rafael J. Wysocki
7eedea7abf BACKPORT: PM: sleep: Fix possible deadlocks in core system-wide PM code
It is reported that in low-memory situations the system-wide resume core
code deadlocks, because async_schedule_dev() executes its argument
function synchronously if it cannot allocate memory (and not only in
that case) and that function attempts to acquire a mutex that is already
held.  Executing the argument function synchronously from within
dpm_async_fn() may also be problematic for ordering reasons (it may
cause a consumer device's resume callback to be invoked before a
requisite supplier device's one, for example).

Address this by changing the code in question to use
async_schedule_dev_nocall() for scheduling the asynchronous
execution of device suspend and resume functions and to directly
run them synchronously if async_schedule_dev_nocall() returns false.


Link: https://lore.kernel.org/linux-pm/ZYvjiqX6EsL15moe@perf/
Reported-by: Youngmin Nam <youngmin.nam@samsung.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Tested-by: Youngmin Nam <youngmin.nam@samsung.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Cc: 5.7+ <stable@vger.kernel.org> # 5.7+: 6aa09a5bccd8 async: Split async_schedule_node_domain()
Cc: 5.7+ <stable@vger.kernel.org> # 5.7+: 7d4b5d7a37bd async: Introduce async_schedule_dev_nocall()
Cc: 5.7+ <stable@vger.kernel.org> # 5.7+
Bug: 319759660
Change-Id: I1164a6a0b9899ab2f01d5efb413827b9d0983d98
(cherry picked from commit 7839d0078e0d5e6cc2fa0b0dfbee71de74f1e557)
[Youngmin: Resolved minor conflict in drivers/base/power/main.c]
Signed-off-by: Youngmin Nam <youngmin.nam@samsung.com>
2024-01-19 09:02:31 +00:00
Rafael J. Wysocki
e1a20dd9ff UPSTREAM: async: Introduce async_schedule_dev_nocall()
In preparation for subsequent changes, introduce a specialized variant
of async_schedule_dev() that will not invoke the argument function
synchronously when it cannot be scheduled for asynchronous execution.

The new function, async_schedule_dev_nocall(), will be used for fixing
possible deadlocks in the system-wide power management core code.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> for the series.
Tested-by: Youngmin Nam <youngmin.nam@samsung.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Bug: 319759660
Change-Id: I497f1a9655d80c2d9710c3c814f6a99a31bcf019
(cherry picked from commit 7d4b5d7a37bdd63a5a3371b988744b060d5bb86f)
Signed-off-by: Youngmin Nam <youngmin.nam@samsung.com>
2024-01-19 09:02:31 +00:00
Rafael J. Wysocki
e4b0e14f83 UPSTREAM: async: Split async_schedule_node_domain()
In preparation for subsequent changes, split async_schedule_node_domain()
in two pieces so as to allow the bottom part of it to be called from a
somewhat different code path.

No functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Tested-by: Youngmin Nam <youngmin.nam@samsung.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Bug: 319759660
Change-Id: I6405b388d9a0286208b48f7a321b0042d85abb4b
(cherry picked from commit 6aa09a5bccd8e224d917afdb4c278fc66aacde4d)
Signed-off-by: Youngmin Nam <youngmin.nam@samsung.com>
2024-01-19 09:02:31 +00:00
Carlos Galo
6b4c816d17 FROMGIT: BACKPORT: mm: update mark_victim tracepoints fields
The current implementation of the mark_victim tracepoint provides only the
process ID (pid) of the victim process.  This limitation poses challenges
for userspace tools that need additional information about the OOM victim.
The association between pid and the additional data may be lost after the
kill, making it difficult for userspace to correlate the OOM event with
the specific process.

In order to mitigate this limitation, add the following fields:

- UID
   In Android each installed application has a unique UID. Including
   the `uid` assists in correlating OOM events with specific apps.

- Process Name (comm)
   Enables identification of the affected process.

- OOM Score
   Allows userspace to get additional insights of the relative kill
   priority of the OOM victim.

Link: https://lkml.kernel.org/r/20240111210539.636607-1-carlosgalo@google.com
Change-Id: Icc3ed013a9dfff9bb09f1d7588757e6028c17069
Signed-off-by: Carlos Galo <carlosgalo@google.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
(cherry picked from commit 649ffb4cbb90a7f60f17dd74e57d814e762ea01d mm-unstable)
[ carlosgalo: Manually added struct cred change in mark_oom_victim function ]

Bug: 315560026
Change-Id: I81fb6f3447f432100ad4cd25e22db23768003388
Signed-off-by: Carlos Galo <carlosgalo@google.com>
2024-01-19 00:27:34 +00:00
Qais Yousef
d97ea65296 ANDROID: Enable CONFIG_LAZY_RCU in arm64 gki_defconfig
It is still disabled by default. Must specify
rcutree.android_enable_rcu_lazy and rcu_nocbs=all in boot time parameter
to actually enable it.

Bug: 258241771
Change-Id: I11c920aa5edde2fc42ab54245cd198eb8cb47616
Signed-off-by: Qais Yousef <qyousef@google.com>
2024-01-19 00:10:44 +00:00
Qais Yousef
90d68cedd1 FROMLIST: rcu: Provide a boot time parameter to control lazy RCU
To allow more flexible arrangements while still provide a single kernel
for distros, provide a boot time parameter to enable/disable lazy RCU.

Specify:

	rcutree.enable_rcu_lazy=[y|1|n|0]

Which also requires

	rcu_nocbs=all

at boot time to enable/disable lazy RCU.

To disable it by default at build time when CONFIG_RCU_LAZY=y, the new
CONFIG_RCU_LAZY_DEFAULT_OFF can be used.

Bug: 258241771
Signed-off-by: Qais Yousef (Google) <qyousef@layalina.io>
Tested-by: Andrea Righi <andrea.righi@canonical.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Link: https://lore.kernel.org/lkml/20231203011252.233748-1-qyousef@layalina.io/
[Fix trivial conflicts rejecting newer code that doesn't exist on 6.1]
Signed-off-by: Qais Yousef <qyousef@google.com>
Change-Id: Ib5585ae717a2ba7749f2802101b785c4e5de8a90
2024-01-19 00:10:44 +00:00
Joel Fernandes (Google)
a079cc5876 ANDROID: rcu: Add a minimum time for marking boot as completed
On many systems, a great deal of boot (in userspace) happens after the
kernel thinks the boot has completed. It is difficult to determine if
the system has really booted from the kernel side. Some features like
lazy-RCU can risk slowing down boot time if, say, a callback has been
added that the boot synchronously depends on. Further expedited callbacks
can get unexpedited way earlier than it should be, thus slowing down
boot (as shown in the data below).

For these reasons, this commit adds a config option
'CONFIG_RCU_BOOT_END_DELAY' and a boot parameter rcupdate.boot_end_delay.
Userspace can also make RCU's view of the system as booted, by writing the
time in milliseconds to: /sys/module/rcupdate/parameters/rcu_boot_end_delay
Or even just writing a value of 0 to this sysfs node.
However, under no circumstance will the boot be allowed to end earlier
than just before init is launched.

The default value of CONFIG_RCU_BOOT_END_DELAY is chosen as 15s. This
suites ChromeOS and also a PREEMPT_RT system below very well, which need
no config or parameter changes, and just a simple application of this patch. A
system designer can also choose a specific value here to keep RCU from marking
boot completion.  As noted earlier, RCU's perspective of the system as booted
will not be marker until at least rcu_boot_end_delay milliseconds have passed
or an update is made via writing a small value (or 0) in milliseconds to:
/sys/module/rcupdate/parameters/rcu_boot_end_delay.

One side-effect of this patch is, there is a risk that a real-time workload
launched just after the kernel boots will suffer interruptions due to expedited
RCU, which previous ended just before init was launched. However, to mitigate
such an issue (however unlikely), the user should either tune
CONFIG_RCU_BOOT_END_DELAY to a smaller value than 15 seconds or write a value
of 0 to /sys/module/rcupdate/parameters/rcu_boot_end_delay, once userspace
boots, and before launching the real-time workload.

Qiuxu also noted impressive boot-time improvements with earlier version
of patch. An excerpt from the data he shared:

1) Testing environment:
    OS            : CentOS Stream 8 (non-RT OS)
    Kernel     : v6.2
    Machine : Intel Cascade Lake server (2 sockets, each with 44 logical threads)
    Qemu  args  : -cpu host -enable-kvm, -smp 88,threads=2,sockets=2, …

2) OS boot time definition:
    The time from the start of the kernel boot to the shell command line
    prompt is shown from the console. [ Different people may have
    different OS boot time definitions. ]

3) Measurement method (very rough method):
    A timer in the kernel periodically prints the boot time every 100ms.
    As soon as the shell command line prompt is shown from the console,
    we record the boot time printed by the timer, then the printed boot
    time is the OS boot time.

4) Measured OS boot time (in seconds)
   a) Measured 10 times w/o this patch:
        8.7s, 8.4s, 8.6s, 8.2s, 9.0s, 8.7s, 8.8s, 9.3s, 8.8s, 8.3s
        The average OS boot time was: ~8.7s

   b) Measure 10 times w/ this patch:
        8.5s, 8.2s, 7.6s, 8.2s, 8.7s, 8.2s, 7.8s, 8.2s, 9.3s, 8.4s
        The average OS boot time was: ~8.3s.

(CHROMIUM tag rationale: Submitted upstream but got lots of pushback as
it may harm a PREEMPT_RT system -- the concern is VERY theoretical and
this improves things for ChromeOS. Plus we are not a PREEMPT_RT system.
So I am strongly suggesting this mostly simple change for ChromeOS.)

Bug: 258241771
Tested-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/4350228
Commit-Queue: Joel Fernandes <joelaf@google.com>
Commit-Queue: Vineeth Pillai <vineethrp@google.com>
Tested-by: Vineeth Pillai <vineethrp@google.com>
Tested-by: Joel Fernandes <joelaf@google.com>
Reviewed-by: Vineeth Pillai <vineethrp@google.com>
Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/4909180
Signed-off-by: Qais Yousef <qyousef@google.com>
Change-Id: Ibd262189d7f92dbcc57f1508efe90fcfba95a6cc
2024-01-19 00:10:44 +00:00
Joel Fernandes (Google)
ffe09c06a8 UPSTREAM: rcu: Disable laziness if lazy-tracking says so
During suspend, we see failures to suspend 1 in 300-500 suspends.
Looking closer, it appears that asynchronous RCU callbacks are being
queued as lazy even though synchronous callbacks are expedited. These
delays appear to not be very welcome by the suspend/resume code as
evidenced by these occasional suspend failures.

This commit modifies call_rcu() to check if rcu_async_should_hurry(),
which will return true if we are in suspend or in-kernel boot.

[ paulmck: Alphabetize local variables. ]

Ignoring the lazy hint makes the 3000 suspend/resume cycles pass
reliably on a 12th gen 12-core Intel CPU, and there is some evidence
that it also slightly speeds up boot performance.

Fixes: 3cb278e73be5 ("rcu: Make call_rcu() lazy to save power")
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
(cherry picked from commit cf7066b97e27b2319af1ae2ef6889c4a1704312d)

Bug: 258241771
Signed-off-by: Joel Fernandes <joelaf@google.com>
Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/4909179
Reviewed-by: Vineeth Pillai <vineethrp@google.com>
Reviewed-by: Ross Zwisler <zwisler@google.com>
Reviewed-by: Sean Paul <sean@poorly.run>
Signed-off-by: Qais Yousef <qyousef@google.com>
Change-Id: I4cfe6f43de8bae9a6c034831c79d9773199d6d29
2024-01-19 00:10:44 +00:00
Joel Fernandes (Google)
d07488d26e UPSTREAM: rcu: Track laziness during boot and suspend
Boot and suspend/resume should not be slowed down in kernels built with
CONFIG_RCU_LAZY=y.  In particular, suspend can sometimes fail in such
kernels.

This commit therefore adds rcu_async_hurry(), rcu_async_relax(), and
rcu_async_should_hurry() functions that track whether or not either
a boot or a suspend/resume operation is in progress.  This will
enable a later commit to refrain from laziness during those times.

Export rcu_async_should_hurry(), rcu_async_hurry(), and rcu_async_relax()
for later use by rcutorture.

[ paulmck: Apply feedback from Steve Rostedt. ]

Fixes: 3cb278e73be5 ("rcu: Make call_rcu() lazy to save power")
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
(cherry picked from commit 6efdda8bec2900ce5166ee4ff4b1844b47b529cd)

Bug: 258241771
Signed-off-by: Joel Fernandes <joelaf@google.com>
Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/4909178
Reviewed-by: Ross Zwisler <zwisler@google.com>
Reviewed-by: Sean Paul <sean@poorly.run>
Reviewed-by: Vineeth Pillai <vineethrp@google.com>
Signed-off-by: Qais Yousef <qyousef@google.com>
Change-Id: Ieb2f2d484a33cfbd71f71c8e3dbcfc05cd7efe8c
2024-01-19 00:10:44 +00:00
Joel Fernandes (Google)
4316bd568b UPSTREAM: net: Use call_rcu_hurry() for dst_release()
In a networking test on ChromeOS, kernels built with the new
CONFIG_RCU_LAZY=y Kconfig option fail a networking test in the teardown
phase.

This failure may be reproduced as follows: ip netns del <name>

The CONFIG_RCU_LAZY=y Kconfig option was introduced by earlier commits
in this series for the benefit of certain battery-powered systems.
This Kconfig option causes call_rcu() to delay its callbacks in order
to batch them.  This means that a given RCU grace period covers more
callbacks, thus reducing the number of grace periods, in turn reducing
the amount of energy consumed, which increases battery lifetime which
can be a very good thing.  This is not a subtle effect: In some important
use cases, the battery lifetime is increased by more than 10%.

This CONFIG_RCU_LAZY=y option is available only for CPUs that offload
callbacks, for example, CPUs mentioned in the rcu_nocbs kernel boot
parameter passed to kernels built with CONFIG_RCU_NOCB_CPU=y.

Delaying callbacks is normally not a problem because most callbacks do
nothing but free memory.  If the system is short on memory, a shrinker
will kick all currently queued lazy callbacks out of their laziness,
thus freeing their memory in short order.  Similarly, the rcu_barrier()
function, which blocks until all currently queued callbacks are invoked,
will also kick lazy callbacks, thus enabling rcu_barrier() to complete
in a timely manner.

However, there are some cases where laziness is not a good option.
For example, synchronize_rcu() invokes call_rcu(), and blocks until
the newly queued callback is invoked.  It would not be a good for
synchronize_rcu() to block for ten seconds, even on an idle system.
Therefore, synchronize_rcu() invokes call_rcu_hurry() instead of
call_rcu().  The arrival of a non-lazy call_rcu_hurry() callback on a
given CPU kicks any lazy callbacks that might be already queued on that
CPU.  After all, if there is going to be a grace period, all callbacks
might as well get full benefit from it.

Yes, this could be done the other way around by creating a
call_rcu_lazy(), but earlier experience with this approach and
feedback at the 2022 Linux Plumbers Conference shifted the approach
to call_rcu() being lazy with call_rcu_hurry() for the few places
where laziness is inappropriate.

Returning to the test failure, use of ftrace showed that this failure
cause caused by the aadded delays due to this new lazy behavior of
call_rcu() in kernels built with CONFIG_RCU_LAZY=y.

Therefore, make dst_release() use call_rcu_hurry() in order to revert
to the old test-failure-free behavior.

[ paulmck: Apply s/call_rcu_flush/call_rcu_hurry/ feedback from Tejun Heo. ]

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Cc: David Ahern <dsahern@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: <netdev@vger.kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
(cherry picked from commit 483c26ff63f42e8898ed43aca0b9953bc91f0cd4)

Bug: 258241771
Signed-off-by: Joel Fernandes <joelaf@google.com>
Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/4909041
Reviewed-by: Sean Paul <sean@poorly.run>
Reviewed-by: Ross Zwisler <zwisler@google.com>
Reviewed-by: Vineeth Pillai <vineethrp@google.com>
Signed-off-by: Qais Yousef <qyousef@google.com>
Change-Id: Ifd64083bd210a9dfe94c179152f27d310c179507
2024-01-19 00:10:44 +00:00
Uladzislau Rezki
b9427245f0 UPSTREAM: workqueue: Make queue_rcu_work() use call_rcu_hurry()
Earlier commits in this series allow battery-powered systems to build
their kernels with the default-disabled CONFIG_RCU_LAZY=y Kconfig option.
This Kconfig option causes call_rcu() to delay its callbacks in order
to batch them.  This means that a given RCU grace period covers more
callbacks, thus reducing the number of grace periods, in turn reducing
the amount of energy consumed, which increases battery lifetime which
can be a very good thing.  This is not a subtle effect: In some important
use cases, the battery lifetime is increased by more than 10%.

This CONFIG_RCU_LAZY=y option is available only for CPUs that offload
callbacks, for example, CPUs mentioned in the rcu_nocbs kernel boot
parameter passed to kernels built with CONFIG_RCU_NOCB_CPU=y.

Delaying callbacks is normally not a problem because most callbacks do
nothing but free memory.  If the system is short on memory, a shrinker
will kick all currently queued lazy callbacks out of their laziness,
thus freeing their memory in short order.  Similarly, the rcu_barrier()
function, which blocks until all currently queued callbacks are invoked,
will also kick lazy callbacks, thus enabling rcu_barrier() to complete
in a timely manner.

However, there are some cases where laziness is not a good option.
For example, synchronize_rcu() invokes call_rcu(), and blocks until
the newly queued callback is invoked.  It would not be a good for
synchronize_rcu() to block for ten seconds, even on an idle system.
Therefore, synchronize_rcu() invokes call_rcu_hurry() instead of
call_rcu().  The arrival of a non-lazy call_rcu_hurry() callback on a
given CPU kicks any lazy callbacks that might be already queued on that
CPU.  After all, if there is going to be a grace period, all callbacks
might as well get full benefit from it.

Yes, this could be done the other way around by creating a
call_rcu_lazy(), but earlier experience with this approach and
feedback at the 2022 Linux Plumbers Conference shifted the approach
to call_rcu() being lazy with call_rcu_hurry() for the few places
where laziness is inappropriate.

And another call_rcu() instance that cannot be lazy is the one
in queue_rcu_work(), given that callers to queue_rcu_work() are
not necessarily OK with long delays.

Therefore, make queue_rcu_work() use call_rcu_hurry() in order to revert
to the old behavior.

[ paulmck: Apply s/call_rcu_flush/call_rcu_hurry/ feedback from Tejun Heo. ]

Signed-off-by: Uladzislau Rezki <urezki@gmail.com>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
(cherry picked from commit a7e30c0e9a5f95b7f74e6272d9c75fd65c897721)

Bug: 258241771
Signed-off-by: Joel Fernandes <joelaf@google.com>
Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/4909040
Reviewed-by: Sean Paul <sean@poorly.run>
Reviewed-by: Ross Zwisler <zwisler@google.com>
Reviewed-by: Vineeth Pillai <vineethrp@google.com>
Signed-off-by: Qais Yousef <qyousef@google.com>
Change-Id: I1dd4cedd1fb02626fa47f88a7fbaa7cacfa95d11
2024-01-19 00:10:44 +00:00
Joel Fernandes (Google)
72fdf7f606 UPSTREAM: percpu-refcount: Use call_rcu_hurry() for atomic switch
Earlier commits in this series allow battery-powered systems to build
their kernels with the default-disabled CONFIG_RCU_LAZY=y Kconfig option.
This Kconfig option causes call_rcu() to delay its callbacks in order to
batch callbacks.  This means that a given RCU grace period covers more
callbacks, thus reducing the number of grace periods, in turn reducing
the amount of energy consumed, which increases battery lifetime which
can be a very good thing.  This is not a subtle effect: In some important
use cases, the battery lifetime is increased by more than 10%.

This CONFIG_RCU_LAZY=y option is available only for CPUs that offload
callbacks, for example, CPUs mentioned in the rcu_nocbs kernel boot
parameter passed to kernels built with CONFIG_RCU_NOCB_CPU=y.

Delaying callbacks is normally not a problem because most callbacks do
nothing but free memory.  If the system is short on memory, a shrinker
will kick all currently queued lazy callbacks out of their laziness,
thus freeing their memory in short order.  Similarly, the rcu_barrier()
function, which blocks until all currently queued callbacks are invoked,
will also kick lazy callbacks, thus enabling rcu_barrier() to complete
in a timely manner.

However, there are some cases where laziness is not a good option.
For example, synchronize_rcu() invokes call_rcu(), and blocks until
the newly queued callback is invoked.  It would not be a good for
synchronize_rcu() to block for ten seconds, even on an idle system.
Therefore, synchronize_rcu() invokes call_rcu_hurry() instead of
call_rcu().  The arrival of a non-lazy call_rcu_hurry() callback on a
given CPU kicks any lazy callbacks that might be already queued on that
CPU.  After all, if there is going to be a grace period, all callbacks
might as well get full benefit from it.

Yes, this could be done the other way around by creating a
call_rcu_lazy(), but earlier experience with this approach and
feedback at the 2022 Linux Plumbers Conference shifted the approach
to call_rcu() being lazy with call_rcu_hurry() for the few places
where laziness is inappropriate.

And another call_rcu() instance that cannot be lazy is the one on the
percpu refcounter's "per-CPU to atomic switch" code path, which
uses RCU when switching to atomic mode.  The enqueued callback
wakes up waiters waiting in the percpu_ref_switch_waitq.  Allowing
this callback to be lazy would result in unacceptable slowdowns for
users of per-CPU refcounts, such as blk_pre_runtime_suspend().

Therefore, make __percpu_ref_switch_to_atomic() use call_rcu_hurry()
in order to revert to the old behavior.

[ paulmck: Apply s/call_rcu_flush/call_rcu_hurry/ feedback from Tejun Heo. ]

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: <linux-mm@kvack.org>
(cherry picked from commit 343a72e5e37d380b70534fae3acd7e5e39adb769)

Bug: 258241771
Signed-off-by: Joel Fernandes <joelaf@google.com>
Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/4909039
Reviewed-by: Sean Paul <sean@poorly.run>
Reviewed-by: Vineeth Pillai <vineethrp@google.com>
Reviewed-by: Ross Zwisler <zwisler@google.com>
Signed-off-by: Qais Yousef <qyousef@google.com>
Change-Id: Icc325f69d0df1a37b6f1de02a284e1fabf20e366
2024-01-19 00:10:44 +00:00
Dylan Yudaken
ced65a053b UPSTREAM: io_uring: use call_rcu_hurry if signaling an eventfd
io_uring uses call_rcu in the case it needs to signal an eventfd as a
result of an eventfd signal, since recursing eventfd signals are not
allowed. This should be calling the new call_rcu_hurry API to not delay
the signal.

Signed-off-by: Dylan Yudaken <dylany@meta.com>

Cc: Joel Fernandes (Google) <joel@joelfernandes.org>
Cc: Paul E. McKenney <paulmck@kernel.org>
Acked-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Link: https://lore.kernel.org/r/20221215184138.795576-1-dylany@meta.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
(cherry picked from commit 44a84da45272b3f4beb90025a64cfbde18f1aef0)

Bug: 258241771
Signed-off-by: Joel Fernandes <joelaf@google.com>
Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/4909038
Reviewed-by: Vineeth Pillai <vineethrp@google.com>
Reviewed-by: Ross Zwisler <zwisler@google.com>
Reviewed-by: Sean Paul <sean@poorly.run>
Signed-off-by: Qais Yousef <qyousef@google.com>
Change-Id: Iec189c9ce0a95ccacda81f58bf7d49a575a6ab3f
2024-01-19 00:10:44 +00:00
Paul E. McKenney
84c8157d06 UPSTREAM: rcu: Update synchronize_rcu_mult() comment for call_rcu_hurry()
Those who have worked with RCU for some time will naturally think in
terms of the long-standing call_rcu() API rather than the much newer
call_rcu_hurry() API.  But it is call_rcu_hurry() that you should normally
pass to synchronize_rcu_mult().  This commit therefore updates the header
comment to point this out.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
(cherry picked from commit 6716f4d39c17febf7aa4fa5f5923da67a8d10e85)

Bug: 258241771
Signed-off-by: Joel Fernandes <joelaf@google.com>
Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/4909037
Reviewed-by: Ross Zwisler <zwisler@google.com>
Reviewed-by: Sean Paul <sean@poorly.run>
Reviewed-by: Vineeth Pillai <vineethrp@google.com>
Signed-off-by: Qais Yousef <qyousef@google.com>
Change-Id: I0d701825ddd7e15cebd92190388fbf78c04d26fb
2024-01-19 00:10:44 +00:00
Uladzislau Rezki
3751416eeb UPSTREAM: scsi/scsi_error: Use call_rcu_hurry() instead of call_rcu()
Earlier commits in this series allow battery-powered systems to build
their kernels with the default-disabled CONFIG_RCU_LAZY=y Kconfig option.
This Kconfig option causes call_rcu() to delay its callbacks in order
to batch them.  This means that a given RCU grace period covers more
callbacks, thus reducing the number of grace periods, in turn reducing
the amount of energy consumed, which increases battery lifetime which
can be a very good thing.  This is not a subtle effect: In some important
use cases, the battery lifetime is increased by more than 10%.

This CONFIG_RCU_LAZY=y option is available only for CPUs that offload
callbacks, for example, CPUs mentioned in the rcu_nocbs kernel boot
parameter passed to kernels built with CONFIG_RCU_NOCB_CPU=y.

Delaying callbacks is normally not a problem because most callbacks do
nothing but free memory.  If the system is short on memory, a shrinker
will kick all currently queued lazy callbacks out of their laziness,
thus freeing their memory in short order.  Similarly, the rcu_barrier()
function, which blocks until all currently queued callbacks are invoked,
will also kick lazy callbacks, thus enabling rcu_barrier() to complete
in a timely manner.

However, there are some cases where laziness is not a good option.
For example, synchronize_rcu() invokes call_rcu(), and blocks until
the newly queued callback is invoked.  It would not be a good for
synchronize_rcu() to block for ten seconds, even on an idle system.
Therefore, synchronize_rcu() invokes call_rcu_hurry() instead of
call_rcu().  The arrival of a non-lazy call_rcu_hurry() callback on a
given CPU kicks any lazy callbacks that might be already queued on that
CPU.  After all, if there is going to be a grace period, all callbacks
might as well get full benefit from it.

Yes, this could be done the other way around by creating a
call_rcu_lazy(), but earlier experience with this approach and
feedback at the 2022 Linux Plumbers Conference shifted the approach
to call_rcu() being lazy with call_rcu_hurry() for the few places
where laziness is inappropriate.

And another call_rcu() instance that cannot be lazy is the one in the
scsi_eh_scmd_add() function.  Leaving this instance lazy results in
unacceptably slow boot times.

Therefore, make scsi_eh_scmd_add() use call_rcu_hurry() in order to
revert to the old behavior.

[ paulmck: Apply s/call_rcu_flush/call_rcu_hurry/ feedback from Tejun Heo. ]

Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Uladzislau Rezki <urezki@gmail.com>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: <linux-scsi@vger.kernel.org>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
(cherry picked from commit 54d87b0a0c19bc3f740e4cd4b87ba14ce2e4ea73)

Bug: 258241771
Signed-off-by: Joel Fernandes <joelaf@google.com>
Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/4909036
Reviewed-by: Vineeth Pillai <vineethrp@google.com>
Reviewed-by: Sean Paul <sean@poorly.run>
Signed-off-by: Qais Yousef <qyousef@google.com>
Change-Id: I95bba865e582b0a12b1c09ba1f0bd4f897401c07
2024-01-19 00:10:44 +00:00
Joel Fernandes (Google)
52193e9489 UPSTREAM: rcu/rcutorture: Use call_rcu_hurry() where needed
call_rcu() changes to save power will change the behavior of rcutorture
tests. Use the call_rcu_hurry() API instead which reverts to the old
behavior.

[ paulmck: Apply s/call_rcu_flush/call_rcu_hurry/ feedback from Tejun Heo. ]

Reported-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
(cherry picked from commit 405d8e91f0a99777d61f6b0ddc3484d8ea7ca393)

Bug: 258241771
Signed-off-by: Joel Fernandes <joelaf@google.com>
Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/4909035
Reviewed-by: Sean Paul <sean@poorly.run>
Reviewed-by: Vineeth Pillai <vineethrp@google.com>
Signed-off-by: Qais Yousef <qyousef@google.com>
Change-Id: I8008990dfe7e64f511aada006e736b15cdd0d61e
2024-01-19 00:10:44 +00:00
Joel Fernandes (Google)
83f8ba569f UPSTREAM: rcu/rcuscale: Use call_rcu_hurry() for async reader test
rcuscale uses call_rcu() to queue async readers. With recent changes to
save power, the test will have fewer async readers in flight. Use the
call_rcu_hurry() API instead to revert to the old behavior.

[ paulmck: Apply s/call_rcu_flush/call_rcu_hurry/ feedback from Tejun Heo. ]

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
(cherry picked from commit 723df859d8bba948ff2eb08eba32ab433acf7c9c)

Bug: 258241771
Signed-off-by: Joel Fernandes <joelaf@google.com>
Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/4909034
Reviewed-by: Sean Paul <sean@poorly.run>
Reviewed-by: Vineeth Pillai <vineethrp@google.com>
Signed-off-by: Qais Yousef <qyousef@google.com>
Change-Id: I680dacb44e81e210e2e4455f28e50b9b516222a8
2024-01-19 00:10:44 +00:00
Joel Fernandes (Google)
9b625f4978 UPSTREAM: rcu/sync: Use call_rcu_hurry() instead of call_rcu
call_rcu() changes to save power will slow down rcu sync. Use the
call_rcu_hurry() API instead which reverts to the old behavior.

[ paulmck: Apply s/call_rcu_flush/call_rcu_hurry/ feedback from Tejun Heo. ]

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
(cherry picked from commit 7651d6b25086656eacfdd8356bfe3a21c0c2d79d)

Bug: 258241771
Signed-off-by: Joel Fernandes <joelaf@google.com>
Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/4909033
Reviewed-by: Vineeth Pillai <vineethrp@google.com>
Reviewed-by: Sean Paul <sean@poorly.run>
Signed-off-by: Qais Yousef <qyousef@google.com>
Change-Id: I5123ba52f47676305dbcfa1233bf3b41f140766c
2024-01-19 00:10:44 +00:00
Vineeth Pillai
c570c8fea3 BACKPORT: rcu: Shrinker for lazy rcu
The shrinker is used to speed up the free'ing of memory potentially held
by RCU lazy callbacks. RCU kernel module test cases show this to be
effective. Test is introduced in a later patch.

Signed-off-by: Vineeth Pillai <vineeth@bitbyteword.org>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
(cherry picked from commit c945b4da7a448a9a56becc5a8745d942b2b83d3c)

Conflicts:
   kernel/rcu/tree_nocb.h

Trivial conflict due to: "rcu/nocb: Add an option to offload all CPUs on boot"

Bug: 258241771
Signed-off-by: Joel Fernandes <joelaf@google.com>
Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/4909032
Reviewed-by: Vineeth Pillai <vineethrp@google.com>
Signed-off-by: Qais Yousef <qyousef@google.com>
Change-Id: I6a73a9dae79ff35feca37abe2663e55a0f46dda8
2024-01-19 00:10:44 +00:00
Joel Fernandes (Google)
4957579439 UPSTREAM: rcu: Refactor code a bit in rcu_nocb_do_flush_bypass()
This consolidates the code a bit and makes it cleaner. Functionally it
is the same.

Reported-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
(cherry picked from commit 3d222a0c0cfef85bad2c9cff5d541836cb81cfbd)

Bug: 258241771
Signed-off-by: Joel Fernandes <joelaf@google.com>
Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/4909031
Reviewed-by: Sean Paul <sean@poorly.run>
Reviewed-by: Vineeth Pillai <vineethrp@google.com>
Signed-off-by: Qais Yousef <qyousef@google.com>
Change-Id: I8422c7138edd6a476fc46374beefdf46dd76b8b0
2024-01-19 00:10:44 +00:00
Joel Fernandes (Google)
66a832fe38 UPSTREAM: rcu: Make call_rcu() lazy to save power
Implement timer-based RCU callback batching (also known as lazy
callbacks). With this we save about 5-10% of power consumed due
to RCU requests that happen when system is lightly loaded or idle.

By default, all async callbacks (queued via call_rcu) are marked
lazy. An alternate API call_rcu_hurry() is provided for the few users,
for example synchronize_rcu(), that need the old behavior.

The batch is flushed whenever a certain amount of time has passed, or
the batch on a particular CPU grows too big. Also memory pressure will
flush it in a future patch.

To handle several corner cases automagically (such as rcu_barrier() and
hotplug), we re-use bypass lists which were originally introduced to
address lock contention, to handle lazy CBs as well. The bypass list
length has the lazy CB length included in it. A separate lazy CB length
counter is also introduced to keep track of the number of lazy CBs.

[ paulmck: Fix formatting of inline call_rcu_lazy() definition. ]
[ paulmck: Apply Zqiang feedback. ]
[ paulmck: Apply s/call_rcu_flush/call_rcu_hurry/ feedback from Tejun Heo. ]

Suggested-by: Paul McKenney <paulmck@kernel.org>
Acked-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
(cherry picked from commit 3cb278e73be58bfb780ecd55129296d2f74c1fb7)

Bug: 258241771
Signed-off-by: Joel Fernandes <joelaf@google.com>
Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/4909030
Reviewed-by: Vineeth Pillai <vineethrp@google.com>
Signed-off-by: Qais Yousef <qyousef@google.com>
Change-Id: I557d5af2a5d317bd66e9ec55ed40822bb5c54390
2024-01-19 00:10:44 +00:00
Frederic Weisbecker
4fb09fb4f7 UPSTREAM: rcu: Fix missing nocb gp wake on rcu_barrier()
In preparation for RCU lazy changes, wake up the RCU nocb gp thread if
needed after an entrain.  This change prevents the RCU barrier callback
from waiting in the queue for several seconds before the lazy callbacks
in front of it are serviced.

Reported-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
(cherry picked from commit b8f7aca3f0e0e6223094ba2662bac90353674b04)

Bug: 258241771
Signed-off-by: Joel Fernandes <joelaf@google.com>
Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/4909029
Reviewed-by: Sean Paul <sean@poorly.run>
Reviewed-by: Vineeth Pillai <vineethrp@google.com>
Signed-off-by: Qais Yousef <qyousef@google.com>
Change-Id: Ib55c5886764b74df22531eca35f076ef7acc08dd
2024-01-19 00:10:44 +00:00
Joel Fernandes (Google)
64c59ad2c3 UPSTREAM: rcu: Fix late wakeup when flush of bypass cblist happens
When the bypass cblist gets too big or its timeout has occurred, it is
flushed into the main cblist. However, the bypass timer is still running
and the behavior is that it would eventually expire and wake the GP
thread.

Since we are going to use the bypass cblist for lazy CBs, do the wakeup
soon as the flush for "too big or too long" bypass list happens.
Otherwise, long delays can happen for callbacks which get promoted from
lazy to non-lazy.

This is a good thing to do anyway (regardless of future lazy patches),
since it makes the behavior consistent with behavior of other code paths
where flushing into the ->cblist makes the GP kthread into a
non-sleeping state quickly.

[ Frederic Weisbecker: Changes to avoid unnecessary GP-thread wakeups plus
		    comment changes. ]

Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
(cherry picked from commit b50606f35f4b73c8e4c6b9c64fe7ba72ea919134)

Bug: 258241771
Signed-off-by: Joel Fernandes <joelaf@google.com>
Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/4909028
Reviewed-by: Vineeth Pillai <vineethrp@google.com>
Reviewed-by: Sean Paul <sean@poorly.run>
Signed-off-by: Qais Yousef <qyousef@google.com>
Change-Id: If8da96d7ba6ed90a2a70f7d56f7bb03af44fd649
2024-01-19 00:10:44 +00:00