Maintain statistics per cgroup and export these to user space. These
statistics are essential for verifying whether the proper I/O priorities
have been assigned to requests. An example of the statistics data with
this patch applied:
$ cat /sys/fs/cgroup/io.stat
11:2 rbytes=0 wbytes=0 rios=3 wios=0 dbytes=0 dios=0 [NONE] dispatched=0 inserted=0 merged=171 [RT] dispatched=0 inserted=0 merged=0 [BE] dispatched=0 inserted=0 merged=0 [IDLE] dispatched=0 inserted=0 merged=0
8:32 rbytes=2142720 wbytes=0 rios=105 wios=0 dbytes=0 dios=0 [NONE] dispatched=0 inserted=0 merged=171 [RT] dispatched=0 inserted=0 merged=0 [BE] dispatched=0 inserted=0 merged=0 [IDLE] dispatched=0 inserted=0 merged=0
Cc: Damien Le Moal <damien.lemoal@wdc.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Cc: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
BUG: 187357408
Change-Id: I8d976c62ba2c0397cbb18076f3e61d5ab246cbcf
(cherry picked from commit f5dc926252cb31739809f7d27a8cbc9941b4d36d git://git.kernel.dk/linux-block/ for-5.14/block)
Signed-off-by: Bart Van Assche <bvanassche@google.com>
Track I/O statistics per I/O priority and export these statistics to
debugfs. These statistics help developers of the deadline scheduler.
Cc: Damien Le Moal <damien.lemoal@wdc.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Cc: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
BUG: 187357408
Change-Id: I8e91693dc1d015060737fa2fc15f5f2ebee2530c
(cherry picked from commit 9dc236caf2518c1e434be7a4f8fae60fb0be506a git://git.kernel.dk/linux-block/ for-5.14/block)
Signed-off-by: Bart Van Assche <bvanassche@google.com>
Maintain one dispatch list and one FIFO list per I/O priority class: RT, BE
and IDLE. Maintain statistics for each priority level. Split the debugfs
attributes per priority level as follows:
$ ls /sys/kernel/debug/block/.../sched/
async_depth dispatch2 read_next_rq write2_fifo_list
batching read0_fifo_list starved write_next_rq
dispatch0 read1_fifo_list write0_fifo_list
dispatch1 read2_fifo_list write1_fifo_list
Cc: Damien Le Moal <damien.lemoal@wdc.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Cc: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
BUG: 187357408
Change-Id: I60451cfdb416ad27601dc3ffb4eb307fa6ff783f
(cherry picked from commit 5b701a6e040ff8626ecf29ac06de9689efc00754 git://git.kernel.dk/linux-block/ for-5.14/block)
Signed-off-by: Bart Van Assche <bvanassche@google.com>
When dispatching the first request of a batch, the deadline_move_request()
call clears .next_rq[] for the opposite data direction. .next_rq[] is not
restored when changing data direction. Fix this by not clearing .next_rq[]
and by keeping track of the data direction of a batch in a variable instead.
This patch is a micro-optimization because:
- The number of deadline_next_request() calls for the read direction is
halved.
- The number of times that deadline_next_request() returns NULL is reduced.
Cc: Damien Le Moal <damien.lemoal@wdc.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Cc: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
BUG: 187357408
Change-Id: I582e99603a5443d75cf2b18a5daa2c93b5c66de3
(cherry picked from commit ea0fd2a525436ab5b9ada0f1953b0c0a29357311 git://git.kernel.dk/linux-block/ for-5.14/block)
Signed-off-by: Bart Van Assche <bvanassche@google.com>
For interactive workloads it is important that synchronous requests are
not delayed. Hence reserve 25% of scheduler tags for synchronous requests.
This patch still allows asynchronous requests to fill the hardware queues
since blk_mq_init_sched() makes sure that the number of scheduler requests
is the double of the hardware queue depth. From blk_mq_init_sched():
q->nr_requests = 2 * min_t(unsigned int, q->tag_set->queue_depth,
BLKDEV_MAX_RQ);
Cc: Damien Le Moal <damien.lemoal@wdc.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Cc: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
BUG: 187357408
Change-Id: Ib9cd753a39c8e5f5c45908001d69334130ef2067
(cherry picked from commit c970bc8292aaaf6f2d333d612e657df3a99f417c git://git.kernel.dk/linux-block/ for-5.14/block)
Signed-off-by: Bart Van Assche <bvanassche@google.com>
Define separate macros for integers and jiffies to improve readability.
Use sysfs_emit() and kstrtoint() instead of sprintf() and simple_strtol().
The former functions are the recommended functions.
Cc: Damien Le Moal <damien.lemoal@wdc.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Cc: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
BUG: 187357408
Change-Id: I4e0fd35124cd0319fcace0d1d5e3c113b60a213c
(cherry picked from commit d9baee13f8cf66a8fac9ec67fdb85ce419fcce3a git://git.kernel.dk/linux-block/ for-5.14/block)
Signed-off-by: Bart Van Assche <bvanassche@google.com>
Modern compilers complain if an out-of-range value is passed to a function
argument that has an enumeration type. Let the compiler detect out-of-range
data direction arguments instead of verifying the data_dir argument at
runtime.
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Cc: Damien Le Moal <damien.lemoal@wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
BUG: 187357408
Change-Id: I4ad8c106a86d17f3010e12e172702e77eca61e80
(cherry picked from commit d9baee13f8cf66a8fac9ec67fdb85ce419fcce3a git://git.kernel.dk/linux-block/ for-5.14/block)
Signed-off-by: Bart Van Assche <bvanassche@google.com>
Change "queue" into "sched" to make the function names reflect better the
purpose of these functions.
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Cc: Damien Le Moal <damien.lemoal@wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
BUG: 187357408
Change-Id: I30825b379146dbaef4ff3f85148b2e788667a77c
(cherry picked from commit a6e57fe5ab09c250fc741294e6321270a4364fec git://git.kernel.dk/linux-block/ for-5.14/block)
Signed-off-by: Bart Van Assche <bvanassche@google.com>
Make __dd_dispatch_request() easier to read by removing two local
variables.
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Cc: Damien Le Moal <damien.lemoal@wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
BUG: 187357408
Change-Id: I5567f7d02a2c628efb437058a1c103c7b123747a
(cherry picked from commit f005b6ff19d2a961a2c3ae9c5f49d48fda143469 git://git.kernel.dk/linux-block/ for-5.14/block)
Signed-off-by: Bart Van Assche <bvanassche@google.com>
Document the locking strategy by adding two lockdep_assert_held()
statements.
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Cc: Damien Le Moal <damien.lemoal@wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
BUG: 187357408
Change-Id: Ie8cf0b0ae208c9cc87731a9c6d7df5e5e59332d5
(cherry picked from commit 91831ddfd7c6e3df9857526a76cfa88673ec0637 git://git.kernel.dk/linux-block/ for-5.14/block)
Signed-off-by: Bart Van Assche <bvanassche@google.com>
Make the code easier to read by adding more comments.
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
BUG: 187357408
Change-Id: If62eb600614d2883d72ee3bd7e7859ae66b24512
(cherry picked from commit 16c3afdb127bbff7d3552e076e568281765674b7 git://git.kernel.dk/linux-block/ for-5.14/block)
Signed-off-by: Bart Van Assche <bvanassche@google.com>
Introduce an rq-qos policy that assigns an I/O priority to requests based
on blk-cgroup configuration settings. This policy has the following
advantages over the ioprio_set() system call:
- This policy is cgroup based so it has all the advantages of cgroups.
- While ioprio_set() does not affect page cache writeback I/O, this rq-qos
controller affects page cache writeback I/O for filesystems that support
assiociating a cgroup with writeback I/O. See also
Documentation/admin-guide/cgroup-v2.rst.
Cc: Damien Le Moal <damien.lemoal@wdc.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Cc: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
BUG: 187357408
Change-Id: If51e608ad37ee7a3f57b507bb17900dcfcb263ed
(cherry picked from commit ee9d2a55c960f152b5710078bbe399a4c51eb0a9 git://git.kernel.dk/linux-block/ for-5.14/block)
Signed-off-by: Bart Van Assche <bvanassche@google.com>
rq_qos_id_to_name() is only used in blk-mq-debugfs.c so move that function
into in blk-mq-debugfs.c.
Cc: Damien Le Moal <damien.lemoal@wdc.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Cc: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
BUG: 187357408
Change-Id: If03083a13917bc2f88b6df7151e033a11ab1bc50
(cherry picked from commit f1a7f539c2720906fb10be0af3514b034e1a9fee git://git.kernel.dk/linux-block/ for-5.14/block)
Signed-off-by: Bart Van Assche <bvanassche@google.com>
Before adding more calls in this function, simplify the error path.
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Cc: Tejun Heo <tj@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
BUG: 187357408
Change-Id: I8568b87d1bebbd3841e42a79b7efe2d0a1bff2bc
(cherry picked from commit f1a7f539c2720906fb10be0af3514b034e1a9fee git://git.kernel.dk/linux-block/ for-5.14/block)
Signed-off-by: Bart Van Assche <bvanassche@google.com>
These entries were consecutive at the time of their introduction but are no
longer consecutive. Make these again consecutive. Additionally, modify the
help text since it refers to blk-mq and since the legacy block layer has
been removed.
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
BUG: 187357408
Change-Id: I568383377a3244efba9748adf0a2e90bd7660bb2
(cherry picked from commit fdc250ea26e44066d690bbe65a03fab512af0699 git://git.kernel.dk/linux-block/ for-5.14/block)
Signed-off-by: Bart Van Assche <bvanassche@google.com>
Since commit 01e99aeca3 'blk-mq: insert passthrough request into
hctx->dispatch directly', passthrough request should not appear in
IO-scheduler any more, so blk_rq_is_passthrough checking in addon IO
schedulers is redundant.
(Notes: this patch passes generic IO load test with hdds under SAS
controller and hdds under AHCI controller but obviously not covers all.
Not sure if passthrough request can still escape into IO scheduler from
blk_mq_sched_insert_requests, which is used by blk_mq_flush_plug_list and
has lots of indirect callers.)
Signed-off-by: Lin Feng <linf@wangsu.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
BUG: 187357408
Change-Id: I97d85c38e584add44399295f3839994b694bc9ca
(cherry picked from commit 0856faaa220759a4fe4334f5c57a8661c94c14ce git://git.kernel.dk/linux-block/ for-5.14/block)
Signed-off-by: Bart Van Assche <bvanassche@google.com>
Currently when non-mq aware IO scheduler (BFQ, mq-deadline) is used for
a queue with multiple HW queues, the performance it rather bad. The
problem is that these IO schedulers use queue-wide locking and their
dispatch function does not respect the hctx it is passed in and returns
any request it finds appropriate. Thus locality of request access is
broken and dispatch from multiple CPUs just contends on IO scheduler
locks. For these IO schedulers there's little point in dispatching from
multiple CPUs. Instead dispatch always only from a single CPU to limit
contention.
Below is a comparison of dbench runs on XFS filesystem where the storage
is a raid card with 64 HW queues and to it attached a single rotating
disk. BFQ is used as IO scheduler:
clients MQ SQ MQ-Patched
Amean 1 39.12 (0.00%) 43.29 * -10.67%* 36.09 * 7.74%*
Amean 2 128.58 (0.00%) 101.30 * 21.22%* 96.14 * 25.23%*
Amean 4 577.42 (0.00%) 494.47 * 14.37%* 508.49 * 11.94%*
Amean 8 610.95 (0.00%) 363.86 * 40.44%* 362.12 * 40.73%*
Amean 16 391.78 (0.00%) 261.49 * 33.25%* 282.94 * 27.78%*
Amean 32 324.64 (0.00%) 267.71 * 17.54%* 233.00 * 28.23%*
Amean 64 295.04 (0.00%) 253.02 * 14.24%* 242.37 * 17.85%*
Amean 512 10281.61 (0.00%) 10211.16 * 0.69%* 10447.53 * -1.61%*
Numbers are times so lower is better. MQ is stock 5.10-rc6 kernel. SQ is
the same kernel with megaraid_sas.host_tagset_enable=0 so that the card
advertises just a single HW queue. MQ-Patched is a kernel with this
patch applied.
You can see multiple hardware queues heavily hurt performance in
combination with BFQ. The patch restores the performance.
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
BUG: 187357408
Change-Id: I53645eb48cb308cd3af81a1c5e718a6abec6a1f9
(cherry picked from commit fa56cac78af68bd93734c290a0ffd0716e871dba git://git.kernel.dk/linux-block/ for-5.14/block)
Signed-off-by: Bart Van Assche <bvanassche@google.com>
This reverts commit b445547ec1.
Since both mq-deadline and BFQ completely ignore hctx they are passed to
their dispatch function and dispatch whatever request they deem fit
checking whether any request for a particular hctx is queued is just
pointless since we'll very likely get a request from a different hctx
anyway. In the following commit we'll deal with lock contention in these
IO schedulers in presence of multiple HW queues in a different way.
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Change-Id: Ibd7dbe69ae1799f2efce5788986e2f1aad88f66d
BUG: 187357408
(cherry picked from commit 2490aeca0081bb168e96fb7b1746d676be84369f git://git.kernel.dk/linux-block/ for-5.14/block)
Signed-off-by: Bart Van Assche <bvanassche@google.com>
This patch adds a host_lock which existed before on ufshcd_vops_setup_xfer_req.
Bug: 190637035
Cc: Stanley Chu <stanley.chu@mediatek.com>
Cc: Can Guo <cang@codeaurora.org>
Cc: Bean Huo <beanhuo@micron.com>
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Asutosh Das <asutoshd@codeaurora.org>
Link: https://lore.kernel.org/linux-scsi/20210701005117.3846179-1-jaegeuk@kernel.org/T/#u
Fixes: 7613068f95 ("BACKPORT: FROMGIT: scsi: ufs: Optimize host lock on transfer requests send/compl paths")
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@google.com>
Change-Id: I0e5f9ec11fa62a074bca5feb5638e8d04cf858ee
This reverts commit 83d653257a.
We need to go back upstream version with right fix.
Bug: 192088222
Signed-off-by: Jaegeuk Kim <jaegeuk@google.com>
Change-Id: I7a52e161e5c82a13304fb5ba96bb6a5c6dacd06a
This reverts commit 46575badbb.
We need to go back upstream version with right fix.
Bug: 192095860
Signed-off-by: Jaegeuk Kim <jaegeuk@google.com>
Change-Id: I3dd1eb638bb3a95b3c8d40673f0821afdeb74f96
This reverts commit 850f11aa85.
We need to go back upstream version with right fix.
Bug: 192095860
Signed-off-by: Jaegeuk Kim <jaegeuk@google.com>
Change-Id: I26bf924125f06e97c1262578c99a2dbb58394235
Export these tracepoint functions to track USB data flow
for performance tuning.
Bug: 192512300
Signed-off-by: chihhao.chen <chihhao.chen@mediatek.com>
Change-Id: I37ae07e87b5b2d0fb24c1e0a2e83954ceb4aa4f9
Transmit the track action type to save_track_hash, otherwise we do not know it is allocation stack or free stack.
Fixes: 8bc6337823 ("ANDROID: vendor_hooks: add hooks for slab memory leak debugging")
Bug: 184928480
Signed-off-by: Liujie Xie <xieliujie@oppo.com>
Change-Id: I74c50c02cfb4ebbf3e9fecdf125e76946ff4e7d1
Update ufs tracepoint symbol list for QCOM.
Bug: 191951106
Signed-off-by: Asutosh Das <asutoshd@codeaurora.org>
Change-Id: Ia95f3bc6d02775fb435e5fd854e355838e8500b1
For memory analysis, need to know all memory-consuming of dma-buf heaps.
But now, other modules can't get defer-free list size.
Export get_freelist_nr_pages to let other modules can get
defer-free list total size.
Bug: 192041645
Change-Id: Icaa1b98e9ab7e330141a92ad147a4e2150c2534b
Signed-off-by: Guangming Cao <Guangming.Cao@mediatek.com>
This patch is based on 1699785. It uses to extend the related interface
to support the request-based operations. We use extension fields in the
parameters of VIDIOC_SUBDEV_S_SELECTION, VIDIOC_SUBDEV_S_FMT and
VIDIOC_SUBDEV_S_FRAME_INTERVAL as request fd.
The driver uses media_request_get_by_fd() to retrieve the media request and
save the pending change in it, so that we can apply the pending change in
req_queue() callback then.
Bug: 191903073
CR-Id:
Signed-off-by: Louis Kuo <louis.kuo@mediatek.com>
Change-Id: Idb7921724cf8febc44b01880a4ad8b7c9272ba6a
When CONFIG_TRACEPOINTS or CONFIG_ANDROID_VENDOR_HOOKS is not set, there
is a build error after commit 01f2392e13 ("ANDROID: logbuf: Add new
logbuf vendor hook to support pr_cont()"):
kernel/printk/printk.c:1962:4: error: implicit declaration of function
'trace_android_vh_logbuf_pr_cont'
[-Werror,-Wimplicit-function-declaration]
trace_android_vh_logbuf_pr_cont(&r, text_len);
^
1 error generated.
Remove the #if directive so that this code always builds properly, which
is possible after commit ba75b92fef ("ANDROID: simplify vendor hooks
for non-GKI builds").
Change-Id: Icc7f55af1becab5a8833b0651402845559b6b56f
Fixes: 01f2392e13 ("ANDROID: logbuf: Add new logbuf vendor hook to support pr_cont()")
Link: https://github.com/ClangBuiltLinux/continuous-integration2/runs/2948254099
Suggested-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Allow io-coherent devices to use a inner writeback read/write allocate,
outer writeback read allocate, no-write allocate cache policy. The outer
cache policy affects the behavior of a system cache, at least on qcom
boards which have one.
The rational follows that of IOMMU_SYS_CACHE_ONLY_NWA. Certain gpu
usecases perform better when using a no-write allocate policy.
Rename the IOMMU_SYS_CACHE_* flags to better reflect that they are not
exclusive with IOMMU_CACHE.
Bug: 191811876
Change-Id: Ic91616a148f39fead008a5b87a54ffd781fee734
Signed-off-by: Patrick Daly <pdaly@codeaurora.org>
This patch adds to upload initial symbol list for Exynosauto SoC.
To find what has updated from GKI symbol easily, this list does not
include full list of symbol. So, nothing has added to GKI ABI symbols.
Bug: 192103187
Signed-off-by: Chanho Park <chanho61.park@samsung.com>
Change-Id: Iae46da79e06d1081199a8db014b892c74887cbf8
There are debugging modules that monitor the memory usage of tasks
and report memory parameters in some situations (such as OOM).
Bug: 189595202
Change-Id: I6cc405b0f4cbe1706857fc3b2f8da83ea981818d
Signed-off-by: Georgi Djakov <quic_c_gdjako@quicinc.com>
The remoteproc coredump APIs are currently only part of the internal
remoteproc header. This prevents the remoteproc platform drivers from
using these APIs when needed. This change moves the rproc_coredump()
and rproc_coredump_cleanup() APIs to the linux header and marks them
as exported symbols.
Signed-off-by: Siddharth Gupta <sidgup@codeaurora.org>
Bug: 188764827
Link: https://lore.kernel.org/linux-remoteproc/1623722930-29354-2-git-send-email-sidgup@codeaurora.org/
Change-Id: I8333774acb748fae10e0fd5146b747c4cf2ea6c7
Signed-off-by: Siddharth Gupta <quic_sidgup@quicinc.com>
select_fallback_rq() must return a cpu that is valid for the task.
However, when nid is not -1, it skips checking for
task_cpu_possible_mask().
This causes a problem when execve-ing 32 bit apps on an asymmetric
system where not all cpus are 32 bit capable. During execve-ing
the task is marked as 32 bit long before its affinity mask is
restricted.
If the cpu goes offline during this time, select_fallback_rq()
could return a 64 bit only cpu, which __migrate_tasks()/
is_cpu_allowed() rejects.
migrate_tasks() will therefore continue to pick the same task
repeatedly, where __migrate_tasks() rejects the cpu chosen
by select_fallback_rq() every time, leading to an infinite loop.
Correct the issue by updating select_fallback_rq() for the case
where nid is not -1, ensuring that the returned cpu is always
valid for this task.
Bug: 192050156
Change-Id: Ia073a8395a02485f6d1c1daa0f3ce9e2029cb1f4
Signed-off-by: Stephen Dickey <dickey@codeaurora.org>
In order to update cpufreq, vendor modules invoke cpufreq_update_util(),
but when we build our modules, report error:
ERROR: modpost: "cpufreq_update_util_data" [xxx.ko] undefined!
Bug: 192218676
Signed-off-by: Liujie Xie <xieliujie@oppo.com>
Change-Id: Ib1da70229f04b08d8d812d065021dec0bf891e0e
This is technically a backwards incompatible change in behaviour, but I'm
going to argue that it is very unlikely to break things, and likely to fix
*far* more then it breaks.
In no particular order, various reasons follow:
(a) I've long had a bug assigned to myself to debug a super rare kernel crash
on Android Pixel phones which can (per stacktrace) be traced back to BPF clat
IPv6 to IPv4 protocol conversion causing some sort of ugly failure much later
on during transmit deep in the GSO engine, AFAICT precisely because of this
change to gso_size, though I've never been able to manually reproduce it. I
believe it may be related to the particular network offload support of attached
USB ethernet dongle being used for tethering off of an IPv6-only cellular
connection. The reason might be we end up with more segments than max permitted,
or with a GSO packet with only one segment... (either way we break some
assumption and hit a BUG_ON)
(b) There is no check that the gso_size is > 20 when reducing it by 20, so we
might end up with a negative (or underflowing) gso_size or a gso_size of 0.
This can't possibly be good. Indeed this is probably somehow exploitable (or
at least can result in a kernel crash) by delivering crafted packets and perhaps
triggering an infinite loop or a divide by zero... As a reminder: gso_size (MSS)
is related to MTU, but not directly derived from it: gso_size/MSS may be
significantly smaller then one would get by deriving from local MTU. And on
some NICs (which do loose MTU checking on receive, it may even potentially be
larger, for example my work pc with 1500 MTU can receive 1520 byte frames [and
sometimes does due to bugs in a vendor plat46 implementation]). Indeed even just
going from 21 to 1 is potentially problematic because it increases the number
of segments by a factor of 21 (think DoS, or some other crash due to too many
segments).
(c) It's always safe to not increase the gso_size, because it doesn't result in
the max packet size increasing. So the skb_increase_gso_size() call was always
unnecessary for correctness (and outright undesirable, see later). As such the
only part which is potentially dangerous (ie. could cause backwards compatibility
issues) is the removal of the skb_decrease_gso_size() call.
(d) If the packets are ultimately destined to the local device, then there is
absolutely no benefit to playing around with gso_size. It only matters if the
packets will egress the device. ie. we're either forwarding, or transmitting
from the device.
(e) This logic only triggers for packets which are GSO. It does not trigger for
skbs which are not GSO. It will not convert a non-GSO MTU sized packet into a
GSO packet (and you don't even know what the MTU is, so you can't even fix it).
As such your transmit path must *already* be able to handle an MTU 20 bytes
larger then your receive path (for IPv4 to IPv6 translation) - and indeed 28
bytes larger due to IPv4 fragments. Thus removing the skb_decrease_gso_size()
call doesn't actually increase the size of the packets your transmit side must
be able to handle. ie. to handle non-GSO max-MTU packets, the IPv4/IPv6 device/
route MTUs must already be set correctly. Since for example with an IPv4 egress
MTU of 1500, IPv4 to IPv6 translation will already build 1520 byte IPv6 frames,
so you need a 1520 byte device MTU. This means if your IPv6 device's egress
MTU is 1280, your IPv4 route must be 1260 (and actually 1252, because of the
need to handle fragments). This is to handle normal non-GSO packets. Thus the
reduction is simply not needed for GSO packets, because when they're correctly
built, they will already be the right size.
(f) TSO/GSO should be able to exactly undo GRO: the number of packets (TCP
segments) should not be modified, so that TCP's MSS counting works correctly
(this matters for congestion control). If protocol conversion changes the
gso_size, then the number of TCP segments may increase or decrease. Packet loss
after protocol conversion can result in partial loss of MSS segments that the
sender sent. How's the sending TCP stack going to react to receiving ACKs/SACKs
in the middle of the segments it sent?
(g) skb_{decrease,increase}_gso_size() are already no-ops for GSO_BY_FRAGS
case (besides triggering WARN_ON_ONCE). This means you already cannot guarantee
that gso_size (and thus resulting packet MTU) is changed. ie. you must assume
it won't be changed.
(h) changing gso_size is outright buggy for UDP GSO packets, where framing
matters (I believe that's also the case for SCTP, but it's already excluded
by [g]). So the only remaining case is TCP, which also doesn't want it
(see [f]).
(i) see also the reasoning on the previous attempt at fixing this
(commit fa7b83bf3b156c767f3e4a25bbf3817b08f3ff8e) which shows that the current
behaviour causes TCP packet loss:
In the forwarding path GRO -> BPF 6 to 4 -> GSO for TCP traffic, the
coalesced packet payload can be > MSS, but < MSS + 20.
bpf_skb_proto_6_to_4() will upgrade the MSS and it can be > the payload
length. After then tcp_gso_segment checks for the payload length if it
is <= MSS. The condition is causing the packet to be dropped.
tcp_gso_segment():
[...]
mss = skb_shinfo(skb)->gso_size;
if (unlikely(skb->len <= mss)) goto out;
[...]
Thus changing the gso_size is simply a very bad idea. Increasing is unnecessary
and buggy, and decreasing can go negative.
Fixes: 6578171a7f ("bpf: add bpf_skb_change_proto helper")
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Dongseok Yi <dseok.yi@samsung.com>
Cc: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/bpf/CANP3RGfjLikQ6dg=YpBU0OeHvyv7JOki7CyOUS9modaXAi-9vQ@mail.gmail.com
Link: https://lore.kernel.org/bpf/20210617000953.2787453-2-zenczykowski@gmail.com
(cherry picked from commit 364745fbe981a4370f50274475da4675661104df https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/commit/?id=364745fbe981a4370f50274475da4675661104df )
Test: builds, TreeHugger
Bug: 188690383
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Change-Id: I0ef3174cbd3caaa42d5779334a9c0bfdc9ab81f5
This patch avoids KMI, so later should be reverted to sync with upstream in
next KMI update.
Bug: 192095860
Signed-off-by: Jaegeuk Kim <jaegeuk@google.com>
Change-Id: If861ecde381f23b5d5f18005063ec55356673fdb
This reverts commit ae618c699c.
It causes early device being stuck.
Bug: 192088222
Signed-off-by: Jaegeuk Kim <jaegeuk@google.com>
Change-Id: I5117e7f8aaa9433b74f1531619cdc1b687d02b41