4986 Commits

Author SHA1 Message Date
c004a6fe0c block/xen-blkfront: Make it running on 64KB page granularity
The PV block protocol is using 4KB page granularity. The goal of this
patch is to allow a Linux using 64KB page granularity using block
device on a non-modified Xen.

The block API is using segment which should at least be the size of a
Linux page. Therefore, the driver will have to break the page in chunk
of 4K before giving the page to the backend.

When breaking a 64KB segment in 4KB chunks, it is possible that some
chunks are empty. As the PV protocol always require to have data in the
chunk, we have to count the number of Xen page which will be in use and
avoid sending empty chunks.

Note that, a pre-defined number of grants are reserved before preparing
the request. This pre-defined number is based on the number and the
maximum size of the segments. If each segment contains a very small
amount of data, the driver may reserve too many grants (16 grants is
reserved per segment with 64KB page granularity).

Furthermore, in the case of persistent grants we allocate one Linux page
per grant although only the first 4KB of the page will be effectively
in use. This could be improved by sharing the page with multiple grants.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-10-23 14:20:39 +01:00
4f503fbdf3 block/xen-blkfront: split get_grant in 2
Prepare the code to support 64KB page granularity. The first
implementation will use a full Linux page per indirect and persistent
grant. When non-persistent grant is used, each page of a bio request
may be split in multiple grant.

Furthermore, the field page of the grant structure is only used to copy
data from persistent grant or indirect grant. Avoid to set it for other
use case as it will have no meaning given the page will be split in
multiple grant.

Provide 2 functions, to setup indirect grant, the other for bio page.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-10-23 14:20:36 +01:00
a7a6df2223 block/xen-blkfront: Store a page rather a pfn in the grant structure
All the usage of the field pfn are done using the same idiom:

pfn_to_page(grant->pfn)

This will  return always the same page. Store directly the page in the
grant to clean up the code.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-10-23 14:20:35 +01:00
33204663ef block/xen-blkfront: Split blkif_queue_request in 2
Currently, blkif_queue_request has 2 distinct execution path:
    - Send a discard request
    - Send a read/write request

The function is also allocating grants to use for generating the
request. Although, this is only used for read/write request.

Rather than having a function with 2 distinct execution path, separate
the function in 2. This will also remove one level of tabulation.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-10-23 14:20:34 +01:00
e30b7577bf rbd: use writefull op for object size writes
This covers only the simplest case - an object size sized write, but
it's still useful in tiering setups when EC is used for the base tier
as writefull op can be proxied, saving an object promotion.

Even though updating ceph_osdc_new_request() to allow writefull should
just be a matter of fixing an assert, I didn't do it because its only
user is cephfs.  All other sites were updated.

Reflects ceph.git commit 7bfb7f9025a8ee0d2305f49bf0336d2424da5b5b.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Alex Elder <elder@linaro.org>
2015-10-16 16:49:01 +02:00
0d9fde4fc8 rbd: set max_sectors explicitly
Commit 30e2bc08b2bb ("Revert "block: remove artifical max_hw_sectors
cap"") restored a clamp on max_sectors.  It's now 2560 sectors instead
of 1024, but it's not good enough: we set max_hw_sectors to rbd object
size because we don't want object sized I/Os to be split, and the
default object size is 4M.

So, set max_sectors to max_hw_sectors in rbd at queue init time.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Alex Elder <elder@linaro.org>
2015-10-16 16:48:36 +02:00
0dfc70c334 NVMe: Fix memory leak on retried commands
Resources are reallocated for requeued commands, so unmap and release
the iod for the failed command.

It's a pretty bad memory leak and causes a kernel hang if you remove a
drive because of a busy dma pool. You'll get messages spewing like this:

  nvme 0000:xx:xx.x: dma_pool_destroy prp list 256, ffff880420dec000 busy

and lock up pci and the driver since removal never completes while
holding a lock.

Cc: stable@vger.kernel.org
Cc: <stable@vger.kernel.org> # 4.0.x-
Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-10-15 13:38:48 -06:00
81c04b9438 nvme: use an integer value to Linux errno values
Use a separate integer variable to hold the signed Linux errno
values we pass back to the block layer.  Note that for pass through
commands those might still be NVMe values, but those fit into the
int as well.

Fixes: f4829a9b7a61: ("blk-mq: fix racy updates of rq->errors")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-10-15 09:49:20 -06:00
dbcbdc432b drbd: stop including <asm-generic/kmap_types.h>
<linux/highmem.h> is the placace the get the kmap type flags, asm-generic
files are generic implementations only to be used by architecture code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2015-10-15 00:21:08 +02:00
2f8e2c8777 move io-64-nonatomic*.h out of asm-generic
These are not implementations of default architecture code but helpers
for drivers. Move them to the place they belong to.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Darren Hart <dvhart@linux.intel.com>
Acked-by: Hitoshi Mitake <mitake.hitoshi@lab.ntt.co.jp>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2015-10-15 00:21:07 +02:00
835da3f99d nvme: fix 32-bit build warning
Compiling the nvme driver on 32-bit warns about a cast from a __u64
variable to a pointer:

drivers/block/nvme-core.c: In function 'nvme_submit_io':
drivers/block/nvme-core.c:1847:4: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
    (void __user *)io.addr, length, NULL, 0);

The cast here is intentional and safe, so we can shut up the
gcc warning by adding an intermediate cast to 'uintptr_t'.

I had previously submitted a patch to fix this problem in the
nvme driver, but it was accepted on the same day that two new
warnings got added.

For clarification, I also change the third instance of this cast
to use uintptr_t instead of unsigned long now.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: d29ec8241c10e ("nvme: submit internal commands through the block layer")
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-10-12 13:09:40 -06:00
57dacad5f2 nvme: move to a new drivers/nvme/host directory
This patch moves the NVMe driver from drivers/block/ to its own new
drivers/nvme/host/ directory.  This is in preparation of splitting the
current monolithic driver up and add support for the upcoming NVMe
over Fabrics standard.  The drivers/nvme/host/ is chose to leave space
for a NVMe target implementation in addition to this host side driver.

Signed-off-by: Jay Sternberg <jay.e.sternberg@intel.com>
[hch: rebased, renamed core.c to pci.c, slight tweaks]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-10-09 10:40:37 -06:00
9d99a8dda1 nvme: move hardware structures out of the uapi version of nvme.h
Currently all NVMe command and completion structures are exposed to userspace
through the uapi version of nvme.h.  They are not an ABI between the kernel
and userspace, and will change in C-incompatible way for future versions of
the spec.  Move them to the kernel version of the file and rename the uapi
header to nvme_ioctl.h so that userspace can easily detect the presence of
the new clean header.  Nvme-cli already carries a local copy of the header,
so it won't be affected by this move.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-10-09 10:40:37 -06:00
f11bb3e244 nvme: add a local nvme.h header
Add a new drivers/block/nvme.h which contains all the driver internal
interface.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-10-09 10:40:37 -06:00
2659e57b90 nvme: properly handle partially initialized queues in nvme_create_io_queues
This avoids having to clean up later in a seemingly unrelated place.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-10-09 10:40:37 -06:00
3cf519b5a8 nvme: merge nvme_dev_start, nvme_dev_resume and nvme_async_probe
And give the resulting function a sensible name.  This keeps all the
error handling in a single place and will allow for further improvements
to it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-10-09 10:40:36 -06:00
90667892c5 nvme: factor reset code into a common helper
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-10-09 10:40:36 -06:00
77b50d9e15 nvme: merge nvme_dev_reset into nvme_reset_failed_dev
And give the resulting function a more descriptive name.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-10-09 10:40:36 -06:00
201cf1ecdf nvme: delete dev from dev_list in nvme_reset
Device resets need to delete the device from the device list before
kicking of the reset an re-probe, otherwise we get the device added
to the list twice.  nvme_reset is the only side missing this deletion
at the moment, and this patch adds it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-10-09 10:40:36 -06:00
0a7385ad69 NVMe: Simplify device resume on io queue failure
Releasing IO queues and disks was done in a work queue outside the
controller resume context to delete namespaces if the controller failed
after a resume from suspend. This is unnecessary since we can resume
a device asynchronously.

This patch makes resume use probe_work so it can directly remove
namespaces if the device is manageable but not IO capable. Since the
deleting disks was the only reason we had the convoluted "reset_workfn",
this patch removes that unnecessary indirection.

Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-10-09 10:40:36 -06:00
5105aa555c NVMe: Namespace removal simplifications
This liberates namespace removal from the device, allowing gendisk
references to be closed independent of the nvme controller reference
count.

Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-10-09 10:40:36 -06:00
188c3568f8 NVMe: Reference count open namespaces
Dynamic namespace attachment means the namespace may be removed at any
time, so the namespace reference count can not be tied to the device
reference count. This fixes a NULL dereference if an opened namespace
is detached from a controller.

Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-10-09 10:40:36 -06:00
54ef2b9687 Merge branch 'for-4.4/core' into for-4.4/drivers
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-10-09 10:40:29 -06:00
dcc909d90c nbd: Add locking for tasks
The timeout handling introduced in
	7e2893a16d3e (nbd: Fix timeout detection)
introduces a race condition which may lead to killing of tasks that are
not in nbd context anymore. This was not observed or reproducable yet.

This patch adds locking to critical use of task_recv and task_send to
avoid killing tasks that already left the NBD thread functions. This
lock is only acquired if a timeout occures or the nbd device
starts/stops.

Reported-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Markus Pargmann <mpa@pengutronix.de>
Reviewed-by: Ben Hutchings <ben@decadent.org.uk>
Fixes: 7e2893a16d3e ("nbd: Fix timeout detection")
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-10-08 14:21:24 -06:00
c3984cc994 Merge branch 'stable/for-jens-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen into for-linus
Konrad writes:

Please git pull an update branch to your 'for-4.3/drivers' branch (which
oddly I don't see does not have the previous pull?)

 git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git stable/for-jens-4.3

which has two fixes - one where we use the Xen blockfront EFI driver and
don't release all the requests, the other if the allocation of resources
for a particular state failed - we would go back 'Closing' and assume
that an structure would be allocated while in fact it may not be - and
crash.
2015-10-07 13:50:17 -06:00
a54c8f0f2d xen-blkfront: check for null drvdata in blkback_changed (XenbusStateClosing)
xen-blkfront will crash if the check to talk_to_blkback()
in blkback_changed()(XenbusStateInitWait) returns an error.
The driver data is freed and info is set to NULL. Later during
the close process via talk_to_blkback's call to xenbus_dev_fatal()
the null pointer is passed to and dereference in blkfront_closing.

CC: stable@vger.kernel.org
Signed-off-by: Cathy Avery <cathy.avery@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2015-10-07 15:15:18 -04:00
f4829a9b7a blk-mq: fix racy updates of rq->errors
blk_mq_complete_request may be a no-op if the request has already
been completed by others means (e.g. a timeout or cancellation), but
currently drivers have to set rq->errors before calling
blk_mq_complete_request, which might leave us with the wrong error value.

Add an error parameter to blk_mq_complete_request so that we can
defer setting rq->errors until we known we won the race to complete the
request.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-10-01 10:10:55 +02:00
0220531a13 pktcdvd: drop null test before destroy functions
Remove unneeded NULL test.

The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@@ expression x; @@
-if (x != NULL)
  \(kmem_cache_destroy\|mempool_destroy\|dma_pool_destroy\)(x);
// </smpl>

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2015-09-29 15:01:47 +02:00
bda4e0fb31 NVMe: Set affinity after allocating request queues
The asynchronous namespace scanning caused affinity hints to be set before
its tagset initialized, so there was no cpu mask to set the hint. This
patch moves the affinity hint setting to after namespaces are scanned.

Reported-by: 김경산 <ks0204.kim@samsung.com>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-09-23 14:45:57 -06:00
bc07c10a36 block: loop: support DIO & AIO
There are at least 3 advantages to use direct I/O and AIO on
read/write loop's backing file:

1) double cache can be avoided, then memory usage gets
decreased a lot

2) not like user space direct I/O, there isn't cost of
pinning pages

3) avoid context switch for obtaining good throughput
- in buffered file read, random I/O top throughput is often obtained
only if they are submitted concurrently from lots of tasks; but for
sequential I/O, most of times they can be hit from page cache, so
concurrent submissions often introduce unnecessary context switch
and can't improve throughput much. There was such discussion[1]
to use non-blocking I/O to improve the problem for application.
- with direct I/O and AIO, concurrent submissions can be
avoided and random read throughput can't be affected meantime

xfstests(-g auto, ext4) is basically passed when running with
direct I/O(aio), one exception is generic/232, but it failed in
loop buffered I/O(4.2-rc6-next-20150814) too.

Follows the fio test result for performance purpose:
	4 jobs fio test inside ext4 file system over loop block

1) How to run
	- KVM: 4 VCPUs, 2G RAM
	- linux kernel: 4.2-rc6-next-20150814(base) with the patchset
	- the loop block is over one image on SSD.
	- linux psync, 4 jobs, size 1500M, ext4 over loop block
	- test result: IOPS from fio output

2) Throughput(IOPS) becomes a bit better with direct I/O(aio)
        -------------------------------------------------------------
        test cases          |randread   |read   |randwrite  |write  |
        -------------------------------------------------------------
        base                |8015       |113811 |67442      |106978
        -------------------------------------------------------------
        base+loop aio       |8136       |125040 |67811      |111376
        -------------------------------------------------------------

- somehow, it should be caused by more page cache avaiable for
application or one extra page copy is avoided in case of direct I/O

3) context switch
        - context switch decreased by ~50% with loop direct I/O(aio)
	compared with loop buffered I/O(4.2-rc6-next-20150814)

4) memory usage from /proc/meminfo
        -------------------------------------------------------------
                                   | Buffers       | Cached
        -------------------------------------------------------------
        base                       | > 760MB       | ~950MB
        -------------------------------------------------------------
        base+loop direct I/O(aio)  | < 5MB         | ~1.6GB
        -------------------------------------------------------------

- so there are much more page caches available for application with
direct I/O

[1] https://lwn.net/Articles/612483/

Signed-off-by: Ming Lei <ming.lei@canonical.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-09-23 11:01:16 -06:00
ab1cb278bc block: loop: introduce ioctl command of LOOP_SET_DIRECT_IO
If loop block is mounted via 'mount -o loop', it isn't easy
to pass file descriptor opened as O_DIRECT, so this patch
introduces a new command to support direct IO for this case.

Cc: linux-api@vger.kernel.org
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-09-23 11:01:16 -06:00
2e5ab5f379 block: loop: prepare for supporing direct IO
This patches provides one interface for enabling direct IO
from user space:

	- userspace(such as losetup) can pass 'file' which is
	opened/fcntl as O_DIRECT

Also __loop_update_dio() is introduced to check if direct I/O
can be used on current loop setting.

The last big change is to introduce LO_FLAGS_DIRECT_IO flag
for userspace to know if direct IO is used to access backing
file.

Cc: linux-api@vger.kernel.org
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-09-23 11:01:16 -06:00
e03a3d7a94 block: loop: use kthread_work
The following patch will use dio/aio to submit IO to backing file,
then it needn't to schedule IO concurrently from work, so
use kthread_work for decreasing context switch cost a lot.

For non-AIO case, single thread has been used for long long time,
and it was just converted to work in v4.0, which has caused performance
regression for fedora live booting already. In discussion[1], even
though submitting I/O via work concurrently can improve random read IO
throughput, meantime it might hurt sequential read IO performance, so
better to restore to single thread behaviour.

For the following AIO support, it is better to use multi hw-queue
with per-hwq kthread than current work approach suppose there is so
high performance requirement for loop.

[1] http://marc.info/?t=143082678400002&r=1&w=2

Signed-off-by: Ming Lei <ming.lei@canonical.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-09-23 11:01:16 -06:00
5b5e20f421 block: loop: set QUEUE_FLAG_NOMERGES for request queue of loop
It doesn't make sense to enable merge because the I/O
submitted to backing file is handled page by page.

Signed-off-by: Ming Lei <ming.lei@canonical.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-09-23 11:01:16 -06:00
adbe734b2a Merge branch 'stable/for-jens-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen into for-linus
Konrad writes:

It has one fix that should go in and also be put in stable tree (I've
added the CC already).

It is a fix for a memory leak that can exposed via using UEFI
xen-blkfront driver.
2015-09-23 10:59:44 -06:00
f929d42ceb xen/blkback: free requests on disconnection
This is due to  commit 86839c56dee28c315a4c19b7bfee450ccd84cd25
"xen/block: add multi-page ring support"

When using an guest under UEFI - after the domain is destroyed
the following warning comes from blkback.

------------[ cut here ]------------
WARNING: CPU: 2 PID: 95 at
/home/julien/works/linux/drivers/block/xen-blkback/xenbus.c:274
xen_blkif_deferred_free+0x1f4/0x1f8()
Modules linked in:
CPU: 2 PID: 95 Comm: kworker/2:1 Tainted: G        W       4.2.0 #85
Hardware name: APM X-Gene Mustang board (DT)
Workqueue: events xen_blkif_deferred_free
Call trace:
[<ffff8000000890a8>] dump_backtrace+0x0/0x124
[<ffff8000000891dc>] show_stack+0x10/0x1c
[<ffff8000007653bc>] dump_stack+0x78/0x98
[<ffff800000097e88>] warn_slowpath_common+0x9c/0xd4
[<ffff800000097f80>] warn_slowpath_null+0x14/0x20
[<ffff800000557a0c>] xen_blkif_deferred_free+0x1f0/0x1f8
[<ffff8000000ad020>] process_one_work+0x160/0x3b4
[<ffff8000000ad3b4>] worker_thread+0x140/0x494
[<ffff8000000b2e34>] kthread+0xd8/0xf0
---[ end trace 6f859b7883c88cdd ]---

Request allocation has been moved to connect_ring, which is called every
time blkback connects to the frontend (this can happen multiple times during
a blkback instance life cycle). On the other hand, request freeing has not
been moved, so it's only called when destroying the backend instance. Due to
this mismatch, blkback can allocate the request pool multiple times, without
freeing it.

In order to fix it, move the freeing of requests to xen_blkif_disconnect to
restore the symmetry between request allocation and freeing.

Reported-by: Julien Grall <julien.grall@citrix.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Tested-by: Julien Grall <julien.grall@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: xen-devel@lists.xenproject.org
CC: stable@vger.kernel.org # 4.2
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2015-09-23 12:09:19 -04:00
133bb59585 Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block updates from Jens Axboe:
 "This is a bit bigger than it should be, but I could (did) not want to
  send it off last week due to both wanting extra testing, and expecting
  a fix for the bounce regression as well.  In any case, this contains:

   - Fix for the blk-merge.c compilation warning on gcc 5.x from me.

   - A set of back/front SG gap merge fixes, from me and from Sagi.
     This ensures that we honor SG gapping for integrity payloads as
     well.

   - Two small fixes for null_blk from Matias, fixing a leak and a
     capacity propagation issue.

   - A blkcg fix from Tejun, fixing a NULL dereference.

   - A fast clone optimization from Ming, fixing a performance
     regression since the arbitrarily sized bio's were introduced.

   - Also from Ming, a regression fix for bouncing IOs"

* 'for-linus' of git://git.kernel.dk/linux-block:
  block: fix bounce_end_io
  block: blk-merge: fast-clone bio when splitting rw bios
  block: blkg_destroy_all() should clear q->root_blkg and ->root_rl.blkg
  block: Copy a user iovec if it includes gaps
  block: Refuse adding appending a gapped integrity page to a bio
  block: Refuse request/bio merges with gaps in the integrity payload
  block: Check for gaps on front and back merges
  null_blk: fix wrong capacity when bs is not 512 bytes
  null_blk: fix memory leak on cleanup
  block: fix bogus compiler warnings in blk-merge.c
2015-09-19 18:57:09 -07:00
3aaf14da80 zram: fix possible use after free in zcomp_create()
zcomp_create() verifies the success of zcomp_strm_{multi,single}_create()
through comp->stream, which can potentially be pointing to memory that
was freed if these functions returned an error.

While at it, replace a 'ERR_PTR(-ENOMEM)' by a more generic
'ERR_PTR(error)' as in the future zcomp_strm_{multi,siggle}_create()
could return other error codes.  Function documentation updated
accordingly.

Fixes: beca3ec71fe5 ("zram: add multi stream functionality")
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-17 21:16:07 -07:00
e013f74b60 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
Pull Ceph update from Sage Weil:
 "There are a few fixes for snapshot behavior with CephFS and support
  for the new keepalive protocol from Zheng, a libceph fix that affects
  both RBD and CephFS, a few bug fixes and cleanups for RBD from Ilya,
  and several small fixes and cleanups from Jianpeng and others"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  ceph: improve readahead for file holes
  ceph: get inode size for each append write
  libceph: check data_len in ->alloc_msg()
  libceph: use keepalive2 to verify the mon session is alive
  rbd: plug rbd_dev->header.object_prefix memory leak
  rbd: fix double free on rbd_dev->header_name
  libceph: set 'exists' flag for newly up osd
  ceph: cleanup use of ceph_msg_get
  ceph: no need to get parent inode in ceph_open
  ceph: remove the useless judgement
  ceph: remove redundant test of head->safe and silence static analysis warnings
  ceph: fix queuing inode to mdsdir's snaprealm
  libceph: rename con_work() to ceph_con_workfn()
  libceph: Avoid holding the zero page on ceph_msgr_slab_init errors
  libceph: remove the unused macro AES_KEY_SIZE
  ceph: invalidate dirty pages after forced umount
  ceph: EIO all operations after forced umount
2015-09-11 12:33:03 -07:00
06ab838c20 xen: MFN/GFN/BFN terminology changes for 4.3-rc0
- Use the correct GFN/BFN terms more consistently.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJV8VRMAAoJEFxbo/MsZsTRiGQH/i/jrAJUJfrFC2PINaA2gDwe
 O0dlrkCiSgAYChGmxxxXZQSPM5Po5+EbT/dLjZ/uvSooeorM9RYY/mFo7ut/qLep
 4pyQUuwGtebWGBZTrj9sygUVXVhgJnyoZxskNUbhj9zvP7hb9++IiI78mzne6cpj
 lCh/7Z2dgpfRcKlNRu+qpzP79Uc7OqIfDK+IZLrQKlXa7IQDJTQYoRjbKpfCtmMV
 BEG3kN9ESx5tLzYiAfxvaxVXl9WQFEoktqe9V8IgOQlVRLgJ2DQWS6vmraGrokWM
 3HDOCHtRCXlPhu1Vnrp0R9OgqWbz8FJnmVAndXT8r3Nsjjmd0aLwhJx7YAReO/4=
 =JDia
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-4.3-rc0b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen terminology fixes from David Vrabel:
 "Use the correct GFN/BFN terms more consistently"

* tag 'for-linus-4.3-rc0b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen/xenbus: Rename the variable xen_store_mfn to xen_store_gfn
  xen/privcmd: Further s/MFN/GFN/ clean-up
  hvc/xen: Further s/MFN/GFN clean-up
  video/xen-fbfront: Further s/MFN/GFN clean-up
  xen/tmem: Use xen_page_to_gfn rather than pfn_to_gfn
  xen: Use correctly the Xen memory terminologies
  arm/xen: implement correctly pfn_to_mfn
  xen: Make clear that swiotlb and biomerge are dealing with DMA address
2015-09-10 16:21:11 -07:00
daf0e1ed57 virtio: fixes and features 4.3
virtio-mmio can now be auto-loaded through acpi.
 virtio blk supports extended partitions.
 total memory is better reported when using virtio balloon with auto-deflate.
 cache control is re-enabled when using virtio-blk in modern mode.
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJV7/iUAAoJECgfDbjSjVRpzRIIAMJz5enk2FbyiOsky5THASqK
 rWAEx+Jz5Av3FVM99W+Fs8dmXrBGkCTyRgmX3mZx+pGNhOsLjc6xHElTekzwQBas
 18oFsQDTWk0rPZkO7X6/OY3GgHyq8VSI/2kW+yKMYj4k/gAaGV+Azf0m5DGMp4wn
 rngevvcDSkFEN2Gjxbi3MWT8sJ9Br/EFg6KzKUOiOycHh/rQZsaQK6tq4kpUZkil
 2wvXspQY6ix2fD558gsiqUv6l8yZ9mXVgxtyRcnxjfM+8jr+qtqai8yVGCAC5krh
 50RXrR22tN/6aLYuvtPj46+Y0dleCMEwW2rSmcw5h5yr3B/0ylnZ1YNp6vlFyv0=
 =8xQ2
 -----END PGP SIGNATURE-----

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

Pull virtio updates from Michael Tsirkin:
 "Virtio fixes and features for 4.3:

   - virtio-mmio can now be auto-loaded through acpi.
   - virtio blk supports extended partitions.
   - total memory is better reported when using virtio balloon with
     auto-deflate.
   - cache control is re-enabled when using virtio-blk in modern mode"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  virtio_balloon: do not change memory amount visible via /proc/meminfo
  virtio_ballon: change stub of release_pages_by_pfn
  virtio-blk: Allow extended partitions
  virtio_mmio: add ACPI probing
  virtio-blk: use VIRTIO_BLK_F_WCE and VIRTIO_BLK_F_CONFIG_WCE in virtio1
2015-09-09 10:37:41 -07:00
f6f7a63692 Merge branch 'akpm' (patches from Andrew)
Merge second patch-bomb from Andrew Morton:
 "Almost all of the rest of MM.  There was an unusually large amount of
  MM material this time"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (141 commits)
  zpool: remove no-op module init/exit
  mm: zbud: constify the zbud_ops
  mm: zpool: constify the zpool_ops
  mm: swap: zswap: maybe_preload & refactoring
  zram: unify error reporting
  zsmalloc: remove null check from destroy_handle_cache()
  zsmalloc: do not take class lock in zs_shrinker_count()
  zsmalloc: use class->pages_per_zspage
  zsmalloc: consider ZS_ALMOST_FULL as migrate source
  zsmalloc: partial page ordering within a fullness_list
  zsmalloc: use shrinker to trigger auto-compaction
  zsmalloc: account the number of compacted pages
  zsmalloc/zram: introduce zs_pool_stats api
  zsmalloc: cosmetic compaction code adjustments
  zsmalloc: introduce zs_can_compact() function
  zsmalloc: always keep per-class stats
  zsmalloc: drop unused variable `nr_to_migrate'
  mm/memblock.c: fix comment in __next_mem_range()
  mm/page_alloc.c: fix type information of memoryless node
  memory-hotplug: fix comments in zone_spanned_pages_in_node() and zone_spanned_pages_in_node()
  ...
2015-09-08 17:52:23 -07:00
708649694a zram: unify error reporting
Make zram syslog error reporting more consistent. We have random
error levels in some places. For example, critical errors like
  "Error allocating memory for compressed page"
and
  "Unable to allocate temp memory"
are reported as KERN_INFO messages.

a) Reassign error levels

Error messages that directly affect zram
functionality -- pr_err():

 Error allocating zram address table
 Error creating memory pool
 Decompression failed! err=%d, page=%u
 Unable to allocate temp memory
 Compression failed! err=%d
 Error allocating memory for compressed page: %u, size=%zu
 Cannot initialise %s compressing backend
 Error allocating disk queue for device %d
 Error allocating disk structure for device %d
 Error creating sysfs group for device %d
 Unable to register zram-control class
 Unable to get major number

Messages that do not affect functionality, but user
must be warned (because sysfs attrs will be removed in
this particular case) -- pr_warn():

 %d (%s) Attribute %s (and others) will be removed. %s

Messages that do not affect functionality and mostly are
informative -- pr_info():

 Cannot change max compression streams
 Can't change algorithm for initialized device
 Cannot change disksize for initialized device
 Added device: %s
 Removed device: %s

b) Update sysfs_create_group() error message

First, it lacks a trailing new line; add it.  Second, every error message
in zram_add() has a "for device %d" part, which makes errors more
informative.  Add missing part to "Error creating sysfs group" message.

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08 15:35:28 -07:00
860c707dca zsmalloc: account the number of compacted pages
Compaction returns back to zram the number of migrated objects, which is
quite uninformative -- we have objects of different sizes so user space
cannot obtain any valuable data from that number.  Change compaction to
operate in terms of pages and return back to compaction issuer the
number of pages that were freed during compaction.  So from now on we
will export more meaningful value in zram<id>/mm_stat -- the number of
freed (compacted) pages.

This requires:
 (a) a rename of `num_migrated' to 'pages_compacted'
 (b) a internal API change -- return first_page's fullness_group from
     putback_zspage(), so we know when putback_zspage() did
     free_zspage().  It helps us to account compaction stats correctly.

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08 15:35:28 -07:00
7d3f393823 zsmalloc/zram: introduce zs_pool_stats api
`zs_compact_control' accounts the number of migrated objects but it has
a limited lifespan -- we lose it as soon as zs_compaction() returns back
to zram.  It worked fine, because (a) zram had it's own counter of
migrated objects and (b) only zram could trigger compaction.  However,
this does not work for automatic pool compaction (not issued by zram).
To account objects migrated during auto-compaction (issued by the
shrinker) we need to store this number in zs_pool.

Define a new `struct zs_pool_stats' structure to keep zs_pool's stats
there.  It provides only `num_migrated', as of this writing, but it
surely can be extended.

A new zsmalloc zs_pool_stats() symbol exports zs_pool's stats back to
caller.

Use zs_pool_stats() in zram and remove `num_migrated' from zram_stats.

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Suggested-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08 15:35:28 -07:00
12f03ee606 libnvdimm for 4.3:
1/ Introduce ZONE_DEVICE and devm_memremap_pages() as a generic
    mechanism for adding device-driver-discovered memory regions to the
    kernel's direct map.  This facility is used by the pmem driver to
    enable pfn_to_page() operations on the page frames returned by DAX
    ('direct_access' in 'struct block_device_operations'). For now, the
    'memmap' allocation for these "device" pages comes from "System
    RAM".  Support for allocating the memmap from device memory will
    arrive in a later kernel.
 
 2/ Introduce memremap() to replace usages of ioremap_cache() and
    ioremap_wt().  memremap() drops the __iomem annotation for these
    mappings to memory that do not have i/o side effects.  The
    replacement of ioremap_cache() with memremap() is limited to the
    pmem driver to ease merging the api change in v4.3.  Completion of
    the conversion is targeted for v4.4.
 
 3/ Similar to the usage of memcpy_to_pmem() + wmb_pmem() in the pmem
    driver, update the VFS DAX implementation and PMEM api to provide
    persistence guarantees for kernel operations on a DAX mapping.
 
 4/ Convert the ACPI NFIT 'BLK' driver to map the block apertures as
    cacheable to improve performance.
 
 5/ Miscellaneous updates and fixes to libnvdimm including support
    for issuing "address range scrub" commands, clarifying the optimal
    'sector size' of pmem devices, a clarification of the usage of the
    ACPI '_STA' (status) property for DIMM devices, and other minor
    fixes.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJV6Nx7AAoJEB7SkWpmfYgCWyYQAI5ju6Gvw27RNFtPovHcZUf5
 JGnxXejI6/AqeTQ+IulgprxtEUCrXOHjCDA5dkjr1qvsoqK1qxug+vJHOZLgeW0R
 OwDtmdW4Qrgeqm+CPoxETkorJ8wDOc8mol81kTiMgeV3UqbYeeHIiTAmwe7VzZ0C
 nNdCRDm5g8dHCjTKcvK3rvozgyoNoWeBiHkPe76EbnxDICxCB5dak7XsVKNMIVFQ
 NuYlnw6IYN7+rMHgpgpRux38NtIW8VlYPWTmHExejc2mlioWMNBG/bmtwLyJ6M3e
 zliz4/cnonTMUaizZaVozyinTa65m7wcnpjK+vlyGV2deDZPJpDRvSOtB0lH30bR
 1gy+qrKzuGKpaN6thOISxFLLjmEeYwzYd7SvC9n118r32qShz+opN9XX0WmWSFlA
 sajE1ehm4M7s5pkMoa/dRnAyR8RUPu4RNINdQ/Z9jFfAOx+Q26rLdQXwf9+uqbEb
 bIeSQwOteK5vYYCstvpAcHSMlJAglzIX5UfZBvtEIJN7rlb0VhmGWfxAnTu+ktG1
 o9cqAt+J4146xHaFwj5duTsyKhWb8BL9+xqbKPNpXEp+PbLsrnE/+WkDLFD67jxz
 dgIoK60mGnVXp+16I2uMqYYDgAyO5zUdmM4OygOMnZNa1mxesjbDJC6Wat1Wsndn
 slsw6DkrWT60CRE42nbK
 =o57/
 -----END PGP SIGNATURE-----

Merge tag 'libnvdimm-for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm

Pull libnvdimm updates from Dan Williams:
 "This update has successfully completed a 0day-kbuild run and has
  appeared in a linux-next release.  The changes outside of the typical
  drivers/nvdimm/ and drivers/acpi/nfit.[ch] paths are related to the
  removal of IORESOURCE_CACHEABLE, the introduction of memremap(), and
  the introduction of ZONE_DEVICE + devm_memremap_pages().

  Summary:

   - Introduce ZONE_DEVICE and devm_memremap_pages() as a generic
     mechanism for adding device-driver-discovered memory regions to the
     kernel's direct map.

     This facility is used by the pmem driver to enable pfn_to_page()
     operations on the page frames returned by DAX ('direct_access' in
     'struct block_device_operations').

     For now, the 'memmap' allocation for these "device" pages comes
     from "System RAM".  Support for allocating the memmap from device
     memory will arrive in a later kernel.

   - Introduce memremap() to replace usages of ioremap_cache() and
     ioremap_wt().  memremap() drops the __iomem annotation for these
     mappings to memory that do not have i/o side effects.  The
     replacement of ioremap_cache() with memremap() is limited to the
     pmem driver to ease merging the api change in v4.3.

     Completion of the conversion is targeted for v4.4.

   - Similar to the usage of memcpy_to_pmem() + wmb_pmem() in the pmem
     driver, update the VFS DAX implementation and PMEM api to provide
     persistence guarantees for kernel operations on a DAX mapping.

   - Convert the ACPI NFIT 'BLK' driver to map the block apertures as
     cacheable to improve performance.

   - Miscellaneous updates and fixes to libnvdimm including support for
     issuing "address range scrub" commands, clarifying the optimal
     'sector size' of pmem devices, a clarification of the usage of the
     ACPI '_STA' (status) property for DIMM devices, and other minor
     fixes"

* tag 'libnvdimm-for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (34 commits)
  libnvdimm, pmem: direct map legacy pmem by default
  libnvdimm, pmem: 'struct page' for pmem
  libnvdimm, pfn: 'struct page' provider infrastructure
  x86, pmem: clarify that ARCH_HAS_PMEM_API implies PMEM mapped WB
  add devm_memremap_pages
  mm: ZONE_DEVICE for "device memory"
  mm: move __phys_to_pfn and __pfn_to_phys to asm/generic/memory_model.h
  dax: drop size parameter to ->direct_access()
  nd_blk: change aperture mapping from WC to WB
  nvdimm: change to use generic kvfree()
  pmem, dax: have direct_access use __pmem annotation
  dax: update I/O path to do proper PMEM flushing
  pmem: add copy_from_iter_pmem() and clear_pmem()
  pmem, x86: clean up conditional pmem includes
  pmem: remove layer when calling arch_has_wmb_pmem()
  pmem, x86: move x86 PMEM API to new pmem.h header
  libnvdimm, e820: make CONFIG_X86_PMEM_LEGACY a tristate option
  pmem: switch to devm_ allocations
  devres: add devm_memremap
  libnvdimm, btt: write and validate parent_uuid
  ...
2015-09-08 14:35:59 -07:00
d194cd1dd1 rbd: plug rbd_dev->header.object_prefix memory leak
Need to free object_prefix when rbd_dev_v2_snap_context() fails, but
only if this is the first time we are reading in the header.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Alex Elder <elder@linaro.org>
2015-09-08 23:14:30 +03:00
3ebe138ac6 rbd: fix double free on rbd_dev->header_name
If rbd_dev_image_probe() in rbd_dev_probe_parent() fails, header_name
is freed twice: once in rbd_dev_probe_parent() and then in its caller
rbd_dev_image_probe() (rbd_dev_image_probe() is called recursively to
handle parent images).

rbd_dev_probe_parent() is responsible for probing the parent, so it
shouldn't muck with clone's fields.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Alex Elder <elder@linaro.org>
2015-09-08 23:14:29 +03:00
752240e74d xen: features and fixes for 4.3-rc0
- Convert xen-blkfront to the multiqueue API
 - [arm] Support binding event channels to different VCPUs.
 - [x86] Support > 512 GiB in a PV guests (off by default as such a
   guest cannot be migrated with the current toolstack).
 - [x86] PMU support for PV dom0 (limited support for using perf with
   Xen and other guests).
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJV7wIdAAoJEFxbo/MsZsTR0hEH/04HTKLKGnSJpZ5WbMPxqZxE
 UqGlvhvVWNAmFocZmbPcEi9T1qtcFrX5pM55JQr6UmAp3ovYsT2q1Q1kKaOaawks
 pSfc/YEH3oQW5VUQ9Lm9Ru5Z8Btox0WrzRREO92OF36UOgUOBOLkGsUfOwDinNIM
 lSk2djbYwDYAsoeC3PHB32wwMI//Lz6B/9ZVXcyL6ULynt1ULdspETjGnptRPZa7
 JTB5L4/soioKOn18HDwwOhKmvaFUPQv9Odnv7dc85XwZreajhM/KMu3qFbMDaF/d
 WVB1NMeCBdQYgjOrUjrmpyr5uTMySiQEG54cplrEKinfeZgKlEyjKvjcAfJfiac=
 =Ktjl
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-4.3-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen updates from David Vrabel:
 "Xen features and fixes for 4.3:

   - Convert xen-blkfront to the multiqueue API
   - [arm] Support binding event channels to different VCPUs.
   - [x86] Support > 512 GiB in a PV guests (off by default as such a
     guest cannot be migrated with the current toolstack).
   - [x86] PMU support for PV dom0 (limited support for using perf with
     Xen and other guests)"

* tag 'for-linus-4.3-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: (33 commits)
  xen: switch extra memory accounting to use pfns
  xen: limit memory to architectural maximum
  xen: avoid another early crash of memory limited dom0
  xen: avoid early crash of memory limited dom0
  arm/xen: Remove helpers which are PV specific
  xen/x86: Don't try to set PCE bit in CR4
  xen/PMU: PMU emulation code
  xen/PMU: Intercept PMU-related MSR and APIC accesses
  xen/PMU: Describe vendor-specific PMU registers
  xen/PMU: Initialization code for Xen PMU
  xen/PMU: Sysfs interface for setting Xen PMU mode
  xen: xensyms support
  xen: remove no longer needed p2m.h
  xen: allow more than 512 GB of RAM for 64 bit pv-domains
  xen: move p2m list if conflicting with e820 map
  xen: add explicit memblock_reserve() calls for special pages
  mm: provide early_memremap_ro to establish read-only mapping
  xen: check for initrd conflicting with e820 map
  xen: check pre-allocated page tables for conflict with memory map
  xen: check for kernel memory conflicting with memory layout
  ...
2015-09-08 11:46:48 -07:00
0df4f266b3 xen: Use correctly the Xen memory terminologies
Based on include/xen/mm.h [1], Linux is mistakenly using MFN when GFN
is meant, I suspect this is because the first support for Xen was for
PV. This resulted in some misimplementation of helpers on ARM and
confused developers about the expected behavior.

For instance, with pfn_to_mfn, we expect to get an MFN based on the name.
Although, if we look at the implementation on x86, it's returning a GFN.

For clarity and avoid new confusion, replace any reference to mfn with
gfn in any helpers used by PV drivers. The x86 code will still keep some
reference of pfn_to_mfn which may be used by all kind of guests
No changes as been made in the hypercall field, even
though they may be invalid, in order to keep the same as the defintion
in xen repo.

Note that page_to_mfn has been renamed to xen_page_to_gfn to avoid a
name to close to the KVM function gfn_to_page.

Take also the opportunity to simplify simple construction such
as pfn_to_mfn(page_to_pfn(page)) into xen_page_to_gfn. More complex clean up
will come in follow-up patches.

[1] http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=e758ed14f390342513405dd766e874934573e6cb

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-09-08 18:03:49 +01:00