docs/devel: Document VFIO device dirty page tracking

Adjust the VFIO dirty page tracking documentation and add a section to
describe device dirty page tracking.

Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/r/20230307125450.62409-16-joao.m.martins@oracle.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
This commit is contained in:
Avihai Horon 2023-03-07 12:54:50 +00:00 committed by Alex Williamson
parent 95b29658b6
commit 333f988d19

View file

@ -59,22 +59,37 @@ System memory dirty pages tracking
----------------------------------
A ``log_global_start`` and ``log_global_stop`` memory listener callback informs
the VFIO IOMMU module to start and stop dirty page tracking. A ``log_sync``
memory listener callback marks those system memory pages as dirty which are
used for DMA by the VFIO device. The dirty pages bitmap is queried per
container. All pages pinned by the vendor driver through external APIs have to
be marked as dirty during migration. When there are CPU writes, CPU dirty page
tracking can identify dirtied pages, but any page pinned by the vendor driver
can also be written by the device. There is currently no device or IOMMU
support for dirty page tracking in hardware.
the VFIO dirty tracking module to start and stop dirty page tracking. A
``log_sync`` memory listener callback queries the dirty page bitmap from the
dirty tracking module and marks system memory pages which were DMA-ed by the
VFIO device as dirty. The dirty page bitmap is queried per container.
Currently there are two ways dirty page tracking can be done:
(1) Device dirty tracking:
In this method the device is responsible to log and report its DMAs. This
method can be used only if the device is capable of tracking its DMAs.
Discovering device capability, starting and stopping dirty tracking, and
syncing the dirty bitmaps from the device are done using the DMA logging uAPI.
More info about the uAPI can be found in the comments of the
``vfio_device_feature_dma_logging_control`` and
``vfio_device_feature_dma_logging_report`` structures in the header file
linux-headers/linux/vfio.h.
(2) VFIO IOMMU module:
In this method dirty tracking is done by IOMMU. However, there is currently no
IOMMU support for dirty page tracking. For this reason, all pages are
perpetually marked dirty, unless the device driver pins pages through external
APIs in which case only those pinned pages are perpetually marked dirty.
If the above two methods are not supported, all pages are perpetually marked
dirty by QEMU.
By default, dirty pages are tracked during pre-copy as well as stop-and-copy
phase. So, a page pinned by the vendor driver will be copied to the destination
in both phases. Copying dirty pages in pre-copy phase helps QEMU to predict if
it can achieve its downtime tolerances. If QEMU during pre-copy phase keeps
finding dirty pages continuously, then it understands that even in stop-and-copy
phase, it is likely to find dirty pages and can predict the downtime
accordingly.
phase. So, a page marked as dirty will be copied to the destination in both
phases. Copying dirty pages in pre-copy phase helps QEMU to predict if it can
achieve its downtime tolerances. If QEMU during pre-copy phase keeps finding
dirty pages continuously, then it understands that even in stop-and-copy phase,
it is likely to find dirty pages and can predict the downtime accordingly.
QEMU also provides a per device opt-out option ``pre-copy-dirty-page-tracking``
which disables querying the dirty bitmap during pre-copy phase. If it is set to
@ -89,7 +104,8 @@ phase of migration. In that case, the unmap ioctl returns any dirty pages in
that range and QEMU reports corresponding guest physical pages dirty. During
stop-and-copy phase, an IOMMU notifier is used to get a callback for mapped
pages and then dirty pages bitmap is fetched from VFIO IOMMU modules for those
mapped ranges.
mapped ranges. If device dirty tracking is enabled with vIOMMU, live migration
will be blocked.
Flow of state changes during Live migration
===========================================