Optimize qcow2 metadata processing #45

nirs · 2024-11-03T23:32:47Z

Currently we get the l1 entry and l2 entry for every cluster, and get the host offset of the cluster and the status (allocated, compressed, zero). Than we read each cluster into the user buffer. This is repeated multiple times until the user buffer is full.

Issues

This is not efficient for reading data - we want to do one ReadAt for multiple clusters if the cluster host offset is consecutive.
This is not efficient for zero clusters - we want to get the status once for multiple clusters.
This is extremely inefficient for unallocated clusters which have no l2 entry. We can do one status check per l1 entry (512 MiB with default cluster size).

Better implementation

l1 entry without an l2 offset

There is no l2 table - so we know that we have 8192 unallocated clusters (with default cluster size).

Example:

l1index = 0
offset = 0
length = 512 MiB

When reading, we can get status once for each buffer and fill it with zeros - no need to loop over clusters.

When getting extents, perform one l1 entry check instead of 8192. When using 32 MiB segment, perform one l1 entry check for every segment, instead of 512 checks.

When we have a backing file we need to delegate the call to the backing file. If the backing file l1 entry is also not allocated it will be one extra l1 entry check for every backing file in the chain.

l2 entry - unallocated

We have l2 table, but this cluster is unallocated.

If we have backing file, get the status from the backing file.

Iterate on next l2 entries and backing file result, stoping on the next entry with different status, or end of backing file result.

l2 entry - allocated

Iterate of the next l2 entries with the same status bits, and extract the host offset. Stop when a cluster host offset is in a different location on disk.

When reading, do one ReadAt for multiple clusters. With 1 MiB buffer, read up to 16 clusters in the same call.

OS images are never fragmented, but also usually compressed. Real images created with qemu-img convert without -W option are never fragmented. Real images written by guest are likely to be fragmented.

l2 entry - zero

Iterate of the next l2 entries with the same status bits. Stop when status bit are different.

When reading, get status once for up to 16 clusters with 1 MiB buffer.

When getting extents, get status once for segment (32 MiB).

l2 entry - compressed

No way to optimize this, every cluster must be read and decompressed separately.

l2 extended entry

More complicated, currently we even don't cache l2 extended entries and read one sub cluster per read so this is extremely slow.

Hopefully can work similar to standard clusters, but we need to iterate over sub clusters instead of clusters.

When reading we can do one status check per buffer of 16 clusters, and read up to 16 clusters in one read.

We can work on this later.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize qcow2 metadata processing #45

Optimize qcow2 metadata processing #45

nirs commented Nov 3, 2024 •

edited

Loading

Optimize qcow2 metadata processing #45

Optimize qcow2 metadata processing #45

Comments

nirs commented Nov 3, 2024 • edited Loading

Issues

Better implementation

l1 entry without an l2 offset

l2 entry - unallocated

l2 entry - allocated

l2 entry - zero

l2 entry - compressed

l2 extended entry

nirs commented Nov 3, 2024 •

edited

Loading