Ray Of Hope

QEMU + CPU Hotunplug design

The basic design decision that we have taken while designing cpu hotunplug functionality for qemu is that the kernel mode fd (in KVM) associated with the usermode vcpu will not be deleted. It will remain there. Only the mmapped memory associated the usermode thread (vcpu) will be deleted from the kernel.

If we get a cpu-deletion command, pc.c/pc_hot_del_cpu(const int64_t id, Error **errp) gets the request for deletion along with the apic ID as one of the parameter. pc_hot_del_cpu gets the device class for the specific cpu and calls parent_unrealize.

qom/cpu.c:cpu_common_unrealizefn() is called that fires a notify event for UNPLUG. piix4 virtual motherboard has registered for the notification events related to hotplug/hotunplug for the virtual devices connected to it (i.e connected to the chip). As a result it receives the notification event for cpu unplug.

piix4_cpu_hotplug_req callback is called when it gets the event. In this function, first AcpuHotplug_Handle is called which updates the gpe associated with cpu. Here basically we are updating the ACPI table. Earlier ACPI tables were part of the BIOS now they are part of QEMU. (gpe is a 8 byte bitmap where each bit represents a cpu. GPE 8 byte area is registered int the form of io mapped area , and any writes and reads to it are trapped and redirected to cpu_hotplug.c). After updating the GPE area, acpi_update_sci is called that sends interrupt request to guest to release the vcpu and don’t use it.

As I mentioned, when the GPE is updated, the write call is trapped in the cpu_hotplug.c in callback function cpu_status_write. cpu_hotplug.c:cpu_status_write checks whether its a write related to cpu deletion or cpu addition. If its cpu_deletion, then it calls acpi_eject_vcpu.

acpi_eject_vcpu: sets the flags for cpu->stop and cpu->exit to be true and calls qemu_cpu_kick().

In qemu_cpu_kick, halt messages i broadcasted and user mode thread(vcpu) is killed with pthread_kill. As a result of the halt broadcast message, release of halt event, the initial function which spawned/created this new thread(qemu_kvm_cpu_thread_fn) and was waiting in unending while loop gets the halt event.

qemu_kvm_cpu_thread_fn receives the event for halt and calls qemu_kvm_destroy_vcpu and qemu_mutex_unlock.

qemu_kvm_destroy_vcpu : calls kvm_destroy_vcpu which unmaps the kernel memory associated with this usermode thread and insert the kernel mode fd that was associated with this thread in a kvm_parked_vcpus linked list. This kernel fd can be used again and we can save on certain conext switches by saving on ioctls to create new kernel mode fd.

Anshul Makkar
mail query: anshul_makkar@justkernel.com

Tags: ,

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.