Ray Of Hope
brain-dump: ballooning bug.
For past 3-4 days I have been working on a problem where after a series of ballooning operation involving in the range of 512 MiB to 30 GiB , the guest (Ubunut 16.04 ) crashes.
[ 193.432063] Freezing remaining freezable tasks ...
[ 198.032804] ata2.01: qc timeout (cmd 0xa1)
[ 198.032815] ata2.01: failed to IDENTIFY (I/O error, err_mask=0x4)
[ 198.032819] ata2.01: revalidation failed (errno=-5)
[ 198.033571] ata2: soft resetting link
[ 208.188659] ata2.01: qc timeout (cmd 0xa1)
[ 208.188670] ata2.01: failed to IDENTIFY (I/O error, err_mask=0x4)
[ 208.188674] ata2.01: revalidation failed (errno=-5)
[ 208.189367] ata2: soft resetting link
[ 213.436080] Freezing of tasks failed after 20.003 seconds (0 tasks refusing to freeze, wq_busy=1):
[ 213.436097] Restarting kernel threads ... done.
[ 213.437318] xen:manage: do_suspend: freeze kernel threads failed -16
By looking at the logs, I got 2 important hints.
1) ata driver is missing some commands or interrupts.
2) tasks are failing to freeze during suspend phase so something is hogging the cpus.
Following the hints above, I started looking in direction of ata driver in guest and QEMU emulation on the host. With debug traces I found that ata driver in guest is returning sense 0x3a and asc = 2 meaning the DVD ROM not found. Further, I found that after missing the IO commands/IRQs and returning error status, guest kernel tries again to send the command to identify the DVD ROM. It tries for 2-3 times and then gives up. So, this doesn’t look like the reason for the crash.
Also I instrumented QEMU to confirm whether it is sending interrupt/command during the point of crash and it was.
Keeping in mind the above analysis, I attacked the second hint where the task are failing to freeze. Using top and perf tool I found that kernel worker which is handling the balloon operation is hogging 100% of the guest CPU and blocking the CPU.
Points that I noted
1) Ubuntu Kernel is configured for voluntary preemption
2) balloon_process has max_retries set to UNLIMITED in drivers/xen/balloon.c balloon_process funciton
3) balloon_process has cond_resched in it (cond_resched ensures that currently running thread can give others chance to run if there are any), but still other threads are not getting scheduled. Looks like a guest kernel issue.
There was a solution proposed to limit the max_retries or return after ballooning for 10 ms. But I rejected it as these will be hacks. cond_resched should work and should schedule another threads , if any are waiting. In balloon_process we return the spinlock also before calling for reschedule , so it should not be a deadlock scenario also.
let me see how the time permits. Either I will dig into this further and try to fix Ubuntu kernel or will file a bug with Ubuntu.