Information to collect when debugging nested KVM

Debugging nested KVM problems involve going through log files across multiple levels, and it can get tedious with NEEDINFOs (on Bugzilla) bouncing back-n-forth. So here is some, hopefully, useful information.

Terminology

  • L0: bare-metal host
  • L1: level-1 guest (or guest hypervisor)
  • L2: level-2 guest (or nested guest)

Spell out that you are actually “nesting” guests

Firstly, if you’re running any kind of “nesting” at all, please spell that out explicitly (bonus marks: if you show evidence — more on it below). Unfortunately, this needs to be spelled out because when filing bugs, people don’t even _mention_ they’re using nested virt at all.

Ensure you are actually running KVM on KVM

Sometimes people don’t have KVM enabled for their L1 guest, as they run with pure emulation (or what QEMU calls it as “TCG”), and think they’re running nested KVM. Thus confusing “nested virt” (QEMU on KVM) with “nested KVM” (KVM on KVM).

You can confirm that you are _actually_ running KVM on KVM by checking for the presence of the file /dev/kvm in your L1.

For notes on configuring nested KVM, refer to the Procedure to configure nested virtualization document.

What information to collect

Some of this information is available via ‘sosreport’, but just spelling it out for clarity. Collecting the following details (and sticking them in a text file) and posting them somewhere would be a good starting point:

  • Kernel, libvirt, and QEMU version from L0
  • Kernel, libvirt and QEMU version from L1
  • QEMU command-line of L1 – preferably full log from /var/log/libvirt/qemu/instance.log
  • QEMU command-line of L2 – preferably full log from /var/log/libvirt/qemu/instance.log
  • Full dmesg output from L0
  • Full dmesg output from L1
  • Output of: x86info -a (& lscpu) from L0
  • Output of: x86info -a (& lscpu) from L1
  • Output of: dmidecode from L0
  • Output of: dmidecode from L1