Test selective enabling/disabling of CPU flags in Nova
Table of Contents
Setup
My host processor is "Intel(R) Xeon(R) CPU E5-2609 v3 @ 1.90GHz": https://ark.intel.com/content/www/us/en/ark/products/81897/intel-xeon-processor-e5-2609-v3-15m-cache-1-90-ghz.html
The hardware itself does not support Intel TSX. (Refer to annex at the bottom for `lscpu` output.)
My Compute "host" is an all-in-one DevStack (Fedora 32), running in a level-1 VM, with 'host-passthrugh'; so a Nova instance will be a nested (i.e. level-2) guest.
And Nova is running with this patch: https://review.opendev.org/c/openstack/nova/+/774240. The git describe output:
[nova] $> git describe 22.0.0-483-gd897ce2a54
Test procedure
Assuming you've got default DevStack configured with:
ENABLED_SERVICES=key,n-api,n-cpu,n-cond,n-sch,n-novnc,n-api-meta,placement-api,placement-client,g-api,g-reg,q-svc,q-dhcp,q-meta,q-agt,q-l3,horizon,rabbit,mysql
And an SSH keypair generated, a CirrOS image imported to Glance)
- Edit nova.conf as shown in one of the several test variants
- Restart Nova service: sudo systemctl restart "devstack@n-cpu"
- Launch a "nano" instance: openstack server create test_vm1 –flavor m1.nano –key-name mykey1 –image cirros-0.5.1-x86_64-disk –net private
- Observe the CPU flags the guest gets
Test-1: Enable PCID; disable SSBD
nova.conf was configured with:
[libvirt] cpu_models = Nehalem-IBRS cpu_model_extra_flags = +pcid,-ssbd cpu_mode = custom virt_type = kvm
NOTE: The Nehalem-IBRS model also automatically includes tow
Resulting (live) guest XML:
[...] <cpu mode='custom' match='exact' check='full'> <model fallback='forbid'>Nehalem-IBRS</model> <topology sockets='1' dies='1' cores='1' threads='1'/> <feature policy='require' name='pcid'/> <feature policy='disable' name=ssbd'/> [...] </cpu> [...]
Result: Only the PCID flag is enabled ('require' in libvirt parlance), but the SSBD flag is disabled.
Test-2: Enable three flags ('md-clear', 'pcid', and 'ssbd') but disable two ('pdpe1gb' and 'mtrr')
nova.conf was configured with:
$ grep "\[libvirt\]" -A5 /etc/nova/nova-cpu.conf [libvirt] cpu_models = Nehalem-IBRS cpu_model_extra_flags = +md-clear,+pcid,ssbd,-pdpe1gb,-mtrr cpu_mode = custom virt_type = kvm
Resulting (live) guest XML:
[...] <cpu mode='custom' match='exact' check='full'> <model fallback='forbid'>Nehalem-IBRS</model> <topology sockets='1' dies='1' cores='1' threads='1'/> <feature policy='require' name='md-clear'/> <feature policy='require' name='pcid'/> <feature policy='require' name='ssbd'/> <feature policy='disable' name='pdpe1gb'/> <feature policy='disable' name='mtrr'/> [...] </cpu>
Observe:
- The guest correctly gets (as noticed with the 'require' XML attribute) the three CPU flags: 'md-clear', 'pcid', and 'ssbd' (this one enabled was because it was specified with neither "+" nor "-" prefix – so it gets enabled)
- But the "pdpe1gb" nor "mtrr" CPU flags are marked as 'disable'
And the QEMU command-line (notice the -cpu bit):
[...] libvirt version: 6.1.0, package: 4.fc32 (Fedora Project, 2020-06-02-17:50:10, ), qemu version: 4.2.1qemu-4.2.1-1.fc32, kernel: 5.10.13-100.fc32.x86_64, hostname: dstack-f32 LC_ALL=C \ PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin \ HOME=/var/lib/libvirt/qemu/domain-5-instance-00000005 \ XDG_DATA_HOME=/var/lib/libvirt/qemu/domain-5-instance-00000005/.local/share \ XDG_CACHE_HOME=/var/lib/libvirt/qemu/domain-5-instance-00000005/.cache \ XDG_CONFIG_HOME=/var/lib/libvirt/qemu/domain-5-instance-00000005/.config \ QEMU_AUDIO_DRV=none \ /usr/bin/qemu-system-x86_64 \ -name guest=instance-00000005,debug-threads=on \ -S \ -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-5-instance-00000005/master-key.aes \ -machine pc-i440fx-4.2,accel=kvm,usb=off,dump-guest-core=off \ -cpu Nehalem-IBRS,md-clear=on,pcid=on,ssbd=on,pdpe1gb=off,mtrr=off \ -m 64 \ -overcommit mem-lock=off \ -smp 1,sockets=1,dies=1,cores=1,threads=1 \ -uuid c69e8c13-6b84-4347-8661-89ee03527af5 \ -smbios 'type=1,manufacturer=OpenStack Foundation,product=OpenStack Nova,version=22.1.0,serial=c69e8c13-6b84-4347-8661-89ee03527af5,uuid=c69e8c13-6b84-4347-8661-89ee03527af5,family=Virtual Machine' \ -no-user-config \ -nodefaults \ -chardev socket,id=charmonitor,fd=46,server,nowait \ -mon chardev=charmonitor,id=monitor,mode=control \ -rtc base=utc,driftfix=slew \ -global kvm-pit.lost_tick_policy=delay \ -no-hpet \ -no-shutdown \ -boot strict=on \ -blockdev '{"driver":"file","filename":"/home/stack/src/cloud/data/nova/instances/_base/8e147458643d240e4b578acf7e84b6785aa4225c","node-name":"libvirt-2-storage","cache":{"direct":true,"no-flush":false},"auto-read-only":true,"discard":"unmap"}' \ -blockdev '{"node-name":"libvirt-2-format","read-only":true,"cache":{"direct":true,"no-flush":false},"driver":"raw","file":"libvirt-2-storage"}' \ -blockdev '{"driver":"file","filename":"/home/stack/src/cloud/data/nova/instances/c69e8c13-6b84-4347-8661-89ee03527af5/disk","node-name":"libvirt-1-storage","cache":{"direct":true,"no-flush":false},"auto-read-only":true,"discard":"unmap"}' \ -blockdev '{"node-name":"libvirt-1-format","read-only":false,"cache":{"direct":true,"no-flush":false},"driver":"qcow2","file":"libvirt-1-storage","backing":"libvirt-2-format"}' \ -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=libvirt-1-format,id=virtio-disk0,bootindex=1,write-cache=on \ -netdev tap,fd=50,id=hostnet0,vhost=on,vhostfd=51 \ -device virtio-net-pci,host_mtu=1450,netdev=hostnet0,id=net0,mac=fa:16:3e:91:bf:c4,bus=pci.0,addr=0x3 \ -add-fd set=3,fd=53 \ -chardev pty,id=charserial0,logfile=/dev/fdset/3,logappend=on \ -device isa-serial,chardev=charserial0,id=serial0 \ -vnc 0.0.0.0:4 \ -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 \ -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 \ -object rng-random,id=objrng0,filename=/dev/urandom \ -device virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.0,addr=0x6 \ -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny \ -msg timestamp=on char device redirected to /dev/pts/6 (label charserial0)
Annex: `lscpu` from host (level-0) and compute node (level-1)
Baremetal host (L0)
[root@taroxhost ~]# lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian Address sizes: 46 bits physical, 48 bits virtual CPU(s): 12 On-line CPU(s) list: 0-11 Thread(s) per core: 1 Core(s) per socket: 6 Socket(s): 2 NUMA node(s): 2 Vendor ID: GenuineIntel CPU family: 6 Model: 63 Model name: Intel(R) Xeon(R) CPU E5-2609 v3 @ 1.90GHz Stepping: 2 CPU MHz: 1200.001 CPU max MHz: 1900.0000 CPU min MHz: 1200.0000 BogoMIPS: 3800.10 Virtualization: VT-x L1d cache: 384 KiB L1i cache: 384 KiB L2 cache: 3 MiB L3 cache: 30 MiB NUMA node0 CPU(s): 0-5 NUMA node1 CPU(s): 6-11 Vulnerability Itlb multihit: KVM: Mitigation: Split huge pages Vulnerability L1tf: Mitigation; PTE Inversion; VMX conditional cache flushes, SMT disabled Vulnerability Mds: Mitigation; Clear CPU buffers; SMT disabled Vulnerability Meltdown: Mitigation; PTI Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization Vulnerability Spectre v2: Mitigation; Full generic retpoline, IBPB conditional, IBRS_FW, STIBP disabled, RSB filling Vulnerability Srbds: Not affected Vulnerability Tsx async abort: Not affected Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc c puid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer xsave avx f16c rdrand lahf_lm abm cpuid_fault epb invpcid_sin gle pti intel_ppin ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm xsaveopt cqm_llc cqm_occup_llc dtherm arat pln pts md_clear flush_l1d
Compute node (L1)
NB: This VM is using "host-passthrough".
[stack@dstack-f32 devstack]$ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian Address sizes: 40 bits physical, 48 bits virtual CPU(s): 2 On-line CPU(s) list: 0,1 Thread(s) per core: 1 Core(s) per socket: 1 Socket(s): 2 NUMA node(s): 1 Vendor ID: GenuineIntel CPU family: 6 Model: 63 Model name: Intel(R) Xeon(R) CPU E5-2609 v3 @ 1.90GHz Stepping: 2 CPU MHz: 1899.999 BogoMIPS: 3799.99 Virtualization: VT-x Hypervisor vendor: KVM Virtualization type: full L1d cache: 32 KiB L1i cache: 32 KiB L2 cache: 256 KiB L3 cache: 15 MiB NUMA node0 CPU(s): 0,1 Vulnerability Itlb multihit: Not affected Vulnerability L1tf: Mitigation; PTE Inversion; VMX flush not necessary, SMT disabled Vulnerability Mds: Mitigation; Clear CPU buffers; SMT Host state unknown Vulnerability Meltdown: Mitigation; PTI Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization Vulnerability Spectre v2: Mitigation; Full generic retpoline, IBPB conditional, IBRS_FW, STIBP disabled, RSB filling Vulnerability Srbds: Not affected Vulnerability Tsx async abort: Not affected Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon rep_good nopl xtopology cpuid tsc_known_freq pni pclmulqdq vmx ss se3 fma cx16 pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer xsave avx f16c rdrand hypervisor lahf_lm abm cpuid_fault invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsg sbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt arat umip md_clear arch_capabilities