.. This work is licensed under a Creative Commons Attribution 3.0 Unported License. http://creativecommons.org/licenses/by/3.0/legalcode ========================================== Example Spec - The title of your blueprint ========================================== Include the URL of your launchpad blueprint: https://blueprints.launchpad.net/nova/+spec/handle-default-machine-type-as-q35 Problem description =================== Background: QEMU supports two main variants of "machine type" (think of it as a virtual chipset that provides certain default devices) for x86 hosts: (a) 'pc', which corresponds to Intel's 'I440FX' chipset, which is twenty-two years old as of this writing; and (b) 'q35', which corresponds to Intel's 82Q35 chipset (released in 2007; a relatively modern chipset). For AArch64 hosts, the machine type is called: 'virt'. The 'pc' machine type is considerd "legacy", and does not support some of the modern features (refer the `Use Cases`_ section). The upstream QEMU is considering to stop adding new variants of the legacy 'pc' machine type. This spec aims to rework Nova's libvirt driver to use 'q35' machine type, that enables a few advanced features, based on certain criteria. Use Cases --------- The 'q35' machine type brings several advantages: - Native PCIe support. This allows the use of PCI "Extended Config Space", which *cannot* be used by legacy PCI. Use case: Sometimes certain PCIe devices probe for 'extended' features that determine the support and / or operation of the said device. Also, the native PCIe hotplug is more effective and "cleaner" than the ACPI-based hotplug that is used by the legacy 'pc' machine type. - vIOMMU emulation. This has a few use cases[2]_, namely: (a) protecting the guest memory from untrusted devices that are directly assigned to the guest; (b) protecting guests from untrusted user space drivers (e.g. DPDK); (c) assigning devices to nested virtual guests. - Faster SATA emulation—in comparison to the IDE emulation that the legacy 'pc' machine type uses. Note that this is useful only when a guest OS doesn't support 'virtio' devices (which is what any modern guest should be using, and this is what Nova configures). - Q35 machine type makes Secure Boot (in combination with OVMF, the project that enables UEFI support for QEMU / KVM guests) _actually_ secure. The low-level explanation, this is because a malicious guest kernel might attempt to tamper with the emulated 'pflash' chip (which stores Secure Boot related persistent UEFI variables) directly, skipping the UEFI runtime variable service altogether. In order to prevent this, QEMU and KVM emulate SMM (System Management Mode), and restrict 'pflash' hardware access to code that runs in SMM. And SSM emulation is in QEMU/KVM is only provided by 'q35'; 'i440fx' does not have the necessary (virtual) hardware. (Thanks to Laszlo Ersek, OVMF maintainer, for this explanation.) [1] https://wiki.osdev.org/PCI_Express#Extended_Configuration_Space [2] https://wiki.qemu.org/Features/VT-d#Use_Case_1:_Guest_Device_Assignment_with_vIOMMU Proposed change =============== Given that Linux distributions might change the default machine type to 'q35' (and not to mention the legacy 'pc' machine type is most likely to not get any fixes, besides critical security patches). Regardless of upstream QEMU's plans, Nova should be prepared to not break when that happens. (Refer the "What will break?" section below.) (1) Use the Nova metadata property: 'hw_machine_type' to set the machine type on the guest. (2) Ask 'libosinfo', and pick q35 if it says guest can do both 'pc' or 'q35'. (3) Or Operators can just use 'q35' (this shouldn't need code changes, propose to solve this problem? If this is one part of a larger effort make it clear where this piece ends. In other words, what's the scope of this effort? At this point, if you would like to just get feedback on if the problem and proposed change fit in nova, you can stop here and post this for review to get preliminary feedback. If so please say: Posting to get preliminary feedback on the scope of this spec. Alternatives ------------ None Data model impact ----------------- None REST API impact --------------- None Security impact --------------- FIXME: Since Q35 indirectly enables Secure Boot, wonder if that should be mentioned here. My guess: "no" -- because, the change itself isn't introducing any security-sensitive code. Notifications impact -------------------- None Other end user impact --------------------- FIXME: Talk about guests Performance Impact ------------------ FIXME: Other deployer impact --------------------- FIXME: Consider where to mention this: Any Linux distribution that was released earlier than 2007 should use 'i440fx' (or 'pc') machine type, and those released in 2007 (the year when Intel introduced the Q35 chipset) or newer should use 'q35' Developer impact ---------------- Discuss things that will affect other developers working on OpenStack, such as: * If the blueprint proposes a change to the driver API, discussion of how other hypervisors would implement the feature is required. Upgrade impact -------------- FIXME: [...] Implementation ============== Assignee(s) ----------- Primary assignee: kashyapc Work Items ---------- * Work items or tasks -- break the feature up into the things that need to be done to implement it. Those parts might end up being done by different people, but we're mostly trying to understand the timeline for implementation. Dependencies ============ [FIXME] * Work out how Nova should use 'libosinfo' to extract machine type recommendations. * Include specific references to specs and/or blueprints in nova, or in other projects, that this one either depends on or is related to. * Does this feature require any new library dependencies or code otherwise not included in OpenStack? Or does it depend on a specific version of library? Testing ======= FIXME - Testing in context of migration - Is this untestable in gate given current limitations (specific hardware / software configurations available)? If so, are there mitigation plans (3rd party testing, gate enhancements, etc). Documentation Impact ==================== FIXME: Operations Guide (and potentially End User Guide) will be impacted. Which audiences are affected most by this change, and which documentation titles on docs.openstack.org should be updated because of this change? Don't repeat details discussed above, but reference them here in the context of documentation for multiple audiences. For example, the Operations Guide targets cloud operators, and the End User Guide would need to be updated if the change offers a new feature available through the CLI or dashboard. If a config option changes or is deprecated, note here that the documentation needs to be updated to reflect this specification's change. References ========== Please add any useful references here. You are not required to have any reference. Moreover, this specification should still make sense when your references are unavailable. Examples of what you could include are: [1] An overview of Q35 machine type: https://wiki.qemu.org/images/4/4e/Q35.pdf [*] Emumlated Q35 config: https://git.qemu.org/?p=qemu.git;a=blob;f=docs/config/q35-emulated.cfg [*] libosinfo: https://bugzilla.redhat.com/show_bug.cgi?id=1623501 (RFE: provide machine-type information) [*] Upstream discssion on libvirt and QEMU lists about supporting OSes: https://www.redhat.com/archives/libvir-list/2018-August/msg01073.html -- "[libvirt] clean/simple Q35 support in libvirt+QEMU for guest OSes that don't support virtio-1.0" History ======= Optional section intended to be used each time the spec is updated to describe new design, API or any database schema updated. Useful to let reader understand what's happened along the time. .. list-table:: Revisions :header-rows: 1 * - Release Name - Description * - Train - Introduced