Problem ------- `nova rebuild` of an instance with Cinder volume results in "Failed to terminate process 8164 with SIGKILL: Device or resource busy" Because the Cinder volume is removed, when re-spawning the Nova instance, it fails and ends up in ERROR state. Fix --- https://review.openstack.org/#/c/176891/ Environment ----------- DevStack, upstream stable/kilo: $ git branch * (HEAD detached at origin/stable/kilo) Commits IDs: cinder: b3a301b (Support host type specific block volume attachment, 2015-08-04) devstack: 4935abb (Merge "Hardcode the extension lists by default for tempest" into stable/kilo, 2015-08-04) glance: c062362 (Merge "Allow ramdisk_id, kernel_id to be null on schema" into stable/kilo, 2015-07-31) keystone: 220eed8 (Merge "Add openstack_user_domain to assertion" into stable/kilo, 2015-07-31) neutron: 62b7103 (Merge "lb-agent: ensure tap mtu is the same as physical device" into stable/kilo, 2015-08-03) nova: 90e1eac (Bump stable/kilo next version to 2015.1.2, 2015-07-28) requirements: d337e5a (Raise minimum required neutronclient to >=2.4.0, 2015-08-10) Test ---- (1) Boot a Nova instance $ . openrc admin $ nova boot --flavor 1 --key_name oskey1 --image cirros-0.3.3-x86_64-disk vm1 (2) Create a Cinder volume: $ cinder create --display_name=test_volume 1 $ nova list && cinder list +--------------------------------------+------+--------+------------+-------------+------------------+ | ID | Name | Status | Task State | Power State | Networks | +--------------------------------------+------+--------+------------+-------------+------------------+ | 09089b38-7076-40f9-b53b-2037f86dd975 | vm1 | ACTIVE | - | Running | private=10.1.0.3 | +--------------------------------------+------+--------+------------+-------------+------------------+ +--------------------------------------+-----------+-------------+------+-------------+----------+-------------+ | ID | Status | Name | Size | Volume Type | Bootable | Attached to | +--------------------------------------+-----------+-------------+------+-------------+----------+-------------+ | 7df07142-4d95-4bab-8287-1072f1833a6b | available | test_volume | 1 | lvmdriver-1 | false | | +--------------------------------------+-----------+-------------+------+-------------+----------+-------------+ (3) Attach it to the Nova instance: $ nova volume-attach 09089b38-7076-40f9-b53b-2037f86dd975 7df07142-4d95-4bab-8287-1072f1833a6b +----------+--------------------------------------+ | Property | Value | +----------+--------------------------------------+ | device | /dev/vdb | | id | 7df07142-4d95-4bab-8287-1072f1833a6b | | serverId | 09089b38-7076-40f9-b53b-2037f86dd975 | | volumeId | 7df07142-4d95-4bab-8287-1072f1833a6b | +----------+--------------------------------------+ (4) Run a couple of inspection commands: $ sudo virsh list Id Name State ---------------------------------------------------- 2 instance-00000001 running $ sudo virsh domblklist instance-00000001 Target Source ------------------------------------------------ vda /home/stack/src/cloud/data/nova/instances/09089b38-7076-40f9-b53b-2037f86dd975/disk vdb /dev/disk/by-path/ip-192.169.143.120:3260-iscsi-iqn.2010-10.org.openstack:volume-7df07142-4d95-4bab-8287-1072f1833a6b-lun-0 $ cat /proc/scsi/scsi Attached devices: Host: scsi4 Channel: 00 Id: 00 Lun: 00 Vendor: LIO-ORG Model: IBLOCK Rev: 4.0 Type: Direct-Access ANSI SCSI revision: 05 $ iscsiadm -m session tcp: [3] 192.169.143.120:3260,1 iqn.2010-10.org.openstack:volume-7df07142-4d95-4bab-8287-1072f1833a6b (non-flash) $ $ sudo targetcli targetcli shell version 2.1.fb39 Copyright 2011-2013 by Datera, Inc and others. For help on commands, type 'help'. /> ls o- / ......................................................................................................................... [...] o- backstores .............................................................................................................. [...] | o- block .................................................................................................. [Storage Objects: 1] | | o- iqn.2010-10.org.openstack:volume-7df07142-4d95-4bab-8287-1072f1833a6b [/dev/stack-volumes-lvmdriver-1/volume-7df07142-4d95-4bab-8287-1072f1833a6b (1.0GiB) write-thru activated] | o- fileio ................................................................................................. [Storage Objects: 0] | o- pscsi .................................................................................................. [Storage Objects: 0] | o- ramdisk ................................................................................................ [Storage Objects: 0] | o- user ................................................................................................... [Storage Objects: 0] o- iscsi ............................................................................................................ [Targets: 1] | o- iqn.2010-10.org.openstack:volume-7df07142-4d95-4bab-8287-1072f1833a6b ............................................. [TPGs: 1] | o- tpg1 .......................................................................................... [no-gen-acls, auth per-acl] | o- acls .......................................................................................................... [ACLs: 1] | | o- iqn.1994-05.com.redhat:02eb8f0da82 ....................................................... [1-way auth, Mapped LUNs: 1] | | o- mapped_lun0 ................. [lun0 block/iqn.2010-10.org.openstack:volume-7df07142-4d95-4bab-8287-1072f1833a6b (rw)] | o- luns .......................................................................................................... [LUNs: 1] | | o- lun0 [block/iqn.2010-10.org.openstack:volume-7df07142-4d95-4bab-8287-1072f1833a6b (/dev/stack-volumes-lvmdriver-1/volume-7df07142-4d95-4bab-8287-1072f1833a6b)] | o- portals .................................................................................................... [Portals: 1] | o- 192.169.143.120:3260 ............................................................................................. [OK] o- loopback ......................................................................................................... [Targets: 0] o- vhost ............................................................................................................ [Targets: 0] /> (5) Perform `nova rebuild`: $ nova rebuild vm1 cirros-0.3.3-x86_64-disk $ nova list +--------------------------------------+------+--------+------------+-------------+------------------+ | ID | Name | Status | Task State | Power State | Networks | +--------------------------------------+------+--------+------------+-------------+------------------+ | 09089b38-7076-40f9-b53b-2037f86dd975 | vm1 | ERROR | - | Running | private=10.1.0.3 | +--------------------------------------+------+--------+------------+-------------+------------------+ (6) Run the same inspection commands ost `nova rebuild`: ------------------ When you inspect in `targetcli`, the Cinder volume is no more present (because it was deleted): $ sudo targetcli targetcli shell version 2.1.fb39 Copyright 2011-2013 by Datera, Inc and others. For help on commands, type 'help'. /> ls o- / ......................................................................................................................... [...] o- backstores .............................................................................................................. [...] | o- block .................................................................................................. [Storage Objects: 0] | o- fileio ................................................................................................. [Storage Objects: 0] | o- pscsi .................................................................................................. [Storage Objects: 0] | o- ramdisk ................................................................................................ [Storage Objects: 0] | o- user ................................................................................................... [Storage Objects: 0] o- iscsi ............................................................................................................ [Targets: 0] o- loopback ......................................................................................................... [Targets: 0] o- vhost ............................................................................................................ [Targets: 0] /> $ cat /proc/scsi/scsi Attached devices: Host: scsi4 Channel: 00 Id: 00 Lun: 00 Vendor: LIO-ORG Model: IBLOCK Rev: 4.0 Type: Direct-Access ANSI SCSI revision: 05 $ iscsiadm -m session tcp: [3] 192.169.143.120:3260,1 iqn.2010-10.org.openstack:volume-7df07142-4d95-4bab-8287-1072f1833a6b (non-flash) Logs ---- From n-cpu.log: [. . .] 2015-08-12 18:53:28.321 WARNING nova.virt.libvirt.driver [req-5184c944-575c-48f4-b990-1c9ba081be45 admin demo] [instance: 09089b38-7076-40f9-b53b-2037f86dd975] Error from libvirt during dest roy. Code=38 Error=Failed to terminate process 22191 with SIGKILL: Device or resource busy; attempt 1 of 3 2015-08-12 18:53:28.325 INFO nova.compute.resource_tracker [req-3cd42fd4-e693-4a7b-9aa2-1826452c7bf1 None None] Auditing locally available compute resources for node devstack1 2015-08-12 18:53:43.342 WARNING nova.virt.libvirt.driver [req-5184c944-575c-48f4-b990-1c9ba081be45 admin demo] [instance: 09089b38-7076-40f9-b53b-2037f86dd975] Error from libvirt during dest roy. Code=38 Error=Failed to terminate process 22191 with SIGKILL: Device or resource busy; attempt 2 of 3 2015-08-12 18:53:43.344 WARNING nova.virt.libvirt.driver [req-3cd42fd4-e693-4a7b-9aa2-1826452c7bf1 None None] couldn't obtain the vpu count from domain id: 09089b38-7076-40f9-b53b-2037f86dd9 75, exception: cannot get CPU affinity of process 22209: No such process 2015-08-12 18:53:58.365 WARNING nova.virt.libvirt.driver [req-5184c944-575c-48f4-b990-1c9ba081be45 admin demo] [instance: 09089b38-7076-40f9-b53b-2037f86dd975] Error from libvirt during dest roy. Code=38 Error=Failed to terminate process 22191 with SIGKILL: Device or resource busy; attempt 3 of 3 2015-08-12 18:53:58.366 ERROR nova.compute.manager [req-5184c944-575c-48f4-b990-1c9ba081be45 admin demo] [instance: 09089b38-7076-40f9-b53b-2037f86dd975] Setting instance vm_state to ERROR 2015-08-12 18:53:58.366 TRACE nova.compute.manager [instance: 09089b38-7076-40f9-b53b-2037f86dd975] Traceback (most recent call last): [. . .] 2015-08-12 18:53:58.366 TRACE nova.compute.manager [instance: 09089b38-7076-40f9-b53b-2037f86dd975] libvirtError: Failed to terminate process 22191 with SIGKILL: Device or resource busy 2015-08-12 18:53:58.366 TRACE nova.compute.manager [instance: 09089b38-7076-40f9-b53b-2037f86dd975] [. . .]