Provide Clean OS to Unbootable VM

This document describes how to handle a Virtual Machine that is not able to boot properly by creating a new boot disk. Before resorting to this method make sure to:

  1. Try to enter Grub menu of broken VM, either by pressing Esc or Shift at early stages of boot.

  2. Try to fix it with Rescue Mode as described in https://warrenio.atlassian.net/wiki/spaces/WARP/pages/216760329 .

If these methods fail then the next option would be to create a new disk with clean OS install as primary boot disk and move the current boot disk to be a secondary disk. Admin API can be used to achieve that.

The result will be a clean OS with nothing extra installed, the problematic, unbootable disk will be the VM’s secondary disk. Extra steps will have to be carried out to restore/reinstall/reconfigure the applications that were running in the VM.

Prerequisites

  • As admin user, get an API Token to use the Admin API.

  • Find out the VM UUID of the broken VM.

  • Find out the Disk UUID of the broken VM.

    • Only the boot disk is relevant, others can stay as they are.

  • Find out the Billing account ID that this VM is assigned to.

  • Find out the User ID of the owner of the VM.

  • Take a note of the OS name and version.

  • Take a note of the location, if present.

  • Make sure the VM is stopped.

Switching Boot Disk

1. Detach disk from VM

Using admin API endpoint for detaching a disk from VM:

curl 'https://api.DOMAIN/v1/LOCATION_SLUG/user-resource/admin/vm/storage/detach' \ -X POST \ --header 'apikey: [API Token]' \ --data-urlencode 'uuid=[VM UUID]' \ --data-urlencode 'storage_uuid=[Disk UUID]'

2. Create a new clean OS disk

The new boot disk could be with any OS, but it would probably be safest to use the same OS. The source_image parameter format is [os-name]_[version], essentially it must match the OS base image name on Ceph.

Decide how big the new boot disk should be. If data will be copied to the new disk then it should be at least as big as the old one. If the old main disk will remain as secondary disk for data then the new disk can be smaller, just big enough to accommodate the OS.

curl 'https://api.DOMAIN/v1/LOCATION_SLUG/storage/admin/disks' \ -X POST \ --header 'apikey: [API Token]' \ --data-urlencode 'user_id=[User ID]' \ --data-urlencode 'billing_account_id=[Billing account ID]' \ --data-urlencode 'size_gb=20' \ --data-urlencode 'source_image_type=OS_BASE' \ --data-urlencode 'source_image=ubuntu_20.04'

The request will take a minute or two. Take a note of Clean OS Disk UUID, it can be found in the response uuid field.

{ "uuid": "d399d668-a1da-...", "status": "Active", "user_id": [User ID], "billing_account_id": [Billing account ID], "size_gb": 20, "source_image_type": "OS_BASE", "source_image": "ubuntu_20.04", "image_path": "vm-disk-images/d399d668-a1da-....img", "created_at": "2023-03-21T12:40:01.209+0000", "updated_at": "2023-03-21T12:40:57.585+0000", "acting_user_id": ..., "storage_pool_name": "vm-disk-images" }

3. Attach clean OS disk as boot disk

The first disk that is attached to VM will be the boot disk. Attach the new clean OS image.

4. Attach previous main disk as secondary disk

Same as previous call, just replace storage_uuid.

The disk that had the broken boot before is now secondary disk, most probably vdb.

Follow-up actions

The VM now has two disks. (Known bug: admin view does not always refresh properly and still shows the one disk. Check it by impersonating the user, if really in doubt.)

Start the VM. See that it has a secondary disk with all the data.

Either copy the data from vdb1 to vda1 or keep the additional disk as data disk and configure fstab accordingly.

This is a clean OS now, so all packages have to be reinstalled and systems set up that were running in the VM before.

Alternatively, if the system that is now on vdb can be fixed then both disks can be detached again and the main disk with the old system attached again, so it once again becomes the boot disk.