Bosh
Bosh is a tool created by Cloud Foundry to manage and deploy software on virtual machines either on private vSphere clouds, or on public clouds.
See Bosh's website for more information at https://bosh.io/.
Usage
Installation
Obtain the binaries from https://github.com/cloudfoundry/bosh-cli/releases.
If using PKS, Bosh is already installed as part of the Ops Manager deployment.
Obtaining Credentials
Before using any of the commands below, you will need to authenticate your bosh client against the bosh director. This can be done using environment variables or by using the bosh login
command.
With PKS, the Bosh credentials can be obtained from Ops Manager. Log in to Ops Manager, then navigate to Bosh Tile -> Credentials -> Bosh Commandline Credentials.
Set the following environment variables in a .bash_profile
file:
export BOSH_CLIENT=ops_manager
export BOSH_CLIENT_SECRET=xxxxxxxxxxxxxxxx
export BOSH_CA_CERT=/var/tempest/workspaces/default/root_ca_certificate
export BOSH_ENVIRONMENT=172.31.0.3 bosh
Commands
Deployments
bosh deployments
shows deployments tracked by the Bosh Director.
Each deployment has a name, a set of Bosh releases (ie. components required by the cloud VMs), the stemcell (ie. operating system), and the team it belongs to (ie. the parent deployment).
# bosh deployments
Using environment '172.31.0.3' as client 'ops_manager'
Name Release(s) Stemcell(s) Team(s)
harbor-container-registry-b987f1f1a01cf539193c bosh-dns/1.10.0 bosh-vsphere-esxi-ubuntu-xenial-go_agent/250.93 -
harbor-container-registry/1.7.4-build.42
pivotal-container-service-ad2c46d3833f5f4ea239 backup-and-restore-sdk/1.8.0 bosh-vsphere-esxi-ubuntu-xenial-go_agent/250.93 -
bosh-dns/1.10.0
bpm/1.0.4
cf-mysql/36.14.0.1
cfcr-etcd/1.10.0
docker/35.1.0
harbor-container-registry/1.7.4-build.42
kubo/0.31.7
kubo-service-adapter/1.4.0-build.230
nsx-cf-cni/2.4.1.13515827
on-demand-service-broker/0.26.0
pks-api/1.4.0-build.230
pks-nsx-t/1.30.0
pks-telemetry/2.0.0-build.201
pks-vrli/0.9.0
pks-vrops/0.13.0
pxc/0.14.0
sink-resources-release/0.1.32
syslog/11.4.0
uaa/71.2
wavefront-proxy/0.14.0
service-instance_06d9a723-24d8-477e-8e7c-dcbf31c84e96 bosh-dns/1.10.0 bosh-vsphere-esxi-ubuntu-xenial-go_agent/250.93 pivotal-container-service-ad2c46d3833f5f4ea239
bpm/1.0.4
cfcr-etcd/1.10.0
docker/35.1.0
harbor-container-registry/1.7.4-build.42
kubo/0.31.7
nsx-cf-cni/2.4.1.13515827
pks-api/1.4.0-build.230
pks-nsx-t/1.30.0
pks-telemetry/2.0.0-build.201
pks-vrli/0.9.0
pks-vrops/0.13.0
sink-resources-release/0.1.32
syslog/11.4.0
wavefront-proxy/0.14.0
bosh delete-deployment -d deployment_name
can be used to delete a deployment by name. Use the --force
to force delete and ignore any script or job errors.
ubuntu@ITSOOPSMAN01:~$ bosh delete-deployment -d service-instance_92975a8c-be8f-4db4-8731-8a6c3aab51d7
Using environment '172.31.0.3' as client 'ops_manager'
Using deployment 'service-instance_92975a8c-be8f-4db4-8731-8a6c3aab51d7'
Continue? [yN]: y
Task 189372
Task 189372 | 20:36:37 | Deleting instances: master/3d9a60e5-0f27-40c2-933f-fef5f9d874e0 (2)
Task 189372 | 20:36:37 | Deleting instances: worker/80adeff6-9017-47cc-b76e-853944b71a89 (3)
Task 189372 | 20:36:37 | Deleting instances: apply-addons/7942478a-ec42-4bf0-a1ea-dbe9cd2bdd09 (0)
Task 189372 | 20:36:37 | Deleting instances: worker/a18216cb-253f-4067-8f94-5e1e099a94a7 (1)
Task 189372 | 20:36:37 | Deleting instances: master/5dffde0c-2890-4a6f-8b94-d6c7f6089cac (1)
Task 189372 | 20:36:37 | Deleting instances: worker/8cc69743-84ec-4862-aabd-9277769e7332 (0)
Task 189372 | 20:36:37 | Deleting instances: master/aa5ff8b9-57b4-4c2e-aa50-8dcece512e22 (0)
Task 189372 | 20:36:37 | Deleting instances: worker/516b4580-b910-45ec-aa24-dc8b112d7f00 (4)
Task 189372 | 20:36:37 | Deleting instances: worker/5dacad58-b77b-42ed-8d86-f2451eb67db7 (2)
Task 189372 | 20:36:37 | Deleting instances: apply-addons/7942478a-ec42-4bf0-a1ea-dbe9cd2bdd09 (0) (00:00:00)
Task 189409 | 20:38:09 | Deleting instances: worker/80adeff6-9017-47cc-b76e-853944b71a89 (3) (00:00:11)
L Error: Action Failed get_task: Task 0803ac73-6db2-4145-7160-ceaddba77230 result: 1 of 3 drain scripts failed. Failed Jobs: kubelet. Successful Jobs: syslog_forwarder, openvswitch.
Task 189409 | 20:38:09 | Deleting instances: worker/a18216cb-253f-4067-8f94-5e1e099a94a7 (1) (00:00:11)
L Error: Action Failed get_task: Task 6ea57ea7-c5b9-4778-5947-73be0bb50579 result: 1 of 3 drain scripts failed. Failed Jobs: kubelet. Successful Jobs: syslog_forwarder, openvswitch.
Task 189409 | 20:38:09 | Deleting instances: worker/516b4580-b910-45ec-aa24-dc8b112d7f00 (4) (00:00:11)
L Error: Action Failed get_task: Task f76f0e3d-0d23-4434-7955-5e13778fc5ba result: 1 of 3 drain scripts failed. Failed Jobs: kubelet. Successful Jobs: syslog_forwarder, openvswitch.
Task 189409 | 20:38:09 | Error: Action Failed get_task: Task 0803ac73-6db2-4145-7160-ceaddba77230 result: 1 of 3 drain scripts failed. Failed Jobs: kubelet. Successful Jobs: syslog_forwarder, openvswitch.
Task 189409 Started Tue Aug 27 20:37:58 UTC 2019
Task 189409 Finished Tue Aug 27 20:38:09 UTC 2019
Task 189409 Duration 00:00:11
Task 189409 error
Deleting deployment 'service-instance_92975a8c-be8f-4db4-8731-8a6c3aab51d7':
Expected task '189409' to succeed but state is 'error'
Exit code 1
Errors can be ignored by using the --force
option.
VMs
bosh vms
shows VMs that are running for every deployment managed by Bosh Director. Specific deployments can be specified with the -d
option.
Common options:
- Show stats using
--vitals
- Cloud properties with
--cloud-properties
# bosh vms -d service-instance_62e82b91-cabb-40f0-842c-e1c10a98f2f7 --vitals
Using environment '172.31.0.3' as client 'ops_manager'
Task 164927. Done
Deployment 'service-instance_62e82b91-cabb-40f0-842c-e1c10a98f2f7'
Instance Process State AZ IPs VM CID VM Type Active VM Created At Uptime Load CPU CPU CPU CPU Memory Swap System Ephemeral Persistent
(1m, 5m, 15m) Total User Sys Wait Usage Usage Disk Usage Disk Usage Disk Usage
master/74eb8b50-da8f-4d51-9a26-c8ab63c2bc70 running pks-compute 172.16.14.3 vm-ec465136-0887-4a17-9aa0-3b9be8a33469 xlarge true Fri Aug 23 22:04:07 UTC 2019 2d 22h 27m 35s 0.15, 0.10, 0.08 - 1.0% 0.7% 4.6% 9% (1.5 GB) 0% (0 B) 46% (33i%) 13% (3i%) 4% (0i%)
master/c8cd7497-78bb-49a8-a092-264cbad73100 running pks-compute 172.16.14.2 vm-f82f0fe9-cd61-40cd-86bb-ac60c88143ca xlarge true Fri Aug 23 22:04:07 UTC 2019 2d 22h 27m 34s 0.10, 0.14, 0.15 - 0.9% 0.6% 7.5% 10% (1.6 GB) 0% (0 B) 46% (33i%) 13% (3i%) 4% (0i%)
master/d50104c4-759f-40e7-b55f-15aa93083272 running pks-compute 172.16.14.4 vm-1ffa5f44-2d08-4080-808c-4733e98bc954 xlarge true Fri Aug 23 22:04:06 UTC 2019 2d 22h 27m 34s 0.04, 0.08, 0.08 - 0.6% 0.2% 0.3% 9% (1.5 GB) 0% (0 B) 46% (33i%) 13% (3i%) 4% (0i%)
worker/6f3ec8e8-8976-449d-851b-83803a67d348 running pks-compute 172.16.14.5 vm-4eccd396-a040-44ac-8fdb-61bbf87ef024 large.disk true Fri Aug 23 22:04:08 UTC 2019 2d 22h 27m 34s 2.04, 2.82, 2.10 - 5.7% 2.7% 44.7% 40% (3.3 GB) 0% (4.1 MB) 46% (33i%) 7% (1i%) 12% (6i%)
4 vms
Tasks
bosh tasks
shows tasks and errands that are currently running by the Bosh Director. Show completed recent tasks with -r[=n]
or --recent[=n]
flags, with optional n
value to specify n
results.
bosh task n
show specific task logs for task n
.
Common flags
--debug
to get debug logs--cpi
to show cloud provider interface logs
bosh cancel-task n
will cancel a specific task n
.
root@ITSOOPSMAN01:~# bosh tasks
Using environment '172.31.0.3' as client 'ops_manager'
ID State Started At Last Activity At User Deployment Description Result
164924 processing Mon Aug 26 20:28:45 UTC 2019 Mon Aug 26 20:28:45 UTC 2019 pivotal-container-service-ad2c46d3833f5f4ea239 service-instance_92975a8c-be8f-4db4-8731-8a6c3aab51d7 create deployment -
164034 processing Mon Aug 26 19:37:16 UTC 2019 Mon Aug 26 19:37:16 UTC 2019 ops_manager pivotal-container-service-ad2c46d3833f5f4ea239 run errand upgrade-all-service-instances from deployment pivotal-container-service-ad2c46d3833f5f4ea239 -
2 tasks
Succeeded
Instances
bosh instances
shows instances in deployments.
Common flags:
--details
to show details--ps
to show process for each instance, which shows the Monit status in each VM
SSH
bosh ssh -d deployment vm
creates a SSH connection to the particular VM.
# bosh ssh -d service-instance_92975a8c-be8f-4db4-8731-8a6c3aab51d7 worker/8cc69743-84ec-4862-aabd-9277769e7332
Using environment '172.31.0.3' as client 'ops_manager'
Using deployment 'service-instance_92975a8c-be8f-4db4-8731-8a6c3aab51d7'
Task 164932. Done
Unauthorized use is strictly prohibited. All access and activity
is subject to logging and monitoring.
Welcome to Ubuntu 16.04.6 LTS (GNU/Linux 4.15.0-52-generic x86_64)
* Documentation: https://help.ubuntu.com
* Management: https://landscape.canonical.com
* Support: https://ubuntu.com/advantage
The programs included with the Ubuntu system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.
Ubuntu comes with ABSOLUTELY NO WARRANTY, to the extent permitted by
applicable law.
Last login: Mon Aug 26 20:48:23 2019 from 172.31.0.2
To run a command as administrator (user "root"), use "sudo <command>".
See "man sudo_root" for details.
worker/8cc69743-84ec-4862-aabd-9277769e7332:~$
Cleanup
bosh clean-up
removes all unused releases and stemcells.
# bosh clean-up
Using environment '172.31.0.3' as client 'ops_manager'
Continue? [yN]: y
Task 164925
Task 164925 | 20:29:18 | Deleting stemcells: bosh-vsphere-esxi-ubuntu-xenial-go_agent/250.29 (00:00:12)
Task 164925 | 20:29:30 | Deleting dns blobs: DNS blobs (00:00:00)
Task 164925 Started Mon Aug 26 20:29:18 UTC 2019
Task 164925 Finished Mon Aug 26 20:29:30 UTC 2019
Task 164925 Duration 00:00:12
Task 164925 done
Succeeded
Locks
bosh locks
lists current locks held by different tasks or deployments
Errands
bosh errands -d deployment
lists all errands defined by a deployment
bosh run-errand -d deployment errand
runs an errand by a job name, in a particular deployment
# bosh run-errand -d pivotal-container-service-ad2c46d3833f5f4ea239 upgrade-all-service-instances
Using environment '172.31.0.3' as client 'ops_manager'
Using deployment 'pivotal-container-service-ad2c46d3833f5f4ea239'
Task 164997
Task 164997 | 21:06:08 | Preparing deployment: Preparing deployment (00:00:01)
Task 164997 | 21:06:09 | Running errand: pivotal-container-service/6b58ba3e-be95-43e7-a9f5-57e8812c4826 (0)
Logs
bosh logs
shows logs from a particular deployment.
Troubleshooting
Restarting Bosh Director VM Breaks Bosh
After rebooting the Bosh Director VM, subsequent bosh
calls results in:
Using environment '172.31.0.3' as client 'ops_manager'
Finding current tasks:
Performing request GET 'https://172.31.0.3:25555/tasks?state=processing%!C(MISSING)cancelling%!C(MISSING)queued&verbose=2':
Performing GET request:
Requesting token via client credentials grant: UAA responded with non-successful status code '503' response 'FAILURE'
Exit code 1
This issue is known, see:
- https://github.com/cloudfoundry/bosh/issues/2131
- https://www.pivotaltracker.com/n/projects/956238/stories/163642242
The fix is to run monit restart all
after all services except credhub have started. The credhub service should be in a running state after this is done after a few minutes and bosh should function correctly afterwards.