Tools for Troubleshooting Application Deployment Issues in Cloud Foundry
Our standard demo for Cloud Foundry has us in a directory where either some source code or an application package (war, zip, etc.) is sitting and then we do a
A handful of messages will appear on the screen, like:
Uploading hello... OK Preparing to start hello... OK -----> Downloaded app package (4.0K) -----> Using Ruby version: ruby-1.9.3 -----> Installing dependencies using Bundler version 1.3.2 Running: bundle install --without development:test --path vendor/bundle --binstubs vendor/bundle/bin --deployment Fetching gem metadata from http://rubygems.org/.......... Fetching gem metadata from http://rubygems.org/.. Installing rack (1.5.2) Installing rack-protection (1.5.0) Installing tilt (1.4.1) Installing sinatra (1.4.3) Using bundler (1.3.2) Your bundle is complete! It was installed into ./vendor/bundle Cleaning up the bundler cache. -----> Uploading droplet (23M) Checking status of app 'hello'.... 1 of 1 instances running (1 running) Push successful! App 'hello' available at http://hello.cdavisafc.cf-app.com
This shows that the application was uploaded, dependencies were downloaded, a droplet was uploaded and the application was started. And that is all fine and good, but what happens when something goes wrong? How can the application developer troubleshoot this?
The answer is multi-faceted and in this note I will try to organize things a bit.
First, let me list the different tools someone might have at their disposal, and briefly what app troubleshooting things it offers:
- the cf cli
cf appscommand – This should be very familiar, it simply shows you the apps you have deployed and an indication of their health
cf logscommand – This will show you the contents of the files found in the logs directory of the warden container – these contents will vary depending on where in the app deployment process you are when investigating
cf filescommand – This will show you the filesystem contents of the warden container – these contents will vary depending on where in the app deployment process you are when investigating
- the bosh cli
bosh logscommand – This will tar up and download the files found in the
/var/vcap/sys/logsdirectory on the targeted VM. In general, the logs from the dea will probably be the most helpful (dea logs and warden logs), with perhaps something of note in the cloud controller logs.
- ssh into CF VMs
- there is a trick to this when running in the AWS VPC – see this thread: https://groups.google.com/a/cloudfoundry.org/forum/#!topic/bosh-users/Zc0IHbPC47k
- In most cases this probably won’t bring you anything that the bosh logs command doesn’t already, except for this next thing…
- wsh (warden shell) into the warden container for the application
- this is only possible if the application was entirely staged and is up and running. In the event that the application is “flapping,” the warden containers are likely getting killed and recreated on some pretty short interval and it will be hard to get much from wsh-ing in.
Here’s the thing… ultimately your application developer will only have access to the first of these things (the cf cli) and once your cloud is stable, this should be sufficient. While you are getting the kinks worked out of your PaaS deployment, however, the other tools can be very helpful. One other thing to note is that if your developers are enabled with some type of micro-cloud foundry on their workstations, then while they may not have bosh, they would be able to ssh into that machine and poke around, for example, getting to the dea logs directly. I do this all the time on my laptop.
Okay, so now with this list of tools, I’ve crafted the following diagram to give some guidance on what tools will help when investigating things during different stages of the application deployment process. There is definitely a bit of a trick to figuring out where in the lifecycle something went wrong, but even trying to use a prescribed tool for something will give you a hint.