To the Cloud! Well… the SDSC “Cloud” Anyway

About a year ago, SDSC deployed an object store based on OpenStack Swift. I really haven’t had much of a use for it but it just seems like such a neat idea, not to mention the price. I noticed that Mathias Meyer @roidrage published a neat little on GitHub, called S3itch. After looking at the code I decided I could make this work with the stuff at SDSC. I forked the code, which can be found at https://github.com/cwebberOps/s3itch, and reworked the code so it would work with the Rackspace provider and pointed it at SDSC. After a little digging around for the right URL I managed to get it up and running. I need to update the README.md to reflect my changes but here is the run down on how UC and UC Affiliated users can get started with Skitch and the Swift storage at SDSC.

  1. Before we begin, this tutorial assumes that you are running on OS X
  2. Sign up for an account with SDSC at https://cloud.sdsc.edu/hp/request.php
  3. Once you have the account info, create a folder for this project. You will need to make it world readable.
  4. Sign up for an account with Heroku and follow the Getting Started with Ruby Apps document. (You don’t actually NEED Heroku, it just means you don’t have to run the server on your local box)
  5. Download Skitch
  6. Clone the Repo from GitHub
  7. Change directories to the newly cloned repo.
  8. Copy the .env.sample file to .env
  9. Update the .env file.
    • The HTTP variables are used for Skitch to auth to your app, so use something random.
    • The BUCKET is the directory that you created in the Cloud Explorer.
  10. Run heroku plugins:install git://github.com/ddollar/heroku-config.git to install the heroku-config plugin.
  11. Run heroku create --stack cedar to create the an app on Heroku
  12. Run git push heroku master to push the app code up to Heroku
  13. Run heroku config:push to push the ENV variables in .env up to Heroku
  14. Open Skitch and the Sharing Preferences tab.
  15. Create a new WebDAV Account
    • The Server field is the name of your Heroku app
    • The User and Password fields correspond to the HTTP variables you set in the .env file
    • The Directory is left empty
    • The Base URL is the public link from the folder that you are uploading to. This can be found in the Cloud Explorer. Please note that it cannot have a trailing slash.
  16. You should now be able to Share your screenshots and Copy the links as you go.
Tagged , , , | Leave a comment

Don’t Miss Your Life

I heard an interesting song this weekend, “Don’t Miss Your Life” by Phil Vasser and it really got me thinking about life. A little bit about me first. I am naturally a workaholic. If it were not for my wife keeping me in check, I would probably work 80+ hours a week without thinking about it. I take tremendous pride in my work, but really used to over do it.

As the week started, there was a tweet by @georgeresse that got me thinking about this again.

If it’s socially awkward for you to leave work at 5:30, you are working for the wrong company.

While my original thought on the matter was a little different, it took me back to the song and thoughts about community and my job. As a father and a husband, I really appreciate the flexibility that UCR and my boss give me in taking care of family. I am able to work from home, leave a little early and come in a little late when I need to. I know that I am not going to miss those important moments. I have tremendous support to be a good father, and husband, which really does make me a better SysAdmin.

Then there is the community aspect. I LOVE the group of people I have come to know as part of the DevOps movement. When we were at SCaLE this year it was demonstrated time and time again how important family is. One of the nights we were at dinner, my daughter called, I snuck away to spend some time on the phone with her. A side of me expected to have the guys give me a hard time. Instead, I got nothing but excitement from the likes of @lusis, @drawks and @mattray. And, instead of feeling like I missed the conversation, we moved into talking about how awesome our kids our and all the neat things about being Dads.

Enter #dadops. While there are way too many people to list, many, many of the people involved with DevOps on twitter are also all about their families. It is awesome to be part of a group that values family so much.

So, if you aren’t getting out of the office in time to spend time with your family, you should do something about it, so you don’t miss your life.

Leave a comment

Yo Dawg, I Heard You Like Puppet, So I Put Some Puppet In Your Puppet

I am a huge fan of the configuration management space. Whether you are running Chef or Puppet, you should be running one of them. Occasionally, I find myself in predicaments where there are resources I need to manage that don’t quite fit the standard configuration management paradigm though. For example, I use the puppet-virt module to generate OpenVZ containers. But, I need an easy way to bootstrap where the container looks to find it’s puppetmaster. In my case it is determined by an entry in the hosts file. Now, I could fairly easily write some bash or ruby to parse the hosts file and point it where I would like, but why not just use a language that is purpose built for making those kinds of things happen.

So there is a puppetmaster.pp that gets dropped into the new container. Once that is there, I exec a puppet apply against this manifest. This is a bit better mocked up in puppet code:

file {"/var/lib/vz/private/${id}/etc/puppet/puppetmaster.pp":
  ensure  => present,
  owner   => "root",
  group   => "root",
  mode    => 0644,
  source  => "puppet:///modules/openvz/puppetmaster.pp",
  require => Virt[$title]
}

exec {"/usr/sbin/vzctl exec ${id} puppet apply /etc/puppet/puppetmaster.pp":
  refreshonly => true,
  require     => File["/var/lib/vz/private/${id}/etc/puppet/puppetmaster.pp"],
  subscribe   => Exec["${title}-network"]
}

The puppetmaster.pp looks something like:

Exec {
  path => "/usr/local/bin:/usr/bin:/bin:/usr/local/sbin:/usr/sbin:/sbin"
}
    $puppetmaster = "puppet.fqdn"

# Setup the correct puppetmaster
case $interfaces {

  /infra/: {
    host { $puppetmaster:
      ensure => present,
      ip     => "192.168.0.8",
      alias  => "puppetmaster"
    }

  }
  /clust/:{
    host { $puppetmaster:
      ensure => present,
      ip     => "192.168.2.2",
      alias  => "puppetmaster"
    }
  }
  /web/:{
    host { $puppetmaster:
      ensure => present,
      ip     => "192.168.1.2",
      alias  => "puppetmaster"
    }

  }
  default:{

    host { $puppetmaster:
      ensure => present,
      ip     => "<public ip address>",
      alias  => "puppetmaster"
    }

  }

}

exec {"echo server = ${puppetmaster} >> /etc/puppet/puppet.conf":
  unless      => "grep master /etc/puppet/puppet.conf",
  refreshonly => true,
  alias       => "master",
  subscribe   => Host['puppetmaster']
}

service {'puppet':
  ensure     => running,
  enable     => true,
  hasrestart => true,
  hasstatus  => true,
  subscribe  => Exec['master']
}

This makes it easy to get the host entry setup correctly and keep from spending time parsing files, etc. While I usually work towards using a puppetmaster, it is nice to be able to arbitrarily apply basic settings.

Tagged , , | Leave a comment

NoOps? Wuh? Huh?

In the circles I run with on twitter there has been a huge discussion about why the idea of NoOps is just flat out wrong. Suffice to say, I tend to agree with the camp that is anti the idea of NoOps. But really it got me thinking about why I think so many of us take issue with the idea.

The Writing On The Wall

As this discussion unfolded, my mind kept going back to the problems we see in and around the idea of security. There are many places which believe that security is taken care of by the Security group. It has been proven time and time again that this doesn’t work. Everyone has a responsibility for security. Developers need to write their code so that it is secure. Ops needs to ensure things are patched and locked down. The security team needs to know how both of the other groups do their job so they can provide suggestions and help with configuration along with normal duties of auditing and forensic analysis. Lets face it, a NoSec approach is gonna get you in trouble.

Ops is Everyone’s Responsibility (So is Development)

There I said it. Everyone has Ops responsibilities, just like everyone has Security responsibilities, just like everyone has Dev responsibilities. This is at the core of the DevOps movement. Make Dev responsible for the way their apps work in production and make Ops responsible for making sure that the environment is suitable for Dev to get work done. Sure, my work today looks a lot more like development then it did a few years back, but it has not fundamentally changed what I am responsible for making happen, I just happen to be leveraging different tools. Ops has nothing to do with lighting up DNS and DHCP services. It is about ensuring that the infrastructure and other components that are needed for the app to operate well are there. Now, that may very well mean writing code to manage the number of EC2 instances running at a given time or polling the SaaS we are using to verify SLAs. Either way, it is still Ops.

Fundamentally, I think Mark Imbriaco (@markimbriaco) hit the nail on the head when he said, “I object to NoOps because it is a lie. Somebody always does Ops.” There will always be Ops, it may just be done by someone that calls themselves a Developer.

Tagged , , | 2 Comments

CERT and Operations: A Disaster Made in Heaven

As operations folks, dealing with emergencies and disasters is in our blood. I have yet to meet an Ops person that couldn’t recount some war stories about a large scale outage or failure in an environment that they have worked in. In addition to the large scale failures, we generally spend a fair amount of our time fighting fires on a regular basis. As the DevOps movement has taken off one of the bigger things that I have taken away is that we should be looking at other professions and disciplines where they have already solved the issues that we are facing. This is where CERT, and in many ways, more importantly, ICS, come into play.

When I say CERT, many in the community think of US-CERT, the United States Computer Emergency Readiness Team. While that is an important CERT, it is not the CERT I will be referring to today. CERT, or Community Emergency Response Team falls under the FEMA (Federal Emergency Managment Agency) program called Citizens Corps. Semantics aside, CERT is there to empower citizens to be able to sustain themselves after a large scale disaster where they may be without government support for possibly 72 hrs or more. At CERT’s core there is a 20 hr course that teaches:

  • Disaster Preparedness
  • Fire Safety (which includes actual training and usage of a fire extinguisher)
  • Disaster Medical Operations
    • First Aid
    • Triage
  • Light Search and Rescue
    • How to assess whether or not to enter a building
    • Cribbing
    • Basic search techniques
  • Incident Command System
  • Disaster Psychology

Enough of the sales pitch, why does this matter to the average Ops person? First off, many of us, while not first responders, are responsible for the critical infrastructure that makes things work. Whether it is the networking that make everything possible, or the web presence that needs to be maintained so municipalities can order the supplies needed to begin dealing with the disaster, there are all sorts of ways that we as Ops folks are needed in a disaster. We do what we can to prep for disasters at work, why not do the same at home?

Then there is the way this applies in the office. Beyond the basics of knowing how to use the fire extinguisher or knowing how to get that row of servers off your buddy when they fell over during an earthquake, there are some real lessons to be learned from the Incident Command System or ICS. ICS was developed by the fire service to handle dealing with scaling resources during an incident. ICS scales from a small brush fire all the way to incidents as large as Hurricane Katrina. The Incident Command System lays the foundation for being able to deal with large scale and small scale issues. It deals with the questions of:

  • Who’s in charge?
  • How does interaction between different teams happen?
  • How do we get more resources?

In his talk, GameDay: Creating Resiliency Through Destruction at LISA 2011, Jesse Robbins (@jesserobbins) talks about how during his time at Amazon, he essentially took the Incident Command System and substituted the Command for Management to come up with their procedures on how to deal with incidents. CERT gives the basic intro to ICS and then you can add on (or start with if you like) courses from the FEMA Emergency Management Institute Independent Study Program. In general, IS-100 and IS-200 are the suggested starting points.

Hopefully I have convinced you that CERT is a worthwhile investment of 20 hrs. In case I haven’t, there are a few other things worth knowing. This is generally free and usually you walk away with a bag of personal gear to get you started. Thats right, there is SWAG involved and its not just a t-shirt. But really, this is something we should all be doing to be good citizens. The career building aspects are just an awesome win.

CERT Classes and Programs

Related Links

Tagged , | 1 Comment

SCaLE 10x: Recap

The annual Southern California Linux Expo was this past weekend. Put simply, it rocked. Everything about it this year was awesome. As usual the talks were great and the hallway track was phenomenal. That said I want to highlight some of the people and talks/software that stuck out to me.

The People

I met a number of new people in the “hallway track” this year. Many of which were doing the same things as my self, usually at larger scale, but there were a few that I got a chance to meet in real life that have had a large impact on our community and profession. Those people are listed below, in no particular order.

John Vincent (@lusis)

I got a chance to spend quite a bit of time with John this weekend and he is just as fun in person as he is on twitter. It was awesome to get a chance to hear his feelings on different aspects of cloud computing, operations and the ideas of DevOps. After spending a few days with John, I am even more excited to see the amazing things he is going to do in the coming years.

John Willis (@botchagalupe)

Meeting this John was amazing as well. With his years of experience he definitely is asking the hard questions and sees some of the bigger picture issues with the industry. I am looking forward to the next time I get a chance to sit down and have a beer with this man and talk shop.

Matt Ray (@mattray)

Matt is just one of those awesome guys that is a part of some of the largest projects shaping the future of our industry. His work with Chef, OpenStack and Crowbar is just amazing. I loved the back and fourth about Puppet and Chef. I feel like a better Ops guy just by getting a chance to spend time talking to him.

Carl Caum (@ccaum)

So while I definitely showed Chef some love this weekend, I am still a Puppet guy at heart. Getting a chance to meet Carl and talk a little about Puppet was great. Even better was watching Carl explain Puppet to people.

Brendan Gregg (@brendangregg)

Brendan Gregg falls directly into the celebrity category for me. He is someone that has just done amazing work and been directly responsible for making things that have made my life easier. It was great to see his talk (which, consequently is going to cause me to go back and look at things on my file servers) and then getting a chance to sit and talk to him at the meetup later that night. On a side note, I am super excited that I got a SIGNED copy of the DTrace book.

Deirdré Straughan (@DeirdreS)

Deirdré plays an important role in the Illumos and Joyent community. Whether it be making sure that videos are made available or facilitating the conversations that take place or need to take place, she is out there keeping us all aware of what is going on.

The Talks and Software

nVentory (http://nventory.sourceforge.net)

This just looks awesome. It does the work of inventorying systems and gives me the physical datacenter management tools that I have been looking for along with direct Puppet integration. This looks like it is just going to be one big win.

HAProxy and haproxyctl

I had heard about HAProxy before but hadn’t really looked much at it. The talk by @lolcatstevens was a great introduction to HAProxy and I can’t wait to give it a spin.

Chef

This was the first time I got a chance to really understand the fundamental differences between Chef and Puppet. As I said many times this weekend, I truly believe that over the next few years, being a good Ops person will require knowing both at least well enough to read them. Chef is doing some amazing things and has some really neat features, I am looking forward to the new concepts that come out of Opscode.

DTrace and Performance

Putting aside the fact that Brendan Gregg himself gave this talk, it was just awesome. If you are running Solaris/OpenIndiana/NexentaStor, you need to go look at the slide deck. I can’t wait to take the commands to my NFS servers. Here is a link to the slide deck. http://dtrace.org/blogs/brendan/files/2012/01/scale10x-performance.pdf

Monitoring Sucks

The panel was very interesting and really highlighted the frustrations we all have. The responses and later discussions with Simon from Zenoss and James from Pager Duty were great. I think my biggest takeaway and point is that monitoring is really a number of things and we need to focus on the smaller aspects of it to solve problems.

Additional Reading

As I see blog posts recapping SCaLE I will try and update them here.

Tagged , , , , | 1 Comment

The End of an Era

Today marks the end of an era for me. At lunch I am handing off all my old personal Sun SPARC gear that I have collected in years past. For those that lived it, It is an Ultra 10, Ultra 60 and a StorEdge MultiPack. Until now, as a Sysadmin, I have always been in Solaris environment and up until recently that was almost entirely SPARC. But now, I have moved on. I now work in an x86 Linux world. Cloud, virtualization and commodity hardware have changed the game. Illumos is certain to continue to be a player but Sun, SPARC and Solaris proper have met their fate. So as I see those systems go, here is to the great memories and the beginnings of my career as a Sysadmin.

Tagged , , , , | Leave a comment

New Year’s Resolutions 2012

One of the coolest things about maintaining a blog has been the ability to go back and look at things. I have spent a few minutes this morning going back and looking at my resolutions for 2011. I am happy to report that I achieved all of the professional things that I set out to do. Unfortunately, I didn’t achieve a single one of the personal things.

So where to from here? Here is my list of resolutions for the upcoming year:

Professional

  • Speak at UCCSC (the UC internal IT conference)
  • Find project management strategies that work for my facility
  • Learn Chef
  • Develop a proof of concept for cluster users to be able to take advantage of elastic compute services

Personal

  • Drink at least 24oz of water a day
  • Take a long walk at least once a week
  • Cook at home more frequently

Here is to a 2012 that is filled with new adventures, new opportunities and lots of time spent with friends and family.

Tagged , | Leave a comment

Sharing the Joy

Vacation is one of the most important things we can do in our professional lives. It gives us a chance to get out of the normal routine and clear our minds, etc. For me vacation provides opportunities to spend time with family and reflect on one of my greatest passions, systems administration. While that is a bit of a paradox, I really do enjoy being a sysadmin, and even more so the systems engineering tasks that come along with that.

When my wife and I travel we really enjoy walking the nearby college campuses and taking a stroll through the bookstore. This year while in Phoenix, AZ I took that one step further. After determining that I had no contacts with the HPC group at Arizona State University, I took a leap of faith and called the number listed on their site. Once I explained that I was a sysadmin from UC Riverside, the coordinator took down my info so she could arrange for a tour. A few hours later I had a call back and an appointment to not only tour the facility, but a chance to sit down and talk shop with the sysadmin team at the ASU Advanced Computing Center.

Minus the pain that is parking on just about every college campus I have ever been on, the visit was awesome. They are doing some neat things at ASU including supporting an active space flight mission. The chance to talk vendors, queuing systems, software installs and even how we are all dealing with budget cutbacks was good. The team seemed very knowledgeable and dedicated to providing an excellent end user experience. More than anything, I want to thank the team at ASU for taking time out of their day to talk shop.

In closing, I want to invite all of you involved in systems administration to swing on by when you are in town. The opportunity to talk shop and learn from other sysadmins is always welcomed and appreciated.

Check out the ASU Advanced Computing Center at http://a2c2.asu.edu

Tagged , , | Leave a comment

Configuration Management in Small Environments

Configuration management “just makes sense” in large environments where there are lots of machines that all do the same thing. But it is just as important, albeit in slightly different ways, in small environments. Configuration management provides solutions for a number of issues that are exasperated in small environments.

Documentation

The configuration management code documents the infrastructure. As long as you understand the lingua franca (which happens to be puppet and chef these days) you can fairly easily figure out how a system is configured. Not only that but, if you wanted to do testing before making changes, you now have an easy way to reproduce that system. This also means that the one off system that you spent weeks configuring can easily be setup again if something goes wrong.

Disaster Recovery

In most environments, but especially in smaller environments, it is not the system that is of any value, it is the data. If you get the configuration to a point where it is trivial to reproduce systems, you can focus on the bigger task at hand, the data. With the configuration properly managed, you can do a full test restores of entire components of your environment to make sure things are as expected. This also means less downtime for your environment in the case of a disaster. Now instead of spending days or weeks rebuilding systems before you can even begin restoring the data, you have a simple way to get the systems configured and can focus on the data restore.

Hiring

Employee movement is another huge win for configuration management. Whether it be a new hire or that single admin leaving for greener pastures, there is now a place where all the config is defined and how the environment works at a basic level can be discerned from some manifests or recipes. Instead of training on all the specifics of the environment, you can ensure that the new hire understands the lingua franca (puppet or chef) and then they will be ready to start figuring it out for themselves.

Collaborating

With a configuration management system it becomes easy to share experiences and entirely configured environments. In these days of complex systems, I can give a colleague a sanitized, but complete, copy of how I am building out a new system to do whatever and they can help to troubleshoot not only problems with specific config files but with interactions between applications in the stacks that seem to frequently be deployed. Additionally, I can publish my sanitized config to something like GitHub orPuppetForge and share the things I have done with others that need to do the same. Likewise, instead of spending weeks trying to figure out how to get something setup, I can pull down someone else’s module or cookbook and get a running start.

Is it really that small?

Even in small environments, there are usually a fair number of systems. In many cases there will be 5-10 or more systems just to provide basic functionality. While each of those systems is likely providing unique services, there are some things that are likely to be the same. Whether it be the NTP configuration, name services or SSH that config is going to be the same on all of the systems or only vary on one or two. Why not move these config files into a place where you have 3-5 config files instead of 30+ to manage?

At first configuration management takes a bit of extra effort, but over time the wins gained by deploying a configuration management system dwarf the initial cost. Not only can you sleep better at night as an admin of these small environments, you end up with more time to do cool things and it is easier to do those cool things because of the configuration management system.

Leave a comment