Kubernetes pre-101 (brief introduction)

Before a better blog entry detailing a real, production ready example of dockerized micro-services (using free/open tools, most of the work, and design, would be reused if we used external cloud providers like GCP or AWS ECS‌), I’ll introduce another key technology in the ecosystem: the container management engine.

From a previous post: Docker: production usefulness

To run Docker in a safe robust way for a typical multi-host production environment requires very careful management of many variables:

  • secured private image repository (index)
  • orchestrating container deploys with zero downtime
  • orchestrating container deploy roll-backs
  • networking between containers on multiple hosts
  • managing container logs
  • managing container data (db, etc)
  • creating images that properly handle init, logs, etc
  • much much more…

Time to eat my words (or my quotes), let’s present Kubernetes:

A brief summary of what is: https://github.com/GoogleCloudPlatform/kubernetes/blob/master/DESIGN.md

Kubernetes is a system for managing containerized applications across multiple hosts, providing basic mechanisms for deployment, maintenance, and scaling of applications

Kubernetes uses Docker to package, instantiate, and run containerized applications.

Kubernetes enables users to ask a cluster to run a set of containers. The system automatically chooses hosts to run those containers on.

The scheduler needs to take into account individual and collective resource requirements, quality of service requirements, hardware/software/policy constraints, affinity and anti-affinity specifications, data locality, inter-workload interference, deadlines, and so on

The atomic element for Kubernetes is the pod.

Pods simplify application deployment and management by providing a higher-level abstraction than the raw, low-level container interface. Pods serve as units of deployment and horizontal scaling/replication. Co-location, fate sharing, coordinated replication, resource sharing, and dependency management are handled automatically.

Pods serve as units of deployment and horizontal scaling/replication. Co-location, fate sharing, coordinated replication, resource sharing, and dependency management are handled automatically.
A pod correspond to a colocated group of Docker containers with shared volumes.

Pods facilitate data sharing and communication among their constituents.

Their use:

Pods can be used to host vertically integrated application stacks, but their primary motivation is to support co-located, co-managed helper programs, such as:

  • content management systems, file and data loaders, local cache managers, etc.
  • log and checkpoint backup, compression, rotation, snapshotting, etc.
  • data change watchers, log tailers, logging and monitoring adapters, event publishers, etc.
  • proxies, bridges, and adapters
  • controllers, managers, configurators, and updaters

Barcelona AWS Interact 2014

Amazon presented the last week (12/2/2014) in Barcelona some sessions from their AWS re:Invent 2014

The agenda: http://awsinteract-idgtv.webcasting-studio.net/#/agenda

Living here I went to the event to take some notes and asks some questions.

9:00 Biking around looking for the sitefabri_coats_08.jpg

It’s difficult to check waze or google maps in a bike. 30 minutes looking for the street without finding it because it was closed, there was a f*cking circus! in front of the event.

10:00 Entrancegunnar-knechtel-fachada-fabra-coats-2012-0213.jpg

Queues at the entrance, people not too young, not too old, few girls, few suits, the local an old warehouse reinvented as conference building, the minimalistic style, all a bit hipster. (there are photos in the twitter hasthag #AWSInteract )

10:30 Queues at the expert spot

Luckily I’d gravitated there early and were able to ask some questions about extending our datacenter, VPC and extracting data from our as400 for its processing with Hadoop, MapR or EMR.

14:15 Welcome and Keynote (by Carlos Conde  https://twitter.com/caarlco )

Private Clouds are things of the past, there won’t exist for 2020 (Forrester or Gartner quote, like all forecasts sometimes they guess right sometimes they don’t, the trick is guessing both depending your target. With AWS, today, private clouds meet the dodo)

Their reasoning is the innovation rate of the big three can’t be matched by companies which IT isn’t it’s core business. The innovation rate idea has some weight; private clouds don’t have the dynamism and services catalog of AWS or GCP and they will be always behind someway, if this is critical or not it’s another matter (we run production code which is 30 years old!) on the other hand innovation happens everywhere.

The core business idea it’s funny coming from “a book selling company”.  I remembered (maybe a bit further when talking about agility) an article from the CEO of BBVA (one of the biggest banks of Europe): Banks need to take on Amazon and Google or die – FT.com

Aurora (>2015) a new DB service, MySQL compatible, better performance than mysql (x5), HA and fault tolerance. http://aws.amazon.com/rds/aurora/

KMS  (key management service) http://aws.amazon.com/kms/

A new centralized key mgmt service.

A change of paradigm in production roll-outs 

(explained and commented below in “Container management”)

Lambda  (in beta, early-mid 2015, available for preview) http://aws.amazon.com/lambda/

It’s an engine for execution stateless functions reacting to events.Those functions can be provided by AWS (its api and library/software functions for ETL) or programmed by the customer (in JavaScript using node.js, other languages like python, java or scala in pipeline).

The example was a lambda function that was executed when an image was dropped in a S3 storage, getting the image, making a thumbnail and storing that reduced image in a DB table. Without needing extra code. They compared it as implementing a DB trigger functionality to all the AWS services.

The pricing model is for execution ($0.0000002 per request), memory allocation and time spent (100ms quantum). Better check http://aws.amazon.com/lambda/pricing/

15:00 Customer Experiences

2 Spanish customer experiences, the migration/setup in AWS of segundamano.es (a second-hand sale portal) by Schibsted , and an Online Game  (Golden Manager, a free2play soccer manager game).

Nothing revolutionary, but well planned; using open source tools, few costs, run by few people.

16:00 Coffee

Nesspresso capsules

17:00 Infrastructure as Code (by Carlos Conde)

AWS Elastic Beanstalk http://aws.amazon.com/elasticbeanstalk/

Their original service (maybe their classic cloud model). A simple way to script the deploy and scaling of simple apps (web applications, typical 3-tiers apps).

AWS CloudFormation  http://aws.amazon.com/cloudformation/

This is IaaS or IaaS++ , whatever you want to name it.

All the platform is modeled in JSON templates/syntax. Changes on those templates are changes in the infrastructure.

The creation of all the services, provisioning , apps , db, networking, balancer… (everything in AWS could be called a service) can be declared and scripted.

The atomic unit is called stack (and reminds me a lot to Kubernetes’s pods)

The goal is: NO SSH.  (nice goal btw, I fully agree, also it’s related to the micro-services paradigm and docker containers model)

DR (Disaster recovery environment)

Easiness to set-up and clone

Documentation

No more visios our outdated powerpoints.The documentation is in the code. (Didn’t give examples, just a slide and a brief comment. The image that comes to my mind is GitHub repos and the their .md textfiles.The templates are going to be versioned in github or some clone, it’s logic to document it there too, tying code and documentation.)

Game Days

motto: if you didn’t measure it , didn’t happen (rule #32 for amazon? )

It’s a service offered by AWS for training customer staff. They clone their environments and during a day/weekend they are BOFH () so customer staff learns to detect and respond to situations (one of their favorites, change the hosts files but also change IPs, halt db , set servers in other networks, corrupt running services…)

The load tests methods are interesting. (well the first one, the second one is meh )

1) Replay network load (they save the network load of a typical day and they rerun it, the traffic is real. cool!)

2) Synthetic load tests (jmeter, etc).

17:15 Container Management

motto: Agility is Key

“Speed of iteration > iteration quality” or in another way, “fast release in small units > one big release”

(from the keynote)

Micro-services model, most of the items in the amazon sites are microservices. Amazon rolls-out 95 production changes per minute (aws included I think, didn’t specify). Each micro-service is mantained by teams of 5 people, the motto here is “you build it, you run it“. Those teams not only code but also maintain and run their services.

The most successful start-ups using aws has a production roll-out of 1 per  20 minutes.

Amazon EC2 Container Service (+2015)  https://aws.amazon.com/ecs/  and http://aws.amazon.com/blogs/aws/cloud-container-management/

Container (docker) management in EC2 instances.

Task = set of containers.

JSON files defining container relationships.

Still in beta (open for preview, not for production)

The looks, the functionality, the syntax. It’s a Kubernetes clone integrated with AWS.

17:30 Security Operations AWS (Enrique Lopez)

The security is shared with the client. Securing the content and services inside instances are responsibilities of the customers.

Data doesn’t leave the region that holds them. In Europe, 2 regions: Ireland and Frankfurt (Germany)

3 types of data encryption:

  • managed by aws (keys and software)
  • shared (software by aws, keys by customers)
  • managed by client (keys and non aws software)

CloudHSM http://aws.amazon.com/cloudhsm/

A dedicated HW appliance for each client, keys are exclusive of the client. AWS doesn’t know them. KSM on the other hand would be the shared option.

VPC (Virtual Private Cloud) http://aws.amazon.com/vpc/

VPC lets you configure a set of AWS instances as a private cloud, an extension of the datacenter using VPN/IPSec.

AWS Config http://aws.amazon.com/config/

Still in preview, it’s an inventory management for tracking and monitoring changes in the configuration of the services, access permissions, code, etc.

Trusted Advisor https://aws.amazon.com/premiumsupport/trustedadvisor/

It’s an automated cost optimizator and has security features where it checks all your services and lists your configuration and their risks (like open permissions, open fw rules, read-write permissions, … )

It’s a “Best Practices” automation tool.

Penetration testings https://aws.amazon.com/security/

There’s no problem, the clients can arrange them whenever they want but they must be notified previously using a form in the web.

18:00 Mobile support / Amazon App Store

Skipped (late and not very interested, I could always look it up later if I needed that intel)

Docker: Attack on Wildfly

Somehow in my previous blog entry Docker: production usefulness. I gave a negative impression (I don’t know why, it isn’t as if I were using NOs in red, bold and supersized for answering main questions). But Docker it’s really a disruptive technology which works, so let’s use it for a near future platform change, the migration of our app servers to Red Hat Jboss 7 (present wildfly) from a classic infrastructure to a cloud/dockerized one.

Docker Images

The goal is an image that can be instanced as domain controller (dc) or node, each execution adding its service/container to the JBoss domain seamlessly. This domain and their services must meet production quality.

Existing Docker Images problematic.

JBoss (Red Hat) has an official docker image (jboss/wildfly) image and dockerfile. With documentation on how to extend it.

The first problem comes from its origin: “jboss/wildfly” depends on “jboss/base-jdk:7” which depends on “jboss/base:latest” , which depends on “fedora:20“. Fedora isn’t a production software (at least not here in Spain), those images are tainted, we must run our services in supported systems. For our research we’d use Centos so we could change it for RHEL later.

In any case, this means we had to recreate a new Dockerfile without sharing images with the official repo but importing the inner workings of all those dockerfiles. There’s another option, fork all those repos changing only their FROM, in this way we have the advantage of pulling and merging updates. For simplicity and not creating/uploading anything to github I chose the former.

FROM centos:centos7

# ENV http_proxy <PROXY_HOST>:<PROXY_PORT>
# ENV https_proxy <PROXY_HOST>:<PROXY_PORT>

# imported from jboss/base
# Execute system update
RUN yum -y update

# Install packages necessary to run EAP (from jboss/base)
RUN yum -y install xmlstarlet saxon augeas unzip tar bzip2 xz

# Imported from jboss/jdk7
# Install necessary packages (from JDK7 Base)
RUN yum -y install java-1.7.0-openjdk-devel
RUN yum clean all

# Imported from jboss/base
# Create a user and group used to launch processes
# The user ID 1000 is the default for the first "regular" user on Fedora/RHEL,
# so there is a high chance that this ID will be equal to the current user
# making it easier to use volumes (no permission issues)
RUN groupadd -r jboss -g 1000 && useradd -u 1000 -r -g jboss -m -d /opt/jboss -s /sbin/nologin -c "JBoss user" jboss

# Set the working directory to jboss' user home directory
WORKDIR /opt/jboss

# Specify the user which should be used to execute all commands below
USER jboss

# Imported from jboss/jdk7
# Set the JAVA_HOME variable to make it clear where Java is located
ENV JAVA_HOME /usr/lib/jvm/java

# Imported from jboss/wildfly
# Set the WILDFLY_VERSION env variable
ENV WILDFLY_VERSION 8.1.0.Final

# Add the WildFly distribution to /opt, and make wildfly the owner of the extracted tar content
# Make sure the distribution is available from a well-known place
RUN cd $HOME; /curl http://download.jboss.org/wildfly/$WILDFLY_VERSION/wildfly-$WILDFLY_VERSION.tar.gz /
| tar zx && mv $HOME/wildfly-$WILDFLY_VERSION $HOME/wildfly

# Set the JBOSS_HOME env variable
ENV JBOSS_HOME /opt/jboss/wildfly

# Expose the ports we're interested in
EXPOSE 8080 9990

# Set the default command to run on boot
# This will boot WildFly in the standalone mode and bind to all interface
CMD ["/opt/jboss/wildfly/bin/standalone.sh", "-b", "0.0.0.0"]

 

This dockerfile will build us an image similar to the official one but using centos instead of fedora .

$ sudo docker build --tag ackward/wildfly .
$ sudo docker images
REPOSITORY           TAG                 IMAGE ID            CREATED             VIRTUAL SIZE
ackward/wildfly       latest              1c4572ac308d        4 days ago          680.8 MB
<none>               <none>              102ff2048023        4 days ago          467.7 MB
centos               centos7             ae0c2d0bdc10        2 weeks ago         224 MB

Executing them would be so easy as (following the documentation):

$ sudo docker run -it ackward/wildfly /opt/jboss/wildfly/bin/domain.sh -b 0.0.0.0 -bmanagement 0.0.0.0

or for a standalone instance:

$ sudo docker run -it ackward/wildfly

They execute and works, kinda, because it isn’t configured but that’s expected. So we continue to the next step, configure the JBoss Domain Service.

Configuring JBoss Domain Service.

For the neophytes, JBoss App Server has two operation modes, standalone where each server manages itself and domain mode where there is a domain controller managing several nodes similar to how WebSphere Deployment Manager or Weblogic are implemented.

The first thing to consider is how are we going to manage the configuration files, how are going to be accessed, tracked or versioned, how logs could be accessed, how the apps could be deployed, etc.

It’s important because the docker paradigm tries to achieve independence of the data, each exec of a image is fresh and containers should be the most flexible to create and destroy on the fly (tying them with data isn’t the best way to achieve it).

The solution is already implemented in our environments (prod included), when I designed the migration (or runaway) from WebSphere to JBoss I separated the configuration files from the application files (for accountability and because all the configuration files are textfiles, can be gitted and had them in a remote/central repo; an upgrade doesn’t touch the config files, can be merged and diffed between them). This solution works because we can pass the config dir (jboss.<>.base.dir) as an external volume, all the changes would be persistent, we have straight access to the log files or the deployment directory, can be versioned with git and synced with a master/central repo and creating new nodes is dead easy (let’s see it).

Instead of modifying our wildfly dockerfile we extend it, in this way we customize an image while maintaining backward compatibility.

FROM ackward/wildfly
RUN mkdir -p /opt/jboss/LOCAL_JB/node
VOLUME ["/opt/jboss/LOCAL_JB/node"]
EXPOSE 8080 9990 9999
CMD ["/opt/jboss/wildfly/bin/domain.sh", "-b", "0.0.0.0", "-bmanagement",  "0.0.0.0",
"-Djboss.domain.base.dir=/opt/jboss/LOCAL_JB/node"]

To build it:

$ sudo docker build --tag ackward/dc-wildfly .$ sudo docker imagesREPOSITORY           TAG                 IMAGE ID            CREATED             VIRTUAL SIZE
ackward/dc-wildfly   latest              1a9de7c0734f        3 days ago          680.8 MB
ackward/wilfly       latest              1c4572ac308d        5 days ago          680.8 MB
<none>               <none>              102ff2048023        5 days ago          467.7 MB
centos               centos7             ae0c2d0bdc10        2 weeks ago         224 MB
$ sudo touch: cannot touch '/opt/jboss/LOCAL_JB/node/tt': Permission denied
$ sudo chcon -Rt svirt_sandbox_file_t /opt/jboss/

There is some issues with SELinux, we need to change a context in the host dir we’re exposing to the containers :

We aren’t going to work with data containers at the moment just the old plain local fs (I’ve created the jboss user also in the host with the same uid/gid as the images, don’t know if it’s needed just as precaution):

$ ls -l /opt/jboss/
total 0
drwxr-xr-x. 7 jboss jboss 71 nov 21 08:17 dc1
drwxr-xr-x. 7 jboss jboss 71 nov 24 11:06 node-tplt
drwxr-xr-x. 7 jboss jboss 71 nov 21 08:40 node1
drwxr-xr-x. 7 jboss jboss 71 nov 21 11:51 node2
$ ls -l /opt/jboss/dc1/
total 4
drwxr-xr-x. 4 jboss jboss 4096 nov 21 10:59 configuration
drwxr-xr-x. 3 jboss jboss   20 nov 21 08:17 data
drwxr-xr-x. 2 jboss jboss   61 nov 21 09:23 log
drwxr-xr-x. 4 jboss jboss   40 nov 21 08:17 servers
drwxr-xr-x. 3 jboss jboss   17 nov 21 08:17 tmp
$ ls -l /opt/jboss/node1
total 4
drwxr-xr-x. 3 jboss jboss 4096 nov 21 11:16 configuration
drwxr-xr-x. 3 jboss jboss   20 nov 21 08:22 data
drwxr-xr-x. 2 jboss jboss   61 nov 21 10:50 log
drwxr-xr-x. 4 jboss jboss   40 nov 21 11:12 servers
drwxr-xr-x. 3 jboss jboss   17 nov 21 08:22 tmp
How to configure wildfly as a domain controller and their nodes is out of scope of this entry. It’s well documented by RH. Running the dc is as easy as:
$ sudo docker run -d -P -p 9990:9990 -v /opt/jboss/dc1:/opt/jboss/LOCAL_JB/node -h dc1 --name=wf-dc1 ackward/dc-wildfly

If we want to create a new node we’ve an empty (no servers, no groups, just the slave service account needed to join the domain) template in “node-tplt” and a cp :

$ sudo cp /opt/jboss/node-tplt /opt/jboss/node2
$ sudo chown -R jboss:jboss /opt/jboss/node2
$ sudo docker run -P -d --link=wf-dc1:ldc1 -v /opt/jboss/node3:/opt/jboss/LOCAL_JB/node --name=node2 /
-h node2 ackward/dc-wildfly /opt/jboss/wildfly/bin/domain.sh -b 0.0.0.0 -Djboss.domain.master.address=ldc1 /
-Djboss.domain.base.dir=/opt/jboss/LOCAL_JB/node
$ sudo docker ps
CONTAINER ID        IMAGE                       COMMAND                CREATED             STATUS
PORTS                                                                       NAMES
665e986ad585        ackward/dc-wildfly:latest   "/opt/jboss/wildfly/   2 days ago          Up 2 days
0.0.0.0:49187->8080/tcp, 0.0.0.0:49188->9990/tcp, 0.0.0.0:49189->9999/tcp   node2
1229904dd9d2        ackward/dc-wildfly:latest   "/opt/jboss/wildfly/   2 days ago          Up 2 days
0.0.0.0:49184->8080/tcp, 0.0.0.0:49185->9990/tcp, 0.0.0.0:49186->9999/tcp   node1
d0cdba99520b        ackward/dc-wildfly:latest   "/opt/jboss/wildfly/   3 days ago          Up 3 days
0.0.0.0:9990->9990/tcp, 0.0.0.0:49170->9999/tcp, 0.0.0.0:49171->8080/tcp    node1/ldc1,node2/ldc1,wf-dc1

Where are we?

A long road yet, the foundations look solid and this post is long enough than most people will just TL;DR.

We have:

  • We have a way to create nodes, accessing/modifying their config and their logs easily.
  • The images and containers would be easily upgraded or customized.
  • The images and containers have zero configuration. (GREAT!)
  • We can use git repos for accountability of our configuration files and creating new nodes.
  • With a private docker hub we could share the images on all the docker nodes.
  • A way to execute containers in remote servers using Ansible, we could create a role with some docker tasks. 
    A simple Ansible tasks for executing a container from an image locally:
- hosts: localhost
tasks:
- name: run wildfly dc
docker: image=ackward/wildfly name=wf-dc1 command="/opt/jboss/wildfly/bin/domain.sh -b 0.0.0.0 -bmanagement 0.0.0.0" /
hostname=dc1 publish_all_ports=yes volumes=/opt/jboss/dc1:/opt/jboss/LOCAL_JB/node ports=9990

We haven’t:

  • An infrastructure for provisioning docker hosts (physical/virtual nodes) or at least a fully Ansible configuration for managing the underlying servers and their services.
    • Including dependencies and orchestration of those micro-services.
  • A map of external dependencies and how they must be managed in a cloud paradigm (with floating services).
    • External resources like accessing a mainframe, SAS, fw rules, access control permissions, etc.
  • All the deployment (continuous delivery) platform. How are we going to connect, build and deploy apps in this dockerized Jboss Domain.
  • All the end2end experience!!!:
    • How the services are accessed. And this one is big, now they are exposing a dynamic tunneled port there is only one known port exposed 9990 (the dc admin console) but they are public services with known urls and ports.
    • How developers interact with these environments, how they publish, debug, etc… the applications that run on them.
  • How the platform is monitored (nodes, services, apps,…)
  • Backups
  • Contingency and HA plans.
  • A business case! What would happen in 1, 3, 5 years, associated costs, etc. Do we really need to implement it? are there any advantage, benefits of doing so rather than otherwise?.