Kubernetes pre-101 (brief introduction)

Before a better blog entry detailing a real, production ready example of dockerized micro-services (using free/open tools, most of the work, and design, would be reused if we used external cloud providers like GCP or AWS ECS‌), I’ll introduce another key technology in the ecosystem: the container management engine.

From a previous post: Docker: production usefulness

To run Docker in a safe robust way for a typical multi-host production environment requires very careful management of many variables:

  • secured private image repository (index)
  • orchestrating container deploys with zero downtime
  • orchestrating container deploy roll-backs
  • networking between containers on multiple hosts
  • managing container logs
  • managing container data (db, etc)
  • creating images that properly handle init, logs, etc
  • much much more…

Time to eat my words (or my quotes), let’s present Kubernetes:

A brief summary of what is: https://github.com/GoogleCloudPlatform/kubernetes/blob/master/DESIGN.md

Kubernetes is a system for managing containerized applications across multiple hosts, providing basic mechanisms for deployment, maintenance, and scaling of applications

Kubernetes uses Docker to package, instantiate, and run containerized applications.

Kubernetes enables users to ask a cluster to run a set of containers. The system automatically chooses hosts to run those containers on.

The scheduler needs to take into account individual and collective resource requirements, quality of service requirements, hardware/software/policy constraints, affinity and anti-affinity specifications, data locality, inter-workload interference, deadlines, and so on

The atomic element for Kubernetes is the pod.

Pods simplify application deployment and management by providing a higher-level abstraction than the raw, low-level container interface. Pods serve as units of deployment and horizontal scaling/replication. Co-location, fate sharing, coordinated replication, resource sharing, and dependency management are handled automatically.

Pods serve as units of deployment and horizontal scaling/replication. Co-location, fate sharing, coordinated replication, resource sharing, and dependency management are handled automatically.
A pod correspond to a colocated group of Docker containers with shared volumes.

Pods facilitate data sharing and communication among their constituents.

Their use:

Pods can be used to host vertically integrated application stacks, but their primary motivation is to support co-located, co-managed helper programs, such as:

  • content management systems, file and data loaders, local cache managers, etc.
  • log and checkpoint backup, compression, rotation, snapshotting, etc.
  • data change watchers, log tailers, logging and monitoring adapters, event publishers, etc.
  • proxies, bridges, and adapters
  • controllers, managers, configurators, and updaters

Barcelona AWS Interact 2014

Amazon presented the last week (12/2/2014) in Barcelona some sessions from their AWS re:Invent 2014

The agenda: http://awsinteract-idgtv.webcasting-studio.net/#/agenda

Living here I went to the event to take some notes and asks some questions.

9:00 Biking around looking for the sitefabri_coats_08.jpg

It’s difficult to check waze or google maps in a bike. 30 minutes looking for the street without finding it because it was closed, there was a f*cking circus! in front of the event.

10:00 Entrancegunnar-knechtel-fachada-fabra-coats-2012-0213.jpg

Queues at the entrance, people not too young, not too old, few girls, few suits, the local an old warehouse reinvented as conference building, the minimalistic style, all a bit hipster. (there are photos in the twitter hasthag #AWSInteract )

10:30 Queues at the expert spot

Luckily I’d gravitated there early and were able to ask some questions about extending our datacenter, VPC and extracting data from our as400 for its processing with Hadoop, MapR or EMR.

14:15 Welcome and Keynote (by Carlos Conde  https://twitter.com/caarlco )

Private Clouds are things of the past, there won’t exist for 2020 (Forrester or Gartner quote, like all forecasts sometimes they guess right sometimes they don’t, the trick is guessing both depending your target. With AWS, today, private clouds meet the dodo)

Their reasoning is the innovation rate of the big three can’t be matched by companies which IT isn’t it’s core business. The innovation rate idea has some weight; private clouds don’t have the dynamism and services catalog of AWS or GCP and they will be always behind someway, if this is critical or not it’s another matter (we run production code which is 30 years old!) on the other hand innovation happens everywhere.

The core business idea it’s funny coming from “a book selling company”.  I remembered (maybe a bit further when talking about agility) an article from the CEO of BBVA (one of the biggest banks of Europe): Banks need to take on Amazon and Google or die – FT.com

Aurora (>2015) a new DB service, MySQL compatible, better performance than mysql (x5), HA and fault tolerance. http://aws.amazon.com/rds/aurora/

KMS  (key management service) http://aws.amazon.com/kms/

A new centralized key mgmt service.

A change of paradigm in production roll-outs 

(explained and commented below in “Container management”)

Lambda  (in beta, early-mid 2015, available for preview) http://aws.amazon.com/lambda/

It’s an engine for execution stateless functions reacting to events.Those functions can be provided by AWS (its api and library/software functions for ETL) or programmed by the customer (in JavaScript using node.js, other languages like python, java or scala in pipeline).

The example was a lambda function that was executed when an image was dropped in a S3 storage, getting the image, making a thumbnail and storing that reduced image in a DB table. Without needing extra code. They compared it as implementing a DB trigger functionality to all the AWS services.

The pricing model is for execution ($0.0000002 per request), memory allocation and time spent (100ms quantum). Better check http://aws.amazon.com/lambda/pricing/

15:00 Customer Experiences

2 Spanish customer experiences, the migration/setup in AWS of segundamano.es (a second-hand sale portal) by Schibsted , and an Online Game  (Golden Manager, a free2play soccer manager game).

Nothing revolutionary, but well planned; using open source tools, few costs, run by few people.

16:00 Coffee

Nesspresso capsules

17:00 Infrastructure as Code (by Carlos Conde)

AWS Elastic Beanstalk http://aws.amazon.com/elasticbeanstalk/

Their original service (maybe their classic cloud model). A simple way to script the deploy and scaling of simple apps (web applications, typical 3-tiers apps).

AWS CloudFormation  http://aws.amazon.com/cloudformation/

This is IaaS or IaaS++ , whatever you want to name it.

All the platform is modeled in JSON templates/syntax. Changes on those templates are changes in the infrastructure.

The creation of all the services, provisioning , apps , db, networking, balancer… (everything in AWS could be called a service) can be declared and scripted.

The atomic unit is called stack (and reminds me a lot to Kubernetes’s pods)

The goal is: NO SSH.  (nice goal btw, I fully agree, also it’s related to the micro-services paradigm and docker containers model)

DR (Disaster recovery environment)

Easiness to set-up and clone


No more visios our outdated powerpoints.The documentation is in the code. (Didn’t give examples, just a slide and a brief comment. The image that comes to my mind is GitHub repos and the their .md textfiles.The templates are going to be versioned in github or some clone, it’s logic to document it there too, tying code and documentation.)

Game Days

motto: if you didn’t measure it , didn’t happen (rule #32 for amazon? )

It’s a service offered by AWS for training customer staff. They clone their environments and during a day/weekend they are BOFH () so customer staff learns to detect and respond to situations (one of their favorites, change the hosts files but also change IPs, halt db , set servers in other networks, corrupt running services…)

The load tests methods are interesting. (well the first one, the second one is meh )

1) Replay network load (they save the network load of a typical day and they rerun it, the traffic is real. cool!)

2) Synthetic load tests (jmeter, etc).

17:15 Container Management

motto: Agility is Key

“Speed of iteration > iteration quality” or in another way, “fast release in small units > one big release”

(from the keynote)

Micro-services model, most of the items in the amazon sites are microservices. Amazon rolls-out 95 production changes per minute (aws included I think, didn’t specify). Each micro-service is mantained by teams of 5 people, the motto here is “you build it, you run it“. Those teams not only code but also maintain and run their services.

The most successful start-ups using aws has a production roll-out of 1 per  20 minutes.

Amazon EC2 Container Service (+2015)  https://aws.amazon.com/ecs/  and http://aws.amazon.com/blogs/aws/cloud-container-management/

Container (docker) management in EC2 instances.

Task = set of containers.

JSON files defining container relationships.

Still in beta (open for preview, not for production)

The looks, the functionality, the syntax. It’s a Kubernetes clone integrated with AWS.

17:30 Security Operations AWS (Enrique Lopez)

The security is shared with the client. Securing the content and services inside instances are responsibilities of the customers.

Data doesn’t leave the region that holds them. In Europe, 2 regions: Ireland and Frankfurt (Germany)

3 types of data encryption:

  • managed by aws (keys and software)
  • shared (software by aws, keys by customers)
  • managed by client (keys and non aws software)

CloudHSM http://aws.amazon.com/cloudhsm/

A dedicated HW appliance for each client, keys are exclusive of the client. AWS doesn’t know them. KSM on the other hand would be the shared option.

VPC (Virtual Private Cloud) http://aws.amazon.com/vpc/

VPC lets you configure a set of AWS instances as a private cloud, an extension of the datacenter using VPN/IPSec.

AWS Config http://aws.amazon.com/config/

Still in preview, it’s an inventory management for tracking and monitoring changes in the configuration of the services, access permissions, code, etc.

Trusted Advisor https://aws.amazon.com/premiumsupport/trustedadvisor/

It’s an automated cost optimizator and has security features where it checks all your services and lists your configuration and their risks (like open permissions, open fw rules, read-write permissions, … )

It’s a “Best Practices” automation tool.

Penetration testings https://aws.amazon.com/security/

There’s no problem, the clients can arrange them whenever they want but they must be notified previously using a form in the web.

18:00 Mobile support / Amazon App Store

Skipped (late and not very interested, I could always look it up later if I needed that intel)