Docker: Attack on Wildfly

Somehow in my previous blog entry Docker: production usefulness. I gave a negative impression (I don’t know why, it isn’t as if I were using NOs in red, bold and supersized for answering main questions). But Docker it’s really a disruptive technology which works, so let’s use it for a near future platform change, the migration of our app servers to Red Hat Jboss 7 (present wildfly) from a classic infrastructure to a cloud/dockerized one.

Docker Images

The goal is an image that can be instanced as domain controller (dc) or node, each execution adding its service/container to the JBoss domain seamlessly. This domain and their services must meet production quality.

Existing Docker Images problematic.

JBoss (Red Hat) has an official docker image (jboss/wildfly) image and dockerfile. With documentation on how to extend it.

The first problem comes from its origin: “jboss/wildfly” depends on “jboss/base-jdk:7” which depends on “jboss/base:latest” , which depends on “fedora:20“. Fedora isn’t a production software (at least not here in Spain), those images are tainted, we must run our services in supported systems. For our research we’d use Centos so we could change it for RHEL later.

In any case, this means we had to recreate a new Dockerfile without sharing images with the official repo but importing the inner workings of all those dockerfiles. There’s another option, fork all those repos changing only their FROM, in this way we have the advantage of pulling and merging updates. For simplicity and not creating/uploading anything to github I chose the former.

FROM centos:centos7

# ENV http_proxy <PROXY_HOST>:<PROXY_PORT>
# ENV https_proxy <PROXY_HOST>:<PROXY_PORT>

# imported from jboss/base
# Execute system update
RUN yum -y update

# Install packages necessary to run EAP (from jboss/base)
RUN yum -y install xmlstarlet saxon augeas unzip tar bzip2 xz

# Imported from jboss/jdk7
# Install necessary packages (from JDK7 Base)
RUN yum -y install java-1.7.0-openjdk-devel
RUN yum clean all

# Imported from jboss/base
# Create a user and group used to launch processes
# The user ID 1000 is the default for the first "regular" user on Fedora/RHEL,
# so there is a high chance that this ID will be equal to the current user
# making it easier to use volumes (no permission issues)
RUN groupadd -r jboss -g 1000 && useradd -u 1000 -r -g jboss -m -d /opt/jboss -s /sbin/nologin -c "JBoss user" jboss

# Set the working directory to jboss' user home directory
WORKDIR /opt/jboss

# Specify the user which should be used to execute all commands below
USER jboss

# Imported from jboss/jdk7
# Set the JAVA_HOME variable to make it clear where Java is located
ENV JAVA_HOME /usr/lib/jvm/java

# Imported from jboss/wildfly
# Set the WILDFLY_VERSION env variable
ENV WILDFLY_VERSION 8.1.0.Final

# Add the WildFly distribution to /opt, and make wildfly the owner of the extracted tar content
# Make sure the distribution is available from a well-known place
RUN cd $HOME; /curl http://download.jboss.org/wildfly/$WILDFLY_VERSION/wildfly-$WILDFLY_VERSION.tar.gz /
| tar zx && mv $HOME/wildfly-$WILDFLY_VERSION $HOME/wildfly

# Set the JBOSS_HOME env variable
ENV JBOSS_HOME /opt/jboss/wildfly

# Expose the ports we're interested in
EXPOSE 8080 9990

# Set the default command to run on boot
# This will boot WildFly in the standalone mode and bind to all interface
CMD ["/opt/jboss/wildfly/bin/standalone.sh", "-b", "0.0.0.0"]

 

This dockerfile will build us an image similar to the official one but using centos instead of fedora .

$ sudo docker build --tag ackward/wildfly .
$ sudo docker images
REPOSITORY           TAG                 IMAGE ID            CREATED             VIRTUAL SIZE
ackward/wildfly       latest              1c4572ac308d        4 days ago          680.8 MB
<none>               <none>              102ff2048023        4 days ago          467.7 MB
centos               centos7             ae0c2d0bdc10        2 weeks ago         224 MB

Executing them would be so easy as (following the documentation):

$ sudo docker run -it ackward/wildfly /opt/jboss/wildfly/bin/domain.sh -b 0.0.0.0 -bmanagement 0.0.0.0

or for a standalone instance:

$ sudo docker run -it ackward/wildfly

They execute and works, kinda, because it isn’t configured but that’s expected. So we continue to the next step, configure the JBoss Domain Service.

Configuring JBoss Domain Service.

For the neophytes, JBoss App Server has two operation modes, standalone where each server manages itself and domain mode where there is a domain controller managing several nodes similar to how WebSphere Deployment Manager or Weblogic are implemented.

The first thing to consider is how are we going to manage the configuration files, how are going to be accessed, tracked or versioned, how logs could be accessed, how the apps could be deployed, etc.

It’s important because the docker paradigm tries to achieve independence of the data, each exec of a image is fresh and containers should be the most flexible to create and destroy on the fly (tying them with data isn’t the best way to achieve it).

The solution is already implemented in our environments (prod included), when I designed the migration (or runaway) from WebSphere to JBoss I separated the configuration files from the application files (for accountability and because all the configuration files are textfiles, can be gitted and had them in a remote/central repo; an upgrade doesn’t touch the config files, can be merged and diffed between them). This solution works because we can pass the config dir (jboss.<>.base.dir) as an external volume, all the changes would be persistent, we have straight access to the log files or the deployment directory, can be versioned with git and synced with a master/central repo and creating new nodes is dead easy (let’s see it).

Instead of modifying our wildfly dockerfile we extend it, in this way we customize an image while maintaining backward compatibility.

FROM ackward/wildfly
RUN mkdir -p /opt/jboss/LOCAL_JB/node
VOLUME ["/opt/jboss/LOCAL_JB/node"]
EXPOSE 8080 9990 9999
CMD ["/opt/jboss/wildfly/bin/domain.sh", "-b", "0.0.0.0", "-bmanagement",  "0.0.0.0",
"-Djboss.domain.base.dir=/opt/jboss/LOCAL_JB/node"]

To build it:

$ sudo docker build --tag ackward/dc-wildfly .$ sudo docker imagesREPOSITORY           TAG                 IMAGE ID            CREATED             VIRTUAL SIZE
ackward/dc-wildfly   latest              1a9de7c0734f        3 days ago          680.8 MB
ackward/wilfly       latest              1c4572ac308d        5 days ago          680.8 MB
<none>               <none>              102ff2048023        5 days ago          467.7 MB
centos               centos7             ae0c2d0bdc10        2 weeks ago         224 MB
$ sudo touch: cannot touch '/opt/jboss/LOCAL_JB/node/tt': Permission denied
$ sudo chcon -Rt svirt_sandbox_file_t /opt/jboss/

There is some issues with SELinux, we need to change a context in the host dir we’re exposing to the containers :

We aren’t going to work with data containers at the moment just the old plain local fs (I’ve created the jboss user also in the host with the same uid/gid as the images, don’t know if it’s needed just as precaution):

$ ls -l /opt/jboss/
total 0
drwxr-xr-x. 7 jboss jboss 71 nov 21 08:17 dc1
drwxr-xr-x. 7 jboss jboss 71 nov 24 11:06 node-tplt
drwxr-xr-x. 7 jboss jboss 71 nov 21 08:40 node1
drwxr-xr-x. 7 jboss jboss 71 nov 21 11:51 node2
$ ls -l /opt/jboss/dc1/
total 4
drwxr-xr-x. 4 jboss jboss 4096 nov 21 10:59 configuration
drwxr-xr-x. 3 jboss jboss   20 nov 21 08:17 data
drwxr-xr-x. 2 jboss jboss   61 nov 21 09:23 log
drwxr-xr-x. 4 jboss jboss   40 nov 21 08:17 servers
drwxr-xr-x. 3 jboss jboss   17 nov 21 08:17 tmp
$ ls -l /opt/jboss/node1
total 4
drwxr-xr-x. 3 jboss jboss 4096 nov 21 11:16 configuration
drwxr-xr-x. 3 jboss jboss   20 nov 21 08:22 data
drwxr-xr-x. 2 jboss jboss   61 nov 21 10:50 log
drwxr-xr-x. 4 jboss jboss   40 nov 21 11:12 servers
drwxr-xr-x. 3 jboss jboss   17 nov 21 08:22 tmp
How to configure wildfly as a domain controller and their nodes is out of scope of this entry. It’s well documented by RH. Running the dc is as easy as:
$ sudo docker run -d -P -p 9990:9990 -v /opt/jboss/dc1:/opt/jboss/LOCAL_JB/node -h dc1 --name=wf-dc1 ackward/dc-wildfly

If we want to create a new node we’ve an empty (no servers, no groups, just the slave service account needed to join the domain) template in “node-tplt” and a cp :

$ sudo cp /opt/jboss/node-tplt /opt/jboss/node2
$ sudo chown -R jboss:jboss /opt/jboss/node2
$ sudo docker run -P -d --link=wf-dc1:ldc1 -v /opt/jboss/node3:/opt/jboss/LOCAL_JB/node --name=node2 /
-h node2 ackward/dc-wildfly /opt/jboss/wildfly/bin/domain.sh -b 0.0.0.0 -Djboss.domain.master.address=ldc1 /
-Djboss.domain.base.dir=/opt/jboss/LOCAL_JB/node
$ sudo docker ps
CONTAINER ID        IMAGE                       COMMAND                CREATED             STATUS
PORTS                                                                       NAMES
665e986ad585        ackward/dc-wildfly:latest   "/opt/jboss/wildfly/   2 days ago          Up 2 days
0.0.0.0:49187->8080/tcp, 0.0.0.0:49188->9990/tcp, 0.0.0.0:49189->9999/tcp   node2
1229904dd9d2        ackward/dc-wildfly:latest   "/opt/jboss/wildfly/   2 days ago          Up 2 days
0.0.0.0:49184->8080/tcp, 0.0.0.0:49185->9990/tcp, 0.0.0.0:49186->9999/tcp   node1
d0cdba99520b        ackward/dc-wildfly:latest   "/opt/jboss/wildfly/   3 days ago          Up 3 days
0.0.0.0:9990->9990/tcp, 0.0.0.0:49170->9999/tcp, 0.0.0.0:49171->8080/tcp    node1/ldc1,node2/ldc1,wf-dc1

Where are we?

A long road yet, the foundations look solid and this post is long enough than most people will just TL;DR.

We have:

  • We have a way to create nodes, accessing/modifying their config and their logs easily.
  • The images and containers would be easily upgraded or customized.
  • The images and containers have zero configuration. (GREAT!)
  • We can use git repos for accountability of our configuration files and creating new nodes.
  • With a private docker hub we could share the images on all the docker nodes.
  • A way to execute containers in remote servers using Ansible, we could create a role with some docker tasks. 
    A simple Ansible tasks for executing a container from an image locally:
- hosts: localhost
tasks:
- name: run wildfly dc
docker: image=ackward/wildfly name=wf-dc1 command="/opt/jboss/wildfly/bin/domain.sh -b 0.0.0.0 -bmanagement 0.0.0.0" /
hostname=dc1 publish_all_ports=yes volumes=/opt/jboss/dc1:/opt/jboss/LOCAL_JB/node ports=9990

We haven’t:

  • An infrastructure for provisioning docker hosts (physical/virtual nodes) or at least a fully Ansible configuration for managing the underlying servers and their services.
    • Including dependencies and orchestration of those micro-services.
  • A map of external dependencies and how they must be managed in a cloud paradigm (with floating services).
    • External resources like accessing a mainframe, SAS, fw rules, access control permissions, etc.
  • All the deployment (continuous delivery) platform. How are we going to connect, build and deploy apps in this dockerized Jboss Domain.
  • All the end2end experience!!!:
    • How the services are accessed. And this one is big, now they are exposing a dynamic tunneled port there is only one known port exposed 9990 (the dc admin console) but they are public services with known urls and ports.
    • How developers interact with these environments, how they publish, debug, etc… the applications that run on them.
  • How the platform is monitored (nodes, services, apps,…)
  • Backups
  • Contingency and HA plans.
  • A business case! What would happen in 1, 3, 5 years, associated costs, etc. Do we really need to implement it? are there any advantage, benefits of doing so rather than otherwise?.

Docker: production usefulness

Goal: answer the question if Docker is viable and ready for production service.

Short Answer: NO

Not by a long shot. Here it’s explained perfectly: Docker Misconceptions.

Just this excerpt has a lot of weight behind.

To run Docker in a safe robust way for a typical multi-host production environment requires very careful management of many variables:

  • secured private image repository (index)
  • orchestrating container deploys with zero downtime
  • orchestrating container deploy roll-backs
  • networking between containers on multiple hosts
  • managing container logs
  • managing container data (db, etc)
  • creating images that properly handle init, logs, etc
  • much much more…

Long Answer: Not alone.

The short answer still applies, it’s just that I wanted to confirm them myself to get a broad image of where docker is, its capabilities: what is and what lacks for.

How:

After the migration of our servers to Poland RDC, we are decommissioning most of the servers in Local Country. Some of the remaining services that are left here can be consolidated in less servers, I’ve dockerized in microservices the few ones I’m still maintaining here: A zabbix proxy server, its progresql rbdms, and a R development environment for daily statistics of our mainframe and backend app.

Why:

Well, LI/LM will eventually remove our root access to our servers, red line… whatever… also I don’t like our Linux and Solaris patching process. A docker image provisioning service looks like a solution, wouldn’t they be applications and country business services?

Also someone mentioning docker as replacement for configuration software( the blog entry ‘Why docker? Why not Chef’  in particular) piqued my interest, a solution able to replace puppet or chef needs to be considered.

Never forget:

Our goal is to deploy applications in production, flawlessly, with accountability, the same application developed, tested and approved and later maintain them in a HA environment. Continuous development aims to the perfection of the change management process and every new tool in the chain goes toward that goal.

Where is Docker:

  1. It’s a clever idea for separating the application template (a docker image), the application instance: configured and running (a container) and it’s data (data container or volume fs).
  2. The Dockerfile it’s an easy and quick way to bootstrap an image and a container (the final product).
  3. The docker registry and github are big repositores of images and dockerfiles. And private registries are easy to setup. A nice idea for accessing, sharing, and improving images
  4. Self-contained, minimalist, good performance, easy to launch even in Windows or Mac Os X with boot2docker.

Where isn’t Docker:

  1. It doesn’t scale out of a developer box and is not designed for automatic processing . It needs orchestration, configuration and discovery services. To be able to roll containers, configure them dynamically and know how and where to call their linked services. All that info isn’t in docker and needs to be provided prior to trying to implement something minimally serious.
  2. There is no change management accountability, a dockerfile builds a image, modifying a container and committing it upgrades an image. there are some tools and commands to review the commits and differences (docker history and diff mainly) but they are very limited. 2 images pulled are identical but there is no way to reproduce from scratch 2 images from the same dockerfile. This for me is a NO-NO, the environments, their containers, the runtime must be reproducible from step 1 to N, the images could be compromised, the images can be out-of-sync with the containers, the container looks like a black box without knowing what happens after several iterations or if multiple people upgrades them. There is no documentation of the configuration inside them, it looks backwards but it makes sense using images that launches a puppet/chef/ansible process for its configuration (tackling the first point in a way).
  3. The image/docker registries are (almost) a joke. Those images and install, next, next, next usually are the same thing and looking for the image that fits your necessities isn’t viable. Dockerfiles look easier to setup than cookbooks/playbooks but they are also chaotic and without a framework behind them (there is a reason for Spring existence, isn’t it?). Containers/images don’t match the “One Size Fit All” idea, internationalization is a must for us, support of locales, timezones needs to be addressed. That’s forking and/or a private hub. So, the idea of pulling an image from the official docker hub and I’m ready, that I will be updated and patched just by syncing doesn’t ring true (except in a trivial/demo scenario). UPDATE: Attack on Wildfly shows how to extend an image.
  4. The persistence data model comes from an evil, twisted, tortured mind. The volume syntax, mount mapping, the recursive idea of data containers for the portability holy grail look like a hack. Actually it’s a nice, well thought concept if already exists a provisioning service who is going to provide that storage, IOPS, redundancy, backups, etc.and exists a configuration service for inventorying and handling the storage needs… the problem is that docker doesn’t come/expect one. It could be enough (and even perfect) from a developer POV, but if our POV is focused in HA and performance then docker alone isn’t enough.
  5. Monitoring and logging those microservices are another issue. It’s a question of how are built those images but the minimalist approach of some official repos, without sysvinit/systemd, syslog, supervisors or monitoring agent processes (aren’t they micro?) in a production environment need a re-think.. That functionality must be provided somehow, balancing the needs of portability, performance, integration/management of all them in a central hub while maintaining the microservices concept and its performance.

Where to go:

At least to another blog post, there’s enough material for another entry. Docker is a cog in all the projects involving ?AAS (this already gives dimension to the Docker project, it’s a key technology). The ecosystem is boiling with projects or potential solutions, more or less polished although all of them needing heavy integration/tailoring: Openstack, Heat, Mesos, Zookeeper, Marathon, Puppet/Chef/Ansible, Jenkins, Kubernetes, Openshift, Fig, Orchard, and so on…

My next blog entry: Attack on Wildfly shows some answers to these problems and a lot of its virtues, and there is a third one in the works that tackle a lot of these problems using Kubernetes.

 

Logging for the masses

(I really need to update the blog template 🙁 )

Problem: there are several sources of logs you want to consult, search in a centralized way. Also those logs should be correlated for events and raise alerts.

At first glance there are 2 alternatives: Splunk, maybe the leader for logging systems and ArcSight Logger already installed in Poland RDC.

The former is ridiculously expensive (at least for my miserable budget) and the later is a bureaucratic hell.

Both are expensive solutions, proprietary and closed, so sometimes pays itself to look for inexpensive and free (as in speech) source

The free solution involves using Logstash, Elasticsearch and Kibana for logging, storing and presentation.

Web Server Logging

We have about 80 log feeds from 15 web applications and 30 servers, the goal is log everything and be able to search by app, server, date, IP,…

The good news are that all those logs follows the same pattern 

The architecture follows the next scheme (the configuration files are sanitized):

logstash-infr

Logstash-forwarder: formerly known as lumberjack. It’s an application that tails logs and sends them in a secure channel to logstash over a tcp port, maintains an offset of each log.

Logstash (as shipper): Receives all the logs streams and stores them in a Redis data store.

Redis: It works here like a message queue between shipper and indexer. A thousand times easier to setup than ActiveMQ.

Logstash (as indexer): Extracts from the redis queue and process the data: parse, map and store in a elasticsearch db.

ElasticSearch: the database where logs are stored, indexed and be searchable.

Kibana: a php frontend for ES, allows the creation and customization of dashboards, queries and filters.

 

Logstash works as shipper and indexer why split those functions in two different process?

  • Because we don’t want to lose data.
  • Because the indexer can do some serious, CPU intensive tasks per entry.
  • Because the shipper and indexer throughput are different and not synchronized.
  • Because the logs can be unstructured and the match model could have errors reporting null pointers and finally out of memory killing or making it a zombie process (as when i tried to add some JBoss log4j logs).

For those reasons there is a queue between shipper and indexer, so the infrastructure is resilient to downtimes and the indexer isn’t saturated by the shipper throughput.

Logstash-forwarder configuration

A JSON config file, declaring the shipper host, a certificate (shared with the shipper) and which paths are being forwarded.

One instance per server

{

  "network": {

        "servers": [ "<SHIPPER>:5000" ],

    "ssl certificate": "/opt/logstash-forwarder/logstash.pub",

    "ssl key": "/opt/logstash-forwarder/logstash.key",

    "ssl ca": "/opt/logstash-forwarder/logstash.pub",

    "timeout": 15

  },

"files": [

             {

      "paths": [

        "/opt/httpd/logs/App1/access.log" , "/opt/httpd-sites/logs/App2/access_ssl.log"

      ],

      "fields": { "type": "apache" },

      "fields": { "app": "App1" }

    },

            {

      "paths": [

        "/opt/httpd/logs/App2/access.log" , "/opt/httpd/logs/App2/access_ssl.log"

      ],

      "fields": { "type": "apache" },

      "fields": { "app": "App2" }

    }

  ]

}

Logstash as shipper

Another JSON config file, accepts logs streams and stores them in a redis datastore.

input {

lumberjack {

port => 5000

ssl_certificate => "/etc/ssl/logstash.pub"

ssl_key => "/etc/ssl/logstash.key"

codec => json

}

}

output {

stdout { codec => rubydebug }

redis { host => "localhost" data_type => "list" key => "logstash" }

}

 

Redis

I think is out of the scope of this blog entry, it’s really dead easy, a default config was enough. It would need scaling depending on the throughput.

Logstash as indexer

Here the input it’s the output of the shipper, the output it’s the ES database, between there is the matching section where we filter the entries (we map them, dropping the health-checks from the F5 balancers and tagging entries with 503 errors). Yes, the output can be multiple too here not only we store those matches but those 503 are sended to a zabbix output which in turn sends them to our zabbix server.

input {

redis {

host => "<REDIS_HOST>"

type => "redis"

data_type => "list"

key => "logstash"

}

}

filter {

grok {

match => [ "message", "%{IPORHOST:clientip} %{USER:ident} %{USER:auth} \[%{HTTPDATE:timestamp}\] \"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})\" %{NUMBER:response:int} (?:%{NUMBER:bytes}|-) %{QS:referrer} %{QS:agent} %{QS:jsessionid} %{QS:bigippool} %{NUMBER:reqtimes:int}/%{NUMBER:reqtimems:int}" ]

}

}

filter {

if [request] == "/f5.txt" {

drop { }

}

}

filter {

if [response] == "503" {

alter {

add_tag => [ "zabbix-sender" ]

}

}

}

output {

stdout { }

elasticsearch {

cluster => "ES_WEB_DMZ"

}

zabbix {

# only process events with this tag

tags => "zabbix-sender"

# specify the hostname or ip of your zabbix server

# (defaults to localhost)

host => "<ZABBIX_SERVER>"

# specify the port to connect to (default 10051)

port => "10051"

# specify the path to zabbix_sender

# (defaults to "/usr/local/bin/zabbix_sender")

zabbix_sender => "/usr/bin/zabbix_sender"

}

}

 

ElasticSearch

The configuration file for a basic service is easy. Depending on the needs, throughput, how many searches per second it gets complicate (shards, masters, nodes,…) but for a very occasional use with this line is enough:

cluster.name: ES_WEB_DMZ

Kibana

Another easy configuration, it only needs to know the ES address: “http://<ES_HOST>:9200” and that’s all. Dashboards and queries are saved in the ES database. The php files and directories can be read only.

This post was originally posted in my company intranet and was showing 2 dashboards/screenshots that I can’t reproduce here:

  1. A simple dashboard showing how the logs are distributed per application, server, how many entries and their response times. Each facet can be inspected and go deeper.
  2. A dashboard showing the application errors (error codes 5XX)