Category Archives: Apuntes varios

Stream Data Platform or Global Logging with Apache Kafka

A logging platform has been something I’ve been looking for some time. In Logging for the masses‌ I explained how I built an ELK platform for accessing/searching our web logs. Elasticsearch and Kibana are great but Logstash is the weak link, it’s not well designed for parallel processing (cloud/multiples nodes). I had to split the logstash service in two adding a redis server just to get some HA and don’t lose logs.

Also logging is a deficit or a requisite needed by any dockerized app. Most of the issues I talked about in Docker: production usefulness are still valid some  have been tackled with kubernetes, openshiftv3,…  (those relative to managing docker images, and fleet/project management) but with monitoring and logging the jury is still out.

Apache Kafka is a solution to both. Actually is a solution for a lot of things:

  • Messaging: Kafka works well as a replacement for a more traditional message broker.In this domain Kafka is comparable to traditional messaging systems such as ActiveMQ or RabbitMQ.

  • Website Activity Tracking: The original use case for Kafka was to be able to rebuild a user activity tracking pipeline as a set of real-time publish-subscribe feeds.This means site activity (page views, searches, or other actions users may take) is published to central topics with one topic per activity type. These feeds are available for subscription for a range of use cases including real-time processing, real-time monitoring, and loading into Hadoop or offline data warehousing systems for offline processing and reporting.

  • Metrics: Kafka is often used for operational monitoring data. This involves aggregating statistics from distributed applications to produce centralized feeds of operational data.

  • Log Aggregation: Many people use Kafka as a replacement for a log aggregation solution. Kafka abstracts away the details of files and gives a cleaner abstraction of log or event data as a stream of messages. This allows for lower-latency processing and easier support for multiple data sources and distributed data consumption.

  • Stream Processing: Many users end up doing stage-wise processing of data where data is consumed from topics of raw data and then aggregated, enriched, or otherwise transformed into new Kafka topics for further consumption. Storm and Samza are popular frameworks for implementing these kinds of transformations.

  • Event Sourcing: Event sourcing is a style of application design where state changes are logged as a time-ordered sequence of records. Kafka’s support for very large stored log data makes it an excellent backend for an application built in this style.

  • Commit Log: Kafka can serve as a kind of external commit-log for a distributed system. The log helps replicate data between nodes and acts as a re-syncing mechanism for failed nodes to restore their data.

What is Kafka? Where does the name come from?
It’s explained in great blog entry by the way, a must to read for understanding Kafka.

We built Apache Kafka at LinkedIn with a specific purpose in mind: to serve as a central repository of data streams.

For a long time we didn’t really have a name for what we were doing (we just called it “Kafka stuff” or “the global commit log thingy”) but over time we came to call this kind of data “stream data”, and the concept of managing this centrally a “stream data platform”


LinkedIn platform before and after developing and implementing Kafka.


In this blog entry from “”, there is another technical explanation:


The Log: What every software engineer should know about real-time data’s unifying abstraction



I learnt about it thanks to Javi Roman (@javiromanrh) a RedHat Engineer who talks about BigData and for several weeks his tweets always had some Kafka in them. So appealing that I had to research it myself to verify that it really needs to enter in my priority list.

Some links tweeted by Javi Roman to get a glimpse of Apache Kafka:

Potential and imaginary roadmap for 2015/16

I’ll talk about our Spanish middle-ware platform dedicated to offer web services (and sites, not just ws).

It’s an exercise of a potential roadmap for the next year.
Several RHEL servers in DEV, UAT and PROD , LAN, ExternalDevices (DMZ),…
Several Jboss App Servers , Apache Servers, F5 VIPs and support services (LDAP, MQ,…)
All the infrastructure that supports CI/CD. Subversion servers, Artifactory,…

Migrate Spanish CI/CD to X-Forge

That would be migrating our source code from our Subversion servers to our overlord SVN repository (let’s call it X-Forge, a dev portal in jira, confluence,etc…), our artifactory server to Nexus and our build and deploy scripts to bamboo, Sonar to Sonarcube,… It seems viable because the technology involved is compatible and the build (maven+eclipse headless) and deploy (ssh+perl+bash) scripts are invoked from Quickbuild ,but if it were Jenkins there wouldn’t be any difference, we ought to be able to migrate them to bamboo.

It isn’t in plans for 2015 so at least it won’t be before 2016. Although the technical part seems viable at first glance, there would be still a lot of details to be ironed with our overlords. Actually, the underlying organization of the code (as Eclipse PSF) and the inter-dependencies between repositories make phasing the project quite difficult (if not impossible). Actually migrating seamlessly our CD/CI infrastructure from Spain to Poland (while maintaining both for almost 2 years) was maybe one of major feat in that project (in complexity terms,  of the 200 servers I built in the new data center; the second was the SVN server while the first one was the configuration manager). Nobody was impacted so it went unnoticed (And nobody assessed the difficulty of doing it at the first try).

Revamp J2EE platform

The platform design is showing its age. I designed it 4 or 5 years ago and hasn’t been updated since them. It isn’t keeping the pace. Recent moves to agile procedures show a lack of flexibility that the powers to be have decided to overcome with what it’s called “multicontext”. Instead of creating several environments on demand (changing the platform) the solution is using the same environments to install multiple times the same j2ee apps changing their contexts (I’m a ‘rara avis’ I’m the only one that thinks this is a problem and should be considered heresy).

Although most or all of our apps could run in a Tomcat/jetty server I’d maintain the Jboss App Server due to it’s modularity, lightweight fingerprint, excellent support and synergies with other Red Hat products like FUSE (although we use other Service Bus as ESB, JBoss Fuse is elastic and looks great for cloud platforms) or BRMS (drools).

Solution 1: Migration of our applications to Pivotal CF/CloudFoundry.

Overall, maybe the best solution as it’s being deployed by our overlords. It’d allow the developers to have control of their environments and create sandboxes/environments on demand.
Of course there are problems to be addressed, it’s a pre-requisite that our CI/CD were already migrated to X-Forge and the applications should be revised and refactored to eliminate some tightly coupled functionality. Performance and stress tests are critical, it’s a shared platform. Maybe the solution is using it only for development in a first phase.

Solution 2: OpenShift V3

Maybe the best technical solution (IMHO and without knowing Pivotal in detail) for a J2EE platform like ours.

“OpenShift adds developer and operational centric tools on top of Kubernetes and Docker”.

It can be used for dev and prod environments and gives a lot of elasticity. Our apps will need to be dockerized but that is also a plus. The drawbacks as with Pivotal, the apps would need to be refactored and it isn’t a technology that it’s in our overlords roadmap/elements (while Pivotal CF seems to be the option of choice). Also dockerizing the apps will need some type of registry, access permission and change management procedures on its own.
Dockerizing Jboss will also involve upgrading the Jboss Domain Controller and a better workflow (a preliminary POC was tested in Docker: Attack on Wildfly).
This solution doesn’t require the migration to X-Forge.

Solution 3: Upgrading our JBoss servers

Upgrading our JBoss servers and changing the devops procedures for the jboss domain. It’s a solution 2 without openshift, kubernetes and docker, no cloud ability, elasticity,etc… just the improvements of a better workflow and a better refactoring (with ansible or salt) of devops procedures (creating environments, creating apps, deleting, cloning,…)
This solution doesn’t require the migration to X-Forge.

Solution 4: Refactoring the devops procedures.

Most of the complaints are related to the lack of flexibility for creating new apps and environments. That’s really an automation problem that can be resolved anytime (if I had it). The actual design can be scaled just adding extra capacity (and maybe licenses). I’ve already preliminary work in Ansible and Salt, Our overlords red line it’s a concern. This solution doesn’t require the migration to X-Forge.

Solution 5a: Not doing anything.

Keep multicontext and the same workflow (I don’t know why I detailed the other options… they have no chance against this one!)

Solution 5b:  Run.

Barcelona AWS Interact 2014

Amazon presented the last week (12/2/2014) in Barcelona some sessions from their AWS re:Invent 2014

The agenda:

Living here I went to the event to take some notes and asks some questions.

9:00 Biking around looking for the sitefabri_coats_08.jpg

It’s difficult to check waze or google maps in a bike. 30 minutes looking for the street without finding it because it was closed, there was a f*cking circus! in front of the event.

10:00 Entrancegunnar-knechtel-fachada-fabra-coats-2012-0213.jpg

Queues at the entrance, people not too young, not too old, few girls, few suits, the local an old warehouse reinvented as conference building, the minimalistic style, all a bit hipster. (there are photos in the twitter hasthag #AWSInteract )

10:30 Queues at the expert spot

Luckily I’d gravitated there early and were able to ask some questions about extending our datacenter, VPC and extracting data from our as400 for its processing with Hadoop, MapR or EMR.

14:15 Welcome and Keynote (by Carlos Conde )

Private Clouds are things of the past, there won’t exist for 2020 (Forrester or Gartner quote, like all forecasts sometimes they guess right sometimes they don’t, the trick is guessing both depending your target. With AWS, today, private clouds meet the dodo)

Their reasoning is the innovation rate of the big three can’t be matched by companies which IT isn’t it’s core business. The innovation rate idea has some weight; private clouds don’t have the dynamism and services catalog of AWS or GCP and they will be always behind someway, if this is critical or not it’s another matter (we run production code which is 30 years old!) on the other hand innovation happens everywhere.

The core business idea it’s funny coming from “a book selling company”.  I remembered (maybe a bit further when talking about agility) an article from the CEO of BBVA (one of the biggest banks of Europe): Banks need to take on Amazon and Google or die –

Aurora (>2015) a new DB service, MySQL compatible, better performance than mysql (x5), HA and fault tolerance.

KMS  (key management service)

A new centralized key mgmt service.

A change of paradigm in production roll-outs 

(explained and commented below in “Container management”)

Lambda  (in beta, early-mid 2015, available for preview)

It’s an engine for execution stateless functions reacting to events.Those functions can be provided by AWS (its api and library/software functions for ETL) or programmed by the customer (in JavaScript using node.js, other languages like python, java or scala in pipeline).

The example was a lambda function that was executed when an image was dropped in a S3 storage, getting the image, making a thumbnail and storing that reduced image in a DB table. Without needing extra code. They compared it as implementing a DB trigger functionality to all the AWS services.

The pricing model is for execution ($0.0000002 per request), memory allocation and time spent (100ms quantum). Better check

15:00 Customer Experiences

2 Spanish customer experiences, the migration/setup in AWS of (a second-hand sale portal) by Schibsted , and an Online Game  (Golden Manager, a free2play soccer manager game).

Nothing revolutionary, but well planned; using open source tools, few costs, run by few people.

16:00 Coffee

Nesspresso capsules

17:00 Infrastructure as Code (by Carlos Conde)

AWS Elastic Beanstalk

Their original service (maybe their classic cloud model). A simple way to script the deploy and scaling of simple apps (web applications, typical 3-tiers apps).

AWS CloudFormation

This is IaaS or IaaS++ , whatever you want to name it.

All the platform is modeled in JSON templates/syntax. Changes on those templates are changes in the infrastructure.

The creation of all the services, provisioning , apps , db, networking, balancer… (everything in AWS could be called a service) can be declared and scripted.

The atomic unit is called stack (and reminds me a lot to Kubernetes’s pods)

The goal is: NO SSH.  (nice goal btw, I fully agree, also it’s related to the micro-services paradigm and docker containers model)

DR (Disaster recovery environment)

Easiness to set-up and clone


No more visios our outdated powerpoints.The documentation is in the code. (Didn’t give examples, just a slide and a brief comment. The image that comes to my mind is GitHub repos and the their .md textfiles.The templates are going to be versioned in github or some clone, it’s logic to document it there too, tying code and documentation.)

Game Days

motto: if you didn’t measure it , didn’t happen (rule #32 for amazon? )

It’s a service offered by AWS for training customer staff. They clone their environments and during a day/weekend they are BOFH () so customer staff learns to detect and respond to situations (one of their favorites, change the hosts files but also change IPs, halt db , set servers in other networks, corrupt running services…)

The load tests methods are interesting. (well the first one, the second one is meh )

1) Replay network load (they save the network load of a typical day and they rerun it, the traffic is real. cool!)

2) Synthetic load tests (jmeter, etc).

17:15 Container Management

motto: Agility is Key

“Speed of iteration > iteration quality” or in another way, “fast release in small units > one big release”

(from the keynote)

Micro-services model, most of the items in the amazon sites are microservices. Amazon rolls-out 95 production changes per minute (aws included I think, didn’t specify). Each micro-service is mantained by teams of 5 people, the motto here is “you build it, you run it“. Those teams not only code but also maintain and run their services.

The most successful start-ups using aws has a production roll-out of 1 per  20 minutes.

Amazon EC2 Container Service (+2015)  and

Container (docker) management in EC2 instances.

Task = set of containers.

JSON files defining container relationships.

Still in beta (open for preview, not for production)

The looks, the functionality, the syntax. It’s a Kubernetes clone integrated with AWS.

17:30 Security Operations AWS (Enrique Lopez)

The security is shared with the client. Securing the content and services inside instances are responsibilities of the customers.

Data doesn’t leave the region that holds them. In Europe, 2 regions: Ireland and Frankfurt (Germany)

3 types of data encryption:

  • managed by aws (keys and software)
  • shared (software by aws, keys by customers)
  • managed by client (keys and non aws software)


A dedicated HW appliance for each client, keys are exclusive of the client. AWS doesn’t know them. KSM on the other hand would be the shared option.

VPC (Virtual Private Cloud)

VPC lets you configure a set of AWS instances as a private cloud, an extension of the datacenter using VPN/IPSec.

AWS Config

Still in preview, it’s an inventory management for tracking and monitoring changes in the configuration of the services, access permissions, code, etc.

Trusted Advisor

It’s an automated cost optimizator and has security features where it checks all your services and lists your configuration and their risks (like open permissions, open fw rules, read-write permissions, … )

It’s a “Best Practices” automation tool.

Penetration testings

There’s no problem, the clients can arrange them whenever they want but they must be notified previously using a form in the web.

18:00 Mobile support / Amazon App Store

Skipped (late and not very interested, I could always look it up later if I needed that intel)