Disrupting Java apprentices with node.js

I started lecturing a few years ago, for a while it was a fulltime job. I lectured mainly in two programming languages: C in introductory classes (first semester) and Java for OOP, data structures and distributed systems classes (equally distributed along the bachelor degree).

After a while I got bored and nowadays being a lecturer is not a fulltime job, my main contribution now is to bring fresh tech into the play and introduce it to seniors and other lecturers and this is where this story starts.

nodejs_logo

About one year ago I started diving into node.js. Initially I only used it for non critical stuff, still using Java as my main language.
I really liked node from the first day but I just didn’t have the time to dive into it at the time.
About 6 months ago everything changed, I’m really hooked into it and now I finally feel that I acquired enough skill to allow it to replace Java in my head.
Node ecosystem is awesome and it doesn’t have all that clutter that Java has, it is so clean and agile…

This semester I decided to bring node.js into a senior class and simultaneously a few senior projects.

Remember how I started this post?

In here almost every design/programming topics are taught Java, when I gave the first class about JavaScript and node.js everyone was like “WTF”?
A few students even looked at me thinking I was drunk because in their heads Javascript was for browsers and nothing else.

In order to sell them node.js I live coded a drag-n-drop DIV element that moved simultaneously in all browsers using socket.io. Instantly everyone on the room was curious about this, even the more sceptic ones. (if you want to disrupt someone show them something very graphic/explicit and that I did)
Code available at https://github.com/apocas/psi2013-node (it uses prototype based objects, modules, and events)

It’s true that I could have implemented that in almost any other language, but the lack effort needed to implement it in node.js is really impressive and that was what disrupted the audience (npm awesomeness helped :-D).

Right now seniors are starting to put their hands on node.js at multiple projects (https://github.com/portugol – Portugol rewritten in node.js) so far I feel that the hardest thing for them is the asynchronous architecture.

Although one big advantage I felt was the fact that they came from a language where everything is an Object (Java), because of this they quickly understood objects in JavaScript and how can events be used for message passing in an asynchronous environment.

In my opinion this is one of the most important thing to understand in the Javascript/node.js world.

.onion Mapper

Initially I had the idea of developing this during Codebits, but I ended up in developing this for fun before the event, invalidating any idea of using it.

.onion is a “non official” top-level domain suffix, the big thing about this TLD is that you can only access domains in it over the TOR network.

Due to the anonymity characteristics of the TOR network and the necessity of using it to access this TLD, very often this type of networks are called deep web.

The idea here was to crawl the .onion network, but instead of crawling and data mining its contents I just wanted to crawl its server’s relationships.

The stack is very simplistic, at the infrastructure level this was built using a main control node. Which runs node.js and a Redis instance.

Additionally there were multiple crawlers running a onion tweaked version of crawler4j, each crawler grabs the *.onion links in the html code and saves them (domains relationships) in Redis using Jedis. At the network level Polipo was used as a proxy and obviously tor client.

Each domain relationship is displayed in a graph which is rendered using sigma.js, all data is delivered to the browser using socket.io.

In two days, it crawled 1.5M urls finding and mapping relationships between 440 domains. Keep in mind that this crawling was done inside the tor network, which sometimes have very high latency times.

Finally here it is.

Designing a monitor and control system for 200+ servers

A few months ago i had to design a proactive monitoring system that could handle 200+ servers with ease. The idea was not to build a simple monitor that passively watched the server farms notifying the admins when some threshold was reached.

Keeping a team watching the servers 24h/7 has its problems, if the system could lighten up the load on them would be great.

I wanted the system to have some capability of reacting according with the scenario it had at the moment. This scenario is represented by all the readings of each sensor loaded at the time and it may be a single server contained scenario or farm/cluster wide. With this reactive capability humans are notified only for situations that the system couldn’t handle/contain.

Sorry if i offended someone with the project name (Skynet), too much movies… lol, but fyi it has TTS library for many things but one of them is saying “hasta la vista baby” 😛

Architecture

  • Starting from the core piece, it was written in Java for two main reasons. First was because at the time i had only a few days to implement the prototype of this and since i have years of experience in Java so it is where i was most productive.
  • Second reason was “Reflection“, i know many other languages let you inspect and execute code at runtime, but again previous experience in the technology allowed me to cut corners. Runtime inspection/execution was obligatory since i wanted to be able to add components/sensors/… at any time and more important abstract all this.

Skynet Schematic

Input sources

  • Currently Skynet has many input sources, the mainly one is sessions over SSH opened to each server which allow to monitor everything in each server, accordingly with each server profile the right set of sensors will be loaded at runtime using reflection.
  • This SSH sessions are, of course, used by Skynet to actively interact with the servers. For example block an ip, keeping mail queues clean, stop some non critical services if a server is under stress, etc. All this is done automatically and if the problem fails to be contained then humans are alerted for the problem.
  • The second main input source is Mail, this is great since end-users/customers can interact with the system without knowing and without human intervention, for example: requesting an ip unblock from a server in an shared hosting cluster.
  • There are many others like: RSS feeds, SMS and so on. RSS feeds support is a funny history, Skynet actively scans defacements feeds (like zone-h and others) for IPs from any one of the servers connected to it. If a match is found it alerts the admins allowing them to alert the website owner.
  • Applications are endless.

Data

  • All events and readings are stored in a offsite Redis instance, adding persistence capability.

Ouput

  • Current version have modules for SMS, Mail and  Twitter. Twitter is used almost like a timeline log for each action Skynet does and since there is almost a twitter client in any electronic device nowadays, its the perfect on the go log solution. (feed is kept private)

Security

  • The machines where Skynet core is running are in a secure location without any direct input connections form the web. Since SSH sessions are used to talk with the servers, there were a real danger if the location was compromised.
  • Key authentication is used and keys are saved only in volatile memory. If the power goes down they are lost, so if even someone steal the machines they will not be able to reestablish the sessions with the servers in the new location.
  • It is totally autonomous, accepting only emergency shutdown in case something starts to deviate. This shutdown command is not sent directly to the Skynet since theres no direct connection to it from the outside, instead its saved in a location where Skynet connects to check for emergency commands. (Botnet style)

Web Architecture

  • Here goes my favorite part of all this. That Redis instance had to be accessed  someway, for me the only web that makes sense (in these kind of things) is in realtime.
  • In order to achieve realtime and bragging rights you have to build it full Javascript, so i needed to have a good async data controller at server side, this was the big opportunity for Node.JS in this project.
  • Node.JS allowed to build something using socket.io real quick, since some code was reused  in the webclients. This allowed quick, painless and direct access in realtime to the data at the Redis instance.
  • Added a few cool UI libraries into the pan (like Google Chart, jQuery, jGrowl) and a realtime dashboard was built overnight.

After Skynet was online and “reactive” human intervention in maintenance tasks and solving simple event scenarios dropped drastically. More important it filters the problems, solving the simple ones and only passing the harder ones to the sysadmins, boosting productivity.