.onion Mapper

Initially I had the idea of developing this during Codebits, but I ended up in developing this for fun before the event, invalidating any idea of using it.

.onion is a “non official” top-level domain suffix, the big thing about this TLD is that you can only access domains in it over the TOR network.

Due to the anonymity characteristics of the TOR network and the necessity of using it to access this TLD, very often this type of networks are called deep web.

The idea here was to crawl the .onion network, but instead of crawling and data mining its contents I just wanted to crawl its server’s relationships.

The stack is very simplistic, at the infrastructure level this was built using a main control node. Which runs node.js and a Redis instance.

Additionally there were multiple crawlers running a onion tweaked version of crawler4j, each crawler grabs the *.onion links in the html code and saves them (domains relationships) in Redis using Jedis. At the network level Polipo was used as a proxy and obviously tor client.

Each domain relationship is displayed in a graph which is rendered using sigma.js, all data is delivered to the browser using socket.io.

In two days, it crawled 1.5M urls finding and mapping relationships between 440 domains. Keep in mind that this crawling was done inside the tor network, which sometimes have very high latency times.

Finally here it is.

Advertisements

Codebits 2012

It was my third Codebits and by far the best one!

Codebits is an event which merges characteristics from a hackathon and a conference.

Organized and sponsored by Sapo, each year it has 800 handpicked attendees.

This year I decided to “walk the talk” and really participate into the spirit of the event: I gave a talk, saw talks, helped, got help, participated in a project (4th place :D). All this along three awesome days.

Unfortunately I didn’t have much time to meet new people, it happens when you are always running.

Here goes some photos to remember these three awesome days:

This slideshow requires JavaScript.

From SWF to HTML5

A few months ago I challenged our team if we could, convert/rebuild our big flash-based animation in our entry page to HTML5.

Well this was one of those challenges that backfired 🙂

Our designer is an awesome graphic designer, but he is still sharpening his “web skills”. Which was a problem since he could only deliver that animation in SWF format.

I started looking into painless solutions to convert the SWF file to HTML5, Adobe had some stuff (can’t remember its name) still in beta which didn’t work quite well at the time.

Then I found Google Swiffy, Swiffy is a project from Google’s DoubleClick team which converts SWF to HTML5 using their library to interpret things along the way.

First problems we encountered was performance and some conversion issues, we rebuilt the Flash and ActionScript which solved all those issues.

The second major issue was library compatibility with jQuery. Dont know why (since it is a really bad idea) they used a “$” to define variables inside their code base.

Well I could easily redefine jQuery caller on my side, but I wasn’t starting something from scratch so this was not an option.

Instead I swapped the problematic variables in Swiffy, solving the compatibility issues. This was really an easy fix.

You may my find a jQuery compatible Swiffy version at  https://github.com/apocas/swiffy-jquery

It worked pretty well, although the generated JS file is 50% bigger than the initial SWF file, it was a tradeoff we could handle.

This is related to our SWF graphic/visual complexity, if your SWF isn’t graphically complex the end result could be entirely different.