Monday, December 15, 2008

IP mapping Romanian sites around the world

It happens (sometimes) in life that you are very close to something/someone(? :-) and because of it you cannot grasp the full extent of your opportunities. Taking few steps back/away might help, if you train yourself in thinking out of the box; Otherwise, taking a few years break will definitely bring the "argh, I could have done this & that" moments :-)

Which bring us to the subject: I could have done it in 2000 (maybe), I could have done it in 2004 (definitely, maybe :-) or, I'll just do it now (2008)

The task at hand was/is quite daunting, but one can break it in a few "easy" steps:
  1. build your own web crawler/robot
  2. feed it with previous art or just google
  3. let it crunch the data/web for a while or two :-)
  4. (hopefully your robot behaves and no one will ban you; sorry guys)
  5. extract the www sites from the stack
  6. perform some IP magic/geo tagging on them
  7. find a good/nice way to plot/display your data
I guess by the time one reaches 6. his/her chances of finishing the job go above 80% :-)

Luckily, I managed to do 7+ and to the left is the static result of it. A more dynamic map of Romanian sites around the world is hosted by Mapeed until January 2009, so be quick to enjoy it (Santa comes earlier this year for the ones clicking with IQ :-)

Update: the data was interpolated/reduced (depends on how you look at it :-) to ~50K geo locations that follow the initial distribution found in the collected data set. I have no time, nor excessive passion/knowledge(? :-) to run the numbers in a complete statistical way, it's like what I learned the other day: it's good to have some data to question, rather than having no data at all.


Anonymous said...

nice one, cat timp ti-a luat sa le colectezi ?

tester said...

destul de greu de spus, a fost parte a unui proiect 'hobby' pe care l-am tot pornit si abandonat de-a lungul timpului...

totusi, ultimele 2+ luni au fost mai intense in pus totul cap la cap (pentru postul asta :-)

Sebastian Cochinescu said...

N-am găsit easter egg-ul. :(

tester said...


la click pe buline (n) -> zoom in
la click pe buline (1) -> clasa IP locata acolo :-)

Anonymous said...

punctul 6 cum l-ai facut ? ai parsat or something like that ?

tester said...


nope, am folosit un serviciu gen: