Friday, January 18, 2013

Neo4j Un-Meetup in Gamlastan/Stockholm

I have been involved with Neo4JStockholm for about a year. It's a community of friends (who happen to be great hackers, fun thinkers and overall an awesome lot). We always welcome newcomers to our meetups as if they have been with us for years. But one of the feedbacks that I got from past meetups was that there was a lot of intro information and not too much advanced discussion. So for our first meetup of 2013 we decided to do a little experiment called Un-Meetup. The name is obviously inspired from the term Un-conference.

Of course we also wanted to go to a newer venue for this meetup. Omnicloud which is a cool start-up based in Gamlastan (old town, a really cool place to visit) in Stockholm reached out to us and I am so glad that they did.  To quote from their website 
"Omnicloud is drag-and-drop hosting on Mac OS X that empowers web developers to build great things."
 We had access to at least two rooms so we decided to run two parallel tracks. We voted on the following topics
-Introduction to NoSQL and Neo4J (track A)
-Neo4J in production
-Query optimization and performance (track B)
-Data modelling

So participants voted  and chose the two tracks in bold and we had two informal sessions. We had plenty to talk about in both sessions and they could have gone for hours easily. I (a self elected/imposed moderator) was a bit pain in the a** about making this entire set-up  informal (so less lecturing, more one-one discussions). At more than one occasion I interrupted the speakers to take a break. The highlight of the meetup were all the cool new members from various backgrounds who shared the interest in Neo4J and to build better software.

Omnicloud was our host and their office served as the perfect location for this meetup. I can safely certify that this was the first meetup in Stockholm attended by a cat. Furthermore the developers at Omnicloud were really cool hosts and are smart people who happen to be working on really low level kernel hacking stuff (you should check out their blog). Their company is building a product that would solve a lot of complexities for end users (making their life more richer and fun by doing so). That's what Neo4J does and that's what every good product should be doing.

But enough rants. I think this format was a great success and this should be replicated in other places. That's all what I wanted to share.

See pictures here 

Thursday, September 6, 2012

Ideas for Neo4j community education

The purpose of this (longish) post is to jot down my ideas about education of Neo4j in it's community. This post is in response to an irc/online meeetup called on Sep 12th 6th 2012. I have been involved with Neo4j community for almost a year now. It all started with me attending Goto Aarhus 2010 as a student volunteer/crew. Since then I have been trying to learn more about the subject of Neo4j and NOSQL for my personal and professional use. I am by no means an expert in Neo4j, or in spreading knowledge (pedagogy). But as a loud mouth trouble maker I do have some opinions/ideas of my own. 

There are many aspects of Neo4j that I am interested in (Cypher being one, Spring-Data integration being other). But one topic that I am deeply interested is regarding education of NOSQL in general and Neo4j in particular. The reason behind is very selfish to be honest. SQL and relational databases have a very deep grip on the job market and more importantly on the intellectual market (you can call it decades of collaborative brain washing). As a developer who values his sanity I want to escape fate of "SQL Join Hell" at my work place. It is only by education and collaboration that we as a community can "get over" SQL. 

So I am pleased that there is a genuine effort in the Neo4j community to address the issue of knowledge proliferation.


Theme: Neo4j Every where

The first topic that I would like to address is more of a slogan. As a strategy/theme Neo4j should be used/present every where (or at least there should be easy to follow examples  that  touch every aspect of database driven software development). So whether it has to do with enterprise application development or casual data crunching app, it would be of great help to have Neo4j example.

It could also be applications that one uses personally on a daily basis. I for my own sake have been drooling to use a Neo4j backed blog engine (perhaps Structur might fit the use case and I may find it in me to ditch blogger.com for this CMS instead, would you do that for your own blog?).

A good effort in the examples department was the Neo4j heroku application challenge. The result  of this exercise is the repository of examples on Gensen. While this effort is commendable, there is a considerable room for improvement (Disclosure: in terms of participation; I am my self guilty of hyping about a demo github-issue-voting app, but then not delivering; 2012 has been a rather crazy year so far for me). Perhaps the hacking challenge for 2013 will have more advertising/signup period, more marketing, varied challenges in different categories, and ideas along the lines of using already established datasources (use this datasource and do some thing cool) and converting a known open source project to Neo4j. 

Neo4j in the classroom

I would like to see Neo4j being introduced in more classrooms/lecture halls around the world. In most universities around the globe in the undergraduate/graduate course of databases there are guest lectures. Here is an  opportunity to introduce NOSQL and Graph databases to the younglings. This could be a community effort where individuals reach out to their own universities and arrange for/deliver guest lectures. As a group we could come up with a common toolkit/presentation for such an activities. The real trick is to introduce the possibilities of using Neo4j to some one who has not been stung with SQL/Relational databases. So on a practical note; I would love to see some one presenting Neo4j at Chalmers (my Alma mater) and KTH here in Sweden. I might do one of them my self if I can muster the courage.

Of course the first criticism for such approach is scalability (I have a much bigger potential audience for this rambling/blog compared to all the lectures that I can possibly give). While I truly believe that one-on-one contact trumps all other forms of knwoldge sharing, the online format of learning has been gaining more traction since last year or so (with Khan Academy, Stanford courses and all). So the next thing that I would like to see is a Coursera/Udacity course using Graph databases (with emphasis on applications).

I am myself new to the subject  of Graphs and its applications (I slept through my Graph theory lectures at college). I am sure there is a lot to learn from Graph Theory and how to do effective real world modelling when it comes to Graph databases. Currently there are  two courses regarding applications of networks/graphs happening this fall on Coursera that have caught my eye; Networked Life and Social Network Analysis . I would love to see Neo4j as part of the toolset for the next round of these courses in 2013 (Neo4j could potentially be used as part of the assignments for such courses where students have to solve some assignments). So this could also be a practical goal.  We can start with contacting current instructors of courses and see . They will hopefully choose Neo4j on merit (a practical approach to graph databases). One could offer a course on Neo4j it's self but it has the potential to get outdated very soon.

All of this may sound very idealistic, but worth exploring in my opinion.

Spoon Feeding

Another aspect that I would like to clearly see in terms of examples, blog posts in a rapidly evolving application. A case with is an easy to follow example on how to have a fluid, evolving schema (step-by-step) with Neo4j over the course of changing (real world) requirements. One could draw parallels to how this works in SQL or other NOSQL stores. (hint, its not easy). I know that all of this is possible, but I want a clear but simple example so that I can show to others. And yes I am thinking about a  spoon feeding example. Gensen is a great effort but it emphasises on self learning.

So thinking out loud (at the risk of sounding like a fool) an example cloud could be some incremental features implemented in MySQL, Java, Spring (perhaps Liquibase to manage database schema migrations vs a similar application using Spring, Spring-Data, Neo4j. Perhaps this is all very easy to do, but for many of us "seeing is believing".

Cross Meetup collaboration

I would also like to see more collaboration amongst meetups in different cities and even cross meetup-group events e.g. a MongoDB vs Neo4j event where mongodb experts and Neo4j champs work on similar requirements hackathon. This is easier said than done. With such events we can shine light on the gaps (in Neo4j or community knowledge and perhaps in Neo4j, how good or bad it fairs in comparison to Mongodb for instance). Besides it would be like a friendly football match (for those who rather code than play football). I have in the past talked about collaborative online meetups (open up the possibility of Stockholm members to attend the awesome Chicago meetups, live via Google hangouts) but that is one tough nut to crack as too many things can go wrong.

Conclusion

It is a very exciting time to be working with Databases. Data growth (both in size and complexity) is almost certain in every aspect of our life and a technology like Neo4j will surely help us tame this beast. If we want a future where we use data to control our lives data serves us and not the other way around we should JOIN forces (no pun intended) to educate our fellow Graphistas.

Your ideas and comments are most welcome. This concludes my ramblings.......for the moment. 

Tuesday, July 31, 2012

OpenNMS on Windows 2008 ; Notes

OpenNMS installation on Windows 2008 is a bit complicated since you need to install the dependencies (Java, Postgres). There is an IzPack bundle and it does make the process streamlined, but It feels less independent. I followed the guide to install OpenNMS on Windows present on OpenNMS wiki. The only thing that I would add to the guide is to install all things on the root path (C:\Java, C:\Postgres, C:\OpenNMS).

If you pay attention to the logs of Performing External Processing, you may encounter some errors/warnings. You can ignore warnings regarding JRRD (an older, not used component) and C-Iplike-Postgres (a stored procedure which would be helpful for good performance when you have a lot of nodes and are using Iplike extensively).

Before you start the bat file you may like to check the log4.properties file to set a relevant debugging level for your daemon (you can find more about what these daemons are on the Wiki). The default debugging level is Warning which is not adequate when setting up the system first time. But after the setup is done, any thing more than warning is just taking extra resources.

The last step is to start OpenNMS with a bat file.  

#it is best to do the following in Windows PowerShell running as an Admin
#you need to navigate to your %OPENNMS_HOME%/bin directory first
./opennms.bat start
 
Make sure that there is no OpenNMS instance already running. Sometimes it may take a bit long for OpenNMS to start. You should check the console where OpenNMS stands at the moment. You can stop OpenNMS by the following command.
#you need to navigate to your %OPENNMS_HOME%/bin directory first
./opennms.bat stop
 
Day-to-Day running and configuration of OpenNMS is very similar to MacOS/Linux and is subject of other blog posts.


Configuring Basic SNMP on Linux, Mac and Windows

SNMP is my weapon of choice when it comes to network and device monitoring. It is present in almost every enterprise grade network device. The thing I like most about SNMP on servers is that one does not have to install any fancy agents just to get some basic performance metrics. Although with more people using likes of Puppet and Chef it seems that managing custom agents won't be a big problem. And usually you have to install and configure a daemon any way, but that configuration is a one time affair and could be part of your OS deployment (a pre-configured AMI for AWS for example)

Unlike the acronym SNMP, which stands for "Simple Network Mangement Protocol", I don't believe that this protocol is any thing all dimple. SNMP is an IETF standard and it's current version is 3 but organizations all over the world are  full of v2 and v1 devices. . With multiple versions in the wild with security issues and potential scalability issues due to polling some question if SNMP will survive the next decade. But for the moment SNMP is a good standard protocol and is very useful for day-to-day monitoring and management of devices. If you like to learn more about SNMP I would recommend Essential SNMP 2nd Edition from O'Reilly. These videos are useful intro to SNMP. [1, 2]

This blog entry will try to cover basic SNMP configuration on Linux, Mac and Windows. In my opinion it is OK to run the daemon with default settings in a LAN setting but it should not be exposed to the internet (read as good firewall housekeeping). It is best to use a non-default port and community name, but one should not rely on security through obscurity pattern for production machines in general. I will try to post on how to secure SNMP at a later stage.  

Net-SNMP on Linux/Mac

I bleieve that the most widely used agent/daemon for snmp is Net-SNMP. It is packaged on Ubuntu 10.04 as snmpd (the version that I am using is 5.4.2.1; bit old but works). On Ubuntu I aslo like to install some other utilities along with snmpd

apt-get install snmp-mibs-downloader scli snmptrapfmt snmp snmpd

SNMP 5.4.x is present on MacOS-X (Server version) so nothing to do here.

After snmpd is installed (you can confirm it by which snmp) we need to configure it. You can either use the following configuration or use snmpconf to write your own configuration like shown below. Note the port and community are changed.

agentaddress  1161  #listening on port 1161 instead of default 161
disk  / 10%   
load  12 12 12
master  yes
rocommunity  notsopublic #note the comunity name it not public, also rocommunity can read every thing under the sunsyscontact  "Shahzada Hatim <jhon.smith@usa.net>"
syslocation  "Somewhere in Solar System"
sysservices 12 

I would highly recommend that you go through generation of configuration from snmpconf and read the comments of the generated file.

Next step is to start the snmp daemon. I have found it useful to start snmpd in shell to check things first.

snmpd -c /etc/snmp/snmpd.conf -f -Le

This gives me a chance to test the snmpd rapidly as all errors are logged on the screen. But once you have snmp configured to how we would like it then we can start it  like we normally start services.
service snmpd start

For MacOS one can change the settings in Server Admin tool found in MacOS server or the following command to start snmp.

sudo launchctl load -w /System/Library/LaunchDaemons/org.net-snmp.snmpd.plist

Microsoft  SNMP

Microsoft has their own SNMP daemon and I used this wiki to enable it for Windows 2008.  The You can install the Windows version of SNMP to get tools like snmpwalk etc, but I found that the Net-SNMP daemon does not report every thing about the machine so I chose to use the windows version.

SNMP Diagnostics Basics

So you got your SNMP daemon to work as you wished. Great. But how do you know it it works in the first place. You can either Let your NMS do the magic or you can poll snmp your self. We do this by using snmpwalk command. On Window and MacOS this is probably already present, but on windows you will have to install the NetSNMP package to get this working.

For the above snmpd setting we can use the following command
snmpwalk -v 2c -c notsopublic localhost:1161
snmpwalk -Os  -v 2c -c notsopublic localhost:1161
snmpwalk -On  -v 2c -c notsopublic localhost:1161
You may not get every thing that you are looking for in the snmp output in the default tree, which is why you may need to query a relevent OID.

For example
snmpwalk -Os  -v 2c -c notsopublic localhost:1161 .1.3.6.1.2.1.25.1.1.0
  
Happy data collecting :)

Tuesday, July 24, 2012

Taming Windows 2008 with Cygwin

I usually prefer unix command line for my server needs. But some times one has to make good with what is available. At my current work I have to install OpenNMS on a Windows 2008  R2 machine. 

Every thing would work just fine even without windows native tools, but it makes life so much easier for me to have basic command line tools (like cat,grep, awk,sed, tail and other goodies) which frankly I think Microsoft should provide by default. Fortunately we have Cygwin at our disposal.

Installing Cygwin is rather easy. I just prefer to keep every thing on C:\cygwin. The packages that I usually install on top of stock Cygwin are
screen
unzip
vim
patchutils
gzip
bzip2
openssh
coreutils
findutils
rxvt
In previous versions of Cygwin I used rxvt and a neat tip to setup a working console. But now with Cygwin 1.7.x one gets mintty which is much better (although still does not compare to likes of iTerm on Mac).

The next step that I like to do is to configure openssh as a service. It has worked ok for me in the past but for some reason sshd service shuts down when I try to edit some thing privileged (like /etc/profile). I am not sure of the reason for this behavior but I would suggest that windows remote dektop be remained opened. Of course it goes without saying that firewall should allow incoming connections on port 22 if you want to use SSH from a remote system.


And it is always nice to have a apt-get like interface for installing new packages (which I got working by following this simple guide)

Monday, July 23, 2012

Installing mysql-5.5 on Ubuntu 10.04 from generic linux package

Usually I don't mind installing packages from distribution's own channels. But if it is not easy then I usually resort to binary packages. This is the case I usually follow when I want Java and MySQL installed on my VPS machine.

One can usually find self contained detailed instructions for these installations but in the rare instance when they don't work new comers get frustrated and blame it on Linux. I believe that it should be the responsibility of the binary package provider to have complete instructions.

The recent installation of mysql-5.5.25 community version that I did had following extra steps right after you extract the MySQL distribution. I chose to do it in /opt folder.

export MYSQL_HOME=/path/to/your/mysql
apt-get install libaio1

#copy error messages file
ln -sf $MYSQL_HOME/share/english/errmsg.sys /usr/share/errmsg.sys

#create symlink for all binaries
for i in `ls -1 $MYSQL_HOME/bin/`; do  ln -sf $MYSQL_HOME/bin/$i /usr/bin/; done
After this you should be able to continue with the binary installation.

Monday, July 16, 2012

OpenNMS on Ubuntu 10.04 - Installation

This is a guide to install OpenNMS on Ubuntu 10.04. One might ask why I am using such an old version of Ubuntu and the answer is that my current VPS box on HostEurope.de is on the said version of Ubuntu. You can check your version of ubuntu with the following command

cat /etc/lsb-release 
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=10.04
DISTRIB_CODENAME=lucid
DISTRIB_DESCRIPTION="Ubuntu 10.04.4 LTS

I assume that the installation on any other apt based linux distribution/version of Ubuntu would be identical to this guide. In fact this is not even a full guide as I am following debian install instructions and QuickStart Guide for OpenNMS but with some differences

The following steps would be helpful to any one who wants a production ready OpenNMS in no time. In my experience it is better to install in the following steps. It is assumed that the following steps are executed with a user who is part of sudo group.

Make sure there is no postgres installation/instance already present/running with the opsnnms database. If there is one already there then you can either use it or use the install process to nuke every thing and start fresh.
#check if postgres is already installed and which version, skip this block if it is not installed
sudo dpkg-query -W -f='${Status} ${Version}\n' postgresql

#check if postgres is running
sudo /etc/init.d/postgres status

#if there is no instance running then you can start start the database
sudo /etc/init.d/postgresql-8.4 start

#otherwise check if there are running processes for postgres, there should be some
sudo ps aux | grep postgres

#check if any thing is listening on port 5432 (default postgresql port), if there is no output then it means that postgres is running on non-default port. You will have to adjust your commands and opennms for the non-default port.
sudo netstat -ntlp | grep postgres

#check if there is an opennms database already present, we can back it up
sudo sudo -u postgres psql -c "select count(*) from pg_catalog.pg_database where datname = 'opennms'" ;

#backup postgres if it exists
sudo sudo -u postgres pg_dump opennms > /tmp/pg_dump_opennms_`date +%Y-%m-%d-%H-%M-%S-%Z`
Install Postgresql and change the security configuration.
#purge any instances of postgresql
sudo apt-get --purge autoremove postgresql

#install postgresql (8.4 in my case)
sudo apt-get install postgresql 

#find the pg_hba.conf file and change md5 to trust on non-commented lines
sudo find /etc -name 'pg_hba.conf' -type f -exec sed -i '/^[[:space:]]*#/!s/md5/trust/g' {} \;
sudo find /etc -name 'pg_hba.conf' -type f -exec sed -i '/^[[:space:]]*#/!s/ident/trust/g' {} \;

#start the database
sudo /etc/init.d/postgresql-8.4 start

#create opennms database, doing sudo twice to avoid permissions error
sudo sudo -u postgres createdb -U postgres -E UTF-8 opennms --template=template0

Choose the stable Opennms distribution and then add the repository and key as stated in the wiki. Then install OpenNMS. I make sure that all previous instances and configurations are purged or backedup
#backup any important opennms configurations,data already there
mv /etc/opennms /tmp/etc_opennms_`date +%Y-%m-%d-%H-%M-%S-%Z`
mv /var/lib/opennms/ /tmp/var_lib_opennms_`date +%Y-%m-%d-%H-%M-%S-%Z`

#purge any opennms stuff
sudo apt-get --purge autoremove opennms 

#install opennms
sudo apt-get install opennms
Next step is optional. You can install a C/libtool based stored procudure called iplike.
#install iplike
sudo /usr/sbin/install_iplike.sh

#check if iplike.la exists
sudo find /usr/lib/postgresql -name 'iplike.la' -type f
Next step is to configure java. If you installed OpenNMS from apt-get you probably also got OpenJDK 6 installed along with it. You can also use your own Java distribution. You will also need to initialize the OpenNMS database
#check version of openjdk
/usr/lib/jvm/java-6-openjdk/bin/java -version

#configure OpenNMS java
sudo /usr/share/opennms/bin/runjava -S /usr/lib/jvm/java-6-openjdk/bin/java

#initialize opennms database
sudo /usr/share/opennms/bin/install -dis
Of course the last step is to start OpenNMS
#check version of openjdk
/etc/init.d/opennms start
Make sure you login to your OpenNMS instance and change your password. This is just a start from where we can configure OpenNMS to monitor aspects of our machines and applications vis SNMP, JMX, Selenium and other adapters.