Load Freebase Dump into Virtuoso (SPARQL database)

Sept 3 2013 Update: Freebase is about to switch from its Turtle format to RDF Triples. This post will only work with Freebase RDF Turtle format. See the discussion at!topic/freebase-discuss/AG5sl7K5KBE.

Automatically Back Up Projects or Data using Git

Two years back my hosting domain crashed and I lost all my data/code. The sad news: I did not have a backup. I spent many hours building my website and all the effort went to waste. This was a good lesson to keep my data backed up.

Hadoop Map-Reduce quick tutorial

ML-for-NLP discussion on Hadoop.

When to use Hadoop? If you have a large input file that you want to split and run each split as an input to a program (i.e. the mapper), and later process (i.e. the reducer) the output of each mapper, then Hadoop is for you.

latexdiff - find the differences between two latex revisions

ML-for-NLP discussion on latexdiff

Latexdiff is a tool for highlighting the differences between two latex files. Usage is straightforward on Unix bash prompt:

latexdiff old.tex new.tex > diff.tex
pdflatex diff.tex

latexdiff -h for additional options

Simple LaTeX Sample for Proposals using a Powerful Editor

I am currently tutoring Informatics Research proposal and the aim of this course is to write a neat research proposal. All my students plan to use LaTeX and so I decided to write a short simple LaTeX sample which cover basics required to start a good research proposal.

Differences between TreeTagger and Penn Treebank Tagset

TreeTagger is widely used for part-of-speech (pos) tagging but some of the well known language processing tools like Malt Parser require corpus tagged with Penn Treebank (PTB) tagset. Though TreeTagger uses PTB tagset there are some major differences (I believe TreeTagger tagset is more expressive than the PTB).

NLP Conference Deadlines

List of top NLP/AI conference deadlines. Courtesy: Bharat Ram Ambati

Try this too: Rochester NLP list is good

Indian Institute of Idiots - Current Schooling System in India

We worry about what a child will become tomorrow, yet we forget that he is someone today.

Chinese Language is interesting

These days I am working on many languages like Vietnamese, Dutch, Thai apart from Indian languages. Now the Chinese turn comes. I was always interested to know more about China for different reasons. Some of them are China is a neighboring country of mine, recent political disturbances between China and India, more news about China saving the world from recession, Super Emerging power blah blah. I always wonder why there is such a economic gap between India and China.

How to make a personal webpage - do's and don'ts

Namaste. You are reading my first blog post and this is my new website. From the past few days, I have been working hard on my website. It all started like this.

I felt like its time to change my website which I made it four years back during my early days of interaction with a foreign object called computer. Now that I am quite familiar with it, I planned to have a new personal web presence. Web has evolved so rapidly these past four years where I think almost have no place in it. Anyhow, I don't want to lose more and here is my effort.

