Unstructured Data in Finance

It is sad to see many financial companies burn millions of dollars on Big Data solutions and use them for silly things like keeping tap on system logs. So I was wondering where can Big Data can add value for a financial company.… Continue reading

Parallel Computing for Analytics

Distributed file systems have sort of solved the problem of elastically storing data in several systems and search and retrieving them parallelly. But some of the problems related to the speed of analytics on structured data still remains. Most of the analytics algorithms have parallel implementations, but I haven’t come across a good solution that can be used scalably on a series of commodity servers to do complex analytics without user side coding.  … Continue reading

Myths about Analytics

Without further ado:

  • Analytics work involves funky predictions: No. Most of the work done by analytics service providers is data cleaning and data visualisation
  • Complex algorithms yield better results: No. Regression (including its derivatives like logistic regression) is enough to cover 99% of analytics work
  • Fast multi-core processors are needed for analytics: No. 
Continue reading

Death of Data Warehouses

Data warehouses are complex, clumsy, ugly animals that have thankfully become redundant. What is the use of a centralized warehouse of data when one can easily browse data across the organisation smoothly using various data accessing techniques? Bandwidth is cheap, processing is cheap and storage does not cost anything.… Continue reading