Its very highly likely you will not need Hadoop

  • Do you have peta bytes of data (1,000,000 GB?)
  • Are you willing to wait in long queues even before simple queries get answered?
  • Are your computational requirements embarrassingly parallel?

If the answer to all of the above questions is “true” then sure go ahead with Hadoop.… Continue reading

BgiData Sans BigInsights

There is a whole lot of bragging about how data is growing and is changing the world. But ask any expert where it can be used and most of them give examples of using analytics and call that BigData use cases – predicting weather, calculating risk, doing trend analysis in diseases etc.… Continue reading

Analytics in a Box

Instead of selling data warehouse, analytics tools, reporting tools and analytics modeling services, data companies should build analytics as a product with industry level customization and sell the product. The product should have all the layers: data, analytics and reporting. In fact it will become even more attractive if the hardware is also clubbed into the offering.… Continue reading

Three-way Disconnect

When it comes to modeling and analytics, I believe that there is a three-way disconnect between research in institutions, technology companies and real businesses. When I say real businesses I don’t include social media and other such new age startups. I refer to companies providing tangible services.… Continue reading

Parallel Computing in Linux

This is a good introduction to parallel processing using Linux. It is slightly dated but a lot of principles remain even today.… Continue reading

Internet of Things

At the last count I have 5 devices at home that connect to the internet. I am planning to add a smart TV, a play station and a new home theater – all of which will connect to the net. I have always been fascinated with internet of things, only now am I seeing it becoming a reality.… Continue reading

“There isn’t a better time to be an enterprise investor”

It is about time for a shift from b2c to b2b when it comes to startup investments. Companies require innovative technology and information solutions. Startups are best equipped to offer these. Hence it is time to invest in b2b startups. Investors are becoming weary of consumer startups with miniscule or no revenues.… Continue reading

The Bitcoin Swan Dive Was Utterly Predictable

Well said. With your permission am sharing it on my blog.… Continue reading

Scale of Technology

Again, I want to deviate slightly to share my world-view of technology. When I say technology, I primarily refer to information technology in this post. I’ve always been a believer in the enabling capacity of technology. This role has increased recently and is getting highly emphasized everywhere.… Continue reading

Unstructured Data in Finance

It is sad to see many financial companies burn millions of dollars on Big Data solutions and use them for silly things like keeping tap on system logs. So I was wondering where can Big Data can add value for a financial company.… Continue reading

Parallel Computing for Analytics

Distributed file systems have sort of solved the problem of elastically storing data in several systems and search and retrieving them parallelly. But some of the problems related to the speed of analytics on structured data still remains. Most of the analytics algorithms have parallel implementations, but I haven’t come across a good solution that can be used scalably on a series of commodity servers to do complex analytics without user side coding.  … Continue reading

Myths about Analytics

Without further ado:

  • Analytics work involves funky predictions: No. Most of the work done by analytics service providers is data cleaning and data visualisation
  • Complex algorithms yield better results: No. Regression (including its derivatives like logistic regression) is enough to cover 99% of analytics work
  • Fast multi-core processors are needed for analytics: No. 
Continue reading