Looking to build a BigData platform? Here is an infrastructural checklist that you should build:
- Scalable structured DB (PotsgreSQL is a good choice)
- Scalable unstructured DB (MongoDB)
- Streaming data (Apache Storm)
- In-memory data processing (Apache SparkSQL and RDD)
- Machine learning (Spark MLLIB)
Also Python and associated libraries: although its not a big-data tool in itself python is a great language with readily available packages to interact with most big-data tools.
Follow