Looking to build a BigData platform? Here is an infrastructural checklist that you should build:

  • Scalable structured DB (PotsgreSQL is a good choice)
  • Scalable unstructured DB (MongoDB)
  • Streaming data (Apache Storm)
  • In-memory data processing (Apache SparkSQL and RDD)
  • Machine learning (Spark MLLIB)

Also Python and associated libraries: although its not a big-data tool in itself python is a great language with readily available packages to interact with most big-data tools.

Follow

Leave a Reply

Your email address will not be published. Required fields are marked *