100% PowerPoint free! Learn at your own pace with 10 hours of video training and 15+ hours of in-depth exercises created by Matt Pouttu-Clarke and available exclusively through Harvard Innovation Labs on the Experfy Training Platform.
Learn to develop industrial strength MapReduce applications hands-on with tricks of the trade you will find nowhere else.
Apply advanced concepts such as Monte Carlo Simulations, Intelligent Hashing, Push Predicates, and Partition Pruning.
Learn to produce truly reusable User Defined Functions (UDFs) which stand the test of time and work seamlessly with multiple Hadoop tools and distributions.
Learn the latest industry best practices on how to utilize Hadoop ecosystem tools such as Hive, Pig, Flume, Sqoop, and Oozie in an Enterprise context.
When combined with Apache Spark’s severe tech resourcing issues caused by mandatory Scala dependencies, it seems that Apache Beam has all the bases covered to become the de facto streaming analytic API. The cool thing is that by using Apache Beam you can switch run time engines between Google Cloud, Apache Spark, and Apache Flink. A generic streaming API like Beam also opens up the market for others to provide better and faster run times as drop-in replacements. Google is the perfect stakeholder because they are playing the cloud angle and don’t seem to be interested in supporting on-site deployments. Hats off Google, and may the best Apache Beam run time win!