289 pages, published in 2017
About the Author
My name is Frank Kane. I spent nine years at amazon.com and imdb.com, wrangling millions of customer ratings and customer transactions to produce things such as personalized recommendations for movies and products and "people who bought this also bought." I tell you, I wish we had Apache Spark back then, when I spent years trying to solve these problems there. I hold 17 issued patents in the fields of distributed computing, data mining, and machine learning. In 2012, I left to start my own successful company, Sundog Software, which focuses on virtual reality environment technology, and teaching others about big data analysis.
We will do some really quick housekeeping here, just so you know where to put all the stuff for this book. First, I want you to go to your hard drive, create a new folder called SparkCourse, and put it in a place where you're going to remember it i