Processing Big Data with Azure HDInsight
Why this Book?
Hadoop has been the base for most of the emerging technologies in today’s big data world. It changed the face of distributed processing by using commodity hardware for large data sets. Hadoop and its ecosystem were used in Java, Scala, and Python languages. Developers coming from a .NET background had to learn one of these languages. But not anymore. This book solely focuses on .NET developers and uses C# as the base language. It covers Hadoop and its ecosystem components, such as Pig, Hive, Storm, HBase, and Spark, using C#. After reading this book, you—as a .NET developer—should be able to build end-to-end big data business solutions on the Azure HDInsight platform.
Azure HDInsight is Microsoft’s managed Hadoop-as-a-service offering in the cloud. Using HDInsight, you can get a fully configured Hadoop cluster up and running within minutes. The book focuses on the practical aspects of HDInsight and shows you how to use it to tackle real-world big data problems.
The audience for this book includes anyone who wants to kick-start Azure HDInsight, wants to understand its core fundamentals to modernize their business, or who wants to get more value out of their data. Anyone who wants to have a solid foundational knowledge of Azure HDInsight and the Hadoop ecosystem should take advantage of this book. The focus of the book appeals to the following two groups of readers.
To get the most out of this book, follow along with the sample code and do the hands-on programs directly in Sandbox or an Azure HDInsight environment.
About versions used in this book: Azure HDInsight changes very rapidly and comes in the form of Azure service updates. Also, HDInsight is a Hadoop distribution from Hortonworks; hence, it also introduces a new version when available. The basics covered in this book will be useful in upcoming versions too.
221 pages, published in 2017
About the Author
Vinit Yadav is the founder and CEO of Veloxcore, a company that helps organizations leverage big data and machine learning. He and his team at Veloxcore are actively engaged in developing software solutions for their global customers using agile methodologies. He continues to build and deliver highly scalable big data solutions.
Vinit started working with Azure when it first came out in 2010, and since then, he has been continuously involved in designing solutions around the Microsoft Azure platform.
Vinit is also a machine learning and data science enthusiast, and a passionate programmer. He has more than 12 years of experience in designing and developing enterprise applications using various .NET technologies.
On a side note, he likes to travel, read, and watch sci-fi. He also loves to draw, paint, and create new things. Contact him on Twitter (@vinityad),
or by email ([email protected]
), or on LinkedIn
About the Technical Reviewer
Dattatrey Sindol (a.k.a. Datta) is a data enthusiast. He has worked in data warehousing, business intelligence, and data analytics for more than a decade. His primary focus is on Microsoft SQL Server, Microsoft Azure, Microsoft Cortana Intelligence Suite, and Microsoft Power BI. He also works in other technologies within Microsoft’s cloud and big data analytics space.
Currently, he is an architect at a leading digital transformation company in India. With his extensive experience in the data and analytics space, he helps customers solve real-world business problems and bring their data to life to gain valuable insights. He has published numerous articles and currently writes about his learnings on his blog at http://dattatreysindol.com.
You can follow him on Twitter (@dattatreysindol), connect with him on LinkedIn (https://www.linkedin.com/in/dattatreysindol), or contact him via email ([email protected]