Treffer: Mastering Apache Spark 2.x : Advanced Techniques in Complex Big Data Processing, Streaming Analytics and Machine Learning

Title:
Mastering Apache Spark 2.x : Advanced Techniques in Complex Big Data Processing, Streaming Analytics and Machine Learning
Authors:
Resource Type:
eBook.
Subjects:
Database:
eBook Index

Weitere Informationen

Advanced analytics on your Big Data with the latest Apache Spark 2.xKey Features[•]Master the art of real-time Big Data processing using Apache Spark 2.x[•]Perform machine learning, deep learning and streaming data analytics by extending the most up-to-date functionalities of Apache Spark[•]An advanced guide with a unique combination of tips, instructions and practical examples on using Apache Spark effectivelyBook DescriptionApache Spark is an in-memory, cluster-based Big Data processing system that provides a wide range of functionalities such as graph processing, machine learning, stream processing, and more. This book will take your knowledge of Apache Spark to the next level by teaching you how to expand Spark's functionality and build your data flows and machine/deep learning programs on top of the platform. The book starts with a quick overview of the Apache Spark ecosystem, and introduces you to the new features and capabilities in Apache Spark 2.x. You will then work with the different modules in Apache Spark such as interactive querying with Spark SQL, using DataFrames and DataSets effectively, streaming analytics with Spark Streaming, and performing machine learning and deep learning on Spark using MLlib and external tools such as H20 and Deeplearning4j. The book also contains chapters on efficient graph processing, memory management and using Apache Spark on the cloud. By the end of this book, you will have all the necessary information to master Apache Spark, and use it efficiently for Big Data processing and analytics. What you will learnGet to grips with the newly introduced features in Apache Spark 2.xPerform highly optimised unified batch and real-time data processing usingSparkSQL and Structured StreamingEvaluate large-scale Graph Processing and Analysis using GraphX and GraphFramesPerform advanced machine learning and deep learning with Spark MLlib, SparkML, SystemML, H2O and DeepLearning4JLearn how specific parameter settings affect overall performance of anApache Spark clusterApply Apache Spark in Elastic deployments using Jupyter and Zeppelin Notebooks, Docker, Kubernetes and the IBM CloudWho this book is forIf you are an intermediate-level Spark developer looking to master the advanced capabilities and use-cases of Apache Spark 2.x, this book is for you. Big Data professionals who wish to know how to integrate and use the features of Apache Spark to build a strong Big Data pipeline will also find this book to be a useful resource. A fundamental knowledge of Apache Spark and the Scala programming language is assumed.