Agile Data Science: Building Full-Stack Data Analytics Applications with Spark

Building analytics products at scale requires a deep investment in people, machines, and time. How can you be sure you’re building the right models that people will pay for? With this hands-on book, you’ll learn a flexible toolset and methodology for building effective analytics applications with Spark.Using lightweight tools such as Python, PySpark, Elastic MapReduce, MongoDB, ElasticSearch, Doc2vec, Deep Learning, D3.js, Leaflet, Docker and Heroku, your team will create an agile environment for exploring data, starting with an example application to mine flight data into an analytic product. You’ll learn an iterative approach that enables you to quickly change the kind of analysis you’re doing, depending on what the data is telling you. All example code in this book is available as working applications.Create analytics applications by using the Agile Data Science development methodologyBuild value from your data in a series of agile sprints, using the data-value pyramidLearn how to extract features for statistical models from a single datasetVisualize data with charts, and expose different aspects through interactive reportsUse historical data to predict the future via classification and regressionTranslate predictions into actionsGet feedback from users after each sprint to keep your project on track

Author: Russell Jurney

Learn more