Browse Results

Showing 57,101 through 57,125 of 100,000 results

Apache OfBiz Cookbook

by Ruth Hoffman

The best way to experience OFBiz is to dive right in and start "kicking the tires". No matter if you are an end user exploring the out-of-the-box e-commerce web store or a software developer getting ready to build a new application, you will find, eventually, that you perform the same tasks over and over again. This book is designed as a reference to guide you though those oft-encountered OFBiz tasks. It is a collection of recipes, not necessarily in any particular order of importance, that address and give answers to many of the real-world questions asked about how to do things with OFBiz. If you are an OFBiz user who has some familiarity with enterprise software systems, and perhaps more importantly, Internet and Web exposure, you will be able to glean useful information from this book. You will need only basic knowledge of modern browser behavior (for example: how to click a mouse button) to follow some recipes, while others assume a passing familiarity with a text-editor and XML documents. If you are a software developer looking for Java and/or Groovy examples, this book also includes a chapter on Java software development.

Apache OFBiz Development: The Beginner's Tutorial

by Rupert Howell Jonathon Wong

This is an accessible step-by-step tutorial that introduces readers to the world of OFBiz through practical examples and clear explanations. It will guide you through the framework, teach you to tweak OFBiz and master widgets, entities, and permissions, and give you the knowledge to customize your own bespoke applications. This book is for developers who want to build easily deployed and supported OFBiz applications. No previous knowledge of OFBiz is assumed, but readers should be comfortable in a Java development environment.

Apache Oozie: The Workflow Scheduler for Hadoop

by Mohammad Kamrul Islam Aravind Srinivasan

Get a solid grounding in Apache Oozie, the workflow scheduler system for managing Hadoop jobs. With this hands-on guide, two experienced Hadoop practitioners walk you through the intricacies of this powerful and flexible platform, with numerous examples and real-world use cases.Once you set up your Oozie server, you’ll dive into techniques for writing and coordinating workflows, and learn how to write complex data pipelines. Advanced topics show you how to handle shared libraries in Oozie, as well as how to implement and manage Oozie’s security capabilities.Install and configure an Oozie server, and get an overview of basic conceptsJourney through the world of writing and configuring workflowsLearn how the Oozie coordinator schedules and executes workflows based on triggersUnderstand how Oozie manages data dependenciesUse Oozie bundles to package several coordinator apps into a data pipelineLearn about security features and shared library managementImplement custom extensions and write your own EL functions and actionsDebug workflows and manage Oozie’s operational details

Apache Oozie Essentials

by Jagat Jasjit Singh

Unleash the power of Apache Oozie to create and manage your big data and machine learning pipelines in one go About This Book * Teaches you everything you need to know to get started with Apache Oozie from scratch and manage your data pipelines effortlessly * Learn to write data ingestion workflows with the help of real-life examples from the author's own personal experience * Embed Spark jobs to run your machine learning models on top of Hadoop Who This Book Is For If you are an expert Hadoop user who wants to use Apache Oozie to handle workflows efficiently, this book is for you. This book will be handy to anyone who is familiar with the basics of Hadoop and wants to automate data and machine learning pipelines. What You Will Learn * Install and configure Oozie from source code on your Hadoop cluster * Dive into the world of Oozie with Java MapReduce jobs * Schedule Hive ETL and data ingestion jobs * Import data from a database through Sqoop jobs in HDFS * Create and process data pipelines with Pig, hive scripts as per business requirements. * Run machine learning Spark jobs on Hadoop * Create quick Oozie jobs using Hue * Make the most of Oozie's security capabilities by configuring Oozie's security In Detail As more and more organizations are discovering the use of big data analytics, interest in platforms that provide storage, computation, and analytic capabilities is booming exponentially. This calls for data management. Hadoop caters to this need. Oozie fulfils this necessity for a scheduler for a Hadoop job by acting as a cron to better analyze data. Apache Oozie Essentials starts off with the basics right from installing and configuring Oozie from source code on your Hadoop cluster to managing your complex clusters. You will learn how to create data ingestion and machine learning workflows. This book is sprinkled with the examples and exercises to help you take your big data learning to the next level. You will discover how to write workflows to run your MapReduce, Pig ,Hive, and Sqoop scripts and schedule them to run at a specific time or for a specific business requirement using a coordinator. This book has engaging real-life exercises and examples to get you in the thick of things. Lastly, you'll get a grip of how to embed Spark jobs, which can be used to run your machine learning models on Hadoop. By the end of the book, you will have a good knowledge of Apache Oozie. You will be capable of using Oozie to handle large Hadoop workflows and even improve the availability of your Hadoop environment. Style and approach This book is a hands-on guide that explains Oozie using real-world examples. Each chapter is blended beautifully with fundamental concepts sprinkled in-between case study solution algorithms and topped off with self-learning exercises.

Apache Over Libya

by Will Laidlaw

In this military memoir, an Army Air Corps pilot recounts his experience flying Apache helicopters behind enemy lines in the First Libyan Civil War. In May 2011, after a routine exercise in the Mediterranean, HMS Ocean and her fleet of Apache attack helicopters were about to head home. But the civil war in Libya and the resulting NATO air campaign intervened. Soon the author and his fellow Apache pilots were flying at night over hostile territory. Despite Libya's cutting-edge defense systems and land-to-air weapons, the Apaches made nightly raids at ultra low-level behind enemy lines. They had to fight their way into Libya and complete their mission before the hazardous return to Ocean.Apache Over Libya describes the experiences of eight Army and two Royal Navy pilots who played a significant role in the NATO-led campaign. Despite fighting the best armed enemy British aircrew have faced in generations, they defied the odds and survived. Thrilling firsthand action accounts vividly convey what it means to fly the Apache in combat at sea and over enemy-held terrain. This unforgettable account gives a rare insight into attack helicopter operations in war.

Apache Pulsar in Action

by David Kjerrumgaard

Deliver lightning fast and reliable messaging for your distributed applications with the flexible and resilient Apache Pulsar platform.In Apache Pulsar in Action you will learn how to: Publish from Apache Pulsar into third-party data repositories and platforms Design and develop Apache Pulsar functions Perform interactive SQL queries against data stored in Apache Pulsar Apache Pulsar in Action is a comprehensive and practical guide to building high-traffic applications with Pulsar. You&’ll learn to use this mature and battle-tested platform to deliver extreme levels of speed and durability to your messaging. Apache Pulsar committer David Kjerrumgaard teaches you to apply Pulsar&’s seamless scalability through hands-on case studies, including IOT analytics applications and a microservices app based on Pulsar functions. Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications. About the technology Reliable server-to-server messaging is the heart of a distributed application. Apache Pulsar is a flexible real-time messaging platform built to run on Kubernetes and deliver the scalability and resilience required for cloud-based systems. Pulsar supports both streaming and message queuing, and unlike other solutions, it can communicate over multiple protocols including MQTT, AMQP, and Kafka&’s binary protocol. About the book Apache Pulsar in Action teaches you to build scalable streaming messaging systems using Pulsar. You&’ll start with a rapid introduction to enterprise messaging and discover the unique benefits of Pulsar. Following crystal-clear explanations and engaging examples, you&’ll use the Pulsar Functions framework to develop a microservices-based application. Real-world case studies illustrate how to implement the most important messaging design patterns. What's inside Publish from Pulsar into third-party data repositories and platforms Design and develop Apache Pulsar functions Create an event-driven food delivery application About the reader Written for experienced Java developers. No prior knowledge of Pulsar required. About the author David Kjerrumgaard is a committer on the Apache Pulsar project. He currently serves as a Developer Advocate for StreamNative, where he develops Pulsar best practices and solutions. Table of Contents PART 1 GETTING STARTED WITH APACHE PULSAR 1 Introduction to Apache Pulsar 2 Pulsar concepts and architecture 3 Interacting with Pulsar PART 2 APACHE PULSAR DEVELOPMENT ESSENTIALS 4 Pulsar functions 5 Pulsar IO connectors 6 Pulsar security 7 Schema registry PART 3 HANDS-ON APPLICATION DEVELOPMENT WITH APACHE PULSAR 8 Pulsar Functions patterns 9 Resiliency patterns 10 Data access 11 Machine learning in Pulsar 12 Edge analytics

Apache Reservation: Indigenous Peoples & the American State

by Richard J. Perry

“Indian reservations” were the United States’ ultimate solution to the “problem” of what to do with native peoples who already occupied the western lands that Anglo settlers wanted. In this broadly inclusive study, Richard J. Perry considers the historical development of the reservation system and its contemporary relationship to the American state, with comparisons to similar phenomena in Canada, Australia, and South Africa. The San Carlos Apache Reservation of Arizona provides the lens through which Perry views reservation issues. One of the oldest and largest reservations, its location in a minerals- and metals-rich area has often brought it into conflict with powerful private and governmental interests. Indeed, Perry argues that the reservation system is best understood in terms of competition for resources among interest groups through time within the hegemony of the state. He asserts that full control over their resources—and hence, over their lives—would address many of the Apache’s contemporary economic problems.

Apache Reservation: Indigenous Peoples and the American State

by Richard J. Perry

"Indian reservations" were the United States' ultimate solution to the "problem" of what to do with native peoples who already occupied the western lands that Anglo settlers wanted. In this broadly inclusive study, Richard J. Perry considers the historical development of the reservation system and its contemporary relationship to the American state, with comparisons to similar phenomena in Canada, Australia, and South Africa. The San Carlos Apache Reservation of Arizona provides the lens through which Perry views reservation issues. One of the oldest and largest reservations, its location in a minerals- and metals-rich area has often brought it into conflict with powerful private and governmental interests. Indeed, Perry argues that the reservation system is best understood in terms of competition for resources among interest groups through time within the hegemony of the state. He asserts that full control over their resources-and hence, over their lives-would address many of the Apache's contemporary economic problems.

Apache Resistance: Causes And Effects Of Geronimo's Campaign (Cause And Effect: American Indian History Ser.)

by Pamela Dell

The Apache of the American Southwest had long been in conflict with Mexican and U.S. soldiers and settlers by the time Geronimo began resisting these forces. The Apache warrior and his followers spent decades fighting to remain free and in control of their vast lands. The last stage of the long-running resistance began about 1877 when U.S. troops rounded up the Apache and moved them to a reservation. Unable to tolerate life there, Geronimo and his followers escaped several times, fleeing to the hills and their traditional ways. Each time they were captured and brought back. Geronimo surrendered for the last time in 1886 and Apache resistance collapsed. How would it affect the lives of the Apache and change the United States?

Apache Roller 4.0 – Beginner's Guide

by Alfonso Romero

This beginner's guide is packed with information, tips, and tricks, based on the author's extensive experience with Apache Roller. In next to no time, you will be able to build and deploy your own blog. The clear and concise hands-on exercises will teach you everything you need to know to install, configure, and use Apache Roller, along with the open source software required to run it. The book includes plenty of illustrations to guide you through all the detailed exercises and tutorials, so you can get the most out of every chapter. If you are interested in establishing a blog, using Apache Roller and popular web applications to write attractive posts and promote your blog on all the major social bookmarking services, this book is for you. No previous experience on Tomcat, MySQL, the Apache Web Server, or Linux is required.

Apache Solr: A Practical Approach to Enterprise Search

by Dikshant Shahi

Build an enterprise search engine using Apache Solr: index and search documents; ingest data from varied sources; apply various text processing techniques; utilize different search capabilities; and customize Solr to retrieve the desired results. Apache Solr: A Practical Approach to Enterprise Search explains each essential concept--backed by practical and industry examples--to help you attain expert-level knowledge. The book, which assumes a basic knowledge of Java, starts with an introduction to Solr, followed by steps to setting it up, indexing your first set of documents, and searching them. It then introduces you to information retrieval and its implementation in Apache Solr; this will help you understand your search problem, decide the approach to build an effective solution, and use various metrics to evaluate the results. The book next covers the schema design and techniques to build a text analysis chain for cleansing, normalizing and enriching your documents and addressing different types of search queries. It describes various popular matching techniques which are generally applied to improve the precision and recall of searches. You will learn the end-to-end process of data ingestion from varied sources, metadata extraction, pre-processing and transformation of content, various search components, query parsers and other advanced search capabilities. After covering out-of-the-box features, Solr expert Dikshant Shahi dives into ways you can customize Solr for your business and its specific requirements, along with ways to plug in your own components. Most important, you will learn about implementations for Solr scoring, factors affecting the document score, and tuning the score for the application at hand. The book explains why textual scoring is not sufficient for practical ranking of documents and ways to integrate real-world factors for contributing to the document ranking. You'll see how to influence user experience by providing suggestions and recommendations. You'll also see integration of Solr with important related technologies such as OpenNLP and Tika. Additionally, you will learn about scaling Solr using SolrCloud. This book concludes with coverage of semantic search capabilities, which is crucial for taking the search experience to the next level. By the end of Apache Solr, you will be proficient in designing and developing your search engine. What you'll learn How to develop a search engine using Solr How to implement information retrieval concepts How to master search engine internals How to build your search strategy Hot to customize Solr for your unique search problem How to make your search engine intelligent and self-learning Who this book is for Professionals into data mining, data management, or Web development. Table of Contents Chapter 1: Introduction to Search Engines and Solr Chapter 2: Solr Setup and Administration Chapter 3: Understanding Information Retrieval Chapter 4: Search Strategy and Schema Design Chapter 5: Indexing Data Chapter 6: Searching Solr Chapter 7: Advanced Querying Chapter 8: Customizing Document Ranking Chapter 9: Going Beyond Text Matching Chapter 10: SolrCloud Chapter 11: Practical Search Examples

Apache Solr 3.1 Cookbook

by Rafa Ku

This book is part of Packt's Cookbook series; each chapter looks at a different aspect of working with Apache Solr. The recipes deal with common problems of working with Solr by using easy-to-understand, real-life examples. The book is not in any way a complete Apache Solr reference and you should see it as a helping hand when things get rough on your journey with Apache Solr.Developers who are working with Apache Solr and would like to know how to combat common problems will find this book of great use. Knowledge of Apache Lucene would be a bonus but is not required.

Apache Solr 3 Enterprise Search Server

by Eric Pugh David Smiley

The book is written as a reference guide. It includes fully working examples based on a real-world public data set.This book is for developers who want to learn how to use Apache Solr in their applications. Only basic programming skills are needed.

Apache Solr 4 Cookbook

by Rafal Kuc

Apache Solr 4 Cookbook is written in a helpful, practical style with numerous hands-on recipes to help you master Apache Solr to get more precise search results and analysis, higher performance, and reliability. This book is for developers who wish to learn how to master Apache Solr 4. This book will specifically appeal to developers who wish to quickly get to grips with the changes and new features of Apache Solr 4. This book is also handy as a practical guide to solving common problems and issues when using Apache Solr.

Apache Solr Beginner's Guide

by Alfredo Serafini

Written in a friendly, example-driven format, the book includes plenty of step-by-step instructions and examples that are designed to help you get started with Apache Solr.This book is an entry level text into the wonderful world of Apache Solr. The book will center around a couple of simple projects such as setting up Solr and all the stuff that comes with customizing the Solr schema and configuration. This book is for developers looking to start using Apache Solr who are stuck or intimidated by the difficulty of setting it up and using it.For anyone wanting to embed a search engine in their site to help users navigate around the mammoth data available this book is an ideal starting point. Moreover, if you are a data architect or a project manager and want to make some key design decisions, you will find that every example included in the book contains ideas usable in real-world contexts.

Apache Solr Enterprise Search Server - Third Edition

by David Smiley

This book is for developers who want to learn how to get the most out of Solr in their applications, whether you are new to the field, have used Solr but don't know everything, or simply want a good reference. It would be helpful to have some familiarity with basic programming concepts, but no prior experience is required.

Apache Solr Essentials

by Andrea Gazzarini

If you are a competent developer with experience of working with technologies similar to Apache Solr and want to develop efficient search applications, then this book is for you. Familiarity with the Java programming language is required.

Apache Solr for Indexing Data

by Sachin Handiekar Anshul Johri

Enhance your Solr indexing experience with advanced techniques and the built-in functionalities available in Apache Solr About This Book * Learn about distributed indexing and real-time optimization to change index data on fly * Index data from various sources and web crawlers using built-in analyzers and tokenizers * This step-by-step guide is packed with real-life examples on indexing data Who This Book Is For This book is for developers who want to increase their experience of indexing in Solr by learning about the various index handlers, analyzers, and methods available in Solr. Beginner level Solr development skills are expected. What You Will Learn * Get to know the basic features of Solr indexing and the analyzers/tokenizers available * Index XML/JSON data in Solr using the HTTP Post tool and CURL command * Work with Data Import Handler to index data from a database * Use Apache Tika with Solr to index word documents, PDFs, and much more * Utilize Apache Nutch and Solr integration to index crawled data from web pages * Update indexes in real-time data feeds * Discover techniques to index multi-language and distributed data in Solr * Combine the various indexing techniques into a real-life working example of an online shopping web application In Detail Apache Solr is a widely used, open source enterprise search server that delivers powerful indexing and searching features. These features help fetch relevant information from various sources and documentation. Solr also combines with other open source tools such as Apache Tika and Apache Nutch to provide more powerful features. This fast-paced guide starts by helping you set up Solr and get acquainted with its basic building blocks, to give you a better understanding of Solr indexing. You'll quickly move on to indexing text and boosting the indexing time. Next, you'll focus on basic indexing techniques, various index handlers designed to modify documents, and indexing a structured data source through Data Import Handler. Moving on, you will learn techniques to perform real-time indexing and atomic updates, as well as more advanced indexing techniques such as de-duplication. Later on, we'll help you set up a cluster of Solr servers that combine fault tolerance and high availability. You will also gain insights into working scenarios of different aspects of Solr and how to use Solr with e-commerce data. By the end of the book, you will be competent and confident working with indexing and will have a good knowledge base to efficiently program elements. Style and approach This fast-paced guide is packed with examples that are written in an easy-to-follow style, and are accompanied by detailed explanation. Working examples are included to help you get better results for your applications.

Apache Solr High Performance

by Surendra Mohan

This book is an easy-to-follow guide, full of hands-on, real-world examples. Each topic is explained and demonstrated in a specific and user-friendly flow, from search optimization using Solr to Deployment of Zookeeper applications. This book is ideal for Apache Solr developers and want to learn different techniques to optimize Solr performance with utmost efficiency, along with effectively troubleshooting the problems that usually occur while trying to boost performance. Familiarity with search servers and database querying is expected.

Apache Solr PHP Integration

by Jayant Kumar

This book is full of step-by-step example-oriented tutorials which will show readers how to integrate Solr in PHP applications using the available libraries, and boost the inherent search facilities that Solr offers.If you are a developer who knows PHP and is interested in integrating search into your applications, this is the book for you. No advanced knowledge of Solr is required. Very basic knowledge of system commands and the command-line interface on both Linux and Windows is required. You should also be familiar with the concept of Web servers.

Apache Solr Search Patterns

by Jayant Kumar

This book is for developers who already know how to use Solr and are looking at procuring advanced strategies for improving their search using Solr. This book is also for people who work with analytics to generate graphs and reports using Solr. Moreover, if you are a search architect who is looking forward to scale your search using Solr, this is a must have book for you. It would be helpful if you are familiar with the Java programming language.

Apache Spark 2 for Beginners

by Rajanarayanan Thottuvaikkatumana

Develop large-scale distributed data processing applications using Spark 2 in Scala and Python About This Book * This book offers an easy introduction to the Spark framework published on the latest version of Apache Spark 2 * Perform efficient data processing, machine learning and graph processing using various Spark components * A practical guide aimed at beginners to get them up and running with Spark Who This Book Is For If you are an application developer, data scientist, or big data solutions architect who is interested in combining the data processing power of Spark from R, and consolidating data processing, stream processing, machine learning, and graph processing into one unified and highly interoperable framework with a uniform API using Scala or Python, this book is for you. What You Will Learn * Get to know the fundamentals of Spark 2 and the Spark programming model using Scala and Python * Know how to use Spark SQL and DataFrames using Scala and Python * Get an introduction to Spark programming using R * Perform Spark data processing, charting, and plotting using Python * Get acquainted with Spark stream processing using Scala and Python * Be introduced to machine learning using Spark MLlib * Get started with graph processing using the Spark GraphX * Bring together all that you've learned and develop a complete Spark application In Detail Spark is one of the most widely-used large-scale data processing engines and runs extremely fast. It is a framework that has tools that are equally useful for application developers as well as data scientists. This book starts with the fundamentals of Spark 2 and covers the core data processing framework and API, installation, and application development setup. Then the Spark programming model is introduced through real-world examples followed by Spark SQL programming with DataFrames. An introduction to SparkR is covered next. Later, we cover the charting and plotting features of Python in conjunction with Spark data processing. After that, we take a look at Spark's stream processing, machine learning, and graph processing libraries. The last chapter combines all the skills you learned from the preceding chapters to develop a real-world Spark application. By the end of this book, you will have all the knowledge you need to develop efficient large-scale applications using Apache Spark. Style and approach Learn about Spark's infrastructure with this practical tutorial. With the help of real-world use cases on the main features of Spark we offer an easy introduction to the framework.

Apache Spark 2.x Cookbook

by Rishi Yadav

Over 70 recipes to help you use Apache Spark as your single big data computing platform and master its libraries About This Book • This book contains recipes on how to use Apache Spark as a unified compute engine • Cover how to connect various source systems to Apache Spark • Covers various parts of machine learning including supervised/unsupervised learning & recommendation engines Who This Book Is For This book is for data engineers, data scientists, and those who want to implement Spark for real-time data processing. Anyone who is using Spark (or is planning to) will benefit from this book. The book assumes you have a basic knowledge of Scala as a programming language. What You Will Learn • Install and configure Apache Spark with various cluster managers & on AWS • Set up a development environment for Apache Spark including Databricks Cloud notebook • Find out how to operate on data in Spark with schemas • Get to grips with real-time streaming analytics using Spark Streaming & Structured Streaming • Master supervised learning and unsupervised learning using MLlib • Build a recommendation engine using MLlib • Graph processing using GraphX and GraphFrames libraries • Develop a set of common applications or project types, and solutions that solve complex big data problems In Detail While Apache Spark 1.x gained a lot of traction and adoption in the early years, Spark 2.x delivers notable improvements in the areas of API, schema awareness, Performance, Structured Streaming, and simplifying building blocks to build better, faster, smarter, and more accessible big data applications. This book uncovers all these features in the form of structured recipes to analyze and mature large and complex sets of data. Starting with installing and configuring Apache Spark with various cluster managers, you will learn to set up development environments. Further on, you will be introduced to working with RDDs, DataFrames and Datasets to operate on schema aware data, and real-time streaming with various sources such as Twitter Stream and Apache Kafka. You will also work through recipes on machine learning, including supervised learning, unsupervised learning & recommendation engines in Spark. Last but not least, the final few chapters delve deeper into the concepts of graph processing using GraphX, securing your implementations, cluster optimization, and troubleshooting. Style and approach This book is packed with intuitive recipes supported with line-by-line explanations to help you understand Spark 2.x's real-time processing capabilities and deploy scalable big data solutions. This is a valuable resource for data scientists and those working on large-scale data projects.

Apache Spark 2.x for Java Developers

by Sourav Gulati Sumit Kumar

Unleash the data processing and analytics capability of Apache Spark with the language of choice: Java About This Book • Perform big data processing with Spark—without having to learn Scala! • Use the Spark Java API to implement efficient enterprise-grade applications for data processing and analytics • Go beyond mainstream data processing by adding querying capability, Machine Learning, and graph processing using Spark Who This Book Is For If you are a Java developer interested in learning to use the popular Apache Spark framework, this book is the resource you need to get started. Apache Spark developers who are looking to build enterprise-grade applications in Java will also find this book very useful. What You Will Learn • Process data using different file formats such as XML, JSON, CSV, and plain and delimited text, using the Spark core Library. • Perform analytics on data from various data sources such as Kafka, and Flume using Spark Streaming Library • Learn SQL schema creation and the analysis of structured data using various SQL functions including Windowing functions in the Spark SQL Library • Explore Spark Mlib APIs while implementing Machine Learning techniques to solve real-world problems • Get to know Spark GraphX so you understand various graph-based analytics that can be performed with Spark In Detail Apache Spark is the buzzword in the big data industry right now, especially with the increasing need for real-time streaming and data processing. While Spark is built on Scala, the Spark Java API exposes all the Spark features available in the Scala version for Java developers. This book will show you how you can implement various functionalities of the Apache Spark framework in Java, without stepping out of your comfort zone. The book starts with an introduction to the Apache Spark 2.x ecosystem, followed by explaining how to install and configure Spark, and refreshes the Java concepts that will be useful to you when consuming Apache Spark's APIs. You will explore RDD and its associated common Action and Transformation Java APIs, set up a production-like clustered environment, and work with Spark SQL. Moving on, you will perform near-real-time processing with Spark streaming, Machine Learning analytics with Spark MLlib, and graph processing with GraphX, all using various Java packages. By the end of the book, you will have a solid foundation in implementing components in the Spark framework in Java to build fast, real-time applications. Style and approach This practical guide teaches readers the fundamentals of the Apache Spark framework and how to implement components using the Java language. It is a unique blend of theory and practical examples, and is written in a way that will gradually build your knowledge of Apache Spark.

Apache Spark 2.x Machine Learning Cookbook

by Siamak Amirghodsi Meenakshi Rajendran Broderick Hall Shuen Mei

Simplify machine learning model implementations with Spark About This Book • Solve the day-to-day problems of data science with Spark • This unique cookbook consists of exciting and intuitive numerical recipes • Optimize your work by acquiring, cleaning, analyzing, predicting, and visualizing your data Who This Book Is For This book is for Scala developers with a fairly good exposure to and understanding of machine learning techniques, but lack practical implementations with Spark. A solid knowledge of machine learning algorithms is assumed, as well as hands-on experience of implementing ML algorithms with Scala. However, you do not need to be acquainted with the Spark ML libraries and ecosystem. What You Will Learn • Get to know how Scala and Spark go hand-in-hand for developers when developing ML systems with Spark • Build a recommendation engine that scales with Spark • Find out how to build unsupervised clustering systems to classify data in Spark • Build machine learning systems with the Decision Tree and Ensemble models in Spark • Deal with the curse of high-dimensionality in big data using Spark • Implement Text analytics for Search Engines in Spark • Streaming Machine Learning System implementation using Spark In Detail Machine learning aims to extract knowledge from data, relying on fundamental concepts in computer science, statistics, probability, and optimization. Learning about algorithms enables a wide range of applications, from everyday tasks such as product recommendations and spam filtering to cutting edge applications such as self-driving cars and personalized medicine. You will gain hands-on experience of applying these principles using Apache Spark, a resilient cluster computing system well suited for large-scale machine learning tasks. This book begins with a quick overview of setting up the necessary IDEs to facilitate the execution of code examples that will be covered in various chapters. It also highlights some key issues developers face while working with machine learning algorithms on the Spark platform. We progress by uncovering the various Spark APIs and the implementation of ML algorithms with developing classification systems, recommendation engines, text analytics, clustering, and learning systems. Toward the final chapters, we'll focus on building high-end applications and explain various unsupervised methodologies and challenges to tackle when implementing with big data ML systems. Style and approach This book is packed with intuitive recipes supported with line-by-line explanations to help you understand how to optimize your work flow and resolve problems when working with complex data modeling tasks and predictive algorithms. This is a valuable resource for data scientists and those working on large scale data projects.

Refine Search

Showing 57,101 through 57,125 of 100,000 results