While Lisa is keeping a chimp at her house, she discovers that he knows sign language and she and her friends hope to find a way to keep him from being taken to live at the zoo.
Building a simple but powerful recommendation system is much easier than you think. Approachable for all levels of expertise, this report explains innovations that make machine learning practical for business production settings--and demonstrates how even a small-scale development team can design an effective large-scale recommendation system.Apache Mahout committers Ted Dunning and Ellen Friedman walk you through a design that relies on careful simplification. You'll learn how to collect the right data, analyze it with an algorithm from the Mahout library, and then easily deploy the recommender using search technology, such as Apache Solr or Elasticsearch. Powerful and effective, this efficient combination does learning offline and delivers rapid response recommendations in real time.Understand the tradeoffs between simple and complex recommendersCollect user data that tracks user actions--rather than their ratingsPredict what a user wants based on behavior by others, using Mahoutfor co-occurrence analysisUse search technology to offer recommendations in real time, complete with item metadataWatch the recommender in action with a music service exampleImprove your recommender with dithering, multimodal recommendation, and other techniques
Finding Data Anomalies You Didn't Know to Look ForAnomaly detection is the detective work of machine learning: finding the unusual, catching the fraud, discovering strange activity in large and complex datasets. But, unlike Sherlock Holmes, you may not know what the puzzle is, much less what "suspects" you're looking for. This O'Reilly report uses practical examples to explain how the underlying concepts of anomaly detection work.From banking security to natural sciences, medicine, and marketing, anomaly detection has many useful applications in this age of big data. And the search for anomalies will intensify once the Internet of Things spawns even more new types of data. The concepts described in this report will help you tackle anomaly detection in your own project.Use probabilistic models to predict what's normal and contrast that to what you observeSet an adaptive threshold to determine which data falls outside of the normal range, using the t-digest algorithmEstablish normal fluctuations in complex systems and signals (such as an EKG) with a more adaptive probablistic modelUse historical data to discover anomalies in sporadic event streams, such as web trafficLearn how to use deviations in expected behavior to trigger fraud alerts
If you're a business team leader, CIO, business analyst, or developer interested in how Apache Hadoop and Apache HBase-related technologies can address problems involving large-scale data in cost-effective ways, this book is for you. Using real-world stories and situations, authors Ted Dunning and Ellen Friedman show Hadoop newcomers and seasoned users alike how NoSQL databases and Hadoop can solve a variety of business and research issues.You'll learn about early decisions and pre-planning that can make the process easier and more productive. If you're already using these technologies, you'll discover ways to gain the full range of benefits possible with Hadoop. While you don't need a deep technical background to get started, this book does provide expert guidance to help managers, architects, and practitioners succeed with their Hadoop projects.Examine a day in the life of big data: India's ambitious Aadhaar projectReview tools in the Hadoop ecosystem such as Apache's Spark, Storm, and Drill to learn how they can help youPick up a collection of technical and strategic tips that have helped others succeed with HadoopLearn from several prototypical Hadoop use cases, based on how organizations have actually applied the technologyExplore real-world stories that reveal how MapR customers combine use cases when putting Hadoop and NoSQL to work, including in production
Many big data-driven companies today are moving to protect certain types of data against intrusion, leaks, or unauthorized eyes. But how do you lock down data while granting access to people who need to see it? In this practical book, authors Ted Dunning and Ellen Friedman offer two novel and practical solutions that you can implement right away.Ideal for both technical and non-technical decision makers, group leaders, developers, and data scientists, this book shows you how to:Share original data in a controlled way so that different groups within your organization only see part of the whole. You'll learn how to do this with the new open source SQL query engine Apache Drill.Provide synthetic data that emulates the behavior of sensitive data. This approach enables external advisors to work with you on projects involving data that you can't show them.If you're intrigued by the synthetic data solution, explore the log-synth program that Ted Dunning developed as open source code (available on GitHub), along with how-to instructions and tips for best practice. You'll also get a collection of use cases.Providing lock-down security while safely sharing data is a significant challenge for a growing number of organizations. With this book, you'll discover new options to share data safely without sacrificing security.
More and more data-driven companies are looking to adopt stream processing and streaming analytics. With this concise ebook, you'll learn best practices for designing a reliable architecture that supports this emerging big-data paradigm.Authors Ted Dunning and Ellen Friedman (Real World Hadoop) help you explore some of the best technologies to handle stream processing and analytics, with a focus on the upstream queuing or message-passing layer. To illustrate the effectiveness of these technologies, this book also includes specific use cases.Ideal for developers and non-technical people alike, this book describes:Key elements in good design for streaming analytics, focusing on the essential characteristics of the messaging layerNew messaging technologies, including Apache Kafka and MapR Streams, with links to sample codeTechnology choices for streaming analytics: Apache Spark Streaming, Apache Flink, Apache Storm, and Apache ApexHow stream-based architectures are helpful to support microservicesSpecific use cases such as fraud detection and geo-distributed data streamsTed Dunning is Chief Applications Architect at MapR Technologies, and active in the open source community. He currently serves as VP for Incubator at the Apache Foundation, as a champion and mentor for a large number of projects, and as committer and PMC member of the Apache ZooKeeper and Drill projects. Ted is on Twitter as @ted_dunning.Ellen Friedman, a committer for the Apache Drill and Apache Mahout projects, is a solutions consultant and well-known speaker and author, currently writing mainly about big data topics. With a PhD in Biochemistry, she has years of experience as a research scientist and has written about a variety of technical topics. Ellen is on Twitter as @Ellen_Friedman.
Select your format based upon: 1) how you want to read your book, and 2) compatibility with your reading tool. To learn more about using Bookshare with your device, visit the Help Center.
Here is an overview of the specialized formats that Bookshare offers its members with links that go to the Help Center for more information.
- Bookshare Web Reader - a customized reading tool for Bookshare members offering all the features of DAISY with a single click of the "Read Now" link.
- DAISY (Digital Accessible Information System) - a digital book file format. DAISY books from Bookshare are DAISY 3.0 text files that work with just about every type of access technology that reads text. Books that contain images will have the download option of ‘DAISY Text with Images’.
- BRF (Braille Refreshable Format) - digital Braille for use with refreshable Braille devices and Braille embossers.
- MP3 (Mpeg audio layer 3) - Provides audio only with no text. These books are created with a text-to-speech engine and spoken by Kendra, a high quality synthetic voice from Ivona. Any device that supports MP3 playback is compatible.
- DAISY Audio - Similar to the Daisy 3.0 option above; however, this option uses MP3 files created with our text-to-speech engine that utilizes Ivonas Kendra voice. This format will work with Daisy Audio compatible players such as Victor Reader Stream and Read2Go.