Browse Results

Showing 15,551 through 15,575 of 61,985 results

Data Governance and Data Management: Contextualizing Data Governance Drivers, Technologies, and Tools

by Rupa Mahanti

This book delves into the concept of data as a critical enterprise asset needed for informed decision making, compliance, regulatory reporting and insights into trends, behaviors, performance and patterns. With good data being key to staying ahead in a competitive market, enterprises capture and store exponential volumes of data. Considering the business impact of data, there needs to be adequate management around it to derive the best value. Data governance is one of the core data management related functions. However, it is often overlooked, misunderstood or confused with other terminologies and data management functions. Given the pervasiveness of data and the importance of data, this book provides comprehensive understanding of the business drivers for data governance and benefits of data governance, the interactions of data governance function with other data management functions and various components and aspects of data governance that can be facilitated by technology and tools, the distinction between data management tools and data governance tools, the readiness checks to perform before exploring the market to purchase a data governance tool, the different aspects that must be considered when comparing and selecting the appropriate data governance technologies and tools from large number of options available in the marketplace and the different market players that provide tools for supporting data governance. This book combines the data and data governance knowledge that the author has gained over years of working in different industrial and research programs and projects associated with data, processes and technologies with unique perspectives gained through interviews with thought leaders and data experts. This book is highly beneficial for IT students, academicians, information management and business professionals and researchers to enhance their knowledge and get guidance on implementing data governance in their own data initiatives.

Data Governance and the Digital Economy in Asia: Harmonising Cross-Border Data Flows (Routledge Studies in the Modern World Economy)

by Paul Cheung Liu Jingting Ulrike Sengstschmid

Data governance is the cornerstone of digital economy growth, particularly in Asia, where both the digital economy and the policy space are fast expanding. The chapters collected in this volume delve into how diverse and rapidly evolving data governance models of ASEAN countries and their Asian partners are shaping the regional digital economy integration, particularly through cross-border data flows.The book begins with an examination of the diffusion of data governance rules globally and their economic impacts on a macro level. It then delves into a regional analysis, emphasising the interplay between data governance and economic development. Key discussions include data policies in India, China, South Korea, and ASEAN countries, enriched with insights from industry leaders. The book evaluates the role of regional and international trade agreements in facilitating digital trade and explores the consequences of widely differing data governance models for the ASEAN regional economy, with a special focus on implications for ASEAN’s Digital Economy Framework Agreement.Written for scholars of digital economy, data governance, and digital trade, this book provides a thorough understanding of Asia’s data regulatory environment. Policymakers and industry professionals will also find the book’s insights into the intricacies of digital economy policies and their implications useful in navigating the future of digital economic integration and growth in the ASEAN region.

Data Governance for Managers: The Driver of Value Stream Optimization and a Pacemaker for Digital Transformation (Management for Professionals)

by Lars Michael Bollweg

Professional data management is the foundation for the successful digital transformation of traditional companies. Unfortunately, many companies fail to implement data governance because they do not fully understand the complexity of the challenge (organizational structure, employee empowerment, change management, etc.) and therefore do not include all aspects in the planning and implementation of their data governance. This book explains the driving role that a responsive data organization can play in a company's digital transformation. Using proven process models, the book takes readers from the basics, through planning and implementation, to regular operations and measuring the success of data governance. All the important decision points are highlighted, and the advantages and disadvantages are discussed in order to identify digitization potential, implement it in the company, and develop customized data governance. The book will serve as a useful guide for interested newcomers as well as for experienced managers.

Data Governance: A Guide

by Dimitrios Sargiotis

This book is a comprehensive resource designed to demystify the complex world of data governance for professionals across various sectors. This guide provides in-depth insights, methodologies, and best practices to help organizations manage their data effectively and securely. It covers essential topics such as data quality, privacy, security, and management ensuring that readers gain a holistic understanding of how to establish and maintain a robust data governance framework. Through a blend of theoretical knowledge and practical applications, this book addresses the challenges and benefits of data governance, equipping readers with the tools needed to navigate the evolving data landscape. In addition to foundational principles, this book explores real-world case studies that illustrate the tangible benefits and common pitfalls of implementing data governance. Emerging trends and technologies, including artificial intelligence, machine learning, and blockchain are also examined to prepare readers for future developments in the field. Whether you are a seasoned data management professional or new to the discipline, this book serves as an invaluable resource for mastering the intricacies of data governance and leveraging data as a strategic asset for organizational success. This resourceful guide targets data management professionals, IT managers, Compliance officers, Data Stewards, Data Owners Data Governance Managers and more. Business leaders, business executives academic researchers, students focused on computer science in data-related fields will also find this book a useful resource.

Data Governance: From the Fundamentals to Real Cases

by Mario Piattini Ismael Caballero

This book presents a set of models, methods, and techniques that allow the successful implementation of data governance (DG) in an organization and reports real experiences of data governance in different public and private sectors. To this end, this book is composed of two parts. Part I on “Data Governance Fundamentals” begins with an introduction to the concept of data governance that stresses that DG is not primarily focused on databases, clouds, or other technologies, but that the DG framework must be understood by business users, systems personnel, and the systems themselves alike. Next, chapter 2 addresses crucial topics for DG, such as the evolution of data management in organizations, data strategy and policies, and defensive and offensive approaches to data strategy. Chapter 3 then details the central role that human resources play in DG, analysing the key responsibilities of the different DG-related roles and boards, while chapter 4 discusses the most common barriers to DG in practice. Chapter 5 summarizes the paradigm shifts in DG from control to value creation. Subsequently chapter 6 explores the needs, characteristics and key functionalities of DG tools, before this part ends with a chapter on maturity models for data governance. Part II on “Data Governance Applied” consists of five chapters which review the situation of DG in different sectors and industries. Details about DG in the banking sector, public administration, insurance companies, healthcare and telecommunications each are presented in one chapter. The book is aimed at academics, researchers and practitioners (especially CIOs, Data Governors, or Data Stewards) involved in DG. It can also serve as a reference for courses on data governance in information systems.

Data Governance: How to Design, Deploy and Sustain an Effective Data Governance Program (The Morgan Kaufmann Series on Business Intelligence)

by John Ladley

This book is for any manager or team leader that has the green light to implement a data governance program. The problem of managing data continues to grow with issues surrounding cost of storage, exponential growth, as well as administrative, management and security concerns – the solution to being able to scale all of these issues up is data governance which provides better services to users and saves money. What you will find in this book is an overview of why data governance is needed, how to design, initiate, and execute a program and how to keep the program sustainable. With the provided framework and case studies you will be enabled and educated in launching your very own successful and money saving data governance program. - Provides a complete overview of the data governance lifecycle, that can help you discern technology and staff needs - Specifically aimed at managers who need to implement a data governance program at their company - Includes case studies to detail 'do's' and 'don'ts' in real-world situations

Data Governance: Nachhaltige Geschäftsmodelle und Technologien im europäischen Rechtsrahmen

by Beatrix Weber

Data Governance kann in den Dimensionen Technik, Ökonomie, Nachhaltigkeit und Recht als Steuerung der Nutzung, des Teilens und der Weiterverwendung von Daten definiert werden. Der sich entwickelnde Rechtsrahmen der Europäischen Union zum Datenrecht, insbesondere der Data Governance Act, der Data Act, der Digital Markets Act sowie bereits bestehende Gesetze wie die Datenschutzgrundverordnung schaffen einen Ordnungsrahmen für Dateninhaber, Datennutzer und Datensubjekte. Daneben erfordert die ESG-Gesetzgebung in den Bereichen Nachhaltigkeit und Umweltschutz die rechtskonforme Erfassung und Nutzung von Daten. Vor diesem Hintergrund wird der Binnenmarkt für Daten als Produkte oder Dienstleistungen dauerhaft nur wachsen, wenn technische Innovationen und Standards eine nachhaltige, rechtskonforme, aber auch wertschöpfende Datennutzung für die Marktteilnehmer ermöglichen. Dieses Werk löst die Frage, wie ein ökonomischer Mehrwert durch die Nutzung von Daten erzeugt werden kann, der die aktuellen technischen Möglichkeiten, Ziele der Nachhaltigkeit und das rechtlich Zulässige verbindet.

Data Governance: The Definitive Guide

by Valliappa Lakshmanan Evren Eryurek Uri Gilad Anita Kibunguchy-Grant Jessi Ashdown

As you move data to the cloud, you need to consider a comprehensive approach to data governance, along with well-defined and agreed-upon policies to ensure your organization meets compliance requirements. Data governance incorporates the ways people, processes, and technology work together to ensure data is trustworthy and can be used effectively. This practical guide shows you how to effectively implement and scale data governance throughout your organization.Chief information, data, and security officers and their teams will learn strategy and tooling to support democratizing data and unlocking its value while enforcing security, privacy, and other governance standards. Through good data governance, you can inspire customer trust, enable your organization to identify business efficiencies, generate more competitive offerings, and improve customer experience. This book shows you how.You'll learn:Data governance strategies addressing people, processes, and toolsBenefits and challenges of a cloud-based data governance approachHow data governance is conducted from ingest to preparation and useHow to handle the ongoing improvement of data qualityChallenges and techniques in governing streaming dataData protection for authentication, security, backup, and monitoringHow to build a data culture in your organization

Data Grab: The New Colonialism of Big Tech and How to Fight Back

by Nick Couldry Ulises A. Mejias

A compelling argument that the extractive practices of today’s tech giants are the continuation of colonialism—and a crucial guide to collective resistance. Large technology companies like Meta, Amazon, and Alphabet have unprecedented access to our daily lives, collecting information when we check our email, count our steps, shop online, and commute to and from work. Current events are concerning—both the changing owners (and names) of billion-dollar tech companies and regulatory concerns about artificial intelligence underscore the sweeping nature of Big Tech’s surveillance and the influence such companies hold over the people who use their apps and platforms. As trusted tech experts Ulises A. Mejias and Nick Couldry show in this eye-opening and convincing book, this vast accumulation of data is not the accidental stockpile of a fast-growing industry. Just as nations stole territories for ill-gotten minerals and crops, wealth, and dominance, tech companies steal personal data important to our lives. It’s only within the framework of colonialism, Mejias and Couldry argue, that we can comprehend the full scope of this heist. Like the land grabs of the past, today’s data grab converts our data into raw material for the generation of corporate profit against our own interests. Like historical colonialism, today’s tech corporations have engineered an extractive form of doing business that builds a new social and economic order, leads to job precarity, and degrades the environment. These methods deepen global inequality, consolidating corporate wealth in the Global North and engineering discriminatory algorithms. Promising convenience, connection, and scientific progress, tech companies enrich themselves by encouraging us to relinquish details about our personal interactions, our taste in movies or music, and even our health and medical records. Do we have any other choice? Data Grab affirms that we do. To defy this new form of colonialism we will need to learn from previous forms of resistance and work together to imagine entirely new ones. Mejias and Couldry share the stories of voters, workers, activists, and marginalized communities who have successfully opposed unscrupulous tech practices. An incisive discussion of the digital media that’s transformed our world, Data Grab is a must-read for anyone concerned about privacy, self-determination, and justice in the internet age.

Data Information in Online Environments: 4th EAI International Conference, DIONE 2023, Nanchang, China, November 25–27, 2023, Proceedings (Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering #515)

by Yuanlong Cao Xun Shao

This book constitutes the refereed proceedings of the 4th EAI International Conference on Data Information in Online Environments, DIONE 2023, held in Nanchang, China, during November 25-27, 2023. The 21 full papers were carefully reviewed and selected from 81 submissions.The papers are grouped in thematic sessions as follows: the application of artificial intelligence: the new era of computer network by using machine learning, a caching strategy using deep q-learning for multi-access edge computing users, a deep reinforcement learning-based content updating algorithm for high definition map edge caching, advanced technology in computing, emerging technologies and applications in networks and management.

Data Infrastructure Management: Insights and Strategies

by Greg Schulz

This book looks at various application and data demand drivers, along with data infrastructure options from legacy on premise, public cloud, hybrid, software-defined data center (SDDC), software data infrastructure (SDI), container as well as serverless along with infrastructure as a Service (IaaS), IT as a Service (ITaaS) along with related technology, trends, tools, techniques and strategies. Filled with example scenarios, tips and strategy considerations, the book covers frequently asked questions and answers to aid strategy as well as decision-making.

Data Insight Foundations: Step-by-Step Data Analysis with R

by Nikita Tkachenko

This book is not a comprehensive guide; if that's what you're seeking, you may want to look elsewhere. Instead, it serves as a map, outlining the necessary tools and topics for your research journey. The goal is to build your intuition and provide pointers for where to find more detailed information. The chapters are deliberately concise and to the point, aiming to expose and enlighten rather than bore you. While examples are primarily in R, a basic understanding of the language is advantageous but not essential. Several chapters, especially those focusing on theory, require no programming knowledge at all. Parts of this book have proven useful to a diverse audience, including web developers, mathematicians, data analysts, and economists, making the material beneficial regardless of one’s background The structure allows for flexible reading paths; you may explore the chapters in sequence for a systematic learning experience or navigate directly to the topics most relevant to you. What You Will Learn Data Management: Master the end-to-end process of data collection, processing, validation, and imputation using R. Reproducible Research: Understand fundamental theories and achieve transparency with literate programming, renv, and Git. Academic Writing: Conduct scientific literature reviews and write structured papers and reports with Quarto. Survey Design: Design well-structured surveys and manage data collection effectively. Data Visualization: Understand data visualization theory and create well-designed and captivating graphics using ggplot2 Who this Book is For Career professionals such as research and data analysts transitioning from academia to a professional setting where production quality significantly impacts career progression. Some familiarity with data analytics processes and an interest in learning R or Python are ideal.

Data Integration Life Cycle Management with SSIS: A Short Introduction By Example

by Andy Leonard

Build a custom BimlExpress framework that generates dozens of SQL Server Integration Services (SSIS) packages in minutes. Use this framework to execute related SSIS packages in a single command. You will learn to configure SSIS catalog projects, manage catalog deployments, and monitor SSIS catalog execution and history. Data Integration Life Cycle Management with SSIS shows you how to bring DevOps benefits to SSIS integration projects. Practices in this book enable faster time to market, higher quality of code, and repeatable automation. Code will be created that is easier to support and maintain. The book teaches you how to more effectively manage SSIS in the enterprise environment by drawing on the art and science of modern DevOps practices. What You'll Learn Generate dozens of SSIS packages in minutes to speed your integration projects Reduce the execution of related groups of SSIS packages to a single command Successfully handle SSIS catalog deployments and their projects Monitor the execution and history of SSIS catalog projects Manage your enterprise data integration life cycle through automated tools and utilities Who This Book Is For Database professionals working with SQL Server Integration Services in enterprise environments. The book is especially useful to those readers following, or wishing to follow, DevOps practices in their use of SSIS.

Data Integration in the Life Sciences: 11th International Conference, DILS 2015, Los Angeles, CA, USA, July 9-10, 2015, Proceedings (Lecture Notes in Computer Science #9162)

by Naveen Ashish Jose-Luis Ambite

This book constitutes the proceedings of the 11th International Conference on Data Integration in the Life Sciences, DILS 2015, held in Los Angeles, CA, USA, in July 2015. The 24 papers presented in this volume were carefully reviewed and selected from 40 submissions. They are organized in topical sections named: data integration technologies; ontology and knowledge engineering for data integration; biomedical data standards and coding; medical research applications; and graduate student consortium.

Data Integration in the Life Sciences: 12th International Conference, DILS 2017, Luxembourg, Luxembourg, November 14-15, 2017, Proceedings (Lecture Notes in Computer Science #10649)

by Marcos Da Silveira Cédric Pruski Reinhard Schneider

This book constitutes the proceedings of the 12th International Conference on Data Integration in the Life Sciences, DILS 2017, held in Luxembourg, in November 2017. The 5 full papers and 5 short papers presented in this volume were carefully reviewed and selected from 16 submissions. They cover topics such as: life science data modelling; analysing, indexing, and querying life sciences datasets; annotating, matching, and sharing life sciences datasets; privacy and provenance of life sciences datasets.

Data Integration in the Life Sciences: 13th International Conference, Dils 2018, Hannover, Germany, November 20-21, 2018, Proceedings (Lecture Notes in Computer Science #11371)

by Maria-Esther Vidal Sören Auer

This book constitutes revised selected papers from the 13th International Conference on Data Integration in the Life Sciences, DILS 2018, held in Hannover, Germany, in November 2018. The 5 full, 8 short, 3 poster and 4 demo papers presented in this volume were carefully reviewed and selected from 22 submissions. The papers are organized in topical sections named: big biomedical data integration and management; data exploration in the life sciences; biomedical data analytics; and big biomedical applications.

Data Intelligence and Cognitive Informatics: Proceedings of ICDICI 2020 (Algorithms for Intelligent Systems)

by Selwyn Piramuthu I. Jeena Jacob Selvanayaki Kolandapalayam Shanmugam Przemyslaw Falkowski-Gilski

This book discusses new cognitive informatics tools, algorithms and methods that mimic the mechanisms of the human brain which lead to an impending revolution in understating a large amount of data generated by various smart applications. The book is a collection of peer-reviewed best selected research papers presented at the International Conference on Data Intelligence and Cognitive Informatics (ICDICI 2020), organized by SCAD College of Engineering and Technology, Tirunelveli, India, during 8–9 July 2020. The book includes novel work in data intelligence domain which combines with the increasing efforts of artificial intelligence, machine learning, deep learning and cognitive science to study and develop a deeper understanding of the information processing systems.

Data Intelligence and Cognitive Informatics: Proceedings of ICDICI 2021 (Algorithms for Intelligent Systems)

by Robert Bestak I. Jeena Jacob Selvanayaki Kolandapalayam Shanmugam

The book is a collection of peer-reviewed best selected research papers presented at the International Conference on Data Intelligence and Cognitive Informatics (ICDICI 2021), organized by SCAD College of Engineering and Technology, Tirunelveli, India, during July 16–17, 2021. This book discusses new cognitive informatics tools, algorithms, and methods that mimic the mechanisms of the human brain which leads to an impending revolution in understating a large amount of data generated by various smart applications. The book includes novel work in data intelligence domain which combines with the increasing efforts of artificial intelligence, machine learning, deep learning, and cognitive science to study and develop a deeper understanding of the information processing systems.

Data Intelligence and Cognitive Informatics: Proceedings of ICDICI 2022 (Algorithms for Intelligent Systems)

by I. Jeena Jacob Selvanayaki Kolandapalayam Shanmugam Ivan Izonin

The book is a collection of peer-reviewed best selected research papers presented at the International Conference on Data Intelligence and Cognitive Informatics (ICDICI 2021), organized by SCAD College of Engineering and Technology, Tirunelveli, India, during July 6–7, 2022. This book discusses new cognitive informatics tools, algorithms and methods that mimic the mechanisms of the human brain which lead to an impending revolution in understating a large amount of data generated by various smart applications. The book includes novel work in data intelligence domain which combines with the increasing efforts of artificial intelligence, machine learning, deep learning and cognitive science to study and develop a deeper understanding of the information processing systems.

Data Intelligence and Cognitive Informatics: Proceedings of ICDICI 2023 (Algorithms for Intelligent Systems)

by Selwyn Piramuthu I. Jeena Jacob Przemyslaw Falkowski-Gilski

The book is a collection of peer-reviewed best selected research papers presented at the International Conference on Data Intelligence and Cognitive Informatics (ICDICI 2023), organized by SCAD College of Engineering and Technology, Tirunelveli, India, during June 27–28, 2023. This book discusses new cognitive informatics tools, algorithms and methods that mimic the mechanisms of the human brain which lead to an impending revolution in understating a large amount of data generated by various smart applications. The book includes novel work in data intelligence domain which combines with the increasing efforts of artificial intelligence, machine learning, deep learning and cognitive science to study and develop a deeper understanding of the information processing systems.

Data Intensive Computing for Biodiversity (Studies in Computational Intelligence #485)

by Sarinder K. Dhillon Amandeep S. Sidhu

This book is focused on the development of a data integration framework for retrieval of biodiversity information from heterogeneous and distributed data sources. The data integration system proposed in this book links remote databases in a networked environment, supports heterogeneous databases and data formats, links databases hosted on multiple platforms, and provides data security for database owners by allowing them to keep and maintain their own data and to choose information to be shared and linked. The book is a useful guide for researchers, practitioners, and graduate-level students interested in learning state-of-the-art development for data integration in biodiversity.

Data Jujitsu: The Art of Turning Data into Product

by Dj Patil

Acclaimed data scientist DJ Patil details a new approach to solving problems in Data Jujitsu.Learn how to use a problem's "weight" against itself to:Break down seemingly complex data problems into simplified partsUse alternative data analysis techniques to examine themUse human input, such as Mechanical Turk, and design tricks that enlist the help of your users to take short cuts around tough problemsLearn more about the problems before starting on the solutions—and use the findings to solve them, or determine whether the problems are worth solving at all.

Data Lake Analytics on Microsoft Azure: A Practitioner's Guide to Big Data Engineering

by Harsh Chawla Pankaj Khattar

Get a 360-degree view of how the journey of data analytics solutions has evolved from monolithic data stores and enterprise data warehouses to data lakes and modern data warehouses. You willThis book includes comprehensive coverage of how:To architect data lake analytics solutions by choosing suitable technologies available on Microsoft AzureThe advent of microservices applications covering ecommerce or modern solutions built on IoT and how real-time streaming data has completely disrupted this ecosystemThese data analytics solutions have been transformed from solely understanding the trends from historical data to building predictions by infusing machine learning technologies into the solutionsData platform professionals who have been working on relational data stores, non-relational data stores, and big data technologies will find the content in this book useful. The book also can help you start your journey into the data engineer world as it provides an overview of advanced data analytics and touches on data science concepts and various artificial intelligence and machine learning technologies available on Microsoft Azure.What Will You LearnYou will understand the:Concepts of data lake analytics, the modern data warehouse, and advanced data analyticsArchitecture patterns of the modern data warehouse and advanced data analytics solutionsPhases—such as Data Ingestion, Store, Prep and Train, and Model and Serve—of data analytics solutions and technology choices available on Azure under each phaseIn-depth coverage of real-time and batch mode data analytics solutions architectureVarious managed services available on Azure such as Synapse analytics, event hubs, Stream analytics, CosmosDB, and managed Hadoop services such as Databricks and HDInsightWho This Book Is ForData platform professionals, database architects, engineers, and solution architects

Data Lake Development with Big Data

by Pradeep Pasupuleti Beulah Salome Purra

Explore architectural approaches to building Data Lakes that ingest, index, manage, and analyze massive amounts of data using Big Data technologies About This Book * Comprehend the intricacies of architecting a Data Lake and build a data strategy around your current data architecture * Efficiently manage vast amounts of data and deliver it to multiple applications and systems with a high degree of performance and scalability * Packed with industry best practices and use-case scenarios to get you up-and-running Who This Book Is For This book is for architects and senior managers who are responsible for building a strategy around their current data architecture, helping them identify the need for a Data Lake implementation in an enterprise context. The reader will need a good knowledge of master data management, information lifecycle management, data governance, data product design, data engineering, and systems architecture. Also required is experience of Big Data technologies such as Hadoop, Spark, Splunk, and Storm. What You Will Learn * Identify the need for a Data Lake in your enterprise context and learn to architect a Data Lake * Learn to build various tiers of a Data Lake, such as data intake, management, consumption, and governance, with a focus on practical implementation scenarios * Find out the key considerations to be taken into account while building each tier of the Data Lake * Understand Hadoop-oriented data transfer mechanism to ingest data in batch, micro-batch, and real-time modes * Explore various data integration needs and learn how to perform data enrichment and data transformations using Big Data technologies * Enable data discovery on the Data Lake to allow users to discover the data * Discover how data is packaged and provisioned for consumption * Comprehend the importance of including data governance disciplines while building a Data Lake In Detail A Data Lake is a highly scalable platform for storing huge volumes of multistructured data from disparate sources with centralized data management services. It eliminates the need for up-front modeling and rigid data structures by allowing schema-less writes. Data Lakes make it possible to ask complex far-reaching questions to find out hidden data patterns and relationships. This book explores the potential of Data Lakes and explores architectural approaches to building data lakes that ingest, index, manage, and analyze massive amounts of data using batch and real-time processing frameworks. It guides you on how to go about building a Data Lake that is managed by Hadoop and accessed as required by other Big Data applications such as Spark, Storm, Hive, and so on, to create an environment in which data from different sources can be meaningfully brought together and analyzed. Data Lakes can be viewed as having three capabilities--intake, management, and consumption. This book will take readers through each of these processes of developing a Data Lake and guide them (using best practices) in developing these capabilities. It will also explore often ignored, yet crucial considerations while building Data Lakes, with the focus on how to architect data governance, security, data quality, data lineage tracking, metadata management, and semantic data tagging. By the end of this book, you will have a good understanding of building a Data Lake for Big Data. You will be able to utilize Data Lakes for efficient and easy data processing and analytics. Style and approach Data Lake Development with Big Data provides architectural approaches to building a Data Lake. It follows a use case-based approach where practical implementation scenarios of each key component are explained. It also helps you understand how these use cases are implemented in a Data Lake. The chapters are organized in a way that mimics the sequential data flow evidenced in a Data Lake.

Data Lake for Enterprises

by Pankaj Misra Tomcy John

A practical guide to implementing your enterprise data lake using Lambda Architecture as the base About This Book • Build a full-fledged data lake for your organization with popular big data technologies using the Lambda architecture as the base • Delve into the big data technologies required to meet modern day business strategies • A highly practical guide to implementing enterprise data lakes with lots of examples and real-world use-cases Who This Book Is For Java developers and architects who would like to implement a data lake for their enterprise will find this book useful. If you want to get hands-on experience with the Lambda Architecture and big data technologies by implementing a practical solution using these technologies, this book will also help you. What You Will Learn • Build an enterprise-level data lake using the relevant big data technologies • Understand the core of the Lambda architecture and how to apply it in an enterprise • Learn the technical details around Sqoop and its functionalities • Integrate Kafka with Hadoop components to acquire enterprise data • Use flume with streaming technologies for stream-based processing • Understand stream- based processing with reference to Apache Spark Streaming • Incorporate Hadoop components and know the advantages they provide for enterprise data lakes • Build fast, streaming, and high-performance applications using ElasticSearch • Make your data ingestion process consistent across various data formats with configurability • Process your data to derive intelligence using machine learning algorithms In Detail The term "Data Lake" has recently emerged as a prominent term in the big data industry. Data scientists can make use of it in deriving meaningful insights that can be used by businesses to redefine or transform the way they operate. Lambda architecture is also emerging as one of the very eminent patterns in the big data landscape, as it not only helps to derive useful information from historical data but also correlates real-time data to enable business to take critical decisions. This book tries to bring these two important aspects — data lake and lambda architecture—together. This book is divided into three main sections. The first introduces you to the concept of data lakes, the importance of data lakes in enterprises, and getting you up-to-speed with the Lambda architecture. The second section delves into the principal components of building a data lake using the Lambda architecture. It introduces you to popular big data technologies such as Apache Hadoop, Spark, Sqoop, Flume, and ElasticSearch. The third section is a highly practical demonstration of putting it all together, and shows you how an enterprise data lake can be implemented, along with several real-world use-cases. It also shows you how other peripheral components can be added to the lake to make it more efficient. By the end of this book, you will be able to choose the right big data technologies using the lambda architectural patterns to build your enterprise data lake. Style and approach The book takes a pragmatic approach, showing ways to leverage big data technologies and lambda architecture to build an enterprise-level data lake.

Refine Search

Showing 15,551 through 15,575 of 61,985 results