Apache Oozie
Samenvatting
Get a solid grounding in Apache Oozie, the workflow scheduler system for managing Hadoop jobs. With this hands-on guide, two experienced Hadoop practitioners walk you through the intricacies of this powerful and flexible platform, with numerous examples and real-world use cases.
Once you set up your Oozie server, you’ll dive into techniques for writing and coordinating workflows, and learn how to write complex data pipelines. Advanced topics show you how to handle shared libraries in Oozie, as well as how to implement and manage Oozie’s security capabilities.
- Install and configure an Oozie server, and get an overview of basic concepts
- Journey through the world of writing and configuring workflows
- Learn how the Oozie coordinator schedules and executes workflows based on triggers
- Understand how Oozie manages data dependencies
- Use Oozie bundles to package several coordinator apps into a data pipeline
- Learn about security features and shared library management
- Implement custom extensions and write your own EL functions and actions
- Debug workflows and manage Oozie’s operational details
Specificaties
Inhoudsopgave
Preface
1. Introduction to Oozie
-Big Data Processing
2. Oozie Concepts
-Oozie Applications
-Parameters, Variables, and Functions
-Application Deployment Model
-Oozie Architecture
3. Setting Up Oozie
-Oozie Deployment
-Basic Installations
-Advanced Oozie Installations
4. Oozie Workflow Actions
-Workflow
-Actions
-Action Types
-Synchronous Versus Asynchronous Actions
5. Workflow Applications
-Outline of a Basic Workflow
-Control Nodes
-Job Configuration
-Parameterization
-The job.properties File
-Configuration and Parameterization Examples
-Lifecycle of a Workflow
6. Oozie Coordinator
-Coordinator Concept
-Triggering Mechanism
-Coordinator Application and Job
-Coordinator Job Lifecycle
-Coordinator Action Lifecycle
-Parameterization of the Coordinator
-Execution Controls
-An Improved Coordinator
7. Data Trigger Coordinator
-Expressing Data Dependency
-Example: Rollup
-Parameterization of Dataset Instances
-Parameter Passing to Workflow
-A Complete Coordinator Application
8. Oozie Bundles
-Bundle Basics
-Bundle Specification
-Bundle State Transitions
9. Advanced Topics
-Managing Libraries in Oozie
-Oozie Security
-Supporting New API in MapReduce Action
-Supporting Uber JAR
-Cron Scheduling
-Emulate Asynchronous Data Processing
-HCatalog-Based Data Dependency
10. Developer Topics
-Developing Custom EL Functions
-Supporting Custom Action Types
-Overriding an Asynchronous Action Type
-Creating a New Asynchronous Action
11. Oozie Operations
-Oozie CLI Tool
-Oozie REST API
-Oozie Java Client
-The oozie-site.xml File
-The Oozie Purge Service
-Job Monitoring
-Oozie Instrumentation and Metrics
-Reprocessing
-Server Tuning
-Oozie High Availability
-Debugging in Oozie
-MiniOozie and LocalOozie
-The Competition
Index
Anderen die dit boek kochten, kochten ook
Net verschenen
Rubrieken
- aanbestedingsrecht
- aansprakelijkheids- en verzekeringsrecht
- accountancy
- algemeen juridisch
- arbeidsrecht
- bank- en effectenrecht
- bestuursrecht
- bouwrecht
- burgerlijk recht en procesrecht
- europees-internationaal recht
- fiscaal recht
- gezondheidsrecht
- insolventierecht
- intellectuele eigendom en ict-recht
- management
- mens en maatschappij
- milieu- en omgevingsrecht
- notarieel recht
- ondernemingsrecht
- pensioenrecht
- personen- en familierecht
- sociale zekerheidsrecht
- staatsrecht
- strafrecht en criminologie
- vastgoed- en huurrecht
- vreemdelingenrecht