,

Parallel R

Data Analysis in the Distributed World

Specificaties
Paperback, 108 blz. | Engels
O'Reilly | 1e druk, 2012
ISBN13: 9781449309923
Rubricering
Hoofdrubriek : Computer en informatica
O'Reilly 1e druk, 2012 9781449309923
Verwachte levertijd ongeveer 16 werkdagen

Samenvatting

It's tough to argue with R as a high-quality, cross-platform, open source statistical software product-unless you're in the business of crunching Big Data. This concise book introduces you to several strategies for using R to analyze large datasets. You'll learn the basics of Snow, Multicore, Parallel, and some Hadoop-related tools, including how to find them, how to use them, when they work well, and when they don't.

With these packages, you can overcome R's single-threaded nature by spreading work across multiple CPUs, or offloading work to multiple machines to address R's memory barrier.

- Snow: works well in a traditional cluster environment
- Multicore: popular for multiprocessor and multicore computers
- Parallel: part of the upcoming R 2.14.0 release
- R+Hadoop: provides low-level access to a popular form of cluster computing
- RHIPE: uses Hadoop's power with R's language and interactive shell
- Segue: lets you use Elastic MapReduce as a backend for lapply-style operations

Specificaties

ISBN13:9781449309923
Taal:Engels
Bindwijze:paperback
Aantal pagina's:108
Uitgever:O'Reilly
Druk:1
Verschijningsdatum:10-1-2012

Over Q Ethan McCallum

Q Ethan McCallum is a consultant, writer, and technology enthusiast, though perhaps not in that order. In his professional roles, he helps companies to make smart decisions about data and technology. His written work appears online on The O'Reilly Network and Java.net, and also in print publications such as C/C++ Users Journal, Doctor Dobb's Journal, and Linux Magazine.

Andere boeken door Q Ethan McCallum

Over Stephen Weston

Stephen Weston has been working in high performance and parallel computing for over 25 years. He was employed at Scientific Computing Associates in the 90's, working on the Linda programming system, invented by David Gelernter. He was also a founder of Revolution Computing, leading the development of parallel computing packages for R, including nws, foreach, doSNOW, and doMC. He now works at Yale University as an HPC Specialist.

Andere boeken door Stephen Weston

Inhoudsopgave

Preface

1. Getting Started
-Why R?
-Why Not R?
-The Solution: Parallel Execution
-A Road Map for This Book
-In a Hurry?
-Summary

2. snow
-Quick Look
-How It Works
-Setting Up
-Working with It
-When It Works...
-...And When It Doesn't
-The Wrap-up

3. multicore
-Quick Look
-How It Works
-Setting Up
-Working with It
-When It Works...
-...And When It Doesn't
-The Wrap-up

4. parallel
-Quick Look
-How It Works
-Setting Up
-Working with It
-Summary of Differences
-When It Works...
-...And When It Doesn't
-The Wrap-up

5. A Primer on MapReduce and Hadoop
-Hadoop at Cruising Altitude
-A MapReduce Primer
-Thinking in MapReduce: Some Pseudocode Examples
-Binary and Whole-File Data: SequenceFiles
-No Cluster? No Problem! Look to the Clouds...
-The Wrap-up

6. R+Hadoop
-Quick Look
-How It Works
-Setting Up
-Working with It
-When It Works...
-...And When It Doesn't
-The Wrap-up

7. RHIPE
-Quick Look
-How It Works
-Setting Up
-Working with It
-When It Works...
-...And When It Doesn't
-The Wrap-up

8. Segue
-Quick Look
-How It Works
-Setting Up
-Working with It
-When It Works...
-...And When It Doesn't
-The Wrap-up

9. New and Upcoming
-doRedis
-RevoScale R and RevoConnectR (RHadoop)
-cloudNumbers.com

Net verschenen

Rubrieken

Populaire producten

    Personen

      Trefwoorden

        Parallel R