- ISBN-10: 0-13-381116-6
- ISBN-13: 978-0-13-381116-2
- Copyright 2016
- Pages: 480
- Edition: 1st
Plan and Implement Hadoop Virtualization for Maximum Performance, Scalability, and Business Agility
Enterprises running Hadoop must absorb rapid changes in big data ecosystems, frameworks, products, and workloads. Virtualized approaches can offer important advantages in speed, flexibility, and elasticity. Now, a world-class team of enterprise virtualization and big data experts guide you through the choices, considerations, and tradeoffs surrounding Hadoop virtualization. The authors help you decide whether to virtualize Hadoop, deploy Hadoop in the cloud, or integrate conventional and virtualized approaches in a blended solution.
First, Virtualizing Hadoop reviews big data and Hadoop from the standpoint of the virtualization specialist. The authors demystify MapReduce, YARN, and HDFS and guide you through each stage of Hadoop data management. Next, they turn the tables, introducing big data experts to modern virtualization concepts and best practices.
Finally, they bring Hadoop and virtualization together, guiding you through the decisions you’ll face in planning, deploying, provisioning, and managing virtualized Hadoop. From security to multitenancy to day-to-day management, you’ll find reliable answers for choosing your best Hadoop strategy and executing it.
Coverage includes the following:
- Reviewing the frameworks, products, distributions, use cases, and roles associated with Hadoop
- Understanding YARN resource management, HDFS storage, and I/O
- Designing data ingestion, movement, and organization for modern enterprise data platforms
- Defining SQL engine strategies to meet strict SLAs
- Considering security, data isolation, and scheduling for multitenant environments
- Deploying Hadoop as a service in the cloud
- Reviewing the essential concepts, capabilities, and terminology of virtualization
- Applying current best practices, guidelines, and key metrics for Hadoop virtualization
- Managing multiple Hadoop frameworks and products as one unified system
- Virtualizing master and worker nodes to maximize availability and performance
- Installing and configuring Linux for a Hadoop environment
About the Author
George J. Trujillo, Jr. is an experienced corporate executive with exceptional communication skills. He is an expert in change management with strong leadership skills, critical thinking, and data-driven decisions. George is an internationally recognized data architect, leader, and speaker in big data and cloud solutions. His background includes Big Data Architecture, Hadoop (Hortonworks, Cloudera), data governance, schema design, metadata management, security, NoSQL, and BI. He has many industry recognitions, including Oracle Recognized Double ACE, Sun Ambassador for Sun Microsystem’s Application Middleware Platform, VMware Recognized vExpert, VMware Certified Instructor, MySQL’s Socrates Award, and MySQL Certified DBA. His leadership in the user community includes Independent Oracle Users Group (IOUG) board of directors, president of IOUG Cloud SIG, chair for RMOUG Big Data SIG, president of RMOUG Cloud SIG, Oracle Fusion Council and Oracle Beta Leadership Council, IOUG’s Elected to “Oracles of Oracle” circle, and master presenter for the IOUG’s Master Series. His many job positions have included vice president of big data architecture in the financial services industry, master principal big data specialist at Hortonworks, tier one data specialist for VMware Center of Excellence, and CEO for professional services and training organization.
Charles Kim is the president of Viscosity North America, a niche consulting organization specializing in big data, Oracle Exadata/RAC, and virtualization. Charles is an architect in Hadoop/big data, Linux infrastructure, cloud, virtualization, engineered systems, and Oracle clustering technologies. Charles is an author with Oracle Press, Pearson, and APress in Oracle, Hadoop, and Linux technology stacks. He holds certifications in Oracle, VMware, Red Hat Linux, and Microsoft and has more than 23 years of IT experience on mission- and business-critical systems.
Charles presents regularly at VMworld, Oracle OpenWorld, IOUG, and various local/regional user group conferences. He is an Oracle ACE director, VMware vExpert, Oracle Certified DBA, Certified Exadata Specialist, and a Certified RAC Expert. Charles’s books include the following:
- Oracle Database 11g New Features for DBA and Developers
- Linux Recipes for Oracle DBAs
- Oracle Data Guard 11g Handbook
- Virtualizing Business Critical Oracle Databases: Database as a Service
- Oracle ASM 12c Pocket Reference Guide
- Expert Exadata Handbook
Charles is the president of the Cloud Computing (and Virtualization) SIG for the Independent Oracle User Group. Charles blogs regularly at the DBAExpert.com/ blog site.
His LinkedIn profile is http://www.linkedin.com/in/chkim.
His Twitter tag is @racdba
Steven Jones is a 16-year veteran of technical training with experience in UNIX, networking, database technology, virtualization, and big data. Steven works at VMware as a VMware Certified Instructor; VCA; VCP 4, 5, 6; and vExpert 2014, 2015. He is a coauthor of Virtualize Oracle Business Critical Databases: Database Infrastructure as a Service, by Charles Kim, George Trujillo, Steven Jones, and Sudhir Balasubramanian 2014 iBooks. He was a speaker for VMworld 2013 Virtualizing Mission Critical Oracle RAC with vC Ops, San Francisco and Barcelona, and a co-speaker worldwide for VMware Education SDDC Intensive Workshop. Steven seeks to bring innovation, analogy, and narrative to understanding and mastering information technology as a service.
Rommel Garcia is a senior solutions engineer at Hortonworks, a leading open source company driving the adoption of Hadoop. Rommel has spent the past few years focusing on the design, installation, and deployment of large-scale Hadoop ecosystems. He has helped organizations implement security best practices and guidelines for Hadoop platforms. He has performance tuned Hadoop clusters ranging from fast-growing startups to Fortune 100 organizations. Rommel is a nationally recognized speaker at Hadoop and big data conferences. He is also well known for his expertise in performance tuning Java applications and middle-tier platforms. He has a BS in electronics engineering and an MS degree in computer science. Rommel resides in Atlanta with his wife, Elizabeth, and his children, Mila and Braden.
Justin Murray is a senior technical marketing architect at VMware. He holds a BA and a post-graduate diploma in computer science from University College Cork in Ireland. Justin has worked in software engineering, technical training, and consulting in various companies in the UK and the United States. Since 2007, he has been working with VMware’s partner companies to validate and optimize big data and other next-generation application workloads on VMware vSphere.