Introduction to Big Data with Spark and Hadoop

Software > Computer Software > Educational Software IBM

$49

ENROLL NOW

Course Overview

What You'll Learn

You will start the course by understanding what big data is and exploring how insights from big data can be harnessed for a variety of use cases.
Next, you will learn about Hadoop, an open-source framework that allows for the distributed processing of large data and its ecosystem.
You will discover important applications that go hand in hand with Hadoop, like Distributed File System (HDFS), MapReduce, and HBase.

This self-paced IBM course will teach you all about big data! You will become familiar with the characteristics of big data and its application in big data analytics. You will also gain hands-on experience with big data processing tools like Apache Hadoop and Apache Spark. Bernard Marr defines big data as the digital trace that we are generating in this digital era. You will start the course by understanding what big data is and exploring how insights from big data can be harnessed for a variety of use cases. You’ll also explore how big data uses technologies like parallel processing, scaling, and data parallelism. Next, you will learn about Hadoop, an open-source framework that allows for the distributed processing of large data and its ecosystem. You will discover important applications that go hand in hand with Hadoop, like Distributed File System (HDFS), MapReduce, and HBase. You will become familiar with Hive, a data warehouse software that provides an SQL-like interface to efficiently query and manipulate large data sets. You’ll then gain insights into Apache Spark, an open-source processing engine that provides users with new ways to store and use big data. In this course, you will discover how to leverage Spark to deliver reliable insights. The course provides an overview of the platform, going into the components that make up Apache Spark. You’ll learn about DataFrames and perform basic DataFrame operations and work with SparkSQL. Explore how Spark processes and monitors the requests your application submits and how you can track work using the Spark Application UI. This course has several hands-on labs to help you apply and practice the concepts you learn. You will complete Hadoop and Spark labs using various tools and technologies, including Docker, Kubernetes, Python, and Jupyter Notebooks.

Course FAQs

Is this an accredited online course?

Accreditation for 'Introduction to Big Data with Spark and Hadoop' is determined by the provider, IBM. For online college courses or degree programs, we strongly recommend you verify the accreditation status directly on the provider's website to ensure it meets your requirements.

Can this course be used for continuing education credits?

Many of the courses listed on our platform are suitable for professional continuing education. However, acceptance for credit varies by state and licensing board. Please confirm with your board and {course.provider} that this specific course qualifies.

How do I enroll in this online school program?

To enroll, click the 'ENROLL NOW' button on this page. You will be taken to the official page for 'Introduction to Big Data with Spark and Hadoop' on the IBM online class platform, where you can complete your registration.

Introduction to Big Data with Spark and Hadoop

Course Overview

What You'll Learn

Course FAQs

Is this an accredited online course?

Can this course be used for continuing education credits?

How do I enroll in this online school program?

Similar Online School Programs

The Pearson Complete Course for CISM Certification: Unit 4

Introduction to Cybersecurity for Business

Power BI for Data Science and Analytics

Advanced XML Integration and Project Applications

Exam Prep AIF-C01: AWS Certified AI Practitioner

Ethics in GenAI for Software Engineering Training