• Courses
  • Books
  • Blog
  • My Courses / Log In
  • Community
    • Timeline
  • Help
  • Cart
Machine Learning and Big Data Training with Frank Kane
  • Courses
  • Books
  • Blog
  • My Courses / Log In
  • Community
    • Timeline
  • Help
  • Cart
Menu

[Activity] Run your first Spark program! Ratings histogram example.

  • [Activity] Run your first Spark program! Ratings histogram example.

[Activity] Run your first Spark program! Ratings histogram example.

  • November 5, 2019
  • 0

Back to: Taming Big Data with Apache Spark 3 and Python – Hands On!

Previous Lesson
[Activity] Installing the MovieLens Movie Rating Dataset
Next Lesson
Introduction to Spark
Share this

Leave a Comment Cancel Comment

You must be logged in to post a comment.

Lessons

  • Getting Started with Spark
    • Introduction
    • How to Use This Course
    • [Activity] Getting Set Up: Installing Python, a JDK, Spark, and its Dependencies.
    • [Activity] Installing the MovieLens Movie Rating Dataset
    • [Activity] Run your first Spark program! Ratings histogram example.
  • Spark Basics and the RDD Interface
    • Introduction to Spark
    • The Resilient Distributed Dataset (RDD)
    • Ratings Histogram Walkthrough
    • Key/Value RDD’s, and the Average Friends by Age Example
    • [Activity] Running the Average Friends by Age Example
    • Filtering RDD’s, and the Minimum Temperature by Location Example
    • [Activity]Running the Minimum Temperature Example, and Modifying it for Maximums
    • [Activity] Running the Maximum Temperature by Location Example
    • [Activity] Counting Word Occurrences using flatmap()
    • [Activity] Improving the Word Count Script with Regular Expressions
    • [Activity] Sorting the Word Count Results
    • Assignment: Tally up amount spent by customer using Spark
    • Assignment Solution
    • Assignment: Sort your results by amount spent per customer
  • SparkSQL, DataFrames, and DataSets
    • Introducing SparkSQL
    • Executing SQL commands and SQL-style functions on a DataFrame
    • Using DataFrames instead of RDD’s
    • [Exercise]: Implement Friends by Age with Dataframes
    • Exercise Solution: Friends by Age, with Dataframes
    • Word Count, with Dataframes
    • Minimum Temperature, with Dataframes
    • [Exercise] Implement Total Amount Spent with Dataframes
    • Exercise Solution: Total Amount Spent with Dataframes
  • Advanced Examples of Spark Programs
    • [Activity] Find the Most Popular Movie
    • [Activity] Use Broadcast Variables to Display Movie Names Instead of ID Numbers
    • Find the Most Popular Superhero in a Social Graph
    • [Activity] Run the Script – Discover Who the Most Popular Superhero is!
    • [Exercise] Find the Most Obscure Superheroes
    • Exercise Solution
    • Superhero Degrees of Separation: Introducing Breadth-First Search
    • Superhero Degrees of Separation:  Accumulators, and Implementing BFS in Spark
    • [Activity] Superhero Degrees of Separation: Review the Code and Run it
    • Item-Based Collaborative Filtering in Spark, cache(), and persist()
    • [Activity] Running the Similar Movies Script using Spark’s Cluster Manager
    • [Exercise] Improve the Quality of Similar Movies
  • Running Spark on a Cluster
    • Introducing Elastic MapReduce
    • [Activity] Setting up your AWS / Elastic MapReduce Account and Setting Up PuTTY
    • Partitioning
    • Create Similar Movies from One Million Ratings – Part 1
    • [Activity] Create Similar Movies from One Million Ratings – Part 2
    • Create Similar Movies from One Million Ratings – Part 3
    • Troubleshooting Spark on a Cluster
    • More Troubleshooting, and Managing Dependencies
  • Machine Learning with Spark ML
    • Introducing MLLib
    • [Activity] Using MLLib to Produce Movie Recommendations
    • Analyzing the ALS Recommendations Results
    • [Activity] Linear Regression with Spark ML
    • [Exercise] Using Decision Trees to Predict Real Estate Prices
    • Exercise Solution
  • Spark Streaming, Structured Streaming, and GraphX
    • [Activity] Structured Streaming in Python
    • [Exercise] Using Windowed Operations with Structured Streaming
    • Exercise Solution
    • GraphX
  • You Made It! Where to Go from Here.
    • Learning More about Spark and Data Science
    • Continue your Learning Journey
https://sundog-education.com/ Website and all course content © Copyright 2021 Sundog Software LLC DBA Sundog Education. All rights reserved worldwide. "Sundog" is a registered trademark of Sundog Software, LLC. Read our privacy policy and terms of service.