Working with HDFS
Introduction
Hadoop comes with a distributed filesystem, Hadoop Distributed File System, known as HDFS. You will learn basic HDFS commands and how to move data between your local filesystem and HDFS.
We will assume you have installed the HDFS docker container as in the previous step. You will learn to load a local file into HDFS and read it back.
Vagrant up
Stop remove all containers - you will see errors if you have no containers running
Start the HDFS container
You should see the bash prompt
Run commands
Update Environmental Variable
Create a data directory
Create a csv file - Enter these one at a time!!!!
HDFS Operations
Create an HDFS directory
Write a file to HDFS
Cat the HDFS file
Copy the file from HDFS to your local file system
Make a HDFS directory
Copy the people.csv file to HDFS
Examine people.csv on HDFS
Copy the HDFS file to your local file system
Read the local file
List HDFS files
Exit the docker machine (#bash-4.18)
Stop all docker containers on the VM
Shut down the VM - from the native machine
Last updated