Start All Daemons

start-all.sh 	# Deprecated
start-dfs.sh 	# Datenode, Namenode, Secondary node
start-yarn.sh 	# Recourse Manager, Node Manager

Java Process Status

Check if all daemons are online
Returns the name along with the process id

jps 

HDFS Web Console

http://localhost:50070/ (VM)

http://localhost:9870 (WSL)

Made Directory

hadoop fs -mkdir /lti
hadoop fs -mkdir -p /lti/david/ 	# Parent Directory
hdfs dfs -mkdir day2 				# Will make folder in /user/<username>/

Copy from Local FS to Hadoop FS

hadoop fs -copyFromLocal <local-file> <hdfs-path>
hadoop fs -copyFromLocal -f <local-file> <hdfs-path> 	# Force
hadoop fs -put <local-path> <hdfs-path>
hadoop fs -put sample.txt hdfs://localhost:9000/lti/ 	# Path for hdfs on other system

Copy from Hadoop FS to Local FS

hadoop fs -copyToLocal <hfds-file> <local-path>
hadoop fs -get <hdfs-path> <local-path>

Append Files

hadoop fs -appendToFile<local-file> <hdfs-file>

Copy (From HDFS to HDFS)

hdfs dfs -cp /user/hadoop/file1 /user/hadoop/file2

Move (From HDFS to HDFS)

hdfs dfs -mv <hdfs-file> <hdfs-file>

List Directory

hadoop fs -ls <dir>
hadoop fs -ls -R <dir> # Recursive

Delete (Remove from HDFS)

hdfs dfs -rm <hdfs-file>
hdfs dfs -rm -R <hdfs-file>

Change Replication Factor

hadoop fs -setrep <num> /lti/

NOTE

These properties can be changed at the global level in hdfs-site.xml

  • dfs.replication
  • dfs.blocksize

Distributed Copy (Inter/ Intra-Cluster Copying)

hadoop distcp hdfs://nn1:8020/foo/bar hdfs://nn2:8020/bar/foo

Apache Hadoop 2.4.1 - File System Shell Guide

Apache Hadoop Distributed Copy – DistCp Guide