Time for action – using HDFS

As the preceding example shows, there is a familiar-looking interface to HDFS that allows us to use commands similar to those in Unix to manipulate files and directories on the filesystem. Let's try it out by typing the following commands:

Type in the following commands:

$ hadoop -mkdir /user
$ hadoop -mkdir /user/hadoop
$ hadoop fs -ls /user
Found 1 items
drwxr-xr-x - hadoop supergroup 0 2012-10-26 23:09 /user/Hadoop
$ echo "This is a test." >> test.txt
$ cat test.txt
This is a test.
$ hadoop dfs -copyFromLocal test.txt .
$ hadoop dfs -ls
Found 1 items
-rw-r--r-- 1 hadoop supergroup 16 2012-10-26 23:19/user/hadoop/test.txt
$ hadoop dfs -cat test.txt
This is a test.
$ rm test.txt 
$ hadoop dfs -cat test.txt
This is a test.
$ hadoop fs -copyToLocal test.txt
$ cat test.txt
This is a test.

What just happened?

This example shows the use of the fs subcommand to the Hadoop utility. Note that both dfs and fs commands are equivalent). Like most filesystems, Hadoop has the concept of a home directory for each user. These home directories are stored under the /user directory on HDFS and, before we go further, we create our home directory if it does not already exist.

We then create a simple text file on the local filesystem and copy it to HDFS by using the copyFromLocal command and then check its existence and contents by using the -ls and -cat utilities. As can be seen, the user home directory is aliased to . because, in Unix, -ls commands with no path specified are assumed to refer to that location and relative paths (not starting with /) will start there.

We then deleted the file from the local filesystem, copied it back from HDFS by using the -copyToLocal command, and checked its contents using the local cat utility.


Mixing HDFS and local filesystem commands, as in the preceding example, is a powerful combination, and it's very easy to execute on HDFS commands that were intended for the local filesystem and vice versa. So be careful, especially when deleting.

There are other HDFS manipulation commands; try Hadoop fs -help for a detailed list.