How to refer files in HDInsight – Azure Storage using different ways

If you have started working with Big Data, you surely need to check the Microsoft support on it via Azure platform – HDInsight service. HDInsight allows you to create a Hadoop environment within few minutes and it can be anytime scaled out or in based on your requirements. I have written few posts on this, you can have a look on them using following links;
In order to work with data loaded to HDInsight, or Hadoop, data files have to be refereed using supported syntax. There are multiple ways for referring files in the storage with HDFS. Here are the ways;

Fully qualified path with wasb(s) protocol

This is most accurate and correct way of referring files in the storage. Here is the pattern;
wasb://container_name@storage_account_name.blob.core.windows.net/file_path.
Here is an example using Putty, connecting with HDInsight and reading a file (processed with Hive) exist. My container name is dinesqlashdinsight and storage name is dinesqlasstorage. File path is data/cleanlog/000000_0 (this is a Hive table in fact).

Connecting with the default container

If your files are in the default container, you can skip the container name and storage name as follow;
wasb:///file_path
Note the three slashes. It is required when you do not mentioned the container name.

Connecting using Hadoop/Linux/Unix native ways

Generally, when you work with Hadoop using Linux/Unix, you refer files without the protocol. Azure HDInsight supports the same and we can refer files using that syntax.
file_path

Do I need double quotes for my paths?

It is required when you have some odd characters like equal (=) sign with your path. See the example below. I try to read a data file exist in a the cluster and the path has equal signs, hence path is encased with double quotes.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s