Loading...

Configuring the Linux Computer to Access the Azure Data Lake Store using Hadoop (HDFS) Agent

For each ADLS store, you must configure a set of Linux computers to access the ADLS.

For more information about ADLS support, see the Hadoop Azure Data Lake Support page at https://hadoop.apache.org/docs/current/hadoop-azure-datalake/index.html.

Prerequisites

Procedure

  1. Download and Install Java 8 on the Linux computer.

    This procedure assumes the Java home path is /usr/java/default.

  2. Download and extract Hadoop software on the computer.

    Verify that the Hadoop version is higher than 3.0.0-alpha2.

  3. Edit the JAVA_HOME variable in the /hadoop303/hadoop-3.0.3/etc/hadoop/hadoop-env.sh file as follows:

    export JAVA_HOME= /usr/java/default

  4. Edit the HADOOP_CLASSPATH variable in the /hadoop303/hadoop-3.0.3/etc/hadoop/hadoop-env.sh file as follows:

    export HADOOP_CLASSPATH=${HADOOP_HOME}/share/hadoop/tools/lib/*

  5. Add the following properties in the hadoop core-site.xml file by substituting the values with the ADLS parameters that you obtained earlier:

    <configuration>
      <property>
        <name>fs.default.name</name>
        <value>Azure_ADL_URI</value>
      </property>
      <property>
        <name>dfs.adls.oauth2.access.token.provider.type</name>
        <value>ClientCredential</value>
      </property>
      <property>
        <name>dfs.adls.oauth2.refresh.url</name>
        <value>OAuth_2.0_token_endpoint</value>
      </property>
      <property>
        <name>dfs.adls.oauth2.client.id</name>
          <value>Application_ID_or_Client_ID</value>
      </property>
      <property>
        <name>dfs.adls.oauth2.credential</name>
          <value>Application_or_Authentication_Key</value>
      </property>
      <property>
        <name>fs.adl.impl</name>
          <value>org.apache.hadoop.fs.adl.AdlFileSystem</value>
      </property>
      <property>
        <name>fs.AbstractFileSystem.adl.impl</name>
          <value>org.apache.hadoop.fs.adl.Adl</value>
      </property>
    </configuration>

  6. Add Hadoop bin directories to the PATH environment in your profile or bashrc file.

    export HADOOP_HOME=/hadoop303/hadoop-3.0.3
    export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin

  7. Verify that you are able to access the ADLS by using the following HDFS commands:

    hdfs dfs -ls /
    Found 3 items
    drwxrwx---+ - 4902e6ff-70a5-494a-bd19-2edc6966ae64 4902e6ff-70a5-494a-bd19-2edc6966ae64 0 2017-12-15 07:04 /cluster
    drwxrwxr-x+ - 01863d38-90b3-4c6e-aed7-a2049987545b 4902e6ff-70a5-494a-bd19-2edc6966ae64 0 2018-06-13 23:50 /my_data
    drwxrwx---+ - 01863d38-90b3-4c6e-aed7-a2049987545b 4902e6ff-70a5-494a-bd19-2edc6966ae64 0 2018-06-13 23:57 /restore

Last modified: 11/1/2018 2:28:51 PM