HDFS的伪分布式部署&HADOOP的常用命令

目标

  1. 安装hadoop的hdfs伪分布式部署
  2. hadoop fs常规命令
  3. 配置文件在官方哪里找
  4. 整理 jdk、ssh、hosts文件

1.安装hadoop的hdfs伪分布式部署

  1. 创建用户和目录

    [root@aliyun ~]# useradd hadoop
    [root@aliyun ~]# su - hadoop
    [hadoop@aliyun ~]$ mkdir app software sourcecode log tmp data lib
    [hadoop@aliyun ~]$ ll
    total 28
    drwxrwxr-x 2 hadoop hadoop 4096 Nov 28 11:26 app    #解压的文件夹 软连接
    drwxrwxr-x 2 hadoop hadoop 4096 Nov 28 11:26 data   #数据
    drwxrwxr-x 2 hadoop hadoop 4096 Nov 28 11:26 lib    #第三方的jar
    drwxrwxr-x 2 hadoop hadoop 4096 Nov 28 11:26 log    #日志文件夹
    drwxrwxr-x 2 hadoop hadoop 4096 Nov 28 11:26 software #压缩包
    drwxrwxr-x 2 hadoop hadoop 4096 Nov 28 11:26 sourcecode  #源代码编译
    drwxrwxr-x 2 hadoop hadoop 4096 Nov 28 11:26 tmp    #临时文件夹
  2. 下载/上传压缩包

    [hadoop@aliyun ~]$ cd software/
    [hadoop@aliyun software]$ wget http://archive.cloudera.com/cdh5/cdh/5/hadoop-2.6.0-cdh5.16.2.tar.gz
  3. 解压

    [hadoop@aliyun software]$ tar -xzvf hadoop-2.6.0-cdh5.16.2.tar.gz -C ../app/
    ...
    ...
    ...
    [hadoop@aliyun software]$ cd ../app/
    [hadoop@aliyun app]$ ln -s hadoop-2.6.0-cdh5.16.2/ hadoop
    [hadoop@aliyun app]$ ll
    total 4
    lrwxrwxrwx 1 hadoop hadoop 23 Nov 28 11:36 hadoop -> hadoop-2.6.0-cdh5.16.2/
    drwxr-xr-x 14 hadoop hadoop 4096 Jun 3 19:11 hadoop-2.6.0-cdh5.16.2
  4. 环境要求

    [root@aliyun java]# mkdir /usr/java
    [root@aliyun java]# cd /usr/java
    [root@aliyun java]# rz -E
    [root@aliyun java]# tar -xzvf jdk-8u144-linux-x64.tar.gz
    [root@aliyun java]# chown -R root:root jdk1.8.0_144/
    [root@aliyun java]# ln -s jdk1.8.0_144/ jdk
    [root@aliyun java]# ll
    total 4
    lrwxrwxrwx 1 root root 13 Nov 28 12:01 jdk -> jdk1.8.0_144/
    drwxr-xr-x 8 root root 4096 Jul 22 2017 jdk1.8.0_144
    [root@aliyun java]# vim /etc/profile
    #env
    export JAVA_HOME=/usr/java/jdk
    export PATH=$JAVA_HOME/bin:$PATH
    [root@aliyun java]# source /etc/profile
    [root@aliyun java]# which java
    /usr/java/jdk/bin/java
  5. JAVA_HOME 显性配置

    [hadoop@aliyun hadoop]$ vi hadoop-env.sh
    export JAVA_HOME=/usr/java/jdk
    [root@aliyun java]# cat /etc/hosts
    127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
    ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6

    172.16.39.48 aliyun
  6. 配置文件

    etc/hadoop/core-site.xml:
    <configuration>
    <property>
    <name>fs.defaultFS</name>
    <value>hdfs://aliyun:9000</value>
    </property>
    </configuration>

    etc/hadoop/hdfs-site.xml:
    <configuration>
    <property>
    <name>dfs.replication</name>
    <value>1</value>
    </property>
    </configuration>
  7. ssh无密码信任关系

    家目录下输入
    $ ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
    $ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
    $ chmod 0600 ~/.ssh/authorized_keys
    [hadoop@aliyun ~]$ ssh aliyun date
    Thu Nov 28 12:15:08 CST 2019
  8. 环境变量 hadoop

    [hadoop@aliyun ~]$ vi .bashrc
    export HADOOP_HOME=/home/hadoop/app/hadoop
    export PATH=${HADOOP_HOME}/bin:${HADOOP_HOME}/sbin:$PATH
    [hadoop@aliyun ~]$ source .bashrc
    [hadoop@aliyun ~]$ which hadoop
    ~/app/hadoop/bin/hadoop
  9. 格式化

    [hadoop@aliyun ~]$ hdfs namenode -format
    has been successfully formatted.
  10. 第一次启动

    [hadoop@aliyun ~]$ start-dfs.sh 
    [hadoop@aliyun ~]$ jps
    10804 SecondaryNameNode
    10536 NameNode
    10907 Jps
    10654 DataNode
    [hadoop@aliyun ~]$

    坑:第一次启动会输入yes确定信任关系,我们打开./ssh下的known_hosts文件,这个文件中存放信任关系

    [hadoop@aliyun .ssh]$ cat known_hosts
    aliyun,172.16.39.48 ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBCjHBKn/7LF5sfbae1OLkK5QoWm11Xn8RZs1JTc7K8v4RFum1OKIjArocvRjLOYPsq5ezYo8TlBHTrAgeUcvkBM=
    localhost ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBCjHBKn/7LF5sfbae1OLkK5QoWm11Xn8RZs1JTc7K8v4RFum1OKIjArocvRjLOYPsq5ezYo8TlBHTrAgeUcvkBM=
    0.0.0.0 ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBCjHBKn/7LF5sfbae1OLkK5QoWm11Xn8RZs1JTc7K8v4RFum1OKIjArocvRjLOYPsq5ezYo8TlBHTrAgeUcvkBM=

    将来也许在启动hadoop的时候一直要输入密码,就是这里面已经存在了主机的信任关系,但是密匙对是新的,删除这个文件或者内容即可

  11. DN SNN都以 aliyun启动

  • NN:core-site.xml fs.defaultFS控制

  • DN: slaves文件

  • 2NN:hdfs-site.xml

    <property>
    <name>dfs.namenode.secondary.http-address</name>
    <value>aliyun:50090</value> #注意端口号,新旧版本有区别
    </property>
    <property>
    <name>dfs.namenode.secondary.https-address</name>
    <value>aliyun:50091</value> #注意端口号,新旧版本有区别
    </property>

2.hadoop fs常规命令

hadoop fs -mkdir /
hadoop fs -put
hadoop fs -get
hadoop fs -cat
hadoop fs -rm
hadoop fs -ls

3.配置文件在官方哪里找

https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/SingleCluster.html#Pseudo-Distributed_Operation

4.整理 jdk、ssh、hosts文件

jdk和ssh是hadoop运行的先决条件

hosts文件存放主机名和ip地址的映射

Author: Tunan
Link: http://yerias.github.io/2018/10/04/hadoop/1/
Copyright Notice: All articles in this blog are licensed under CC BY-NC-SA 4.0 unless stating additionally.