目标
- 安装hadoop的hdfs伪分布式部署
- hadoop fs常规命令
- 配置文件在官方哪里找
- 整理 jdk、ssh、hosts文件
1.安装hadoop的hdfs伪分布式部署
创建用户和目录
[root@aliyun ~]# useradd hadoop
[root@aliyun ~]# su - hadoop
[hadoop@aliyun ~]$ mkdir app software sourcecode log tmp data lib
[hadoop@aliyun ~]$ ll
total 28
drwxrwxr-x 2 hadoop hadoop 4096 Nov 28 11:26 app #解压的文件夹 软连接
drwxrwxr-x 2 hadoop hadoop 4096 Nov 28 11:26 data #数据
drwxrwxr-x 2 hadoop hadoop 4096 Nov 28 11:26 lib #第三方的jar
drwxrwxr-x 2 hadoop hadoop 4096 Nov 28 11:26 log #日志文件夹
drwxrwxr-x 2 hadoop hadoop 4096 Nov 28 11:26 software #压缩包
drwxrwxr-x 2 hadoop hadoop 4096 Nov 28 11:26 sourcecode #源代码编译
drwxrwxr-x 2 hadoop hadoop 4096 Nov 28 11:26 tmp #临时文件夹下载/上传压缩包
[hadoop@aliyun ~]$ cd software/
[hadoop@aliyun software]$ wget http://archive.cloudera.com/cdh5/cdh/5/hadoop-2.6.0-cdh5.16.2.tar.gz解压
[hadoop@aliyun software]$ tar -xzvf hadoop-2.6.0-cdh5.16.2.tar.gz -C ../app/
...
...
...
[hadoop@aliyun software]$ cd ../app/
[hadoop@aliyun app]$ ln -s hadoop-2.6.0-cdh5.16.2/ hadoop
[hadoop@aliyun app]$ ll
total 4
lrwxrwxrwx 1 hadoop hadoop 23 Nov 28 11:36 hadoop -> hadoop-2.6.0-cdh5.16.2/
drwxr-xr-x 14 hadoop hadoop 4096 Jun 3 19:11 hadoop-2.6.0-cdh5.16.2环境要求
[root@aliyun java]# mkdir /usr/java
[root@aliyun java]# cd /usr/java
[root@aliyun java]# rz -E
[root@aliyun java]# tar -xzvf jdk-8u144-linux-x64.tar.gz
[root@aliyun java]# chown -R root:root jdk1.8.0_144/
[root@aliyun java]# ln -s jdk1.8.0_144/ jdk
[root@aliyun java]# ll
total 4
lrwxrwxrwx 1 root root 13 Nov 28 12:01 jdk -> jdk1.8.0_144/
drwxr-xr-x 8 root root 4096 Jul 22 2017 jdk1.8.0_144
[root@aliyun java]# vim /etc/profile
#env
export JAVA_HOME=/usr/java/jdk
export PATH=$JAVA_HOME/bin:$PATH
[root@aliyun java]# source /etc/profile
[root@aliyun java]# which java
/usr/java/jdk/bin/javaJAVA_HOME 显性配置
[hadoop@aliyun hadoop]$ vi hadoop-env.sh
export JAVA_HOME=/usr/java/jdk
[root@aliyun java]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
172.16.39.48 aliyun配置文件
etc/hadoop/core-site.xml:
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://aliyun:9000</value>
</property>
</configuration>
etc/hadoop/hdfs-site.xml:
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>ssh无密码信任关系
家目录下输入
ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
chmod 0600 ~/.ssh/authorized_keys
[hadoop@aliyun ~]$ ssh aliyun date
Thu Nov 28 12:15:08 CST 2019环境变量 hadoop
[hadoop@aliyun ~]$ vi .bashrc
export HADOOP_HOME=/home/hadoop/app/hadoop
export PATH=${HADOOP_HOME}/bin:${HADOOP_HOME}/sbin:$PATH
[hadoop@aliyun ~]$ source .bashrc
[hadoop@aliyun ~]$ which hadoop
~/app/hadoop/bin/hadoop格式化
[hadoop@aliyun ~]$ hdfs namenode -format
has been successfully formatted.第一次启动
[hadoop@aliyun ~]$ start-dfs.sh
[hadoop@aliyun ~]$ jps
10804 SecondaryNameNode
10536 NameNode
10907 Jps
10654 DataNode
[hadoop@aliyun ~]$坑:第一次启动会输入yes确定信任关系,我们打开./ssh下的known_hosts文件,这个文件中存放信任关系
[hadoop@aliyun .ssh]$ cat known_hosts
aliyun,172.16.39.48 ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBCjHBKn/7LF5sfbae1OLkK5QoWm11Xn8RZs1JTc7K8v4RFum1OKIjArocvRjLOYPsq5ezYo8TlBHTrAgeUcvkBM=
localhost ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBCjHBKn/7LF5sfbae1OLkK5QoWm11Xn8RZs1JTc7K8v4RFum1OKIjArocvRjLOYPsq5ezYo8TlBHTrAgeUcvkBM=
0.0.0.0 ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBCjHBKn/7LF5sfbae1OLkK5QoWm11Xn8RZs1JTc7K8v4RFum1OKIjArocvRjLOYPsq5ezYo8TlBHTrAgeUcvkBM=将来也许在启动hadoop的时候一直要输入密码,就是这里面已经存在了主机的信任关系,但是密匙对是新的,删除这个文件或者内容即可
DN SNN都以 aliyun启动
NN:core-site.xml fs.defaultFS控制
DN: slaves文件
2NN:hdfs-site.xml
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>aliyun:50090</value> #注意端口号,新旧版本有区别
</property>
<property>
<name>dfs.namenode.secondary.https-address</name>
<value>aliyun:50091</value> #注意端口号,新旧版本有区别
</property>
2.hadoop fs常规命令
hadoop fs -mkdir / |
3.配置文件在官方哪里找
4.整理 jdk、ssh、hosts文件
jdk和ssh是hadoop运行的先决条件
hosts文件存放主机名和ip地址的映射