MapJoin,文件在HDFS上Idea报错:File does not exist: /xxx/yyy.txt#yyy.txt
Exception in thread "main" java.io.FileNotFoundException: File does not exist: /data/dept.txt#dept.txt

先去HDFS上确定文件是否存在,文件不存在,put文件上去,再次运行

Exception in thread "main" org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: hdfs://192.168.91.10:9000/data/emp.txt

有是Path找不到,再去HDFS上检查这个文件是否存在,文件不存在,再次put上去,然后运行

INFO DFSClient: Could not obtain BP-1292531802-192.168.181.10-1583457649867:blk_1073746058_5236 from any node:  No live nodes contain current block Block locations: DatanodeInfoWithStorage[192.168.91.10:50010,DS-f8088840-5065-4320-8b51-4563ccec125a,DISK] Dead nodes:  DatanodeInfoWithStorage[192.168.91.10:50010,DS-f8088840-5065-4320-8b51-4563ccec125a,DISK]. Will get new block locations from namenode and retry...
WARN DFSClient: DFS chooseDataNode: got # 2 IOException, will wait for 3263.5374223592803 msec.
WARN BlockReaderFactory: I/O error constructing remote block reader.
java.net.ConnectException: Connection timed out: no further information

继续报错,连接不到DataNode的块,去命令行输入jps发现进程没挂,然后我怀疑是因为我的host没有配置hadoop的ip映射关系,于是配上后,继续运行

java.lang.Exception: java.io.FileNotFoundException: dept.txt (系统找不到指定的文件。)
Caused by: java.io.FileNotFoundException: dept.txt (系统找不到指定的文件。)

不出所望的再次报错,这次的日志多,上下文分析一下

WARN FileUtil: Command 'G:\hadoop-2.7.2\bin\winutils.exe symlink E:\Java\hadoop\hadoop-client\dept.txt \tmp\hadoop-Tunan\mapred\local\1585106809398\dept.txt' failed 1 with: CreateSymbolicLink error (1314): ???????????
WARN LocalDistributedCacheManager: Failed to create symlink: \tmp\hadoop-Tunan\mapred\local\1585106809398\dept.txt <- E:\Java\hadoop\hadoop-client/dept.txt

由于我们给hdfs上面文件的路径创建了一个链接,也就是symlink,我们发现这个两个警告是symlink创建失败,于是使用管理员的身份启动Idea,解决


需要注意是除了管理员身份启动Idea,还需要在在程序的main方法下开启symlink的功能,才能在new FileInputStream(new File("dept.txt"))中使用

FileSystem.enableSymlinks();

还需要注意的是本地操作HDFS的一些配置

System.setProperty("HADOOP_USER_NAME", "hadoop");
Configuration conf = new Configuration();
conf.set("fs.defaultFS", "hdfs://192.168.91.10:9000");
conf.set("dfs.client.use.datanode.hostname", "true");
Author: Tunan
Link: http://yerias.github.io/2020/03/25/error/2/
Copyright Notice: All articles in this blog are licensed under CC BY-NC-SA 4.0 unless stating additionally.