今天在调用Java API向Hadoop上传文件的时候,碰到个问题:

java.io.EOFException: Unexpected EOF while trying to read response from server

具体如下:

1
2
3
4
5
6
2021-03-02 11:22:37,367 INFO (DataStreamer.java:1791)- Exception in createBlockOutputStream blk_1073741842_1018
java.io.EOFException: Unexpected EOF while trying to read response from server
at org.apache.hadoop.hdfs.protocolPB.PBHelperClient.vintPrefixed(PBHelperClient.java:551)
at org.apache.hadoop.hdfs.DataStreamer.createBlockOutputStream(DataStreamer.java:1763)
at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1680)
at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:715)

这个问题非常奇怪。我的部署环境是:

  • 宿主机:CentOS 7.4 64位
  • 在CentOS上使用Docker部署Hadoop
  • Java版本:jdk-8u241-linux-x64
  • Hadoop:3.2.2版本,伪分布式部署

本机虚拟机上部署的Hadoop可以正常访问/上传/下载文件。在阿里云上部署的Hadoop却只能创建/查看目录,无法上传/下载文件。报错即为:

java.io.EOFException: Unexpected EOF while trying to read response from server

首先查看下hadoop是否有报错,DataNode与NameNode通信是否正常。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
[grapetec@hadoop hadoop-3.2.2]$ hdfs dfsadmin -report
Configured Capacity: 63277363200 (58.93 GB)
Present Capacity: 10731831296 (9.99 GB)
DFS Remaining: 10731794432 (9.99 GB)
DFS Used: 36864 (36 KB)
DFS Used%: 0.00%
Replicated Blocks:
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0
Low redundancy blocks with highest priority to recover: 0
Pending deletion blocks: 0
Erasure Coded Block Groups:
Low redundancy block groups: 0
Block groups with corrupt internal blocks: 0
Missing block groups: 0
Low redundancy blocks with highest priority to recover: 0
Pending deletion blocks: 0

-------------------------------------------------
Live datanodes (1):

Name: 172.25.0.2:9866 (hadoop)
Hostname: hadoop
Decommission Status : Normal
Configured Capacity: 63277363200 (58.93 GB)
DFS Used: 36864 (36 KB)
Non DFS Used: 49307639808 (45.92 GB)
DFS Remaining: 10731794432 (9.99 GB)
DFS Used%: 0.00%
DFS Remaining%: 16.96%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Tue Mar 02 03:29:04 UTC 2021
Last Block Report: Tue Mar 02 02:36:15 UTC 2021
Num of Blocks: 0

发现一切正常。网上各种搜索,没有靠谱回答。

我本以为是网络问题,连接超时导致。于是尝试在另外一台云服上部署个客户端试试,结果定位了问题。

最终解决方案如下:

修改host配置, 增加hadoop节点映射。

1
2
# hadoop是伪分布式部署时,slave里面的配置
xx.xx.xx.xx hadoop

以上。