准备
Hadoop集群
上一篇讲到的Hadoop环境搭建,开启Hadoop几个服务
./sbin/start-dfs.sh
./sbin/start-yarn.sh
./sbin/mr-jobhistory-daemon.sh start historyserver
使用jps查看是否执行成功 命令:start-all.sh已经不推荐使用了。
[root@hadoop01 hadoop-2.6.0]# jps1941 JobHistoryServer1665 ResourceManager1355 NameNode1977 Jps1497 SecondaryNameNode
IDEA+MAVEN
安装好IDEA,配置好MAVEN即可
WINDOWS系统账号
(同Linux下的Hadoop运行账号,如root)
如果在Windows中,新建一个账号,用户名为root(必须和Hadoop运行的账号一致,比如叫root、hadoop等等)
新建好之后,注销该账号登录的系统即可,不用在该账号下运行。
搭建
Hadoop
如果在调试HDFS功能,拒绝访问,并且在测试环境下,尝试下述做法
1、 调用hdfs无需使用和运行hadoop用户名一致前提,但是需要到hdfs-site.xml中设置permission=false
dfs.namenode.secondary.http-address hadoop01:9001 dfs.namenode.name.dir file:/usr/hadoop-2.6.0/dfs/name dfs.datanode.data.dir file:/usr/hadoop-2.6.0/dfs/data dfs.replication 2 dfs.webhdfs.enabled true dfs.permissions false
IDEA工程
新建一个maven工程:hadoop
1、POM依赖
org.apache.hadoop hadoop-common 2.6.0 org.apache.hadoop hadoop-hdfs 2.6.0 org.apache.hadoop hadoop-client 2.6.0
2、新建一个测试类Test
在hadoop中已经通过执行wordcount导入了input和output的fs文件,这里通过hdfs的api进行调试
import org.apache.hadoop.conf.Configuration;import org.apache.hadoop.fs.FSDataOutputStream;import org.apache.hadoop.fs.FileStatus;import org.apache.hadoop.fs.FileSystem;import org.apache.hadoop.fs.Path;import org.apache.hadoop.io.IOUtils;import java.io.InputStream;import java.net.URI;/** * Created with j360 -> me.h360.hdfs. * User: min_xu * Date: 2015/4/14 * Time: 9:05 * 说明:测试hdfs的文件的情况 */public class Test { public static void main(String[] args) throws Exception { //hdfs的地址 String uri = "hdfs://192.168.145.128:9000/"; Configuration config = new Configuration(); FileSystem fs = FileSystem.get(URI.create(uri), config); // 列出hdfs上/tmp/input/目录下的所有文件和目录 FileStatus[] statuses = fs.listStatus(new Path("/tmp/input")); for (FileStatus status : statuses) { System.out.println(status); } // 在hdfs的/tmp/input目录下创建一个文件,并写入一行文本 FSDataOutputStream os = fs.create(new Path("/tmp/input/test.log")); os.write("Hello World!".getBytes()); os.flush(); os.close(); // 显示在hdfs的/tmp/input下指定文件的内容 InputStream is = fs.open(new Path("/tmp/input/test.log")); IOUtils.copyBytes(is, System.out, 1024, true); }}
调试
执行main方法
log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory).log4j:WARN Please initialize the log4j system properly.log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.FileStatus{path=hdfs://192.168.145.128:9000/tmp/input/f1; isDirectory=false; length=20; replication=2; blocksize=134217728; modification_time=1428671368587; access_time=1428998938744; owner=root; group=supergroup; permission=rw-r--r--; isSymlink=false}FileStatus{path=hdfs://192.168.145.128:9000/tmp/input/f2; isDirectory=false; length=25; replication=2; blocksize=134217728; modification_time=1428671368663; access_time=1428998938711; owner=root; group=supergroup; permission=rw-r--r--; isSymlink=false}FileStatus{path=hdfs://192.168.145.128:9000/tmp/input/test.log; isDirectory=false; length=12; replication=3; blocksize=134217728; modification_time=1428991073630; access_time=1428998938072; owner=root; group=supergroup; permission=rw-r--r--; isSymlink=false}Hello World!Process finished with exit code 0
已经把前面生成的fs文件打印出来。