首页 > 编程知识 正文

路由器网页日志,商务数据分析感想

时间:2023-05-05 14:01:48 阅读:109683 作者:1857

网站日志分析网站基本指标:pv(page view ):网页浏览总量

uv (unique view):独立访客数

vv (visit view) :访客的访问次数:

ip:独立的IP

2.uv(计算出每个城市的独立访客数)

也就是说,一天有多少人访问了网站? (一个人多次访问只能算一次。

代码实现:

(1) WebLogMapReduce:

mport org.Apache.Hadoop.conf.configuration; importorg.Apache.Hadoop.conf.configured; importorg.Apache.Hadoop.fs.file system; importorg.Apache.Hadoop.fs.path; importorg.Apache.Hadoop.io.text; importorg.Apache.Hadoop.MapReduce.job; importorg.Apache.Hadoop.MapReduce.lib.input.fileinputformat; importorg.Apache.Hadoop.MapReduce.lib.output.fileoutputformat; importorg.Apache.Hadoop.util.tool; importorg.Apache.Hadoop.util.tool runner; publicclassweblogmapreduce 01 extendsconfiguredimplementstool { @ overridepublicintrun (string [ ] args (throws exception )/) 设置//2.job//2.1 inputpathinputpath=new path (args [0] ); fileinputformat.setinputpaths (job,inputpath ); //2.2 map job.setmapperclass (weblog mapper 01.class ); job.setmapoutputkeyclass (text.class ); job.setoutputvalueclass (text.class; //2.3 shuffle步骤省略//2.4 reduce job.setreducerclass (weblog reducer 01.class ); job.setoutputkeyclass (text.class ); job.setoutputvalueclass (text.class; //2.5 output //如果输出目录存在,则删除filesystemhdfs=file system.get (this.getconf ); pathoutputpath=newpath(args[1]; if(HDFS.exists(outputpath ) ) HDFS.delete (output path,true ); } fileoutputformat.setoutputpath (job,outputpath ); //3 .提交布尔is success=job.wait for completion (true )的运行; return isSuccess? 0:1; } publicstaticvoidmain (string [ ] args ) throws exception (configuration configuration=new configuration ); int status=tool runner.run (configuration,new WebLogMapReduce01 ),args ); system.exit(status; (2) weblog映射器:

importorg.Apache.com mons.lang.string utils; importorg.Apache.Hadoop.io.long writable; importorg.Apache.Hadoop.io.text; importorg.Apache.Hadoop.MapReduce.mapper; import java.io.IOException; publicclassweblogmapper 01 extendsmapperlongwritable,Text,text { privatetextmapoutkey=new text () }; privatefinalstatictextmapoutvalue=new text (; @ overrideprotectedvoidmap (longwritablekey,Text value,Context context ) throws IOException,interrupted exception { stringring } items.length=36 ) if (string utils.is blank (items [1] ) ) { return; }mapoutkey.set(items[24]; //城市idmapoutvalue.set(items[5]; //用户idcontext.write(mapoutkey,mapOutValue; }else{ return; }}(3) WebLogReducer:

(减少每个城市对应的用户访问量,并将减少后的数量相加得到数字) )。

import java.io.IOException; import java.util.HashSet; import java.util.Set; publicclassweblogreducer 01 extendsreducerText,text,text { privatetextoutputvalue=new text () }; @ overrideprotectedvoidreduce (text key,IterableText values,Context context ) throws IOException,interrupted exception { seted } //输入遍历后的用户id (inta=set.size ); //重置后的用户id数output value.set (string.value of (a ) ); context.write(key,outputValue; }

版权声明:该文观点仅代表作者本人。处理文章:请发送邮件至 三1五14八八95#扣扣.com 举报,一经查实,本站将立刻删除。