爬数据的两大难点:
1.分析网络结构
2.写正则表达式
科普版:
install.packages("RCurl")library(RCurl)url1='http://shenzhen.lashou.com/cate/meishi'web=readLines(url1,encoding='UTF-8')goods_name<-web[grep("goods-name",web)]goods_name2<-substr(goods_name,regexpr("">",goods_name)+2,nchar(goods_name)-4) goods_name2i=2goods_name[i]substr(goods_name[i],regexpr("">",goods_name[i])+2,nchar(goods_name[i])-4) web[1:10]