首页 > 编程知识 正文

三大运营商大数据抓取,运营商爬虫业务

时间:2023-05-06 02:34:00 阅读:271306 作者:2925

做过p2p的人都知道,我们在给用户授信评额时都会采集用户的个人信息。一般公司都会采集网银,运营商,社保等。现在我着重说运营商采集是怎么实现的,其他类似。 一、授信过程

运营商采集分布式爬虫主要分为3大块 如图

1、API/H5是用户授权入口,有任务创建,任务状态查询,任务验证码输入,任务结果查询,手机号归属地查询等。
2、调度,任务创建之后会经过调度会计算出采集入口。每个省份运营商有不同入口如移动有,移动营业厅,移动商城,移动app,10086等入口。关于调度还有很多职责不一一列举。
3、采集,采集是整个项目的重中之重,也是维护比较频繁的项目,随着运营商改版必须进行相应的维护。
运营商主要采集如下内容:
用户基本信息,
用户套餐信息,
通话记录,
缴费账单等。
如有什么问题不对请随时联系我 V 15123957473 欢迎技术交流,其他合作
所采集的字段如下:
{
“mobile”: “string”,
“code”: “int”,
“message”: “string”,
“name”: “string”,
“idcard”: “string”,
“carrier”: “string”,
“province”: “string”,
“city”: “string”,
“open_time”: “string”,
“level”: “string”,
“package_name”: “string”,
“state”: “int”,
“available_balance”: “int”,
“last_modify_time”: “string”,
“packages”:[
{
“bill_start_date”: “string”,
“bill_end_date”: “string”,
“items”: [
{
“item”: “string”,
“total”: “string”,
“used”: “string”,
“unit”: “string”
}…
]
}…
],
“families”: [
{
“family_num”: “string”,
“items”: [
{
“long_number”: “string”,
“short_number”: “string”,
“member_type”: “string”,
“join_date”: “string”,
“expire_date”: “string”
}…
]
}…
],
“recharges”: [
{
“details_id”: “string”,
“recharge_time”: “string”,
“amount”: “int”,
“type”: “string”
}…
],
“bills”:[
{
“bill_month”: “string”,
“bill_start_date”: “string”,
“bill_end_date”: “string”,
“base_fee”: “int”,
“extra_service_fee”: “int”,
“voice_fee”: “int”,
“sms_fee”: “int”,
“web_fee”: “int”,
“extra_fee”: “int”,
“total_fee”: “int”,
“discount”: “int”,
“extra_discount”: “int”,
“actualFee”: “int”,
“paid_fee”: “int”,
“unpaid_fee”: “int”,
“point”: “int”,
“last_point”: “int”,
“related_mobiles”: “string”,
“notes”: “string”
}…
],
“calls”: [
{
“bill_month”: “string”,
“total_size”: “int”,
“items”: [
{
“details_id”: “string”,
“time”: “string”,
“peer_number”: “string”,
“location”: “string”,
“location_type”: “string”,
“duration”: “int”,
“dial_type”: “string”,
“fee”: “int”
}…
]
}…
],
“smses”: [
{
“bill_month”: “string”,
“total_size”: “int”,
“items”: [
{
“details_id”: “string”,
“time”: “string”,
“peer_number”: “string”,
“location”: “string”,
“send_type”: “string”,
“msg_type”: “string”,
“service_name”: “string”,
“fee”: “int”
}…
]
}…
],
“month_info”:{
“miss_month_count”:“int”,
“month_count”:“int”,
“month_list”:{
“string”:“int”,
“string”:“int”,
“string”:“int”,
“2018-05”:“int”,
“2018-04”:“int”,
“2018-03”:“int”
},
“no_call_month”:“int”,
“phone_no”:“15******9”,
“user_id”:“string”
}
}

版权声明:该文观点仅代表作者本人。处理文章:请发送邮件至 三1五14八八95#扣扣.com 举报,一经查实,本站将立刻删除。