
图一
来吧,先来说说这个项目吧,空闲中自己学习python的代码。爬虫一个二手汽车网站源码。
用php或python都能爬虫。灵活性大的python比较好爬一点。利用在数据分析中,肯定是
python好很多。项目文件如图(一)所示。
开始讲讲我们的配置项目时候需要用到的东西,直接上代码分析吧!
config.py
# -*- coding:utf-8 -*- def cfg(): c = { "url": "https://www.xin.com/guangzhou/s/?channel=baidu&keywordid=102424667978&creative=26383680751&mediaid=1", "name": "优信二手车", "version": "v1.0.0", "header": {'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Trident/7.0; rv:11.0) like Gecko'}, "host": "localhost", "user": "root", "pass": "root", "port": 3306, "database": "yx", "num": 1 } return c
配置文件,如上代码,每个项目建立的开始必须要有一个配置文件,存放着常用的对象。想用啥就用啥了。
datebase.py
#!/usr/bin/python # -*- coding:utf-8 -*- import mysql.connector import config import time import main host = config.cfg().get('host') user = config.cfg().get('user') pas = config.cfg().get('pass') port = config.cfg().get('port') date = config.cfg().get('database') con = mysql.connector.connect(host=host, user=user, port=port, password=pas, database=date, charset='utf8') data = main.ResetDate() reset = main.ResetDate() def conn(): try: time.sleep(5) con.close() return con except mysql.connector.Error as e: print(e) # 导入数据库 def InsetData(): cursor = con.cursor() i = 1 for item in main.ResetDate(): strs = "" strs = strs + "INSERT INTO yx_data(title, price, vehicle, record, carpic) VALUES" strs = strs + " ('%s','%s','%s','%s',%s)" % (item['title'][i], item['price'][i], item['vehicle'][i], item['record'][i], item['carPic'][i]) print(strs) i = i+1 try: cursor.execute(strs) con.commit() except mysql.connector.Error as e: print('connect fails!{}'.format(e)) finally: cursor.close() con.close() InsetData()
版权声明:本文为gzwebsj原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接和本声明。