MCPcopy Index your code
hub / github.com/lining0806/PythonSpiderNotes / Spider

Function Spider

NewsSpider/NewsSpider.py:37–54  ·  view source on GitHub ↗
(url)

Source from the content-addressed store, hash-verified

35 return zip(new_items, new_urls)
36
37def Spider(url):
38 i = 0
39 print "downloading ", url
40 myPage = requests.get(url).content.decode("gbk")
41 # myPage = urllib2.urlopen(url).read().decode("gbk")
42 myPageResults = Page_Info(myPage)
43 save_path = u"网易新闻抓取"
44 filename = str(i)+"_"+u"新闻排行榜"
45 StringListSave(save_path, filename, myPageResults)
46 i += 1
47 for item, url in myPageResults:
48 print "downloading ", url
49 new_page = requests.get(url).content.decode("gbk")
50 # new_page = urllib2.urlopen(url).read().decode("gbk")
51 newPageResults = New_Page_Info(new_page)
52 filename = str(i)+"_"+item
53 StringListSave(save_path, filename, newPageResults)
54 i += 1
55
56
57if __name__ == '__main__':

Callers 1

NewsSpider.pyFile · 0.70

Calls 3

Page_InfoFunction · 0.85
StringListSaveFunction · 0.85
New_Page_InfoFunction · 0.85

Tested by

no test coverage detected