01.程序讲解
对于古装美女数据的抓取,程序获取的是靓丽图库(https://www.hexuexiao.cn/meinv/guzhuang/)的图片数据,如下图所示:
网站中不仅有古装美女的图片,而且还有日韩美女、动漫美女等美图,通过今天程序的分享,大家可以通过程序的修改来抓取其他类型的图片。
对于图片的抓取,程序可以分为三个部分来进行拆解。
-
获取子网页链接
-
获取每张图片的链接
-
抓取图片保存到本地
为了便于大家对于程序的理解,上述的三个部分,程序分别利用了三个函数来对应实现。
02.获取子网页链接
通过对于网页的分析,可以发现我们要抓取的网页属于静态网页,也就是说我们想要获取的子网页链接,就通过对于网页源代码的解析就能够获取得到。
如上图所示,网页源代码中,包含着网页中的文本信息,还有我们想要抓取的子网页的链接地址。根据这些信息,我们就可以利用requests库来请求网页源代码,通过BeautifulSoup、Xpath等库来进行网页源代码的解析,并提取出我们想要的子网页链接地址。
对于程序的编写,完全按照我们上述的分析来完成,首先是对于网页源代码数据的获取,并通过BeautifulSoup来解析网页源代码并获取得到所有的子网页链接。程序中的每行程序小编都进行了注释,方便大家对于程序的理解。
03.获取每张图片的链接
对于每张图片的链接,这里同样是对于静态网页的解析。同子网页抓取稍微不同的是,程序需要判断每个子网页中,包含多少张图片,程序如下图所示。
程序中添加了对于每个子网页下包含多少张图片的判断,因为每个子网页下的每张图片有不同的网页链接,例如对于https://www.hexuexiao.cn/a/124672-0.html和网页https://www.hexuexiao.cn/a/124672-1.html来说,是针对于子网页https://www.hexuexiao.cn/a/124672.html衍生出来的两张图片的链接地址。在获取得到每张图片的链接地址后,程序按照静态网页的分析方法来获取每张图片的链接,并通过self.savePic函数保存图片。
04.抓取图片保存到本地
图片数据的保存,可以通过下图中的三行程序进行完成。
程序请求网页图片的数据,通过二进制写入的方式,将图片保存到本地文件,进行保存。
05.结果展示
以上三个部分的解析,便是我们所有的程序内容,接下来我们来看一下抓取得到的结果吧。
06.总结
需要源码的同学,请在后台输入:小助手,备注暗号:(古装美女)
<section helvetica="" neue="" pingfang="" sc="" hiragino="" sans="" gb="" microsoft="" yahei="" ui="" arial="" sans-serif="" widows:="" orphans:="" caret-color:="" rgb="" word-spacing:="" px="" overflow-wrap:="" break-word="" important="" style="max-width: 100%;font-family: -apple-system, BlinkMacSystemFont, "Helvetica Neue", "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;white-space: normal;letter-spacing: 0.544px;box-sizing: border-box !important;overflow-wrap: break-word !important;"><section style="max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><section style="max-width: 100%;letter-spacing: 0.544px;text-align: center;box-sizing: border-box !important;overflow-wrap: break-word !important;"><section style="max-width: 100%;display: inline-block;box-sizing: border-box !important;overflow-wrap: break-word !important;"><section style="max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><section data-tool="mdnice编辑器" data-website="https://www.mdnice.com" data-mpa-powered-by="yiban.io" style="padding-right: 10px;padding-left: 10px;max-width: 100%;font-size: 16px;word-break: break-word;text-align: left;line-height: 1.75;color: rgb(89, 89, 89);font-family: Optima-Regular;letter-spacing: 2px;background-size: 20px 20px;background-position: center center;box-sizing: border-box !important;overflow-wrap: break-word !important;"><section style="max-width: 100%;font-family: -apple-system, BlinkMacSystemFont, "Helvetica Neue", "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;letter-spacing: 0.544px;box-sizing: border-box !important;overflow-wrap: break-word !important;"><section powered-by="xiumi.us" style="max-width: 100%;letter-spacing: 0.544px;box-sizing: border-box !important;overflow-wrap: break-word !important;"><section label="Powered by 135editor.com" data-role="outer" style="max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><section label="Powered by 135editor.com" data-role="outer" style="max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><section data-tool="mdnice编辑器" data-website="https://www.mdnice.com" style="max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><pre data-tool="mdnice编辑器" style="margin-top: 10px;margin-bottom: 10px;max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><section data-tool="mdnice编辑器" data-website="https://www.mdnice.com" style="padding: 3px;max-width: 100%;letter-spacing: 0px;white-space: normal;line-height: 1.6;word-break: break-word;font-family: -apple-system, system-ui, BlinkMacSystemFont, "Helvetica Neue", "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei", Arial, sans-serif;box-sizing: border-box !important;overflow-wrap: break-word !important;"><section data-tool="mdnice编辑器" data-website="https://www.mdnice.com" style="padding-right: 10px;padding-left: 10px;max-width: 100%;color: black;line-height: 1.6;letter-spacing: 0px;word-break: break-word;font-family: Optima-Regular, Optima, PingFangSC-light, PingFangTC-light, "PingFang SC", Cambria, Cochin, Georgia, Times, "Times New Roman", serif;box-sizing: border-box !important;overflow-wrap: break-word !important;"><pre data-darkmode-bgcolor="rgb(36, 36, 36)" data-style="background-color: rgb(255, 255, 255); color: rgba(230, 230, 230, 0.9); letter-spacing: 0.544px; text-size-adjust: auto; font-size: 16px; text-align: center; word-spacing: 1.6px;" data-darkmode-color="rgba(230, 230, 230, 0.9)" data-darkmode-original-color="rgba(230, 230, 230, 0.9)" data-darkmode-original-bgcolor="rgb(255, 255, 255)" data-darkmode-bgcolor-15862411819306="rgb(36, 36, 36)" data-darkmode-original-bgcolor-15862411819306="rgb(255, 255, 255)" data-darkmode-color-15862411819306="rgba(230, 230, 230, 0.9)" data-darkmode-original-color-15862411819306="rgba(230, 230, 230, 0.9)" data-darkmode-bgcolor-15862671987026="rgb(36, 36, 36)" data-darkmode-original-bgcolor-15862671987026="rgb(255, 255, 255)" data-darkmode-color-15862671987026="rgba(230, 230, 230, 0.9)" data-darkmode-original-color-15862671987026="rgba(230, 230, 230, 0.9)" data-darkmode-bgcolor-15864118999603="rgb(36, 36, 36)" data-darkmode-original-bgcolor-15864118999603="rgb(255, 255, 255)" data-darkmode-color-15864118999603="rgba(230, 230, 230, 0.9)" data-darkmode-original-color-15864118999603="rgba(230, 230, 230, 0.9)" data-darkmode-bgcolor-15864940858736="rgb(36, 36, 36)" data-darkmode-original-bgcolor-15864940858736="rgb(255, 255, 255)" data-darkmode-color-15864940858736="rgba(230, 230, 230, 0.9)" data-darkmode-original-color-15864940858736="rgba(230, 230, 230, 0.9)" data-darkmode-bgcolor-15869584691402="rgb(36, 36, 36)" data-darkmode-original-bgcolor-15869584691402="rgb(255, 255, 255)" data-darkmode-color-15869584691402="rgba(230, 230, 230, 0.9)" data-darkmode-original-color-15869584691402="rgba(230, 230, 230, 0.9)" data-darkmode-bgcolor-15869584691739="rgb(36, 36, 36)" data-darkmode-original-bgcolor-15869584691739="rgb(255, 255, 255)" data-darkmode-color-15869584691739="rgba(230, 230, 230, 0.9)" data-darkmode-original-color-15869584691739="rgba(230, 230, 230, 0.9)" data-darkmode-bgcolor-15873005456075="rgb(36, 36, 36)" data-darkmode-original-bgcolor-15873005456075="rgb(255, 255, 255)" data-darkmode-color-15873005456075="rgba(230, 230, 230, 0.9)" data-darkmode-original-color-15873005456075="rgba(230, 230, 230, 0.9)" data-darkmode-bgcolor-15873005456615="rgb(36, 36, 36)" data-darkmode-original-bgcolor-15873005456615="rgb(255, 255, 255)" data-darkmode-color-15873005456615="rgba(230, 230, 230, 0.9)" data-darkmode-original-color-15873005456615="rgba(230, 230, 230, 0.9)" data-darkmode-bgcolor-15886839320558="rgb(36, 36, 36)" data-darkmode-original-bgcolor-15886839320558="rgb(255, 255, 255)" data-darkmode-color-15886839320558="rgba(230, 230, 230, 0.9)" data-darkmode-original-color-15886839320558="rgba(230, 230, 230, 0.9)" data-darkmode-color-159923607914210="rgba(163, 163, 163, 0.9)" data-darkmode-original-color-159923607914210="rgba(230, 230, 230, 0.9)" data-darkmode-bgcolor-159923607914210="rgb(25, 25, 25)" data-darkmode-original-bgcolor-159923607914210="rgb(255, 255, 255)" data-darkmode-bgcolor-160008070860010="rgb(25, 25, 25)" data-darkmode-original-bgcolor-160008070860010="rgb(255, 255, 255)" data-darkmode-color-160008070860010="rgba(163, 163, 163, 0.9)" data-darkmode-original-color-160008070860010="rgba(230, 230, 230, 0.9)" data-darkmode-bgcolor-16072664870629="rgb(25, 25, 25)" data-darkmode-original-bgcolor-16072664870629="rgb(255, 255, 255)" data-darkmode-color-16072664870629="rgba(163, 163, 163, 0.9)" data-darkmode-original-color-16072664870629="rgba(230, 230, 230, 0.9)" data-darkmode-bgcolor-16073544711184="rgb(25, 25, 25)" data-darkmode-original-bgcolor-16073544711184="rgb(255, 255, 255)" data-darkmode-color-16073544711184="rgba(163, 163, 163, 0.9)" data-darkmode-original-color-16073544711184="rgba(230, 230, 230, 0.9)" style="max-width: 100%;color: rgb(62, 62, 62);letter-spacing: 0.544px;text-align: center;word-spacing: 1.6px;box-sizing: border-box !important;overflow-wrap: break-word !important;"><p style="max-width: 100%;min-height: 1em;font-family: -apple-system, BlinkMacSystemFont, "Helvetica Neue", "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;letter-spacing: 0.544px;white-space: normal;text-size-adjust: auto;text-align: left;box-sizing: border-box !important;overflow-wrap: break-word !important;"></p><p style="max-width: 100%;min-height: 1em;font-family: -apple-system, BlinkMacSystemFont, "Helvetica Neue", "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;letter-spacing: 0.544px;white-space: normal;text-size-adjust: auto;text-align: left;box-sizing: border-box !important;overflow-wrap: break-word !important;"><br style="max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;" /></p><p style="max-width: 100%;min-height: 1em;font-family: -apple-system, BlinkMacSystemFont, "Helvetica Neue", "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;letter-spacing: 0.544px;white-space: normal;text-size-adjust: auto;text-align: left;box-sizing: border-box !important;overflow-wrap: break-word !important;"><strong style="max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span style="max-width: 100%;font-size: 15px;color: rgb(2, 30, 170);box-sizing: border-box !important;overflow-wrap: break-word !important;">推荐阅读:</span></strong></p><p style="max-width: 100%;min-height: 1em;font-family: -apple-system, BlinkMacSystemFont, "Helvetica Neue", "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;letter-spacing: 0.544px;white-space: normal;text-size-adjust: auto;text-align: left;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span style="max-width: 100%;font-size: 15px;color: rgb(2, 30, 170);box-sizing: border-box !important;overflow-wrap: break-word !important;">入门: </span><span style="max-width: 100%;color: rgb(2, 30, 170);text-decoration: underline;font-size: 14px;box-sizing: border-box !important;overflow-wrap: break-word !important;">最全的零基础学Python的问题</span><span style="max-width: 100%;font-size: 15px;color: rgb(2, 30, 170);box-sizing: border-box !important;overflow-wrap: break-word !important;"> | </span><span style="max-width: 100%;color: rgb(2, 30, 170);text-decoration: underline;font-size: 14px;box-sizing: border-box !important;overflow-wrap: break-word !important;">零基础学了8个月的Python </span> <span style="max-width: 100%;color: rgb(2, 30, 170);text-decoration: underline;font-size: 15px;box-sizing: border-box !important;overflow-wrap: break-word !important;">|</span> <span style="max-width: 100%;text-decoration: underline;font-size: 14px;box-sizing: border-box !important;overflow-wrap: break-word !important;">实战项目</span> <span style="max-width: 100%;text-decoration: underline;font-size: 14px;box-sizing: border-box !important;overflow-wrap: break-word !important;">|学Python就是这条捷径</span></p><p style="max-width: 100%;min-height: 1em;font-family: -apple-system, BlinkMacSystemFont, "Helvetica Neue", "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;letter-spacing: 0.544px;white-space: normal;text-size-adjust: auto;text-align: left;box-sizing: border-box !important;overflow-wrap: break-word !important;"><br style="max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;" /></p><p style="max-width: 100%;min-height: 1em;font-family: -apple-system, BlinkMacSystemFont, "Helvetica Neue", "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;letter-spacing: 0.544px;white-space: normal;text-size-adjust: auto;text-align: left;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span style="max-width: 100%;font-size: 15px;color: rgb(2, 30, 170);box-sizing: border-box !important;overflow-wrap: break-word !important;">干货:</span><span style="max-width: 100%;font-size: 15px;color: rgb(2, 30, 170);text-decoration: underline;box-sizing: border-box !important;overflow-wrap: break-word !important;">爬取豆瓣短评,电影《后来的我们》</span> | <span style="max-width: 100%;font-size: 14px;text-decoration: underline;box-sizing: border-box !important;overflow-wrap: break-word !important;">38年NBA最佳球员分析 </span><span style="max-width: 100%;font-size: 15px;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span style="max-width: 100%;text-decoration: underline;box-sizing: border-box !important;overflow-wrap: break-word !important;">|</span> </span><span style="max-width: 100%;font-size: 15px;text-decoration: underline;box-sizing: border-box !important;overflow-wrap: break-word !important;">从万众期待到口碑扑街!唐探3令人失望</span> | 笑看新倚天屠龙记 | 灯谜答题王 |<span style="max-width: 100%;font-size: 14px;text-decoration: underline;box-sizing: border-box !important;overflow-wrap: break-word !important;">用Python做个海量小姐姐素描图 |</span></p><p style="max-width: 100%;min-height: 1em;font-family: -apple-system, BlinkMacSystemFont, "Helvetica Neue", "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;letter-spacing: 0.544px;white-space: normal;text-size-adjust: auto;text-align: left;box-sizing: border-box !important;overflow-wrap: break-word !important;"><br style="max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;" /></p><p style="max-width: 100%;min-height: 1em;font-family: -apple-system, BlinkMacSystemFont, "Helvetica Neue", "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;letter-spacing: 0.544px;white-space: normal;text-size-adjust: auto;text-align: left;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span style="max-width: 100%;font-size: 15px;color: rgb(2, 30, 170);box-sizing: border-box !important;overflow-wrap: break-word !important;">趣味:</span><span style="max-width: 100%;color: rgb(2, 30, 170);text-decoration: underline;font-size: 14px;box-sizing: border-box !important;overflow-wrap: break-word !important;">弹球游戏</span> | <span style="max-width: 100%;text-decoration: underline;font-size: 14px;box-sizing: border-box !important;overflow-wrap: break-word !important;">九宫格</span> | 漂亮的花 | 两百行Python《天天酷跑》游戏!</p><p style="max-width: 100%;min-height: 1em;font-family: -apple-system, BlinkMacSystemFont, "Helvetica Neue", "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;letter-spacing: 0.544px;white-space: normal;text-size-adjust: auto;text-align: left;box-sizing: border-box !important;overflow-wrap: break-word !important;"><br style="max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;" /></p><p style="max-width: 100%;min-height: 1em;font-family: -apple-system, BlinkMacSystemFont, "Helvetica Neue", "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;letter-spacing: 0.544px;white-space: normal;text-size-adjust: auto;text-align: left;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span style="max-width: 100%;color: rgb(2, 30, 170);box-sizing: border-box !important;overflow-wrap: break-word !important;">AI:</span> 会做诗的机器人 | 给图片上色 | 预测收入 | 碟中谍这么火,我用机器学习做个迷你推荐系统电影</p>
年度爆款文案
-
1).卧槽!Pdf转Word用Python轻松搞定!
-
2).学Python真香!我用100行代码做了个网站,帮人PS旅行图片,赚个鸡腿吃
-
3).首播过亿,火爆全网,我分析了《乘风破浪的姐姐》,发现了这些秘密
-
4).80行代码!用Python做一个哆来A梦分身
-
5).你必须掌握的20个python代码,短小精悍,用处无穷
-
6).30个Python奇淫技巧集
-
7).我总结的80页《菜鸟学Python精选干货.pdf》,都是干货
-
8).再见Python!我要学Go了!2500字深度分析!
-
9).发现一个舔狗福利!这个Python爬虫神器太爽了,自动下载妹子图片
本篇文章来源于: 菜鸟学Python
本文为原创文章,版权归知行编程网所有,欢迎分享本文,转载请保留出处!
你可能也喜欢
- ♥ 使用 for 循环遍历 Python 字典的 3 种方法 !02/24
- ♥ python如何定义函数09/01
- ♥ 什么是python关键字08/13
- ♥ python is和==有什么区别10/07
- ♥ 如何在python中打开一个文件夹09/24
- ♥ 如何使用 python 编写自动化脚本?10/28
内容反馈