新浪微博上有不少福利号,想保存一下他们的图片。
于是昨天中午吃饭前花了点时间,拿 Python 写了个爬虫,扔 VPS 上爬了上万张图。
难以遏制想与大家分享的心情,看了看 Telegram 的 Bot API 文档,发现 tg 的 bot 设计还是挺简单的,于是根据 Telegram 给的 demo 进行修改,做了一个发图的 Bot 出来。
Telegram 的 Bot 的演示: https://telegram.me/canbin_bot
爬虫: https://github.com/lincanbin/Sina-Weibo-Album-Downloader
Telegram Bot: https://github.com/lincanbin/Telegram-Simple-Image-Bot
1
lincanbin OP 爬虫抓了几 GiB 的图片,在 DeactivatedOcean 上跑了好几分钟,可以跑满百兆带宽,并没有被 Deactivated 。 昨天吃饭时一直盯着手机的 SSH 客户端,生怕被 Deactivated 了。 图片内容来自一批微博账号,抓了上万张吧,并没有一一查看,大抵都是妹子图吧? |
2
zangbob 2015-11-17 10:04:02 +08:00
感谢灿神又有好作品分享~~~已 star ……
(是否应再加几行字: 1024 楼主好人之类的…… :) |
3
n37r06u3 2015-11-17 10:22:21 +08:00
readme 里截图用的什么浏览器
|
6
lwbjing 2015-11-17 10:41:38 +08:00
fllow
|
7
jedyu 2015-11-17 10:58:29 +08:00
Lofter 也有好多哦
|
9
Suclogger 2015-11-17 11:16:45 +08:00
果然福利才是人类的根本驱动力么
|
10
halfer53 2015-11-17 12:19:58 +08:00
Tumblr 才是最多的,各种福利简直精尽人亡
|
11
lincanbin OP |
12
halfer53 2015-11-17 12:36:46 +08:00
@lincanbin http://www.coolapk.com/apk/com.tumblr 酷安评论里有很多。我的 Tumblr 上还关注了 200 多个绝对领域的,回家后发给你
|
13
chengzhoukun 2015-11-17 12:43:35 +08:00
还有新闻联播截图😓
|
14
lonelygo 2015-11-17 12:45:26 +08:00
已✨, 1024 赞。
|
15
imn1 2015-11-17 12:47:42 +08:00
爬虫不难搞,难的是如何搜集发布号, share 一下吧
|
16
Hysteria 2015-11-17 13:04:22 +08:00
bot 简直不能太溜,赞得飞起。
|
18
phithon 2015-11-17 13:39:54 +08:00
分享福利号啊!!
|
19
PandaSaury 2015-11-17 14:18:33 +08:00
可以在 github 开个地方,专门收集福利号
|
20
mfinal 2015-11-18 00:45:00 +08:00
已经 star 表示👍。学习一下怎么爬 weibo
|
21
joewangyz 2015-11-18 14:44:55 +08:00
关键是福利号啊,,不然哪获取 OID 和 照片墙的 cookie 。。
|
23
banri 2015-11-18 21:37:31 +08:00
200 个绝对领域!
|
30
fuliti 2015-11-22 16:03:12 +08:00
感觉好神奇 ,可惜不会用。
|
31
JiaFeiX 2015-12-02 12:34:52 +08:00
请问楼主爬取的哪些账号?
|
32
bbjoe 2016-08-30 17:43:21 +08:00
请问爬相册老会爬漏是什么问题呢?比如 402 个图片 id ,运行完只得百来张。
|
35
yxqcyl 2017-01-20 08:59:29 +08:00
出现以下错误是什么原因?
['4065529837148919'] 9f128f33jw1e8qgp5bmzyj2050050aa8.jpg lxhxixi_org.gif 2Flxhxixi_org.gif 9f128f33ly1fbw8sp2ro7j20qo1beq4l.jpg Exception in thread Thread-51: Traceback (most recent call last): File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 137, in _new_conn (self.host, self.port), self.timeout, **extra_kw) File "/usr/lib/python3/dist-packages/urllib3/util/connection.py", line 67, in create_connection for res in socket.getaddrinfo(host, port, 0, socket.SOCK_STREAM): File "/usr/lib/python3.5/socket.py", line 732, in getaddrinfo for res in _socket.getaddrinfo(host, port, family, type, proto, flags): socket.gaierror: [Errno -3] Temporary failure in name resolution During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 559, in urlopen body=body, headers=headers) File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 353, in _make_request conn.request(method, url, **httplib_request_kw) File "/usr/lib/python3.5/http/client.py", line 1106, in request self._send_request(method, url, body, headers) File "/usr/lib/python3.5/http/client.py", line 1151, in _send_request self.endheaders(body) File "/usr/lib/python3.5/http/client.py", line 1102, in endheaders self._send_output(message_body) File "/usr/lib/python3.5/http/client.py", line 934, in _send_output self.send(msg) File "/usr/lib/python3.5/http/client.py", line 877, in send self.connect() File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 162, in connect conn = self._new_conn() File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 146, in _new_conn self, "Failed to establish a new connection: %s" % e) requests.packages.urllib3.exceptions.NewConnectionError: <requests.packages.urllib3.connection.HTTPConnection object at 0x7f6d60069b00>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/lib/python3/dist-packages/requests/adapters.py", line 376, in send timeout=timeout |