周末没事儿在家学了学 python 的 gevent 库,感觉老牛逼了,顺便写了一个简单的爬虫,比较了 gevent 和 multithread 和 multiprocessing.欢迎拍砖.另外,请教一个问题,脚本在运行多进程的时候总是报错:
https://github.com/hellorocky/alexaTop500
Traceback (most recent call last):
File "alexa.py", line 116, in <module>
site.multiprocess()
File "alexa.py", line 83, in multiprocess
self.domain = multiprocessing.Manager().list()
File "/home/rocky/python/lib/python2.7/multiprocessing/managers.py", line 667, in temp
token, exp = self._create(typeid, *args, **kwds)
File "/home/rocky/python/lib/python2.7/multiprocessing/managers.py", line 565, in _create
conn = self._Client(self._address, authkey=self._authkey)
File "/home/rocky/python/lib/python2.7/multiprocessing/connection.py", line 175, in Client
answer_challenge(c, authkey)
File "/home/rocky/python/lib/python2.7/multiprocessing/connection.py", line 432, in answer_challenge
message = connection.recv_bytes(256) # reject large message
IOError: [Errno 11] Resource temporarily unavailable
1
zwpaper 2016-10-07 14:23:18 +08:00 via iPhone
手机没细看,瞎猜一下,是不是 Socket 端口占用没处理好
|
2
hellorocky728 OP @zwpaper 应该不是吧,我都没有起服务,没有占用端口
|
3
ericls 2016-10-07 14:57:10 +08:00
试试 loop.run_in_executor 呢? 用 future 的形式来看看
|
4
firebroo 2016-10-07 15:04:20 +08:00 via Android
resource xxx 报错应该是进程太多了
|
5
stranbird 2016-10-07 15:49:22 +08:00
可以先用 pep8 lint 一下。
|
6
pright 2016-10-07 15:51:22 +08:00
|
7
prasanta 2016-10-07 22:46:27 +08:00
为什么不用 aasync/await 呢
|
8
panda0 2016-10-08 10:05:28 +08:00
|
9
hellorocky728 OP @prasanta 这两天打算学学,然后用上,比较一下~
|
10
rale 2016-10-08 10:42:28 +08:00
这个 errno 11 来自于 c 语言, http://www-numi.fnal.gov/offline_software/srt_public_context/WebDocs/Errors/unix_system_errors.html , 按这里的 define 你重试几次看看。
|
11
hellorocky728 OP @rale 虽然不知道怎么做,但还是感谢~
|
12
hellorocky728 OP @stranbird 谢谢
|
13
ToughGuy 2016-10-08 17:29:29 +08:00
ulimit -a 看看
|
14
hellorocky728 OP @ToughGuy
` core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 3862 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 10240 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 3862 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited ` |
15
ToughGuy 2016-10-09 09:34:51 +08:00
系统资源限制问题不大, 你代码里面多进程部分最好加上 queue, 限制同时 fork 的进程数量。
|