Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LZ写的挺好的 #1

Open
Allianzcortex opened this issue Oct 31, 2015 · 1 comment
Open

LZ写的挺好的 #1

Allianzcortex opened this issue Oct 31, 2015 · 1 comment

Comments

@Allianzcortex
Copy link

偶然见到LZ写的文章,非常漂亮。虽然和我的教程思路不一样^^ ,用的 cookie 方法也不一样

大概看了一下代码,没有测试。可以重写 增加 pipeline ,大概是这样,就可以直接在 ans.json 中
看到内容,并且也处理了 Unicode 到 utf-8 的转换
'
import json
import codecs
class doubanBookPipeline(object):

def __init__(self):
    self.file = codecs.open('ans.json', 'wb', encoding='utf-8')

def process_item(self, item, spider):
    line = json.dumps(dict(item), ensure_ascii=False) + "\n"
    self.file.write(line)
    return item

def spider_closed(self, spider):
    self.file.close()

    file = codecs.open(filename,'wb',encoding='utf-8')

'

还有 LZ 的代码现在应该登录不上去了? zhihu.com/login 已经被取消了,换用 zhihu.com 直接提交表单应该就可以。

祝顺利啦~

@Allianzcortex
Copy link
Author

request 库里有丧心病狂的直接复制粘贴 cookie~~

额,我上面是说我没有下载代码测试…………

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant