Skip to content
This repository has been archived by the owner on Jul 17, 2019. It is now read-only.

mongo-connector 同步到ealasicsearch问题 #34

Open
jas502n opened this issue Nov 20, 2016 · 3 comments
Open

mongo-connector 同步到ealasicsearch问题 #34

jas502n opened this issue Nov 20, 2016 · 3 comments

Comments

@jas502n
Copy link

jas502n commented Nov 20, 2016

当不修改vi /usr/local/lib/python2.7/dist-packages/mongo_connector/doc_managers/elastic2_doc_manager.py
页面的时候,只同步了几分钟,Logging to mongo-connector.log就中断了,bugs只有500个,drops文章一篇也没有.

然后我按照所说修改了
sudo vi /usr/local/lib/python2.7/dist-packages/mongo_connector/doc_managers/elastic2_doc_manager.py
将:
self.elastic = Elasticsearch(hosts=[url],**kwargs.get('clientOptions', {}))

修改为:
self.elastic = Elasticsearch(hosts=[url],timeout=200, **kwargs.get('clientOptions', {}))

删除Elasticsearch data 下的目录, 然后重启服务 service mongodb restart 普通账户运行 elasticsearch-2.3.4/bin/elasticsearch -d

当我sudo mongo-connector -m localhost:27017 -t localhost:9200 -d elastic2_doc_manager
的时候,我发现Logging to mongo-connector.log同步是并不能像您所说的那样,大概完全同步需要30分钟,我只进行了几分钟,导致没有同步完全,bugs只有11000个,drops文章一篇也没有

然后就去看看mongo-connector 同步时的日志,谷歌了一遍没有找到解决办法..........心酸,然后又去修改
将:
self.elastic = Elasticsearch(hosts=[url],timeout=200, **kwargs.get('clientOptions', {}))

修改为:
self.elastic = Elasticsearch(hosts=[url],timeout=20000, **kwargs.get('clientOptions', {}))

删除Elasticsearch data下的目录 重新进行同步,同样只同步了几分钟,Logging to mongo-connector.log就中断了,这次bugs只有9500个,drops文章一篇也没有

cat mongo-connector.log 的日志如下:

cat mongo-connector.log 
2016-11-21 01:44:43,039 [CRITICAL] mongo_connector.oplog_manager:630 - Exception during collection dump
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/mongo_connector/oplog_manager.py", line 583, in do_dump
    upsert_all(dm)
  File "/usr/local/lib/python2.7/dist-packages/mongo_connector/oplog_manager.py", line 567, in upsert_all
    dm.bulk_upsert(docs_to_dump(namespace), mapped_ns, long_ts)
  File "/usr/local/lib/python2.7/dist-packages/mongo_connector/util.py", line 32, in wrapped
    return f(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/mongo_connector/doc_managers/elastic2_doc_manager.py", line 229, in bulk_upsert
    for ok, resp in responses:
  File "/usr/local/lib/python2.7/dist-packages/elasticsearch/helpers/__init__.py", line 162, in streaming_bulk
    for result in _process_bulk_chunk(client, bulk_actions, raise_on_exception, raise_on_error, **kwargs):
  File "/usr/local/lib/python2.7/dist-packages/elasticsearch/helpers/__init__.py", line 87, in _process_bulk_chunk
    resp = client.bulk('\n'.join(bulk_actions) + '\n', **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/elasticsearch/client/utils.py", line 69, in _wrapped
    return func(*args, params=params, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/elasticsearch/client/__init__.py", line 785, in bulk
    doc_type, '_bulk'), params=params, body=self._bulk_body(body))
  File "/usr/local/lib/python2.7/dist-packages/elasticsearch/transport.py", line 327, in perform_request
    status, headers, data = connection.perform_request(method, url, params, body, ignore=ignore, timeout=timeout)
  File "/usr/local/lib/python2.7/dist-packages/elasticsearch/connection/http_urllib3.py", line 112, in perform_request
    raw_data, duration)
  File "/usr/local/lib/python2.7/dist-packages/elasticsearch/connection/base.py", line 62, in log_request_success
    body = body.decode('utf-8')
  File "/usr/lib/python2.7/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
MemoryError
2016-11-21 01:44:43,054 [ERROR] mongo_connector.oplog_manager:638 - OplogThread: Failed during dump collection cannot recover! Collection(Database(MongoClient(host=['localhost:27017'], document_class=dict, tz_aware=False, connect=True, replicaset=u'rs0'), u'local'), u'oplog.rs')
2016-11-21 01:44:44,009 [ERROR] mongo_connector.connector:304 - MongoConnector: OplogThread <OplogThread(Thread-2, started 140037811336960)> unexpectedly stopped! Shutting down


cat oplog.timestamp 里面什么也没有

求辅助!!!!!!!!!!

@hanc00l
Copy link
Owner

hanc00l commented Nov 21, 2016

这个具体原因也不太清楚。不过有一个快捷的解决办法:你将虚拟机里的elasticsearch整个目录复制到你的电脑上直接可以运行使用的,里面包含了完整的同步的数据。

@jas502n
Copy link
Author

jas502n commented Nov 22, 2016

是mongodb 数据库这一块出了问题,不能直接覆盖数据库,必须要把数据库mongodump备份 和mongorestore恢复,不得不说,mongorestore恢复数据库好快,不会使用mongodb的小伙伴们
可以参考这篇文章
http://www.jb51.net/article/85854.htm
Windows或Linux系统中备份和恢复MongoDB数据的教程_MongoDB_脚本之家

@zhouddan
Copy link

您好,最近做项目也遇到您这个问题了,请问您最终是怎么解决的啊

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants