new structure

zhou1xiang2 · Dec 7, 2017 · d13495c · d13495c
1 parent e6bc0e1
commit d13495c
Show file tree

Hide file tree

Showing 136 changed files with 26,672 additions and 14,294 deletions.
diff --git a/.gitignore b/.gitignore
@@ -1,28 +1,15 @@
 *.pyc
+*.txt
 *.log
 *.swp
 *.bak
 *.weights
 *.trec
 *.ranklist
 *.DS_Store
-*.mq2007
-matchzoo/*.txt
-data/mq2008/*
-data/mq2007
-data/toutiao
-data/example/*
-data/toutiao_jieba_new
-data/robust/*
 build/
+dist/
 log/*
 matchzoo/log/*
-qrels.*
-trec_eval
-matchzoo/lydev/*
-#matchzoo/models/*.config
-matchzoo/run_submit_gypsum_jobs_wikiqa.py
-matchzoo/run_model.py
-matchzoo/run_model_wraper.py
 log/*
 .idea/
diff --git a/MatchZoo.egg-info/PKG-INFO b/MatchZoo.egg-info/PKG-INFO
@@ -1,7 +1,7 @@
 Metadata-Version: 1.1
 Name: MatchZoo
-Version: 1.0
-Summary: MatchingZoom is a toolkit for text matching.It was developed with a focus on enabling fast experimentation.
+Version: 0.2.0
+Summary: MatchZoo is a toolkit for text matching. It was developed with a focus on facilitating the designing, comparing and sharing of deep text matching models.
 Home-page: https://github.com/faneshion/MatchZoo
 Author: Yixing Fan, Liang Pang, Jianpeng Hou, Jiafeng Guo, Yanyan Lan, Xueqi Cheng
 Author-email: [email protected]
@@ -12,6 +12,10 @@ Platform: UNKNOWN
 Classifier: Development Status :: 3 - Alpha
 Classifier: Environment :: Console
 Classifier: Operating System :: POSIX :: Linux
-Classifier: Programming Language :: Python :: 2.7
 Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
-Classifier: License :: OSI Approved :: BSD License
+Classifier: License :: OSI Approved :: Apache License
+Classifier: Programming Language :: Python :: 2.7
+Classifier: Programming Language :: Python :: 3
+Classifier: Programming Language :: Python :: 3.4
+Classifier: Programming Language :: Python :: 3.5
+Classifier: Programming Language :: Python :: 3.6
diff --git a/MatchZoo.egg-info/SOURCES.txt b/MatchZoo.egg-info/SOURCES.txt
@@ -25,4 +25,5 @@ matchzoo/metrics/evaluations.py
 matchzoo/metrics/rank_evaluations.py
 matchzoo/utils/__init__.py
 matchzoo/utils/rank_io.py
+matchzoo/utils/roc_auc.py
 matchzoo/utils/utility.py
diff --git a/MatchZoo.egg-info/requires.txt b/MatchZoo.egg-info/requires.txt
@@ -3,3 +3,5 @@ tensorflow >= 1.1.0
 nltk >= 3.2.3
 numpy >= 1.12.1
 six >= 1.10.0
+h5py >= 2.7.0
+tqdm >= 4.19.4
diff --git a/README.md b/README.md
@@ -1,5 +1,5 @@
 <div align='center'>
-<img src="./data/matchzoo-logo.png" width = "400"  alt="图片名称" align=center />
+<img src="./docs/_static/images/matchzoo-logo.png" width = "400"  alt="图片名称" align=center />
 </div>
 
 ---
@@ -55,14 +55,14 @@ In the main directory, this will install the dependencies automatically.
 
 For usage examples, you can run
 ```
-python main.py --phase train --model_file ./models/arci_ranking.config
-python main.py --phase predict --model_file ./models/arci_ranking.config
+python matchzoo/main.py --phase train --model_file examples/toy_example/config/arci_ranking.config
+python matchzoo/main.py --phase predict --model_file examples/toy_example/config/arci_ranking.config
 ```
 
 ## Overview
 The architecture of the MatchZoo toolkit is described in the Figure  in what follows,
 <div align='center'>
-<img src="./data/matchzoo.png" width = "400" height = "200" alt="图片名称" align=center />
+<img src="./docs/_static/images/matchzoo.png" width = "400" height = "200" alt="图片名称" align=center />
 </div>
 There are three major modules in the toolkit, namely data preparation, model construction, training and evaluation, respectively. These three modules are actually organized as a pipeline of data flow.
 
@@ -87,11 +87,11 @@ Here, we adopt <a href="https://www.microsoft.com/en-us/download/details.aspx?id
 
 Take the DRMM as an example. In training phase, you can run
 ```
-python main.py --phase train --model_file models/wikiqa_config/drmm_wikiqa.config
+python matchzoo/main.py --phase train --model_file examples/wikiqa/config/drmm_wikiqa.config
 ```
 In testing phase, you can run
 ```
-python main.py --phase predict --model_file models/wikiqa_config/drmm_wikiqa.config
+python matchzoo/main.py --phase predict --model_file examples/wikiqa/config/drmm_wikiqa.config
 ```
 
 We have compared 10 models, the results are as follows.
@@ -166,12 +166,12 @@ We have compared 10 models, the results are as follows.
 </table>
 The loss of each models are described in the following figure,
  <div align='center'>
-<img src="./data/matchzoo.wikiqa.loss.png" width = "550" alt="图片名称" align=center />
+<img src="./docs/_static/images/matchzoo.wikiqa.loss.png" width = "550" alt="图片名称" align=center />
 </div>
 
 The MAP of each models are depicted in the following figure,
 <div align='center'>
-<img src="./data/matchzoo.wikiqa.map.png" width = "550" alt="图片名称" align=center />
+<img src="./docs/_static_images/matchzoo.wikiqa.map.png" width = "550" alt="图片名称" align=center />
 </div>
 Here, the DRMM_TKS is a variant of DRMM for short text matching. Specifically, the matching histogram is replaced by a top-k maxpooling layer and the remaining part are fixed. 
 
@@ -297,11 +297,11 @@ Development Teams
 
 Acknowledgements
 =====
-We would like to express our appreciation to the following people for contributing source code to MatchZoo, including [Yixing Fan](https://scholar.google.com/citations?user=w5kGcUsAAAAJ&hl=en), [Liang Pang](https://scholar.google.com/citations?user=1dgQHBkAAAAJ&hl=zh-CN), [Liu Yang](https://sites.google.com/site/lyangwww/), [Lijuan Chen](), [Jianpeng Hou](https://github.com/HouJP), [Zhou Yang](), [Niuguo cheng](https://github.com/niuox) etc..
+We would like to express our appreciation to the following people for contributing source code to MatchZoo, including [Yixing Fan](https://scholar.google.com/citations?user=w5kGcUsAAAAJ&hl=en), [Liang Pang](https://scholar.google.com/citations?user=1dgQHBkAAAAJ&hl=zh-CN), [Liu Yang](https://sites.google.com/site/lyangwww/), [Yukun Zheng](), [Lijuan Chen](), [Jianpeng Hou](https://github.com/HouJP), [Zhou Yang](), [Niuguo cheng](https://github.com/niuox) etc..
 
 Feedback and Join Us
 =====
 Feel free to post any questions or suggestions on [GitHub Issues](https://github.com/faneshion/MatchZoo/issues) and we will reply to your questions there. You can also suggest adding new deep text maching models into MatchZoo and apply for joining us to develop MatchZoo together.
 <div align='center'>
-<img src="./data/matchzoo-group.jpeg" width = "200"  alt="图片名称" align=center />
+<img src="./docs/_static/images/matchzoo-group.jpeg" width = "200"  alt="图片名称" align=center />
 </div>
diff --git a/Team.md b/Team.md
@@ -11,6 +11,9 @@ The following people contributed to the development of the MatchZoo project：
 - **Liu Yang (Core Developer)** 
     - PhD. student from Center for Intelligent Information Retrieval, University of Massachusetts Amherst
     - [HomePage](https://sites.google.com/site/lyangwww/)
+- **Yukun Zheng (Core Developer)** 
+    - master student from Tsinghua University
+    - [HomePage]()
 - **Zhou Yang (Core Developer)** 
     - Master student from Chongqing University of Technology
     - [HomePage]()

diff --git a/build/lib/matchzoo/inputs/__init__.py b/build/lib/matchzoo/inputs/__init__.py
@@ -1,5 +1,39 @@
-# note 
-from pair_generator import PairGenerator
-from pair_generator import DRMM_PairGenerator
-from list_generator import ListGenerator
-from list_generator import DRMM_ListGenerator
+# note
+from __future__ import absolute_import
+import six
+from keras.utils.generic_utils import deserialize_keras_object
+
+from .point_generator import PointGenerator
+from .point_generator import Triletter_PointGenerator
+from .point_generator import DRMM_PointGenerator
+
+from .pair_generator import PairGenerator
+from .pair_generator import Triletter_PairGenerator
+from .pair_generator import DRMM_PairGenerator
+from .pair_generator import PairGenerator_Feats
+from .list_generator import ListGenerator
+from .list_generator import Triletter_ListGenerator
+from .list_generator import DRMM_ListGenerator
+from .list_generator import ListGenerator_Feats
+
+def serialize(generator):
+    return generator.__name__
+
+def deserialize(name, custom_objects=None):
+    return deserialize_keras_object(name,
+                                    module_objects=globals(),
+                                    custom_objects=custom_objects,
+                                    printable_module_name='loss function')
+
+def get(identifier):
+    if identifier is None:
+        return None
+    if isinstance(identifier, six.string_types):
+        identifier = str(identifier)
+        return deserialize(identifier)
+    elif callable(identifier):
+        return identifier
+    else:
+        raise ValueError('Could not interpret '
+                         'loss function identifier:', identifier)
+