Skip to content
alkeshpatel11 edited this page Apr 5, 2013 · 7 revisions

Indexing is a process of making text contents available for fast searching in real time.

How to setup schema.xml before using indexing API

Following steps should be performed in sequence to create solr schema for given UIMA CAS:

  1. Identify the features from your UIMA Typesystem that you want to index

    e.g. text, sentence, noun-phrases, named-entities etc.

  2. Go to the SOLR_HOME directory of your solr server deployment and crate a new solr-core in "multicore" folder.

    e.g. In typical setup, you will locate the ...apache-solr-3.6.1/example/multicore/.

  3. Create a core and add relevant entry to solr.xml

  4. Refer http://wiki.apache.org/solr/SchemaXml documentation to create schema.xml. You will add fields (identified in step 1) with corresponding field-types and other options.

    e.g.
    ...
    <field name="id" type="string" indexed="true" stored="true" multiValued="false" required="true"/>
    <field name="text" type="text_en" indexed="true" stored="true" multiValued="true" termVectors="true" termPositions="true" termOffsets="true"/>
    ...

  5. Restart the solr server to reflect the changes you made. If you created schema properly then you should be able to see the new core at http://host:port/solr/

How to index a document?

  1. Create a HashMap of fields to be indexed. i.e. Key=fieldname, Value=fieldValue
  2. Call indexDocument(HashMap<String,Object> indexMap) followed by indexCommit().

Clone this wiki locally