Skip to content
Rnjai Lamba edited this page Mar 23, 2016 · 7 revisions

SOLR

Update solr - 

2 Parts- 

DATA IMPORT HANDLER(get data from database to solr)

Use DataImportHandler to initially load your data into Solr... You could create a Delta Import Handler that is executed using curl from a cron job to periodically add changes in the database to the index. Also, if you need more real time updates, you could use a trigger in your database and have that kick off the curl call to the Delta DIH.

sync solr and db using delta import feature of solr [ http://wiki.apache.org/solr/DataImportHandler#Delta-Import_Example ]

[ http://stackoverflow.com/questions/3226489/sync-solr-documents-with-database-records ]

solr handle deleted / updated docs http://stackoverflow.com/questions/1555610/solr-dih-how-to-handle-deleted-documents?lq=1

solr performance [http://blog.michaelhamrah.com/2011/11/solr-improving-performance-and-other-considerations/]

COMMIT THE DATA

There are two options 

1 hard commit 

2 soft commit

A hard commit uses the transaction log to get the id of the latest document changes, and also calls fsync on the index files to ensure they have been flushed to stable storage and no data loss will result from a power failure.

A soft commit is much faster since it only makes index changes visible and does not fsync index files or write a new index descriptor. If the JVM crashes or there is a loss of power, changes that occurred after the last hard commit will be lost. Search collections that have NRT requirements (that want index changes to be quickly visible to searches) will want to soft commit often but hard commit less frequently. A softCommit may be "less expensive" in terms of time, but not free, since it can slow throughput.[https://cwiki.apache.org/confluence/display/solr/Near+Real+Time+Searching]

[https://lucidworks.com/blog/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/]

A common configuration is to do a hard autocommit every 1-10 minutes and a autosoftcommit every second. With this configuration, new documents will show up within about a second of being added, and if the power goes out, soft commits are lost unless a hard commit has been done.

Data sent to Solr is not searchable until it has been committed to the index. 

<updateHandler class="solr.DirectUpdateHandler2">

<autoCommit>

  <maxDocs>10000</maxDocs>

  <maxTime>1000</maxTime>

  <openSearcher>false</openSearcher>

</autoCommit>

</updateHandler>

[https://cwiki.apache.org/confluence/display/solr/UpdateHandlers+in+SolrConfig]

You can add documents as soon as they are posted. Solr will first write them into memory and then, depending on your configuration settings, after a certain amount of time has passed or after a certain amount of documents are pending, it will "commit" them to the index on disk.[http://stackoverflow.com/questions/2086995/which-method-of-solr-implementation-should-i-use-3-questions?lq=1]

_Setup SOLR _

quick start http://lucene.apache.org/solr/quickstart.html

faceting https://cwiki.apache.org/confluence/display/solr/Faceting

searching https://cwiki.apache.org/confluence/display/solr/Searching

node client - https://github.com/lbdremy/solr-node-client

node client async -http://stackoverflow.com/questions/26566535/nodejs-solrclient-how-to-wait-for-response

solr query builderhttps://github.com/maxcnunes/solr-query-builder

Running SOLR

bin/solr start

bin/solr create -c test

 - location of new core is here /usr/local/solr-5.5.0/server/solr/test

schema.xml is not there 

[ https://cwiki.apache.org/confluence/display/solr/Managed+Schema+Definition+in+SolrConfig ]

schema.xml issue http://stackoverflow.com/questions/31320696/solr-error-creating-core-fieldtype-x-not-found-in-the-schema

bin/solr delete -c test

bin/post -c test server/solr/test/conf/docs.xml

Node solr

add with options https://github.com/lbdremy/solr-node-client/issues/134