Apache Solr 3 on Drupal 7 Turtorial with screen shots
This page will cover how to Install and configure all the Drupal 7 modules to an existing Solr 3
On previous pages I have already covered:
- Installing one Tomcat server to host Solr
- Create two Solr applications each with multi cores
Download the Drupal 7 Modules for Solr
1. Download the modules required
#drush dl search_api_solr apachesolr search_api_spellcheck apachesolr_autocomplete search_api
2. Down load the Solr PHP client from http://code.google.com/p/solr-php-client/downloads/list (SolrPhpClient.r60.2011-05-04.zip)3. Extract the file and put it in the /opt/www/drupal6/sites/all/libaries directory
|- SolrPhpClient
|- Apache/
|- ChangeLog
4. Enable the modules as shown
#drush en search_api_solr apachesolr search_api_spellcheck apachesolr_autocomplete search_api
Configure Solr to use the Drupal schema that comes with the apachesolr module
5. Before configuring the modules in Drupal, test the existing Solr instance.
Go to http://YourSolrServer:8983/Solr/core0/admin in your browser
6. Check that the Drupal Schema is showing in Solr similar to below.
Click on Schema and see that Drupal is mentioned.
This is important: If Drupal is not listed in the schema, then you will need to add the files from the solr-conf directory of the Drupal ApacheSolr module.
The apachesolr module comes with a schema.xml, solrconfig.xml, and protwords.txt file which
must be used in your Solr installation.
drupal/sites/all/modules/apachesolr/solr-conf# ls
protwords.txtschema.xml
solrconfig.xml
schema-solr3x.xml
solrconfig-solr3x.xml
7. Save the apache-solr-3.5/example/solr/core0/conf/schema.xml by renaming it to something like schema.bak. Then move the solr-conf/schema.xml that comes with this Drupal module to take its place. Since we are using Solr 3.5 or later, we can use solr-conf/schema-solr3x.xml instead.
8. Similarly, save apache-solr-3.5/example/solr/conf/solrconfig.xml by rename it to solrconfig.bak. Then move the solr-conf/solrconfig-solr3x.xml that comes with the apachesolr module to take its place.
Make sure that the apache-solr-3.5/example/solr/core0/conf/directory includes the following files - the Solr core may not load if you don't have at least an empty file present:
solrconfig.xml
schema.xml
elevate.xml
mapping-ISOLatin1Accent.txt
protwords.txt
stopwords.txt
synonyms.txt
9. Do the same thing for the other multicore directory in Solr apache-solr-3.5/example/solr/core1/conf/.
You can then start Solr. For the example application, go to $SOLR/example/ and issue the following command: java -jar start.jar
10.Configure the Drupal search-api-solr module as shown
go to http://drupalsite/admin/config/search/search_api and make sure it is mostly blank like this.
11. Configure the apachesolr search modules as shown
Go to http://drupalsite/admin/config/search/apachesolr and set as shown on the next few screens
12. Click the settings tab and click the link circled in red to add the Solr server conection informaiton.
13. To increase the number of pages sent for indexing on each cron, click the Advanced Configuration and change as shown.
14. On the settings page configure as shown
15. After saving the above screen go back to it and click the test button to make sure it connects to the Solr server ok.
16. Next, click the Pages/Blocks tab
17. Edit the Core search settings as shown
Optional: Change the Solr configuration to commit records every 20 seconds instead of every 120 seconds, so items show up in the search sooner.
18. To change the 2 minute delay in Solr for commits edit the example/solr/conf/solrconfig.xml file
Change this section, to shorten the autoCommit from 120 sec to 20 seconds
<autoCommit> <maxDocs>2000</maxDocs> <maxTime>120000</maxTime> </autoCommit> to
<autoCommit> <maxDocs>2000</maxDocs>
<maxTime>20000</maxTime> </autoCommit>
Solr with DOC and PDF Files
http://drupal.org/node/1377416
See Also:
Setting up Solr 4 with Tika support in Tomcat 6
apache solr,,not able to index
Hi, I am not able to index my content… in logs,,error is Indexing failed on one of the following entity ids: node/1, node/2 “400″ Status: Bad Request: Bad RequestApache Tomcat/6.0.33 – Error report HTTP Status 400 – ERROR:unknown field ‘site’type Status reportmessage ERROR:unknown field ‘site’description The request sent by the client was syntactically incorrect (ERROR:unknown field ‘site’).Apache Tomcat/6.0.33 pls help…
Index fails on nodes with bad data
I have seen something similar, the cause was some bad data in the content of one of the nodes. You might try changing or removing the content from the listed nodes, or even delete the affected nodes so the rest of the site will be indexed and then try to add the content back and trouble shoot.
drupal6
In step 3 you say drupal6 I guess this has to be Drupal7. Unless you've named your Drupal 7 install "drupal6" ;) (Exactly what I did)
2 minute delay
Changing the 2 minute delay can have some performance implications. @source http://serverfault.com/questions/407407/why-set-a-delay-for-apache-solr-indexing
Yes, it depends on the site
In reply to 2 minute delay by Anonymous (not verified)
We have few updates, but want the new content indexed quickly, so reducing the delay helps us. As I understand it, if there are now additions the Solr cache is not lost. But on a site with lots of updates and heavy traffic reducing the delay could be a performance concern.