Apache Solr 3 on Drupal 7 Turtorial with screen shots

This page will cover how to Install and configure all the Drupal 7 modules to an existing Solr 3

On previous pages I have already covered:

Download the Drupal 7 Modules for Solr

1. Download the modules required

#drush dl search_api_solr apachesolr search_api_spellcheck apachesolr_autocomplete search_api

2. Down load the Solr PHP client from http://code.google.com/p/solr-php-client/downloads/list (SolrPhpClient.r60.2011-05-04.zip)

3. Extract the file and put it in the /opt/www/drupal6/sites/all/libaries directory

     DRUPAL_ROOT/sites/all/libraries/
      |- SolrPhpClient
         |- Apache/

         |- ChangeLog

4. Enable the modules as shown

#drush en search_api_solr apachesolr search_api_spellcheck apachesolr_autocomplete search_api

enable apachesolr modules

search_api-modules

Configure Solr to use the Drupal schema that comes with the apachesolr module

5. Before configuring the modules in Drupal, test the existing Solr instance.

Go to http://YourSolrServer:8983/Solr/core0/admin in your browser

6. Check that the Drupal Schema is showing in Solr similar to below.

Solr Admin

Click on Schema and see that Drupal is mentioned.

This is important: If Drupal is not listed in the schema, then you will need to add the files from the solr-conf directory of the Drupal ApacheSolr module. 

The apachesolr module comes with a schema.xml, solrconfig.xml, and protwords.txt file which
must be used in your Solr installation.  

     drupal/sites/all/modules/apachesolr/solr-conf# ls

     protwords.txt
     schema.xml
     solrconfig.xml
     schema-solr3x.xml
     solrconfig-solr3x.xml

7. Save the apache-solr-3.5/example/solr/core0/conf/schema.xml by renaming it to something like schema.bak. Then move the solr-conf/schema.xml that comes with this Drupal module to take its place. Since we are using Solr 3.5 or later, we can use solr-conf/schema-solr3x.xml instead.

8. Similarly, save apache-solr-3.5/example/solr/conf/solrconfig.xml by rename it to solrconfig.bak. Then move the solr-conf/solrconfig-solr3x.xml that comes with the apachesolr module to take its place.

Make sure that the apache-solr-3.5/example/solr/core0/conf/directory includes the following files - the Solr core may not load if you don't have at least an empty file present:

solrconfig.xml
schema.xml
elevate.xml
mapping-ISOLatin1Accent.txt
protwords.txt
stopwords.txt
synonyms.txt

9. Do the same thing for the other multicore directory in Solr apache-solr-3.5/example/solr/core1/conf/.

You can then start Solr. For the example application, go to $SOLR/example/ and issue the following command: java -jar start.jar

10.Configure the Drupal search-api-solr module as shown

go to http://drupalsite/admin/config/search/search_api and make sure it is mostly blank like this.

search-api-blank

11. Configure the apachesolr search modules as shown

Go to http://drupalsite/admin/config/search/apachesolr and set as shown on the next few screens

apachesolr search1

12. Click the settings tab and click the link circled in red to add the Solr server conection informaiton.

apachesolr search2

13. To increase the number of pages sent for indexing on each cron, click the Advanced Configuration and change as shown.

Number per Cron

14. On the settings page configure as shown

apachesolr search2a

15. After saving the above screen go back to it and click the test button to make sure it connects to the Solr server ok.

 

16. Next, click the Pages/Blocks tab

pages-blocks2

17. Edit the Core search settings as shown

pages-blocks

Optional: Change the Solr configuration to commit records every 20 seconds instead of every 120 seconds, so items show up in the search sooner.

18. To change the 2 minute delay in Solr for commits edit the example/solr/conf/solrconfig.xml file

Change this section, to shorten the autoCommit from 120 sec to 20 seconds

 <autoCommit> <maxDocs>2000</maxDocs> 
 <maxTime>120000</maxTime> </autoCommit>
to
  <autoCommit> <maxDocs>2000</maxDocs>
 <maxTime>20000</maxTime> </autoCommit>

 

Solr with DOC and PDF Files

http://drupal.org/node/1377416

See Also:

 

Setting up Solr 4 with Tika support in Tomcat 6

 

Hi, I am not able to index my content… in logs,,error is Indexing failed on one of the following entity ids: node/1, node/2 “400″ Status: Bad Request: Bad RequestApache Tomcat/6.0.33 – Error report HTTP Status 400 – ERROR:unknown field ‘site’type Status reportmessage ERROR:unknown field ‘site’description The request sent by the client was syntactically incorrect (ERROR:unknown field ‘site’).Apache Tomcat/6.0.33 pls help…

I have seen something similar, the cause was some bad data in the content of one of the nodes. You might try changing or removing the content from the listed nodes, or even delete the affected nodes so the rest of the site will be indexed and then try to add the content back and trouble shoot.

In step 3 you say drupal6 I guess this has to be Drupal7. Unless you've named your Drupal 7 install "drupal6" ;) (Exactly what I did)

Changing the 2 minute delay can have some performance implications. @source http://serverfault.com/questions/407407/why-set-a-delay-for-apache-solr-indexing

In reply to by Anonymous (not verified)

We have few updates, but want the new content indexed quickly, so reducing the delay helps us. As I understand it, if there are now additions the Solr cache is not lost. But on a site with lots of updates and heavy traffic reducing the delay could be a performance concern.