Cheap VPS & Xen Server


Residential Proxy Network - Hourly & Monthly Packages

Using Solr With TYPO3 On Debian Wheezy


TYPO3’s default search extension called “Indexed Search” is fine for small web sites, but if your web site is bigger (> 500 pages), it is getting very slow. Fortunately, you can replace it with a search extension that uses the ultra-fast Apache Solr search server. This tutorial explains how to use Apache Solr with TYPO3 on Debian Wheezy.

I do not issue any guarantee that this will work for you!

 

1 Preliminary Note

In this tutorial I’m using two servers:

  • server1.example.com (IP: 192.168.0.100): web server where the TYPO3 4.7 Introduction package is installed (in the www.example.com vhost).
  • server2.example.com (IP: 192.168.0.101): separate server where I will install Apache Solr.

Of course, it’s possible to install Solr on the same system as the web server; however, I’d like to split up both services so that they do not impact each other’s performance.

 

2 Installing Solr

server2.example.com:

First install Java:

apt-get install openjdk-6-jdk openjdk-6-jre unzip

update-alternatives –config java
update-alternatives –config javac

The TYPO3 project provides a Solr installation script which we download as follows:

wget http://forge.typo3.org/projects/extension-solr/repository/revisions/master/raw/resources/shell/install-solr.sh
chmod 755 install-solr.sh

Next check what the current Apache Tomcat 6 version is by visiting http://tomcat.apache.org/download-60.cgi. At the time of this writing it was 6.0.37. Now open install-solr.sh

vi install-solr.sh

… and make sure that the TOMCAT_VER variable holds the correct version number – if necessary, change it:

[...]
TOMCAT_VER=6.0.37
[...]

Now we install Solr. By default (if you don’t provide any languages as parameters), Solr is installed with support for the English language only; if you need support for more languages, just append them to the command, e.g. like this:

./install-solr.sh german english french

This installs a Tomcat server (where Solr is run on) and Solr. By default, Tomcat listens on 127.0.0.1 only; because we want to access Solr from a remote host, we must configure Tomcat to listen on all interfaces, therefore we replace 127.0.0.1 with 0.0.0.0 in /opt/solr-tomcat/tomcat/conf/server.xml:

vi /opt/solr-tomcat/tomcat/conf/server.xml

[...]
    <Connector port="8080" protocol="HTTP/1.1"
               maxHttpHeaderSize="65536"
               connectionTimeout="20000"
               redirectPort="8443"
               address="0.0.0.0"
               URIEncoding="UTF-8" />
[...]

Restart Tomcat:

/opt/solr-tomcat/tomcat/bin/shutdown.sh
/opt/solr-tomcat/tomcat/bin/startup.sh

Next we can configure cores in Solr. By default, an English core is already configured; if you need more languages, you can add them to /opt/solr-tomcat/solr/solr.xml, e.g. like this:

vi /opt/solr-tomcat/solr/solr.xml

<?xml version="1.0" encoding="UTF-8" ?>
<solr persistent="true">
        <cores adminPath="/admin/cores" shareSchema="true">
                <core name="core_en" instanceDir="typo3cores" schema="english/schema.xml" dataDir="data/core_en" />
                <core name="core_de" instanceDir="typo3cores" schema="german/schema.xml" dataDir="data/core_de" />
                <core name="core_fr" instanceDir="typo3cores" schema="french/schema.xml" dataDir="data/core_fr" />
        </cores>
</solr>

Restart Solr afterwards:

/opt/solr-tomcat/tomcat/bin/shutdown.sh
/opt/solr-tomcat/tomcat/bin/startup.sh

Because we don’t want to start Tomcat manually each time the server is booted, we can add the Tomcat startup command to /etc/rc.local:

vi /etc/rc.local

[...]
/opt/solr-tomcat/tomcat/bin/startup.sh
[...]

2.1 Adding Authentication To Solr

Because Solr is listening on all interfaces, it is a good idea to add authentication to it. I will now configure the user user1 with the password secret for the English core.

Open /opt/solr-tomcat/tomcat/conf/web.xml

vi /opt/solr-tomcat/tomcat/conf/web.xml

… and add the following section somewhere inside the <web-app> container:

[...]
  <security-constraint>
    <web-resource-collection>
      <web-resource-name>Solr authenticated application</web-resource-name>
      <url-pattern>/core_en/*</url-pattern>
    </web-resource-collection>
    <auth-constraint>
      <role-name>role1</role-name>
    </auth-constraint>
  </security-constraint>

  <login-config>
    <auth-method>BASIC</auth-method>
    <realm-name>Admin and Update protection</realm-name>
  </login-config>
[...]

As you see, this is valid for the English core only (<url-pattern>/core_en/*</url-pattern>), and I’ve configured this for the role role1, so valid users must belong to that role. To add the user user1 with his password to that role, open /opt/solr-tomcat/tomcat/conf/tomcat-users.xml

vi /opt/solr-tomcat/tomcat/conf/tomcat-users.xml

… and add the following section inside the <tomcat-users> container:

[...]
  <role rolename="role1"/>
  <user username="user1" password="secret" roles="role1"/>
[...]

Restart Tomcat afterwards:

/opt/solr-tomcat/tomcat/bin/shutdown.sh
/opt/solr-tomcat/tomcat/bin/startup.sh

You can now open a browser and visit Solr under http://192.168.0.101:8080/solr where you should see all configured cores:

50

When you visit the English core (for which we have just configured authentication), you should be asked for a username and a password:

51

After successful authentication, you should see the following page which means Solr is running successfully:

52

3 Configuring TYPO3

Now we are going to configure Solr search for our TYPO3 Introduction Package web site on www.example.com:

53

One important note: make sure that www.example.com can be resolved correctly from server1.example.com and server2.example.com. If you use a test domain which cannot be resolved or resolves to a wrong host, you will not be able to set up Solr search successfully. In such a case, you can add a record for example.com and www.example.com to /etc/hosts on both server1.example.com and server2.example.com:

vi /etc/hosts

[...]
192.168.0.100 example.com www.example.com

Now in the TYPO3 backend, go to the Extension Manager and there to the Import Extensions tab:

54

Click on the Update Repository button right of the Repository drop-down to download a list of available extensions:

55

The list of available extensions is now being updated:

56

Afterwards, still on the Import Extensions tab, type solr into the Filter field and press ENTER. You should see the extension Apache Solr for TYPO3 in the list – click on the Import extensions icon in front of it:

57

A window should pop up telling you that the extension has been imported. Click on the Close window link to close it:

58

Now go to the Available Extensions tab. Find the Solr extension and click on the Install extensions icon in front of it:

59

A window should pop up. It is possible that a dependency is not fulfilled (e.g., the pagebrowse extension is needed by the Solr extension), so you might have to click on the Import now link to import the missing extension:

60

 

A new window will open telling you that the missing extension has been imported. Close that window:

61

Back on the Available Extensions tab, we need to enable the dependency first (in this case the pagebrowse extension), so find that extension and click on the Install extensions icon:

62

 

A new window will open telling you that the extension has been installed. Close that window:

63

After you have enabled all dependencies of the Solr extension, click on the Install extensions icon in front of the Solr extension:

64

A new window pops up telling you that database changes have to be made to enable the Solr extension. Accept all proposed changes and click on the Make updates button:

65

 

Afterwards you can close the window:

66

The Solr extension is now installed. Now we must configure our TYPO3 web site to use the Solr search.

Go to the List module and click on the root of your web site (in this case it is the page called Home) and select Edit from the menu:

67

Go to the Behaviour tab and make sure that the Use as Root Page checkbox is checked:

68

Next we must create a domain record for our web site. Still in the List module, click on the Create new record icon…

69

… and select Domain (under System Records):

70

Create the Domain record as follows (if you use example.com instead of www.example.com as the main domain of your web site, fill in example.com without www):

71

Now we must tell the Solr extension where it can find our Solr server. Go to the Template module, select extension_configuration (under TypoScript Templates) and select Info/Modify in the drop-down menu at the top. Then click on Edit the whole template record:

72

On the General tab, fill in the following Solr configuration in the Constants field:

plugin.tx_solr.solr.scheme = http
#plugin.tx_solr.solr.host = 192.168.0.101
plugin.tx_solr.solr.host = user1:secret@192.168.0.101
plugin.tx_solr.solr.port = 8080
plugin.tx_solr.solr.path = /solr/core_en

config.index_enable = 1

(Make sure you fill in the correct user and password in the plugin.tx_solr.solr.host line. If you don’t use authentication, use plugin.tx_solr.solr.host = 192.168.0.101 instead.)

73

Go to the Includes tab and include the Apache Solr (solr) extension, then save the template:

74

 

Now log out…

75

… and log back in. In the Clear Caches menu in the upper right corner, you should now find the option Initialize Solr connections – click on that option:

76

Now go to the Reports module and select Apache Solr Index:

77

If no errors are reported, this means that the Solr server has successfully been contacted (of course, no documents have been indexed yet, that’s why you see 0 for all items). If the Solr server cannot be contacted, you will see a Failed to connect… error message – in this case you should install the devlog extension to find out what went wrong.

78

 

Go to the Search module next, select pages and click on the Initialize Index Queue button:

79

80

Next we must set up two scheduled tasks, one that creates the index of your TYPO3 page and one that commits the Solr index. In the Scheduler module, click on the Add task icon:

81

For the first scheduled task, select Index Queue Worker (solr) in the Class field, Recurring in the Type field, specify a start time, leave the End field empty, specify a frequency (like 3600 for one hour), select your root page in the Site field and save the scheduled task:

82

For the second scheduled task, select Commit Solr Index (solr) in the Class field, Recurring in the Type field, specify a start time, leave the End field empty, specify a frequency (like 3600 for one hour), select your root page in the Site field and save the scheduled task:

83

Afterwards, in the Scheduler module, select both scheduled tasks and click on the Execute selected tasks button to run them right now:

84

85

If you visit your Solr server in your browser now and type in *:* in the Query String field,…

52

… Solr should show you a list of results in XML format which means your TYPO3 page has successfully been indexed:

86

Back in the TYPO3 backend, we create a search page now so that we can use the Solr search from our TYPO3 web site. Go to the Page module and click on the Create new pages icon:

87

Drag and drop the Standard icon…

88

… to the place in your tree structure where you want the search page to be located (e.g. after the Feedback page):

89

Click on the icon in fron of the new page and select Edit from the menu:

90

Fill in a title for the new page (like Search) and save the page:

91

Afterwards, enable the page:

92

Then click on the Create new content element icon:

93

Go to the Form elements tab and select Search Form; then specify a position for that element:

94

Save the document afterwards:

95

You can now reload your TYPO3 web site – you should now see a new menu item called Search. Go there an fill in a search term in the search form and submit the form:

96

 

If your search term is in the Solr index, you should get a list of results in virtually no time:

97

Congratulations, you have just set up Solr search for your TYPO3 web site!

 

  • Apache Solr: http://lucene.apache.org/solr/
  • TYPO3 Solr Extension: http://www.typo3-solr.com/
  • TYPO3: http://typo3.org/
  • Debian: http://www.debian.org/

 

 

 

 

Comments

comments