You are here

README.txt in Apache Solr Multilingual 6.2

Same filename and directory in other branches
  1. 6.3 README.txt
  2. 6 README.txt
  3. 7 README.txt
Apache Solr Multilingual
========================

Name: apachesolr_multilingual
Authors: Markus Kalkbrenner | Cocomore AG
         Matthias Huder | Cocomore AG
Drupal: 6.x
Sponsor: Cocomore AG - http://www.cocomore.com
                       http://drupal.cocomore.com


Description
===========

Apache Solr Multilingual extends Apache Solr Search Integration
in a clean way to provide:
  * better support for non-English languages
  * support for multilingual search
  * an easy to use administration interface for non-English and
multilingual search


Installation
============

Currently, the only version of apachesolr apachesolr_multilingual is
compatible with is apachesolr 6.x-2.0-beta5:
http://drupal.org/node/1173780

1. Place whole apachesolr_multilingual folder into your Drupal
   modules/ or better sites/x/modules/ directory.

2. Enable the apachesolr_multilingual module at
   admin/build/modules

3. Optional but recommended:
   Enable the apachesolr_multilingual_texfile module at
   administer/modules. Apache Solr requires some text files
   like stopwords.txt. This module adds an adminstration
   interface for such files to drupal. If you don't like it
   you need to maintain such files manually.

Now you have different options to complete your setup:

1. Your site uses a unique non-English language.
   If you additionally installed apachesolr_multilingual_texfile
   continue at "A) Unique Language and Apache Solr Multilingual
   Texfile". Otherwise continue at "C) Unique Language"

2. Your site uses multiple languages (multilingual) and your
   content is assigned to languages using the locale module.
   If you additionally installed apachesolr_multilingual_texfile
   continue at "B) Multiple Languages and Apache Solr Multilingual
   Texfile". Otherwise continue at "D) Multiple Languages"



A) Unique Language and Apache Solr Multilingual Texfile
=======================================================

1. Ensure that all the language you want to cover is
   available and enabled at admin/settings/language

2. Enable the languages you want to cover at
   admin/settings/apachesolr/multilingual
   and "Save configuration"

3. Adjust all solr text files to your needs at
   admin/settings/apachesolr/multilingual

4. Download apachesolr_unique_language_config.zip at
   admin/settings/apachesolr/schema_generator

5. Extract apachesolr_unique_language_config.zip to your solr
   conf directory and restart solr

6. "Re-index all content" at settings/apachesolr/index.


B) Multiple Languages and Apache Solr Multilingual Texfile
==========================================================

1. Ensure that all the languages you want to cover with
   multilingual search are available and enabled at
   admin/settings/language

2. Enable all the languages you want to cover with
   multilingual search at admin/settings/apachesolr/multilingual
   and "Save configuration"

3. Adjust all solr text files to your needs at
   admin/settings/apachesolr/multilingual

4. Download apachesolr_multilingual_config.zip at
   admin/settings/apachesolr/schema_generator

5. Extract apachesolr_multilingual_config.zip to your solr
   conf directory and restart solr

6. "Re-index all content" at settings/apachesolr/index.
   It's important that you already have content in every langauge
   at this point. Otherwise the checkboxes in the next step won't
   exist until you indexed some content in a specific language

7. Go to admin/settings/apachesolr/query-fields and set "Body" and
   "Title" to "Omit". Enable all language specific bodies and titles
   like body_en or title_de by selecting any value you like but not
   "Omit". And don't forget to "Save configuration".

8. Optional: Like described in 7 omit
     "Body text inside links (A tags)",
     "Body text inside H1 tags",
     "Body text inside H2 or H3 tags",
     "Body text inside H4, H5, or H6 tags",
     "Body text in inline tags like EM or STRONG"
   and turn on the language specific fields like
     "tags_a_de",
     "tags_h1_de",
     "tags_h2_h3_de",
     "tags_h4_h5_h6_de",
     "tags_inline_de".

9. Optional: If you insatalled the module "Taxonomy translation" and
   turned on "Index taxonomy term translations" at
   /admin/settings/apachesolr/multilingual you should omit
   "All taxonomy term names" and enable the language specific equivalent
   like "taxonomy_names_de" instead like described in 7.


C) Unique Language
==================

1. Ensure that all the language you want to cover is
   available and enabled at admin/settings/language

2. Enable the languages you want to cover at
   admin/settings/apachesolr/multilingual
   and "Save configuration"

4. Download schema.xml for unique language setup at
   admin/settings/apachesolr/schema_generator

5. Copy schema.xml to your solr conf directory

6. Ensure that you have these four files in your solr conf
   directory:
     stopwords.txt
     synonyms.txt
     protwords.txt
     compoundwords.txt

7. Restart solr

8. "Re-index all content" at settings/apachesolr/index.


D) Multiple Languages
=====================

1. Ensure that all the languages you want to cover with
   multilingual search are available and enabled at
   admin/settings/language

2. Enable all the languages you want to cover with
   multilingual search at admin/settings/apachesolr/multilingual
   and "Save configuration"

4. Download schema.xml for multilingual setup at
   admin/settings/apachesolr/schema_generator

5. Copy schema.xml to your solr conf directory

6. Ensure that you have these four files in your solr conf
   directory for each language:
     stopwords_LANGUAGE.txt
     synonyms_LANGUAGE.txt
     protwords_LANGUAGE.txt
     compoundwords_LANGUAGE.txt

7. Restart solr

8. "Re-index all content" at settings/apachesolr/index.
   It's important that you already have content in every langauge
   at this point. Otherwise the checkboxes in the next step won't
   exist until you indexed some content in a specific language

9. Go to admin/settings/apachesolr/query-fields and set "Body" and
   "Title" to "Omit". Enable all language specific bodies and titles
   like body_en or title_de by selecting any value you like but not
   "Omit". And don't forget to "Save configuration".

10. Optional: Like described in 9 omit
     "Body text inside links (A tags)",
     "Body text inside H1 tags",
     "Body text inside H2 or H3 tags",
     "Body text inside H4, H5, or H6 tags",
     "Body text in inline tags like EM or STRONG"
   and turn on the labguage specific fields like
     "tags_a_de",
     "tags_h1_de",
     "tags_h2_h3_de",
     "tags_h4_h5_h6_de",
     "tags_inline_de".

11. Optional: If you insatalled the module "Taxonomy translation" and
   turned on "Index taxonomy term translations" at
   /admin/settings/apachesolr/multilingual you should omit
   "All taxonomy term names" and enable the language specific equivalent
   like "taxonomy_names_de" instead like described in 9.


Spell Checker
=============

How it works:
* langauge neutral spell checker doesn't use any stop words.
* as soon as a user limited his search by language facet spell
  checking is language specific

ToDo:
* admin configures if spell checker is language specific if
  site language changes (language selector, URL, ...)
* admin configures if more than one suggestion should be made
  in different languages (expensive because solr needs to be queried
  one time per language)


Apache Solr Text Files
======================

stopwords.txt
=============
TODO


protwords.txt
=============
TODO


synonyms.txt
=============
TODO


compoundwords.txt
=================
TODO


Troubleshooting
===============

Searching for words containing accents or umlauts does not work!
You need to verify the configuration of your servlet container (tomcat, jetty, ...)
to support UTF-8 characters within the URL. For tomcat you have to add an attribute
URIEncoding="UTF-8" to your Connector definition. See Solr's documentation for details:
http://wiki.apache.org/solr/SolrInstall
http://wiki.apache.org/solr/SolrTomcat

File

README.txt
View source
  1. Apache Solr Multilingual
  2. ========================
  3. Name: apachesolr_multilingual
  4. Authors: Markus Kalkbrenner | Cocomore AG
  5. Matthias Huder | Cocomore AG
  6. Drupal: 6.x
  7. Sponsor: Cocomore AG - http://www.cocomore.com
  8. http://drupal.cocomore.com
  9. Description
  10. ===========
  11. Apache Solr Multilingual extends Apache Solr Search Integration
  12. in a clean way to provide:
  13. * better support for non-English languages
  14. * support for multilingual search
  15. * an easy to use administration interface for non-English and
  16. multilingual search
  17. Installation
  18. ============
  19. Currently, the only version of apachesolr apachesolr_multilingual is
  20. compatible with is apachesolr 6.x-2.0-beta5:
  21. http://drupal.org/node/1173780
  22. 1. Place whole apachesolr_multilingual folder into your Drupal
  23. modules/ or better sites/x/modules/ directory.
  24. 2. Enable the apachesolr_multilingual module at
  25. admin/build/modules
  26. 3. Optional but recommended:
  27. Enable the apachesolr_multilingual_texfile module at
  28. administer/modules. Apache Solr requires some text files
  29. like stopwords.txt. This module adds an adminstration
  30. interface for such files to drupal. If you don't like it
  31. you need to maintain such files manually.
  32. Now you have different options to complete your setup:
  33. 1. Your site uses a unique non-English language.
  34. If you additionally installed apachesolr_multilingual_texfile
  35. continue at "A) Unique Language and Apache Solr Multilingual
  36. Texfile". Otherwise continue at "C) Unique Language"
  37. 2. Your site uses multiple languages (multilingual) and your
  38. content is assigned to languages using the locale module.
  39. If you additionally installed apachesolr_multilingual_texfile
  40. continue at "B) Multiple Languages and Apache Solr Multilingual
  41. Texfile". Otherwise continue at "D) Multiple Languages"
  42. A) Unique Language and Apache Solr Multilingual Texfile
  43. =======================================================
  44. 1. Ensure that all the language you want to cover is
  45. available and enabled at admin/settings/language
  46. 2. Enable the languages you want to cover at
  47. admin/settings/apachesolr/multilingual
  48. and "Save configuration"
  49. 3. Adjust all solr text files to your needs at
  50. admin/settings/apachesolr/multilingual
  51. 4. Download apachesolr_unique_language_config.zip at
  52. admin/settings/apachesolr/schema_generator
  53. 5. Extract apachesolr_unique_language_config.zip to your solr
  54. conf directory and restart solr
  55. 6. "Re-index all content" at settings/apachesolr/index.
  56. B) Multiple Languages and Apache Solr Multilingual Texfile
  57. ==========================================================
  58. 1. Ensure that all the languages you want to cover with
  59. multilingual search are available and enabled at
  60. admin/settings/language
  61. 2. Enable all the languages you want to cover with
  62. multilingual search at admin/settings/apachesolr/multilingual
  63. and "Save configuration"
  64. 3. Adjust all solr text files to your needs at
  65. admin/settings/apachesolr/multilingual
  66. 4. Download apachesolr_multilingual_config.zip at
  67. admin/settings/apachesolr/schema_generator
  68. 5. Extract apachesolr_multilingual_config.zip to your solr
  69. conf directory and restart solr
  70. 6. "Re-index all content" at settings/apachesolr/index.
  71. It's important that you already have content in every langauge
  72. at this point. Otherwise the checkboxes in the next step won't
  73. exist until you indexed some content in a specific language
  74. 7. Go to admin/settings/apachesolr/query-fields and set "Body" and
  75. "Title" to "Omit". Enable all language specific bodies and titles
  76. like body_en or title_de by selecting any value you like but not
  77. "Omit". And don't forget to "Save configuration".
  78. 8. Optional: Like described in 7 omit
  79. "Body text inside links (A tags)",
  80. "Body text inside H1 tags",
  81. "Body text inside H2 or H3 tags",
  82. "Body text inside H4, H5, or H6 tags",
  83. "Body text in inline tags like EM or STRONG"
  84. and turn on the language specific fields like
  85. "tags_a_de",
  86. "tags_h1_de",
  87. "tags_h2_h3_de",
  88. "tags_h4_h5_h6_de",
  89. "tags_inline_de".
  90. 9. Optional: If you insatalled the module "Taxonomy translation" and
  91. turned on "Index taxonomy term translations" at
  92. /admin/settings/apachesolr/multilingual you should omit
  93. "All taxonomy term names" and enable the language specific equivalent
  94. like "taxonomy_names_de" instead like described in 7.
  95. C) Unique Language
  96. ==================
  97. 1. Ensure that all the language you want to cover is
  98. available and enabled at admin/settings/language
  99. 2. Enable the languages you want to cover at
  100. admin/settings/apachesolr/multilingual
  101. and "Save configuration"
  102. 4. Download schema.xml for unique language setup at
  103. admin/settings/apachesolr/schema_generator
  104. 5. Copy schema.xml to your solr conf directory
  105. 6. Ensure that you have these four files in your solr conf
  106. directory:
  107. stopwords.txt
  108. synonyms.txt
  109. protwords.txt
  110. compoundwords.txt
  111. 7. Restart solr
  112. 8. "Re-index all content" at settings/apachesolr/index.
  113. D) Multiple Languages
  114. =====================
  115. 1. Ensure that all the languages you want to cover with
  116. multilingual search are available and enabled at
  117. admin/settings/language
  118. 2. Enable all the languages you want to cover with
  119. multilingual search at admin/settings/apachesolr/multilingual
  120. and "Save configuration"
  121. 4. Download schema.xml for multilingual setup at
  122. admin/settings/apachesolr/schema_generator
  123. 5. Copy schema.xml to your solr conf directory
  124. 6. Ensure that you have these four files in your solr conf
  125. directory for each language:
  126. stopwords_LANGUAGE.txt
  127. synonyms_LANGUAGE.txt
  128. protwords_LANGUAGE.txt
  129. compoundwords_LANGUAGE.txt
  130. 7. Restart solr
  131. 8. "Re-index all content" at settings/apachesolr/index.
  132. It's important that you already have content in every langauge
  133. at this point. Otherwise the checkboxes in the next step won't
  134. exist until you indexed some content in a specific language
  135. 9. Go to admin/settings/apachesolr/query-fields and set "Body" and
  136. "Title" to "Omit". Enable all language specific bodies and titles
  137. like body_en or title_de by selecting any value you like but not
  138. "Omit". And don't forget to "Save configuration".
  139. 10. Optional: Like described in 9 omit
  140. "Body text inside links (A tags)",
  141. "Body text inside H1 tags",
  142. "Body text inside H2 or H3 tags",
  143. "Body text inside H4, H5, or H6 tags",
  144. "Body text in inline tags like EM or STRONG"
  145. and turn on the labguage specific fields like
  146. "tags_a_de",
  147. "tags_h1_de",
  148. "tags_h2_h3_de",
  149. "tags_h4_h5_h6_de",
  150. "tags_inline_de".
  151. 11. Optional: If you insatalled the module "Taxonomy translation" and
  152. turned on "Index taxonomy term translations" at
  153. /admin/settings/apachesolr/multilingual you should omit
  154. "All taxonomy term names" and enable the language specific equivalent
  155. like "taxonomy_names_de" instead like described in 9.
  156. Spell Checker
  157. =============
  158. How it works:
  159. * langauge neutral spell checker doesn't use any stop words.
  160. * as soon as a user limited his search by language facet spell
  161. checking is language specific
  162. ToDo:
  163. * admin configures if spell checker is language specific if
  164. site language changes (language selector, URL, ...)
  165. * admin configures if more than one suggestion should be made
  166. in different languages (expensive because solr needs to be queried
  167. one time per language)
  168. Apache Solr Text Files
  169. ======================
  170. stopwords.txt
  171. =============
  172. TODO
  173. protwords.txt
  174. =============
  175. TODO
  176. synonyms.txt
  177. =============
  178. TODO
  179. compoundwords.txt
  180. =================
  181. TODO
  182. Troubleshooting
  183. ===============
  184. Searching for words containing accents or umlauts does not work!
  185. You need to verify the configuration of your servlet container (tomcat, jetty, ...)
  186. to support UTF-8 characters within the URL. For tomcat you have to add an attribute
  187. URIEncoding="UTF-8" to your Connector definition. See Solr's documentation for details:
  188. http://wiki.apache.org/solr/SolrInstall
  189. http://wiki.apache.org/solr/SolrTomcat