How to check what your Xperience by Kentico stores inside the Lucene indexes

Lukasz Skowroński - Senior Solutions Architect

13 Sep 2024

Share on social media

Let’s jump in! 
 
If you are working on the search functionality for your website hosted on XbK, you can use one of the supported search engines: 

  • Algolia 
  • Lucene 
  • Azure AI Search 

Lucene is the only free engine supported by XbK, so it will likely be the most popular choice among companies that do not require highly advanced search capabilities. 

For now, we will not focus on the configuration details, as they are covered in the official documentation available here 

Our focus will be on reviewing the indexed content. 

What is the tool and where to find it 

Lucene provides a tool called “Luke” that years ago was developed by the community and now is a part of their official release. Because the Lucene.Net library uses an older version of Lucene, you will not find Luke in the official distribution packages. 

If you check Kentico’s GitHub you will see that XbK uses Lucene.Net library in version 4.8.0 

https://github.com/Kentico/xperience-by-kentico-lucene/compare/v8.0.0...v8.1.0 

If you verify the details of Lucene.Net 4.8.0 you will find out that it uses also 4.8 version of Lucene/Solr release (which makes sense): 

This is why we must find the older Luke release to work with the older version of Lucene.  

If you try to download a new version of Lucene and use Luke which is included as part of distributed library, you will see an error like this:

And in the logs, you will find exception org.apache.lucene.index.IndexFormatTooOldException .  

 

To find the older version you must go to the page of the community member that developed this tool years ago and download version 4.10.4.1, this is the page:  

https://github.com/DmitryKey/luke/releases

When you download it and extract files from the tar.gz file you will also need to change the content of the “luke.sh” file that you will find inside and remove “-XX:MaxPermSize=512” parameter which is no longer supported by newer versions of Java.  

When you remove it you will be able to run Luke from your terminal. 

 

How to use Luke 

 

Now once you’ve downloaded the correct version of Luke and adjusted running parameters you can run Luke with the command (for windows devices):
 
“sh .\luke.sh “ 

As a result, you should see Luke tool where you can open existing indexes 

Index files generated by Lucene and your XbK you can find (at least on local) in your App_Data folder:

If you want to review some of the indexed documents, you can do this with the Documents tab: 

You can also use Lucene syntax and search for data when your index is larger:

Syntax for Lucene in your version you will find on official Lucene Apache documentation page dedicated for 4.8 version:  

Summary 

Hopefully, this tutorial will help you better understand how the data is stored inside the indexes and as an outcome, you will be able to write your queries more effectively. Knowledge like that very often helps also with debugging of results that you receive when you search for something on your website.  

Plenty of fun is ahead of you – good luck!  
 
Ready to master your queries and optimize search results? Book a consultation with our team of experts today.  

Sign up to our newsletter

Share on social media

Caricature of Lukasz

Lukasz Skowroński

I have been awarded with the Sitecore MVP award seven times (the first time in 2017) for my continued support of the Sitecore Community. Besides blogging, as a Sitecore Community member, I organize all of the Sitecore User Group meetups in Poland. Since 2021 I have helped to organize the Sitecore User Group Conference (SUGCON) as one of the co-organizers.


Subscribe to newsletter