March 11th, 2006

xpclucene

xpclucene is an extension for Firefox 1.5+ that lets you index, tag, and search the text of the pages that you view in your browser. Some uses of this extension might be:

  • Bookmarks replacement
  • Searchable documentation repository
  • Recipe organizer

For developers, it provides a set of scriptable XPCOM wrappers around many of the CLucene classes. Other extensions can use these interfaces to create and search CLucene full text indexes.

Download

How To Use

Quick Start

Tags

Query Syntax

For Developers

Download

Updated March 10, 2006

Linux x86 Debug
xpclucene-0.1.6-linux_x86_debug.xpi

Win32 Debug
xpclucene-0.1.6-win32_debug.xpi

OS X PPC Debug
xpclucene-0.1.6-macosx_ppc_debug.xpi

How To Use

Quick Start

After installing, you should have an xpclucene toolbar with a tag entry box and add and delete buttons. (You may customize your toolbars and move these controls on to another toolbar to save space.) To get started, navigate to a page you wish to store in the search index and click the “+ Add” button. Note that clicking the add button on a page that is already in the index will update the indexed version of the page in the index, not add another copy.

To search for the indexed document, first change the search engine in the search bar to xpclucene. To do this, click the Google icon and select xpclucene from the dropdown.

Next, type your search terms into the search box and hit enter. Your search results will appear in the browser window.

The search results include a link to the indexed document, a context-highlighted excerpt from the indexed page, the date the page was added to the index, the tags assigned to the page, and a link to delete the page from the index.

Tags

When adding a page to the index, you may also specify one or more tags that are stored with the page. These tags can use to provide additional categorization to the pages you are indexing. You can later include tags in your searches to search only in a particular category. To add tags to a page, enter a space delimited list of words in the tag box:

To constrain a search to a tag, include the term “tag:<tagname>” in the query

Query Syntax

When a page is added to the index, several other fields of searchable data are added as well. The fields can be used in the search are:

title
The title of the page
link
The URL of the page
author
The domain of the page
tag
The tags assigned to the page
updated
The date and time the page was added to the index. The date is formatted as YYYYMMDDHHMMSS (in GMT) to assist in date range searching
content
The content of the page (this is the default field)

The complete Lucene query syntax can be found at the Apache Lucene Query Parser Syntax page. Some sample queries:

Search for pages that contain the words “javascript” and “ajax”
javascript ajax
Search for all pages in the domain chocolateandzucchini.com
author:chocolateandzucchini.com
Search for all pages indexed since the beginning of the year
updated:2006*

For Developers

Grab the source from the subversion repository:
svn co http://skrul.com/svn/xpclucene/trunk