Xapian is an open source search engine library, which allows developers to add advanced indexing and search facilities to their own applications.
This document aims to be a guide to getting up and running with your first database, explaining basic concepts and providing code examples of the library's core functionality.
If you just want to follow our code examples, you can skip the chapter on "Core Concepts" and go straight to :ref:`a-practical-example` - but you should probably make sure you have Xapian installed first!
There are two pieces of Xapian you need to following this guide: the library itself, and support for the language you're going to be using. We've written this guide mostly using Python for the examples, although we're also working on full translations into PHP and C++.
This guide documents Xapian 1.2 (except where a different version is explicitly mentioned) so you'll find it easier to follow if you use a version from the 1.2 release series. So let's get that onto your system.
Recent versions of both Debian and Ubuntu have Xapian 1.2 packaged: if you're using Debian 6.0 (squeeze) or later, or Ubuntu 11.10 (Oneiric Ocelot) or later you can just do one of the following depending on whether you want to work through the examples in Python or C++:
$ sudo apt-get install python-xapian $ sudo apt-get install libxapian-dev
If you're using Ubuntu 11.04 (Natty Narwhal) or earlier, you'll need to install from our PPA.
Packages of the PHP bindings aren't available due to a licence compatibility issue, but you can build your own packages.
Many operating systems have packages available to make Xapian easy to install; information is available on our download page. This covers most popular Linux distributions, FreeBSD, Mac OS (Python and C++ only) and Windows using Microsoft Visual Studio.
If you're using a different operating system, you will need to compile from source, which should work on any Unix-like operating system, and Windows using any one of Cygwin, MSYS+mingw or MSVC. Source code is again available from our download page, as are additional Makefiles for building using MSVC on Windows.
If you want to run the code we use to demonstrate Xapian's features (and we recommend you do), you'll need both the code itself and the two datasets we use.
The example code is available in Python, PHP, and C++ so far, although there's only a complete set of examples for Python at present.
.. todo:: finalise datasets and code and link to them from here
For now, you'll want to grab the documentation source from github which contains the example code in each language, and also the data files listed in the next paragraph (both are in the "code" subdirectory).
The first dataset is the first 100 objects taken from museum catalogue data released by the Science Museum, and the second we have curated ourselves from information on Wikipedia about the 50 US States. Both are provided as gzipped CSV files. The first dataset is released under the Creative Commons license Attribution-NonCommercial-ShareAlike license, and the second under Creative Commons Attribution-Share Alike 3.0.
.. todo:: link to here from every howto and everything that needs the data files and example code