Doodle is licensed under the GNU GPL. Indexing large volumes can take several hundred MBs of memory (depending on the amount of meta-data found). Searching should nevertheless require almost no memory. Using the latest version of libextractor is recommended. Doodle has so far only been tested under Debian and RedHat GNU Linux. Doodle is expected it to work under any platform supported by GNU libextractor.
Debian packages can be found here. RedHat/Fedora RPM packages can be found here. A Python binding for doodle (and GNU libextractor) can be found here.
First the doodle database needs to be created. The simplest way to create the database is to run doodle with the -b option on the directories that are to be indexed. For example:
$ doodle -bThis will create the doodle database under ~/.doodle.
$ doodle keyword
If you want to keep your doodle database up-to-date, you can either periodically re-run doodle with the -b option, or you can use doodled, the doodle daemon. doodled uses fam to notice whenever a file is changed and instantly updates the doodle database. In order to use doodled, you must have famd running. If famd is running, you can start doodled by passing the same arguments that you would pass to doodle to construct the database, but without the -b option:
$ doodledYou can also use doodled to construct the initial database. While doodled is updating the database, any doodle search will block until the update is complete. Note that while you may want to index your entire disk (i.e., doodle -b /), it is typically not a great idea to have doodled monitor your entire system for changes -- especially since /usr is unlikely to change frequently. You can address this issue by first indexing / and then using doodled to monitor only directories that change frequently:
$ doodle -b / $ doodledThis way, your entire system will be in the index, and your home directory will be always up-to-date.
You can achieve a (limited) form of full-text search with doodle. For that, the dictionary-based plaintext extractors from libextractor are used. In order to use them, you need to pass the option -b LANG to doodle. LANG is a two letter language code that selects the dictionary. Available languages at the moment are en, es, fr, it and no. Words and sentences that are available in the respective dictionaries for these languages will then be added to the index. While libextractor attempts to avoid full-text extraction for certain kown binary formats, it may still find words in non-text files. Running with this option will dramatically increase the size of the index and the time it takes to build the index. Note that if you change the options used to build a database will not (!) result in doodle re-indexing files that were processed with other options previously. The only way to force doodle to re-index files with different options is to either touch the files (change modification timestamp) or to delete the old database and start from scratch.
If you are the system administrator, you might want to run doodle on the entire system periodically (cron job) and have doodled monitor the home directories in the background. In that case, it is suggested to have the doodle database be group-readable for a group doodle. Set the permissions for the doodle binary to SGID to allow users to poll the database. Doodle will ensure that information about files not accessible to the user are not leaked by checking if files found in the database are accessible to the user. doodled has to run as root since otherwise it would be impossible to index the personal files of all users. If that it too risky, doodled will still work, but only index the files readable to the user that runs doodled.
You can build a database from multiple doodle runs over distinct sets of files with different options. For example, the following can make sense:
$ doodle -b /usr/share/doc # normal index $ doodle -n -l elf -l mime -b /usr /opt /bin /lib # only ELF and Mime support $ doodled /home # monitor /home (changes frequently)A simple doodle search will then find files in all listed directories. You can also build multiple disjoint databases and search all of them in one run (see doodle for option -d).
Doodle uses Mantis for bugtracking. Visit https://bugs.gnunet.org/ to report bugs. You need to sign up for a reporter account. Please make sure you report bugs under Doodle and not under any of the other projects.
If Mantis does not work for you some reason and you need to report a bug, contact christian@grothoff.org via e-mail.
Copyright (C) 2004, 2005, 2006, 2007, 2009, 2010, 2018, 2024 Christian Grothoff. Verbatim copying and distribution of this entire article is permitted in any medium, provided this notice is preserved.