RefDB Status

This page summarizes the development status of RefDB and outlines the planned changes and enhancements in the near future.

On this page


RefDB is written and maintained by humans, therefore it is quite possible that a nasty bug raises its ugly head once in a while. If you want to see a list of known bugs, or if you think you've found a new one, please use the RefDB bug tracker.


0.9.9: November 4, 2007

The PHP-based web interface was rewritten from scratch. Besides sporting a fresher look, the usability was greatly enhanced. The query form now offers four different search strategies, from a simple string lookup in all fields, to a field-based search using booleans and various types of matches. The results are now displayed with live links, i.e. each author or periodical name and each keyword is clickable, performing a new search for the requested item. These items are now also displayed as a "tag cloud", i.e. the font size reflects the frequency of each item in the database. Buttons allow to edit or delete references, or to display them in any of the supported output formats. Adding or editing references is via a web form. The field labels reflect the required contents for each reference type. If you add references from local files, the web interface now automatically determines the reference format and converts the references accordingly.

RefDB can now be queried via Search and Retrieve via URL (SRU). Both a simple standalone web server for local use and a CGI application are provided. SRU can be used from any web browser, from dedicated query tools like YAZ, or by other applications written to retrieve data via this protocol. The output format most commonly used with SRU (MODS) is supported. RefDB implements all SRU operations (explain, searchRetrieve, scan) and conforms to CQL Level 2.

The internal data representation was improved. Now the data as well as all backends use the analytical/monographic/serial levels. Also, the RIS fields M1 through M3 are now represented according to their function in the individual reference types. This way they are searchable in a reasonable way, and their formatting in bibliography styles was improved too.

The upcoming schema-based versions of TEI (P5) and DocBook (V5.0) are fully supported. To this end, all XML output including the bibliographies can now use namespace prefixes and use the appropriate namespace declarations where needed. Also, raw bibliographies are now supported. You can use the same tools to scan a document for citations, and RefDB will create a matching bibliography. However, it will be unformatted, and you have to use your own formatting mechanisms.

New refdbc commands were added: countref and countnote are equivalent to getref and countref, except that they just determine the number of matching entries instead of returning them. getax was added to retrieve author names from any level. The bibliography sorting order is now case-insensitive to avoid "van Beethoven" to appear after "ZZ Top". Bibliography styles can now distinguish between authors and editors on the monographic level.

0.9.8: December 7, 2006

One of the main new features of this version is the support for multiple personal reference lists. Each user still has a default list which contains all datasets she added, but as many additional lists as necessary can be created. These lists are internally implemented on top of extended notes, turning the pickref and dumpref commands into simplified frontends for the extended note handling for the particular purpose of keeping lists of links to references.

Another important new feature is the checkref command. This is related to the addref command, but differs from the latter in that it adds the data to temporary tables and compares the new data with the existing datasets. This allows you to identify duplicate entries, misspelled keywords, alternate spellings of author names, and possible journal name abbreviations at a glance before adding the data permanently. See this example.

The getref as well as all other get* commands now support an optional limit:offset argument to limit the number of returned datasets. This feature is also convenient to loop over chunks of datasets in graphical frontends.

refdbd now optionally displays the number of references that use a particular author, keyword, or journal name to collect statistical information about your database. One nice application of this statistical information is presented by the getref html/xhtml output. If you use an optional stylesheet, the frequency information is used to render the author names, journal names, and keywords in different font sizes and colours, creating the equivalent of tag clouds of social tagging services. See this example. This also works in the xhtml output of the checkref command, allowing you to estimate whether a new reference is at the center or at the fringes of your research interests.

UTF-8 is now the default character encoding throughout in order to simplify data exchange between the realms of SGML/XML and bibTeX. A new script bib2ris-utf8 creates UTF-8 versions of transformed bibTeX data which can be readily imported into refdbd. On the way back, refdbd now properly escapes bibTeX data and returns them in UTF-8, as current TeX implementations support UTF-8 out of the box. As a matter of fact, RIS data should now also be encoded in UTF-8, although you can still choose a different input encoding by setting up refdbc appropriately.

Citation keys are now created from author names and publication years by using an iconv transliteration. This greatly improves the results when importing non-English datasets.

In addition to the numeric and author/year citation formats, RefDB now also supports the citation key format which is more popular in engineering and computer sciences.

Finally, you can now choose the name of the main database freely, and you can even run RefDB with a single database - a combined main and reference database. These features allow to run RefDB using cheap web space which often comes with a single MySQL database with a fixed name.

0.9.7: June 25, 2006

First-time users may find the new script useful which performs the initial post-install setup of RefDB. This includes creating the main database and a reference database, creating configuration files, and creating database users.

The remaining changes to the code proper include a wide variety of bugfixes for possible and observed segfaults and for problems during import and export of data. Most of the changes are too small, and their number too high to list them here individually. Please see the NEWS and ChangeLog files in the sources.

A far-reaching set of changes touches the end-user only indirectly: The documentation has been moved to XML, which required extensive changes to the build system. The reference section of the manual was rewritten and now consists mainly of the man pages. These have been lifted from the troff sources to DocBook XML for this purposes. The man pages are in turn generated from these XML sources, which keeps the manual and the man pages automatically in sync.

0.9.6: November 15, 2005

Besides an impressive number of bugfixes, there were lots of usability improvements in this release. The most obvious is probably the full support for refdb-mode which turns Emacs into an integrated markup authoring and bibliography tool. However, the major changes are architectural. First of all, RefDB now uses the 0.8.x series of libdbi and libdbi-drivers, using their improved support for character encodings. Next, the client-server protocol was rewritten to improve error reporting, recovery from odd situations, and portability to other languages. The protocol is now documented in the manual. Finally, the previous limitation of the size of query strings was dropped.

SQLite3 is now supported as a database engine. MySQL versions 4.0 and later are now fully supported, including transactions (only with transaction-safe tables, of course).

The RefDB manual as well as all DTD documentations are now part of the sources and are installed automatically on your system. As another first in this release, man pages for all programs and utilities are now available.

References can now hold multiple UR and L1-L4 fields. Extended notes can now be declared public or private to share them with others or to keep them to yourself, respectively.

Bibliographies now allow to use the work title instead of missing authors. A new command updatejo offers a simple interface to maintain periodical synonyms.

A couple of scripts were added to the core distribution: refdb-ms is an interactive tool to write bibliography styles. refdb-backup and refdb-restore back up and restore, respectively, your reference databases.

0.9.5: December 13, 2004

There were many bug fixes and improvements in the bibliography support. Most notably, the driver files now support the latest DocBook and TEI stylesheets. The bibliography formatting capabilities were overhauled thoroughly. Support for YMD style dates and titles formatted with initial caps was added. Bibliography styles now allow to specify indenting and font sizes. These features are implemented by the newly added CSS support in (X)HTML documents.

The bibliography tool now handles missing references (i.e. works cited in the document but not present in the database) in a more intelligent fashion. You'll get a list of the IDs or citation keys of the missing works to make it easier to fix your document or your database. Using a new configuration option, missing references can either be treated as errors or as warnings.

The query language was extended with a ":TA:" pseudo target which searches in the titles in all bibliographic levels. This is useful to locate references by title words across all reference types.

The configuration of the XML toolchain is now simpler as it supports XML catalogs. If you (or your operating system's package system) maintain an XML catalog in the default location, the RefDB configure script can automatically detect the required stylesheets. In addition, the toolchain for transformations can now be configured with a new refdbxmlrc configuration file.

A serious bug in the refdbxp tool was fixed. This speeds up the tool by orders of magnitudes for large documents. Finally, the diagnostic messages for database connection errors were improved.

0.9.4: February 18, 2004

Extended notes are stored separately from reference entries, but they can be linked to any number of references, keywords, author or periodical names. This allows you to create notes about a topic and link every relevant item in your database to that note. Notes can be searched in a similar way as references. You can even search notes that are linked to a particular reference and vice versa.

refdbd now has built-in character encoding conversion support. This makes it easier to match the encoding of your incoming data and the encoding of your databases.

The query language was slightly changed to allow you to choose between literal matches and regular expression matches in most string comparisons.

The package configuration was enhanced to allow building and installing the server and the clients separately.

0.9.3: August 19, 2003

RefDB now supports XML documents as a native input format. The risx.dtd can be used to author reference data from scratch or as a target for SGML/XML transformations.

The separately available RefDBClient Perl module is a collection of classes and functions to access a refdbd server directly from a Perl script, without using the C clients. This allows rapid development of custom clients and CGI applications that utilize the functionality supplied by refdbd.

A couple of issues related to citation keys have been fixed. The bibliography scripts now use xsltproc to extract the citations from XML documents in a case-preserving fashion. This also means that (open)jade is no longer necessary to process XML documents with RefDB. SGML documents still require (open)jade, and the citation keys have to be treated as case-insensitive. A new switch for refdbd allows to uppercase citation keys automatically upon import to simplify working with SGML documents.

The new refdba:scankw command runs a manual keyword scan over all existing datasets. This is more thorough than the automatic keyword scan as it will also add keywords introduced by newer references to the older ones.

Furthermore, portability was improved, thus fixing problems on Cygwin and on OSX.

0.9.2: March 2, 2003

Finally RefDB allows to use an embedded database engine in addition to the external database servers MySQL and PostgreSQL. The SQLite-based engine requires no administration and supports everything except personal reference lists (this may be fixed in a later release).

The script was extended to handle both tagged and XML Pubmed data and is intended to replace nmed2ris entirely. The web interface now uses by default. and now support the same config file and logging features as the C programs. The common functionality of these scripts was moved to an external Perl module package called "RefDB-perlmod" which is available at the RefDB project page.

The refdbc:getref command now returns the number of retrieved references.

As usual, a couple of bugs were fixed. Worth mentioning are a couple of minor query bugs, bibtex bibliographies, and refdbxp. Other fixes allow to run refdba and refdbc as regular applications from CGI programs like Perl or PHP scripts.

0.9.1: December 5, 2002

This maintenance release fixes a couple of bugs related to the processing of XML documents and to the PostgreSQL support. 0.9 had introduced a few incompatibilities with Solaris which are also fixed in this release.

To take the hassle out of processing your documents, RefDB now offers a script called "refdbnd". This is an interactive shell script which asks a few questions and then creates a skeleton document with the doctype you need, along with a customized Makefile. The Makefile controls all further processing steps, so unless you do weird things you'll be able to just say "make pdf" (or html, rtf, ps etc) to turn your SGML or XML source into a PDF document with formatted citations and bibliographies.

A new perl script "" was added. While nmed2ris converts the tagged Pubmed data format, is designed to convert the Pubmed XML data format to RIS. nmed2ris will eventually be dropped or replaced by an equivalent Perl script.

The new tutorial is an introduction for users who have access to a working RefDB setup. The tutorial explains the most common tasks with a lot of examples to get new users up to speed.

0.9: November 19, 2002

This version focuses on architectural changes and new features. The most prominent change is the integration of a database abstraction layer which currently allows to use MySQL and PostgreSQL as database servers. Support for an embedded database engine will be included in the next release.

PostgreSQL offers transactions and a variety of character encodings that can be selected individually for each database. If you run PostgreSQL anyway, you'll certainly appreciate the choice.

New features include the support for mnemonic citation keys, an automatic keyword scan, export of bibliography styles to XML documents, a new TEI XML backend to export reference data, a first shot at a MARC import filter (not yet feature-complete and most likely full of bugs, but at least it's a start), and a new subdirectory with example files for playing around.

RefDB now also supports a new short citation style in SGML and XML documents. You just need the minimum of markup with a list of citation keys or IDs that you want to cite, something like <citation role="REFDB">miller1999;doe2001</citation>. The new tool refdbxp expands this short notation to the full notation required for the document transformation.

Back then in the days of yore

Of course there were releases prior to 0.9, but no one can actually remember these days, let alone say anything useful about them. If you do not want to leave the old releases in this state of obscurity, feel free to unearth the forgotten releases from CVS and play with them.


Due to the usual imponderabilities that affect a volunteer project, only a very tentative list of planned enhancements can be shown.




This is a list of links to unfinished stuff which may or may not show up in RefDB eventually.


This is a RelaxNG schema which tries to capture the data that RIS datasets can hold in a structured fashion that is easier for dataset authors to use and easier for database programmers to implement. A full description including the download info is here. Some additional info can be found in this blog entry.


Rich Text Format (RTF) is a plain-text format which can be handled by most commercial and free word processors. Since early 2008 RefDB has experimental support for this format. See this brief description for further information.