27th Annual International ACM SIGIR Conference on Research and Development in IR, Sheffield, United Kingdom, 29 - 31 July 2004, pp.1-9
In-depth analysis about a specific subject in molecular biology,
specifically those associated with the structural and functional
properties of a particular group of sequences typically requires access
to an extensive knowledge base. The knowledge base may take the form of a
specialist database or subject-specific data warehouse (SSDW) to
facilitate the organisation of specialized data and the extraction of
new knowledge. These SSDWs are particularly useful for data mining or
knowledge discovery processes which require the relevant information
from multiple data sources. The construction of a specialist database is
a multi- step process which typically involves enrichment of
annotations (by domain experts), development and integration of
analytical tools (by computer programmers), and construction of the
system (by database experts). The SSDWs contain focused subsets of data
compiled from multiple data sources and enriched with user annotations.
In this article we present and describe the BioWare system which enables
its users to collect, annotate, publish, and update specialized
molecular data in personalized WWW- accessible databases. BioWare
contains four data warehouse enabling components: (i) BioWare-Retrieve
searches and extracts data from selected sources and integrates them
into a standardized format, (ii) BioWare-Prep provides a semi-automated
mechanism for user-driven cleaning, preliminary analysis and annotation
of the data, (iii) TEMPLAR enables users to rapidly create searchable
WWW-accessible SSDWs, and (iv) BioWare-Update enables incremental
updating of the SSDWs with new data from the sources. We have used
BioWare system for the creation and maintenance of several bioinformatic
databases.