Submitted to OOPSLA'99 Metadata and Active Object-Model
Pattern Mining Workshop
Three Patterns for the Implementation
of Ontologies in Java
Holger Knublauch
Research Institute for Applied Knowledge Processing
(FAW)
Helmholtzstraße 16
89081 Ulm, Germany
Holger.Knublauch@faw.uni-ulm.de
Abstract:
Ontologies - specifications of domain concepts and
their relationships - are essential building blocks of well-designed knowledge-based
systems. Although being high-level representations of abstract knowledge
models, ontologies yet have to be mapped to some kind of object model in
order to be used in an executable system. We present three patterns for
the implementation of ontologies in the object-oriented language Java.
Ontology concepts are either represented by reflection-backed JavaBeans
classes, by an active object-model (AOM), or by a mixed approach based
on extending the classes from the AOM. We compare the benefits and limitations
of these individual patterns and conclude that the choice of the pattern
depends on the requirements of the target system being considered.
1 Introduction
Although knowledge-based systems (KBSs) are in routine use in many application
domains, there is still no widely accepted framework for their development,
comparable to standard Software Engineering (SE) methodologies. Researchers
from the field of Knowledge Engineering (KE) are proposing systematic guidelines
for the development of KBSs since the mid-eighties [14].
Their basic approach is the stepwise, evolutionary construction of models
which represent knowledge independent from its implementation, but with
respect to different knowledge types. The resulting explicit knowledge
models are supposed to be easier to validate, maintain and reuse. Ontologies
[8] - specifications
of domain concepts and their relationships - are essential building blocks
of these knowledge models.
Although several KE methodologies such as CommonKADS [13],
MIKE [1] and Protégé
[11] are matters
of spirited discussion on an academic level, industrial software developers
often find it difficult to apply them to their projects [2].
We argue that this is largely due to the fact that until now little attention
has been paid to the integration of the knowledge models into the overall
software development process. Few of the KE methodologies lead to executable
code and therefore do not support rapid prototyping, which is essential
for system evaluation by domain experts. Furthermore, there is little tool
support
for knowledge modeling apart from research prototypes [4].
Finally, knowledge models are often stored in formal languages such as
Ontolingua [8] and
are therefore difficult to access by the remaining software modules. Due
to these reasons, projects have to expend significant development efforts
building up basic infrastructure, before the main thrust of development
can commence.
In order to overcome such reinventions, software developers try to generalize
the approaches they use in their projects. A widely adopted goal is the
mining of patterns [6],
reflecting reusable design techniques on a high level of abstraction.
Based on our experiences in the development of various medical decision-support
systems, this paper presents three patterns for the implementation of ontologies
in the object-oriented language Java. These patterns can aid developers
in integrating knowledge models into their software architectures.
Section 2 describes the task and use of ontologies in KBSs and thus
define some requirements that have to be fulfilled by ontology design patterns.
Section 3 presents two versions of an active object-model pattern. Section
4 presents our own approach, based on the Java component model JavaBeans.
Section 5 reports on some benefits and limitations of these three patterns.
2 Ontologies in knowledge-based
systems
KBSs differ from other software artifacts by the close involvement of domain
experts in the development process. These domain experts might have
little or no background in computer science and should therefore collaborate
with knowledge engineers in the construction of formal models of
their expertise. These models are based on concepts which best describe
the domain of discourse and some relationships and constraints
between them. An ontology is such a set of concepts and relationships.
Apart from defining the structure of knowledge, ontologies are also
frequently used to describe the input and output of problem-solving methods,
which perform the reasoning task in KBSs. Furthermore, the notion of ontologies
is increasingly popular in the domain of database research, for example
to specify the semantic relationships between heterogeneous, distributed
databases.
Various guidelines aiming at the construction of domain ontologies from
informal knowledge sources were proposed [10],
but a detailed discussion is beyond the scope of this paper. The difficulty
of building ontologies is comparable to the problems during the design
phase of object-oriented SE, where object models are typically redesigned
in an evolutionary, cyclic process [3].
Therefore, and even more because knowledge modeling is mainly performed
by domain experts, comfortable editors and test environments are required.
A number of formal ontology representation languages aiming at knowledge
exchange have emerged [14].
Most of these languages include the following primitives (cf. [12]):
-
Classes represent the concepts, arranged in an inheritance hierarchy.
For example, a medical domain ontology might contain the concepts Vaccine
and Disease and the relationship
immunizes-against between
them.
-
Slots represent the attributes of the classes. Possible slot types
are primitive types (integer, string, boolean), references to other objects
(modeling relationships) and sets of values of these types. For example,
each Disease has a name slot of type string.
-
Facets are attached to classes or slots and contain meta information,
such as comments, constraints and default values. For example, in order
to qualify for a Vaccine, a drug has to immunize against at
least one Disease.
-
Instances represent specific entities from the domain knowledge
base (KB). An example KB based on the medical ontology above might contain
the specific vaccine MMR and the disease
measles.
In order to be used by executable systems, ontologies have to be mapped
to some kind of object model. Obviously, the items above resemble object-oriented
concepts - especially, because ontology concepts are often tightly connected
to algorithms operating on them. However, the straight-forward approach
of hard-coding ontologies with classes, fields and objects in the target
programming language violates some of the following requirements:
-
Transparency: Users need to be able to extract and understand the
ontology underlying the system for evaluation purposes.
-
Maintainability: During evolutionary system development, there is
a strong need for ontology redesign and knowledge base editing. A generic
description of ontology concepts is needed, to enable the use of flexible
editors and tools.
-
Flexibility: High-level operations, such as the translation between
different ontologies, are frequently needed. In support of such operations,
ontology concepts have to be made explicit at run-time.
In short, an explicit, abstract ontology definition based on metadata
has to be encoded into the system. The following two sections present three
design patterns to include such metadata that we used in some of our Java-based
KBS projects. We started with two patterns which are supported by the KE
methodology and tool Protégé-2000 [7].
In order to overcome some of the limitations we encountered with these
patterns, we then developed our own approach, which is presented in section
4.
3 Two active object-model patterns
Protégé-2000 is a system supporting the definition of domain
ontologies and knowledge bases. It provides user-friendly editors for the
definition of classes, slots and instances, which can be included into
own applications by means of a Java API.
For the sake of illustration, the following list contains a simplified
version of this API, which can be used to read and modify ontologies at
run-time (the classes from the real API are of course much more complex):
-
Cls (called Cls to be distinguishable from the standard
java.lang.Class):
Each Cls manages sets of
Slots, sub-Clses and
Instances, and has fields specifying the class name and whether
the class is abstract or not.
-
Slot: Provides meta information (name, type, allowed values) and
current values of a slot.
-
Instance: Represents an instance from a given Cls.
These classes also support additional meta information, such as comments
and information on how to present the ontology in a graphical editor. Further
classes for constraints and so on exist.
According to Foote and Yoder [5],
``an active object-model (AOM) is an object model that provides meta information
about itself so that it can be changed at run-time''. Obviously, this API
is a typical example of the AOM pattern. Instances and even classes are
all encapsulated in objects which are instances of intermediate classes,
acting as an additional layer managing all interactions. For example, in
order to access the value of the name slot from a specific Disease
instance, a reference to the name Slot has to be retrieved from
the Instance object, and a method such as
getSlotValue()
has to be invoked on it.
Protégé-2000 also supports an extended version of this
pattern. Application developers can write their own subclasses of the API
classes in Java and thus introduce additional features, e.g., comfortable
access methods such as Disease.getName(). This pattern is no longer
a pure AOM, because the ontology's behavior is not determined by the implementing
object model alone, but also from the new classes. Therefore, parts of
the functionality are hard-coded in Java source code.
4 Implementation of ontologies
with JavaBeans
The two AOM patterns - the original API and its extension - are more or
less re-implementations of features already found in an object-oriented
language such as Java. As already stated in section 2, a naive implementation
of ontologies with ``real'' classes and instances does not provide all
meta information required. However, Java provides reflection mechanisms,
so that programs can analyze their own structure at run-time.
4.1 JavaBeans
Backed by this reflection mechanism, the JavaBeans API [9]
was added to the Java standard with version 1.1. The API was initially
developed to support the development of sharable components with well-defined
interfaces. The JavaBeans designers mainly aimed at visual programming
with components such as buttons and dialogs, which can be easily configured
by builder tools. The API specifies how a JavaBeans class exposes its features,
so that tools can analyze and present them. For each class, a so called
BeanInfo
class can be defined, providing details on the public fields, methods and
events. Using the BeanInfo, programmers can add arbitrary other information
objects to the descriptors of the class, its fields and methods. If no
special BeanInfo is available for a class, suitable default values are
delivered by means of reflection.
JavaBeans properties can be of any primitive type, such as int,
double
or boolean, or references to objects. Properties are either single
or indexed. Indexed properties represent arrays of the above types.
Similar to the
smart variable pattern from [5],
property access is funneled through suitable getter and setter methods.
Properties are said to be bound, if they deliver a PropertyChangeEvent
after their value has changed. Bound properties are simply implemented
by adding an event dispatching call to the end of the setter method. Properties
are said to be
constrained, if the setter method might throw a PropertyVetoException
if a new value is not to be accepted. Such an exception can be caused either
by the class itself or by any registered VetoableChangeListener
on the property. The registered listeners of each property are typically
managed in specific Set fields and can be modified with suitable
add-
and remove-listener methods.
4.2 From ontologies to JavaBeans
(and back)
We developed an approach for the mapping between ontologies and JavaBeans
classes:
Classes: Each ontology concept is implemented by means of a JavaBeans
class. For concepts with more than one parent class, Java's lack of multiple
inheritance is inconvenient but no obstacle, because the Java interfaces
(classes with no method bodies) can be used to simulate this.
Slots: For each slot of an ontology class, a JavaBeans property
of the same name has to be defined. These properties should be declared
bound so that external components, such as knowledge editors and event-driven
visualization modules, are notified on changes. Using reflection mechanisms,
class and slot information can be extracted from the running programs.
Facets: Because property values can only be modified through
the associated setter methods, it is easy to implement constraints, which
reject incoming values on the violation of preconditions. The straight-forward
approach of implementing constraints in Java is hard-coding the constraint
checks with if commands in the beginning of the setter method.
This might be suitable for rapid prototyping, but has several drawbacks,
because the specification of the method's behavior is intransparent and
hard to configure. Instead we propose the use of generic constraint checker
classes, which register themselves as VetoableChangeListener on
the property to check. As an example, consider a RangeChecker
class, which rejects numeric values beyond a given range. Similarly, a
CardinalityChecker
class disallows illegal array sizes in indexed properties. Another example
of constraint checking is maintaining dependencies between slots, such
as mutual link consistency in relationships. In this case, a checker class
can update the depending values if the setter method of one slot was invoked.
The checker classes themselves should be implemented as JavaBeans, so that
their own properties (range and cardinality, resp.) can be easily configured
and analyzed.
However, for reverse-engineering of abstract ontologies from Java classes,
pure JavaBeans based reflection mechanisms are insufficient, because they
only deliver whether a property is constrained, but do not provide details
on the type of constraints being checked. Therefore, ontology classes should
explicitly provide access to the sets of registered listeners, e.g., by
implementing a method such as Iterator getVetoableChangeListeners(String
propertyName).
Another type of facets - documentation - is supported by the JavaBeans
standard with the getShortDescription method from the Property-
and BeanDescriptors. Furthermore, support for visual editors is
included: For each property of a class, it is possible to specify a PropertyEditor
component, which graphically presents the property value in a human readable
form. Similarly, Customizer components can be defined as editors
on class level. These components can be used to adapt knowledge acquisition
tools to specific needs. Other types of facets can be added by attaching
user defined objects to the BeanInfo.
Instances: Are simply Java instances. In support of maintaining
the set of instances in a knowledge base, we are developing a shell to
edit JavaBeans instances, which is a simple but powerful knowledge acquisition
tool for any JavaBeans based ontology. Utilizing reflection to identify
classes, properties and current values, the shell provides various views
on the knowledge bases, such as trees, forms and graphs.
Apart from knowledge acquisition, the second part of KBS development
which requires special tool support is ontology editing. In order to implement
an editor of ontology concepts and relationships, we made use of the flexibility
of our generic JavaBeans instance editor. First, we defined a meta ontology
on the structure of ontologies, consisting of concepts like Class,
Slot
and Constraint, and employed the instance editor to enter class
and slot instances. Then we designed a Java code generator module, which
registers itself as a listener on the meta ontology classes and modifies
the source code modules to reflect the property changes caused by the user.
5 Comparison of the three patterns
This section lists some benefits and limitations of the three patterns.
We introduce the following names for the patterns:
-
AOM: The ontology is modeled by means of intermediate classes which
constitute an active object-model.
-
AOM++: Like AOM, but the intermediate classes themselves can be
subclassed.
-
KBeans: The ontology is modeled by means of JavaBeans classes which
obey the conventions described above (e.g. properties are smart variables).
These patterns were evaluated according to a number of criteria we found
to be important in our projects. The list of criteria is far from being
complete or well-structured and our discussion is informal. Although, the
results of this evaluation might also be interesting in related domains,
we focus on ontologies as the target of the modeling process.
-
How to transfer an abstract ontology into a running system?
AOM and AOM++ objects have to be instantiated either by constructor
code or (better) by a special builder tool. In the latter case, any part
of the ontology, e.g., classes, properties and instances, can be modified
dynamically. The same holds for KBeans instances - but not for classes.
In KBeans, changing classes requires the modification of the underlying
Java source code - either with a special purpose tool (as mentioned above)
or with any other Java editor. On the one hand this means that developers
have the choice between various modeling tools (modern Java IDEs and CASE
tools often include special support for editing JavaBeans). On the other
hand, a recompilation is required after each modification and the system
is error-prone if inexperienced programmers manually modify classes and
violate coding conventions.
-
How to extract abstract models from the system?
Since AOM and AOM++ both use intermediate classes to manage metadata,
any structural information on the model can easily and efficiently be obtained
by calling the respective access methods. KBeans uses reflection mechanisms
(encapsulated in JavaBeans feature descriptors) for metadata access. The
use of reflection mechanisms is less comfortable than what can be provided
by special-purpose AOM classes. Furthermore, the default reflection mechanisms
supported by the JavaBeans framework have some limitations (e.g. missing
information on the type of constraints as described above). Thus, own design
and coding conventions have to be introduced and obeyed.
-
What about the run-time performance of the models?
Since both AOM patterns are based on the intermediate object layer,
any access to values has to be funneled through additional data structures.
Furthermore, in order to look up a single value, one first has to retrieve
a reference to the metadata object. Because these objects have to be stored
in some kind of container, considerable performance is lost. At least some
of these disadvantages can be avoided in the AOM++ pattern, by introducing
additional accessor methods. However, the KBeans pattern, which is based
on ``real'' variables is certainly the most efficient. In one project we
required complex pharmacokinetic simulations based on instances from a
Protégé ontology. By changing to KBeans, we boosted the method's
performance by the factor of 1000.
-
How much memory is required by an ontology?
AOM stores all meta information in intermediate objects, which represent
an extra layer requiring additional memory. Every single boolean is encapsulated
in a slot object. AOM++ and KBeans introduce new Java classes, which cost
class memory. KBeans is mostly based on meta information, which is stored
in the Java virtual machine anyway. Usually, the two AOM patterns require
much more memory than KBeans.
-
How can models be exchanged and distributed between platforms, e.g.,
across the internet?
Since in AOM the ontology is simply a set of serializable instances,
interactions with other applications based on the same intermediate classes
are no problem. AOM++ and KBeans both require the exchange of class files
as well and are therefore less flexible. Furthermore, the reuse of JavaBeans
classes in other computer languages is restricted to languages with similar
reflective capabilities. In any case, special tools are required to read
models and to transform them into a platform-independent language.
-
Which pattern is best for additional ontology features, such as constraints,
event notifications, documentation and default values?
All approaches are equally powerful, although models based on the KBeans
pattern are the most difficult to configure, because many such features
have to be declared in the corresponding BeanInfo classes.
-
How can a model be connected to code, such as methods implementing problem-solving
algorithms?
The pure AOM pattern does not allow to add own methods to the model,
because ontology classes are only represented by instances. This limitation
does not occur in AOM++ and KBeans, where any additional code can be attached
to the ontology classes.
6 Conclusions
Since each of the three approaches has its individual strengths and weaknesses,
the choice of the pattern depends on the requirements of the application
being considered. In the development of a number of medium-sized knowledge-based
systems, we found the KBeans pattern to be the most suitable approach.
In these projects, we very often had to implement interactions between
ontologies, the user interface and the algorithms. KBeans enables the best
link between the ontology and the remaining classes from the system and
therefore enables a smooth integration of abstract knowledge models into
the architecture.
Acknowledgements
Sincere acknowledgements to Ray Fergerson and the other members of the
Protégé gang for their inspiring project.
References
-
1
-
J. Angele, D. Fensel, D. Landes, and R. Studer. Developing Knowledge-Based
Systems with MIKE.
Journal of Automated Software Engineering, 5(4),
1998.
-
2
-
R. Benjamins, D. Fensel, C. Pierret-Golbreich, E. Motta, R. Studer,
B. Wielinga, and M. Rousset. Making Knowledge Engineering Technology Work.
In 9th Int. Conf. on Software Engineering and Knowledge Engineering,
Madrid, Spain, 1997.
-
3
-
D. Coleman, P. Arnold, S. Bodoff, C. Dollin, H. Gilchrist, F. Hayes,
and P. Jeremaes.
Object-Oriented Development: The Fusion Method.
Prentice-Hall, 1994.
-
4
-
A. Duineveld, R. Stoter, M. Weiden, B. Kenepa, and R. Benjamins.
Wondertools? A Comparative Study of Ontological Engineering Tools. In Knowledge
Acquisition Workshop, Banff, Canada, 1999.
-
5
-
B. Foote and J. Yoder. Metadata and Active Object-Models. In OOPSLA
Workshop on Metadata and Active Object-Models, Vancouver, Canada, 1998.
-
6
-
E. Gamma, R. Helm, R. Johnson, and J. Vlissides.
Design Patterns:
Elements of Reusable Object-Oriented Software. Addision-Wesley, 1995.
-
7
-
W. Grosso, H. Eriksson, R. Fergerson, J. Gennari, S. Tu, and M. Musen.
Knowledge Modeling at the Millennium (The Design and Evolution of Protégé-2000).
In Knowledge Acquisition Workshop, Banff, Canada, 1999.
-
8
-
T. Gruber. A Translation Approach to Portable Ontology Specifications.
Knowledge
Acquisition, 5(2), 1993.
-
9
-
G. Hamilton. JavaBeans Specification. Sun Microsystems (http://java.sun.com/beans),
1997.
-
10
-
F. López. Overview of Methodologies for Building Ontologies. In
IJCAI-99
Workshop on Ontologies and Problem-Solving Methods, Stockholm, Sweden,
1999.
-
11
-
M.A. Musen, S.W. Tu, H. Eriksson, J.H. Gennari, and A.R. Puerta.
PROTÉGÉ-II: An Environment for Reusable Problem-Solving Methods
and Domain Ontologies. In International Joint Conference on Artificial
Intelligence, Chambery, France, 1993.
-
12
-
N. Noy and M. Musen. An Algorithm for Merging and Aligning Ontologies.
In 16th Nat. Conf. on Artificial Intelligence (AAAI-99), Workshop on
Ontology Management, Orlando, FL, USA, 1999.
-
13
-
A. Schreiber, B. Wielinga, R. de Hoog, H. Akkermans, and W. van de
Velde. CommonKADS: A Comprehensive Methodology for KBS Development.
IEEE
Expert, 1994.
-
14
-
R. Studer, D. Fensel, S. Decker, and V.R. Benjamins. Knowledge Engineering:
Survey and Future Directions. In 5th German Conf. on Knowledge-based
Systems, Wuerzburg, Germany, 1999.