HIPPO: Incorporating Hypertext using Fuzzy Components<

HIPPO: Incorporating Hypertext using Fuzzy Components

Paul Newton
Electronic Publishing Research Group
Department Of Computer Science
University Of Nottingham, NG7 2RD
England
email: pkn@cs.nott.ac.uk
url: http://www.ep.cs.nott.ac.uk/~pkn/hippo/htf/htf.html

Introduction
In recent years, the hypertext community has seen a move towards more flexible, open forms of hypertext, with many researchers calling for support for heterogeneous applications and platforms [10,12]. Academics have rejected many of the traditional notions of hypertext which adopt monolithic, closed implementations, and have begun to consider hypertext as an abstract service which can be incorporated into other applications. This allows users to augment their conventional environment with hypertext functionality and linking concepts, while using their existing tools and legacy data. Recent years have seen great advances in open hypertext theory and hypertext integration, with the emergence of many popular open systems and models [2,7,9,14,15,16,21]. Many of these models are heavily influenced by early work on classical hypertext [1,10,18], and the development of abstract models such as Dexter, Trellis etc [8,11,17].

While many of these open approaches to hypertext differ in their implementation and precise abstractions, many of them share the idea of some form of hypertext link services layer. Hypertext abstractions are implemented as a substrate layer which sits between the user applications and the underlying operating system. Applications can then make use of these hypertext abstractions, and incorporate hypertext functionality to structure their data. This approach raises many issues surrounding the integration of hypertext into the existing environment - typically, designers have explored ways in which existing applications should communicate and integrate with the hypertext substrate. Furthermore, different models hold widely differing views as to the demands made on existing applications - while some models place most of the hypertext functionality in the hypertext services layer, others choose to make increased demands on applications, and require applications to maintain additional structures to support hypertext.

Problems with Open Hypertext Theory
The research into open hypertext theory has produced some excellent results, and shown the advantages of integrating hypertext services into existing environments. However, despite the advantages offered by hypertext and the flexibility of these open systems, users can still experience the unusual paradox of a closed environment. While open hypertext implementations allow the integration of hypertext services into existing applications, the users must still adopt the particular abstractions and hypertext model offered by a particular open hypertext system. The increasing array of open models all offer different abstractions and views of hypertext, and aim to proselytise users to the their particular model. If a user wishes to adopt the linking strategies or some of the concepts offered by a given model, then they are forced to relinquish their existing open system and all of its features, and adopt another system. Some of these problems are lessened by adopting a loose-coupling between applications and hypertext services, in which the data itself is left untouched by the hypertext layer, and the hypertext structure is maintained separately. However, this rigid, static notion of hypertext falls short of many of the claims and potential advantages of open hypertext theory; some of the particular areas of concern are detailed below:

Each open system implements a particular set of hypertext abstractions and linking model (albeit implicit in many cases). Each open model has clear definitions of nodes, anchors, link functionality, how the user interacts with the hypertext framework, how persistent information is stored, what constitutes a link, the requirements for integration etc. The behaviour of the system and the hypertext servcies is fixed and defined by the system designer - any means of augmenting or modifying this environment is very limited and restrictive.
The boundaries of the system are very clearly defined - the hypertext model defines very clearly what the system offers and what it cannot do. There is no ambiguity or flexibility in the system specification - this notion of a fuzzy system is discussed later in the paper.
Open hypertext implementations are very static - they do not adapt to meet the user's needs or change and adapt as the system is used. A hypertext model should offer an immersive environment in which it changes and reflects the usage and needs of the community that operate within it.
Distribution has been largely ignored by much of the hypertext community. Only recently, with the popularity of the World Wide Web has there been an emphasis on distributed systems. However, many current implementations do not fully embrace distributed ideals, the systems themselves are not distributed but often operate centrally. Many systems simply offer network access or some form of integration with the WWW, and claim to be distributed.
Open systems are tied to a particular implementation and instance of the underlying model. A flexible hypertext system should allow the user to select the most suitable implementation of a particular hypertext abstraction - this may be selected based on the current platform, the hardware capabilities, or any number of other criteria such as efficiency, network traffic, authority etc. User's should be free to choose the ways in which they integrate with a system, what level of integration they require, and the implementations they use to achieve this.

Many hypertext systems allow a limited form of integration and extensibility, using some approach based around scripting languages or plugin/API support. Indeed, some of the earliest hypertext systems such as NoteCards[10] and KMS[1] provided some form of scripting support, and many contemporary applications support an API (Application Programmer's Interface) framework (eg Netscape[13]). However, these approaches can prove very limiting; they often force developer's to adopt a particular language or platform and the extensions still operate within the framework of the host system so they must still adopt the underlying concepts and abstractions defined by the original hypertext system. Furthermore, any extensions can only employ services and API calls which the system designers make available to developers.

Some recent contributions to open theory have employed different approaches to hypertext integration and extensibility. For example, the HyperDISCO[19] and Hyperform[20] systems adopt an object-oriented framework as a means of extending the hypertext models. Developers can extend and modify a set of core classes to implement specific behaviours and augment the hypertext model. While these approaches have proved very useful and make substantial contributions to the flexibility of hypertext environments, they do not address how a system should be extended, which objects can be used and when. They do not provide adequate support for a changing system, and still exhibit some of the problems described previously.

The HIPPO Model
The HIPPO model takes the view that the user is completely in control of the hypertext environment, that the hypertext services are there to be used by a user, but that they should not restrict the user, or force them to change their working practices. The hypertext model and the behaviour of the hypertext services should be selected by the user, and the user should be free to change this whenever they wish. The hypertext user should choose what constitutes a node, what type of links are supported, what happens when they traverse a link etc - indeed all aspects of the hypertext model and semantics should be controlled by the user, not dictated by a system designer (this builds on some of the ideas in Bieber[4,5]). The HIPPO model takes a computational view of hypertext, by abstracting the functionality and semantics of a hypertext environment into lots of small, lightweight processes. These processes can implement any operation or set of operations (link retrievel, presentation of a node, processing of link sets, storage of objects etc), which can than be used together to define an arbitrary hypertext system. In this way, the user chooses the behaviour of the hypertext system and which processes they use to integrate with their existing applications and environment. This model builds on some existing research into computational/dynamic hypertext systems [3,5,16,17,22]

HIPPO introduces a classification system to manage the administration of these processes - this uses a hierarchical system somewhat similar to the MIME-type taxonomy system used to encode data types[23]. This relates to some of the early work done on hypertext link classification [5,6,18] which examined the nature of hyper-links and identified different link types, however HIPPO extends this to include all aspects of hypertext semantics and functionality. For example, a classification system for a set of hypertext processes, might read:

...

/storage/dbms/relational/...

/storage/dbms/network/...

/storage/dbms/object/...

/storage/dbms/relational/concurrent/...

/presentation/node/text/...

/presentation/node/compound/text/...

/link/single-direction/follow/...

/link/single-direction/follow/http/...

/link/bi-direction/follow/...

/link/single-direction/retrieve/...

...

Processes can then register themselves under any of these classification types, and make themselves available to hypertext users. These processes could reside on the user's local machine, however the flexibility of this approach to hypertext becomes more interesting when applied to a distributed model. The HIPPO system allows system processes to register with a trading service, which stores information about each process, location, arguments etc. These trader services can then service requests from clients, and return information about the processes it knows about. This notion of traders is widely used in other areas of distributed programming such as CORBA, DNS services etc, and allows processes to reside anywhere in the network domain. In this way, processes can make use of remote hardware and software platforms, and utilise remote resources without being restricted to the target architecture of the user. In addition, the system can support an arbitrary number of users, who can all add processes to the system. By employing a distributed model, the arbitrary hypertext model that the user decides to use is itself distributed, which results in a more robust and scalable architecture, which inherits the well documented advantages of distributed systems.

The classification hierarchy included above is largely arbitrary, and could be reconstructed in many different ways - indeed, the HIPPO system could adopt a number of policies for administrating the development of the taxonomy system. One approach would be to utilise a centralised authority to control and approve additions to the hierarchy, in much the same way as the existing MIME hiearchy. Applicants make requests for additions to the hierarchy, which are then examined before approval or rejection. This could work well, but a more interesting approach would be to allow the hierarchy to develop itself; processes could (and should) be made available by all users of the system, and could perhaps be given some temporary status. If the processes become sufficiently useful and widely used, then the component could be automatically added to the hierarchy. In this way, the hierarchy becomes largely autonomous, and develops to reflect the communal usage of the system. A useful analogy to this approach is seen in the development and administration of the USENET newsgroup hierarchy.

System Specifications Based Around Document Objects
While the system described so far offers a great deal of flexibility, where each user is free to adopt the hypertext model they choose, this new found freedom could become just as restrictive as existing systems. Users can become overwhelmed by the (infinite) functionality and number of remote processes at their disposal. The system addresses this problem by moving the focus of the hypertext paradigm from the system to the information itself. Existing approaches to hypertext appear to offer hypertext models which encapsulate the document - a system provides the functionality, and the document is shoe-horned into the framework. However, the author believes it is important to give a higher priority to the information - different tasks and information fields demand widely different frameworks and functionality. This ranges from the trivial requirements of different file formats to the more pertinent question of differing platforms and available resources, to the perhaps more wide-reaching question of different behaviours for different tasks and users.

It is clear that an expert working in a specific, highly-skilled medical domain makes very different demands on their hypertext model to a beginner looking for information about their favourite piece of music etc. The HIPPO model attempts to address this by treating the document as the primary object of interest - each document (or other "unit of information") has a system configuration associated with it. Typically, this would be a list of processes from the taxonomy above, which are deemed useful and important for the item of information. In this way, each different document or type of information includes a different specification of a system, so as a user moves through an information space, the system changes to match the task at hand. The user is free to accept or reject the proposed system specification at any time, and can always select from any of the processes available in their trading space.

Fuzzy Components
While the view of the document as the primary object proves useful for providing a flexible hypertext model, the idea of simply providing a "system specification" is not ideal. Firstly, this raises the obvious question of who actually decides on this system specification for each document object? It seems natural to allow the author to specify the components, link data etc that a user should use, yet this can never be ideal in all cases. Indeed, this seems to be a failing with many hypertext systems - while the author has a useful level of expertise, and should surely contribute to the hypertext structure and behaviour, they can never predict how any given user will use the information. They do not know the level of expertise of a user, why they are viewing the documents, or what they hope to gain from them.

Furthermore, it seems very demanding to ask an author or user to define a particular system specification, to say that the optimum system consists of a given set of processes and nothing else. The current notion of a system specification which is associated with each document defines the hypertext system as a fixed set of processes, and excludes any processes outside of this set. However, it seems intuitive to suggest that some processes may prove more useful than others - many users use simple linking models and navigation tools, but may wish to use some of the more complex components from time to time. Also, it seems natural to allow this "ideal" system specification to reflect the demands and usage patterns of the hypertext community, and to change over time.

HIPPO supports this less rigid method of specifying systems, by employing the notion of fuzzy logic, where objects are no longer forced to belong, or be excluded from a set, but are allowed to have a fuzzy membership value. For example, a typical system specification may specify several processes which are considered vital to a document, yet may also list a number of hypertext behaviours which, while not as important, may prove useful in some cases. By attaching "fuzzy values" to hypertext components and processes, then this opens up the possibilities of adaptive hypertext and user modelling[24]. Fuzzy values can be altered to reflect the status of a user or some other criteria. For example, the fuzzy value associated with a process could be increased as more users find it useful. Adaptive methods are a relatively new development in the hypertext community, but have thus far only been applied to sets of links or interfaces. The use of fuzzy logic with hypertext processes opens up the possibilities for truly adaptive hypertext, where the actual behaviour and hypertext abstractions can adapt to meet the demands of the user.

Conclusion
This position paper has discussed some of the techniques used in the HIPPO hypertext system. This takes an alternative approach to integrating hypertext into the users environment, by abstracting the hypertext model into arbitrary hypertext processes. In this way, the user is free to decide which processes and which methods they use to integrate hypertext with their existing environment. A taxonomy for this process model is introduced, along with a brief discussion of the distributed methods employed in its implementations. Finally, the paper introduces the idea of using the document as the unit of hypertext integration, so that the hypertext model changes to reflect the current task of the user. Finally, the notion of fuzzy logic is incorporated into the hypertext model - the author views this area as particularly interesting for future work. Current hypertext systems offer a very static model to the user, yet it seems useful to consider ways in which users can be given a less rigid definition of a hypertext model, which changes to reflect the needs of the users.

References

Robert M. Akscyn, Donald L. McCracken, and Elise A. Yoder. KMS: A Distributed Hypermedia System for Managing Knowledge in Organisations. Communications Of The ACM, 31(7):820-835, Jul 1988.
Kenneth M. Anderson and Richard N. Taylor and E. James Whitehead Jr. Chimera: Hypertext for Heterogenous Software Environments. In Proceedings of the 1994 ECHT Conference, pages 94-107, Sep 1994.
Helen Ashman. Issues in the Use of External And Remote Services in Hypermedia Systems. Submitted to 2nd Workshop on Incorporating Hypertext Functionality into Software Systems (part of Hypertext 1996).
Michael Bieber. Issues in Modeling a "Dynamic" Hypertext Interface for Non-Hypertext Systems. In Proceedings of the 1991 Hypertext Conference, pages 203-217, Dec 1991.
Michael Bieber. What Every Information Systems Developer Should Know About Hypertext. Submitted to 2nd Workshop on Incorporating Hypertext Functionality into Software Systems (part of Hypertext 1996).
Steven J. DeRose. Expanding the Notion of Links. In Proceedings of the Hypertext 1989 Conference pages 249-257, Nov 1989.
Andrew M. Fountain, Wendy Hall, Ian Heath, and Hugh Davis. Microcosm: An Open Model For Hypermedia With Dynamic Linking. In Hypertext: Concepts, Systems and Applications. The Proceedings of the European Conference on Hypertext, pages 298-311, France, 1990. Cambridge University Press.
Franca Garzotto, Paolo Paolini. HDM - A Model-Based Approach to Hypertext Application Design. ACM Transactions on Information Systems, 11(1):1-26, Jan 1993.
Kaj Gronbaek and Randall H. Trigg. Design Issues for a Dexter-Based Hypermedia Systems. Communications Of The ACM, 37(2):41-49, Feb 1994.
Frank G. Halasz. Reflections On Notecards: Seven Issues For The Next Generation Of Hypertext Systems. Communications Of The ACM, 31(7):836-852, Jul 1988.
Frank Halasz and Mayer Schwartz. The Dexter Hypertext Reference Model. Communications of the ACM, 37(2):27-39, Feb 1994.
Kathryn Malcolm. Industrial Strength Hypermedia: Requirements for a Large Engineering Enterprise. In Proceedings of the 1991 Hypertext Conference, pages 13-23, Dec 1991.
The Netscape Plug-in Developer's Guide. URL: http://www.netscape.com
Amy Pearl. Sun's Link Service: A Protocol For Open Linking. In Proceedings of the 1989 Hypertext Conference, pages 137-146, Nov 1989.
Antoine Rizk and Louis Sauter. Multicard: An Open Hypermedia System. In Proceedings of the 1992 ECHT Conference, pages 4-10, Milano, Nov 1992.
John L. Schnase, John J. Leggett, David L. Hicks. Open Architectures For Integrated Hypermedia-Based Information Systems. In Proceedings of the Twenty-Seventh Annual Hawaii International Conference on System Sciences, pages 386-395, Jan 1994.
P. David Stotts and Richard Furuta. Petri-Net-Based Hypertext: Document Structure with Browsing Semantics. ACM Transactions on Information Systems, 7(1):3-29, Jan 1989.
Randall H. Trigg. TEXTNET: a Network-Based Approach to Text Handling. ACM Transactions on Office Information Systems, 4(1):1-23, Jan 1986.
Uffe Kock Wiil and John J. Leggett. The HyperDISCO Approach to Open Hypermedia Systems. In Proceedings of the Hypertext 1996 Conference, pages 140-148, Washington DC, March 1996.
Uffe Kock Wiil and John J. Leggett. Hyperform: Using Extensibility to Develop Dynamic, Open and Distributed Hypertext Systems. In Proceedings of the 1992 ECHT Conference, pages 251-261, Milano, Dec 1992
Nicole Yankelovich, Bernard J. Haan, Norman K. Meyrowitz, and Steven Drucker. Intermedia: The Concept and Construction of a Seamless Information Environment. Computer, pages 81-96, Jan 1988.
Peter J. Nurnberg, John J. Leggett, and John L. Schnase. Hypermedia Operating Systems: A New Paradigm for Computing. In Proceedings of the Hypertext 1996 Conference, pages 194-202, Washington DC, March 1996.
MIME (Multipurpose Internet Mail Extensions): Mechanisms for specifying and describing the format of Internet message bodies, Network Working Group- Request for Comments (RFC) 1341, Jun 1992
Peter Brusilovsky. Methods and techniques of adaptive hypermedia. User Modeling and User Adapted Interaction, 6(2-3), 1996.