Introduction
The importance of information systems is often underestimated by the programming
community. While high standards for coding, bug fixes and behavior have been created and
acknowledged, information systems (documentation and web sites) suffer from a lack of
such search for perfection. In this paper, some ideas for the design and creation of such
system -- as well as guidelines for their management -- are described. This paper is mainly for
active developers and people interested in understanding the project management
processes.
A container for projects
The Java Apache Project started as one single project (the Apache JServ servlet engine)
and later become involved in the general idea of Java software creation. For this reason,
the whole information system is designed in such a way that many independent project may
be contained and glued together by the same resources (web site, mail lists, CVS
repository). These resources offer a common ground between the hosted projects and reduce
the overhead for general project management and save time that can be dedicated to more
productive (and fun) activities.
Main goals
- Reduce the work
- Volunteer project usually have an hard time finding people willing to spend their time
writing documentation or improving/rewriting/porting existing ones. One of the major issues in
designing good information systems is to minimize the work done by contributors
to manage and enhance the systems.
- Esthetic issues
- Esthetic issues do have a place in information system design, even in places where such
needs may appear of second importance. While trying to keep the overhead of esthetic
content to a minimum, a pleasant and original graphic style can help a project to acquire
new subscribers by attracting their attention. The Mozilla web site was an inspiration for both graphics and design ideas.
- Heritability
- Another big issue in volunteer projects is keeping the process separated from the
contributors. This allows easy replacement in case people leave the project or remain
vacant for a while. Such bottlenecks should be avoided by documenting every aspect of the
development and writing guidelines (like this paper) on how things where done and how
should be done in the future.
- Flexibility
- Even if every aspect of the development process is documented, flexibility and
adaptivity should be always considered important issues. Software projects, like the people
who develop them, are subject to changes. The project processes should remain as flexible
as the project so flexible that should be easy to them as the project evolves.
The java.apache.org web site
The project's web site is the portal to all the resources hosted by the Java Apache
Project. It's the front door for the information system that includes all the subprojects
documentation and glues them together. The HTML structure of the web site is designed
following this idea. The graphic part is separated by the contents using frames and
placing graphic files and informative files in different directories. Even if the site was
designed using to be pleasant and well rendered on graphic displays at all resolutions,
the site can be browsed even with browsers on text-only systems to allow the widest
possible range of users to have access to the information contained by the site. We
understand that in many countries, text-only internet access is the rule, not the
exception.
The web site is composed by two parts: the navigation bar on the left frame and the
browsing area in the main frame. The navigation bar uses the same file used by text-only
browsers as a table of content. This is to minimize the overhead of having two different
files for the same thing.
The tree structure of the web site reflects its design ideas:
- / - the root directory contains the index.html page as well as graphic frames pages.
This directory should not contain any information but only graphic contents that doesn't
change. This allows better separation between graphics and content and it's important for
dynamic content generation.
- /main the main directory contains all the general documentation shared between
projects as well as the table of contents file used for the navigation bar.
- /images this directory contains the graphic content for the main site. Projects
store their graphic in their own directory (see below)
- /images/ads this directory contains the images used as advertisement on sites
that donate advertising spaces on their web sites.
- /xxx every project has its own directory and this directory should contain the exact
copy of the HTML documentation stored in the CVS module for that project.
- /xxx/images this directory stores the graphic contents relative to that project.
This allows documentation to be valid if both browsed from a distribution or from the web
site.
- /xxx/dist the distribution directory where product releases are placed. Every
project has its own. This directory should also contain the HEADER.html and README.html
files used by the web server to add graphic contents and other information to the
directory listing. This directory should not be contained in the distributed HTML
documentation.
Subproject documentation
Each subproject directory should contain a copy of the documentation section found in
the project's CVS module. This allows automatic update of the web site without human
action simply by updating the directory using a CVS client. The HTML documentation for
every project must be designed with there idea in mind: it should be possible to simply
move the documentation on the web site without breaking any link. For this reason,
relative links must be used between files and absolute ones must be used when they refer
to directories hosted on the web site. In fact, some directories present on the web side
(i.e. the distribution directory) should not be placed in the CVS module. Each subproject
must have its own graphic content (logo, snapshots, etc..) stored in a subdirectory
relative to the main documentation directory.
Since each subproject has only one documentation directory shared by the distribution
and the web site, the documentation must be complete in all its parts and reduce the
external or absolute links to those resources which should not be included in the product
distribution. This allows both complete off line browsing as well as simple web site
management between different projects since it can be made automatic.
Very Important: one other thing to keep in mind is that the web site
will show these pages in a frame. For this reason, every link to external resource MUST
have target="_top" set to avoid mixing Java Apache information with
other resources that could create legal problems. If any external link is found in the
documentation without the proper target location must be considered a bug and corrected.
Other documentation media
HTML was designed for distributed information systems but its nature does not allow
nice porting to other media. While this has been partially covered with the introduction
of the media abstraction in CSS (cascading style sheets), we analyse other documentation
formats to evaluate their benefits for the project.
- HTML is the standard for
hypertext documentation systems. It has a human readable/editable format, it's simple,
open and standard, and editors are available for every operating system. Its main problem
is lack of media abstraction that gives bad results on media different from computer
displays (paper, speech, braille-devices, etc
)
- TeX is a well known word processing format but like HTML, it is highly targeted
on printing media. There exist tools for the conversion of TeX files to other formats
(plain text, info, HTML, etc
) but for the same reason HTML generated bad printing
material, TeX-generated HTML files offer poor use of hypermedia contents such as links
and graphics.
- DocBook (SGML) is an SGML DTD designed to be a container for well
described documentation information. While DocBook files are not viewed or printed
directly, tools exist to convert these files to both printing media (dvi, ps, pdf) or
hypertext media (html). The Linux Documentation Project as
well as O'Reilly use this system for their
documents and books.
- DocBook (XML) is the (yet unofficial) XML port of the SGML DocBook DTD. The port
of DocBook on XML would allow the use of new innovative languages that are currently being
standardized by the Web Consortium to enhance the
portability along with the media abstraction. The use of XSL (eXtended Stylesheet Language)
together with XLink
(eXtended LINKing language) would guarantee an amazing flexibility of such documentation
format, because content (XML), presentation (XSL) and hierarchy (Xlink) may be separated
in different files. Browsers and automatic tools may then choose to "shape" the
information in the most useful manner for the media being used, allowing complete media
abstraction. Even if major efforts have being committed to the design of such flexible
information system, there are no currently available standards for XSL and Xlink (only
drafts and proposals) and very few research tools work with them.
Conclusions
While many issues in the creation of information systems have been addressed by this
paper, better tools and documentation formats are needed to significantly improve the
usability and usefulness of such systems. XML and the other languages that are now being
researched and standardized offer a nice alternative to old operating system oriented
tools or to portable but not abstract enough languages such as HTML. Currently, the Java
Apache Project uses plain text and HTML as documentation formats, but will move to more
advanced and flexible documentation systems based on XML when (and if) they will be
standardized and freely available. |