Chemical & Engineering News,
November 13, 1995

Copyright © 1995 by the American Chemical Society.

Chemists urged to increase activity in web development


The Internet is a sea of initials and acronyms - FTP (file transfer protocol), RTF (rich text format), GIF (graphics interchange format), HTML (hypertext markup language), and HTTP (hypertext transfer protocol), to name a few. With its unique applications on the World Wide Web (WWW), chemistry is promising to add even more.

WWW is a client-server domain on the Internet that makes it possible for computers anywhere in the network to share a file stored on the server, be it located in the next block or continents away. What makes the sharing possible is the concept of hypertext, by means of which one document, or web page, can be linked to another, forming a "web" of information. Each web page has its own address or uniform resource locator (URL).

For a user, the window to the web is a graphical user interface software program called a browser. Netscape Navigator from Netscape Communication Corp., Mountain View, Calif., is one of the more popular browsers in use today. The browser enables a user to tunnel from one document to another by clicking a mouse on hyperlinks - words, phrases, or icons coded in a special way that is indicated by underlining or color. The document may be a text, graphic, video, or audio file. Files may be displayed or downloaded.

Documents intended for display by the browser are encoded in HTML, which makes use of tags enclosed in angle brackets to let the browser know how to display the document. It also makes use of anchors - similar to tags but lengthier - for linking text to image, audio, or video files. Converter programs are available that can convert the text from a word processing program such as Microsoft Word or WordPerfect, saved in RTF, to a document encoded in HTML.

Chemistry brings some unique features to this mix. Henry S. Rzepa, a reader in organic chemistry at Imperial College, London, and various colleagues have projects under way aimed at addressing some of the issues generated by the peculiarities of chemistry on the web.

The variable quality of chemistry sites is an issue chemistry shares with other areas. To some extent, Rzepa believes, that situation is a problem that stems from lack of tools. For example, early web servers had no mechanism to prevent "dangling hyperlinks," he explains. But structured servers such as one called Hyper-G are under development, he says, and promise much for the future because by definition one cannot insert or move a document without updating the link information in such a server.

Other issues are distinctive to chemistry, however. One is indexing. Text is relatively simple to index, Rzepa explains, but chemists making use of the web face a big challenge in indexing chemical content.

A Rzepa colleague, Peter Murray-Rust, head of protein structure computing at Glaxo Research & Development, Stevenage, England, has been actively pursuing a project in this area called chemical markup language (CML). He describes it as HTML with some chemistry added. Once the chemistry in a document is properly and extensively tagged, he says, it becomes much easier to index.

There are further chemical indexing challenges. One involves images. Here, Rzepa says, the chemical community has only just begun to address the issues through new graphics formats such as one called portable network graphics (PNG). And there is the further challenge of indexing three-dimensional metaphors such as virtual reality modeling language (VRML).

Rzepa sees indexing considerations as a subset of more encompassing concerns. Chemistry on the web is at a crossroads, he says. He notes that at the 1st International World Wide Web Conference in Geneva in May 1994, science had a significant presence. But by the second conference in Chicago in October 1994, science was reduced to a small room seating about 40 people. In Darmstadt, Germany, in April 1995 there was no explicit science agenda, nor will there be one in Boston for the meeting coming up next month. All of which indicates, Rzepa says, that the World Wide Web is now dominated by commercial activities, and as a result parts of the essential infrastructure for chemistry - such as development of a new version of hypertext markup language (HTML 3) and of indexing - are facing a less certain future.

Chemistry software houses are beginning to show an interest, as are publishers, Rzepa notes. "Only if they commit themselves fully," he maintains," will we begin to get properly developed and supported tools for the chemistry community."

Chemistry faces a crossroads in the standards area as well, Rzepa adds. Efforts are being made to establish standards, such as chemical MIME (multipurpose Internet mail extensions). MIME types are standards that enable web browser software and Internet agents to identify the media content of a file as, say, image (GIF, for example), audio, or video, and to treat it according to the way the information is likely to be processed and presented. Current standard MIME types, however, cannot handle the generic media and presentation requirements that a chemical MIME type would be able to manage - for example, rendering molecular geometries in either 2- or 3-D, with some degree of navigation through chemical objects possible, and with control over how individual objects such as atoms are displayed.

Visualization and user manipulation of a molecular structure based on 3-D coordinate data such as might be acquired through the web from the Protein Data Bank (PDB files) database compiled at Brookhaven National Laboratory is one potential application for chemical MIME types. Web interfaces to molecular dynamics and quantum chemistry packages would be another example of a chemical MIME-type application.

Efforts to establish such standards for chemistry have received a mixed response from the largely nonscientific standards bodies like the Internet Engineering Task Force, according to Rzepa. These organizations have not yet developed a robust mechanism for receiving proposals from scientists, he says.

Hence, Rzepa says, "Chemists need to be more proactive in developing the web for their own purposes, because that is the only way its potential will come to be realized."



Return to the Beginning of Article

OR

Click the "Back Arrow at the top of your window to return to hyperlink


[ACS Home Page] [ACS Publications Division Page]