Cover Story
-
October 1, 2007 - Volume 85, Number 40
- pp. 16
The Semantic Web
Pharma Researchers Adopt An Orphan Internet Standard
Rick Mullin
The World Wide Web Consortium (W3C), the Internet standards arbiter that developed the HTML content description protocol, released a new standard several years ago called the semantic Web. Operating on linkages between data called triplets, in which two URL-based pieces of information are connected by a recognizable relationship in a kind of subject-verb-object arrangement, the semantic Web gained far less momentum with programmers than did HTML, which can be searched on the basis of written language.
However, the so-far-neglected standard, which relies on extensive and standardized coding of Web-searchable data and documents, may soon be adopted by the big drug companies, where a coterie of information technology (IT) specialists see its potential in organizing R&D data and expediting drug discovery and development—where a triplet might include a specific compound and a functional relationship of that compound to a specific cellular receptor.
"Many pharmaceutical companies are exploring the use of the semantic Web," says Susie Stephens, principal research scientist for discovery and development informatics at Eli Lilly & Co. It is one of many avenues Lilly is investigating to develop a research IT regime, she says.
Stephens sees the Web-searching technology, which has both Internet and intranet applications, as a promising alternative to software-based data mining. "It's a more flexible approach to incorporate new data sets as you go along," Stephens says. "And I always say that data is more important than software."
Not surprisingly, the semantic Web poses a challenge to the research IT status quo. "Anyone who has been taught that the only way to represent data is in a representational database with columns and rows will have a hard time thinking of universal identifiers," says Eric Neumann, director of consultancy Clinical Semantic Group, referring to the linked triplets of the semantic Web. He contends, however, that the Internet search standard actually makes data more fluid and accessible across a database than a common spreadsheet-based database where information is locked into specific contexts.
"The semantic Web gets rid of the parsing problem that goes along with that kind of database, as long as there is a clear definition of relationships between things," he says. "But, frankly, the IT groups have never looked at the problem like that. They are completely stunned at first by it."
John Wilbanks, executive director of the Science Commons, a spin-off of Creative Commons that develops routes to legal sharing of copyrighted scientific documents and data, sees a critical mass of IT-savvy researchers enthusiastically pursuing projects using the semantic Web. He compares their efforts to pioneering work on the Internet itself.
"Around 1995 or 1996, all the subterranean work exploded," Wilbanks says, "and most people discovered the Web. What is happening now on the semantic Web is similar to what was going on in the five years leading up to that explosion."
Science Commons, in association with W3C, recently launched a demonstration project called Neurocommons to illustrate the benefits of the semantic Web in neurological disease research. Neumann, who is involved with both Science Commons and W3C, says Neurocommons has put together "vocabularies describing different parts of the brain. We are doing federated physiological and neuroscientific queries where people are able to take inputs and do a variety of different things with them."
According to Stephens, while she and her drug industry colleagues are involved with Science Commons and other collaborative efforts, Lilly, like most other companies, is concentrating on developing semantic Web applications internally. While the standard would, ideally, be developed in an open environment, she says it provides tools that can be used now by companies on their own to improve data management and communication among researchers.
Wilbanks agrees. He says companies will eventually have to adapt in-house semantic Webs to a broader standard that expedites collaborative research between companies and institutions. Such a standard will most likely emerge as in-house projects "boil over" and merge. "There are enough databases and enough smart people involved," he says. "You can really see the momentum now."
Cover Story
- The Big Picture
- Drug firms forge an information management architecture to take on the research data glut
- The Semantic Web
- Pharma Researchers Adopt An Orphan Internet Standard
- Target Practice
- Software and database vendors aim to supply researchers with tools to extract quality from quantity
- Electronic Lab Notebooks
- A Collaborative Tool Settles Into Drug Research
Cover Story
- The Big Picture
- Drug firms forge an information management architecture to take on the research data glut
- The Semantic Web
- Pharma Researchers Adopt An Orphan Internet Standard
- Target Practice
- Software and database vendors aim to supply researchers with tools to extract quality from quantity
- Electronic Lab Notebooks
- A Collaborative Tool Settles Into Drug Research