About MDD - Subscription Info
April 2002
Vol. 5, No. 4, pp. 51–53.
sites and software

A league of IT’s own?

opening artAlliances are forming as companies compete for the life-sciences market.

Scientists face a growing problem in to day’s drug discovery world, namely, how to extract knowledge from the ever- increasing information that has resulted from the genomic revolution. With high-throughput screening technologies and the growing reliance on in silico techniques, the amount of data that scientists must manage, store, mine, and use has grown exponentially. In addition to data management issues, scientists now face information technology (IT) problems. They must decide which tools—computers, databases, software, servers, and networks—to use, and in what manner to use them. They also must find ways to simplify and accelerate drug discovery research.

The first step
Recognizing the potential of genomics- and proteomics-based technologies for drug discovery, a handful of large pharmaceutical companies, including Glaxo Smith Kline, Merck & Co., Abbott, and Roche, embraced IT. These companies and others established strong informatics programs to support the exponential growth of data resulting from genomics and proteomics research. For these companies, the first step was to build an informatics infrastructure and then, either internally or externally, to add key applications in support of specific research goals.

In May 2001, Merck displayed a new strategy by acquiring Rosetta Inpharmatics, a genomics company, for about $600 million. Merck hopes to leverage Rosetta’s expertise in genomics, gain valuable bioinformaticists, and, most importantly, acquire new drug targets. Although this acquisition looks like a great move strategically, questions remain: Will a new blockbuster drug come out of the $600 million deal? Will this become the new paradigm in the pharmaceutical industry, with one large pharma following in another’s footsteps? We must wait and see. Ultimately, many different strategies will evolve to deal with the rising problem of information.

Paramount in importance is the way scientists think about and deal with IT—including the tools, software applications, genomic and proteomic databases, data integration, and infrastructure (hardware and middleware). For the research scientist, computers were initially used in experimental analysis. Data generated by a scientist would be analyzed and stored in one place. At that time, most “scientific discovery” was made in vitro. With the advent of networks and the Internet, however, drug discovery now occurs in silico.

With in silico techniques, scientists are faced with a new problem—how to access and integrate all the data. Whereas researchers used to work with data in relatively few formats and a limited number of locations, they now deal with heterogeneous data in repositories throughout the world.

For example, a scientist might need to access text files on the Internet, assay data in Sybase format at multiple locations, 2-D and 3-D chemical structures in Oracle databases, sequence data in SQL format, and toxicological data, not to mention antiquated legacy data in older databases. For the scientist, this is turning out to be a nightmare. Current IT infrastructure does not always provide simple access to data, and different data types are not easily integrated. According to Adel Mikhail, vice president of strategic development at Lab Book, several companies are aware of these issues and are working to solve them. “The days of the large back-end database are over,” he says. “Products like eLabBook and DiscoveryLink understand this. They allow different types of data to be retrieved from multiple locations and can integrate it all on the fly, right from the researcher’s desktop.”

The new paradigm
In a recent survey conducted by Front Line Strategic Consulting (Foster City, CA), most life scientists responded that their main focus is now on the content and tools specific to their research. They have embraced these areas as priorities because of the new research paradigm, which includes the rapid increase in data volume and types, the need to access and integrate all the information, and pressures to accelerate drug discovery timelines. What is surprising in the survey is that scientists give little thought to how the content and tools work together and to the IT infrastructure and middleware supporting their research efforts. The survey also found that most researchers assume and accept that the IT infrastructure is a “given” in their companies.

Front Line’s finding is echoed by Brian Guza, of First Consulting Group’s Discovery Practice (Long Beach, CA), who says, “Informatics applications often gather momentum among a small, influential group of end users. Comprehensive informatics products are not currently available; scientific organizations are forced to buy multiple tools to meet their overall needs. By the time that IT resources becomes involved, it’s too late to evaluate software products against established standards in technology and infrastructure. As a result, many scientists use tools that are poorly integrated or ill-suited for the hardware and infrastructure in their organization.”

So what does this mean for a pharmaceutical or biotechnology company? Unless an organization plans for its informatics future with its researchers, an inefficient patchwork of applications and content will surround a stagnant infrastructure. Furthermore, although this patchwork may serve individual researchers’ needs, it will not efficiently meet the organization’s goals.

The end of the tunnel
Realizing that its life scientists were not computer engineers, the pharmaceutical and biotechnology industries looked to IT suppliers for help. Compaq, one of the first suppliers to enter the life sciences, recognized the growing need and business opportunity by forming a partnership with Celera (Rockville, MD) in 1998. Under the agreement, Compaq supplied Celera with the computing power to unravel the human genome. Commenting on the relationship, Michael Duggan, Compaq’s director of business development for the business critical solutions group, notes, “The relationship between Celera and Compaq resulted in the decoding of the human genome in record time. We anticipate an ongoing relationship (with Celera) in the future.”

In 2000, attention shifted from Compaq to IBM when IBM announced a $100 million life-sciences initiative. Under the initiative, IBM agreed to provide solutions in high-performance computing, infrastructure, data management, and integration. After launching the initiative, IBM stated that it would build the most powerful supercomputer in the world.

Blue Gene (the appropriately named supercomputer) will tackle predicting the 3-D structure of a protein—the most difficult problem in life sciences. IBM also announced an agreement with NuTec Sciences and the Winship Cancer Institute at Emory University, both in Atlanta, to develop an integrated information system that will allow physicians to tailor cancer treatment to a patient’s genetic makeup. Under the agreement, IBM will supply the world’s 10th largest supercomputer along with software for Web application serving, data management and integration, and information portals.

In 2001, IBM announced the upcoming launch of DiscoveryLink—the second prong of its life-sciences initiative, focused on helping the typical discovery scientist. DiscoveryLink is a middleware solution that can mine information with a single query, from heterogeneous data sources using software called a “wrapper” into what IBM calls a “federated” database. Specifically, the “wrapper” translates researchers’ queries into sorted query language (SQL) queries that can then be used to search for information in DB2 or Oracle’s database software. According to Sharon Nunes, the person responsible for DiscoveryLink and the director of IBM’s Life Science Solutions, “We are continually hearing about the problems heterogeneous data are causing life scientists. With DiscoveryLink (and a single query), scientists will have access to all their data streams (structural information, 3-D files, 2-D files, tables, flat files, etc.) in one virtual database.” Reflecting Nunes’ comments, Aventis Pharma has implemented DiscoveryLink to facilitate its drug discovery efforts.

More recently, IBM announced collaborations with several life-science specialists. In partnering with LION Bioscience (Heidelberg, Germany), IBM will combine its DiscoveryLink middleware with LION’s SRS integration platform to provide standard-setting information management capabilities. IBM also will provide the IT backbone for Proteome Systems’ (Woburn, MA) commercial offerings.

What does all this mean for life scientists and pharmaceutical companies? According to IBM, it means knowledge discovery. For a pharmaceutical company such as Aventis or Schering-Plough, it means the ability to efficiently mine the data contained in its numerous and often incompatible databases. For a biotechnology company, IBM’s approach means a one-stop shop for IT infrastructure.

In support of life science’s growing dependence on IT, other key suppliers, including Oracle, Hitachi, Sun, Motorola, and Agilent, have entered the field. In a highly publicized $185 million collaboration, Oracle, Hitachi, and Myriad Genetics have teamed up to map the human proteome (the repertoire of proteins in the human body) in less than three years. The union joins Myriad’s proteomic capabilities, Oracle’s software, and Hitachi’s expertise in electronics technology to improve understanding of the molecular basis for disease.

Sun Microsystems (Palo Alto, CA) has committed to the life sciences by forming an Informatics Advisory Council (IAC) and hosting an annual summit meeting. The IAC, a group of IT specialists from academia, industry, and public agencies, was formed to address the data analysis needs of the life sciences community and to discuss the future of standards, visualization, analysis solutions, and hardware and software platform requirements. Commenting on the underlying need for the IAC, Sun’s Sia Zadeh states, “Data integration is the number one challenge in the postgenomics era.” Sun has also formed alliances with software developers along the drug discovery value chain under its Discovery Informatics Program, in which it is working to promote the adoption of Java technology, a powerful, cross-platform programming tool that enables data sharing over the Web.

The Interoperable Informatics Infrastructure Consortium (I3C) is another group that has recently formed to address the “growing pains” felt in the life sciences. An international consortium led by major IT players, including IBM and Sun, the I3C now boasts more than 60 participants. Similar to the IAC, the I3C was formed to

  • address the need for open standards, protocols, administration, and technical infrastructure for the life sciences;
  • establish a common communications standard protocol that is extensible; and
  • promote discussion of current issues shaping technology evolution
    development, and use.

However, not all IT companies think alike. Unlike Compaq, IBM, and Sun, the heavyweight Motorola has taken a different approach. Instead of producing the IT infrastructure and software to support informatics, Motorola is making the bio chips, instrumentation, assays, and reagents, such as those used in its CodeLink Bioarray System, that researchers will use to conduct genomic and proteomic research. Motorola has also formed strategic alliances, such as that with the SNP Consortium, through which they will provide genotyping services. Key to Motorola’s success is its large size, expertise in manufacturing, and deep pockets. Along the same lines, Agilent Technologies (Palo Alto, CA) has launched several genomics-based products in support of discovery research.

The final analysis
In the life sciences, the two major issues facing researchers today are how to deal with the explosion of data in the postgenomics era and effectively utilize the wealth of heterogeneous data to efficiently create knowledge. The major IT companies have entered the field, each taking one of several fundamental approaches.


Jon Meyer is a consultant and Jim Thompson is a partner at Front Line Strategic Consulting in Foster City, CA. Send your comments or questions regarding this article to mdd@acs.org or the Editorial Office by fax at 202-776-8166 or by post at 1155 16th Street, NW; Washington, DC 20036.

Return to Top || Table of Contents