About MDD - Subscription Info

October 2000
Vol. 3, No. 8, pp. 81–83.

department/feature icon

Cooperative toxicology

A new toxicology consortium may bring together competitors to further science and save both money and animals.

opening art - mouseIn our competitive economy, sharing important information is often out of the question—company data are kept under lock and key. However, exchanging data becomes feasible as projects grow larger than individual companies can handle. Numerous consortia have been formed in various scientific disciplines, but one particular area has been left wanting: toxicology.

Predicting a compound’s toxicity has been difficult because, although software has improved to help the task, the available data set has simply been too small. There is a substantial amount of public data, but in myriad places. In addition, most companies conducting pharmaceutical or other consumer chemical research have databases of compounds and associated toxicities, but these fragmented sources rarely contain more than a few thousand compounds. For a toxicity prediction system to be effective, a database of a few hundred thousand to a few million compounds is needed.

The International Toxicology Information Center (ITIC) consortium was founded on serving the need for large amounts of toxicology data in one location. From humble beginnings, the consortium has entered the pilot stage, and supporters anticipate benefits to the entire chemical and pharmaceutical industry. The consortium hopes to enable companies to predict the toxicology of small molecules, thus preventing the expense of additional in vitro and in vivo testing.

History of the consortium

ITIC is based on an idea that originated with David Tennant when he worked for the Ministry of Agriculture, Fisheries, and Food in the United Kingdom. He presented a draft proposal to the International Life Science Institute Europe (ILSI) for sharing toxicological data from commercial and academic sources. However, ILSI discovered that companies were not willing to share data—even nonconfidential data—with competitors.

In the United States, a similar idea caught on with Procter & Gamble employees who approached Unilever to gauge its interest in such a project. Unilever was aware of Tennant’s project and indicated further interest in the idea.

In 1996, Procter & Gamble and Unilever asked Philip Judson to conduct a feasibility study. Judson, of LHASA, Ltd., and the University of Leeds, contacted about 30 organizations to determine whether there was interest in forming a toxicology consortium and found that all but one organization supported the principles of the project. “About half thought it was a practical idea, while the other half thought it was good in principle, but feared it would not be acceptable to company legal departments. The one dissenting voice came from ILSI in Europe at that time, because of their previous conclusions,” Judson said.

The first meeting of interested parties occurred in Aspen, CA, in July 1997; although members decided that the initiative should move forward, the original funding for the project had run out. During 1998 and 1999, the members decided on the structure of the project and its goals, while there was still no funding. ILSI was asked to and accepted the task of coordinating the project.

In mid-2000, five companies agreed to make small contributions, and the British government delivered a grant of about $20,000 to the project as a “pilot project aimed at reducing dependency on animal experiments.” These events roused the project into motion. ILSI Health and Environmental Sciences Institute is organizing the project, and LHASA is supporting the organization with its database experience. LHASA is a U.K.-based nonprofit that currently runs collaborations in which companies share information to build rules for knowledge bases.

The current draft proposal
Toxic DBs
The ITIC is not alone in its attempt to reduce animal use in toxicology testing. Gene Logic is building its GeneExpresstylized poison symbols gene expression database, which contains a module, called ToxExpress, that focuses on toxicology. This module is a reference database with gene expression profiles of rat (tissues and primary cells) and human (primary cells) samples that have shown toxic responses to known compounds. End users can test new compounds by measuring the gene expression profile of their compound with those profiles of known toxins in the ToxExpress module. This module is not a competitor to the ITIC project, as it covers different means of toxicology prediction. Both a gene-expression method and a structure-based method of toxicology prediction could be used to complement each other and further reduce animal studies.

The pilot program’s first phase will focus on selecting a database format, developing the database, filling it with public data, and testing the database. The selection and development of a database format are discussed below. The database, once decided upon, will be filled with toxicology information from various public sources.

The second phase of the project will be an extensive review of the pilot database. If progress is satisfactory, the project will move into the third phase—developing a proposal for a full-scale database. The full-scale proposal will be presented to sponsors and the scientific and industrial communities to gain additional support.

While the pilot database will use public information, the use of private information is much more complicated. Private companies that donate data may not be able to offer the most useful data for fear of divulging secrets. This could skew the database and narrow its usefulness. One solution that is being considered is constructing a private database that contains commercial toxicology data. Access to this private database would be controlled by a trusted third party, ensuring that a company’s data would not be viewed by the public or by other companies.

Researchers could then develop prediction methods on public and private data and make them available to the public. The actual information would never be released; however, it would help build reliable “rules” for predicting toxic substances.

Technology involved
The coordinators of this project, ILSI and LHASA, have focused on the International Uniform Chemical Information Database (IUCLID), that is currently being used by the European Union (EU) as an information repository for so-called high-production-volume (HPV) chemicals. The standard was developed by the European Chemicals Bureau (ECB, part of the EU) and operates on an Oracle database (it also has a PC version that uses Microsoft Access). In addition to the EU’s use and development, IUCLID has been adopted by the Organization for Economic Cooperation and Development (OECD) and the U.S. Environmental Protection Agency (EPA) for HPV Screening Information Data Set (SIDS) data collection. The EPA is developing a user-friendly PC version that will allow additional data for U.S. compliance to be submitted while still allowing the same data to be submitted to the international versions of IUCLID. In addition, a Web module will be built by the EPA to provide searching of public data from the U.S. HPV challenge program.

IUCLID’s biggest advantage is its widespread use and proven track record; however, there are serious drawbacks in the existing versions. The primary problem is that it is not currently used to store data on chemical structures. One of the main goals of this pilot project is to facilitate searching based on structure–activity relationships (SAR) to find substructures that could contribute to toxicity. Therefore, IUCLID must be modified to fully handle data on chemical structures. Two secondary problems also hinder IUCLID’s adoption: IUCLID cannot store raw documents that are associated with tests, and it doesn’t allow researchers to assign a quality evaluation for the data. The ITIC design encourages researchers to assess the quality of information that studies contain.

The EPA, the OECD, and the EU may correct these shortcomings in the new versions of IUCLID that are in development. For instance, the proprietary database maintained by the ECB uses more data fields than the public version. Perhaps the most important aspect to the development of IUCLID is that ECB is willing to modify the IUCLID system with the toxicology consortium and other international bodies.

Experience with other consortia
There are numerous consortia in other disciplines that can relate to the ITIC. Two consortia, the Linguistic Data Consortium (LDC) and the single nucleotide polymorphisms (SNP) consortium, show necessary attributes that ensure industry support.

The LDC (Philadelphia) uses broadcasts and other private news sources, in speech and in text, to populate its databases. The voluminous amount of data that is collected is used by a variety of research efforts. Christopher Cieri of the LDC indicated that many of the consortium’s members “work in automatic speech recognition, information retrieval (search engines), other natural language processing (parsing), speech synthesis, linguistics, language teaching, and speech pathology.” The original goal of the LDC was to serve the need for large amounts of language data, however, Cieri said, “The LDC has expanded beyond its original goal as data publisher to also provide data collection and annotation services.” These additional services directly benefit the industry as the consortium ages. Although the ITIC consortium has no plans to provide collection, annotation, or archival services as LDC does, in the future, the consortium could consider storing and organizing private data.
Consortia benefits
Karluss Thomas of the International Life Science Institute described the major benefits of the International Toxicology Information Center:
  • reducing the number of animals necessary to develop safety data for pharmaceutical and chemical products,
  • improving structure–activity relationships (SARs) as a predictive tool for toxicology,
  • using early toxicological endpoints to predict long-term toxicological responses, and
  • reducing redundant testing by data sharing.

The SNP consortium is identifying and mapping single nucleotide polymorphisms (SNPs) that occur in the human genome. This consortium deals with new data and makes all information public. The companies involved are contributing relatively small amounts of capital and receiving enormous amounts of information, although other companies can access the data as well. The endeavor will assist the entire industry and is also seen as an advancement for science and a public relations benefit for those involved.

The ITIC is similar to these two examples in providing valuable services to the industry it serves. Like the LDC, the ITIC will provide large amounts of data that are needed for studies that will serve the industry. Like the SNP consortium, the ITIC will serve the advancement of science and the reduced use of animals, both of which are publicly supported endeavors.

The “Big Picture”

The ITIC has a significant challenge ahead as it attempts to convince industry competitors to cooperate for the greater good of the industry and the scientific community, with the additional benefit of reducing animal testing. Consortia have been effective in producing economic returns in other industries, but the benefits are difficult to measure. The toxicology consortium is no different. However, sharing toxicology data appears to benefit the industry both financially and socially.

Return to Top || Table of Contents