Institute of Bioinformatics

Cell line based compound prioritization and response prediction

Term: 1/2009 - 12/2010

Project founded by Johnson & Johnson Pharmacedutical Research and Development, Divisie Van Janssen Pharmaceutica N.V.

Project supported by Belgium government through project no. IWT 80536

Research Partners:
Hasselt University, Center for Statistics, Belgium
Johannes Kepler University, Institute of Bioinformatics, Austria

Topic:
The aim of this project is to utilize a diverse panel of cell lines, a novel high throughput gene expression platform and innovative data analysis procedures to collect whole genome expression data of compound treatments originating from positive phenotypic screen hits. This process would allow us to group chemically diverse molecules according to how they trigger similar biological mechanisms. Such an approach will influence different areas of drug discovery related research:

By introducing biological information into the selection process of compounds we aim to incorporate compound mechanistical data early in the drug discovery pipeline.
Furthermore, a consistent grouping of compounds across cell lines of diverse genetic backgrounds should lead to hints on possible consequences of drug treatment that are essential for the determination of the mechanism of action of a new medical entity.
This information should provide the researcher also with potential leads for generating markers that are indicative of whether a patient will respond to a drug or not.

The pharmaceutical industry is facing major reductions in sales as their leading products go off patent. At the same time, pharma research has been slow in developing new drugs. Almost 50% fewer drugs were brought to the market in the years from 2002 to 2006 as compared to the last five years of the 1990s.
The solution that many in the sector favour is a shift toward biotech from chemistry-based drug development. "For all our amazing advances in the last 50 years, we are still working with the tools of the first pharmaceutical revolution…using advanced chemistry to treat disease symptoms," Lilly's Taurel said in a 2003 speech.
Drug discovery research involves several steps from target identification to preclinical development. The identification of a small molecule as a modulator of protein function and the subsequent process of transforming these into a series of lead compounds are key activities in modern drug discovery.
Currently this process involves the identification and qualification of a drug target. The modulation of the target via a small molecule is thought to have a beneficial effect on the patient as the target is hypothesized to be linked to the cause or the symptoms of a given disease.
As the identification of a compound based on such a single target focuses on the activity of a compound on this target, the initial identification of hit compounds does not include a broader characterization of the compound effects on a cell. Potential side activities are only identified in later toxicity studies. Furthermore, if a target later turns out to be insufficiently qualified (e.g. involved in other essential pathways or in other tissues besides the diseased tissue), compounds developed against this target cannot move forward.
The only parameters that are available at the early stages between identifying a hit and designing a series of lead compounds are chemical properties. While these are essential characteristics for the further development (e.g. solubility), they are insufficient in providing further biological information on the effects of the molecule on the cell.
In other words, the relevant biologically data is acquired too late in the research process and can lead to the termination of compound development projects at a time when substantial resources in chemistry and biology have already been spent.
A setting in which biologically relevant data on the effects of a compound on a cell could be obtained much earlier should prove beneficial to the overall success and length of the drug discovery and development process.

University of Linz
The group of Prof. Hochreiter applies machine learning methods to biological and medical data. The group has a solid expertise in data analysis especially for medical and biological data and has recently developed feature selection and classification methods for high-dimensional and noisy measurements which are relevant in this project. Feature selection methods for special tasks have been developed in Hochreiter's group, i.e. to identify genes belonging to certain pathways. The group has developed the leading method for microarray data preprocessing (normalization and summarization) as their method FARMS has been winner at an international challenge (http://affycomp.biostat.jhsph.edu). Hochreiter's group introduced a protocol for gene selection which obtained a world-wide reputation. These methods are currently applied in a project analyzing the gene expression profile for the choroids melanoma in collaboration with the Department of Medicine (Hematology, Oncology and Transfusion Medicine) of the University Hospital, Benjamin Franklin, Berlin. Another expertise of the institute of bioinformatics is in data bases and information systems for biological data.