Zu Hauptinhalt springen

Projects closed

Gefiltert nach Schlüsselwort
Durch Klick auf das Schlüsselwort wird der Filter zurückgesetzt.

10.12.2015

Use read simulators to validate spike in protocol

Objective: Most algorithms for detection of differentially expressed genes assume that the majority of genes do not change their gene expression level over different conditions. This assumption is used throughout the complete RNA-seq workflow, starting from taking a fixed amount of RNA for sequencing. Recently it has been shown that certain conditions, e.g. constant activation of a transcription factor, can globally affect gene expression. To be able to detect these global changes, new experimental and computational approaches need to be developed. We propose to use foreign organism cells as spike-ins to calibrate for all sources of technical variation and monitor global gene expression shifts.... mehr


10.12.2015

Predicting proteomes from transcriptomes

Objective: In order to make a protein, a cell needs a mRNA. However, gene expression (mRNA quantities) and protein expression (protein quantities) correlate only weakly. Translation is regulated, and so is RNA processing as well as RNA and Protein degradation. All these processes affect mRNA and protein quantities and compromise the correlation between gene and protein expression. However, the regulating processes might leave their own traces in the expression of other genes. Here we aim at predicting the expression of proteins not only from the expression of the corresponding gene, but from the expression of all genes on a microarray. Data: The ICGC consortium provides paired gene and protein expression profiles for thousands of tumor samples. ... mehr


10.12.2015

Predicting Interactions between microRNAs and mRNA

Objective: microRNAs (miRNAs) are short RNAs (~21-24 nt) which can bind to other mRNAs and regulate their abundances and thus affect various biological processes. Modern high-throughput technologies, allow to detect the paired expression of many RNAs and miRNAs. For paired expression profiles it was shown that by using a least angle regression approach the gene expression could be reconstructed from the miRNA expression and new interactions which can be associated with cancer [1] discovered. In this project, the approach described in [1] should be applied on newer and larger miRNA-mRNA datasets and the results compared to the previous outcomes.... mehr


10.12.2015

Analyzing stromal lymphoma profiles

Objective: A lymphoma biopsy does not only consist of tumor cells. In addition there are typically cells from the tumor microenvironment (stroma). These include fibroblasts, epithelial cells from blood vessels and immune cells. If we extract RNA from the lymphoma tissue approximately 20%-30% of the total RNA is not from lymphoma cells. Some genes are only expressed in stromal cells, for them all RNA is stromal. Differences in the expression of these genes reflect the cellular composition of the lymphoma microenvironment. In this project we want to generate “pure” stromal profiles of lymphomas and analyze them.... mehr


09.11.2015

Implement and validate Affinity Propagation based hierarchical clustering on a GPU cluster

Objective: Affinity Propagation is a fairly new hierarchical clustering method that carriers over more conventional clustering techniques the advantage that decisions with regard to group affinity can be revisited. However, the method requires great amounts of Random Access Memory, which has limited the application of Affinity Propagation to relatively small datasets. Recently, Affinity Propagation has been implemented successfully on a Graphics Processing Unit and ongoing work aims at running Affinity Propagation on a GPU cluster to enable the clustering of up to one hundred thousand objects. The objective of this thesis is to apply GPU-based Affinity Propagation to various large datasets and to compare its performance to other clustering algorithms such as Ward's method.... mehr


09.11.2015

Program a wrapper for Segemehl output

Objective: Segemehl is a next generation sequencing read mapper which is capable of detecting unusual splicing events, for example trans-splicing, where distant RNA molecules are spliced together. It can also detect circular RNAs by looking for reads that support the fusion of two ends of an RNA molecule. Currently, segemehl output is limited to alignment files (SAM/BAM) that make it hard to reconstruct circular and trans-spliced RNAs that existed in the sample. The objective is to program a tool that works on SAM/BAM files and reconstructs the RNA population of the sample, and also offers visualization of these RNAs in a genome browser. ... mehr


09.11.2015

Curve Fitting with Distance Correlation

Objective: Dynamic models of biological network predict the course of molecular quantities along time. Typically these predictions are parameterized non linear curves and often the values of the parameters are unknown and need to be estimated from data. Moreover, a fit of the model to real measurements is needed for validation. This is a classical machine learning problem: “curve fitting”. We assume a functional relation of the form: y = f(x,p)+err, where f(.,p) is a parameterized family of functions. Curves are traditionally fitted by minimizing the squared difference between the curve and the observations over all possible values of the parameter p. The cost function for fitting is hence L(p)= E(y-f(x,p)^2). From y = f(x,p)+err we also know that the residuals r(x) = y- f(x,p) must be independent of x. A potential dependence between r(x) and x can be quantified and tested using the distance correlation D(x,r(x)). If there is a significant dependence, this is strong evidence that the fitted curve is incorrect, even if it is close to the data. Here we investigate whether curve fitting can be improved by including the distance correlation into the cost function: L’(p) = a L(p)+(1-a) D(x,r(x)).... mehr


01.11.2015

Simulating expression data with different amounts of total RNA and study effects on differential expression, correlation, clustering, classification

Objective: Current gene expression analysis protocols assume constant amounts of RNA across different samples. It has been shown that the total amount of RNA can change after e.g. the activation of a specific transcription factor. In this project, the student is supposed to simulate expression data from different amounts of total RNA and evaluate effects on differential expression estimates, correlation of genes, clustering and sample classification, taking global changes in mRNA level into account.... mehr


01.11.2015

Search for unusual RNAs in RNA-seq data

Objective: RNA-seq, for the first time, allows to detect the expression of all RNAs, coding and non-coding, on a genome scale independent of annotation. Many novel linear RNAs like transcript isoforms have been characterized based on high throughput sequencing data, but recently also circular RNAs and trans-spliced RNAs, where parts of RNA from different strands are merged together, have been found. Their function is largely unclear. One sequence aligner, segemehl, uses multiple split read alignment and therefore is able to detect circular RNAs and trans-splicing events. In this project, segemehl is to be applied on RNA-seq data and unusual splicing events are to be identified.... mehr


01.11.2015

Implement an R package for miRNA gene finding

Objective: We have developed a pipeline to detect miRNA precursors from RNA sequencing reads. To improve usability, the pipeline should run in R (and invoke scripts outside of R), or alternatively another programming language that the student knows well. The pipeline should be extended with graphical output and summarize results in a format that is easily understood by users. ... mehr


  1. STARTSEITE UR