Mini Symposium - January 9, 2012

Bioinformatics and Medicine

January 9, 2012
11.00 am - 17.15 pm
room: G2-104 (CeBiTec Building)

Speakers

Bernhard Balkenhol, Anna Falkenhain - infinity^3, Bielefeld
Nils Hoffmann - Universität Bielefeld
Peter Husemann - Universitätsklinikum Düsseldorf
Pina Krell - Universitätsklinikum Düsseldorf / Universität Bielefeld
Tim Nattkemper - Universität Bielefeld
Knut Reinert - Institut für Informatik, Freie Universität Berlin
Peter Robinson - Institute for Medical Genetics, Charité, Berlin
entfällt wegen Krankheit: Peter Walden - Klinische Forschergruppe, Tumorimmunologie, Charité, Berlin

Schedule

11.00 - 11.45	Knut Reinert	“Bringing Algorithms to clinical research - Software libraries and infrastructure”
11.45 - 12.30	Bernhard Balkenhol und Anna Falkenhain	“Personalisierte translationale Medizin - ein Einsatzfeld für SOA, Grid, und Cloud?”
12.30 - 13.30	lunch
13.30 - 14.00	Nils Hoffmann	“Computational Metabolomics”
14.00 - 14.30	Pina Krell	“Profiling the T-cell receptor repertoire using next-generation sequencing”
14.30 - 15.00	Peter Husemann	“Bioinformatics challenges in a university clinic”
15.00 - 15.15	coffee break
15.15 - 16.00	Peter Robinson	“The Human Phenotype Ontology as a Tool for Annotation, Analysis, and Clinical Diagnostics in Human Genetics”
16.00 - 16.45	Peter Walden	entfällt wegen Krankheit
16.45 - 17.15	Tim Nattkemper	“Data mining in multivariate bioimage data”

Abstracts

Knut Reinert: "Bringing Algorithms to clinical research - Software libraries and infrastructure"

In this talk I first introduce SeqAn, a generic C++ library for biological sequence analysis. I will shortly introduce its design and content and how it should streamline the process of building efficient applications for biomedical NGS research. Then I will discuss necessary steps we should take to go forward with interdisciplinary research.

Bernhard Balkenhol, Anna Falkenhain: "Personalisierte translationale Medizin - ein Einsatzfeld für SOA, Grid, und Cloud?"

Die Forschung, so heisst es, rückt näher ans Krankenbett: “Translationale Medizin” verfolgt das Ziel, Erkenntnisse der klinischen Forschung so frühzeitig wie möglich in die medizinische Praxis einzubeziehen und dabei zugleich - als “personalisierte Medizin” - die individuelle genetische wie epigenetische Verfasstheit von Patienten so präzise wie irgend möglich zu ermitteln und zu berücksichtigen.

Zugleich gehört der Wunsch nach standardisierten Prozessen und einem komfortablen und sicheren Umgang mit immer größeren Datenvolumen für Krankenhäuser ohnehin zum Alltag.

Die Vorteile einer eigens für beides zur Verfügung stehenden, zentralen und hochautomatisierten Plattform mit Anbindung an eine eigene Glasfaser-Infrastruktur liegen auf der Hand. Beides kann durch eine Vereinheitlichung von Schnittstellen und die Etablierung von automatisierten Unternehmens-Prozessen ermöglicht werden, die sich an gängigen Standards für Prozesse, wie zum Beispiel ITIL, COBIT, und eTOM (für Telekommunikation und Dienstleistungsunternehmen) orientieren und infrastrukturell auf Grid und Cloud setzen. Insbesondere kann durch eine solche Plattform

eine zentrale, sichere Ablage der Daten mit Hilfe eines integrierten Archivsystems und
ein den hohen Anforderungen an hochsensitive Datenhaltung Rechnung tragenden Berechtigungssystem garantiert und
ein direkter Zugang zu den jeweils relevanten Forschungszentren ermöglicht werden.

So können zum Beispiel auch größte Datenvolumina, wie histologische Schnitte oder 3-D Modelle komfortabel gehandhabt, ausgetauscht und zwischen Experten “online” diskutiert werden - Möglichkeiten, die vor ein paar Jahren aufgrund der technischen Limitierung nicht gegeben waren.

Im Vortrag wird die Kette von der Definition eines Prozesses bis hin zur technischen Umsetzung innerhalb einer solchen Plattform skizziert. Ein Schwerpunkt wird auf einem Kernthema von SOA (Service-Orientierten Architekturen) liegen, dem Business-IT-Alignment. Hier wird die Notwendigkeit einer engen Abstimmung zwischen Fach- und IT-Wissen beleuchtet.

Anhand von Beispielen werden Möglichkeiten und Visionen aufgezeigt und zur Diskussion gestellt.

Nils Hoffmann: "Computational Metabolomics for Clinical Applications"

The field of metabolomics, studying the micro-cosmos of a large variety of chemical compound classes involved in the biology of an organism, has seen a rapid development over the past decade. It has helped in various areas to supplement and deepen the insight into the interactions and dependencies of tissues, cells, organisms, and their environment gained from other “omics” techniques, such as genomics, transcriptomics, or proteomics. Computational metabolomics is yet a very loosely defined term, which may comprise anything from computational methods for raw data preprocessing and normalization to statistical analysis and prediction of possible formulas for unknown compounds.

In this talk, we will give a comprehensive overview of the main areas of computational metabolomics and some of their individual aspects. We will then show recent results of computational metabolomics applied to clinical data in order to search for metabolic biomarkers and to provide further hints to involved pathways and regulatory activities. The final goal of this research is the optimization of routine clinical diagnostics and treatment protocols for the benefit of the patients.

Pina Krell: "Profiling the T-cell receptor repertoire using next-generation sequencing"

Molecularly, the key to combat attacks of the vast and highly variable pool of pathogenic microorganisms lays in the CDR3 of the T-cell receptor (TCR) molecule. Somatic alterations, in the complementary-determining region three (CDR3), shape the T-cell receptor repertoire. Diversity in the repertoire is responsible for recognition of foreign particles from viruses and bacteria, and enables the immune system to trigger an immune response against a broad range of foreign intruders. Regarding its immunological ability, characterizing TCR repertoire diversity in health and disease is of major interest in a broad area from clinical to basic immunological questions.

Alterations in the CDR3 are generated through cell-specific, irreversible germ line rearrangement, V-D-J gene recombination. In addition, through a process of random nucleotide deletion and addition between the rearranged genes, diversity is further increased and can result in > 10^18 different TCR molecules, at least theoretically. Actually, 2×10^7 T cells, each carrying a unique CDR3 are thought to reside in the lymphoid organs and circulation of the human blood. This extraordinary diversity has long challenged in depth sequence analysis of the TCR repertoire and thus only small parts of the TCR repertoire were characterized on sequence level.

Emergence of second generation, high-throughput sequencing (HTS), with major progress in capacity and time has made TCR repertoire sequencing an applicable approach. Nevertheless, raw data obtained from HTS technologies are usually error prone and make reliable estimation of TCR repertoire diversity difficult.

In this talk we address challenges that influence reliable TCR repertoire analysis from data generated by high-throughput sequencing technologies. We will show how data analysis can be improved using sequencing error estimators but how rarely sampled sequences still challenge reliable repertoire analysis. Finally, to underline practical application and value as a diagnostic routine we will present current results from healthy and impaired repertoires and future applications.

Peter Husemann: "Bioinformatic challenges in a university clinic"

Current sequencing techniques dramatically enhance the possibilities in clinical disease diagnosis. Yet, there are several theoretical as well as practical challenges when dealing with sequences of the human genome. In this talk, I show some established workflows that are employed in the University Clinic of Dusseldorf (UKD) in the context of childhood leukemia. This contains an introduction to the data acquisition, the description of (pre)processing steps, and an overview of possible bioinformatic analyses.

Peter Robinson: "The Human Phenotype Ontology as a Tool for Annotation, Analysis, and Clinical Diagnostics in Human Genetics"

Phenotypic analysis has played a central role in the mapping of disease genes and many other fields, and humans are particularly good at recognizing human phenotypic traits and anomalies.To date, phenotypic descriptions of human disease in patient charts, scientific articles, and databases have usually been written in free text. Although such descriptions can be highly expressive for human readers, it is extremely difficult to parse such texts computationally to discover their „meaning“, which has hampered data integration across different databases as well as computational analysis of human phenotypic abnormalities. We have developed the Human Phenotype Ontology (HPO), which now contains about 10,000 terms representing individual phenotypic anomalies and have annotated all clinical entries in Online Mendelian Inheritance in Man (OMIM). The HPO was developed on the basis of information in OMIM, which we used to create a comprehensive controlled vocabulary and ontological structure which allow the use of a number of ontological algorithms for the analysis of the human phenotype. The HPO project is currently developing logical definitions of HPO terms based on links to the Foundational Model of Anatomy (FMA), Gene Ontology, and other bio-ontologies that will allow computerized reasoning over these domains.

In this presentation, we will present the HPO project and explain the advantages of using an ontology for representations of the human phenotypes in databases. We will discuss recent algorithms for clinical differential diagnosis with ontological similarity searches. We will show how logical definitions of HPO terms can be used to enable computerzed reasoning over human and model organism phenotypes to identify candidate genes for phenotypic abnormalities in CNV diseases.

The ontology and annotation files, together with a browser for the human phenome and other information, are available at the HPO homepage: http://www.human-phenotype-ontology.org

Tim Nattkemper: "Data mining in multivariate bioimage data"

In medical diagnostics and research imaging is one of the most prominent technologies, since it allows to relate molecular features to morphology or anatomy. In the last decade, new microscopy techniques (such as High Content Screens (HCS), high resolution imaging (STED) or multivariate bioimaging (MBI, TIS, MELC)) and innovative new imaging approaches such as MALDI imaging allow the high-dimensional visualization of molecular interaction and co-location in tissue. While the potential of these approaches promises to close some of the gaps in our systems biology understanding of disease, the analysis of the image data represents a new and fascinating challenge for bioinformatics researchers. In this talk some solutions to the delicate problem of bioimage data mining will be presented, ranging from segmentation to visualization.

Genome Informatics