DISTRIBUTION ANALYSIS ON GENE ARRAYS THROUGH SYMBOLIC DATA ANALYSIS

Mireille Gettler Summa
CEREMADE
Université Paris IX Dauphine
1, Pl. du Ml. De Lattre de Tassigny
75016 – PARIS – France

Abstract

New approaches in Data Mining and Data Analysis makes it possible to represent classes as distributions and to analyse them in such a form. Classes are built, either through queries arising from expert knowledge, or through algorithms which provide clusters and their representation as distributions or in a Symbolic data form. Corresponding software was built for these new methods and was applied on the French National Data Base , the compilation of which started in 2002 and is still on going. Data are still in a validation phase. It consists of multiple-way arrays: tissue samples, gene expression measures, patient clinical information, gene function and associated organs. Symbolic data analysis appears an efficient framework for distribution data exploration, and have already produced interesting results for the end users.

Keywords: Symbolic Marking, Symbolic Pyramidal clustering, Symbolic Multiple Symbolic Analysis, Statistics on distributions, Transcriptome

Back

©