|
Intelligent Data Mining
Sunday,
9:00 - Noon
(Pavilion
Suite 1)
Dr. Iveta Mrazova, Charles University, Czech Republic
Overview:
This tutorial provides an
overview of principal concepts and techniques applicable to data mining.
The subject of data mining consists mainly in exploring and analyzing
large quantities of data with the aim to discover their mutual
relationships. In this context, intelligent refers to systems that can
interact with their environment and that can adapt themselves to changes
both in the feature space and in time. Emphasis will be placed on adaptive
methods developed for data mining and their capability to detect
meaningful novel patterns. The ability to detect significant input
patterns and to identify their characteristic features can be used both
for training and optimizing the structure of the model at hand and for
improving its performance - generalization abilities, robustness, storage
capacity, etc. Simultaneously, it might help to gain insight into the
structure of the processed data and to detect irregularities and errors in
it.
To the tasks well-suited for data mining belongs classification,
estimation, prediction, affinity grouping, clustering and description.
There are known several types of data mining techniques – among others
cluster detection, memory-based reasoning, market basket analysis,
decision trees, link analysis, artificial neural networks, fuzzy logic
based systems, genetic algorithms. Many of these techniques were
introduced few years or decades ago, mostly in the area of computer
science and artificial intelligence. Data mining algorithms typically
require multiple passes over large volumes of data and many of them are
computationally intensive. Anyway, several trends are increasing the
necessity of powerful data mining tools - an increasingly service-based
economy, the advent of mass customization and a competitive advantage of
appropriate information.
Applications of data mining techniques reach a wide variety of fields -
economics, artificial intelligence (AI), databases, Web technologies,
medicine and statistics. The choice of a particular combination of
techniques to apply in a given situation depends on both the nature of the
data mining task to be accomplished and the nature of the available data.
Appropriate visualization of mutual relationships among the data enables
qualified decision making and reasoning. Unfortunately, for some models it
is relatively complicated to explain and visualize what they are doing.
For other purposes, it might be useful to extract a clear set of simple
rules providing insight into how is a particular model working. A similar
requirement represents an easy reusability of the applied model.
Instructor's Background:
Dr. Iveta Mrázová teaches courses on Data Mining and Artificial Neural
Networks at Charles University in Prague, Czech Republic. She received the
M.S. degree from the Friedrich-Schiller-University, Jena (Germany) in 1989
and the Ph.D. degree from the Institute of Computer Science of the Czech
Academy of Sciences in 1997. She published numerous research papers in the
area of artificial neural networks, pattern recognition and image
processing. In 1996, she received the Annual Prize of the Bolzano
Foundation for a collection of original publications “On the Internal
Knowledge Representation in Neural Networks.” In 2000, her paper
"Generalized Relief Error Networks" won the first runner up award in
“Theoretical developments in computational intelligence” of ANNIE´2000
(St. Louis, USA). In 2001, the Union of Czech Mathematicians and
Physicists and the Czech Society for Mechanics awarded her for
“outstanding work in the field of computer science” by the Prize of Prof.
Babuška. During September, 2002 – June, 2003, she joined the Engineering
Management Department, University of Missouri Rolla as a Fulbright
Visiting Scholar.
|