Prof. Nikitas N. Karanikolas, Faculty of Philology, Belgrade, 25-26. May 2016.
Prof. Nikitas N. Karanikolas
Dept. of Informatics, Technological Educational Institute (TEI) of Athens,
Sreda, 25. maj 2016, 12h, Sala za sednice
Lecture “Building Stemmers for IR and related domains”
This work is part of a project aiming to define a methodology for building simple but robust stemmers, without having knowledge of the stemmer’s target language. The methodology starts with a very simple primary stemmer that simply removes the longest suffix (from a list of available suffixes) that matches the ending of the examined word. Information Retrieval (IR) experts express their arguments against the results of the primary stemmer. These (the experts’ arguments) are valuable knowledge that offer us the ability to apply supervised learning in order to automatically produce better stemmers (that conform to the arguments expressed by the IR experts). We also conduct an evaluation of our supervised learning based methodology that build stemmers for languages that the experts do not have knowledge.
2. Lecture “Text Classification Based on Phrases”
cetvrtak, 26. maj 2016, 11:30, sala T.B.A.
Finding the correct category (class) a new unclassified document belongs to is an interesting and difficult problem, with a wide range of applications. Our methodology for narrative text classification is based on two techniques: we calculate the distance (similarity) between the new unclassified document and all the pre-classified documents of each class and also calculate the similarity of the new document to the ‘Average class document’ of each class. In both cases we use key-phrases (text phrases or key terms) as the distinctive features of our text classification methodology and eventually, the proposed text classification method is based on the automatic extraction of an authority list of key-phrases that is appropriate for discriminating between different classes. In this paper, we apply this methodology in handling Greek text and we present the key concepts, the algorithms, and some critical decisions (e.g. use of stemming of words in order to produce the key-phrases instead of using key-phrases based on the inflected words). However, the proposed methods are general and could be applied to any text, regardless of language. A number of parameters of the mining algorithm are also fine-tuned. The actual text classification
system, the adopted (embedded) ideas and the alternative values of
parameters are evaluated using two training-sets (test collections).
Eventually, some useful conclusions are drawn and discussed.
Kratak CV prof. Nikitasa Karanikolasa
Nikitas N. Karanikolas graduated from the department of Statistics and Informatics of the Athens University of Economics and Business (AUEB), in 1988. He received his PhD from the department of Applied Informatics of AUEB, in 1994. Since 1987, he is a professional in software development. In 1996, he got the position of Systems Head of the Technological Educational Institute’s Library (TEI of Athens Library). One year later, November 1997, he got the position of chief in the Department of Informatics and Organization of the Aretaieio University Hospital. He remained in this position until November 2004. Then, he moved (after election) to the academic position of Assistant Professor in the Department of Informatics of TEI of Athens. He progressed to Associate Professor in the Department of Informatics of TEI of Athens, in November 2010. He progressed to Professor in the same department, in December 2014.
He elected and remained member of the Board of Directors of the Greek Computer Society (GCS) for four years. Member of the Board of Directors for two years (2004-2006) and Secretary General for another two years period (2006-2008).
Dr. Karanikolas academic interests are: Medical Informatics; Natural Language Processing and Understanding; Computational Linguistics; Information Retrieval; Text Data Mining; Databases; e-Commerce.
He has authored 15 Journal papers, 4 papers as book contributions and about 50 Conference papers. In most of them, he is the first or the only author. He is also author of five books (which are primary textbooks in Greek Universities) and editor of 3 conference proceedings.
Home page: http://users.teiath.gr/nnk/