Ettekanded / Talks

Marc Bezem: A Hard Problem in Max-Algebra, Control Theory, Hypergraphs and Other Areas

Abstract: Let F be a conjunction of "Max-atoms" of the form max(x,y)+k >= z, where x, y, z are variables and k is a constant value. We consider the satisfiability problem of such formulas (e.g., over the integers or rationals). This problem, which has many applications, is easily shown to be in NP and in co-NP. However, decades of efforts (in several research communities) have not produced any polynomial decision procedure for this apparently so simple problem.

We show that the Max-atom problem is PTIME-equivalent to several other well-known and at first sight unrelated problems on hypergraphs and on Discrete Event Systems for which the existence of PTIME algorithms is also open. Since there are few interesting problems in NP intersection co-NP that are not known to be polynomial, the Max-atom appears to be relevant.

Silvio Capobianco: Interactive Seminar: A Colorful Introduction to Cellular Automata

Slides of the talk: [pdf]

Abstract: Cellular automata are models of uniform synchronous parallel computation, where each identical node of a regular grid updates according to the state of a fixed-shape neighborhood. Their structure makes them very well suited to simulations of real systems. Applications are manifold, from physics to biology and medicine, from traffic to population dynamics. Theoretical studies are also broad and intriguing.

The interactive seminar will introduce cellular automata through visual experiments. These will run on a personal computer with Linux or Windows and a Python interpreter. The flexibility of the interpreted language allows quick creation from scratch on-the-fly modifications, so that several variants can be tried.

Technicalities:
We will make use of the SIMP/STEP software by Ted Bach: http://sourceforge.net/projects/simpstep/
It is a Python module which currently runs on 32-bit Windows (with MinGW) and Linux. The software also runs acceptably on a Pentium D 3.4GHz with Linux Mint 9 Fluxbox inside VirtualBox.
Full instructions for downloading and installing can be found at: http://www.cs.ioc.ee/~silvio/simp.html
Anyone who wishes to experiment with setups different than above is welcome to share impression and outcomes by writing to: silvio(at)cs.ioc.ee

James Chapman: Algebras of Relative Monads

Slides of the talk: [pdf]

Abstract: I will recap what a relative monad is and then present some concrete examples of algebras of relative monads. For the examples of relative monads that we know if you construct their algebras then you arrive at something quite canonical and often well-known. This provides assurance that the notion of relative monad is a good and useful one.

Mark Fišel: Automatic Translation Error Analysis

Slides of the talk: [pdf]

Abstract: Current state-of-the-art methods of evaluating machine translation produce a single score for a specific translation system output. We develop a method of in-depth analysis of typical translation errors, such as wrong, untranslated or misplaced word forms and phrases. First the system output (hypothesis) is aligned with a perfect translation, done by a human translator (reference). The alignment is used to detect specific differences between the hypothesis and reference, which are later summarized as translation errors of different types. Our method is language-independent, but can benefit from linguistic analysis of the evaluated translations. As a result it is possible to automatically assess the weaknesses of a translation system, which is much less expensive than manual error analysis, and much more informative than a single score evaluation.

Olga Gerassimenko: Spoken interaction modelling: the functions of feedback repetitions in spontaneous dialogue

Slides of the talk: [pdf]

Abstract: In modelling spoken dialogue, we benefit strongly from implementing the technics and strategies used by speakers in spontaneous spoken dialogue. Those strategies are highly effective when the spoken interaction is used, but they may be contra-intuitive or insufficiently recognised.

The flow of dialogue is significantly guided by the feedback of interlocutors. Among other means of feedback, full or partical repetitions of the interlocutor's talk are used to indicate the relevance of the information repeated. The functions of repetitions in speech include 1) agreement, 2) acknowledgement and 3) repair initiation. The latter two functions need to be distinguished by the dialogue system in order to understand whether the confirmation is needed. In the spontaneous speech, the distinction might be vague due to the fact that the absense of interlocutor's reaction to repetition would be a sign of its correctness (regardless whether the repetition was a mere acknowledgement or a request for confirmation); and the repairing reaction would follow wrong repetition even if the speaker has designed it as an acknowledgement and is not waiting for the reaction.

However, the confirmation following or not following the repetition is also related to the formal parameters of repetition turns (intonation contour and pauses preceding or following the turn), to the position of a turn in the dialogue structure and the institutional role of a speaker; all those cues are formalisable and can be helpful for the dialogue modelling.

The talk would demonstrate those cues in the qualitative analysis of 180 spontaneous information dialogues taped in natural environment (phone calls to the institutions, including 60 calls to the directory enquiries, held in Estonian, 60 calls to the outpatient's department, held in Estonian, and 60 calls to the outpatient's department, held in Russian, in order to compare language- and structure-driven differences).

Madeline González Muñiz: Time-specific Attribute-based Signatures

Slides of the talk: [pdf]

Abstract: We explore the setting in which digital signatures may be created and/or verified during specific time intervals. We construct time-specific signature schemes in the attribute-based signature setting by providing slight modifications to a known scheme. After discussing time-specific encryption, we propose a time server and encryption scheme suitable for the attribute-based setting.

Oskar Gross: Creating Mindmaps of Documents - Using an Example of a News Surveillance System

Slides of the talk: [pdf]

Abstract: There are thousands of news stories published through different sources every day. Filtering out those, which you are interested in is actually a non-trivial task. Our goal is to provide summaries and mindmaps of the news stories in order to grasp the essence of the document in a very short time. Our goal is to filter documents which are redundant, show mindmaps of novel connections created in the documents and provide short summaries. For achieving this functionality we are currently using tpf-idf-tpu measure for creating bisociations of the documents. We also take a look, how bisocations can be used in the field of computational creativity.

Pelle Jakovits: SciCloud: Adapting Scientific computing problems to Cloud

Slides of the talk: [pdf]

Abstract: Cloud computing, with its promise of virtually infinite resources, seems to suit well in solving resource greedy scientific computing problems. To study this, we established a scientific computing cloud (SciCloud) project and environment, on our internal clusters. With these clouds, students and researchers can efficiently use the already existing resources of university computer networks, in solving computationally intensive scientific, mathematical, and academic problems. However, to be able to exploit the full potential of the cloud infrastructure, the applications must be reduced to frameworks that can successfully exploit the cloud resources, like MapReduce. We show that while MapReduce is suitable for wide range of data processing and analyzing tasks, it is not useable for solving scientific computing problems. We compare MapReduce to alternative cloud computing frameworks, provide our preliminary results and outline our future work and goals in the context of the SciCloud project.

Wolfgang Jeltsch: The Curry-Howard Correspondence between Temporal Logic and Functional Reactive Programming

Slides of the talk: [pdf]

Abstract: Functional Reactive Programming (FRP) is an approach to programming with time-varying values and events. This talk demonstrates that FRP programs can be interpreted as proofs of temporal propositions. This relationship between FRP and temporal logic can be substantiated by a Curry-Howard correspondence. Developing such a correspondence leads to interesting ideas for programming and logic. Examples are time-dependent type inhabitation and a concept of temporal constructivism.

Siim Karus: Predicting Code Churn from XML Metrics

Slides of the talk: [pdf]

Abstract: XML is an increasingly ubiquitous language in modern software projects. Despite its widespread use and the amount of legacy code that it represents, very few studies have addressed the question of how XML code artefacts relate to the maintenance effort of software projects. This talk gives a peek into how metrics extracted from XML and XSLT file revisions and organisational metrics extracted from version control repositories help estimate files cumulative yearly code churn in 13 open-source software projects. We find that models built on organisational metrics are superior to those built purely with code structure metrics. We also find that metrics associated with XML files can be successfully used for making highly accurate estimations of cumulative code churn of not just XML files but all files in project.

Neeme Kahusk: Semantic analysis of simple sentences: the way to go for Estonian

Slides of the talk: [pdf]

Abstract: To carry out the task of getting inferences from simple sentences in Estonian, we have to take several steps. The presentation will focus on inevitable problems to solve on our way from plain text to theorem proving. While morphological analysis and disambiguation is one of the solved problems, there is still more to do before we can move on even in a single limited domain.

Liina Kamm: Genotype Mining Using Secure Multiparty Computation Techniques

Slides of the talk: [pdf]

Abstract: Genome-wide association studies are one of the driving reasons behind the formation of nationwide and privately funded gene banks. One of the main reasons why the identification of complex genetic traits is so difficult, is the relatively small scale of association studies. Larger sample sizes do not only increase the sensitivity of statistical tests, but also make it possible to apply a wide range of data mining methods. It would be ideal if the researchers could have access to nation- and continent-wide patients cohorts.

The privacy of individual gene donors is the biggest concern in such projects. In many countries, genotype information is classified as sensitive information that can be handled by complying specific restrictions. Ideally, no single party should be able to release the data or parts of it. We have set up an infrastructure where the genotype data can be stored and processed without single point of failure using the SHAREMIND system. We implement an algorithm for a simple genome-wide association study on this system and go on to show that the solution is practically feasible using current technologies and it is flexible enough to implement wide variety of data mining algorithms. The algorithm is executed in an oblivious manner so that only the desired outcome is revealed and nothing else. Differently from well-known data perturbation and masking techniques, the security guarantees are cryptographic. As such they depend only on the computational complexity of well-established mathematical problems and not on the background knowledge of the attacker.

Meelis Kull: Hypothesis generation in bioinformatics

Slides of the talk: [pdf]

Abstract: One important role of bioinformatics is to provide hypotheses for biologists who can then confirm or reject these based on additional experiments. Due to high cost of experiments it is necessary to rank the hypotheses by decreasing plausibility. This presentation describes a general statistical framework of hypothesis generation and ordering in bioinformatics. An example of over-representation analysis is studied in more detail.

Marko Kääramees: Model-based Synthesis of Reactive Planning On-line Testers

Slides of the talk: [pdf]

Abstract: I describe a method and algorithm for model-based construction of an on-line reactive planning tester (RPT) for black-box testing of embedded systems specified by non-deterministic extended finite state machine (EFSM) models. The key idea of RPT lies in off-line learning of the System Under Test (SUT) model to prepare the data for efficient on-line reactive planning. The result of the off-line analysis is a set of constraints used in on-line testing for guiding the SUT towards trap-labeled transitions in SUT model and generating required data for inputs.

Peep Küngas: Reverse-Engineering Conference Rankings: What Does it Take to Make a Reputable Conference?

Slides of the talk: [pdf]

Abstract: During the recent years several national and community-driven conference rankings have been compiled and published. These rankings are used for a variety of different purposes. For instance, bureaucrats use them in evaluation of performance of academic institutions and individual scientists, scientists use these rankings in selection of target conferences where to submit papers, conference committees use the rankings to position themselves with respect to competing venues etc. Although the ranking process itself is often not so transparent and the rankings are subjective to ranking communities, these rankings can be seen as an indicator of perceived reputation of particular conferences since publication in a top-tier conference increases the perceived reputation of its authors.

In order to understand the main measurable ingredients of the perceived reputation of conferences we decided to analyze some of the existing ranking with respect to their bibliometric indicators and acceptance rates. The intuition is that many scientists consider those conferences reputable, which have either low acceptance ratio or whose articles are highly cited. Our findings presented in this paper confirm that while in national rankings the acceptance ratio is the dominant feature, in community-driven rankings both acceptance ratio and bibliometric indicators are equally important in deciding the reputation of a conference. It also turns out that only top-tier conferences can be identified with good enough precision through bibliometric indicators and acceptance rates.

Peeter Laud: Polynomial-time logic and liveness

Slides of the talk: [pdf]

Abstract: Trace properties are among the most important classes of properties of cryptographic protocols. Among those, safety properties have received enormous attention from the protocol analysis community, while the liveness properties have not been considered so much, especially with respect to the "computational" semantics of protocols where messages are modeled as bit-strings and the adversary can be any probabilistic polynomial-time algorithm. We believe that among other reasons, the lack of proper understanding of liveness in this model is a major cause for this lack of attention. To remedy the situation, we propose a game-based interpretation for the first-order linear time logic (FO LTL), where the players are polynomially bounded. We show that all axioms and inference rules of the first-order logic and linear time logic continue to hold under this interpretation. We discuss how the logic can be used to analyse the Asokan-Shoup-Waidner fair contract signing protocol and establish that this protocol provides polynomial fairness guarantees. (Joint work with Dominique Unruh.)

Sven Laur: Oblivious shuffle finally done right

Abstract: Oblivious shuffle protocols are commonly discussed in the context of electronic voting, where they are used as a tool to eliminate connections between ballots and voters. Most of these protocols are based on homomorphic encryption; however, the same problem is also relevant in the context of share computing. In particular, the ability to obliviously shuffle a database of shares makes it easy to filter out database records satisfying a private inclusion criterion. In this talk, we present a solution that has asymptotically optimal communication and computation complexity. We first present a solution for the semihonest model and then show how to protect the protocol against active corruption. The construction meets theoretical limits for corruption. Namely, the set of tolerated adversarial coalitions must satisfy the Q2 condition in the semihonest model and the Q3 condition in the malicious model. The presentation covers a joint work with Bingsheng Zhang and Jan Willemson.

Artjom Lind: Mobile Friend-to-Friend Computing

Slides of the talk: [pdf]

Abstract: Cloud computing for desktop computers has recently become popular, and the cloud is already going mobile. The term "mobile cloud" means the transparent migration and distribution of applications and services in a mobile environment. Friend to Friend (F2F) Computing is a new paradigm for distributed computing, merging ideas from Peer-to-Peer, High Performance Computing, and social networks in instant messaging. The framework allows users to set up the small computation Grids (spontaneous Grids) based on mutual trust in current instant messaging systems. Privacy concept of personal mobile device in conjunction with mutual trust of Friend-to-Friend platform makes it possible to build private cloud infrastructure using using mobiles. For this research, we develop Friend-to-Friend mobile port targeting Nokia Symbian and Google Android platforms.

Ubiquitous systems become more and more popular spanning thousands of different devices including gadgets and mobile devices of different vendors. Therefore, it becomes challenging to distribute and configure applications in such heterogeneous environment. Friend-to-Friend (F2F) Computing is designed to solve compatibility and configuration issues of ubiquitous systems, making the overall process from development to deployment user friendly. It provides a unique platform for distributing and running the applications independent of the underlying set of devices.

Helger Lipmaa: Progression-Free Sets and Sublinear Pairing-Based Non-Interactive Zero-Knowledge Arguments

Slides of the talk: [pdf]

Abstract: In Asiacrypt 2010, Groth constructed a non-interactive zero-knowledge (NIZK) argument for circuit satisfiability with constant communication, linear verifier's computation, and quadratic common reference string length and prover's computation. In the current paper, we show how to reduce the common reference string length to quasilinear, and the prover's computation from quadratic number of exponentiations to quadratic number of multiplications. We do this by using a connection to the theory of progression-free sets. As an independent technical construction, we show that for any $n > 0$, $[n]$ has a progression-free subset of odd integers of cardinality $n^{1 - o (1)}$. Moreover, we use slightly weaker security assumptions than Groth. One of the applications of the current paper is a perfect zap for circuit satisfiability with quasi square-root communication complexity.

Cryptology ePrint Archive: Report 2011/009

Keiko Nakata: Walking through infinite trees constructively via mixed induction-coinduction

Slides of the talk: [pdf]

Abstract: I look at familiar formulae in temporal logics from the viewpoint of constructive type theory. On two-colored streams, say red and black, two formulae are of particular interest: "a stream is infinitely often red" and "a stream is almost always black". These formulae naturally extend to binary red-black trees with infinite paths, by quantifying over paths. For instance, we have "every path is infinitely often red" and "some path is almost always black". I examine formulations of these formulae by means of predicates on paths and predicates on trees, and relate the path-based and tree-based views. These formulations require mixed induction and coinduction. With our eyes refined through the glasses of constructive type theory, we discover subtleties in different formulations. (Joint work with Tarmo Uustalu)

Tõnu Tamme: Future of e-mail

Slides of the talk: [pdf]

Abstract: For twenty years e-mail has been the prominent means of computerized communication. We keep our messages in the inbox or store them in several mostly manually created hierarchical folders. The growing number of messages has introduced the information overload problem. Thus organizing and managing huge e-mail repositories is an important and vital task. Our goal is to analyze the explorative search capabilities of present standard e-mail clients and webmail services, and to find out some improvements of them through the use of categorization and graph exploration techniques.

Konstantin Tretjakov: Improved Landmark-based Estimation of Shortest Path Distances in Very Large Graphs

Abstract: Computing the shortest path between a pair of vertices in a graph is a fundamental primitive in graph algorithmics. Classical exact methods for this problem do not scale up to contemporary social networks with hundreds of millions of users and billions of connections. A number of approximate methods have been proposed, including several landmark-based methods that have been shown to scale up to very large graphs with acceptable accuracy. We present some improvements to existing landmark-based shortest path estimation methods which allow us to achieve higher accuracy with acceptable time and memory overhead.

Reina Uba: Clone Detection in Repositories of Business Process Models

Slides of the talk: [pdf]

Abstract: Business process model repositories capture precious knowledge about an organization or a business domain. In many cases, these repositories contain hundreds or even thousands of models and they represent several man-years of effort. Over time, process model repositories tend to accumulate duplicate fragments, as new process models are created by copying and merging fragments from other models. This calls for methods to detect duplicate fragments in process models that can be refactored as separate subprocesses in order to increase readability and maintainability. This talk introduces an indexing structure to support the fast detection of clones in large process model repositories.

Vesal Vojdani: Goblint against automobile racing: Detecting concurrency flaws in interrupt-driven software

Slides of the talk: [pdf]

Abstract: I will give an overview of the Goblint project, focusing primarily on our recent efforts to analyze embedded controllers in the automotive domain. We consider programs consisting of a set of tasks (including interrupt service routines) scheduled according to their run-time priority. As the execution of a task may be interrupted by the environment, synchronization is required to ensure data consistency. This is achieved through the acquisition of "resources" which raise the priority of the executing task such that the priority-driven scheduler is no longer tempted to preempt it. For real-time systems, ingenious synchronization protocols have been devised to determine how priorities are adjusted in order to maintain schedulability (in the face of hard real-time constraints) even as usage of shared resources increase. Nevertheless, it remains the task of the programmer to correctly use the given synchronization primitives to maintain data consistency. We present static analysis techniques to verify the correct use of these resource acquisition primitives. Our analysis takes into account the refined control flow induced by priority-driven scheduling where priorities are adjusted according to a specific widely-used synchronization protocol, the immediate ceiling priority protocol. (Joint work with Martin Schwarz, Helmut Seidl, Markus Müller-Olm and Peter Lammich.)

Jan Willemson: X-Road Pseudonymization Service

Slides of the talk: [pdf]

Abstract: Pseudonymization is sometimes used as a light-weight alternative for fully cryptographic solutions, when information from different data sources needs to be linked in a privacy-preserving manner. In this talk, we review some previously proposed pseudonymization techniques, point out their design flaws and propose a simple solution based on a unified database access layer called X-Road developed in Estonia. Our solution has been fully implemented and the benchmarking results together with the security analysis are presented to conclude the talk.

Liina Kamm
Peeter Laud
Helger Lipmaa
Tarmo Uustalu
Varmo Vene
Viimane uuendus 9.02.2011