Machine Learning



June 2019 – Present
Vienna, Austria



August 2016 – February 2019
Vienna, Austria



I was part of the BigDAMA project, a fundamental research project focused on big data analytics for network traffic monitoring and analysis.

My research topics are mostly related to how to represent network traffic for detecting attacks at the network level (i.e., which features to use). Experimented with both classical features (what have people been using and why?), and feature learning approaches using Deep Learning techniques, and in particular representing traffic in 2-dimensional spaces.

March 2016 – July 2016
Lisbon, Portugal



I was part of the SUMMA project, a huge H2020 project.

I specifically tackled the problem of named entity recognition in which, given a text with some entity mentions (e.g., David and Victoria), the goal is to find the corresponding entities in Wikipedia (e.g., David Beckham and Victoria Beckham). We used both classical methods (such as SVM) and modern Deep Learning approaches.


DeepArchitect is a framework for automatically searching over computational graphs in arbitrary domains, designed with a focus on modularity, ease of use, reusability, and extensibility.

City-GAN uses Conditional Generative Adversarial Neural Networks to generate pictures of fake buildings, with the architectural characteristics of a specific city.

The Traffic Flow Mapper is a prototype tool for visualizing network traffic in 2D.

MDCGenPy is a synthetic dataset generator made specifically for testing clustering algorithms. It allows for incredible flexibility in generating data with specific shapes with a low effort.

The NTARC database is a collective effort of labeling and categorizing research made in the network traffic analysis field.

Multilingual embeddings can be used for any Natural Language Processing task which applies to multiple languages.

Recent Publications

We present Network Traffic Analysis Research Curation (NTARC), a data model to store key information about network traffic analysis …

We propose a formal language for encoding search spaces over general computational graphs, applicable in particular to neural network …

We present a tool for generating multidimensional synthetic datasets for testing, evaluating, and benchmarking unsupervised …

We used semi-supervised Autoencoders to obtain 2d visualizations of network traffic that separate between distinct types of attacks.

We analyse the used features in network traffic research, and propose a new traffic vector based on how often they are chosen in the …

We propose a joint formulation for learning task-specific cross-lingual word embeddings, along with classifiers for that task. We …