I am a machine learning expert working at Cyan Security. My current research focus is on applying machine learning techniques for cyber security.
I received a MSc in Applied Mathematics from Técnico Lisboa. I’ve worked for almost 2 years in machine learning applied to Natural Language Processing, in my MSc thesis and at Priberam. In 2016 I moved to Vienna to work in cyber security, an area by which I’ve always been fascinated. After working as a project assistant for the Communications Networks group at the Technical University of Vienna, Austria, I decided to move to the industry.
Outside of work I’m passionate about music, and in particular metal (check out some of my work), and also video-games.
MSc in Applied Mathematics, 2015
BSc in Applied Mathematics and Computation, 2013
I was part of the BigDAMA project, a fundamental research project focused on big data analytics for network traffic monitoring and analysis.
My research topics are mostly related to how to represent network traffic for detecting attacks at the network level (i.e., which features to use). Experimented with both classical features (what have people been using and why?), and feature learning approaches using Deep Learning techniques, and in particular representing traffic in 2-dimensional spaces.
I was part of the SUMMA project, a huge H2020 project.
I specifically tackled the problem of named entity recognition in which, given a text with some entity mentions (e.g., David and Victoria), the goal is to find the corresponding entities in Wikipedia (e.g., David Beckham and Victoria Beckham). We used both classical methods (such as SVM) and modern Deep Learning approaches.
DeepArchitect is a framework for automatically searching over computational graphs in arbitrary domains, designed with a focus on modularity, ease of use, reusability, and extensibility.
City-GAN uses Conditional Generative Adversarial Neural Networks to generate pictures of fake buildings, with the architectural characteristics of a specific city.
The Traffic Flow Mapper is a prototype tool for visualizing network traffic in 2D.
MDCGenPy is a synthetic dataset generator made specifically for testing clustering algorithms. It allows for incredible flexibility in generating data with specific shapes with a low effort.
The NTARC database is a collective effort of labeling and categorizing research made in the network traffic analysis field.
Multilingual embeddings can be used for any Natural Language Processing task which applies to multiple languages.