Daniel C. Ferreira

Machine Learning Engineer

Freelancer

ABOUT ME

I am a Machine Learning expert based in Vienna, Austria with a background in mathematics and working since 2016 in academia and industry. My focus is on applying ML techniques to Natural Language Processing and/or Cybersecurity topics.

I have experience in the full lifecycle of ML, from data collection/processing/labeling all the way to deployment, and like to work in projects where a little bit of everything is needed. My past professional experience includes a leading cybersecurity company, the Technical University of Vienna, among others.

Outside of work I’m passionate about music, and in particular metal (check out some of my work), and also video-games.

Interests

Artificial Intelligence
Cyber Security
Natural Language Processing
Blockchain
Music

Education

MSc in Applied Mathematics, 2015
Técnico Lisboa
BSc in Applied Mathematics and Computation, 2013
Técnico Lisboa

EXPERIENCE

Machine Learning Engineer

Freelancer

Oct 2022 – Present

Open to embarking on new challenging projects in NLP and cybersecurity. Feel free to reach out if you think I can help you :)

Also available via Toptal.

DATA SCIENTIST

CYAN SECURITY

Jun 2019 – Sep 2022 Vienna, Austria

I tackled a multitude of ML problems in the intersection of NLP and cybersecurity.

I worked on topics such as website categorization (using multilingual text and images), DNS tunneling detection, and IoT security. Heavy emphasis on developing production-ready ML models, and following MLOps best practices. Using state-of-the-art Machine Learning models (e.g., Transformers).

RESEARCHER

TU WIEN

Aug 2016 – Feb 2019 Vienna, Austria

I was part of the BigDAMA project, a fundamental research project focused on big data analytics for network traffic monitoring and analysis.

My research topics are mostly related to how to represent network traffic for detecting attacks at the network level (i.e., which features to use). Experimented with both classical features (what have people been using and why?), and feature learning approaches using Deep Learning techniques, and in particular representing traffic in 2-dimensional spaces.

JUNIOR RESEARCHER

PRIBERAM

Mar 2016 – Jul 2016 Lisbon, Portugal

I was part of the SUMMA project, an H2020 project for media monitoring with multiple industry and academic international partners.

I specifically tackled the problem of named entity recognition in which, given a text with some entity mentions (e.g., David and Victoria), the goal is to find the corresponding entities in Wikipedia (e.g., David Beckham and Victoria Beckham). We used both classical methods (such as SVM) and modern Deep Learning approaches.

Featured Publications

July, 2019 IJCNN 2019

Extreme Dimensionality Reduction for Network Attack Visualization with Autoencoders

We used semi-supervised Autoencoders to obtain 2d visualizations of network traffic that separate between distinct types of attacks.

August, 2017 SIGCOMM Reproducibility

A meta-analysis approach for feature selection in network traffic research

We analyse the used features in network traffic research, and propose a new traffic vector based on how often they are chosen in the literature.

August, 2016 ACL

Jointly Learning to Embed and Predict with Multiple Languages

We propose a joint formulation for learning task-specific cross-lingual word embeddings, along with classifiers for that task. We obtain state of the art results in multiple multilingual datasets.