Gergely Papp | Machine Learning Engineer

Prisma Present

Extracting psychological data, resilience, and player communication statistics from e-sport footage for further analysis.

Python, Huggingface, Docker, Pygame, Databricks

Linux Bash Agent Present

Developing a natural language interface for terminal operations that can execute complex tasks through prompting, with optimization for deployment on resource-constrained devices.

Python, Azure, Huggingface

RAG 2024

Participated in a Retrieval Augmented Generation project involving vector databases, knowledge graphs, and text generation with LLMs.

Python, Huggingface, PyTorch

MSc Thesis 2024

Investigated Vision Transformers' ability to generalize across object properties (shape, texture, color, count) on CLEVR-4.

Python, Huggingface

Stitch-BERT 2023

Analyzed how NLP transformers fine-tuned for different languages and tasks relate geometrically and functionally, revealing potential for cross-task insights.

GitHub Repository

PyTorch, Python

Gaming Bot 2023

Developed a rule-based AI in NodeJS for automating gameplay for a browser game with attack timing, reaction to reports, and HTML dashboard logging.

NodeJS, HTML, JavaScript

Energy Consumption Prediction 2023

Developed a time series forecasting model to predict energy consumption based on historical data through collection, cleaning, preprocessing, and model training.

Python, PyTorch, Docker

Self-Supervised Learning 2022

Explored innovative self-supervised image classification methods competing with state-of-the-art approaches, achieving promising results on smaller datasets.

GitHub Repository

Python, PyTorch, Wandb, Docker

Functional Similarity 2021

Demonstrated that geometric and functional similarity in neural networks are distinct concepts using affine transformations between networks.

Lead Engineer - Responsible for codebase, experiments, and analysis

First NeurIPS paper accepted from a Hungarian institute (2020), establishing a novel approach to comparing neural network representations beyond geometric similarity.

GitHub Repository

Python, PyTorch

Image Interpretation of CNNs 2021

Visualized the role of neurons in CNNs using Lucid and GANs, revealing what images best represent specific classes in a CelebA-trained classifier.

Python, Lucid, TensorFlow

License Plate Recognition 2020

Initiated and built a ViT-based license plate recognition system during a brief window of opportunity while management was away.

Main ML Engineer - Led data collection, labeling, model training, and production integration

Successfully replaced an expensive third-party solution with a superior in-house system that became a core product for Asura Technologies, demonstrating initiative and technical excellence.

Python, TensorFlow, OpenCV, NumPy, C#, Jenkins, CI/CD

MRZ Extraction 2020

Developed a system to extract Machine Readable Zones from passport images through comprehensive data collection, preprocessing, and model training.

Python, TensorFlow, OpenCV, Tesseract, NumPy

Home Quarantine System 2020

Created an ML system to track quarantine compliance during the COVID-19 pandemic, with face detection and anti-cheating mechanisms.

ML Engineer - Led data collection, processing, and model training

Successfully delivered a critical application under tight time constraints that was adopted by the Hungarian government and used by tens of thousands of citizens.

Project Overview

Python, TensorFlow, OpenCV, NumPy, Jenkins

Wheel Counter 2020

Developed a real-time system to count wheels through comprehensive data collection, preprocessing, and model training.

Python, TensorFlow, Keras, OpenCV, Neptun.ai, Jenkins, CI/CD

Car + License Plate Detection 2019

Implemented a real-time YOLO-based system for detecting vehicles and license plates, overcoming challenges with fish-eye camera distortion through specialized preprocessing.

Python, TensorFlow, Keras, SQL, OpenCV, REST API, Neptun.ai, Jenkins, CI/CD

People Counter 2019

Developed a tool to count people entering a shopping mall in real-time with 90%+ accuracy, providing reliable customer traffic estimates.

Python, TensorFlow, Keras, OpenCV, Jenkins

Make & Model Recognition 2019

Categorized car images into their make and model with 90%+ accuracy through comprehensive data collection, preprocessing, and model training.

Python, TensorFlow, Keras, OpenCV, Jenkins, CI/CD

Watermeter Reader 2019

Built an end-to-end OCR solution that automatically cleans, rotates, and extracts readings from watermeter images, deployed to the cloud for seamless integration.

Python, TensorFlow, Docker, REST API

AlphaZero 2018

Reimplemented AlphaZero to explore temporal diﬀerence learning vs. Monte Carlo methods. The study revealed unique in-game strategies made with Reinforcement Learning.

Python, TensorFlow

Balance Sheet Reconciliation 2018

Engineered a tool using hierarchical clustering to reconcile balance sheets between different accounting systems, automating complex financial comparisons.

Python, Pandas, NumPy, Excel, Scikit-learn

Invoice OCR 2018

Architected an OCR system that translates invoices into structured formats and extracts relevant information using a shallow neural network for efficient data processing.

Python, TensorFlow, Tesseract, Scikit-learn

Structured VAE Latent 2018

Investigated latent space properties of structured VAEs by forcing a torus shape while reconstructing clock images, advancing understanding of controlled generative models.

Python, TensorFlow, OpenCV

Gun Detection 2018

Tackled the challenge of detecting small firearms in high-resolution images in real-time, discovering the critical importance of contextual information for accurate detection.

Python, TensorFlow, OpenCV, NumPy

Time Series Forecasting 2017

Built a predictive model using linear regression to forecast future values based on historical time series data, enabling data-driven decision making.

Python, Scikit-learn, Pandas

Chess Engine 2017

Designed a Java-based neural chess engine from scratch without the use of tree search, achieving entry-level play

Java

Technical Skills

Languages

Libraries & Frameworks

DevOps & Tools

Professional Experience

Deep Learning Research Engineer

Alfréd Rényi Institute of Mathematics

Teaching Assistant

University of Amsterdam

Machine Learning Consultant

Asura Technologies Ltd.

Machine Learning Engineer

Asura Technologies Ltd.

Risk Analyst (AI Team)

Morgan Stanley

Education

Master of Artificial Intelligence

University of Amsterdam

Bachelor of Computer Science

University of Manchester

Projects

Prisma Present

Linux Bash Agent Present

RAG 2024

MSc Thesis 2024

Stitch-BERT 2023

Gaming Bot 2023

Energy Consumption Prediction 2023

Self-Supervised Learning 2022

Functional Similarity 2021

Image Interpretation of CNNs 2021

License Plate Recognition 2020

MRZ Extraction 2020

Home Quarantine System 2020

Wheel Counter 2020

Car + License Plate Detection 2019

People Counter 2019

Make & Model Recognition 2019

Watermeter Reader 2019

AlphaZero 2018

Balance Sheet Reconciliation 2018

Invoice OCR 2018

Structured VAE Latent 2018

Gun Detection 2018

Time Series Forecasting 2017

Chess Engine 2017

Research & Publications

Neural Networks (2023)

ReScience (2023)

NeurIPS (2021)

AITP (2021)

Achievements & Distinctions

European Champion in Pool Billiard (2010)

27th place in National Secondary School Mathematics Competition

24th place in National Secondary School Programming Competition

Participated in multiple hackathons

Hobbies & Interests

Gym

Piano

Video Games

Board Games

Pool Billiard