I'm an assistant professor at Institute of Informatics at Hacettepe University. I have received a PhD degree in Computer Engineering from Hacettepe University. My PhD focused on learning sign languages with limited supervision and semantic representations.
I'm interested in machine learning (data-efficient machine learning with minimal supervision), computer vision (image and video understanding), multimodal machine learning (integrate and reason over multiple data modalities—such as text, vision, and audio), machine learning in production (focusing on the deployment and scalability of machine learning systems in real-world settings).
The paper introduces MoDeST, a multi-domain and multilingual dataset for scientific title generation in English and Turkish, evaluates large language models across various scientific input types and learning settings, and highlights the effectiveness of abstracts and domain-specific modeling for improving title generation performance.
We address zero-shot sign language recognition by leveraging textual and attribute-based semantic class representations from sign language dictionaries, introducing three benchmark datasets and a spatiotemporal recognition model that integrates visual and semantic features to recognize previously unseen sign classes.
We propose a novel embedding-based framework for few-shot sign language recognition (FSSLR) in cross-lingual settings, leveraging spatio-temporal visual and hand landmark features, and introducing three multilingual benchmarks to demonstrate its effectiveness in recognizing novel signs in unseen languages with limited examples.
We introduce the "WildestFaces" dataset and a novel clean-to-violent domain transfer framework for recognizing individuals in violent videos using models trained on clean ID photos, addressing challenges of domain discrepancy, limited video data, through stacked affine-transforms, attention-driven temporal adaptation, and a self-attention-based model.
The paper introduces the zero-shot sign language recognition (ZSSLR) problem using the ASL-Text dataset with 250 classes and textual descriptions, proposing a spatiotemporal framework combining 3D-CNNs, BiLSTMs, and text embeddings to recognize unseen signs by leveraging dictionary-based semantic representations.
Patents
A Method for Real-Time Running Detection and Tracking in Video Footage, National. Y. C. Bilge et. al. 2023 013301, 21.02.2025.
Calibration of Angle Measuring Sensors (IMU) In Portable Devices Using DEM and Landform Signature, National. Y. C. Bilge et. al. 2023 003176, 21.02.2024.