TR EN

MSc Thesis Defense: Feyza Teker, MULTI-DATASET LABEL SPACE UNIFICATION FOR OFF-ROAD AUTONOMOUS DRIVING USING LARGE LANGUAGE MODELS , Date & Time: 30 June, 2026 – 9:00 AM, Place: FENS L029

MULTI-DATASET LABEL SPACE UNIFICATION FOR OFF-ROAD

AUTONOMOUS DRIVING USING LARGE LANGUAGE MODELS

 

 

Feyza Teker
Mechatronics Engineering, MSc Thesis, 2026

 

Thesis Jury

     Prof. Mustafa Ünel (Thesis Advisor)

  Assoc. Prof. Kemaletttin Erbatur

  Assoc. Prof. Ali Fuat Ergenç

 

 

 

Date & Time: 30th June, 2026 – 9.00 AM

Place: FENS L029

Keywords : off-road autonomous driving, semantic segmentation, multi-dataset

training, large language models, ontology mapping

 

Abstract

 

Training semantic segmentation models for off-road autonomous driving is difficult because the available datasets are individually small, geographically narrow, and annotated under inconsistent label conventions. This thesis proposes a multi-dataset training framework that unifies several off-road and urban datasets under a single label space through automated, LLM-based ontology construction and knowledge distillation. The unified taxonomy and dataset-specific mapping functions are generated by Gemini 2.5 Pro rather than built by hand. Variability in the model output is controlled through repeated queries, consensus filtering, and reverse cross validation under explicit error criteria. Each master label is annotated with a discrete traversability tier, embedding navigation-relevant information directly into the label space. A teacher model fine-tuned on GOOSE generates pseudo-labels for five off-road and one urban auxiliary dataset through test-time augmentation, constrained by ground-truth annotations through the ontology mapping and filtered by per-pixel confidence threshold. The student is trained with tempered dataset sampling to compensate for size imbalance. Mask2Former and OneFormer are evaluated as both teacher and student. On the GOOSE validation set, the best teacher reaches 0.673 mIoU and the best student reaches 0.704 mIoU, surpassing the published GOOSE baseline and showing that LLM-based automated ontology construction can match and exceed manual quality. The framework offers a scalable path towards multi-dataset training in the off-road domain.

Home

Orta Mahalle, 34956 Tuzla, İstanbul, Türkiye

Telefon: +90 216 483 90 00

Fax: +90 216 483 90 05

© Sabancı Üniversitesi 2023