Data-driven safety specification using Large Language Models
- Subject:Large Language Models, Contract-based Design, functional safety, specification mining
- Type:Masterarbeit
- Date:ab 01 / 2025
- Tutor:
Data-driven safety specification using Large Language Models
Context
Cyber-physical systems are often used in safety-critical environments, for example cars, airplanes or medical devices. Ensuring (functional) safety is therefore a central task in the development of the systems. A formal specification that defines the guarantees and assumptions of the system is particularly important for this purpose. Signal-temporal logic (STL) is particularly suitable as a formal description language for cyber-physical systems. Unfortunately, this methodology entails additional effort in development: a specification in STL is required for each component, which to date has to be created manually.
This Master's thesis is therefore dedicated to the question: how can the STL formalization of a component be automated? How can we determine the system's specification from only recorded data? Various publications have shown that contracts can be derived from data using genetic algorithms, decision trees or clustering methods. The approach to be investigated here aims instead to use a Large Language Model (LLM) for the automatic creation of the formal description. LLMs have shown outstanding results in the generation of texts in various application domains. The central question of the master thesis is therefore how to teach an LLM to generate a text in formal language from signals (time series).
Goals
- Research on the application of LLMs to time series, such as prompting, quantization and alignment with language
- Design and implementation of a suitable method
- Creation of a data set for training and evaluation of the process
- Evaluation of the implemented method
Requirements
- Interest in LLMs, functional safety and formal methods (STL)
- Solid programming skills (Python)
- Reliable and independent way of working