Methods and Algorithms for Object Detection in Images and Video Based on the YOLO Family of Architectures

Students Name: Salamonovych Yurii Zinoviiovych
Qualification Level: magister
Speciality: Information Technology Design
Institute: Institute of Computer Science and Information Technologies
Mode of Study: full
Academic Year: 2025-2026 н.р.
Language of Defence: ukrainian
Abstract: Yurii SALAMONOVYCH, PhD (Phys.–Math.), Associate Professor of the Department of ASP Tamara KLYMKOVYCH (supervisor).Methods and Algorithms for Object Detection Based on the YOLO Family of Architectures.Master’s Thesis – Lviv Polytechnic National University, Lviv, 2025. The modern stage of information technology development is characterized by the active integration of intelligent systems capable of performing tasks that until recently were considered exclusively within the domain of human cognition. One of the most rapidly growing areas is computer vision, which combines methods of informatics, mathematics, biology, and cognitive sciences. Its primary goal is to develop algorithms and models capable of automatically interpreting visual information in a manner similar to human perception. The aim of the master’s thesis is to investigate neural network methods of computer vision, identify their advantages and limitations, and analyze the prospects of applying modern architectures in practical tasks. The scientific novelty of the work lies in the systematic analysis and generalization of contemporary neural-network-based approaches to object detection, encompassing both two-stage and single-stage architectures, as well as in revealing the role of transformer models in forming global context during image processing. For the first time, the study conceptually compares the mechanisms of feature extraction, bounding-box regression, and attention modeling, which provides a deeper understanding of the evolution of computer-vision methods and highlights their potential for enhancing the accuracy and efficiency of object-detection algorithms. Practical significance of the obtained results. The results of this research can be applied in the development of video-surveillance systems, autonomous vehicles, medical diagnostics, robotics, and other fields that require fast and accurate object detection. The presented methods and analytical approaches may serve as a foundation for building custom detection models, improving existing algorithms, and optimizing their computational performance. Object of the study: processes of automated image interpretation in computer-vision systems.Subject of the study: neural-network object detection models, CNN and transformer architectures, methods of classification, segmentation, localization, and keypoint analysis. Keywords: COMPUTER VISION, CONVOLUTIONAL NEURAL NETWORKS, TRANSFORMERS, OBJECT DETECTION, SEGMENTATION, IMAGE CLASSIFICATION, CNN, R-CNN, YOLO, SELF-ATTENTION. List of used sources: 1. Goodfellow I., Bengio Y., Courville A. Deep Learning. MIT Press, 2016. 2. Szeliski R. Computer Vision: Algorithms and Applications. Springer, 2022. 3. Girshick R. Rich feature hierarchies for accurate object detection and semantic segmentation (R-CNN), 2014.