Chapter 1. General Overview of Computer Vision Systems
In this chapter, you will learn about the fundamentals and the general scheme of a computer vision system. The chapter will enable
you to take a wide perspective when approaching computer vision problems.
Introducing computer vision systems
We use our five senses to observe everything around us—touch, taste, smell, hearing, and vision. Although all of these five senses are
crucial, there is a sense which creates the biggest impact on perception. It is the main topic of this book and, undoubtedly, it is vision.
When looking at a scene, we understand and interpret the details within a meaningful context. This seems easy but it is a very complex
process which is really hard to model. What makes vision easy for human eyes and hard for devices? The answer is hidden in the
difference between human and machine perception. Many researchers are trying to go even further.
One of the most important milestones on the journey is the invention of the camera. Even though a camera is a good tool to save
vision-based memories of scenes, it can lead to much more than just saving scenes. Just as with the invention of the camera, man has
always tried to build devices to make life better. As the current trend is to develop intelligent devices, being aware of the environment
around us is surely a crucial step in this. It is more or less the same for us; vision makes the biggest difference to the game. Thanks to
technology, it is possible to mimic the human visual system and implement it on various types of devices. In the process we are able to
build vision-enabled devices.
Images and timed series of images can be called video, in other words the computed representations of the real world. Any
vision-enabled device recreates real scenes via images. Because extracting interpretations and hidden knowledge from images via
devices is complex, computers are generally used for this purpose. The term, computer vision, comes from the modern approach of
enabling machines to understand the real world in a human-like way. Since computer vision is necessary to automate daily tasks with
devices or machines, it is growing quickly, and lots of frameworks, tools and libraries have already been developed.
Open Source Computer Vision Library (OpenCV) changed the game in computer vision and lots of people contributed to it to make
it even better. Now it is a mature library which provides state-of-the-art design blocks which are handled in subsequent sections of this
book. Because it is an easy-to-use library, you don't need to know the complex calculations under-the-hood to achieve vision tasks.
This simplicity makes sophisticated tasks easy, but even so you should know how to approach problems and how to use design tools
in harmony.
Approaching computer vision problems
To be able to solve any kind of complex problem such as a computer vision problem, it is crucial to divide it into simple and realizable
substeps by understanding the purpose of each step. This chapter aims to show you how to approach any computer vision problem
and how to model the problem by using a generic model template.
A practical computer vision architecture, explained in this book, consists of the combination of an Arduino system and an OpenCV
system, as shown in the following diagram:
Arduino is solely responsible for collecting the sensory information—such as temperature, or humidity—from the environment and
sending this information to the vision controller OpenCV system. The communication between the vision controller system and the
Arduino system can be both wired or wireless as Arduino can handle both easily. After the vision system processes the data from
Arduino and the webcam, it comes to a detection (or recognition) conclusion. For example, it can even recognize your face. The next