Computer Vision and Our Biological Vision Algorithms
In the field of computer science, we often come across interesting sub-fields that are rabbit holes that nonetheless amaze us as we try to grasp their nature, no matter how many times we have taken a dive into them.
One of them is computer vision. This field is especially special in a way that it has roots in our own vision systems. Because in order to make sense of how it is possible to create a virtual vision, image recognition and such scientific tasks, we find ourselves needing to go to the roots of the desire to make the virtual version of a biological ability we automatically acquired at birth.
Many theories are present, and some have been discarded, but let’s take “Geons” for example. Geometric Ions, in its full name, suggest that we can categorize objects (or maybe suggest we do as humans) into sub-objects of geometric shapes and combine them into variations to identify objects in the world.
Even though it is not as simple to define the human vision system, this shows that we are trying to solve the problem in our own capacity.
One amazing thing I find fascinating about human vision is that our algorithms have certain bugs in them, just like our own algorithms we write for our computers.
Our vision system gets fooled when something doesn’t have edges that we can distinguish, we get fooled when an object is too far away and there aren’t any close companions to it that can help us make sense of its size and distance.
Optical illusions are the proof that our biological algorithms have a lot of errors in them.
The question becomes, what are we going to do with our artificial computer-based vision technology and make it useful beyond what it is now, if we don’t even have a perfect vision algorithm?
Mathematically speaking, as a course from University of Colorado, Boulder suggested, it is impossible to determine what the object in 3D space actually looks like if we can only see it in 2D.
Meaning, we can’t be sure of what a table actually looks like because our vision system gets signals of light intensity on a 2D plane , that is going to our retina.
Can we make artificial intelligence that can do better than us?
Can we break the boundaries set by our biological vision system that put us into this formula below?
I(u,v)=∫V L(p,n,v)⋅PSF(p,u,v)dV