HVS: Reviewing World's Best Imaging System and Three Preferred Features

When you are scanning through the viewfinder of your latest acquired sophisticated digital camera to achieve the best possible frame, have you ever wondered about another automated scanning system behind the viewfinder that is silently directing you to acquire that desired frame? Your eye, the HUMAN VISUAL SYSTEM (HVS), has been likened to a camera in many descriptions and indeed, superficially, this is true.

I have chosen to write this article to cite only three pieces of evidence, where digital camera technology adopted the anatomy & physiology of HVS and yet to match a similar outcome. The HVS is actually a scanning system, proficient in fine focusing, exposure and white balance, capable of compression, and can respond to variable lighting conditions up to 12 orders of magnitude as well as capably deliver information about the three-dimensional world around us. This complex processing takes place predominantly in the brain (visual cortex), and leads us to perceive images approximately 1/10 to 2/10 of a second after they occur. Beyond this simplest explanation, the structure and operation of the human eye are much more complicated than the most sophisticated consumer digital cameras available in the market to date. But in this context, we must not overlook the incredible variation in complexity, operation and performance of visual systems in other animals, which is beyond the scope of this discussion.

Understanding the basic functioning of the human eye leads to a better understanding of the design and operation of digital camera systems assisting to capture the consumers’ imagination. Is that knowledge-making anyone a better photographer? Not necessarily, like knowledge of grammar doesn’t essentially make anyone a captivating writer, yet it’s desired. I intentionally confined this article only to RETINA to make this discussion precise, but this comparative study can be extended from dark adaptation to movement & focusing (accommodation) to elementary color theory to binocular vision, so on and so forth. I am publishing this article to encourage the aspiring photographers not to run behind the hype and mirage of ‘ultimate digital camera system’ and try to make them comprehend that even the most sophisticated consumer digital camera system is only a tool to capture OUR imagination and cannot match the final image formed by the HVS to many extents. Furthermore, It must be noted that a photograph cannot be ‘created’ without the ‘creative’ capacity of the right hemisphere of the brain and the brain-eye coordination as well as the applied knowledge of visual literacy, visual anthropology, time & space, socioeconomics, communication skill, psychology, and many more relevant subjects. In that context, clinical medicine and photography are having many similarities in practice but that will be a different topic of discussion.

So, what is the retina? Similar to the digital camera sensor, the retina is a layer of the inner surface of the eye consist of photosensitive and nerve cells, whose task is to encode the incoming light to electrical signals for the brain or to be precise for the visual cortex. The photosensitive cells include rods and cones and the nerve cells are ganglion cells. This is enough to understand the rest of the article I trust.

First instance. As consumers are using smartphones more (casually) for everyday photography, hence the first instance is about much-talked dual-camera phones, famous for capturing details in low-light or difficult lighting conditions, which created waves in the consumer digital market. Technically it’s a setup of a couple of cameras, a concept similar to stereo cameras used for 3D or range imaging but not exactly the same; and to capture the details in low-light, they installed one RGB and one monochrome camera. So when you press the shutter (rather a shutter icon in the screen) in your smartphone, it captures simultaneous images – one in color and one in monochrome, and finally combines them to produce a sharp, vivid image. The detail operation is mentioned here if you like to know. So, You are definitely happy to grab the ultimate camera phone and feeling proud that there is no evidence ‘pixelation’ by magnifying the image in x-number on the screen with your finger gestures to make your friends envy. Congratulations, my friend for your new digital device! But that’s a pretty fundamental ‘technology’ of the human eye, and I must say quite old-fashioned. As told earlier, the retina of the human eye consists of photosensitive cells – rods and cones. These cells are primarily responsible for three types of vision. The Rods are highly sensitive cells and provide monochromatic vision at low light levels (or scotopic vision); three variations of Cone cells provide color vision in bright light (or photopic vision) and finally, both types of cells work together during difficult lighting situation between moonlight to twilight (the illumination range from 0.034 to 3.4 cdm^-2) by using both monochrome and color vision (or mesopic vision). Without going deeper to the further detail, the adaptation of one color and one monochrome camera is nothing but to replicate this mesopic vision to produce the sharp image in low light, but never ever come close to the final outcome. Sometimes the human binocular vision is loosely discussed with this setup of the dual-camera technology of smartphones, but it is not worthy to lengthen this article with that ‘technological marvel’ of the human eye.

The second instance here is about Compression. Since it’s innovation, the Joint Photographic Expert Group (JPEG) format is the commonest image format available in any digital camera system from mobile phones to high-end consumer digital cameras and the most accepted digital image file format in almost all digital platforms including World Wide Web. Technically, JPEG is a lossy or irreversible compression of digital images typically achieves 10:1 compression with little perceptible loss in image quality, but yet the end results in lossy compression. From a professional editorial image to a casual Instagram post, JPEG is actually the backbone of the entire digital image industry. Now again coming back to HVS. The photosensitive cells of the retina (rods and cones) encode the incoming light into electrical signals, which then transmitted by the nerve cells (axons of the ganglion cells) that form the optic nerve (quite a similar encoding process of the digital camera sensors). There are approximately 125 million photosensitive cells in the retina, but there are only about 1 million nerve cells (ganglion cells). So, if we consider the mesopic vision, then technically the light information from 125 million of photosensitive cells are encoding and transmitting to 1 million nerve cells. This may be interpreted as representing compression of the image in the order of 125:1, which is lossless. The significance of this level of compression may be demonstrated by comparing an image that has been compressed to a level of 10:1 to its original using the lossy JPEG format. Apart from the proprietary digital negative or RAW files, no consumer camera system is able to produce a compression file format that is indestructibly lossless as produced by HVS. In this context, it must be noted that HVS is not equipped to capture and store RAW images, to process it later. This highest level of lossless compression is also possible by exploiting temporal correlation. Imagine an anchor in a stage show; the anchor is moving during the event but the stage is fixed. Therefore, the stage will exhibit temporal correlation and this information need not be transmitted again.

The third point of discussion is about the most common term in digital imaging platform, RGB or Red-Green-Blue. All the consumer digital camera giants are experimenting with this RGB to implement it in different ways and patterns in the sensor to achieve the best possible image and color reproduction. But the source of these concepts or experiments is actually the underlying phenomenon of human vision. Whilst there is only one type of rod cells exist in the human retina, there are three types of cone cells that have been identified. These three types of cone cells are sensitive to the three types of colors and hence denoted as short (S), medium (M) and long (L) wavelength cones but they are by no means found in equal numbers. Here is my attempt to put their color sensitivity, peak sensitivities and distribution percentage in tabular form for easier understanding:

Types of cones	Color sensitivity	Peak sensitivity	Distribution %
Short	Blue	420 nm (highest)	1-2%
Medium	Green	534 nm	32%
Long	Red	564 nm	64%

*The integrated peak sensitivity of the cones is close to 555 nm, which corresponds closely to the peak output of the sun.

By exhibiting these instances, I am not at all discouraging anyone to buy a new digital camera system or upgrade to the newer technology; rather my intention is to encourage the aspiring photographers to nurture their own vision, the ultimate camera system that they already own and start practicing to scan (observe) and capture the frames without the assistance of any camera. This will enable anyone to exploit his or her existing digital camera system because often constrains give birth to the best creative minds. Or, as we often quoted, necessity is the mother of invention.

So, let’s start scanning your environment without wasting time behind researching & reading theoretical reviews about digital cameras on the Internet. Welcome to the world of art & science of Imaging!!!