From industrial testing to remote healthcare, the millisecond level precision focus is driven by triple collaboration across optics, electronics, and algorithms.
When we start a video conference or scan a document with our phone, the USB camera can instantly present a clear image, which is due to the use of autofocus technology. This seemingly simple function is actually a precise collaboration of optical design, electronic control, and algorithm decision-making. From traditional stepper motor driven lens modules to revolutionary liquid lenses, and to the migration of mobile phone camera technology to USB cameras, autofocus technology has developed multiple technological paths to meet the needs of different scenarios.
1, The core principle of autofocus: a closed loop of optics, evaluation, and execution
The core task of autofocus is to precisely focus the incident light on the photosensitive element by adjusting the distance between the lens and the image sensor.
The realization of this goal through USB cameras relies on the collaborative work of three major modules:
Optical acquisition system: The lens, filter, and CMOS image sensor (such as the 12 megapixel OIS12M module) are responsible for capturing raw light and converting it into electrical signals. When light is refracted through the lens, it forms interference patterns on the imaging sensor, and the phase difference (PD value) of these interference patterns can be used to calculate the position of the focal point.
Clarity evaluation system: After obtaining image data through a USB interface, the computer uses Fast Fourier Transform (FFT) or differential operations to calculate spectral amplitude or edge sharpness data - these are called Image Clarity Evaluation Functions (FV). The FV value is obtained through the analysis of image contrast, which essentially calculates the grayscale difference between adjacent pixels. The larger the difference, the clearer the image.
Execution mechanism: According to the instructions of the decision-making system, the driving device (stepper motor/VCM motor/liquid lens) moves the lens position. For example, a stepper motor will drive the lens forward and backward through a transmission gear set, with an accuracy of up to micrometers; VCM voice coil motors rely on the principle of electromagnetic induction to achieve precise displacement. The entire closed-loop control process can be summarized as: capturing images → calculating clarity → adjusting the lens → verifying the effect → locking the focus. When the system detects defocusing, it will immediately trigger this process to ensure that the image is restored to clarity.
2, Technological Implementation Path: From Traditional Gears to Liquid Revolution
(1). Traditional mechanical drive scheme: The rise and fall of stepper motors
Early USB cameras commonly used a combination of stepper motors and transmission gear sets. The prototype developed by Zhejiang University uses the OV7620 sensor chip. After the computer recognizes defocusing, it sends a pulse signal to the motor drive circuit (such as the PIC16C73A chip) through the USB interface. The motor rotates a fixed angle (such as 1.8 °) every time it receives a pulse, and the rotational motion is converted into linear displacement of the lens through worm drive or thread drive.
The advantage lies in its simple structure and low cost, but there are obvious drawbacks: limited lifespan due to mechanical wear (usually hundreds of thousands of focusing cycles), slow focusing speed (requiring 100-500 milliseconds), weak impact resistance, and easy failure in mobile devices.
(2). Liquid lens revolution: millisecond level response without mechanical motion
The electro wetting technology developed by Varioptic in France has opened up a new path. This technology injects two immiscible liquids, insulating oil and conductive aqueous solution, into a sealed chamber. When a voltage is applied to the electrode, the curvature of the liquid interface changes due to changes in surface tension, thereby achieving millisecond level adjustment of the focal length.
PixeLINK's USB 3.0 industrial camera is the first to apply this technology, and its advantages are remarkable:
No physical moving parts: lifespan exceeding 400 million operations
Ultra high speed focusing:<50 milliseconds in open-loop mode, approximately 10 frames per second in closed-loop mode
Strong environmental adaptability: able to withstand 2000g mechanical impact, with a macro capability of<5cm
Extremely low power consumption: The lens itself consumes less than 1mW of power
(3). Mobile technology migration plan: VCM and continuous focusing
With the increasing demand for image quality in laptop cameras, mobile phone camera module technology has begun to be introduced. The USB module developed by Sunny Optoelectronics uses VCM voice coil motors (commonly found in mobile phone cameras), combined with a 5-megapixel CMOS sensor, to achieve a miniaturized design with a thickness of less than 5mm.
VCM is based on the principle of electromagnetic induction, where current changes drive the coil to move up and down in a magnetic field, resulting in lens displacement. Its advantages lie in its small size, fast response, and support for continuous autofocus (CAF) - the system continuously monitors changes in FV values and re focuses once sharpness drops beyond a threshold, ensuring clarity in motion scenes.
3, Core algorithm: How does the camera "think" about the focus?
Focus search strategy
Global search method: Move the camera from the nearest end to the farthest end, calculate the FV value throughout the process, and select the peak position. Slow speed but high reliability, suitable for initial focusing.
Hill Climbing algorithm: a mainstream optimization solution. The system first moves the camera in large steps to determine the trend of FV changes, and switches to small step fine adjustment when approaching the peak. Modern algorithms such as variable step and variable speed hill climbing can dynamically divide the far focus area (large step fast scan) and the near focus area (small step fine tuning).
Peak determination mechanism
Traditional single peak detection is susceptible to noise interference. The microscope camera of Hangzhou Atlas Optoelectronics adopts the "two rise and two fall" criterion: when the FV values at five consecutive positions satisfy FV ₁
Scene adaptation technology
After focusing is completed, the system continuously monitors the brightness of the scene and the FV value of the area. If significant changes are detected (such as target movement or sudden changes in lighting), it triggers refocusing. Wait for the brightness/FV fluctuation to stabilize within the threshold, and determine that the scene has returned to stillness. This dynamic range adaptability significantly improves low light performance.
4, Frontier Hybrid Technology and Application Adaptation
Hybrid focusing technology
The high-end USB camera adopts a hybrid scheme of phase detection (PDAF) and contrast focusing (CDAF). PDAF simulates human eye disparity by arranging special masking pixels (left half masking and right half masking pixels appearing in pairs) on CMOS sensors to calculate phase differences and achieve preliminary fast positioning; CDAF performs fine tuning. The reference design of the 4K surveillance camera jointly developed by Renesas Electronics and Lianyong Technology adopts this scheme, which maintains excellent target recognition accuracy in low light conditions.
Technology adaptation for industry applications
Industrial Inspection and Medical Imaging: PixeLINK liquid lens cameras excel in fields such as barcode scanning and retinal recognition due to their anti vibration and strong macro capabilities.
Dynamic video recording: The OIS13M anti shake camera combines optical anti shake (OIS) and autofocus to achieve stable imaging in drones or sports cycling.
Microscopic imaging: Hangzhou Atlas Optoelectronics uses UVC protocol private commands to control the microscope camera, and solves the problem of local peak interference at high magnification through adaptive steering recognition.
5, Future Evolution Direction
With the development of computational photography technology, USB camera autofocus is evolving in three directions:
Algorithmic intelligence: Combining deep learning to predict focal positions and reduce mechanical search travel. Such as pre identifying the subject area based on scene semantic segmentation, or predicting the target trajectory through motion blur analysis.
Hardware Fusion: The hybrid drive of liquid lens and VCM has become a new trend, such as the IMX415 sensor module achieving 3x optical zoom while maintaining a compact size of 38×67.39mm.
Protocol and transmission upgrade: The new generation USB4 interface will break through the 480Mbps bandwidth limit, making real-time transmission and processing of 8K high pixel data possible, providing a data foundation for ultra-high precision focusing.
From industrial testing to remote healthcare, the millisecond level precision focus is driven by triple collaboration across optics, electronics, and algorithms.
When we start a video conference or scan a document with our phone, the USB camera can instantly present a clear image, which is due to the use of autofocus technology. This seemingly simple function is actually a precise collaboration of optical design, electronic control, and algorithm decision-making. From traditional stepper motor driven lens modules to revolutionary liquid lenses, and to the migration of mobile phone camera technology to USB cameras, autofocus technology has developed multiple technological paths to meet the needs of different scenarios.
1, The core principle of autofocus: a closed loop of optics, evaluation, and execution
The core task of autofocus is to precisely focus the incident light on the photosensitive element by adjusting the distance between the lens and the image sensor.
The realization of this goal through USB cameras relies on the collaborative work of three major modules:
Optical acquisition system: The lens, filter, and CMOS image sensor (such as the 12 megapixel OIS12M module) are responsible for capturing raw light and converting it into electrical signals. When light is refracted through the lens, it forms interference patterns on the imaging sensor, and the phase difference (PD value) of these interference patterns can be used to calculate the position of the focal point.
Clarity evaluation system: After obtaining image data through a USB interface, the computer uses Fast Fourier Transform (FFT) or differential operations to calculate spectral amplitude or edge sharpness data - these are called Image Clarity Evaluation Functions (FV). The FV value is obtained through the analysis of image contrast, which essentially calculates the grayscale difference between adjacent pixels. The larger the difference, the clearer the image.
Execution mechanism: According to the instructions of the decision-making system, the driving device (stepper motor/VCM motor/liquid lens) moves the lens position. For example, a stepper motor will drive the lens forward and backward through a transmission gear set, with an accuracy of up to micrometers; VCM voice coil motors rely on the principle of electromagnetic induction to achieve precise displacement. The entire closed-loop control process can be summarized as: capturing images → calculating clarity → adjusting the lens → verifying the effect → locking the focus. When the system detects defocusing, it will immediately trigger this process to ensure that the image is restored to clarity.
2, Technological Implementation Path: From Traditional Gears to Liquid Revolution
(1). Traditional mechanical drive scheme: The rise and fall of stepper motors
Early USB cameras commonly used a combination of stepper motors and transmission gear sets. The prototype developed by Zhejiang University uses the OV7620 sensor chip. After the computer recognizes defocusing, it sends a pulse signal to the motor drive circuit (such as the PIC16C73A chip) through the USB interface. The motor rotates a fixed angle (such as 1.8 °) every time it receives a pulse, and the rotational motion is converted into linear displacement of the lens through worm drive or thread drive.
The advantage lies in its simple structure and low cost, but there are obvious drawbacks: limited lifespan due to mechanical wear (usually hundreds of thousands of focusing cycles), slow focusing speed (requiring 100-500 milliseconds), weak impact resistance, and easy failure in mobile devices.
(2). Liquid lens revolution: millisecond level response without mechanical motion
The electro wetting technology developed by Varioptic in France has opened up a new path. This technology injects two immiscible liquids, insulating oil and conductive aqueous solution, into a sealed chamber. When a voltage is applied to the electrode, the curvature of the liquid interface changes due to changes in surface tension, thereby achieving millisecond level adjustment of the focal length.
PixeLINK's USB 3.0 industrial camera is the first to apply this technology, and its advantages are remarkable:
No physical moving parts: lifespan exceeding 400 million operations
Ultra high speed focusing:<50 milliseconds in open-loop mode, approximately 10 frames per second in closed-loop mode
Strong environmental adaptability: able to withstand 2000g mechanical impact, with a macro capability of<5cm
Extremely low power consumption: The lens itself consumes less than 1mW of power
(3). Mobile technology migration plan: VCM and continuous focusing
With the increasing demand for image quality in laptop cameras, mobile phone camera module technology has begun to be introduced. The USB module developed by Sunny Optoelectronics uses VCM voice coil motors (commonly found in mobile phone cameras), combined with a 5-megapixel CMOS sensor, to achieve a miniaturized design with a thickness of less than 5mm.
VCM is based on the principle of electromagnetic induction, where current changes drive the coil to move up and down in a magnetic field, resulting in lens displacement. Its advantages lie in its small size, fast response, and support for continuous autofocus (CAF) - the system continuously monitors changes in FV values and re focuses once sharpness drops beyond a threshold, ensuring clarity in motion scenes.
3, Core algorithm: How does the camera "think" about the focus?
Focus search strategy
Global search method: Move the camera from the nearest end to the farthest end, calculate the FV value throughout the process, and select the peak position. Slow speed but high reliability, suitable for initial focusing.
Hill Climbing algorithm: a mainstream optimization solution. The system first moves the camera in large steps to determine the trend of FV changes, and switches to small step fine adjustment when approaching the peak. Modern algorithms such as variable step and variable speed hill climbing can dynamically divide the far focus area (large step fast scan) and the near focus area (small step fine tuning).
Peak determination mechanism
Traditional single peak detection is susceptible to noise interference. The microscope camera of Hangzhou Atlas Optoelectronics adopts the "two rise and two fall" criterion: when the FV values at five consecutive positions satisfy FV ₁
Scene adaptation technology
After focusing is completed, the system continuously monitors the brightness of the scene and the FV value of the area. If significant changes are detected (such as target movement or sudden changes in lighting), it triggers refocusing. Wait for the brightness/FV fluctuation to stabilize within the threshold, and determine that the scene has returned to stillness. This dynamic range adaptability significantly improves low light performance.
4, Frontier Hybrid Technology and Application Adaptation
Hybrid focusing technology
The high-end USB camera adopts a hybrid scheme of phase detection (PDAF) and contrast focusing (CDAF). PDAF simulates human eye disparity by arranging special masking pixels (left half masking and right half masking pixels appearing in pairs) on CMOS sensors to calculate phase differences and achieve preliminary fast positioning; CDAF performs fine tuning. The reference design of the 4K surveillance camera jointly developed by Renesas Electronics and Lianyong Technology adopts this scheme, which maintains excellent target recognition accuracy in low light conditions.
Technology adaptation for industry applications
Industrial Inspection and Medical Imaging: PixeLINK liquid lens cameras excel in fields such as barcode scanning and retinal recognition due to their anti vibration and strong macro capabilities.
Dynamic video recording: The OIS13M anti shake camera combines optical anti shake (OIS) and autofocus to achieve stable imaging in drones or sports cycling.
Microscopic imaging: Hangzhou Atlas Optoelectronics uses UVC protocol private commands to control the microscope camera, and solves the problem of local peak interference at high magnification through adaptive steering recognition.
5, Future Evolution Direction
With the development of computational photography technology, USB camera autofocus is evolving in three directions:
Algorithmic intelligence: Combining deep learning to predict focal positions and reduce mechanical search travel. Such as pre identifying the subject area based on scene semantic segmentation, or predicting the target trajectory through motion blur analysis.
Hardware Fusion: The hybrid drive of liquid lens and VCM has become a new trend, such as the IMX415 sensor module achieving 3x optical zoom while maintaining a compact size of 38×67.39mm.
Protocol and transmission upgrade: The new generation USB4 interface will break through the 480Mbps bandwidth limit, making real-time transmission and processing of 8K high pixel data possible, providing a data foundation for ultra-high precision focusing.