Region of Interest (ROI) in Machine Vision

In many machine vision applications, only a portion of the camera's full field of view contains information relevant to the inspection task. For example, a high-resolution camera may monitor an entire conveyor area while the barcode, feature, or defect of interest occupies only a small section of the image. Processing the complete image in such cases can increase bandwidth and processing requirements unnecessarily.

A Region of Interest (ROI) addresses this by limiting image acquisition to a defined subset of the sensor area. Instead of reading and transmitting the full pixel array, the camera captures only the selected region containing the relevant information. This reduces the amount of image data that must be transferred and processed, which can help increase achievable frame rates while preserving the sensor's native spatial resolution within the selected region.

How Hardware ROI Works

Hardware ROI changes how the image sensor reads out pixel data rather than simply cropping the image after acquisition.

When an ROI is configured through the camera SDK or vision software, the user defines a rectangular region using parameters such as width, height, X-offset, and Y-offset.

The offset values determine the position of the ROI within the sensor's pixel array. During acquisition, the sensor readout logic processes only the selected region instead of the full frame. Because fewer pixels are digitized and transmitted, bandwidth and processing requirements are reduced, often allowing higher frame rates and lower system latency.

The Physics of ROI: X-Axis vs. Y-Axis

Engineers are often surprised that reducing the width of an image ROI does not always produce the same frame-rate improvement as reducing its height. This behavior is related to how many CMOS image sensors perform row-by-row readout.

Many CMOS sensors read the pixel array sequentially from top to bottom. As a result, reducing the vertical dimension of the ROI can have a greater effect on sensor readout time than reducing the horizontal dimension.

Reducing Height (Y-Axis): When the height of the ROI is reduced, the sensor processes fewer rows during image acquisition. On many rolling shutter sensors, this can significantly reduce readout time and increase the achievable frame rate.
Reducing Width (X-Axis): Reducing the width of the ROI lowers the total number of transmitted pixels and can reduce interface bandwidth requirements. However, because the sensor may still process the same number of rows, the effect on maximum sensor readout speed is often smaller than reducing image height.

Reducing the width of the ROI lowers the total number of transmitted pixels and can reduce interface bandwidth requirements. However, because the sensor may still process the same number of rows, the effect on maximum sensor readout speed is often smaller than reducing image height.

Design Consideration: In applications where frame rate is a primary constraint, it may be beneficial to configure the system so that the smallest required image dimension aligns with the sensor's vertical axis. The actual performance improvement depends on the sensor architecture and camera implementation.

Decision Matrix: ROI vs. Binning vs. Decimation

System integrators commonly use ROI, binning, and decimation to reduce image data volume and increase achievable frame rates. The most suitable approach depends on the imaging requirements, performance constraints, and acceptable tradeoffs for the application.

Technique	How It Reduces Data	What You Sacrifice	Best Used For
Region of Interest (ROI)	Limits sensor readout to a defined region of the pixel array	Field of View. Image data outside the selected ROI is not captured.	High-speed tracking of small, predictable features (e.g., barcodes, laser lines) where maximum spatial resolution within the selected region is required
Pixel Binning	Combines adjacent pixels into larger "super-pixels"	Spatial Resolution. The image stays wide, but fine spatial detail is reduced.	Low-light, high-speed inspection where full field of view is needed and improving signal-to-noise ratio is important
Decimation	Skips rows and columns entirely (e.g., reads pixel 1, skips 2 and 3)	Spatial Resolution & Light. Detail is lost, and not all captured pixel data contributes to the final image.	High-speed, full field-of-view applications where lighting is abundant and where high-precision sub-pixel measurement is not required

Multiple ROIs (MROI)

Many modern industrial image sensors, including sensor families such as Sony Pregius, support Multiple Regions of Interest (MROI). This feature allows the user to define multiple independent regions within a single sensor frame.

For example, when inspecting features located in separate areas of a large assembly, it may not be necessary to acquire or transmit the entire image area between them. Instead, separate ROIs can be configured around each relevant feature or inspection zone.

During acquisition, the camera reads and transmits only the selected regions rather than the complete sensor frame. This reduces the amount of image data that must be transferred and processed, which can improve bandwidth efficiency and increase achievable frame rates while maintaining the larger field of view of the overall optical system.

Frequently asked questions

No. An ROI is simply a crop. It does not change the focal length of your lens, the working distance, or the physical size of the pixels. If a defect measures 10 pixels across on the full sensor, it will still measure exactly 10 pixels across inside the ROI.

No, they are fundamentally different. If you capture a full 20 MP image and crop it down to 5 MP in your vision software (like OpenCV or Halcon), your host PC saves processing time, but your camera is still exposing and transmitting 20 MP of data over the cable. Software cropping provides absolutely zero increase to your camera's maximum frame rate.

Yes, most industrial camera SDKs allow you to update the X and Y offsets on the fly. This is highly useful in tracking applications. If an initial wide-field image locates a moving part, the software can instantly command the camera to draw a tight ROI around the part and update the X/Y offsets frame-by-frame to follow the part as it moves across the sensor.

Glossary

Back to Overview

Finder

USB 3.1

USB 3.0

USB 2.0

GigE

USB 3.1

USB 3.0

USB 2.0

GigE

GigE

USB 3.1

USB 3.0

Overview

USB 3.0

GigE

USB 3.1

USB 3.0

Video-to-USB 2.0

USB 3.0-to-HDMI

HDMI-to-USB 3.1

Device Drivers

SDKs

End User Software

GStreamer

More Software

C-Mount Lenses

M12 S-Mount Lenses

Lens Accessory

Filters

Calculator

Interface Cards

Data Cables

I/O Cables

Lighting Cables

Power Supplies

Tripod Adapters

22-Pin

IP67

Board

IP67

NVIDIA

NXP

Raspberry Pi

Deserializer Boards

USB 3.1

USB 3.0

USB 2.0

GigE

USB 3.1

USB 3.0

USB 2.0

GigE

GigE

USB 3.1

USB 3.0

Device Drivers

GStreamer

C-Mount Lenses

M12 S-Mount Lenses

Lens Accessory

Filters

Calculator

Data Cables

I/O Cables

Lighting Cables

Power Supplies