-
Notifications
You must be signed in to change notification settings - Fork 21
Scan: Introduction
RScan provides three different scan modes,
- GCMODE
- RMODE
- SMODE
GCMODE uses high pass filter (hpf) followed by setting white point & black point. This mode is best suited to scan documents that don't contain pictures. HPF ensures that output image only contains the information written on the document and it eliminates most of the background like color of the sheet of paper as well as noise such as shadow to a great extent. PS: GCMODE outputs scanned docs with less or no shadow
RMODE sets black point followed by setting white point to ouput scanned documents. RMODE acheives the famous magic color effect
SMODE outputs black & white scanned image. SMODE uses some processing in LAB color space to generate superb black & white scanned documents
First lets get an intution about white-point and black-point operations which seem to be the backbone of whole project. After that I will write in brief about the high pass filter followed by detailed explanation on how these processing techniques are implemented using OpenCV.
Note before we start : This project offers the code to scan documents but it is not completely automated. User will have to provide values for
whitePoint
,blackPoint
&kernelSize
as input (kernelSize only for GCMODE). Selecting these values is not that tough once we get a proper feel of the algorithms but still we need to automate the selection of these values in newer versions of the project.
Selecting black-point of an image is defining a value such that any pixel having intensity less than the black-point should now be considered as black.
For example, In a grayscale image selecting black-point value as 50 implies that all pixels below 50 are now made 0 and rest of the pixel currently in the range of [50,255] are mapped to [0,255].
In a simpler language any pixel with value less than the black point is made completely black whereas other pixels with value in range of [black-point, 255] are mapped to the range [0,255]
This operation is achieved using the basic map function,
def map(x, in_min, in_max, out_min, out_max):
return (x - in_min) * (out_max - out_min) / (in_max - in_min) + out_min
Above function works well with Python as it can handle numpy arrays. For Java we will use subtraction & multiplication methods of OpenCV Core to implement the map
For Python implementation,
First we need to use img = img.astype('int32')
to convert image's numpy array to 'int32' type before using the map function map(img, blackPoint, 255, 0, 255)
.
After using the map
function, pixel values in range [blackPoint, 255] are now changed to the range [0, 255] and pixels below blackPoint have some negative value.
Now we need to use the following thresholding operation in order to convert all negative values to zero
ret, img = cv2.threshold(img, 0, 255, cv2.THRESH_TOZERO)
For Java implementation,
map is implemented as follows,
Core.subtract(processedImg, new Scalar(blackPoint, blackPoint, blackPoint), processedImg);
float tmp = (255.0f) / (255.0f - blackPoint);
Core.multiply(processedImg, new Scalar(tmp, tmp, tmp), processedImg);
In Java we don't require THRESH_TOZERO
type thresholding operation as images in Java are stored as Mat
objects and operations on these images performed using Core.subtract
Core.multiply
are of saturation type i-e any value which tries to become negative saturates to zero and values trying to get above 255 saturate to 255 (numpy arithmetics in Python are modulo based).
Selecting white-point of an image is defining a value such that any pixel having intensity more than the white-point should now be considered pure white.
For example, In a grayscale image selecting white-point value as 150 implies that all pixels above 150 are now made 255 and rest of the pixel currently in the range of [0,150] are mapped to [0,255].
In a simpler language any pixel with value more than the white point is made completely white whereas other pixels with value in range of [0, whitePoint] are mapped to the range [0,255]
This operation is acheived using THRESH_TRUNC
type thresholding & map
function
High Pass Filter allows edges in the image to pass through it hence when a document image is passed through it, features such as hand written or types characters, boundaries of drawings, etc are available in the output image whereas low frequency contents such as shadow, color of the sheet of paper are lost. HPF has shown some very good results in scanning documents that don't contain pictures, for eg: image of handwritten class notes.
I have referenced implementation of the HPF from this question on Stackoverflow.
Above link gives the perfect intution about high pass filters using OpenCV on Python. For Java, we will be using filter2D
method of Imgproc
class to implement the HPF.
Upcoming sections will discuss these things in detail.
One final thing before we start detailed discussions in next sections !
I want to add code from the project's main
method which will help the reader get a much better understanding of how to actually use the source codes to scan an input image.
public class Main {
public static void main(String[] args) {
System.loadLibrary(Core.NATIVE_LIBRARY_NAME);
/* read image */
Mat image = Imgcodecs.imread("C:/Users/Sourabh Khemka/Desktop/RScan/ML/dataset/data_img_008.jpg");
/* create an object of Scan class */
Scan scanner = new Scan(image, 51, 66, 160);
/* "scanImage" method of the Scan class takes the desired mode of operation as input i-e GCMODE/RMODE/SMODE
and then uses a switch-case block to call the appropriate set of functions. */
Mat scannedImg = scanner.scanImage(Scan.ScanMode.GCMODE);
/* write the ouput image as .jpg file */
Imgcodecs.imwrite("C:/Users/Sourabh Khemka/Desktop/RScan/ML/scanned.jpg", scannedImg);
}
}
email id : [email protected]