티스토리 뷰

통계 & 머신러닝 & 딥러닝

OpenCV 사람인식기 (HOG 파라미터 설명)

[하마] 이승현 (wowlsh93@gmail.com) 2016. 3. 7. 11:20

http://www.pyimagesearch.com/2015/11/16/hog-detectmultiscale-parameters-explained/ 번역

HOG detectMultiScale 파라미터 설명

Figure 2: On my system, it takes approximately 0.09s to process a single image using the default parameters.

img (required)

이 파라미터는 꽤 명쾌한데 - 우리가 탐지하고 싶은 객체를 가지고 있는 이미지이다. (사진에서는 사람) . 이것은 detectMultiScale 함수에 무조건 들어가야하는 인자이다. 색상을 가지고 있거나 그레이 스케일 이미지이면 된다.

hitThreshold (optional)

hitThreshold 파라미터는 옵셔널이고 detectMultiScale 함수에서 디폴트로 사용되진 않는다.

OpenCV 문서를 보면 단지 이렇게 쓰여져 있다.

: SVM 분류 평면과 피처 사이의 거리에 대한 설정 값.

만약 유클리언 거리 (SVM 평면과 HOG 피처사이) 가 설정 값을 초과하면 탐지는 반려된다. 내 개인적인 의견으로는 당신이 이미지를 탐지할때 , false-positive 탐지율을 높히고 싶지 않으면 건드리지 않는게 좋다.

winStride (optional)

The winStride parameter is a 2-tuple that dictates the “step size” in both the x and y location of the sliding window.

Both winStride and scale are extremely important parameters that need to be set properly. These parameter have tremendous implications on not only the accuracy of your detector, but also the speed in which your detector runs.

In the context of object detection, a sliding window is a rectangular region of fixed width and height that “slides” across an image, just like in the following figure:

Figure 3: An example of applying a sliding window to an image for face detection.

At each stop of the sliding window (and for each level of the image pyramid, discussed in thescale section below), we (1) extract HOG features and (2) pass these features on to our Linear SVM for classification. The process of feature extraction and classifier decision is an expensive one, so we would prefer to evaluate as little windows as possible if our intention is to run our Python script in near real-time.

The smaller winStride is, the more windows need to be evaluated (which can quickly turn into quite the computational burden):

HOG detectMultiScale parameters explained
Shell
1
$ python detectmultiscale.py --image images/person_010.bmp --win-stride="(4, 4)"

Figure 4: Decreasing the winStride increases the amount of time it takes it process each each.

winStride 를 (4,4) 로 바꾸었더니 탐지 시간이 0.27초로 증가되었다. 반대로 winStride 를 크게 하면 탐색 윈도우의 숫자는 더 작아지고, 이것은 탐지기를 엄청 빨라지게 하지만 탐지를 못할 확율이 전체적으로 높아진다.

HOG detectMultiScale parameters explained
Shell
1
$ python detectmultiscale.py --image images/person_010.bmp --win-stride="(16, 16)"

Figure 5: Increasing the winStride can reduce our pedestrian detection time (0.09s down to 0.06s, respectively), but as you can see, we miss out on detecting the boy in the background.

나는 주로 winStride 값을 (4,4)에서 시작한다. 그 후에 스피드와 탐색 정확도 사이의 트레이드 오프가 합당해질때까지 조금씩 값을 올린다.

padding (optional)

The padding parameter is a tuple which indicates the number of pixels in both the x and y direction in which the sliding window ROI is “padded” prior to HOG feature extraction.

As suggested by Dalal and Triggs in their 2005 CVPR paper, Histogram of Oriented Gradients for Human Detection, adding a bit of padding surrounding the image ROI prior to HOG feature extraction and classification can actually increase the accuracy of your detector.

Typical values for padding include (8, 8), (16, 16), (24, 24), and (32, 32).

scale (optional)

An image pyramid is a multi-scale representation of an image:

Figure 6: An example image pyramid.

At each layer of the image pyramid the image is downsized and (optionally) smoothed via a Gaussian filter.

This scale parameter controls the factor in which our image is resized at each layer of the image pyramid, ultimately influencing the number of levels in the image pyramid.

scale 을 더 작게하면 이미지 레이어의 갯수를 증가시키고 , 계산하는 시간을 증가시킨다.

HOG detectMultiScale parameters explained
Shell
1
$ python detectmultiscale.py --image images/person_010.bmp --scale 1.01

Figure 7: Decreasing the scale to 1.01

The amount of time it takes to process our image has significantly jumped to 0.3s. We also now have an issue of overlapping bounding boxes. However, that issue can be easily remedied using non-maxima suppression.

Meanwhile a larger scale will decrease the number of layers in the pyramid as well as decreasethe amount of time it takes to detect objects in an image:

HOG detectMultiScale parameters explained
Shell
1
$ python detectmultiscale.py --image images/person_010.bmp --scale 1.5

Figure 8: Increasing our scale allows us to process nearly 20 images per second — at the expense of missing some detections.

Here we can see that we performed pedestrian detection in only 0.02s, implying that we can process nearly 50 images per second. However, this comes at the expense of missing some detections, as evidenced by the figure above.

Finally, if you decrease both winStride and scale at the same time, you’ll dramaticallyincrease the amount of time it takes to perform object detection:

HOG detectMultiScale parameters explained
1
2
$ python detectmultiscale.py --image images/person_010.bmp --scale 1.03 \
	--win-stride="(4, 4)"

Figure 9: Decreasing both the scale and window stride.

We are able to detect both people in the image — but it’s taken almost half a second to perform this detection, which is absolutely not suitable for real-time applications.

Keep in mind that for each layer of the pyramid a sliding window with winStride steps is moved across the entire layer. While it’s important to evaluate multiple layers of the image pyramid, allowing us to find objects in our image at different scales, it also adds a significant computational burden since each layer also implies a series of sliding windows, HOG feature extractions, and decisions by our SVM must be performed.

Typical values for scale are normally in the range [1.01, 1.5]. If you intend on runningdetectMultiScale in real-time, this value should be as large as possible without significantly sacrificing detection accuracy.

Again, along with the winStride , the scale is the most important parameter for you to tune in terms of detection speed.

finalThreshold (optional)

I honestly can’t even find finalThreshold inside the OpenCV documentation (specifically for the Python bindings) and I have no idea what it does. I assume it has some relation to thehitThreshold , allowing us to apply a “final threshold” to the potential hits, weeding out potential false-positives, but again, that’s simply speculation based on the argument name.

If anyone knows what this parameter controls, please leave a comment at the bottom of this post.

useMeanShiftGrouping (optional)

The useMeanShiftGrouping parameter is a boolean indicating whether or not mean-shift grouping should be performed to handle potential overlapping bounding boxes. This value defaults to False and in my opinion, should never be set to True — use non-maxima suppression instead; you’ll get much better results.

When using HOG + Linear SVM object detectors you will undoubtably run into the issue of multiple, overlapping bounding boxes where the detector has fired numerous times in regions surrounding the object we are trying to detect:

Figure 10: An example of detecting multiple, overlapping bounding boxes.

To suppress these multiple bounding boxes, Dalal suggested using mean shift (Slide 18). However, in my experience mean shift performs sub-optimally and should not be used as a method of bounding box suppression, as evidenced by the image below:

Figure 11: Applying mean-shift to handle overlapping bounding boxes.

Instead, utilize non-maxima suppression (NMS). Not only is NMS faster, but it obtains much more accurate final detections:

Figure 12: Instead of applying mean-shift, utilize NMS instead. Your results will be much better.

Tips on speeding up the object detection process

Whether you’re batch processing a dataset of images or looking to get your HOG detector to run in real-time (or as close to real-time as feasible), these three tips should help you milk as much performance out of your detector as possible:

Resize your image or frame to be as small as possible without sacrificing detection accuracy. Prior to calling the detectMultiScale function, reduce the width and height of your image. The smaller your image is, the less data there is to process, and thus the detector will run faster.
Tune your scale and winStride parameters. These two arguments have a tremendous impact on your object detector speed. Both scale and winStride should be as large as possible, again, without sacrificing detector accuracy.
If your detector still is not fast enough…you might want to look into re-implementing your program in C/C++. Python is great and you can do a lot with it. But sometimes you need the compiled binary speed of C or C++ — this is especially true for resource constrained environments.

Summary

In this lesson we reviewed the parameters to the detectMultiScale function of the HOG descriptor and SVM detector. Specifically, we examined these parameter values in context of pedestrian detection. We also discussed the speed and accuracy tradeoffs you must consider when utilizing HOG detectors.

If your goal is to apply HOG + Linear SVM in (near) real-time applications, you’ll first want to start by resizing your image to be as small as possible without sacrificing detection accuracy:the smaller the image is, the less data there is to process. You can always keep track of your resizing factor and multiply the returned bounding boxes by this factor to obtain the bounding box sizes in relation to the original image size.

Secondly, be sure to play with your scale and winStride parameters. This values can dramatically affect the detection accuracy (as well as false-positive rate) of your detector.

Finally, if you still are not obtaining your desired frames per second (assuming you are working on a real-time application), you might want to consider re-implementing your program in C/C++. While Python is very fast (all things considered), there are times you cannot beat the speed of a binary executable.

저작자표시 비영리 변경금지 (새창열림)

'통계 & 머신러닝 & 딥러닝' 카테고리의 다른 글

OpenFace 얼굴인식기 (0)	2016.03.09
OpenCV 얼굴인식기 (with Harr Feature ) (0)	2016.03.07
사람인식 HOG, Python , OpenCV (0)	2016.03.02
제프리 힌톤은 그냥 더 좋은 사다리를 만들었을뿐.. (0)	2015.12.26
인공신경망 - (다층 피드 포워드 신경망) (0)	2015.10.04

공지사항

최근에 올라온 글

최근에 달린 댓글

Total

Today

Yesterday

링크

TAG more

« 2025/07 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

글 보관함

HAMA 블로그