Person Recognition in Images with OpenCV & Neo4j

Time for an update on my ongoing person identification in images project; for all the background you can check out these previous posts:

In my earlier serverless series I discussed and provided code for getting images into S3 and processed by AWS Rekognition – including storing the Rekognition label data in DynamoDB.

This post builds on all of those concepts.

In short – I’ve been collecting comparative data on person recognition using AWS Rekognition and OpenCV and storing that data in Neo4j for analysis.

My assumption was that the trained model for person detection that ships with OpenCV was efficient enough to run “embedded”. In order to validate that I decided to run OpenCV and the person detection algorithm on a Raspberry PI located on-site. If you’ve read my earlier series of posts you know a Raspberry PI also served as a local controller node in my serverless IoT environment.

I used the following installation guide for OpenCV on a Raspberry PI to get started:

http://www.pyimagesearch.com/2015/07/27/installing-opencv-3-0-for-both-python-2-7-and-python-3-on-your-raspberry-pi-2/

One additional note: the build takes up a lot of space – I strongly recommend you do the build on an external USB drive. It won’t be fast, but it won’t fill your root directory…

Queuing Images for OpenCV Processing

Two things were important when considering how to let the OpenCV processor know there were images to process:

Not interfere with the work already being done in the serverless IoT system
Not transmit the image across the internet multiple times

I solved this with two simple steps:

NAS Storage and Mapping

The first step was to ensure the second Raspberry PI had the exact same mapping to my NAS that the local controller node did. This allowed both PIs to reference the image file with the exact same path.

SQS Messaging for Files

Since the local controller node was already detecting the arrival of the file I saw no benefit in having the second PI detect the arrival event. In order to accomplish this I leveraged another AWS service – SQS.

With a minimum of code I could add the message to SQS from the IoT local controller node and receive the message on the OpenCV PI.

Code to Create the SQS Message

I modified the daemon on the IoT Controller by adding the following function:

def put_file_info_on_sqs(object_info, logger, app_config):
    # Get the service resource
    import boto3
    import json

    if object_info['img_type'] == 'snap':
        sqs = boto3.resource('sqs')

        # Get the queue
        queue = sqs.get_queue_by_name(QueueName='image_for_person_detection')
        logger.info("Putting message: {} on queue.".format(json.dumps(object_info)))
        response = queue.send_message(MessageBody=json.dumps(object_info))
    # Fin

At the appropriate time in the processing sequence I simply call this function and an SQS message is created on the named queue.

Processing Images with OpenCV

In order to process the images with OpenCV I created a daemon that would read the SQS queue and process each image. In addition the information obtained would be added to the Neo4j Graph.

Reading Message from SQS

def get_sqs_messages():
    """
    Gets messages from AWS SQS and sends them along for processing.
    :return:
    """
    # Get the service resource
    sqs = boto3.resource('sqs')

    # Get the queue
    queue = sqs.get_queue_by_name(QueueName='image_for_person_detection')

    for message in queue.receive_messages(MessageAttributeNames=['Author']):
        image_info = json.loads(message.body)
        if image_info['img_type'] == "snap":
            logger.info("Processing message (message id: {})for file: {}".format(message.message_id,
                                                                                 image_info['file_name']))
            image_info['message_id'] = message.message_id
            threading.Thread(name=('message_processor-' + message.message_id), target=process_image_message,
                             args=(image_info,)).start()
            # process_image_message(image_info)
        # FIN

        message.delete()
    # End For
# end get_sqs_messages

This code reads the message from SQS and starts a processing thread for the image.

Processing with OpenCV

The following code processes the image with OpenCV:

def detect_people(image_path):
    """
    Execution of the OpenCV person detection algorithm.
    :param image_path:
    :return:
    """
    # HOG detector config values
    image_size = get_config_item(app_config, "opencv_processing_info.image_size")
    win_stride = get_config_item(app_config, "opencv_processing_info.win_stride")
    padding = get_config_item(app_config, "opencv_processing_info.padding")
    scale = get_config_item(app_config, "opencv_processing_info.scale")

    # initialize the HOG descriptor/person detector
    hog = cv2.HOGDescriptor()
    hog.setSVMDetector(cv2.HOGDescriptor_getDefaultPeopleDetector())

    return_info = None

    image_start = time.time()
    # load the image and resize it to (1) reduce detection time
    # and (2) improve detection accuracy
    filename = image_path[image_path.rfind("/") + 1:]
    # print("[INFO] Processing file: {}".format(imagePath))
    image = cv2.imread(image_path)

    if image is not None:
        return_info = {}
        full_image = image.copy()
        scale_ratio = 1
        if image_size != 'original':
            image = imutils.resize(image, width=min(image_size, image.shape[1]))
            scale_ratio = float(full_image.shape[1]) / float(image_size)
        orig = image.copy()

        # detect people in the image
        people = hog.detectMultiScale(image, winStride=(win_stride, win_stride),
                                      padding=(padding, padding), scale=scale)

        logger.info("People detection done for image {} in {} seconds".format(
            image_path, (time.time() - image_start)
        ))

        rects = people[0]
        weights = people[1]
        return_info['rectangles'] = rects
        return_info['weights'] = weights
        return_info['image_dimensions'] = image.shape

        # draw the original bounding boxes
        for (x, y, w, h) in rects:
            cv2.rectangle(orig, (x, y), (x + w, y + h), (0, 0, 255), 2)

        # apply non-maxima suppression to the bounding boxes using a
        # fairly large overlap threshold to try to maintain overlapping
        # boxes that are still people
        rects = np.array([[x, y, x + w, y + h] for (x, y, w, h) in rects])
        pick = non_max_suppression(rects, probs=None, overlapThresh=0.65)
        return_info['number_merged_rectangles'] = len(pick)

        # draw the final bounding boxes
        for (xA, yA, xB, yB) in pick:
            xA *= scale_ratio
            yA *= scale_ratio
            xB *= scale_ratio
            yB *= scale_ratio
            cv2.rectangle(full_image, (int(xA), int(yA)), (int(xB), int(yB)), (0, 255, 0), 2)
        # End for

        if len(pick) > 0:
            # show some information on the number of bounding boxes
            logger.info("[MATCH] {}: found {} person with {} boxes returned from HOG - confidence {}".format(
                filename, len(pick), len(rects), weights))
            cv2.imwrite("/tmp/full_" + filename, full_image)
            return_info['annotated_image_file'] = "/tmp/full_" + filename
        # FIN

        return_info['processing_time'] = time.time() - image_start
        logger.info("Image processing for {} done in {} seconds".format(
            image_path, (time.time() - image_start)
        ))
    # Fin
    return return_info

This is quite a bit to digest – and I recommend you read up on OpenCV as you review this, but I’ll break it down a bit.

The first section sets the configurable values for the detector. These are loaded earlier from a configuration file. The OpenCV documentation provides excellent documentation of the configuration parameters.

We then scale the image to reduce detection time. We retain the ratio so we can later draw the detection box on the full size/resolution image.

The actual detection happens with a single line of code:

        # detect people in the image
        people = hog.detectMultiScale(image, winStride=(win_stride, win_stride),
                                      padding=(padding, padding), scale=scale)

After that we do some non-maxima suppression to merge boxes within which people are detected.

Lastly, we draw the detected box on the original scale image (by de-scaling the box), log some information and return the resulting data.

Storing the Results in Neo4j

Once the detection is complete we store the data in Neo4j for later analysis. In short, we want to generate Cypher to update the graph that looks like this:

MERGE(this_image:Image {object_key: "patrolcams/DownDownC1/2017-02-27/Hour-17/snap/SDAlarm_20170227-174726.jpg", timestamp: 1488246446})  MERGE(this_label:Label {label_name: "OpenCV Person"})  MERGE (this_image)-[:HAS_NOT_OPENCV_LABEL {confidence: [], number_found: "0", opencv_win_stride: 4, opencv_image_size: 400, opencv_padding: 32, opencv_scale: 1.05, opencv_processing_time: 2.00539588928 }]->(this_label)

This creates (or uses) the image node for the image, the label node which is the indication that OpenCV processed the image, and connects them via one of two edges:

HAS_NOT_OPENCV_LABEL – no person detected
HAS_OPEN_CV_LABEL – a person was detected

On that edge we put all of the current configurable values for the detector (so we can evaluate performance across configuration changes) and the processing time for the detector.

The code used to generate the cypher is:

def add_info_to_graph(image_info, person_found = True):
    """
    Adds the OpenCV output information to the local graph store for analysis
    :param image_info: The dict containing the information from the SQS message
    ;param person_found: Default True. Indicates if a person was found in the image.
    :return:
    """

    start_timing = time.time()
    logger.info("Updating the analytics graph for {}".format(image_info['s3_object']))
    s3_object = "patrolcams" + \
                '/' + image_info['camera_name'] + '/' + \
                image_info['date_string'] + '/' + \
                image_info['hour_string'] + '/' + \
                image_info['img_type'] + '/' + \
                image_info['just_file']

    add_image_node = 'MERGE(this_image:Image {object_key: "' + s3_object + '", ' + \
                     'timestamp: ' + str(image_info['utc_ts']) + '}) '
    add_label_node = 'MERGE(this_label:Label {label_name: "OpenCV Person"}) '

    increment = 0
    string_confidence_array = ""
    for confidence in image_info['opencv_person_detection_info']['weights']:
        if increment > 0:
            string_confidence_array += ","

        string_confidence_array += str(confidence[0])
        increment += 1
    # End For

    label_name = ':HAS_OPENCV_LABEL'
    if not person_found:
        label_name = ':HAS_NOT_OPENCV_LABEL'
    # Fin

    relate_image_to_label = 'MERGE (this_image)-[' + label_name + ' {confidence: [' + \
                            string_confidence_array + '], ' + \
                            'number_found: "' + \
                            str(image_info['opencv_person_detection_info']['number_merged_rectangles']) + \
                            '", opencv_win_stride: ' + str(get_config_item(app_config,
                                                                       "opencv_processing_info.win_stride")) + ', ' + \
                            'opencv_image_size: ' + str(
                                                        image_info['opencv_person_detection_info']['image_dimensions'][1]
                                                    ) + ', ' + \
                            'opencv_padding: ' + str(get_config_item(app_config,
                                                                 "opencv_processing_info.padding")) + ', ' + \
                            'opencv_scale: ' + str(get_config_item(app_config,
                                                                 "opencv_processing_info.scale")) + ', ' + \
                            'opencv_processing_time: ' + \
                            str(image_info['opencv_person_detection_info']['processing_time']) + \
                            ' }]->(this_label)'

    full_query_list = add_image_node + " " + add_label_node + " " + relate_image_to_label

    logger.info("Full Cypher Query: {}".format(full_query_list))
    neo_session = driver.session()

    tx = neo_session.begin_transaction()

    tx.run(full_query_list)

    tx.commit()
    neo_session.close()

    logger.info("Graph updated for {} in {} seconds.".format(image_info['s3_object'],
                                                             time.time() - start_timing))
    return True

Results

Over the last 30 days this setup has processed 103,137 images with OpenCV. 36,313 contained a person, 66,824 did not (at least based on the output of the detector).

Using the following cypher I’m able to see how many were processed and the average processing time per image:

match(camera:Camera)-[:HAS_IMAGE]-(image:Image)-[image_label_edge]-(label:Label) where label.label_name in['Person', 'OpenCV Person'] and image.timestamp > ((timestamp()/1000) - 60*60*24*30) return label.label_name, type(image_label_edge), count(type(image_label_edge)), avg(image_label_edge.opencv_processing_time)

This results in the following output:

The images process in an average of ~3 seconds – which is quite good given the limited processing capacity of the Raspberry PI and the configuration set I’m using.

The line with the label_name “Person” indicates the number of images AWS Rekognition detected a person in over the same time period.

You’ll note the (very interesting) ~12,000 difference in person detection – more on that in a later post.

The Payoff

Since no N3o4j post is complete without some graph image eye-candy – here is a heat map of where OpenCV has detected people in the last hour:

1 comment

Diaulas Castro says:

September 24, 2017 at 2:26 am

Hey Brian!
I read all your posts about opencv and aws reko, but didnt found “your” results/toughts!!! =D

You made a good job and I appreciate if you post your results here.

Sorry about my poor english, but I learned so much from your research, that I cannot leave a “reply”.

PS: Im starting home project about face recog, and was in doubt about many options and products.