Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
This article outlines the capability of Talend to integrate with Amazon Rekognition to process images, and focuses on bringing value to customers using its image processing functionality. The article will showcase various use cases where Talend and AWS Rekognition are used together, along with sample Jobs for each use case.
The article is a continuation of the Talend AWS Machine Learning integration series. You can read the previous articles, Introduction to Talend and Amazon Real-Time Machine Learning, Talend and Amazon Comprehend Integration, and Talend and Amazon Translate Integration in the Talend Community Knowledge Base.
Content:
This article was written using Talend 7.1. However, you can configure earlier versions of Talend with the logic provided to integrate Amazon Rekognition.
Currently, Amazon Rekognition is only available in selected AWS regions. Talend recommends verifying the availability of the service from the AWS Global Infrastructure Region Table before creating the overall application architecture.
This section examines the practical use cases where Talend and Amazon Rekognition could be used for automatic celebrity recognition, image moderation, automatic vehicle license plate recognition using text detection from image, facial analysis, and image labeling in crime scene detection.
This process can be used in the media industry, where classification of images taken by photographers and tagging those images to various celebrities is a cumbersome process. Talend and Amazon Rekognition can be used for automatic classification and indexing of images by celebrity.
The diagram above illustrates the workflow, and how Talend helps you to categorize images of various celebrities with an easy-to-use graphical application design interface. The steps in the flow are:
Note: For more information about using Talend with S3, see Amazon S3 in the Talend Help Center.
Automatic identification of vehicle license plates is a popular use case throughout the world for performing traffic management and law enforcement. The following process depicts using Talend and Amazon Rekognition to perform this activity in an impeccable manner.
The steps in this process are:
Image moderation is an important aspect of a web site, especially when the targeted users of the web site include people from all age categories. Talend and Amazon Rekognition help you identify images that contain suggestive or explicit content that is not appropriate for a web site, and allow you to remove it before publishing.
The steps in this process are:
Crime scene investigation is a time-consuming process where the items in an image need to be labelled and faces in the image need to be detected. Using Talend and Amazon Rekognition, the crime scene analysis can be performed in near real-time where all the items in the image will be properly labelled and facial analysis can be performed by the analysts from the headquarters.
The steps in the process are:
Create a Talend user routine by performing the following steps. The functions used in all the use cases mentioned in the next section have been embedded under the same Talend routines as the multiple Java functions.
Connect to Talend Studio, and create a new routine called AWS_Rekognition that connects to the Amazon Rekognition service to transmit the input image and collect the response back from the Amazon Rekognition service.
Insert the following code into the Talend routine:
package routines;
import com.amazonaws.auth.BasicAWSCredentials;
import com.amazonaws.auth.AWSStaticCredentialsProvider;
import com.amazonaws.services.rekognition.model.S3Object;
import com.amazonaws.services.rekognition.AmazonRekognition;
import com.amazonaws.services.rekognition.AmazonRekognitionClientBuilder;
import com.amazonaws.services.rekognition.model.Image;
import com.amazonaws.services.rekognition.model.RecognizeCelebritiesRequest;
import com.amazonaws.services.rekognition.model.RecognizeCelebritiesResult;
import com.amazonaws.services.rekognition.model.Attribute;
import com.amazonaws.services.rekognition.model.DetectFacesRequest;
import com.amazonaws.services.rekognition.model.DetectFacesResult;
import com.amazonaws.services.rekognition.model.DetectLabelsRequest;
import com.amazonaws.services.rekognition.model.DetectLabelsResult;
import com.amazonaws.services.rekognition.model.DetectTextRequest;
import com.amazonaws.services.rekognition.model.DetectTextResult;
import com.amazonaws.services.rekognition.model.DetectModerationLabelsRequest;
import com.amazonaws.services.rekognition.model.DetectModerationLabelsResult;
import org.apache.commons.logging.LogFactory;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.annotation.JsonView;
import org.apache.http.protocol.HttpRequestExecutor;
import org.apache.http.client.HttpClient;
import org.apache.http.conn.DnsResolver;
import org.joda.time.format.DateTimeFormat;
public class AWS_Rekognition {
public static String Celebrity_Detection(String AWS_Access_Key,String AWS_Secret_Key, String AWS_regionName,String photo,String bucket)
{
// AWS Connection
BasicAWSCredentials awsCreds = new BasicAWSCredentials(AWS_Access_Key,AWS_Secret_Key);
AmazonRekognition rekognitionClient = AmazonRekognitionClientBuilder.standard().withCredentials(new AWSStaticCredentialsProvider(awsCreds)).withRegion(AWS_regionName).build();
//Celebrity recognition from picture
RecognizeCelebritiesRequest request = new RecognizeCelebritiesRequest()
.withImage(new Image().withS3Object(new S3Object()
.withName(photo).withBucket(bucket)));
RecognizeCelebritiesResult result=rekognitionClient.recognizeCelebrities(request);
String response_text =result.getCelebrityFaces().toString();
return response_text;
}
public static String Face_Detection(String AWS_Access_Key,String AWS_Secret_Key, String AWS_regionName,String photo,String bucket)
{
// AWS Connection
BasicAWSCredentials awsCreds = new BasicAWSCredentials(AWS_Access_Key,AWS_Secret_Key);
AmazonRekognition rekognitionClient = AmazonRekognitionClientBuilder.standard().withCredentials(new AWSStaticCredentialsProvider(awsCreds)).withRegion(AWS_regionName).build();
//Face Detection from picture
DetectFacesRequest request = new DetectFacesRequest()
.withImage(new Image().withS3Object(new S3Object()
.withName(photo).withBucket(bucket))).withAttributes(Attribute.ALL);
DetectFacesResult result=rekognitionClient.detectFaces(request);
String response_text =result.getFaceDetails().toString();
return response_text;
}
public static String Label_Detection(String AWS_Access_Key,String AWS_Secret_Key, String AWS_regionName,String photo,String bucket,Integer MaxLabels,Float Confidence_Percent )
{
// AWS Connection
BasicAWSCredentials awsCreds = new BasicAWSCredentials(AWS_Access_Key,AWS_Secret_Key);
AmazonRekognition rekognitionClient = AmazonRekognitionClientBuilder.standard().withCredentials(new AWSStaticCredentialsProvider(awsCreds)).withRegion(AWS_regionName).build();
//Label Detection from Image
DetectLabelsRequest request = new DetectLabelsRequest()
.withImage(new Image().withS3Object(new S3Object()
.withName(photo).withBucket(bucket)))
.withMaxLabels(MaxLabels).withMinConfidence(Confidence_Percent);
DetectLabelsResult result = rekognitionClient.detectLabels(request);
String response_text =result.getLabels().toString();
return response_text;
}
public static String Image_Moderation(String AWS_Access_Key,String AWS_Secret_Key, String AWS_regionName,String photo,String bucket,Float Confidence_Percent)
{
// AWS Connection
BasicAWSCredentials awsCreds = new BasicAWSCredentials(AWS_Access_Key,AWS_Secret_Key);
AmazonRekognition rekognitionClient = AmazonRekognitionClientBuilder.standard().withCredentials(new AWSStaticCredentialsProvider(awsCreds)).withRegion(AWS_regionName).build();
//Image_Moderation call
DetectModerationLabelsRequest request = new DetectModerationLabelsRequest()
.withImage(new Image().withS3Object(new S3Object()
.withName(photo).withBucket(bucket)))
.withMinConfidence(Confidence_Percent);
DetectModerationLabelsResult result = rekognitionClient.detectModerationLabels(request);
String response_text =result.getModerationLabels().toString();
return response_text;
}
public static String Text_Detection(String AWS_Access_Key,String AWS_Secret_Key, String AWS_regionName,String photo,String bucket)
{
// AWS Connection
BasicAWSCredentials awsCreds = new BasicAWSCredentials(AWS_Access_Key,AWS_Secret_Key);
AmazonRekognition rekognitionClient = AmazonRekognitionClientBuilder.standard().withCredentials(new AWSStaticCredentialsProvider(awsCreds)).withRegion(AWS_regionName).build();
//Text_Detection call
DetectTextRequest request = new DetectTextRequest()
.withImage(new Image()
.withS3Object(new S3Object()
.withName(photo)
.withBucket(bucket)));
DetectTextResult result = rekognitionClient.detectText(request);
String response_text =result.getTextDetections().toString();
return response_text;
}
}
The Talend routine needs additional JAR files. Install the following JAR files in the routine:
Add additional Java libraries to the routine. For more information on how to add Java libraries, see the Talend and Amazon Comprehend Integration article of the series.
The setup activities are complete. The next section shows a sample Job for the functionalities described in the practical use cases.
For ease of understanding, and to keep the focus on the integration between Talend and Amazon Rekognition, the sample Job uses a CSV file for input, sample images that are already loaded to AWS S3, and a tLogrow component for output.
The sample Job, Celebrity_Detection_sample_job.zip, is attached to this article. It extracts the image metadata (AWS S3 bucket name and image file name within AWS S3) from the input file and transmits the message to the Amazon Rekognition service. The response (in JSON format) from the Amazon Rekognition service is parsed, and celebrity information for each inbound text record is published in the console.
The configuration details are as follows:
Create a new Standard Job called Celebrity_Detection_sample_job.
The first stage in associating the routine to a Talend Job is to add the routines to the newly created Job by selecting Setup routine dependencies.
Add the AWS_Rekognition routine to the User routines section of the pop-up screen, to link the newly created routine to the Talend Job.
Note: You must perform this step for all of the Jobs mentioned in this article.
Review the overall Job flow, shown in the following diagram:
Configure the context variables as shown below:
The input file for the Job, Celebrity_input_file.txt, is located in the Talend and AWS Rekognition_sample files.zip file attached to this article. Download Celebrity_sample1.jpg, a picture of Robert Downey Jr., and Celebrity_sample2.jpg, a picture of Chris Evans, from AWS S3.
Configure the tFileInputDelimited component as shown below:
Use a tMap component, where the call to the Rekognition Celebrity Detection service is made through a Talend routine. You will have to pass the parameters mentioned in the code snippet in the same order as the function call in the tMap component.
AWS_Rekognition.Celebrity_Detection(context.AWS_Access_Key,context.AWS_Secret_Key, context.AWS_regionName,row1.image_name,row1.bucket_name)
Configure the tMap component layout as shown below:
The output from the Amazon Rekognition call is a string in JSON format. The translated text is parsed to the variables as shown below. Leave the columns bucket_name and image_name empty, because you are going to map them directly from the input flow. Two additional columns, celebrity_name and match_confidence, contain the derived data from the JSON message.
Notice that the input data passes to a tLogrow component that translates the output data and displays in the console.
The sample Job, Text_Detection_sample_job.zip, is attached to this article. It extracts the S3 image metadata from a CSV file and performs a call to the Amazon Rekognition text detection service to identify vehicle license numbers. The output from the service is parsed and displayed in the console.
The configuration details are as follows:
Create a new Standard Job called Text_Detection_sample_job.
Add the AWS_Rekognition routine to the User routines section of the pop-up screen, to link the newly created routine to the Talend Job. The following diagram shows the overall Job flow:
Data in the sample text file, text_detection_input_file attached to this article, and the corresponding image used as input in AWS S3, are shown below:
Use a tFileInputDelimited component to configure the input file, as shown below:
Talend calls the Text_Detection function of the AWS_Rekognition routine in the tMap component, as shown below. This transfers the data from Talend to Amazon Rekognition, and sends the responses back to the output_data field.
AWS_Rekognition.Text_Detection(context.AWS_Access_Key,context.AWS_Secret_Key, context.AWS_regionName,row1.image_name,row1.bucket_name)
Configure the tMap component, as shown below:
Use the tExtractJSONFields component to parse the JSON response. This maps the text, along with the corresponding confidence level, to the new detected_text and match_confidence output fields.
The tMap component extracts the text with actual number plates in different combinations. In the example, the record with maximum length and match confidence is selected as the filter (number_plate) condition in the output section, as shown in the code snippet below. Replace this with country-specific matching regular expressions.
Float.parseFloat( row2.match_confidence ) >99 && row2.detected_text.length()>5
Review the data printed in the output console.
The sample Job, Image_Moderation_sample_job.zip, is attached to this article. It extracts the S3 image metadata from a CSV file and performs a call to the Amazon Rekognition image moderation service to determine if an image is safe for all age groups. If the image is unsafe the service flags the image. The output from the service is parsed and displayed in the console.
The configuration details are as follows:
Create a new Standard Job called Image_Moderation_sample_job.
Add the AWS_Rekognition routine to the User routines section of the pop-up screen, to link the newly created routine to the Talend Job. The following diagram shows the overall Job flow:
Data in the sample text file, image_moderation_input_file.txt, attached in this article, references two images that are used for image moderation. The first image, a person in beach wear, is not appropriate for all age groups. The second image, a celebrity photo, is acceptable for all age groups.
Configure the tFileInputDelimited component, as shown below:
Talend calls the Image_Moderation function of the AWS_Rekognition routine using the tMap component, as shown below. This transfers the data from Talend to Amazon Rekognition and sends the response back to the output_data field. Add a context variable, called Confidence_Percent, in the Float datatype to determine the moderation percentage. In this example, the value is set to 60%.
AWS_Rekognition.Image_Moderation(context.AWS_Access_Key,context.AWS_Secret_Key, context.AWS_regionName,row1.image_name,row1.bucket_name,context.Confidence_Percent)
Configure the tMap component, as shown below:
Add another tMap component to determine if an image is Unsafe. Images with a threshold value of 60% or higher are flagged as potentially unsafe for all age groups, and the output is in JSON format. In this example, the tMap component verifies the length of JSON message and sets the alert_flag variable. However, the JSON message can be processed in different methods according to your use case.
Review the data printed in the output console.
The sample Job, Face_Label_Detection_sample_job.zip, attached to this article. It extracts the S3 image metadata from a CSV file and performs a call to the Amazon Rekognition facial analysis service to identify the details of a person's face in an image. Similarly, the image labeling service is used to label various objects present in an image with an associated confidence level. The output from the service is parsed and displayed in the console.
The configuration details are as follows:
Create a new Standard Job called Face_Label_Detection_sample_job.
Add the AWS_Rekognition routine to the User routines section of the pop-up screen, to link the newly created routine to the Talend Job. The following diagram shows the overall Job flow:
Configure the context variables, as shown below:
Data in the sample text file, face_label_input_file attached to this article, and the corresponding image used as input in AWS S3, are shown below.
Use a tFileInputDelimited component to configure the input file, as shown below:
Talend calls the Face_Detection and Label_Detection functions of the AWS_Rekognition routine in the tMap component, as shown below. This transfers the data from Talend to Amazon Rekognition, and sends the responses back to the output_data field.
AWS_Rekognition.Face_Detection(context.AWS_Access_Key,context.AWS_Secret_Key, context.AWS_regionName,row1.image_name,row1.bucket_name) AWS_Rekognition.Label_Detection(context.AWS_Access_Key,context.AWS_Secret_Key, context.AWS_regionName,row1.image_name,row1.bucket_name,context.MaxLabels,context.Confidence_Percent)
Configure the tMap component, as shown below:
Use the tExtractJSONFields component to parse the JSON response where additional fields have been added to extract the facial details. In this example, specific facial details have been parsed to the output fields. However, according to individual use cases, additional facial details can be extracted from the JSON message.
Review the facial analysis printed in the output console.
Use the tExtractJSONFields component to parse the JSON response for image labeling, as shown below:
Review the image label analysis printed in the output console.
Amazon Rekognition supports image file sizes up to 15 MB when passed as an S3 object, and up to 5 MB when submitted as an image byte array.
Amazon Rekognition supports JPEG and PNG image formats.
For best results, Talend recommends images with 640x480 resolution or higher. Images below 320x240 may increase the chances of missing faces, objects, or inappropriate content. However, Amazon Rekognition accepts images that are at least 80 pixels in both dimensions.
To detect and analyze an object in an image, the smallest object or face present in the image should be at least 5% of the size (in pixels) of the shorter image dimension.
This article depicts use cases of integrating Talend with Amazon Rekognition service. In real time scenarios, data input flow is in the form of web services or queues instead of input files mentioned in the sample Jobs.
AWS Documentation: