Cloudinary Blog

New AI-Based Image Auto-Crop Algorithm Sticks to the Subject

By
AI-Based Image Auto-Crop Algorithm Sticks to the Subject

If you're the developer for an online store, a news site, a social media app, or any other website that delivers new media content on a regular basis, you are probably painfully familiar with both the challenge and importance of delivering well-cropped images.

In a nutshell, your customers expect top-quality, quick-loading photos that clearly show them what they need to see no matter what device they happen to be using at the moment. Making them happy requires delivering the same image in many different aspect ratios, and potentially cropping closer or wider on your main subject, depending on size. Hardly sounds reasonable, but if you don't meet their expectations, your customers will just go somewhere else...

(Why do I feel like I'm describing my teenage daughters? Well, to be honest, the demanding nature and short attention span of today's online customers have a lot in common with teens…)

But unlike teens, you can't afford to fall short of your users' demands. If you've got hundreds or thousands of images to deliver every day, you can't manually crop every image to the right size for every device and placement. Simple client-side scaling of your high quality images would mean completely unacceptable performance and quality. So it's clear that you must adjust the size and aspect ratio of your images programmatically, and then deliver the various image resolutions using responsive delivery code, probably via <picture> and <srcset> tags.

Then the question arises -- if you programmatically crop the same photo to significantly different aspect ratios, how can you be sure your code won't crop off the most important parts of your products, news subjects, or that adorable cat your users just have to share with the world?

If you are already familiar with Cloudinary's powerful media management and delivery capabilities, you might be saying - what's the problem? Just use Cloudinary's image auto-cropping gravity parameter (aka g_auto) and it will do the hard work for you!

And in fact, thousands of Cloudinary developers already use g_auto in their image URLs to automatically crop millions of images daily. That parameter applies a sophisticated algorithm that analyzes the pixels in an image and prioritizes the most salient areas of each image on-the-fly. The analysis gives priority to skin tones, edge detection, color contrasts, any detected faces, and more, in order to determine the most important areas to keep when it crops.

But what if Cloudinary's already powerful image auto-crop functionality could be even smarter?

Introducing Subject Detection Image Auto Cropping

Cloudinary's new and improved deep-learning-based g_autotransformation parameter goes beyond the saliency analysis described above, adding the capability to actually detect the subjects in an image that would be most likely to capture a person's attention.

To accomplish this, our new image auto-cropping deep learning mechanism has been (and continues to be) fed with tens of thousands of images and corresponding human input that together teach our machines to predict the important regions in images, no matter their subject and layout. This involves advanced computations performed by GPU-based hardware clusters that process millions of crop requests on the fly. The result is completely unique auto-cropping functionality that provides very impressive outputs.

Remember those products, news images, and user-generated content (aka cats) we talked about above?

Look what would happen to these mobile-camera photographs if we were to crop them to a square using the standard center cropping that other programs often apply, and how the new and improved g_auto with subject-detection cropping comes to the rescue in each of these crops:

Product - original original Product image with center cropping center crop Product image with subject detection cropping g_auto crop
News image - original original News image with center cropping center crop News image with subject detection cropping g_auto crop
Cat image - original original Cat image with center cropping center crop Cat image with subject detection cropping g_auto crop

How do you apply this cool cropping method on your delivered images? After uploading your original image to your Cloudinary account, just specify auto as the gravity (area to keep) in your on-the-fly delivery URL, along with a crop mode such as fill, lfill, or crop, and of-course an aspect ratio or width/height combination that's different than the original.

For example, here's how you'd deliver that nicely cropped cat above as a 500px square (1.0 aspect ratio or ar_1 in the delivery URL):

Ruby:
Copy to clipboard
cl_image_tag("docs/cat_yellow_leaves2.jpg", :gravity=>"auto", :aspect_ratio=>"1", :width=>500, :crop=>"fill")
PHP v1:
Copy to clipboard
cl_image_tag("docs/cat_yellow_leaves2.jpg", array("gravity"=>"auto", "aspect_ratio"=>"1", "width"=>500, "crop"=>"fill"))
PHP v2:
Copy to clipboard
(new ImageTag('docs/cat_yellow_leaves2.jpg'))
  ->resize(Resize::fill()->width(500)->aspectRatio(1)
    ->gravity(Gravity::autoGravity()));
Python:
Copy to clipboard
CloudinaryImage("docs/cat_yellow_leaves2.jpg").image(gravity="auto", aspect_ratio="1", width=500, crop="fill")
Node.js:
Copy to clipboard
cloudinary.image("docs/cat_yellow_leaves2.jpg", {gravity: "auto", aspect_ratio: "1", width: 500, crop: "fill"})
Java:
Copy to clipboard
cloudinary.url().transformation(new Transformation().gravity("auto").aspectRatio("1").width(500).crop("fill")).imageTag("docs/cat_yellow_leaves2.jpg");
JS:
Copy to clipboard
cloudinary.imageTag('docs/cat_yellow_leaves2.jpg', {gravity: "auto", aspectRatio: "1", width: 500, crop: "fill"}).toHtml();
jQuery:
Copy to clipboard
$.cloudinary.image("docs/cat_yellow_leaves2.jpg", {gravity: "auto", aspect_ratio: "1", width: 500, crop: "fill"})
React:
Copy to clipboard
<Image publicId="docs/cat_yellow_leaves2.jpg" >
  <Transformation gravity="auto" aspectRatio="1" width="500" crop="fill" />
</Image>
Vue.js:
Copy to clipboard
<cld-image publicId="docs/cat_yellow_leaves2.jpg" >
  <cld-transformation gravity="auto" aspectRatio="1" width="500" crop="fill" />
</cld-image>
Angular:
Copy to clipboard
<cl-image public-id="docs/cat_yellow_leaves2.jpg" >
  <cl-transformation gravity="auto" aspect-ratio="1" width="500" crop="fill">
  </cl-transformation>
</cl-image>
.NET:
Copy to clipboard
cloudinary.Api.UrlImgUp.Transform(new Transformation().Gravity("auto").AspectRatio("1").Width(500).Crop("fill")).BuildImageTag("docs/cat_yellow_leaves2.jpg")
Android:
Copy to clipboard
MediaManager.get().url().transformation(new Transformation().gravity("auto").aspectRatio("1").width(500).crop("fill")).generate("docs/cat_yellow_leaves2.jpg");
iOS:
Copy to clipboard
imageView.cldSetImage(cloudinary.createUrl().setTransformation(CLDTransformation().setGravity("auto").setAspectRatio("1").setWidth(500).setCrop("fill")).generate("docs/cat_yellow_leaves2.jpg")!, cloudinary: cloudinary)

Classic Or Subject?

This latest addition to Cloudinary's growing set of machine learning capabilities analyzes the image as a whole, rather than the pixel by pixel analysis that's applied in our classic auto-cropping feature.

In the majority of cases, the classic saliency algorithms and our new subject-detection method will provide the same or very similar results. But when processing the vast number of auto-crops that Cloudinary handles every day, there are some cases where we felt we could provide even better results. For example, the new subject-detection algorithm can be more reliable in cases where the true main subject of the photo would otherwise have to compete with elements like sunlight, faces, or other large areas of color contrast that are not actually central to the 'story' of the image.

For example, the classic auto-cropping algorithm gives increased priority to the bright contrasts of the leaves against the sky, and thus in this case, doesn't keep the girl when the aspect ratio is changed significantly. But the artificial intelligence algorithm emulates what our intuition tells us, and the automatic crop is right on target.

Original Original Original Classic auto-crop Original New auto-crop

Just a quick note here, that in order to get the best of all worlds, the default g_auto parameter now applies a combination of the subject and classic algorithms (more heavily leaning towards the subject results). But you can always proactively request either the classic or subject mechanism with any crop mode using auto:classic or auto:subject as the gravity (g_) value.

Auto-Cropping and Responsive Art Direction

The value of great image auto-cropping really comes into play when you start thinking about delivering the same image in different devices.

Organizations who need to play it safe tend to just scale down their original image and deliver the same thing regardless of the device viewport aspect ratio. But that means you really sacrifice on the detail, when on smaller screens or when the aspect ratios are significantly different from the original.

When you rotate your phone between portrait and landscape, you are switching between ~19:9 vs. 9:19, or an aspect ratio of 2.1 vs. 0.47, while your computer screen viewers are probably using a 4:3 or 16:9 screen. Do you really want to deliver the identical image in all these views?

In general, images that fill the device viewport get the most engagement. But you can only go that route if you can be sure that even if 50% or more of the image may be cropped out, you'll still keep the important parts of your image in tact.

For example, with the new g_auto, you can confidently deliver images that fill a phone's view port, even when users rotate their phones, so when viewer's rotate their phones, you can offer your viewers the more engaging option Bs rather than the tiny option As.

Original
Option A
Original
Option B
Original
Option A
Original
Option B

When the Subject, is Subject to Change…

Programmatically cropped images are an essential part of any web or mobile site. But when the subjects, and the location of those subjects within your images, are unpredictable, automatic cropping isn't enough. The cropping mechanism you use has to be smart enough to 'know' what the end users are going to want to see. With Cloudinary's subject detection auto cropping, you can confidently deliver great photos that bring your users' attention directly to the subject at hand, from responsive view to responsive view.

Oh, by the way, if your subject is not so likely to change, for example, suppose you are selling microwaves, umbrellas, vehicles, or food, you may want to take advantage of our new object-aware image cropping add-on. This add-on applies another deep learning tool that gives the highest preservation ("don't crop") priority to specific objects or categories you specify. If the specified object(s) aren't found, then (by default) the subject-detection auto-cropping algorithm described here is still applied.

The end goal? With these smart AI algorithms, you can confidently use the same cropping transformation with virtually every image you deliver, no matter the size and layout of your images, the user's device, or your graphic design.

The subject detection auto-cropping we've demonstrated in this post is available with all of Cloudinary's plans; even the free plan!

Learn more about g_auto:subject and all the automatic cropping options in our docs.

Recent Blog Posts

Our $2B Valuation

By
Blackstone Growth Invests in Cloudinary

When we started our journey in 2012, we were looking to improve our lives as developers by making it easier for us to handle the arduous tasks of handling images and videos in our code. That initial line of developer code has evolved into a full suite of media experience solutions driven by a mission that gradually revealed itself over the course of the past 10 years: help companies unleash the full potential of their media to create the most engaging visual experiences.

Read more
Direct-to-Consumer E-Commerce Requires Compelling Visual Experiences

When brands like you adopt a direct–to-consumer (DTC) e-commerce approach with no involvement of retailers or marketplaces, you gain direct and timely insight into evolving shopping behaviors. Accordingly, you can accommodate shoppers’ preferences by continually adjusting your product offering and interspersing the shopping journey with moments of excitement and intrigue. Opportunities abound for you to cultivate engaging customer relationships.

Read more
Automatically Translating Videos for an International Audience

No matter your business focus—public service, B2B integration, recruitment—multimedia, in particular video, is remarkably effective in communicating with the audience. Before, making video accessible to diverse viewers involved tasks galore, such as eliciting the service of production studios to manually dub, transcribe, and add subtitles. Those operations were costly and slow, especially for globally destined content.

Read more
Cloudinary Helps Minted Manage Its Image-Generation Pipeline at Scale

Shoppers return time and again to Minted’s global online community of independent artists and designers because they know they can count on unique, statement-making products of the highest quality there. Concurrently, the visual imagery on Minted.com must do justice to the designs into which the creators have poured their hearts and souls. For Minted’s VP of Engineering David Lien, “Because we are a premium brand, we need to ensure that every single one of our product images matches the selected configuration exactly. For example, if you pick an 18x24 art print on blue canvas, we will show that exact combination on the hero images in the PDF.”

Read more
Highlights on ImageCon 2021 and a Preview of ImageCon 2022

New year, same trend! Visual media will continue to play a monumental role in driving online conversions. To keep up with visual-experience trends and best practices, Cloudinary holds an annual conference called ImageCon, a one-of-a-kind event that helps attendees create the most engaging visual experiences possible.

Read more