Skip to main content

Overview

The image vision tool empowers your Utari workers to “see” and analyze images. This capability enables workers to describe visual content, extract information from images, answer questions about pictures, and provide detailed analysis of visual elements. It’s essential for any workflow involving visual content, design review, image analysis, or visual data extraction.

Image Vision Capabilities

When you enable the image vision tool, your worker gains two essential capabilities:

Load Images

Upload and analyze images from files or URLs for visual analysis and interpretation

Clear Images from Context

Remove images from the conversation to manage the 3-image limit and free up space for new images
Important Limitation: Workers can only process up to 3 images at a time in a single conversation. Use the “clear images from context” capability to remove images when you reach this limit.

Enabling Image Vision

1

Select Your Worker

Navigate to the worker you want to configure and click on the Tools tab.
2

Find Image Vision Tool

Scroll through the available tools to locate Image Vision.
3

Enable the Tool

Check the box next to Image Vision to activate both capabilities:
  • Load images
  • Clear images from context
4

Verify Capabilities

Ensure both capabilities are enabled so your worker can both upload and manage images effectively.
5

Save Configuration

Your changes are automatically saved. The worker can now analyze images.

Using Image Vision

Uploading and Analyzing Images

1

Start a Chat

Open a conversation with a worker that has image vision enabled.
2

Upload an Image

Click the Attach files button and select an image from your computer, or provide an image URL.
3

Request Analysis

Ask your worker to analyze the image. For example:
    Can you please explain this image to me?
4

Review the Analysis

Your worker will:
  • Load and process the image
  • Analyze visual elements
  • Provide a detailed description
  • Answer specific questions about the content

Example Analysis Output

When analyzing an image, your worker provides detailed descriptions such as:
This is a striking graphic design with bold red, orange, and black color scheme, split into two contrasting scenes:

Left side: Urban market scene with vibrant colors and busy atmosphere
Right side: A solitary figure against a darker background

The composition creates a dramatic contrast between communal energy and individual isolation.

Image Management Features

Viewing Image Controls

Once an image is uploaded, you have several controls:

Enlarge

Click to view the image at a larger size

Minimize

Click to reduce the image size

Download

Save the image to your device

Managing the 3-Image Limit

When you reach the 3-image limit:
1

Recognize the Limit

You’ll be unable to upload additional images once 3 are loaded in the conversation.
2

Clear Images

Request your worker to clear images:
    Can you please clear the images from the context?
3

Upload New Images

Once cleared, you can upload up to 3 new images to continue your work.
Clear images strategically. If you need to reference previous images later, save your analysis or download the images before clearing them.

Use Cases for Image Vision

Design and Creative Review

Design Feedback

Upload design mockups, logos, or graphics and ask:
  • “Analyze this logo design and provide feedback on color choice and composition”
  • “What design principles are demonstrated in this layout?”
  • “Compare these two design options and recommend improvements”

Content Analysis

Visual Content Understanding

Analyze images for content creation:
  • “Describe this image for an alt text description”
  • “What emotions does this image convey?”
  • “Identify the key visual elements in this photo”

Data Extraction

Information Extraction

Extract text and data from images:
  • “Read the text from this screenshot”
  • “Extract the data from this chart or graph”
  • “Transcribe the information from this document photo”

Product Analysis

Product Review

Analyze product images:
  • “Describe this product and its features”
  • “What are the key selling points visible in this product image?”
  • “Compare these product photos for quality and presentation”

Educational Content

Image Explanation

Explain complex visual content:
  • “Explain what’s happening in this diagram”
  • “Describe the components shown in this technical illustration”
  • “What does this infographic communicate?”

Advanced Image Analysis Requests

Detailed Descriptions

Provide a detailed analysis of this image, including composition, color palette, mood, and visual elements.

Comparative Analysis

Compare these two images and highlight the differences in style, composition, and effectiveness.

Specific Element Analysis

Analyze the color scheme in this image and suggest complementary colors.

Contextual Questions

What is the likely purpose of this image? Who is the target audience?

Combining Image Vision with Other Tools

Image vision becomes even more powerful when combined with other Utari capabilities:

+ Document Creator

Analyze images and create comprehensive reports or descriptions in document format

+ Web Search

Compare uploaded images with similar images found online for context

+ Files and Folder

Save image analyses and descriptions in organized folders for future reference

+ Knowledge Base

Apply brand guidelines or design SOPs when analyzing images

Example Combined Workflows

Analyze this product image using our brand guidelines from the knowledge base, then create a detailed product description document and save it in the "Product Descriptions" folder.
Compare this logo design to similar logos online using web search, then provide a comprehensive analysis document with recommendations.

Best Practices

Be Specific

Ask clear, specific questions about what you want to know about the image

Provide Context

Give background information about the image’s purpose or intended use

Manage Limits

Track your image count and clear when necessary to avoid hitting the 3-image limit

Use High Quality

Upload clear, high-resolution images for better analysis results

Ask Follow-ups

Ask multiple questions about the same image to get comprehensive insights

Save Important Analysis

Save or document important image analyses before clearing images from context

Image Formats and Requirements

Supported Formats

The image vision tool works with common image formats:
  • JPEG/JPG: Standard photo format
  • PNG: Graphics and screenshots
  • GIF: Animated or static images
  • WebP: Modern web image format
  • BMP: Bitmap images

Quality Recommendations

For best results:
  • Use high-resolution images when possible
  • Ensure text in images is clear and legible
  • Avoid extremely large file sizes (compress if needed)
  • Use well-lit, focused images for better analysis

Workflow Examples

Design Review Process

1

Upload Design

    [Upload design mockup]
    Analyze this website design mockup for usability and visual appeal.
2

Detailed Feedback

    What specific improvements would you suggest for the navigation layout?
3

Color Analysis

    Analyze the color palette and suggest alternatives that might improve contrast.
4

Document Feedback

    Create a design review document with all the feedback and save it to "Design Reviews" folder.

Content Creation Assistant

1

Image Upload

    [Upload product photo]
    Describe this product in detail for an e-commerce listing.
2

Alt Text Creation

    Create an SEO-optimized alt text description for this image.
3

Social Media Caption

    Write three different social media captions for this image, each with a different tone.

Batch Image Analysis

1

First Set

    [Upload 3 images]
    Analyze these three product images and rank them by visual appeal.
2

Clear and Continue

    Clear the images from context, please.
3

Second Set

    [Upload 3 new images]
    Analyze these next three images using the same criteria.
4

Comprehensive Report

    Based on all six images we've reviewed, create a comprehensive analysis document.

Troubleshooting

Verify that:
  • Image vision tool is enabled for the worker
  • The image uploaded successfully (check for upload confirmation)
  • The image format is supported
  • The file isn’t corrupted
  • Try re-uploading the image
You’ve likely hit the 3-image limit:
  • Ask the worker to clear images from context
  • Wait for confirmation that images are cleared
  • Upload new images
  • Consider starting a new conversation for fresh context
Improve your requests:
  • Ask more specific questions
  • Provide context about what you’re looking for
  • Break down your analysis into multiple focused questions
  • Specify the type of details you want (colors, composition, objects, etc.)
Check:
  • Which image you’re referencing in your question
  • If multiple images are uploaded, specify “the first image” or “the image showing [subject]”
  • Consider clearing old images to avoid confusion
Try:
  • Uploading a higher resolution version
  • Ensuring text is clearly visible and not too small
  • Checking that text isn’t obscured or distorted
  • Explicitly asking “extract the text from this image”
Ensure:
  • “Clear images from context” capability is enabled
  • You’re using clear phrasing like “clear the images” or “remove images from context”
  • Try the exact phrase: “Can you please clear the images from the context?”

Privacy and Security

Important Considerations:
  • Don’t upload images containing sensitive personal information
  • Avoid uploading confidential business documents without proper authorization
  • Be aware that images are processed to enable analysis
  • Clear sensitive images from context after analysis
  • Follow your organization’s data handling policies

Summary

You’ve successfully learned how to:
Enable and configure the image vision tool for your workers
Upload and analyze images within conversations
Manage the 3-image limit using the clear images capability
Request different types of image analysis and descriptions
Combine image vision with other Utari tools for enhanced workflows
Apply best practices for effective image analysis
The image vision tool transforms your Utari workers into visual analysts, capable of understanding, describing, and extracting insights from images to support your creative, analytical, and content creation workflows.

Next Steps