Overview
The image vision tool empowers your Utari workers to “see” and analyze images. This capability enables workers to describe visual content, extract information from images, answer questions about pictures, and provide detailed analysis of visual elements. It’s essential for any workflow involving visual content, design review, image analysis, or visual data extraction.Image Vision Capabilities
When you enable the image vision tool, your worker gains two essential capabilities:Load Images
Upload and analyze images from files or URLs for visual analysis and interpretation
Clear Images from Context
Remove images from the conversation to manage the 3-image limit and free up space for new images
Enabling Image Vision
1
Select Your Worker
Navigate to the worker you want to configure and click on the Tools tab.
2
Find Image Vision Tool
Scroll through the available tools to locate Image Vision.
3
Enable the Tool
Check the box next to Image Vision to activate both capabilities:
- Load images
- Clear images from context
4
Verify Capabilities
Ensure both capabilities are enabled so your worker can both upload and manage images effectively.
5
Save Configuration
Your changes are automatically saved. The worker can now analyze images.
Using Image Vision
Uploading and Analyzing Images
1
Start a Chat
Open a conversation with a worker that has image vision enabled.
2
Upload an Image
Click the Attach files button and select an image from your computer, or provide an image URL.
3
Request Analysis
Ask your worker to analyze the image. For example:
4
Review the Analysis
Your worker will:
- Load and process the image
- Analyze visual elements
- Provide a detailed description
- Answer specific questions about the content
Example Analysis Output
When analyzing an image, your worker provides detailed descriptions such as:Image Management Features
Viewing Image Controls
Once an image is uploaded, you have several controls:Enlarge
Click to view the image at a larger size
Minimize
Click to reduce the image size
Download
Save the image to your device
Managing the 3-Image Limit
When you reach the 3-image limit:1
Recognize the Limit
You’ll be unable to upload additional images once 3 are loaded in the conversation.
2
Clear Images
Request your worker to clear images:
3
Upload New Images
Once cleared, you can upload up to 3 new images to continue your work.
Use Cases for Image Vision
Design and Creative Review
Design Feedback
Upload design mockups, logos, or graphics and ask:
- “Analyze this logo design and provide feedback on color choice and composition”
- “What design principles are demonstrated in this layout?”
- “Compare these two design options and recommend improvements”
Content Analysis
Visual Content Understanding
Analyze images for content creation:
- “Describe this image for an alt text description”
- “What emotions does this image convey?”
- “Identify the key visual elements in this photo”
Data Extraction
Information Extraction
Extract text and data from images:
- “Read the text from this screenshot”
- “Extract the data from this chart or graph”
- “Transcribe the information from this document photo”
Product Analysis
Product Review
Analyze product images:
- “Describe this product and its features”
- “What are the key selling points visible in this product image?”
- “Compare these product photos for quality and presentation”
Educational Content
Image Explanation
Explain complex visual content:
- “Explain what’s happening in this diagram”
- “Describe the components shown in this technical illustration”
- “What does this infographic communicate?”
Advanced Image Analysis Requests
Detailed Descriptions
Comparative Analysis
Specific Element Analysis
Contextual Questions
Combining Image Vision with Other Tools
Image vision becomes even more powerful when combined with other Utari capabilities:+ Document Creator
Analyze images and create comprehensive reports or descriptions in document format
+ Web Search
Compare uploaded images with similar images found online for context
+ Files and Folder
Save image analyses and descriptions in organized folders for future reference
+ Knowledge Base
Apply brand guidelines or design SOPs when analyzing images
Example Combined Workflows
Best Practices
Be Specific
Ask clear, specific questions about what you want to know about the image
Provide Context
Give background information about the image’s purpose or intended use
Manage Limits
Track your image count and clear when necessary to avoid hitting the 3-image limit
Use High Quality
Upload clear, high-resolution images for better analysis results
Ask Follow-ups
Ask multiple questions about the same image to get comprehensive insights
Save Important Analysis
Save or document important image analyses before clearing images from context
Image Formats and Requirements
Supported Formats
The image vision tool works with common image formats:- JPEG/JPG: Standard photo format
- PNG: Graphics and screenshots
- GIF: Animated or static images
- WebP: Modern web image format
- BMP: Bitmap images
Quality Recommendations
For best results:
- Use high-resolution images when possible
- Ensure text in images is clear and legible
- Avoid extremely large file sizes (compress if needed)
- Use well-lit, focused images for better analysis
Workflow Examples
Design Review Process
1
Upload Design
2
Detailed Feedback
3
Color Analysis
4
Document Feedback
Content Creation Assistant
1
Image Upload
2
Alt Text Creation
3
Social Media Caption
Batch Image Analysis
1
First Set
2
Clear and Continue
3
Second Set
4
Comprehensive Report
Troubleshooting
Worker can't see the image
Worker can't see the image
Verify that:
- Image vision tool is enabled for the worker
- The image uploaded successfully (check for upload confirmation)
- The image format is supported
- The file isn’t corrupted
- Try re-uploading the image
Can't upload more images
Can't upload more images
You’ve likely hit the 3-image limit:
- Ask the worker to clear images from context
- Wait for confirmation that images are cleared
- Upload new images
- Consider starting a new conversation for fresh context
Analysis is too vague or generic
Analysis is too vague or generic
Improve your requests:
- Ask more specific questions
- Provide context about what you’re looking for
- Break down your analysis into multiple focused questions
- Specify the type of details you want (colors, composition, objects, etc.)
Worker describes wrong image
Worker describes wrong image
Check:
- Which image you’re referencing in your question
- If multiple images are uploaded, specify “the first image” or “the image showing [subject]”
- Consider clearing old images to avoid confusion
Text in image not recognized
Text in image not recognized
Try:
- Uploading a higher resolution version
- Ensuring text is clearly visible and not too small
- Checking that text isn’t obscured or distorted
- Explicitly asking “extract the text from this image”
Clear images command not working
Clear images command not working
Ensure:
- “Clear images from context” capability is enabled
- You’re using clear phrasing like “clear the images” or “remove images from context”
- Try the exact phrase: “Can you please clear the images from the context?”
Privacy and Security
Summary
You’ve successfully learned how to:Enable and configure the image vision tool for your workers
Upload and analyze images within conversations
Manage the 3-image limit using the clear images capability
Request different types of image analysis and descriptions
Combine image vision with other Utari tools for enhanced workflows
Apply best practices for effective image analysis