Classes for working with vision models.
Classes
GeneratedImage
GeneratedImage
(
image_bytes
:
bytes
,
generation_parameters
:
typing
.
Dict
[
str
,
typing
.
Any
]
)
Generated image.
Image
Image
(
image_bytes
:
bytes
)
Image.
ImageCaptioningModel
ImageCaptioningModel
(
model_id
:
str
,
endpoint_name
:
typing
.
Optional
[
str
]
=
None
)
Generates captions from image.
Examples::
model = ImageCaptioningModel.from_pretrained("imagetext@001")
image = Image.load_from_file("image.png")
captions = model.get_captions(
image=image,
# Optional:
number_of_results=1,
language="en",
)
ImageGenerationModel
ImageGenerationModel
(
model_id
:
str
,
endpoint_name
:
typing
.
Optional
[
str
]
=
None
)
Generates images from text prompt.
Examples::
model = ImageGenerationModel.from_pretrained("imagegeneration@002")
response = model.generate_images(
prompt="Astronaut riding a horse",
# Optional:
number_of_images=1,
seed=0,
)
response[0].show()
response[0].save("image1.png")
ImageGenerationResponse
ImageGenerationResponse
(
images
:
typing
.
List
[
GeneratedImage
])
Image generation response.
ImageQnAModel
ImageQnAModel
(
model_id
:
str
,
endpoint_name
:
typing
.
Optional
[
str
]
=
None
)
Answers questions about an image.
Examples::
model = ImageQnAModel.from_pretrained("imagetext@001")
image = Image.load_from_file("image.png")
answers = model.ask_question(
image=image,
question="What color is the car in this image?",
# Optional:
number_of_results=1,
)
MultiModalEmbeddingModel
MultiModalEmbeddingModel
(
model_id
:
str
,
endpoint_name
:
typing
.
Optional
[
str
]
=
None
)
Generates embedding vectors from images.
Examples::
model = MultiModalEmbeddingModel.from_pretrained("multimodalembedding@001")
image = Image.load_from_file("image.png")
embeddings = model.get_embeddings(
image=image,
contextual_text="Hello world",
)
image_embedding = embeddings.image_embedding
text_embedding = embeddings.text_embedding
MultiModalEmbeddingResponse
MultiModalEmbeddingResponse
(
_prediction_response
:
typing
.
Any
,
image_embedding
:
typing
.
Optional
[
typing
.
List
[
float
]]
=
None
,
text_embedding
:
typing
.
Optional
[
typing
.
List
[
float
]]
=
None
,
)
The image embedding response.

