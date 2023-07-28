The Gist

Headlines about remarkable advancements in AI consistently dominate the daily business news, with significant developments predominantly unveiled at the start or close of the business week. Google Bard’s newest strategy to take the lead in the AI competition is allowing image uploads alongside your prompt text. This new feature introduces a valuable enhancement to the way you apply your imagination and creativity in constructing a prompt.

How Uploading Images in Google Bard Works

To craft a prompt using the image in Google Bard, create your prompt as you normally would, then click on the plus sign next to your query window. Image formats can be JPEG or PNG.

Bard will interpret image details based on its understanding of the prompt and its understanding of image shapes. Bard’s trained data incorporates various aspects of Google’s vast image sources such as Google Lens. Bard will describe what it interprets in the image, answering questions about them, and even recognizing specific people's faces.

Caution must still be given to large language model (LLM) output, as it is still not deterministic. Despite the innovation and LLMs’ ability to create fluent answers to your questions, marketers must remember that models are still offering a statistical calculation of tokens — word stems from a prompt or prompt chain. So, some answers can still require verification through the user’s knowledge and experience with the prompt subject.

For example, I shared an image of wine bottles and asked Bard to identify the brands. The bottles and their labels in the image were turned in various ways, but still visible. Bard got a few brands right, like Kendall-Jackson and Chateau Ste. Michelle but also suggested a few that were not in the image. For instance, it overlooked Coppola, which has a large font on its label.

Google Bard Image Query

Despite glitches like this, I think uploading image files in Bard will serve users well. It is a major incremental step compared to a May update in which Bard added images from Google Search in its responses. In my wine image example, Bard did return images of each wine it identified. Bard also had a slightly differing variation of its prompt response.

Google Bard Image Query Result

All of this shows the level of insightfulness an image analysis can be. Image uploads revolutionize the way a model consumes data because an image analysis is included in a model’s token considerations.

The Other Latest Google Bard Updates

The image upload feature was introduced as part of a series of enhancements to Google Bard that were unveiled in July. The feature that most complements image uploads is the option to tailor Bard's responses with a simple click. Users can tap to modify the length of the response or calibrate the tone, enabling a shift from a more professional to a more casual demeanor.

Another noteworthy addition is the text-to-speech functionality, which offers a spoken word rendition of the response. It reads out loud in tandem with the displayed text and can interpret more than 39 languages apart from US English, including languages like Hindi and Spanish.

The advantage of having prompt responses read aloud offers an alternative method for absorbing the information, while the multilingual capability can assist Bard users collaborating with partners who converse and respond in different languages. Additionally, the availability of shareable links for Bard conversations means that the collective benefits of these latest Bard updates can have an integrated impact in diverse usage scenarios.

What Google Bard Image Uploads Mean for Customer Experience

Bard's newly acquired capacity to interpret images constitutes a pivotal update that marketers should closely monitor. This feature introduces an additional medium for Bard's model to reference within a prompt, thereby augmenting the user experience through a multimodal approach. "Multimodal" refers to the integration of various mediums to shape a prompt context. Consequently, it involves employing more than one mode for response generation, leading to more nuanced information for decision-making and insights.

The ability to analyze an image can also be useful in advanced analytics that support customer experiences. An analyst could construct a better story about the data using images that reflect customer activity or activities in support of a customer experience. The potential outcomes would push boundaries in delivering insights from exploratory data analysis and cohort analysis.

The image upload gives Google an advantage over OpenAI in the AI wars, though as you can probably imagine, any advantage in the fast-moving world of AI may be a short-lived one. ChatGPT, while not supporting direct image upload, does permit users to describe an image in the prompt or input a link which ChatGPT can incorporate into your prompt. Regardless of the approach, ChatGPT is capable of generating text-based dialogues or descriptions.

The capacity to be multimodal is a desirable feature among AI service providers, so it should not be long before Open AI decides to include uploaded images among its own multimodal capability.

In the meantime Google Bard users should expect to discover new and exciting ways to prompt AI with images. And those users should expect to experience a significant boost in the way AI enhances their tasks.