Dan Huby, CTO at ResourceSpace‘s parent company Montala has recently contributed a feature article exploring the multimodal capabilities of OpenAI’s latest version of GPT-4’s image input.  Following on from his previous analysis of the platform, Dan puts the system through its paces by prompting it to generate text from a given image and using it to populate a standard descriptive metadata field.  Other tests include generating a comma separated list of applicable keywords, or tags, much as you would use for categorisation, along with creating some impressive marketing copy.  The results are compared with both the previous version of OpenAI’s platform, and with Google’s Vision API.

It’s quite remarkable how the AI not only generated a detailed description but also coined a catchy slogan: “Don’t just float—flaunt.”. While I would advise marketing teams to refine and customise this AI-generated content to suit their specific needs, it undeniably serves as a solid foundation for crafting engaging real-world copy. This level of creativity and precision from an AI system is not only impressive but also indicative of its potential as a valuable tool in content creation.  It’s evident that GPT-4’s multimodal capabilities bring a substantial new tool to the world of DAM systems. This isn’t just about technology for technology’s sake; it’s about practical, real-world applications. The accuracy in image description, keyword suggestion, and even the creation of marketing copy is impressive, but what’s even more exciting is the potential for future developments.”  [Read More]

