ExaWizards has Developed exaBase Visual QA, a Generative AI Model that can Describe Images.
– Accurate interpretation of hazards and anomalies in images, pre-trained and ready for commercial use –
ExaWizards Inc. (Headquarters: Minato-ku, Tokyo; Representative Director & President: Makoto Haruta; hereafter, “ExaWizards”) announces the development of “exaBase Visual QA,” a generative AI model that can explain the content of an image. exaBase Visual QA can interpret features such as dangers identified in images and generate explanatory text. It can also be used in commercial applications such as consumer services.
ExaWizards provides services and products through AI to improve productivity and solve social issues.
☑︎”exaBase Visual QA” features
Generative AI models that recognize images have difficulty accurately generating textual information about dangers identified in complex images. exaWizards has trained generative AI models on what people pay attention to when they see an image.
It is now possible to accurately interpret danger and discomfort in images, which humans can recognize intuitively. exaBase Visual QA generates explanatory text by interacting with a system that uses exaBase Visual QA like a chatbot.
When a user types, “Are there any potential dangers?” about an image such as the one below, the system can respond, “The worker will fall if he loses his balance or the scaffolding collapses. The worker could be injured if the power tool he is using to connect metal rods slips. The system can generate sentences such as “Appropriate safety precautions should be taken.” The system generates long sentences that can be summarized using ChatGPT to focus on the important parts.
exaBase Visual QA’s prototype (input/output is currently in English only, but can be translated to other languages with the translation function).
We confirmed that exaBase Visual QA has an interpretation accuracy of up to 10% higher than other commercially available models. This model is smaller and faster in generation and inference execution than models of similar accuracy.
exaBase Visual QA is based on open-source generative AI models to which we add additional training to make it ready to use. We can improve accuracy in specific fields with “fine tuning,” learning and adjusting data settings in individual fields.
☑︎Applicable fields – In a wide range of fields and can be used as a classification model –
exaBase Visual QA can ask any question and can be applied to a wide range of image fields. Natural images (those not artificially generated) can be interpreted with high accuracy. It can also be used as a “classification model” to sort data based on interpreted semantic content.
*Determining hazards at construction sites and other workplaces.
*Assessing places where there is a variety of human activity, such as daycare centers and schools.
*Identifying and analyzing the location and content of malfunctions in various objects.
*Identifying incidents based on images from cameras, sensors, etc.
*Data compression by converting a large amount of video into text and extracting specific scenes.
*Developing classification models to set pass/fail criteria on product lines, etc.
☑︎Technology Offerings – Available in video as well as still images –
exaBase Visual QA is a generative AI model that can be incorporated into a variety of software and systems. It is currently available for PoC (Proof of Concept) use. Initially, the model will be used for still images, but it can also be used for video.
[ExaWizards Corporate Profile]
Company name: ExaWizards Inc.
Location : 21F, Shiodome Sumitomo Building, 1-9-2 Higashi-Shinbashi, Minato-ku, Tokyo
Established : February 2016
Representative : Makoto Haruta, Representative Director & President
Business : Industrial innovation and resolution of social issues via AI service development
URL : https://exawizards.com/
<Contact for public relations>
E-mail address of the Public Relations Division of ExaWizards Inc.: publicrelations@exwzd.com