We present a new product based on Vision and Artificial Intelligence designed to simplify customer service in online stores.

With the launch of ultramarinosoctavio.com, we considered how we could improve the management of potential order issues. To avoid overloading Octavio's customer service department and make it easier for users to report problems, we have implemented a solution that allows customers to record videos, audio messages, or send texts explaining the issue with their order.

AI-Assisted Customer Support

Instead of sending issues directly to the corresponding department, our system processes videos and audios, extracting transcripts and selecting the most relevant images using Artificial Vision. A language model (LLM) analyzes the root cause of the problem and generates a comprehensive and standardized report with key points and potential solutions, helping the team respond much faster.

In this case, we leverage GPT-4o's multimodality, which can interpret both the content of images and the information from the order and user message transcripts to provide a reliable response.

The final result is the generation of an issue report that integrates into the tool's order flow, including the information provided by the user, an AI-generated analysis, and a set of actions recommended by the model.

The information sent to support is progressively completed and refined through a series of steps. One advantage of this solution is that even if AI services were unavailable, the tool continues to function properly, sending the support department all available information up to that point.

Additionally, store managers can respond in a similar manner by recording an audio message that will be transcribed and automatically enhanced using Artificial Intelligence to create a clearer and more detailed response for the customer.

Why doesn't the tool respond directly to customers?

This solution does not respond directly to users because its purpose is to assist the customer support team. Additionally, this prevents potential errors that could damage the business's reputation. A notable example is the logistics company DPD; in their case, the support chatbot ended up insulting customers and even the company itself.

As we explain in our ChatGPT course and Artificial Intelligence for businesses, relying on AI for unsupervised automated customer responses is, as of today, a more than risky operation.

With our approach, we analyze and provide context to user issues with relevant information, pre-generating possible solutions to facilitate the support team's work, enabling them to respond much more efficiently. And if the AI makes a mistake, the support staff can discard unsuitable suggestions.

Roadmap

We have developed a fully functional MVP that meets our client's current needs. This same tool can be installed, with some intervention on our part, in any other business or online store based on WooCommerce.

In the future, we would like to turn it into a standalone WordPress plugin, so the user only needs to download, activate, and use it autonomously without our intervention.

If you are interested in this product for your business or want to integrate automation solutions using Artificial Intelligence, feel free to contact us. Tell us your problems or needs, and we will be happy to help you.

Technical solution

Warning: this section is for tech enthusiasts. If you are passionate about the technical field, go ahead, keep reading, and enjoy. But if coding and technology are not your thing, don't worry: better skip this part, because you might end up feeling as lost as a developer in a gym 🤓.

On the front end, we use our favorite JavaScript framework, Angular (version 19 is fantastic), along with TailwindCSS. This combination allowed us to develop a Web Component under a microfrontend approach.

A microfrontend is an architecture that enables the division of the frontend into independent and reusable modules, each developed, deployed, and maintained separately. This facilitates integration into heterogeneous systems.

This component integrates seamlessly into the order pages of WooCommerce, providing interactivity without altering the original structure or presentation of the platform.

To access the user's device camera and microphone, we use WebRTC, allowing videos and audios to be recorded directly from the browser without needing additional software installation. Moreover, we isolate ourselves from the outside using shadow DOM, ensuring total encapsulation of the component's styles and behavior, avoiding conflicts with other page elements.

On the backend, we extended WordPress's REST API with a custom endpoint. This endpoint receives user information (videos, audios, or texts), stores it in the backend, and processes it through an asynchronous task flow using artificial intelligence. This ensures that the server is not blocked under any circumstances. Internally, we communicate with the OpenAI API to handle all transcription and AI inference tasks.

Finally, we leave you with a flowchart showing the various steps the tool takes to generate the issue report.

flowchart TB A[Start: User reports issue] --> C{Is it a video?} C -- Yes --> D[Extract audio<br>with FFmpeg] C -- No --> E{Is it audio?} D --> F["Transcribe with Whisper (video)"] F --> G[Use GPT-4o to extract<br>relevant frames from video] G --> H[Analyze frames<br>with Computer Vision] E -- Yes --> I["Transcribe with Whisper (audio)"] E -- No --> J{Did they upload photos?} J -- Yes --> K[Analyze images with Computer Vision] J -- No --> M H --> M[Combine transcript or message,<br>images, and order information] I --> M K --> M M --> N[GPT-4o Generates final report] N --> O[Customer support<br>reviews and adjusts] O --> P[Response sent to customer]