Major breakthrough - Realtime NPC Vision - History is Made

This is probably a first for a 3D video game with conversational non-player characters! It's not a flashy demonstration but it is a major first for the technology:

I've come up with a way, using MiniGPT4 and TheBloke_vicuna-7B-1.1-GPTQ-4bit-128g to give the NPC the ability to see his surroundings and describe it to me. I made use of a SceneCaptureComponent2D camera and a render target texture in Unreal 5 to get a screenshot of what the NPC sees. I fed this to a Nvidia A40 cloud instance hosted with Vultr and allowed him to perceive his reality for the first time instead of just being told about it.

King Ogard then accurately describes what he's seeing; a small village in the middle of nowhere, straw and mud homes and a campfire burning in front of them. He responds aptly and quickly and even comes up with additional details to describe the experience beyond just the basics of what is in the scene.

I had this vision in my head of how it would all come together and after a long night and morning of coding without sleep it just all came together finally! It seemed almost miraculous as it was happening. I could barely contain myself as I proceeded to try to calmly ask him about the world.

This was a very simple scene, so the technology will be more interesting in unexpected environments. I've been running a series of tests with other environments to see how it reacts to different objects, arrays of furniture, people, and places.

There could be a lot of interesting interactions that stem from this beyond just simple descriptions. This technique finally gives the characters some awareness. Imagine for example a blacksmith seeing you attempt to strike a blade and saying, "Oh, no no no, not like that. Let me show you," because they perceived visually what was actually happening. My mind boggles at all the possibilites yet undiscovered.

I need to adjust the lipsync'ing a bit, it's a bit tight in this latest update to the code.

Also, this is another first for me, as this code is now fully 100% written in-house. It is not based on ConvAI or Inworld.

