• Hey everyone, staff have documented a list of banned content and subject matter that we feel are not consistent with site values, and don't make sense to host discussion of on Famiboards. This list (and the relevant reasoning per item) is viewable here.
  • Do you have audio editing experience and want to help out with the Famiboards Discussion Club Podcast? If so, we're looking for help and would love to have you on the team! Just let us know in the Podcast Thread if you are interested!

Steam Square Enix has revived "The Portopia Serial Murder Case" as an AI Language Processing tech demo

Krvavi Abadas

Mr. Archivist
Pronouns
He/They



on it's own, this is an interesting idea. but then you remember that the original game hasn't been re-released (much less localized) since a mobile phone port last updated in 2006.

particularly bad as it's historically signifcant, being one of the first projects created by Dragon Quest creator Yuji Horii, and influencing such major developers as Hideo Kojima* and Eiji Aonuma. not even having an untranslated copy of say, the Famicom port on Virtual Console or Nintendo Switch Online is truly baffling.
reportedly, this is using the original scenario for the game. but the fact it is a tech demo leads me to believe it's only a small portion of it.

alongside the language processing, which is basically an enhanced text parser (the original computer vesions of the game actually did use one of these, but later ports like the Famicom version added proper adventure game commands instead.) that uses AI to match what you wrote with a working command. it also has Text-To-Speech, Speech-to-Text, and ChatGPT-like text generation (removed in the public build to prevent it from generating offensive responses).

*Metal Gear Solid V actually includes a small amount of data from the PC-6001 port of Portopia on a cassette tape, though this likely wasn't meant to be an intentional easter egg. just Kojima pulling random "garbage" noise to create what sounds like classified intel, unaware that it could be easily decoded.
 
This is a cool concept, actually, that I think people are overly eager to dismiss because of the baggage associated with generative AI. Yeah, sure, full game eventually please, but ever since the game was originally developed over 40 years ago, the idea that you could just interact with NPCs via full sentence typing has been an unreachable dream. We're finally close to making that a reality that I think genuinely lives into the spirit of that era of adventure game.

That entire era had just trivially specific keyword searches that eventually got dropped in terms of selecting full sentences or distilling it to just the most common commands, but what this new demo is doing was the genuine goal of those early adventure games.

They probably have a whiteboard somewhere of "games to remake" somewhere at SE headquarters, and the middle of last year was probably "oh, this shit is a good match for Portopia." EDIT: This is more like the last few years of Chatbot type systems, so real development and exploration could have started years ago. This is word embedding/sentence similarity routing to pre-written answers.

EDIT: It looks like this is really just the full game, and not just a demo portion.


The official writeup about it is a good history of what they are going for. It isn't even going for generative answers-- it is doing this to map character intent to the existing written responses, only.

Reading the description... this... is a full remake? For free?



Boooooooooooooooooooooo!!!!!!!!!!! We hate this!!!!!!!!!
Why do you hate this?
 
Last edited:
reportedly, this is using the original scenario for the game. but the fact it is a tech demo leads me to believe it's only a small portion of it.

I wouldn't be worried about this actually, the full game is very short. I played through the fan patch of the Famicom version last year in one sitting. The short length+historical significance is probably why it was chosen as a free tech demo.

This use case of AI doesn't seem unethical to me since it's not generating content and they actually did create a localization for it, but I understand why people are wary.

I'll give it a go on Sunday. I loved the original and the genre it spawned. For me, I actually am most interested in the music, cause the version I played was mostly silent.
 
People really need to stop kneejerk reacting with BOOOOO IT SUCKS the second the word AI is uttered. Actually read what it's doing and how it's used. This is a perfectly fine example and a really cool way to do adventure games that require input from the player that isn't just click on thing or select thing from menu. It feels far more natural.

Portopia itself is far more than just the originator of visual novels. It's probably one of the top 5 most influential games in Japan (along with everyone's favorite Xevious).
 
This seems like a relatively resonable use case for natural language processing in games. From what I can tell, it's a completely ordinary adventure game where you have a set of actions you can do at any point, but a model of some mind is used to predict your action from a free-form text input. Provided that the classification works, this helps avoid the pitfall of having to figure out what the exact verbs and nouns the game wants you to input are etc. and should actually make the gameplay experience better.

I'm sure this has been brought to life because of the recent AI hype, but as long as they don't have the generative stuff enabled, this isn't anything particularly crazy and could have been done years ago with well-established methods. I'm sure more of that is coming as well, but this isn't it.

Edit: of course there's the possibility that they collect the player inputs (I didn't look into this), which would make the whole thing more vile.
 
That's one way for them to gather data and build upon their AI platform.

Do you get pissy whenever you interact with a chatbot or a call center that says "Please, say what you would like to do"? Because that's the technology at use here.
Edit: of course there's the possibility that they collect the player inputs (I didn't look into this), which would make the whole thing more vile.

This type of chatbot is iterative. Usually, the developers will write the responses and a list of sample sentences that will route into those responses, but they will use what did and did not match successfully to update the lists of sample sentences that correspond to each response.
 
Seems like a good use of the technology, but pretty limited if it's just mapping your commands to the existing commands programmed into a 40-year-old game.
and ChatGPT-like text generation (removed in the public build to prevent it from generating offensive responses).
Now THAT sounds like a big use of the technology that could demand building a Tech Preview around, but uhhh, only the guys in the office get to see it.
 
Seems like a good use of the technology, but pretty limited if it's just mapping your commands to the existing commands programmed into a 40-year-old game.

Square Enix seems so interested in making modern horror/mystery adventure games that make no financial sense that I can easily see them already at work on using this in a Portopia 2 or the inevitable Nameless Game reboot.
 
I'm scared of AI not because of the tech itself, but because of the people behind it I don't trust.

But seperating that fear from myself, this is cool to me. This is something I loved in old PC adventure games, the crazy text parser that never worked because it just couldn't account for every reply you made. This is addressing that issue in such a way that is radical to me.
 
Bethesda will be all over that
AI-assistance in huge RPGs like Bethesda games seems pretty inevitable in the future, and definitely would be right up their ally with the whole "Radiant Quests" thing they started way back in Oblivion. I can see it happening.
 
15GB of storage required! Like, 1GB for the game, 14GB for the models!

I assume that there are a variety of sizes of models you can select in the options, so it is probably something like 6GB Large, 3GB Medium, 1.5GB small, etc.

When you run one of the OpenAI models from your PC, you specify which model to use, and it will download and cache just that one. This looks like it is just grabbing all of the sizes of model upfront, which is understandable given how people normally expect Steam downloads to work.

I wonder if you are going to be able to specify the text model separately from the voice model. I imagine that the voice model is a way larger chunk of storage than the text model.
 
I assume that there are a variety of sizes of models you can select in the options, so it is probably something like 6GB Large, 3GB Medium, 1.5GB small, etc.

When you run one of the OpenAI models from your PC, you specify which model to use, and it will download and cache just that one. This looks like it is just grabbing all of the sizes of model upfront, which is understandable given how people normally expect Steam downloads to work.
Why do that for a game, rather than the developer choosing the one they've decided works well?
 
Why do that for a game, rather than the developer choosing the one they've decided works well?

Because they have wildly different VRAM requirements. The larger the better generally, but not everybody has 8GB of VRAM, and I don't think it is sane to make a 40 year old adventure game require a video card like that. This type of cutting-edge approach just has fundamentally diverse technical standards that link to gameplay in such a way that PC gamers have not ever seen, so there is no good analogy and intuition about how system requirements should work fails.

As a tech demo, it makes sense why they want to demonstrate the strengths of different standards. In a more commercial focused release, I think they are still likely do to several models and let the player decide. Being able to play the entire game via microphone is really nifty but has such high technical requirements that I don't really see developers hard gating to just the absolute best available model.

It is possible that, eventually, games using voice or text commands could require always online, but for now they want to demonstrate what can be done on commercial gaming hardware, which is probably the main reason why they are putting out this approach rather than four years ago.

As an example, here are the VRAM requirements (like, hard hard hard requirements, no fudging, no reduced performance if you miss) for Whisper, which is OpenAI's voice-to-text somewhere.

SizeRequired VRAM
base~1 GB
small~2 GB
medium~5 GB
large~10 GB

The GTX960 that is the minimum for this game can, at best, could only do the fourth best transcription model (if they were using OpenAI Whisper, which they aren't). The good models will get way better results but have a requirement that is 10x higher.

EDIT: After some more thought, I guess that one of the key skills for game developers in this space is going to be how to get the most mileage out of using the smaller models, part of which will be about building the models on more fitting training data to get the most mileage out of each GB of VRAM for these specific uses cases. Just relying on bigger and better models is not a sustainable strategy.
 
Last edited:
Because they have wildly different VRAM requirements. The larger the better generally, but not everybody has 8GB of VRAM, and I don't think it is sane to make a 40 year old adventure game require a video card like that. This type of cutting-edge approach just has fundamentally diverse technical standards that link to gameplay in such a way that PC gamers have not ever seen, so there is no good analogy and intuition about how system requirements should work fails.

As a tech demo, it makes sense why they want to demonstrate the strengths of different standards. In a more commercial focused release, I think they are still likely do to several models and let the player decide. Being able to play the entire game via microphone is really nifty but has such high technical requirements that I don't really see developers hard gating to just the absolute best available model.

It is possible that, eventually, games using voice or text commands could require always online, but for now they want to demonstrate what can be done on commercial gaming hardware, which is probably the main reason why they are putting out this approach rather than four years ago.

As an example, here are the VRAM requirements (like, hard hard hard requirements, no fudging, no reduced performance if you miss) for Whisper, which is OpenAI's voice-to-text somewhere.

SizeRequired VRAM
base~1 GB
small~2 GB
medium~5 GB
large~10 GB

The GTX960 that is the minimum for this game can, at best, could only do the fourth best transcription model (if they were using OpenAI Whisper, which they aren't). The good models will get way better results but have a requirement that is 10x higher.

EDIT: After some more thought, I guess that one of the key skills for game developers in this space is going to be how to get the most mileage out of using the smaller models, part of which will be about building the models on more fitting training data to get the most mileage out of each GB of VRAM for these specific uses cases. Just relying on bigger and better models is not a sustainable strategy.

FTR there's also been a lot of advancements on how to split models across VRAM and RAM/Disk as of the last couple of weeks. It's now possible to run larger models with less VRAM than the actual entire model with some performance hit.

With that said, yeah, everything seems to indicate just making models bigger won't actually make them better.


Mildly interested on this kind of applications for the tech, honestly.
 
FTR there's also been a lot of advancements on how to split models across VRAM and RAM/Disk as of the last couple of weeks. It's now possible to run larger models with less VRAM than the actual entire model with some performance hit.

With that said, yeah, everything seems to indicate just making models bigger won't actually make them better.


Mildly interested on this kind of applications for the tech, honestly.

I had heard of some of the advancements (like faster implementations of Whisper), but was unaware of some of the broader advancements. This is a really exciting time for this time type of tech, I can imagine SE using this existing tech demo / game as kind of a staging area and test area for different advancements of their own internal implementation of this stuff.
 
It is out, and, as expected, it lets the player select which model to use, but just for the Voice Input mode.

My 3070 with 8GB of RAM can't do the Large model but can do the Medium model, so it is possible that the voice model really is Whisper under the hood? VRAM size, storage size, and use case checks out.

EDIT: Oh, the load screen straight up says it is Whisper. Yeah, Medium requires 5GB of VRAM. Transcription accuracy seems pretty good. It confused me saying Harada with Hirata, but the actual text processing from there picked up the correct intent.
 
0
An incredibly important game from a historical perspective, that finally gets an English localisation after 40 years... that's ruined by AI ChatGPT guff.

Do not trust this software. It is almost certainly harvesting your data through some form of telemetry/spyware. S-E would not be giving out this game for free without there being some sort of behind-the-scenes gotcha and ulterior motive; and we already know that they're invested heavily in NFTs and AI art related tech.
 
Last edited:
An incredibly important game from a historical perspective, that finally gets an English localisation after 40 years... that's ruined by AI ChatGPT guff.

Do not trust this software. It is almost certainly harvesting your data through some form of telemetry/spyware. S-E would not be giving out this game for free without there being some sort of behind-the-scenes gotcha and ulterior motive; and we already know that they're invested heavily in NFTs and AI art related tech.

You clearly have no fucking clue what you're talking about. What about this is Chat GPT? None of the text is generative. It is voice transcription software and chatbot-like routing.
 
I recommending using the PAUSE BREAK option to display the sentence similarity. It is still too finicky on what it expects from the user.

The actual voice transcription system is ace, and it manages to be very loose with names, spelling, proncounciation. The problem is the grammar of the text commands and what maps to eligible commands.

They need to have like twice as many or even triple "sample sentences" to compare to, because as it is there's still some definite weaknesses compared to the old style of text parsing.

I wonder if Japanese and English have the same number of sample sentences, and if one of them performs better than the other.
 
Last edited:
An incredibly important game from a historical perspective, that finally gets an English localisation after 40 years... that's ruined by AI ChatGPT guff.

Do not trust this software. It is almost certainly harvesting your data through some form of telemetry/spyware. S-E would not be giving out this game for free without there being some sort of behind-the-scenes gotcha and ulterior motive; and we already know that they're invested heavily in NFTs and AI art related tech.

Please do some effort in educating yourself about what you're talking about before going into a crusade against stuff. As stated in this same thread, this is not using ChatGPT, runs locally, and can be used offline.

This kind of thing only helps spread potentially harmful misinformation.
 


Seems it's not being received super well. A shame as I don't think it's a terrible idea, but it seems that at least with english commands it barely works. Also just the fact that the game is so obviously extremely budget art wise feels pretty disrespectful to such an important game in a historical sense
 
Seems like people had high expectations when they saw "AI" and didn't properly read the description. I tried it out, and it worked just fine for me.
 
Seems like people had high expectations when they saw "AI" and didn't properly read the description. I tried it out, and it worked just fine for me.

I dig it overall, but I used to do work with chatbots and chatbot design, and SE very clearly didn't write nearly enough example sentences that map into the game commands. It is like each action has two or three command structures that map to it, rather than two or three dozen.

The off-the-shelf voice transcriptions (via Whisper) are fantastic, but it is way too specific in what it wants for what looks like an off-the-shelf word embeddings that map the commands into actions. There is meat in this approach but it still takes more manual effort than they put in, either in writing example sentences or building or a language model that better knows the synonyms inherent to adventure-game-ese.
 
0
I finished it and enjoyed it. There's obviously room for improvement over this minimum viable product, but they get a lot of mileage out of extremely few Sample Sentences that are compared against. The big single word near the end of the game accepts synonyms that weren't programmed, but there are two giant problems that developers in Japan would not have anticipated.

  • Voice transcriptions will often input a period at the end for English Whisper, which hurts the similarity score.
  • Capitalizations, which don't exist in Japanese, can sometimes impact the similarity score.
  • I think both English and Japanese are using the same similarity thresholds, but there's no reason to thin that they have quite the same meaning in the two languages.

They got a lot done with shockingly few example sentences. A retail release and not an Educational Software "Tech Preview" would spend developer resources (which didn't exist here) to smooth over these types of issues.
 
It's officially the worst rated game on Steam.

But it is under Educational Software! It doesn't even show up under a user's list of games.

There's like half of a developer listed on the "game", with more of the credits being links to various white papers and git repos. This was the few AI researches at Square Enix testing out word embeddings into an existing game script and structure with virtually 0 freedom to adjust the game to match the technology.

There's a whole lot of problems, but it also has a lot of baggage from people that come in with expectations for the term AI, or that hate it for being AI.

I played it 100% with a mic and had nowhere near the problems people seem to be having with it. They have a huge amount of improvement to make for a commercial release, and a variety of bugs or unintended effects on the similarity scores, but they demonstrated the basics of how this type of parser would work in a full game (with a lot more effort on creating example sentences). It successfully picks up on synonyms that are totally not programmed into the game code.

I think it is really impressive to finally get these types of models up and running in a game structure. People that expect something like ChatGPT running locally don't quite understand that you kind of sort of need dozens of 4080s or whatever as a bare minimum to run that level of model.

A full actual game developer, working on improving this demo into a full sellable game, with a full year of effort, would be able to get awesome results. But as it is, this absolute bare minimum translation (as in, fewer verbs than the old parser) manages to produce nearly the same results because it successfully picks up on synonyms. You still have to speak like you're playing an old school adventure game, unfortunately, with "X to Y", and at a spot where that action is allowed, and Yasu is not good at telling the player when/where that is the prohibition (other than the two "let's wait until we are back at headquarters" or "let's wait until after the interrogation")

There's a ton to think about, and structuring the game around the parser rather than just implementing the parser into an existing script, are both going to have giant gains. That wasn't the point of this tech demo though. Throwing actual proper game developer resources at this is going to result in some fucking great experiences. A retail release would need probably a dozen times as many sample sentences, of a variety of sentence structures, and similarity thresholds determined individually for each sentence and/or for each prompt. But the basic concept here is exciting. They really did reveal all of the specific considerations that are going to need consideration when this gets applied to a full RPG in a few years.
 
Last edited:
It's officially the worst rated game on Steam.
probably not, no. it's at least tied with a Chinese language only card game.

i also don't like how this is undermining the original version's historical importance. while i still have issues with rolling the game out like this over a traditional re-release, the overwhelming criticism is throwing it into the same awful feedback loop that Balan Wonderworld and Babylon's Fall went through, the concept of it allegedly being "one of the worst games ever" with no consideration for anything else related to it.
 


Back
Top Bottom