Blubberquark Software @blubberquark - Tumblr Blog

Why Doesn't This Exist: ATX eGPU Dock

I was looking for eGPU enclosures. It wasn't encouraging. Reddit was right about this: You might as well buy a laptop and a PC. It's not worth the money. Since that Reddit post was written, things got worse. Docks that allow you to plug in an eGPU might as well not exist. The existing products are prohibitively expensive.

Why is that? It got me thinking. There are USB-3 docks with HDMI output, and there are docks with M.2 slots. There are even external SATA drive enclosures. Why not just sell the whole package in an ATX form factor? Why not put SATA and NVME and USB and PCI ports on an ATX-sized or ITX-sized motherboard, but without the CPU and RAM? Why not do that?

This way, you could re-use your old case and drives and power supply. You could save on everything except for the actual GPU. Then you put a clearly marked USB4 port on the rear I/O of the board, where rear I/O normally goes. That port also has USB-PD, and you plug in the laptop there, with a cable that can handle 100W.

It should be easy enough to make this a thing, with an ATX form factor and power supply. You could plug in front I/O and everything. You could use the SATA ports with your old CD drive.

I don't think this is an extremely clever idea, so I must be a genius for coming up with it. Quite on the contrary, I think it is an extremely obvious idea, and I am puzzled why this is not already a product I can buy.

Ratings

It's not a big problem in my day-to-day life, I must admit. It doesn't keep me awake at night. It's just that sometimes, when it comes up, it gives me pause: I don't always know how to rate things.

Sometimes it's unambiguous. When rating a seller, I rate 5 stars when the product arrives without a hitch, or when refunds are handled promptly. Do you ever give a seller 3 stars? When rating a hotel, the problems start. I am supposed to rate it out of 10, but does that include the price? Should I give that overpriced hotel an 8 out of 10, because it has most of the amenities and a nice location, or should I rate it a 6 because you should expect just a little more for that price?

When rating a movie, should I listen to my inner Siskel and Ebert and rate whether the movie was good, or should I just give it a thumbs up because I had an okay time? There's a difference. If I can give the movie up to four stars, I would be inclined to do the former.

Netflix used to have percentage ratings, and that communicated something to me. Now they have thumbs up, and that communicates something else. If I give a thumbs-up, Netflix will show me more movies of that type. It's only really for me. I can't thumbs-down a bad movie in order to communicate that it's bad like I can give one star only to a product on Amazon. They went from one kind of rating system to another, or from a rating system to a recommendation system, but they kept it in the same place in the UI.

Meaning

Giving something a thumbs-up or five stars or 95% can mean any one of the following: Show me more like this; show this to more people; add this to my list; adjust the score upwards; this is a good product; the product is irrelevant as long as the transaction went smoothly and it did; this hotel is clean and in a nice location.

I can usually distinguish between "This is good" and "Show me more like this". I appreciate that Steam has both, with reviews and discovery as separate systems. I also understand that you aren't supposed to grade Uber drivers on a curve.

It's all good as long as users understand what a good rating means, or what a bad rating means. If users don't know whether ratings are public, or which ones will be used in a recommendation system, then problems arise. They might give high ratings to movies they don't want to watch again, and get confused by the recommendations.

Problems will also arise when users disagree about the meaning of votes and ratings. If a downvote on Reddit is supposed to express disagreement, that's okay as long as users disagree, but if it's meant to be only a punishment for dangerous, false, or bad-faith comments, then users might argue whether they "really deserved" that downvote.

The arguments about meaning don't matter when raters agree about the meaning of the rating, or when they agree that they don't need to agree. For this to work they both need to agree that the rating is subjective, and they need to be correct in that assessment. If one rater thinks the meaning of a thumbs-down is "The Godfather is objectively a bad film", and the other thinks "Do not show me this movie again", then there's still a problem, and potential for conflict. If all raters think the meaning is "Do not show me this movie again", then you better not make any pronouncements about the quality of The Godfather based on aggregate data.

Distributions

Even when users/raters agree on the meaning of ratings, you often do not see nice unimodal rating distributions. Different users value different things, and even if they agree on what rating a product on Amazon 4 stars out of 5 should mean, they disagree about the ratings of individual products. Naively, you might expect there to be a distribution with a peak around the "true" quality of a product, falling off monotonically in both directions, with most products being of average quality.

If a product is "really" a three and a half star product, you may expect a distribution like 05-10-40-30-15, meaning 5% gave one star, 10% game two stars, and do on.

If you have ever spent any time at all researching your options on Amazon, you know this is not the case. Most products have a bimodal distribution of star ratings. The most common is 40-5-10-15-30. There is a peak at one star, and a peak at 5 stars, with a low ramp-up in between. The product is cheap, and it does the job, except for half the users, it broke after one or two uses. The people who didn't get a defective unit rate it higher. There are variations of this ratings distribution, depending on how quickly it breaks, how good customer service is, how well it works before breaking, and so on.

Other bimodal distributions, like 40-10-5-15-30, are also common. If you see a one-sided distribution like 60-20-10-5-5, you know you should avoid the product, no matter what the written reviews say. If you see something with a peak around three or four stars, like 10-15-25-40-10, you cannot possibly know why the product wasn't rated five stars without reading some of the reviews. It's probably something that does the job, but has a major ergonomic flaw, but it also could be a product that does the job, but badly. There must be a drawback or more people would have rated it five stars. Maybe it's too bulky. Maybe it's too expensive.

Likewise, if a product is rated 10-75-5-5-5, you don't know why it doesn't have a peak at one star without reading the text.

The underlying problem with the meaning of Amazon ratings is that "product quality" is not one-dimensional, even if raters all agree more or less on what product quality is, what it means, and that star ratings should somehow reflect product quality. I suspect that for the average Amazon buyer, even stupid-sounding reviews like "I don't know whether it works because I haven't tried it yet. I bought the life jacket for emergencies" or "This is too big to fit on my shelf" fulfil some important function.

Many Amazon review distributions are a composite of two of these: Bell curve, fat-tailed one-sided distribution, uniform with a single outlier peak, one-sided distribution, bimodal/bathtub curve. Many Amazon listings contain products that are updated, or use old product pages with good ratings to sell new products.

The five-star range is just about granular enough. If you look at professional reviews of games, the difference between 76% and 82% is highly meaningful, and the step from 66% to 72% is a world of difference. On the other hand, the difference between two out of ten and four out of ten is completely meaningless, and 1/10 might as well not exist. Everything that is 69% and below, or 6/10 and below, means "don't buy", and everything below 80% means "wait for a sale". Sometimes even games that are reviewed as "deeply flawed" or "uninspired" routinely get an 8/10. Out of five stars, 8/10 means 3 stars, and on a four star scale with half-stars it means 2½. AAA games like Detroit: Become Human get enough benefit of the doubt to still get an 8/10, with reviewers panning the gameplay, writing, cliched premise, pacing, and characters of a primarily story-driven game, but grasping for positives to justify a B+ for effort. Only something truly baffling like The Quiet Man can break through the 7/10 sound barrier and get a 4/10.

With that in mind, it's no wonder that Steam has limited your choices to thumbs-up and thumbs-down. In practice

I just wish Amazon would let me see "recent reviews" or a graph over time, a Steam feature that ostensibly was introduced to fight review bombs, but in practice one which helps buyers filter out games that were initially good, but which recently got replaced with pay-to-win games, like in those zombie Amazon listings.

The Theme Category

All of this brings me to Ludum Dare. If you follow my work there, you have probably guessed where I am going with this. I wrote a post on the LDJAM.com site about the Ludum Dare rating system. I'm not a fan. I thought, now that the LDJAM site is offline and PoV is in hiding, rather than trying to re-write my post from memory, I'd try to approach the same topic from the opposite direction. I knew I would still end up in the same spot.

Ludum Dare has these rating categories: "overall", "fun", "innovation", "graphics", "audio", "humour", "mood", and "theme".

Some are harder to grasp, some are self-explanatory. But some people think "graphics" means "this game has impressive/detailed/elaborate graphics", and some people think it means "the graphics serve the main point or main gameplay loop of the game very well". Some people think that humour means "this game is funny and has a lot fo jokes" and some people think "this game uses humour well, if at all".

The "theme" category is the worst offender, because it means either "this game sticks close to the theme and incorporates it into the mechanics" or "this game reinterprets the theme in a creative way" or "this game interprets the theme in a unique and novel fashion" or "this game really digs into what [whatever this theme is this time] means in our day and age".

So if the theme is "Dating Simulator", then some people will make "Kitchen appliance dating simulator" (I am very confident that this already exists) and some people will make "Carbon Dating Simulator", a game where (psych!) you really just date fossils, and somebody will interpret the same theme to make a game also titled "Carbon Dating Simulator", except (double psych!) you actually go on dates with fossils. Maybe there will also be a game where you're really old and you date other old people and the humour doesn't land... What can you say, it's a game jam, and they can't all be winners.

If, on the other hand, I went to itch.io and joined the next lesbian dating sim jam (again, without Googling, I think I can confidently say that this exists in multiple forms), and if I uploaded a game where you're a straight woman from the Greek island Lesbos dating men (possibly from other parts of Greece) on the island of Lesbos, then the people running the jam wouldn't congratulate me for my creative interpretation of the theme, even if I put in a lot of work and took photos of real places on Lesbos in Greece. They would probably think I was taking the piss out of them, even if the game was actually good, even if it was well-written and played (on my re-read of this paragraph, I realise that for the first time ever, a pun was actually not intended) completely straight.

In the old rule set of Ludum Dare, the rules were roughly "the theme is optional, but if you don't use it at all, others will give you a bad overall score, and if you use it well, they may give you a good theme score."

In the new rule set of Ludum Dare, the rules around theme are roughly "The theme is not really optional, but also not mandatory. You are supposed to use it, but this will not be enforced. Also, if somebody is doing something really inventive with the theme, you may want to reward that with a high theme score."

When I tried to point out my problems with the Ludum Dare ratings, I got some interesting feedback. Different people on the Ludum Dare site actually interpret the theme question differently.

Some think means "adhering to the theme", and they rate accordingly, and consistently so. To some of them, five stars in theme means a definitive and comprehensive exploration of the theme. To others, it means a mechanical interpretation of the theme, or to have the theme deeply integrated into the gameplay in another way.

Some think the theme category means "interpreting the theme in a far-fetched, non-literal or non-obvious way", and they give games low ratings when they take on the theme in a straightforward or mechanical way. The most important thing is not how much you rely on the theme, but how unique and clever your interpretation is.

Some think the theme may not be strictly mandatory, but they give games a low overall rating if they do not incorporate the theme enough, and a high theme rating when they interpret the theme in a far-fetched way.

Developers from group #1 were most likely to be annoyed by the difference in interpretation of the theme category. Developers from group #2 mostly seemed unbothered, because it's all subjective anyway. Group #3 seemed to have made their peace with the rating system.

I wasn't surprised by the people who didn't care about the disagreement. It's just for fun, you aren't supposed to take it seriously, what matters is participation, yada yada. I've heard all of that before. I also wasn't surprised by the insistence that ratings are subjective "anyway", so there is no difference between "this game is objectively good, therefore I give it five stars" and "this game is objectively good, but I don't like it, one star". If ratings are supposed to be subjective, then disagreement about the meaning of the rating categories is just noise – I disagree with sentiment. I find it annoying, but I expected to encounter it.

What I didn't see coming is that it was mostly people like me, those who agreed with my own interpretation of the theme category, who felt the same annoyance about the disagreement. People in #2 and #3 were not concerned, or interested in clarifying what theme ratings actually mean, or how you should rate games. I should have seen it coming, I guess.

Solutions

Even though it is not an agreed-upon problem, we can still apply solutions that work for everybody. At the very least, we could apply unobtrusive solutions that won't alienate people who don't think there is anything wrong.

Mandatory Randomly Assigned Ratings: This won't solve the problems of the theme category, but it would make Ludum Dare ratings "more fair". Every player/developer gets a list of five games that run on the same platform. If you develop a game for windows, you are assigned five games that run on windows. You have to rate them. This is in addition to the 20 ratings your game needs. If you don't rate these five games, your game won't be counted. This would make the ratings be more evenly calibrated and less skewed by popularity and selection effects, but it would also be a highly obtrusive change.

Sample Ratings: There is a list of games that have been rated by a panel of experts, with each rating in each category present at least once. If you want to see whether a game deserves two stars in graphics, you can compare it to a canonical list of two-star graphics games. Of course this would be highly obtrusive, take a lot of time, and you could never get a panel to agree on star ratings. Players/raters/developers would also need to play these games, and not just look at their names in the list. It would be quite harsh on the devs of canonical one-star games. It could solve the problem though.

Explanations of Categories: This is not that obtrusive. All you would need to do is to write a long version of the guide/rules, revise the current guide/rules so it isn't in contradiction with the long version, and then link to the long version from the short version. What is fun? What is mood? When do games deserve a mood rating? Is it ambition or execution? Can a game be five stars on fun and zero overall? What do these ratings mean? I suspect many developers would find taking a definitive and opinionated stance on matters of taste distasteful, and they would probably protest.

Rank Ranking: Instead of letting raters give star ratings, let's do pairwise comparison or best-to-worst sorting. This might be highly contentious, and it would be an obtrusive change to the rating system, but here is a tweak: Instead of making raters list games best to worst, add a view where raters are shown each category sorted best to worst. The rater can drag games around in this view. If a game with more stars is dragged below a game with fewer stars, the star ratings of these games are swapped. Alternatively, give the rater a tier list view, where S tier is five stars, and D tier is one star. These views would be optional and unobtrusive.

Two Theme Rankings, Three Graphics Rankings, Five Golden Rings, Seven Samurai: Specify multiple distinct interpretations of rating categories, and let some raters loose on old Ludum Dare games. Tell them to rate "How closely does this follow the theme" or "How detailed/elaborate are the graphics" and "How realistic are the graphics" and "How pretty are the graphics" and "How well do the graphics serve gameplay compare to more detailed/more realistic/prettier graphics". Is there that much difference in practice? Which one predicts Ludum Dare ratings best?

Recommendation System: After the jam is over, instead of showing the games in ranked order, go full Netflix, and show logged-in users games they might also like. Under games, show related games, based on a simple "people who liked this game also liked..." recommendation engine. You could also allow logged-out users to apply fine-grained metrics and tags, but these would not be subjective or value-laden. Instead, they would be information like "number of minutes to complete" or "how many times did I die before I beat the game" or "sentiment on a scale from fluffy to grimdark".

Histograms: This one is the easiest to implement, unobtrusive, and definitely the most bang for your buck. Instead of telling players the average "fun" or "overall" score, show them the histogram/bar chart. You know, like on Amazon.

derinthescarletpescatarian

I'm constantly seeing stories of people devastated that ai deleted their code or borked their computer or permanently deleted all their important work records or whatever, and while I'm sure many stories are fake I know at least some of them are real, and I don't get it. I don't understand why the ai is capable of doing this. Even if you do think "vibe coding" or whatever is a useful practice, you... you check the code and push it yourself, right? The ai can't push code? The ai shouldn't be able to touch your backups? WHY IS THE AI PHYSICALLY CAPABLE OF PUSHING CODE? WHY IS IT ABLE TO DELETE YOUR FILES AND DELETE YOUR BACKUP FILES? People tell it "do not change this without approval" and then get confused and upset when it does but WHY WAS IT PHYSICALLY ABLE TO? Am I misunderstanding something here? Shouldn't this be basic? Is there some complicated computer reason I don't understand where ais have to be physically able to fuck around with your important shit and your important shit's backups if you use them? Because that doesn't sound right. At least back up your records to an external device nightly or something if there's some architectural reason I don't understand that means ais have to be able to delete them all.

derinthescarletpescatarian

I'm of the opinion that even if you have to use AI it shouldn't have access to anything that you wouldn't be comfortable giving your thirteen year old nephew access to. Why is "AI irretrievably deleted all my emails" a possibility.

justhavetofeelthewaves

It’s a failure of sandboxing and not wanting to be overly prescriptive of all file permissions. If I want the agent to be able to modify a file, it has to have permission to write that file. Write permission means delete permission too.

Pushing code is a one line command: git push. Having it push commits to your own working branch is (sorta) fine. It’s when it pushes to the deploy branch (master or main) that it’s like “ok no it shouldn’t be doing that.” But people don’t differentiate that sometimes or the agent ignores the directive and. Well.

derinthescarletpescatarian

See maybe I'm a dinosaur but I just assumed that the ai would be writing the code in a safe locked off environment and the user would have to export it to somewhere else from where it could be pushed. Like the AI would be writing in Notepad or something, or at least a similar program that preserved code formatting properly. But I guess that very simple security feature is too time consuming. Sure let Spicy Autocomplete fuck around with your important shit. I don't know, I'm not a coder.

theskiesareopen

I'm actually a software engineer and security researcher that has been working on this kind of problem for years, since well before LLMs came around. The short version is that everything has this problem because a long time ago we decided to build all computers on a model that is harder to explain and harder to use and less secure but fits very nicely to a corporation that wants to be the sole authority on what a computer can do mixed with programmers/users who don't want to feel like there's anything they can't do. Most of the time you don't notice this because the malice, incompetence, and mistakes of other people usually gets filtered out before it bites you in particular too hard, and people don't think about putting up safeguards that protects them from themselves. AI just lets there be a new source of mistakes crop up that no one reviews and doesn't appear to be your mistake. The sandboxing you were talking about used to be the case at the very dawn of AI dev tools when it could only be a chat, but then as soon as people started adding features like "let the AI check test results" and "let the AI look for what might be relevant" that punched massive holes through the security that couldn't be fixed without dramatically reimagining how the security is done, and that takes vision and time and effort, which are notably lacking in current software development.

The long version is that by default absolutely everything you run on a computer has full authority to do anything you can do. Your web browser can read any file on your computer, any game can snoop on your browser history, your dev tools can wipe your hard drive. The only ways around this without changing our minds on how our computers secure things are very difficult and time consuming and require individual attention to get working per program.

The current model is something called "access control list" (mostly) which means that for every thing on your computer there is a list of who is allowed to access it, and whenever something tries to access it the computer checks if the who that something is for can access it. Therefore, everything you do on the computer can do everything you can, unless you specifically make a new "who" that the what belongs to and go through and enumerate in a list everything it should be able to do and whenever you want to, say, give it a new file to read, you have to add that file to a list of things it may access before providing the file rather than just giving it the file. Anything that looks like it is violating this (outside of some very niche systems like seL4, capnp, or my own research which I don't expect you to have encountered) is actually just transparently making a "who" and setting up lists behind the scenes. For example, a web browser will make a new "who" (but not the same as the OS's "who") for each website domain you visit and automatically populate the list from Cross Origin Resource Sharing settings the server provides. That's why going to one website doesn't steal your credentials on other websites unless it can trick the browser or trick you into telling the browser to let it.

The alternative model is called "capabilities" with a bunch of adjectives describing different ways they can be represented in different scenarios. (Ocaps, Zcaps, etc.) In this, instead of having a big list of things you are allowed to do any anything you ask to do things inheriting that list, when you ask something to do things, you hand it some reference that both lets it find the object and tells the system it's allowed to access the object. This means that programs you run can't do everything you can, they can only do things you say they can, and the act of saying they can do something and telling them to do it is the same action which makes it secure by default instead of insecure by default.

The android permissions system where the Tumblr app can't just steal and wipe your entire photo roll can be viewed as a sort of attempt to build a capability system on top of an ACL system without quite understanding that. For example, trying to open a file in Android opens a separate privileged file picker and the app that asked to open a file only gets the result of the action, as a cap should be. In windows, however, when you ask an app to open a file it just looks directly at every file you have access to and then you tell it which one.

I and a variety of other researchers are working, sometimes at cross purposes, to make things based around caps more flexible and easier to get started with and more powerful. ACLs have a lot of accumulated software and infrastructure, but caps can subsume it and make things nicer and cleaner and easier in the future. And as you might guess from the length of this post I am always excited to talk about what these systems can become.

blubberquark

I understand that the original Unix security model was "users don't trust each other, and root doesn't trust users, but users trust the system programs and their own programs with their own data" while the Android security model is "users don't trust their own apps with their data". I understand your "ocaps, zcaps" parenthesis to mean something like intents (on Android) or XDG Desktop Portal.

I'd go for something less "research grade" and more "practical". I'm not trying to disparage your research, but if you want to be less "ivory tower", I can see two ways you can go:

Option A: Vibe Coding IDE

You make something that is like "Eclipse for vibe coding" or "AI native Visual Studio". You have different plug-ins, different tools, and a LLM system that can do stuff for you. Plug-ins could be a build system, a test runner, a GUI designer that takes MSPaint.exe drawings and turns them into code or XAML, a build system plug-in, a version control plug-in, and so on. These plug-ins all run with user permissions, and they aren't self-modifying LLM output. They might be vibe-coded, but there was a human in the loop. They are part of the IDE, that is the important part. Maybe there is also a plug-in for documentation, so the LLM knows about all the APIs you are using. Maybe there is a standardised format for docs. If you press "run", it runs, without any extra steps.

Now importantly, the LLM has access to a suite of non-destructive tools. It can look at the version history, it can query LLVM-based parsers and linters, it can see compiler errors and warnings, it can use static analysis. But what it can't do is delete tests, download files from the Internet, or change the build system. (Maybe it can change the build scripts, but then you'd have to run the build system inside a sandbox, too). This works best when the LLM works with a language like Rust or Haskell, one with an effect system or guaranteed no side effects. If the LLM wants to add a dependency or change the build scripts, or if the LLM wants to do a git command that isn't just adding and committing changes, this must go through the user.

If the LLM wants to run or de-bug the code it has written – i.e. code that isn't part of a trusted IDE plug-in – then it will run inside a firejail or a similar jail, so that the generated code doesn't have the permission to head $HOME or to read ./git.

You could even configure the system so that it runs static analysis first, and when it can't prove that a LLM-generated code will behave, then the test-suite will run the LLM-generated code inside a sandbox. You could also configure the system so that the user will be prompted when the LLM generates or modifies "naughty" code, or alternatively, the LLM can be asked to refactor "naughty" code to behave better, and if that fails, the decision is escalated to the user.

Theoretically, a combination of static analysis, interfaces to trusted tools, and sandboxing should be good enough for a LLM-based IDE, unless you really want the thing to operate completely autonomously.

The biggest problem with this approach is that in practice, you need to run arbitrary code in the build system. In the real world the build system often needs to download things from online, and update itself, and every project uses a different build system, so you write plug-ins for build systems to interact with each other, and then another build system crops up and nothing is interoperable. Gone are the days of "./configure && make && make install". Gone are the days of self-contained Visual Studio projects with .sln files. Everything needs an SQL database and a job queue and a document database and the credential to an S3 bucket.

That brings us to:

Option B: Containerise It!

You let the LLM run wild and execute arbitrary shell commands. Instead of giving it an interface to tools, you just let it run GDB and clang and whatever it desires. I realise that the idea of sandboxing the program with firejail is involves ACLs, but this one is really different. You have the LLM running (predicting tokens) on your machine, and you have git running on your machine, or some other system to periodically snapshot the progress, but any code generated by the LLM, and any commands executed by the LLM are always running inside a container. There is no ACL in the sense that everything the LLM ever sees is sandboxed.

This is still problematic. The LLM could generate a program that runs "sudo rm -rf /" inside your container, and then later you take the output of your independent LLM agent, and you compile it, and it wipes your machine. It's still possible. The solution is still code review and a human in the loop, but at least when it happens, you only have yourself to blame.

I just used a web site that had an "AI search" instead of full-text search. It was awful. It took between 30 seconds and a minute to run. It used an LLM to expand my search keyword into a full sentence, and it showed me what the LLM thought I wanted. It had correctly understood what my search keyword meant.

Then it returned hundreds of search results, and only the first was relevant to my search, or even contained my search keyword. The rest was just noise.

#AI enshittification

Animal Well: The Benefit of the Doubt

Animal Well is difficult to write about without spoiling, and hard to review fairly.

Animal Well is also the type of game many hobbyist or first-time game developers want to make, but shouldn't. It's big, sprawling even, and it's built in a custom engine. Animal Well is a game with niche appeal, but also a lot of marketing hype, and only one full-time developer. I'm sure it worked out commercially, and I'm sure using an ARG to market Animal Well worked out in this instance, but I don't think this method of community building works for every game and every developer.

Every moment of Animal Well is a spoiler, but in a way, Animal Well is structurally similar to Noita or FEZ: There's the main game, there are secrets that go beyond the main game, and there are hints to puzzles that lie outside the game, like an alternate-reality game. I am still on my first playthrough of Animal Well, and I am far from getting to what would be 100% in FEZ, and yet I have already encountered many of the post-100% collectibles. Maybe my own playstyle is to blame. When I play a metroidvania, I sometimes get stuck looking for the next area in the wrong spot. When I play Animal Well, I often keep looking in the same room until I find a secret collectible, and when that happens, I am always disappointed. I never felt good about finding these. I wanted to find a way out, instead I find post-game content.

Maybe FEZ just handled this better, because in FEZ, you need to find 64 cubes, so every cube means something. The eggs in Animal Well feel like consolation prizes. They make me ask: "Was this really worth a moon?"

So what is Animal Well in a concrete, moment-to-moment-gameplay sense? It's a puzzle platformer, a metroidvania, and a game with secrets and riddles. Like Blasphemous, it's a metroidvania with few movement upgrades, I think you can beat the game with only one upgrade. Like Stephen's Sausage Roll, it's a puzzle game with many rules, and they rules are always in play, but you don't see all the rules from the start. You're supposed to learn gradually. Like Myst, it's a game of riddles and learning what the rules are, and once you know the rules, the puzzles are much simpler.

Supposedly (as in I have been told this, but cannot verify this firsthand), once you have beaten the game, or once you have been spoiled, you can beat Animal Well rather quickly on your second playthrough. There isn't really anything you unlock, just things you learn, and these things obviously stay with you when you start a new save file. There are many other games where this is the case, of course: Every point-and-click adventure, every puzzle game where you need to learn the mechanics, every metroidvania with complicated routing. It's just that in Animal Well, knowing where to go is supposed to be like knowing where to go and starting the game with triple jump unlocked.

All this meant that I had to go into the game completely blind. Everybody who recommended the game to me told me this was the correct way to play it. If I hadn't been recommended Animal Well, then I probably wouldn't have bought it anyway, but if I had bought it on a lark, I would have probably read the wiki by now, or I would have just refunded it.

I don't know if I am supposed to look online, because there are some things that the community has to find out collectively. I don't think so. I clearly haven't come close to the mid-game of the main path. I'm below 50% of any%.

But I know one thing: If I had made a game in the style of Animal Well (but worse, I freely admit), not only would it have sold poorly, but most players would just have quit in frustration. In the world where Billy Basso uploaded Animal Well to itch.io as a free download one day, without any pre-release marketing, probably nobody important would have taken note of the game, and few people, if any, would have beaten the game, never mind found the secrets. Maybe there would have been 2000 page views, 200 downloads, and 10 people would have taken the disk gotten even a few rooms out from the beginning, instead of quitting in frustration.

Many hobbyist developers want to make their own Animal Well. Some want to do it as their first game. They wanted to make a game like this before Animal Well was a thing. They want to make their big, sprawling magnum opus with many mechanics, and a lot more content. They want to spend seven years building a game in a custom engine.

If you played somebody's magnum opus, how much benefit of the doubt would you give it?

Accessibility Options

I saw a post that I won't link to, because I don't like to start fights.

indie devs will give you a cheat menu labelled Accessibility and then not let you change the size of the subtitles [...] scrolling through a dozen "ignore a central mechanic" toggles and noticing a distinct lack of photosensitivity mode

It's easy to trace this back to Celeste, a game that marketed itself on this particular brand of accessibility, but I distinctly remember that there was an "accessibility" option in VVVVVV that made the player character invincible, but not the characters you escort in escort missions. Many games seem to have blindly copied the Celeste philosophy of calling the difficulty menu "accessibility", without understanding which options their own games need.

"Should be Easy"

Let's get the inflammatory part out of the way: If you comment "It's easy enough to do a shader, I don't understand why they didn't just do a shader" or "You can just make it an checkbox in the options menu", then you have better shipped a game of a comparable size before you say this. You're still allowed to call for a colourblind mode, but don't use words like "easy", "just", or "simple enough". Non-programmers sometimes say things like "Just add a button that does X", without understanding that actually doing the thing is the hard part, and programmers sometimes say things like "Just implement something" without understanding that specifying and designing the thing is the actual problem.

Sometimes you're hamstrung by your engine. Sometimes the only way to to colourblind mode is to make a separate set of sprites, because you're using Klik'n'Play. It's probably easier to do colourblind variations of select sprites or textures in your game than to write a global colourblind mode shader that always does the right thing, or to decide which elements of an open world game need colourblind post-processing. By all means, tell me when something is hard to read, but don't be condescending and tell me to use shaders.

The Elephant

Back to the less inflammatory part. Before Celeste, Terry Cavanagh's VVVVVV had invincibility mode as "accessibility". That game also had a big seizure-inducing elephant. (Don't Google to see what it looks like. The VVVVVV wiki has a flashing gif. *facepalm*)

It was some kind of joke, probably – "The elephant in the room", or something like that. There was no way to turn the flashing off in the original release. It was only patched in the open source version, ten years later, by an unpaid volunteer: https://github.com/TerryCavanagh/VVVVVV/pull/293. The elephant flashed at the refresh rate of the game, in a different colour every time it was drawn.

My own games never have epilepsy toggles. Why is that? I don't ever add flashing elephants to my game as some sort of joke. I don't add full-screen strobes, and I don't add jump scares or unexpected loud noises. You see, I used to be photosensitive as a child. I remember what it's like. I never add anything in the first place that needs to be toggled off.

High Contrast

Some of own games have colourblind mode toggles.

Most of my own games used to have a "tweet my score" button, but I'll probably go back to having it in my games once I implement support for bluesky and mastodon where you can configure your preferred system and instance. I had a whole suite of features that I considered must-haves, even for small game jam games, but over time, I stopped many of them.

These days I don't do colourblind mode. Many of my games are playable in back and white, at a low resolution, on tiny screens. They already have good contrast. The first time I did colourblind mode, I overdid it with the contrast, and then I realised all I really need is some subtle texturing that lets players distinguish between the red guys and the green guys.

If that subtle pattern is always there, I can just design my game around that. I don't need a toggle. I just design it so it helps people who can clearly see colours as well.

Curb Cut

If I make another game that's about red guys versus green guys, or if I release a full version of that first game, it may have a colourblind toggle again, but it's not really necessary when I design the red guys to be a different shape from the green guys. If the only way to distinguish the red pirates from the green orcs is colour, then colour-blind mode is needed. If it's a fighting game where you have two ninjas, and one has a red sash and another a green sash (I know about the other forms of colourblindness by the way, I will continue using red and green without loss of generality) then you need some sort of accessibility option. Unless both players pick the same character, or the same faction in an RTS game, colourblind shaders are not as important as distinctive shapes of different characters/units.

By similar logic, if the camera always follows player character's red go-kart, then there is little need for a shader to distinguish it from an identical green go-kart. You always know which one is you. If different player characters have name tags, that might help more than a "colourblind" toggle.

Now this doesn't mean you never need a colourblind option in your menu, but usually your time is better spent making sure everything is legible and accessible with default settings.

Give Me Some Options

I have accidentally (or maybe inevitably) arrived at the same position I already hold when it comes to keybindings, or difficulty:

Don't rely on custom options. Make sure you have good defaults!

You shouldn't use "the player can tweak the difficulty if a section is too hard" or "the player can change keybindings if the controls are awkward" as an excuse for bad design. You should spend more time fixing and simplifying your controls, or perhaps some time designing alternative control schemes. You should spend less time on the keybinding menu. If you aren't working on a big AAA title, you should really think twice (think about simplifying your controls) before making your keybindings configurable. You should think twice before adding the "I have seizures, don't give me epilepsy" toggle. You should think twice before implementing an easy mode that gives the player more health.

You should do your job as a game designer, and remove the cheap unavoidable damage, and the boss attacks that aren't telegraphed. Instead of adding options, you should find a design that works for everybody.

Job Security

There is a sad reason why this isn't done more often. Within big organisations, making the character designs more legible is already the job of somebody, but it's underappreciated. The character designer wants the characters to look cool. Not everybody's career benefits from legible character design. If you are the programmer who does the colourblind shaders, this is something you can point to, though. You did that. There's a toggle in the menu as proof.

On a larger level, your publisher and their marketing team can point to the options menu and say "Our game is for you, epileptic target group", and lead developers can go to GDC and get recognition from their peers for caring about people who are epileptic, or short-sighted, or hard of hearing, or...

Good design, on the other hand, is invisible. If the game just lets you zoom in at all times with the mouse wheel, that doesn't look like you are going the extra mile for your visually impaired players, it just looks like you made something playable by accident. If there is an option to make text more legible by giving it an outline, why isn't that on by default?

You can easily overdo it with accessibility options. If there is an option to make font sizes larger than the player character, then people who can't see where they are jumping can at least read the dialogue in the cut scenes. It shows the players who don't need this option that you care.

#game design #accessibility #game development

Keybinding Woes

I recently tried to play a game (title anonymised intentionally) with a game controller. The game was controlled with ZX and arrow keys.

I have a game-pad that I can map to keyboard keys, so I used that, but somehow it didn't work. My keyboard is mapped to German, and even though I could use the game-pad to type in Z and X on the German keyboard, it didn't register right. I tried to re-map the keys in-game, but the game just didn't register Z and X when typed with the game-pad. I used a different controller with Xinput, and software to map the d-pad to arrow keys. Again, it did not work.

I tried to swap Y and Z in the keybindings of the game, and I tried to map the controller to Y and X, and that also didn't work. Somehow, the game allowed me to re-bind keys, but not to overcome keyboard layouts.

I could also play the game with Xinput support, but at that point I was trying to find out why that didn't work. What did work was using Z and X on the keyboard itself, or re-mapping controls to X and C, or switching the keyboard layout to US.

Scancodes

I wonder if the developers never tested on a non-US keyboard, or if using joy2key somehow caused an edge case with scancodes and keycodes. When we were porting PyGame to SDL 2.0.X, I happened upon some inconsistencies between the SDL 1.2 approach to keyboard events, and the new SDL 2.0.X approach. What we should have done was to tell everybody to change their code, but instead we decided on some sort of half-measure to allow full backward-compatibility with old games, even thought the SDL developers themselves didn't aim for that for their own project. If you want to type text in SDL 2, you should explicitly start and end text entry, so the OS or desktop environment can show an on-screen keyboard, or an input method for emoji or special character or just regular characters when you are writing in a language like Japanese.

I realised that you theoretically could ignore keyboard layouts altogether, and just use scancodes, and then you wouldn't have to bother. Who cares which letters are on the keys? When you actually want text, you start text input mode, and SDL gives you unicode, and you more or less ignore keyboard events unless they are Tab or Escape or Return. That includes Shift and Alt-Glyph! You don't need to know what's on the keyboard, because SDL will just give you a "mü "or "µ" or even "ム".

For keybindings, you could show something like this:

You just show where the keys are, that's enough! No more keycodes!

If you want to know what the keycodes are for ZX or WASD, you just run SDL_GetKeyFromScancode. This way you could even try to auto-fill keybindings programmatically. You decide what the default keyboard would look like on your local keyboard layout, and instead of hardcoding WASD or ZQSD or YX + arrows or ZX + arrows or XC + arrows, you hard-code the scancodes, and you just fill in the keyboard mapping when the game starts for the first time.

Why isn't everybody doing this?

I'll have to evtest how joy2key software works to see if this the reason, or maybe that one game just hardcoded the US keymap somewhere.

#game development #SDL #UX #localisation #I don't want to put the game devs on blast

What made Brood War good, or Red Alert 2 fun?

Previously: https://blubberquark.tumblr.com/post/726536336618733568/starcraft-2

In this post I would like to compare StarCraft: Brood War, StarCraft II, and Command & Conquer: Red Alert 2.

Defining The Genre

Before we get there, let us take a quick look through my history with RTS games:

Command & Conquer and Red Alert: These put the RTS genre on the map for me. I played them a ton. My father played them a ton. The tone of the first game is grungy, campy, and extremely 90s. Thematically, I like this game the most all RTS games. There's a war between the "status quo coalition" GDI, bland and professional, and the guerilla army Nod with its charismatic leader Kane. The Brotherhood of Nod is some kind of vaguely Maoist-Third-Worldist ideological construct and multinational terrorist network. The second row of leadership are vying for power and aren't hardcore ideologues, or maybe they see the ideology of Nod as up for grabs, but there is also an "information war" going on, with some Western journalists (including a grown man who acts like he is secretly wearing a Che Guevara shirt under his clothes) secretly sympathetic to Kane. The story of Red Alert is a lot more sci-fi than the first Command & Conquer, and the tone feels more over the top and silly. The gameplay is basically the same. The biggest difference is that Red Alert works with higher-resolution monitors than 640x480. You play with the mouse, with one mouse button only, and macro/unit production is centralised and streamlined, in a UI panel off to the side of the screen. The 90s grunge/post cold war/20 minutes into the future story of Command & Conquer felt a lot more interesting than the alt-history cold war gone hot story of Red Alert.

WarCraft II: If you didn't play the C&C franchise first, you started with WarCraft II. The game design, user interface, and tone of WarCraft II are completely different from C&C, and yet both are recognisably RTS games. You control workers to build things. There are multiple resources. You need to select a production building to produce units. The tone is "generic Tolkien-esque fantasy". Where C&C has two different factions that are differentiated by themes and mechanics and only mostly balanced, WarCraft II has two factions that are the same with different skins. Where C&C has a map that needs to be discovered, WarCraft II has fog of war. WarCraft II uses the left mouse button to select units, and the right to give orders. Where C&C has full-motion video cut scenes, WarCraft II has a couple of rendered sequences.

Z: This game has much fewer mechanics than the other two. There is no fog of war. There are no resources. Both players have the same units. You capture buildings by capturing map sectors, and buildings produce units automatically. The result is a frantic scramble of micro-management, and the outcome of a game is often decided in the first half of the first minute: If you capture more than half the map, you win. The single-player campaign always feels like a multi-player match, with the enemy AI starting from the same (or roughly the same) position as the player, but ruthlessly efficient at moving all infantry units to capture sectors at the same time.

Single-Player RTS

Both StarCraft and Tiberian Sun were sci-fi games that followed up on the earlier successes of their respective developers, with more units, more game mechanics, more stuff, isometric graphics, and a darker tone. Red Alert 2 made a complete 180-degree turn, swapping the dystopian vision of a war-torn future for a colourful, campy and downright silly alternate history cold war. How do you beat Soviet giant mutant Krakens? You guessed it: Trained military Dolphins.

In terms of level design, there was not that much difference between StarCraft and Command & Conquer. Common level tropes include:

Get a new unit and build a lot of it to steam-roll the enemy.

Defend a base from waves of attackers.

You have no base and only a small infiltration force in an enemy base. The enemy has no economy and does not produce any more units. This can either be a "stealth" mission, or a "micro" mission.

The level is basically a puzzle. You have to abuse the terrain feature or unit ability or the range of your unit to defeat a superior force

The level is definitely puzzle where you have to do something counter-intuitive, like destroying one if your own units/buildings, selling all your buildings for cash, or capturing a specific enemy building.

The factions in Tiberian Sun, StarCraft, and Red Alert 2 were not balanced for human-versus-human multiplayer. The units in the original Command & Conquer weren't either. The imbalance worked differently between the games. Whereas a dozen marines from StarCraft could defeat a siege tank with some micro, basic C&C infantry units were mostly useless against armoured vehicles.

Units in C&C were slower, infantry squishier and armour tankier, and generally did more damage than units in Star Craft. Static defences in C&C were much more effective. You can overwhelm a photon cannon or a bunker with enough basic infantry, but you can't really overwhelm a Tesla coil, a gatling gun, or a Nod obelisk the same way.

This feels more "realistic". You can't just shoot an assault rifle at a fortified position in real life. To attack a fortified position, you would use air power or artillery. It also allows for a certain flavour of puzzle-like level design, where you have to infiltrate a level, or where you have to attack an enemy position from a certain angle with a certain unit, and then another position, and then another.

This is completely different from the dynamics of a multi-player game in which the enemy is always trying to win, and trying to get the most advantage, and the most use out of his units. If you carefully stage and attack and angle your units to attack a fortified position, the other player won't stand there and watch. He'll outmanoeuver you, counter-attack, or use air units against your artillery. In Z, the computer enemy was always trying to win, in a one-on-one match.

In StarCraft on the other hand, the same type of single-player level design works differently. You don't win these levels by carefully avoiding static defence, but by pulling back damaged units, by engaging larger formations with hit-and-run tactics to split them up and destroy them piece by piece, and by using fog of war and high ground. If there is a combination of mobile units and static defence, you can try to lure the mobile units away and destroy them without the cover of the static defence, and then destroy the static defence by using units that outrange it.

In practice, both games play the same in single player. The simpler control scheme of C&C (no attack-move, hotkey to scatter infantry against area-of-effect damage, easy building menu) doesn't really matter when the player is only ever expected to have one main base, one of each building, and no outposts to mine additional resources. In C&C, the economy is simplified, and much chunkier, because you have only one type of resource, and you have one harvester per base. In StarCraft, you need to expand to tap new resources, but in C&C, your harvesters will just mine long-distance. Harassment is much more difficult, because harvesters have armour and so many hit-points. Again, it doesn't matter. If the level design wants you to expand to another base and mine more in StarCraft, it's obvious. It's not really a decision you make.

The new generation of RTS games, with isometric graphics, more gimmicks, and more units in play, relied more on level design for narrative purposes. C&C still had full-motion cut scenes, and StarCraft still had pre-rendered cut scenes ever now and then, and text plus voice-over between levels, but in addition to this, both games heavily relied on scripted events and story bits that happened in-engine, during gameplay. Both games were not designed to be competitive. They were designed for the people who build one of each unit and one of each building, or the people who build 100 of one unit and then move them all across the map.

Neither Tiberian Sun nor StarCraft was designed as a competitive multi-player game first.

The StarCraft base game without the Brood War expansion was not balanced for competitive play. As the competitive metagame for Brood War was developing, Red Alert 2 was developed. It was based on the same engine as Tiberian Sun, but stand-alone.

Depth of Micro

The addition of units like Lurkers, Medics, and Dark Templar led to Brood War becoming more strategically balanced and interesting.

Brood War wasn't as streamlined or as optimised as Tiberian Sun, to the point that they made the unit cap a game mechanic. Tiberian Sun had horribly janky pathfinding, full of little special cases to cut corners, but Brood War had aliens running around in an alien fashion. StarCraft ran at a lower resolution, with the pixel-art units being bigger on the screen.

Brood War was more micro-intensive, or rather APM-intensive. Even the economy needs a certain number of actions per minute. You have to build workers, build depots, select buildings, build units, select more buildings, build units, and so on. Instead of building a single harvester, you build workers, and assign them to bases. Long distance mining is not the norm, because it is very inefficient.

The animation/movement system of StarCraft was janky in a way that allowed players to micro certain unit types to shoot while moving only when micromanaged correctly, or to stutter-step with the right hot-keys. This clearly wasn't intentional as a game mechanic. It was a result of compromises made to make the units behave smoothly and realistically, while also feeling responsive.

All this only matters if micro matters. If units move to slow for micromanagement to be useful, or if they deal so much damage at once that you can't pull back a damaged unit, or if they deal so little damage that every battle becomes a battle of attrition and slow players can pull off complicated management of health bars, then micro would be moot.

Breadth of Strategy

In Brood War, you can go for a rush, economy, tech, or a mixture where you apply some pressure and expand or tech behind it. You have many tactical and strategic options. You can harass the mineral lines, contain the enemy, maintain map control, scout tech buildings, troop numbers, and army position. Between containing the enemy and map vision, there is the harder to describe concept of "map control". Finally, you can counter-attack.

The game design on Red Alert 2 leads to a much simpler strategic balance. You have fewer harvesters that are easier to defend, so harassment is less of a thing, and expansions are less of a thing. On top of the heavy armour, harvesters are either armed (Soviets), or can be quickly teleported (allies) to safety. In general there is a tremendous defender's advantage, because static defences are strong and easy, if slow, to repair. There are fewer counters to turtle strategies, because you can't really expand when the enemy turtles, there are no unit upgrades, and there are no damage/armor upgrades. Infantry can be stationed in buildings, which gives the weak and squishy infantry an enormous defender's advantage.

The stealth/burrow/detection system of StarCraft reifies the single-player "infiltration mission" into a game mechanic, one you can use in a multi-player game! There is no such thing in Red Alert 2.

Fun

For many people, competitive StarCraft is not fun. It's exhausting. Blizzard took a lot of the incidental complexity and depth of micro of StarCraft out for StarCraft II, trying to create a competitive metagame similar to Brood War, but without streamlining unit production to a menu and controls to one mouse button in the vein of Command & Conquer. I have played a lot of StarCraft II, and the ladder always felt stressful.

When designing a game, you don't always aim for a thinky atmosphere where every turn takes minutes. Sometimes you design to limit the information horizon, so players quickly end their turns and make one move after the other. Sometimes you design for quick action, long-term planning, tactics, or even mood.

Crushing the enemy is fun. Using micro-intensive gimmicky units to cheese the computer can be fun. Commanding armies of hundreds of units can be fun. When the game allows it, pausing a real-time game and thinking through a situation, or thinking through a series of moves in a turn-based game, can also be fun. It all depends.

Red Alert 2 has turned up the defender's advantage, and reduced the frustration of harassment. In multiplayer, early infantry units are weak offensively. There is no cannon rush, bunker rush, or zergling rush. All of this allows weak players who like to build one of each unit to stay in the game until at least the mid-game, and it ensures that a game between two mediocre players will end in a clash of two massive armies, or a long, drawn-out war of attrition.

With its overall more streamlined controls and mechanics, and a narrower band of strategies, Red Alert 2 has also narrowed the gap between knowing "how the pieces move" from playing the single-player campaign and the competitive multi-player game. Of course, this gap is even wider in Chess itself, the origin of "I know how the pieces move". There is more than one way to narrow this gap. Stronghold focused on defence (like the later single-player Bad North), and Z made the single-player missions more like multi-player games (like the later Swords & Soldiers).

Allowing the player to garrison infantry in neutral buildings, and making static defences stronger lets players progress towards the late game, massing armies, and building Krakens and an Apocalypse Tank.

In Brood War, on the other hand, often the best strategy to counter your opponent was to build Zerglings, or Marines and Medics. The way you lost was by getting your workers roasted by Firebats. It's just not as fun as Krakens and Tesla Coils.

#game design #real time strategy #brood war #red alert 2

UFO 50: A Jam for Everybody Else

Before I get started, let me re-iterate: UFO 50 is great. I think you should buy it. Even if not all the games can be winners, there's enough in there to justify the purchase price. There's something in there for everybody; everybody puts different games into their S tier; every game is in somebody's A tier and in somebody's D tier.

Most games in UFO 50 could be better. Some are even bad on purpose. Well, they aren't bad on purpose, but they feel outdated or incomplete or janky on purpose. There's always an excuse: There is enough content for an arcade game, they had to work with the limited control scheme, the game has the difficulty and pacing of an arcade port, the game is supposed to be "Nintendo hard", this is supposed to look like it was developed for an 8-bit home computer, the game design is deliberately clunky to evoke the 1980s, the game design is deliberately clunky in certain places because this is part of Chun+Smolski's early work, the game is unpolished because it makes sense in-universe...

Some games are 80s on purpose, and some games are 80s as an excuse to make a game design work, or to paper over the reason it doesn't.

And what the hell do you want? There are 49 other games if you don't like this one!

For every of these excuses, there are probably two games in UFO 50 it applies to, and two more games where this limitation was ignored. There are games with modern game design. There are games with modern-looking pixel art. There are games that aren't "Nintendo hard", games with the amenities you would expect of modern games, but not old ones from back then.

I just can't break out of the mindset of rating a game jam when playing UFO 50. More than with other games, I immediately see the seams, and I can't turn off my critical eye. It's just what happens when I play these small, self-contained games, games that feel like they are fun, but just wouldn't work as well if they weren't small.

It doesn't help that I harbour no nostalgia for the NES. I didn't make the connection with Blaster Master or Battletoads or Cheetahmen. I remember the 2000s shareware game Battle Painters.

On the flip side, it feels like there are games that are missing from this collection, games that are more like Ultima III (like the modern PC game Fit for a King), or Rogue, Giana Sisters, Ballerburg, and Tetris. There should be more games with split-screen multiplayer.

The games in the UFO 50 collection are often deliberately made to feel like they originally came with a printed manual. I remember the manual for WarCraft II. I still have it somewhere. It lists every unit. I have a strategy guide for Pokemon Red, or rather, I used to. It's somewhere in an attic with my parents.

If I make a game for a game jam, I make sure that the game can be beaten in 20 minutes by a tired game developer who wants to rate 20 other games today, or in 5 minutes on the second play-though. I don't design for a missing manual. At the very least, I make sure that all the content and major game mechanics aren't gated behind a difficult first level or a difficult second level boss. I would rather make the level-select screen open from the start, so the player can see everything and form an opinion.

UFO 50 is not a game jam. It just feels like one.

This dynamic also explains why UFO 50 is both over-appreciated and under-rated. Game designers love it, and love to gush about it. Players often get frustrated by it. If you're a game designer, it's like a game that gives normal people the feeling of being in a game jam (the second half of it, anyway). I still think game designers appreciate UFO 50 a little too much. Some of you are just too modest. You could probably have developed your own take on "Ninpek" or "Porgy"!

This divide becomes clear when developers and players make tier lists of games. In the initial reviews of UFO 50, reviewers unfamiliar with games like A Good Snowman highlighted games like "Grimstone" and "Block Koala", because "Grimstone" has 30 hours of content, and "Block Koala" has 50 levels, like a "full" game. And yet, "Block Koala" is kind of bad. It faithfully mimicks the visuals of Adventures of Lolo, but there's an undo functionality that feels out of place in the 1980s, and way too modern. The puzzle design reminds me (not in a good way) of the original Sokoban, and 1990s/early-2000s fan level packs (think Sasquatch for Sokoban and Chip's Challenge) with a lot of "stuff" in them.

The puzzle design of "Block Koala" also, rather embarrassingly, reminds me of my own attempts at puzzle design. This enabled me to just sprint through these puzzles. Puzzle game designers can immediately pinpoint the problems with "Block Koala". There is no "Twisty Farms" moment. The game immediately following, named "Camouflage", does everything right in terms of modern puzzle design, but it looks less polished at first glance. It drives home the point that they knew what they were doing. "Block Koala" isn't bad on purpose in order to be bad. It's bad on purpose because it's deliberately bad in the was 1980s puzzle games were accidentally bad, and it faithfully reproduces all that 1980s jank. There is a simple fix to the game feel that could have made "Block Koala" about 30% less tedious, without compromising on the d-pad-and-two-button controls, and its absence feels very deliberate.

When you ask somebody who isn't designing puzzle games, or in a cursory review, you will hear that "Block Koala" has better graphics (top-down oblique projection, like in Lolo) than "Camouflage" with its flat bird's eye view and simplistic square tiles. Players seem to find that "Block Koala" is mainly inaccessible because the mechanics aren't properly tutorialised. This is clearly not the case! Players seem to complain that the very levels that introduce new mechanics aren't preceded by levels that tutorialise these mechanics better! Many modern puzzle games let players figure out game mechanics. It's the main mode of introducing mechanics in puzzle game such as The Witness, Baba is You, or Stephen's Sausage Roll. The fiction of the "lost print manual" is at odds with the decidedly modern game design and the antiquated level design.

UFO 50 invites you to think about game design, because it is like a game jam. UFO 50 invites you to compare and contrast these games. This explains why UFO 50 was, although probably a financial success, not as celebrated as it could have been. There was a real disconnect between general gaming audiences and what I think is the "target audience" of the collection: People who were on TIGSource back in the day, people who had a NES back in the 80s, people who do game jams.

Difficulty Settings

If you play-test my game, I will listen to your complaints, take notes of your feelings, and I will investigate the causes. I will also look at your suggestions, but that doesn't mean I will implement them.

It's an accepted truth in game design that sometimes the correct response to feedback is doubling down, doing the opposite of the suggestion, or fixing an earlier problem so that the later frustration doesn't occur. If you tell me a concrete change you want implemented, I may or may not do it. Usually I won't. Players will often have contradictory feedback anyway, and implementing suggested changes won't make even half of them happy. It's more important to know why you think that would make the game better than to do what you want.

Players just don't know what the game needs.

The only exception to this rule is when somebody frames the suggestion as an accessibility issue. Immediately it becomes rude and/or politically charged to disentangle the difficulty/progression issue from the suggested fix of an easy mode and the suggested form that easy mode should take: More health, more checkpoints, a different jump, more i-frames...

The usual considerations about play-testing feedback still apply, but saying it out loud will make any developer look like a jerk. Many people will shy away from stating what is otherwise conventional wisdom in play-testing and game design. At the very least, we should de-couple the idea that a certain game needs an easy mode from the specific implementation of the easy mode.

Understanding The Assignment

When I was a child, I hated this one kind of writing assignment. Well, now that I think about it, that's misleading. When I was in school, I hated all kinds of writing assignments, like "the best day of my summer holidays", but I hated one in particular, and particularly strongly, above and beyond the others: "The story is over. What happens next? Please write a short story! (2 pages A4)"

I couldn't express back then how I loathed this type of assignment. To me, it showed a lack of understanding of what a story is and how it works. Why do ten year olds have to pick up where a story ended? Why did I have to write a story using all the characters, themes, and settings of a story after the tension has been resolved, after all loose plot threads had been tied up, after the mystery has been solved and the secrets revealed?

Now, as an adult, I can immediately and without difficulty appreciate why you would give children such an assignment, with clear boundaries and a starting point. I understand the didactic and pedagogical value of having children write such a story. It certainly beats quizzing children on certain plot points to check whether they actually read it. Maybe you think ten year olds would benefit immensely from a free-form fiction writing workshop – we're on tumblr after all, and somebody on here must have already done exactly that – but all in all, I am perfectly happy today to concede that it would have been overkill.

But back then, this type of assignment didn't just feel like busywork, it didn't just feel like I was being set up to fail by my teachers – teachers who never explained why this kind of homework was important or what we were supposed to learn from it – it felt so stupid and pointless, I had no choice but to conclude that my teachers didn't understand what makes for a good story. I had no choice but to conclude that nothing my teachers thought about books that didn't come from the actually smart people via the blue Reclam booklets (the American equivalent would be Cliff's Notes) was worth paying any attention to. They couldn't write their own short story to save their lives! They already had a no taste in books, and they couldn't really articulate why they read the books they read in their own time, and why we shouldn't read those books instead of the books on the syllabus. And now they made us write a story?

I would have loved to critique some of their stories.

I once asked my teacher about her best day of the holidays. Her answer was: "I read a really good book at the beach."

Sometimes I read something written about computer games, by an ostensibly smart and educated person, a games critic, a writer, and I get the same feeling. I start to doubt whether they actually understand what a game is and how it works. The opinion I am looking at just seems so misguided, and it seems to indicate a fundamental misunderstanding of what games are. Again, I could be wrong. Maybe there's a really good reason for people to talk about games like that.

But more and more, I see people talk about the plot and the world-building of games, the plot, the characters and motivations of a game, when the first question should be: What do you do in the game?

You can just imagine, if you're my age at least, the stereotypical situation where your friend has invited you to his home to play some Super Smash Bros on his brand-new Nintendo 64, and his mother walks past the living room TV and asks "Who are these creatures and why are they punching each other?"

If I switched Super Smash Bros out for Mario Kart 64, and had the mother in this situation ask instead "Where are they driving to?", then you could have called this scenario "contrived", and you would possibly have armchair-psychoanalysed me for my opinion of mothers. But that's what it feels like to me, if you ask "Where do the power lines go?" or "What do the NPCs eat?" or "Why isn't there a bathroom in that house?"

There are other ways to misunderstand what games are about. A large number of mods just implements things that were intentionally left out.

The German state television station ZDF dedicated one whole whopping minute to Silksong last month. The review mentioned that the game is crafted with as lot of attention to detail, that you play as Hornet (she's a bug with a sword), that it's a side-scrolling platformer, that there's combat, and the review also mentioned that the game is very difficult, and there are no difficulty settings – the default mode is hard mode, and there is no "normal" mode or "easy" mode. That's it. I wonder who that review is for. I have no doubt that the games journalist who produced the review played more than a minute. He probably put in a lot of time, but the end result is a review that doesn't really help people who played Hollow Knight, and that doesn't really help people who didn't play Metroidvanias.

I have no doubt that the games journalist who reviewed the game, as well as everybody who touched that segment and all the other journalists and producers who cover games for that station, actually understands what Hollow Knight is. What I don't understand is the coverage. Who is it for? They might was well have said "Silksong is out. If you know, you know." – Or they might have done a six-minute piece about the global cultural phenomenon. They could have dedicated more than three minutes to gaming that month, or taken some air time from the other games they covered in these three minutes. I still remember the one-minute review for Elden Ring (same problems), and the one-minute review for Detroit: Become Human (decent, actually). I think the latter worked quite well in describing what the game was about, because they just explained the rough outline of the story. That particular game is its story.

"If only you could talk to the monsters" has become a punch line for a reason, but I can't help but think that there are people out there who are primarily writing about games for the people who want to skip the boss fights in Titan Souls, or who think it would be neat to talk to the monsters in Doom, but they would never play Doom either way.

I think these people cover games based on the expectations of people who have never played them and never will, people like my old teacher. I think these people were really good at writing these "What happens next" assignments in school, and "The best day of my summer holidays". I think they are really good at writing, and knowing what the audience expects.

blubberquark

Hollow Knight Design

Silksong is in many ways just more Hollow Knight. It's bigger, more polished, and more difficult, but it's just more Hollow Knight, after Hollow Knight ended. Whenever the game design of Silksong differs from Hollow Knight, you can usually attribute the difference to one or sometimes two of three reasons: In-universe reasons, doubling down on what made Hollow Knight special, and trying to be different for the sake of being different.

Maybe that sounds too dismissive. Silksong is a game that's very "conservative" in the changes it makes to the Hollow Knight formula, with the same story beats, structure, and visual style. The few mechanical changes can be traced back to narrative/lore reasons, with only a handful of genuinely new systems: Quests, crests, tools.

This closeness in design to the original invites comparison, and it makes the differences really pop out at you. When a difference in the experience can be clearly attributed to a specific mechanic or design decision, like the diagonal dash or the two currencies, then it's easy to criticise that decision. Team Cherry could always just have stuck with the tried-and-true from Hollow Knight. It's right there. It's easy to compare and contrast.

But what is the tried-and-true? What is the design of Hollow Knight?

Metroidvania

Before we get to Hollow Knight itself, we need to keep in mind that Hollow Knight itself is based on, or inspired by, two genres: Souls-like and Metroidvania, or Dark Souls in particular. Hollow Knight takes the overall mood and difficulty from Dark Souls, as well as many mechanics/dynamics. There's the emphasis on dodging and parrying, the way healing works, resting at benches, weapon upgrades with Pale Ore. Hollow Knight wears that inspiration on its sleeve. The influence of the Metroidvania genre is much stronger, but much more diffuse. There is no single Metroid or Castlevania game that is referenced in particular, no wall chicken, no morph ball, none of that kind of thing. The influence of the Metroidvania genre is clearly there, not hidden or obfuscated, in the moment to moment gameplay and in the map, and it is much stronger than the influence of Dark Souls.

In a Metroidvania, you explore a large 2D map that is organised into rooms, and you find secrets and fight bosses to gain new movement abilities. These movement abilities allow you to get to new areas of the map and fight more bosses. Common abilities are dash, double jump, swim, walk on water, dive into water, walk on spikes or other floor hazards, double jump, a skill to destroy certain obstacles, and so on.

Hollow Knight

In Hollow Knight, bosses gate some new areas. Not all areas are gated by bosses, and not all bosses gate new areas, but unlocking a major part of the map usually depends on beating a boss. When first exploring new areas, progress is gated by arenas that spawn non-boss enemies. Progress to new is usually not gated by movement upgrades, but by bosses. Movement upgrades aren't rewards for boss fights, but for platform sections. Platforming sections usually have an easy way back after you beat them. Doors that open from the other side are preferred to passages that visibly need a double jump or a similar movement upgrade.

When possible, Hollow Knight uses mostly doors openable from the other side, with only a few mobility based gates. When exploring a new area for the first time, it is a lot more linear than later, because of all the one-way doors.

The earliest parts of the game are designed around a limited set of moves. Early game platforming and bosses are designed around jumping, and then dashing. Then there's the early to mid-game, with jumping, dashing, downward slashes, and wall-jumps. You can pretty much get to the final boss having seen 70% of the areas, half of the bosses, and a third of the collectibles.

In Aria of Sorrow, progress is mostly not gated by doors that can be opened, but by movement abilities and souls. These are equippable skills that can be dropped by enemies. These are most comparable to Hollow Knight's charms. You can farm regular enemies to gain XP and level up, or to get the souls they drop.

In Fez, progress is at first gated by the number of cubes collected, and later by exploration itself.

In Aquaria, bosses usually reside in "temples", self-contained areas you enter at the entrance and progress through in a linear fashion.

The Standard Boss Move Set

The level design of Hollow Knight makes it possible to use fewer obvious "come back later when I have double jump" move set gates. Instead, the player comes back later from the other side, opening a one-way door into a permanent shortcut back to and from an earlier area.

All this is deliberate. If you could grind enemies to get stronger, then there would be more constraints on the order you fight bosses in. If you could sequence-break to gain the double jump early, or if you could sequence-break and fight a certain boss before you get the shadow dash, then some bosses might have to have much more "slack" in their boss fights to make them beatable.

The use of openable doors in the early and mid-game (and also the late game, but that doesn't matter here) instead of movement upgrades and

As is, boss fights and map design interact to provide three things:

Every boss fight can be beaten at the earliest moment you could encounter it. (And you can't grind for it anyway, you can only explore)

Every boss fight can be beaten flawlessly, without taking a single hit of damage. (And every hit you take is one mistake)

It is never in your interest to deliberately tank a shot or even deliberately risk getting hit for getting one hit in. (And it is never possible to beat a boss with pure DPS or ranged attacks only)

When you meet the Mantis Lords, you have all the tools you "need" to beat the Mantis Lords, but it can't hurt to gain more charms, money, mask shards, and nail upgrades before you try again. There are more ways to progress than just fighting bosses: Exploration and platforming challenges. You can get to Deepnest another way and never fight the Mantis Lords, but on the flip side, you may want to fight the Mantis Lords early.

The Big Picture

In the large scale, the deliberate and linear streak of the design in the small (where the game wants you to explore an area in a certain order, but then it opens and you can go every direction the next time) is counterbalanced by massive freedom on a larger scale. You have the freedom to explore areas in almost any order because progress isn't that often gated by movement abilities. All bosses can be tough but beatable because of the way progress works, because of the way the map works, because of the way the hit point system works. In Hollow Knight's mid-game you can make incremental progress in every direction for far longer than you might have thought.

In older Metroidvania games, using abilities to gate new map areas cut down on large-scale game state. Moment-to-moment gameplay may need a couple of kilobytes to describe, but large-scale, what you really need is some stats, some inventory slots, and eight bits that describe whether you have found the double jump, morph ball, water walking, and so on. This doesn't just cut down on save-game size, although I suspect this used to be a concern in the 8-bit era and on handheld gaming devices, but it also cuts down the game state you need to think about in designing the game. Unlockable doors were used sparingly in older metroidvania games. Now that I think about it, Aria of Sorrow saves how much of the map you have explored, which items you have picked up, which bosses you have beaten, and I think some kind of pokedex/beastiarium. Savegame size probably didn't factor into it that much, but design space absolutely did.

Hollow Knight cares about the design space of bosses, and spends its design budget on making sure bosses all work as intended.

There is a certain elegance to using double-jump height as a gate for new areas, or in putting the double jump ability in a pit two jumps deep, so that the player needs to "take the plunge" and hopefully gain a new movement ability to get out the same way. Hollow Knight usually doesn't do that. Hollow Knight usually has platforming challenges with a prize at the end, and then the way back is a long corridor with a door that can be opened on the way back.

Due to the size of the map in Hollow Knight, there is still a lot of backtracking, but theoretically, this design is supposed to cut down on backtracking. You don't have to wait and come back two or nine or twenty hours later when you have the double jump, because at that point, you'll double back from the other side anyway. The Castlevania games, as well as Blasphemous and Bloodstained, also have huge maps. The difference between the mapping system of games like Aria of Sorrow (sorry for coming back to this one in particular so often, but I really like it) and games like Hollow Knight and Aquaria (totally different in tone, gameplay, and boss design) is that Aquaria doesn't tell you how big the world is going to end up. I forgot how Super Metroid works, but I think it's set up so you don't know at first how big the final map is going to be. I have the game on my SNES mini, but I can't be bothered.

On the negative side, this kind of design means that you can't really sequence-break. You can sequence-break to some extent. You can access areas out of order, and you can jump off enemies in certain places before you have the double jump. The design of one-way doors/one-way destructible walls even ensures that you can double back and open a path sometimes without getting stuck. By and large though, there is no intended route that lets you get the double-jump ("Monarch Wings") before dash or wall-jump. You can do mid-game things in the mid-game in any order, or you can skip some mid-game content and do it later, but the mid-game will always be the part where you have wall-climb and dash.

As far as I can tell, Silksong has kept all these design decisions in place, for the same reasons.

blubberquark

I know this post would really have benefited from some screenshots and maybe hand drawn maps, or arrows scribbled on in-game maps, but then it wouldn't have been something to write on a weekend, but something I would first have to re-play Hollow Knight for, but also Aquaria, and Aria of Sorrow. If I was writing this Tumblr as a job, it would take me a week to properly research this post. If I was doing this as a YouTube video, I would probably spend another week getting footage from Symphony of the Night, Bloodstained, Super Metroid and Environmental Station Alpha. Other people have already done the work of mapping these games out topographically, but it would be interesting to make a map of the power-up space as a directed acyclic graph that should look like the visualisation of a partial order.

It would take the same time it took me to beat Hollow Knight blind for the first time to re-play 30 mini Metroidvania games from past Ludum Dare jams (sadly, early ones like Leaf Me Alone (LD 26) and Pocket Planet (LD 23) seem to be completely wiped off the Internet), and compare to the progression/map design of a "full" Metroidvania that isn't Hollow Knight difficult with these 30.

A Ludum Dare metroidvania is usually really linear, and you backtrack a lot simply because the only thing that gates content is power-ups, and the map is so small you can just let the player backtrack through everything to re-explore (or half of everything if the player knows where to go) every time.

Hollow Knight is non-linear in many ways, except with movement power-ups, and the first time you explore a new area to get to the map guy. In the time it would take me to beat it again and compare it to others, I could program my multiple of games on the scale of Pocket Planet!

Hollow Knight Design

But what is the tried-and-true? What is the design of Hollow Knight?

Metroidvania

Hollow Knight

In Fez, progress is at first gated by the number of cubes collected, and later by exploration itself.

In Aquaria, bosses usually reside in "temples", self-contained areas you enter at the entrance and progress through in a linear fashion.

The Standard Boss Move Set

The use of openable doors in the early and mid-game (and also the late game, but that doesn't matter here) instead of movement upgrades and

As is, boss fights and map design interact to provide three things:

Every boss fight can be beaten at the earliest moment you could encounter it. (And you can't grind for it anyway, you can only explore)

Every boss fight can be beaten flawlessly, without taking a single hit of damage. (And every hit you take is one mistake)

It is never in your interest to deliberately tank a shot or even deliberately risk getting hit for getting one hit in. (And it is never possible to beat a boss with pure DPS or ranged attacks only)

The Big Picture

Hollow Knight cares about the design space of bosses, and spends its design budget on making sure bosses all work as intended.

As far as I can tell, Silksong has kept all these design decisions in place, for the same reasons.

#game design #hollow knight

UFO 50 Through Designer Eyes

UFO 50 was my Game Of The Year 2024. Even if it isn't yours, it's probably in your top 3 for best indie game of 2024, right behind Animal Well. It's that good. If you haven't played it, you should.

(Note on formatting: I will put titles of games in italics as usual, but games that exist within the fictional world of UFO 50 will be put in italics and quotes, like "Barbuta")

The premise, or if you're trying to sound mean, the conceit of the game is that it isn't a game, but a collection of 50 games developed for the "LX" series of computers in the 1980s. It is kind of an inversion of the "fantasy console" concept: Instead of having the specifications, hardware limitations, and developer tools for a fantasy console, we have all the games that were ever developed for it. The games all still follow the same limitations you would expect from a fantasy console, with limited gamepad buttons, a limited/fixed colour palette, limited/fixed resolution, and chip-tunes, but they aren't actually developed for a fantasy console, they all just look and feel as if they were developed for it.

UFO 50 isn't a mini-game collection. It's a game collection. The games are all designed as if they had been developed in the 1980s, for a series of computers that look like the Apple MacIntosh, with hardware limitations that first evoke the ZX Spectrum, and then Commodore C64 or even the NES. There is a "backstory" there, and you see the games get bigger and more elaborate and more and more "modern" or "professional" over time. The first game is bare-bones and clunky, like a labour of love by somebody who knows how to program but not how to design games, but the second is much better designed, yet still bare-bones. It goes forward from there. Soon games have music, a main menu, end credits, and sometimes multiplayer. Not all of the games have a massive scope, but most of them feel like you could have existed on the NES or C64 in some form, and their scope and design sensibilities often follow what you would have expected from an NES game released in the mid-1980s, or a home-brew C64 game.

Not all games can be winners, but there's remarkably little "filler" in there, to get to the round number 50. I would have expected one uninspired Tetris clone. Even if you dislike some games, there is something in there for everybody, and many people are especially fond of the games I liked least – The games I liked most are "Camouflage", a puzzle game that feels equal parts retro in its presentation and modern in its thinky design, "Bug Hunter" a Brough-like reminiscent of Into The Breach, and "Avianos", a turn-based strategy game with automatically executed RTS-like real-time battles.

The collection seems remarkably well thought out, because none of these games are Tetris – but one is Snake, one is Pong, and you would never have thought about it like that until I told you just now. The games all fit into the world-building of "LX Software", later "UFOsoft", and they have in-universe developer credits and trivia about the development history.

The mix of short games, long games, easy games, difficult games, thinky and twitchy games works really well. Some games don't have a save-game function, so they always start from the start, but some do have save-games. The games with a more arcade-ish difficulty curve are designed to be played from the start, so it still works.

With all that said, the longer I played, the harder it became to sustain my disbelief, and soon I found myself wearing my game designer hat, playing all these little games like the entries in a Ludum Dare competition. These games have issues. They could be better. It's not an excuse that there are so many other games that are good! If some of these games were Ludum Dare games, I would advise not to develop them into commercial releases.

Some of the games have quite limited graphics. Some games are suffering under the constraints of the control scheme. Some games are quite short in terms of "content", in the sense that they have few levels some of which are more difficult to stretch out the play-time. Some games are deliberately made "Nintendo hard". A few games really feel like filler. "Camouflage" could have had more "third dimension" in its graphics, instead of these flat tiles. "Rock On! Island" would have benefited from more varied tower abilities and more information about damage types in the in-game UI. "Party House" feels like it was designed as a Flash game played with a mouse first. It really feels like it wants you to use a mouse.

In that same vein, "Magic Garden", "Block Koala", "Mooncat", "Warptank", and "Devilition" all feel like the opposite of filler, as if they had been designed independently, before UFO 50.

Many games in the collection feels off-the-wall, goofy in a way that old games felt. Sure, there are also modern off-the-wall games, like anything by David Cage or Hideo Kojima, but modern AAA games are trying to feel more grounded, and modern indie or mobile games ae often self-aware or even self-conscious in their weirdness. The games in UFO 50 lean into the weirdness without feeling ironic or self-conscious about it. This works, tonally. There is a game where you are a golf ball exploring a world, a game where you are an alien delivering onions, a game based on Shuffleboard, and a game where you are inexplicably playing as owls. And then there's the flying walrus.

It also works, in-universe, that many games are lacking in tutorialisation. When these games would "originally" (in-universe) have been sold, there would have been a manual with a strategy guide in the box with the floppy disk. There is a short description of the controls in the game select screen or the pause menu, but some games, like "Avianos" and "Bug Hunter", just throw a lot of mechanical complexity at the player in the first level, or they are deliberately designed for the player to figure out, like "Planet Zoldath", "Mooncat", and "Barbuta" – "Night Mansion" is also designed to be figured out, but it's a point-and-click adventure where that kind of thing is to be expected. Although "Rock On! Island" and "Avianos" don't introduce units and mechanics one by one throughout their campaigns, "Attactics" and "Lords of Diskonia" do.

Many games in the UFO 50 collection have old-school design flaws, except when they don't. Many games in the UFO 50 collection are short, or shorter than they should be, but some are unreasonably long. If they had been made for Ludum Dare, I would have commented that some of the games felt really interesting, and quite long for a two-day game jam, but mechanically not interesting enough to develop into a commercial release. I have literally seen a game that has the exact same premise and game mechanics as "Porgy" done in a Ludum Dare.

In the context of the game, a lack of a mini-map means you'll have to consult the printed manual, or draw your own map on paper.

I don't mind that "Camouflage" feels like a way to recycle or salvage a game idea plus some already designed levels that can't be developed into a stand-alone release. I mind it when an author tries to pre-emptively guard his work against criticism. It's one thing to lamp-shade, to have a character notice and mention that a situation he is in is unrealistic. It's another thing to have a "critic" character in your work, to anticipate and acknowledge criticism but at the same time depict potential critics in a bad light. The worst kind of guarding against criticism is making something "not for critics" or "bad on purpose".

I have some games of my own that are un-salvageable. They aren't just short and have mainly novelty going for them. To be un-salvageable, the main game mechanics have to be anti-fun. Sometimes I try something out, and I realise adding more stuff to the game will only stretch out the anti-fun. I can't just make 50 anti-fun games and jam them together into a collection.

I don't begrudge Mossmouth the success they had with UFO 50. I think it is a clever trick to salvage games and game ideas that are good, but short. It's certainly better than trying to turn "Kick Club" into a full release. I just can't shake the feeling that "Combatants" is bad on purpose, and that "Rock On! Island" and "Block Koala" are designed the way they are because the developers were more concerned with making a game that fits with the game design sensibilities of the time, and a game that noticeably pushes against the limitations of the control scheme, than with making a better game.

Still (and that's the most maddening part of it all) it works. "Barbuta" feels janky and dated on purpose. I psyched myself out playing "Block Koala" because I was expecting the puzzles to have more modern, more "clever" solutions. "Block Koala" feels like Sokoban used to feel, and not at all like Stephen's Sausage Roll.

UFO 50 is bad in so many ways, but it always seems on purpose. In making these games janky and deliberately not as good as they could have been, the dev team may have made the right call.

#UFO 50 #game design

I haven't played Silksong (yet), but based on first principles, I think nearly every time it's better to release a balance patch than an easy mode, if the choice was between the two. But making Hornet start with two or three extra hit points is way easier to do on short notice than changing the geometry on platforming sections where people fall into spikes over and over again, or changing bosses to telegraph their attacks better.

A game that does "easy mode" well is Mobility by auroriax. Many games do "easy mode" poorly, and many more games do not have an "easy mode" because the developer knows a poor easy mode makes the game worse, and a proper easy mode would be a lot of work. Online commenters make it look so easy: Just give me more hitpoints!

Going by the balance of Hollow Knight again, I think that Hollow Knight, post patches and DLC, was balanced with the end game in mind, because by that time everybody had beaten the game or quit, and it didn't make sense to add new content to Greenpath at that point. You could of course side-step all this if the easy mode is disabled (or partially disabled) when you enter a DLC boss arena. Silksong on the other hand is new, and it seems to have been initially balanced for people who have played the first Hollow Knight before, but not balanced solely around the endgame of Hollow Knight. This time they had more opportunity to add more stuff to the mid-game. Upcoming DLC will probably focus on the endgame with post-100% optional bosses again, but based on what I heard, Silksong has more stuff in the early and mid-game than Hollow Knight, and the balance of Silksong doesn't presuppose that the player beat all the optional bosses of the DLC before starting the next game. The time to add more stuff to the early game is before the release, not in DLC anyway.

Given the years of work that went into Silksong, I find it more likely that individual bosses, hitboxes, timings, or platforming sections will be tweaked, and not the jump height or damage of the main attack of the player character, or the number of charm slots.

shituationist

Zeppelin tech is only going to get better, so it's paramount for businesses to learn how to use zeppelins, and experiment with zeppelin tech in their workflows. If you don't experiment with zeppelins in your business, you're at risk of being left behind. One company found that the introduction of zeppelin tech into their stacks increased productivity by 10x.

blubberquark

This is much funnier than my comparison to consumer 3d printing.

blubberquark

Why Not Write Cryptography

I learned Python in high school in 2003. This was unusual at the time. We were part of a pilot project, testing new teaching materials. The official syllabus still expected us to use PASCAL. In order to satisfy the requirements, we had to learn PASCAL too, after Python. I don't know if PASCAL is still standard.

Some of the early Python programming lessons focused on cryptography. We didn't really learn anything about cryptography itself then, it was all just toy problems to demonstrate basic programming concepts like loops and recursion. Beginners can easily implement some old, outdated ciphers like Caesar, Vigenère, arbitrary 26-letter substitutions, transpositions, and so on.

The Vigenère cipher will be important. It goes like this: First, in order to work with letters, we assign numbers from 0 to 25 to the 26 letters of the alphabet, so A is 0, B is 1, C is 2 and so on. In the programs we wrote, we had to strip out all punctuation and spaces, write everything in uppercase and use the standard transliteration rules for Ä, Ö, Ü, and ß. That's just the encoding part. Now comes the encryption part. For every letter in the plain text, we add the next letter from the key, modulo 26, round robin style. The key is repeated after we get tot he end. Encrypting "HELLOWORLD" with the key "ABC" yields ["H"+"A", "E"+"B", "L"+"C", "L"+"A", "O"+"B", "W"+"C", "O"+"A", "R"+"B", "L"+"C", "D"+"A"], or "HFNLPYOLND". If this short example didn't click for you, you can look it up on Wikipedia and blame me for explaining it badly.

Then our teacher left in the middle of the school year, and a different one took over. He was unfamiliar with encryption algorithms. He took us through some of the exercises about breaking the Caesar cipher with statistics. Then he proclaimed, based on some back-of-the-envelope calculations, that a Vigenère cipher with a long enough key, with the length unknown to the attacker, is "basically uncrackable". You can't brute-force a 20-letter key, and there are no significant statistical patterns.

I told him this wasn't true. If you re-use a Vigenère key, it's like re-using a one time pad key. At the time I just had read the first chapters of Bruce Schneier's "Applied Cryptography", and some pop history books about cold war spy stuff. I knew about the problem with re-using a one-time pad. A one time pad is the same as if your Vigenère key is as long as the message, so there is no way to make any inferences from one letter of the encrypted message to another letter of the plain text. This is mathematically proven to be completely uncrackable, as long as you use the key only one time, hence the name. Re-use of one-time pads actually happened during the cold war. Spy agencies communicated through number stations and one-time pads, but at some point, the Soviets either killed some of their cryptographers in a purge, or they messed up their book-keeping, and they re-used some of their keys. The Americans could decrypt the messages.

Here is how: If you have message $A$ and message $B$, and you re-use the key $K$, then an attacker can take the encrypted messages $A+K$ and $B+K$, and subtract them. That creates $(A+K) - (B+K) = A - B + K - K = A - B$. If you re-use a one-time pad, the attacker can just filter the key out and calculate the difference between two plaintexts.

My teacher didn't know that. He had done a quick back-of-the-envelope calculation about the time it would take to brute-force a 20 letter key, and the likelihood of accidentally arriving at something that would resemble the distribution of letters in the German language. In his mind, a 20 letter key or longer was impossible to crack. At the time, I wouldn't have known how to calculate that probability.

When I challenged his assertion that it would be "uncrackable", he created two messages that were written in German, and pasted them into the program we had been using in class, with a randomly generated key of undisclosed length. He gave me the encrypted output.

Instead of brute-forcing keys, I decided to apply what I knew about re-using one time pads. I wrote a program that takes some of the most common German words, and added them to sections of $(A-B)$. If a word was equal to a section of $B$, then this would generate a section of $A$. Then I used a large spellchecking dictionary to see if the section of $A$ generated by guessing a section of $B$ contained any valid German words. If yes, it would print the guessed word in $B$, the section of $A$, and the corresponding section of the key. There was only a little bit of key material that was common to multiple results, but that was enough to establish how long they key was. From there, I modified my program so that I could interactively try to guess words and it would decrypt the rest of the text based on my guess. The messages were two articles from the local newspaper.

When I showed the decrypted messages to my teacher the next week, got annoyed, and accused me of cheating. Had I installed a keylogger on his machine? Had I rigged his encryption program to leak key material? Had I exploited the old Python random number generator that isn't really random enough for cryptography (but good enough for games and simulations)?

Then I explained my approach. My teacher insisted that this solution didn't count, because it relied on guessing words. It would never have worked on random numeric data. I was just lucky that the messages were written in a language I speak. I could have cheated by using a search engine to find the newspaper articles on the web.

Now the lesson you should take away from this is not that I am smart and teachers are sore losers.

Lesson one: Everybody can build an encryption scheme or security system that he himself can't defeat. That doesn't mean others can't defeat it. You can also create an secret alphabet to protect your teenage diary from your kid sister. It's not practical to use that as an encryption scheme for banking. Something that works for your diary will in all likelihood be inappropriate for online banking, never mind state secrets. You never know if a teenage diary won't be stolen by a determined thief who thinks it holds the secret to a Bitcoin wallet passphrase, or if someone is re-using his banking password in your online game.

Lesson two: When you build a security system, you often accidentally design around an "intended attack". If you build a lock to be especially pick-proof, a burglar can still kick in the door, or break a window. Or maybe a new variation of the old "slide a piece of paper under the door and push the key through" trick works. Non-security experts are especially susceptible to this. Experts in one domain are often blind to attacks/exploits that make use of a different domain. It's like the physicist who saw a magic show and thought it must be powerful magnets at work, when it was actually invisible ropes.

Lesson three: Sometimes a real world problem is a great toy problem, but the easy and didactic toy solution is a really bad real world solution. Encryption was a fun way to teach programming, not a good way to teach encryption. There are many problems like that, like 3D rendering, Chess AI, and neural networks, where the real-world solution is not just more sophisticated than the toy solution, but a completely different architecture with completely different data structures. My own interactive codebreaking program did not work like modern approaches works either.

Lesson four: Don't roll your own cryptography. Don't even implement a known encryption algorithm. Use a cryptography library. Chances are you are not Bruce Schneier or Dan J Bernstein. It's harder than you thought. Unless you are doing a toy programming project to teach programming, it's not a good idea. If you don't take this advice to heart, a teenager with something to prove, somebody much less knowledgeable but with more time on his hands, might cause you trouble.

blubberquark

Thank you @all-hail-the-conn-8d and everyone who got me to 1000 reblogs!

#1000 reblogs #tumblr milestone #thank you

Trending Blogs

Recently Viewed Blogs

Blubberquark Software