44D AGAIN AGAIN @44dagainagain - Tumblr Blog

Two Subtle C Bugs

Let me tell you a tale of two bugs that wasted my afternoon.

One: SDL wants your vkGetInstanceProcAddr

I like using SDL for cross-platform windows and event handling and and whatnot. For everything else, I try to write my own libraries. I largely do this to understand how things works. Case in point: I wrote my own Vulkan "meta loader" (think Volk).

My meta loader tries to be simple and straight-forward: each Vulkan function is a global function pointer in a single translation unit with external linkage.

Well, turns out SDL tries to be clever: before it calls dlopen, it first checks if it can dlsym the current process for vkGetInstanceProcAddr. Guess what it finds? My pointer! The problem is, I need SDL to load Vulkan (I could write my own but SDL already knows which libraries to look for) so that I can initialize vkGetInstanceProcAddr in my loader. It's a catch-22.

I got around this by renaming vkGetInstanceProcAddr in my meta loader. I thought about trying to hide the symbols but it didn't seem to work (maybe I just did it wrong)

I could have relied on the application to load all the functions that it needs instead of maintaining one big list in a translation unit somewhere. This is what SDL does under the hood. Or, instead of globals, I could have hid them in a struct. But then your function calls don't match the docs 1:1.

Two: Stack Corruption via Struct Size Mismatch using #ifdef

This one caught me off guard. I naively thought the linker could handle this but how could it? Some libraries take advantage of the behavior.

When a function was trying to write to a struct pointer, it would raise SIGBUS on an address of a function pointer I had loaded in the same stack. This screams stack corruption but where and by who?

I had a struct member hidden by an #ifdef. If the macro was defined then the struct was 3 ints (24 bytes) large, otherwise the struct was 2 ints (16 bytes) large. The struct was exposed by a header which was used by the library implementation and the client code.

Turns out the client code didn't define macro but the library did. That meant that the client code only allocated 16 bytes on the stack for the object, whereas the library code thought it was 24 bytes. When the library went to write something to that 3rd int (bytes 17-24) then it wrote past the stack space allocated for the struct and corrupted the stack of the caller!

I solved this by ensuring the client code had the macro defined as well. But I could also have added an assert for the expected struct size in the library. Or I could have removed the macro altogether (seems dangerous to leave it in frankly). Or I could have made the struct opaque and relied on the library to allocate it for the client.

#c

Your wheel is causing your shifting problems

If you're having problems shifting on your bicycle and you feel like you've tried everything you can find on the internet, try checking that your wheels are installed correctly.

I was really struggling with the trigger shifter on my bicycle. (Don't ask me what model it is, it's some Shimano 8 gear thing that's so old I can't find a picture of it on the internet). It was taking so much effort to shift down using the thumb lever, especially 2 to 1. So much effort that I injured my thumb and now I can't put too much pressure on it! My bike is my main method of transportation though so I had to do something.

So I put my bike on my bike stand, pulled up Google, and went to work. I checked:

The rear derailleur. This is my primary suspect for anything shifting related. The left-right movement of the gears is mainly dictated by cable tension, which you tune with the barrel adjuster. But even with the barrel adjuster in the sweet spot to make the gear changes, it was still too hard to shift.

Is the rear derailleur hanger bent? I put my bike on my bike stand and, nope, not bent. Looked fine. The gears of the derailleur were in a pretty straight line with the cassette.

The cable. I had replaced the cable a few months ago (when the old cable had shredded itself inside the shifter) so I didn't think it was a bad cable. Had I installed it wrong? I took it out, inspected it, put it back in, no difference.

The cable housing. Maybe there was something clogging the housing, or maybe the cable had created a channel that was causing some resistance? I took the cable housing off the bike but I couldn't see anything. The cable slid up and down the housing well enough.

The shifter. Normally it's not the shifter, but the grease can dry out and get gummy. Also, when the old cable shredded itself, maybe a little piece had gotten lodged inside? I took out the cable and shifted gears but everything was working fine. I tried putting some dry lube inside (it's all I had on hand). It almost felt like it made a difference but when I tried again the next day it was still too hard to shift.

At this point I was flummoxed. I had spent so many hours and all I had was a sore thumb. All the shifting components are fine, what the hell is causing my problem? I was prepared to take it into a bike shop.

My one last ditch effort was the wheel. I had changed my tires last month for winter, which involved taking the wheel off. What if I didn't install the wheel as before, so the cassette (attached to the wheel) is no longer parallel to the derailleur gears? This would effectively cause the same problems as a bent derailleur hanger: a misalignment between the planes of the cassette and derailleur gears.

The wheel has a bolt running through it that slides between a slot in the tines of the rear fork. Then it gets secured with a nut. I loosed the nut slightly and the wheel dropped into a different position, just by a few millimetres. I tightened the bolt, tested the shifter and BAM shifted like new.

All this to say: counter-intuitively, if you're having shifting problems, maybe the problem is with your wheel (not your shifter). I had not seen this mentioned anywhere else on the internet but it makes sense. Give this a try before giving up.

#bicycle

Lightroom: SDR vs HDR with Tone Mapping

Initially, I wasn’t sold on HDR. It seemed like a gimmick: why not stretch the existing SDR colors between the darkest and brightest that the panel can display? (2)

Now, I'm sitting on top of 7000 photos (1) from a recent vacation, a subscription to Lightroom, and no deadline for editing. And I just clicked the HDR button for the first time...

#lightroom #hdr #sdr

Korsakovia: The Mod: The Book

(This post was originally written around May, 2019. I found it rotting in my drafts.)

I got into bookbinding before Christmas of last year. It’s a slow and frustrating skill to acquire; most of your time is spent repeatedly folding paper, punching holes, cutting board, or waiting for glue to dry. However, it’s rewarding when everything comes together.

Yesterday, I put the finishing touches on this:

Yes, Korsakovia can now be ingested in book form! I’m going to outline the high level process of taking the book from unformatted script to printed product.

#bookbinding #korsakovia

__declspec(naked) (Adventures in Reverse Engineering)

Even the most basic function which does nothing produces machine code:

The push+mov is called the prolog. The pop+ret are the epilog. This is boilerplate to setup the stack, save registers, handle arguments, etc. based on calling convention.

... but what if I want an empty function? That's where __declspec(naked) comes in

(int 3 is a debugger breakpoint???)

Now we can do whatever we want!

This still requires a calling convention because the body of the function does not dictate how the client is expected to pass arguments, which registers to save, etc.

(remember, call is just a push eip + jmp; ret is just a pop eip + jmp)

Serializer for class 'Companion' is not found.

The newish type safe Navigation Compose lets you use objects instead of strings for routes. So, typically I start with something like this if there are no parameters for the route:

@Serializable private object NewTaskDestination // ... navController.navigate(NewTaskDestination)

Then I go "oh I need to add an ID to the destination to access some resource. So I change the object into a data class:

@Serializable private data class NewTaskDestination(val resourceId: Int = 0) // … navController.navigate(NewTaskDestination)

Then, if I recompile and run, I get this exception trying to navigate:

kotlinx.serialization.SerializationException: Serializer for class 'Companion' is not found. Please ensure that class is marked as '@Serializable' and that the serialization compiler plugin is applied.

This is the most unhelpful error in the history of ever I swear to God!

wtf is Companion

what does serialization have to do with a navigation problem?

You know what the solution is? Open and closed parentheses when calling navigate:

@Serializable private data class NewTaskDestination(val resourceId: Int = 0) // … navController.navigate(NewTaskDestination())

I just dftkjhodhjpdoghj I want to murder someone now. Why didn't my IDE tell me this?!

#jetpack compose #kotlin #navigation

Kotlin: WTF is this code (trailing lambdas)

I'm trying to learn Kotlin so that I can use Jetpack Compose so that I can make an Android app. Following the Android codelab, I almost immediately hit a roadblock: wtf is this code doing:

class MainActivity : ComponentActivity() { override fun onCreate(savedInstanceState: Bundle?) { super.onCreate(savedInstanceState) enableEdgeToEdge() setContent { GreetingCardTheme { Scaffold(modifier = Modifier.fillMaxSize()) { innerPadding -> Greeting( name = "lol", modifier = Modifier.padding(innerPadding) ) } } } } }

As a C++ developer (with a little Java knowledge), some of this is bewildering. Especially:

setContent { // ... }

WTF? It looks like a function call but there's no parentheses. It looks like a scope but how is the code run? I was scratching my head: does this have something to do with the annotation (no, that's just metadata), or something to do with accessors (no, you need to spell out get() and set()), or something to do with anonymous objects (no, you need to spell out object). Then I took a look at the setContent method declaration:

public fun ComponentActivity.setContent( parent: CompositionContext? = null, content: @Composable () -> Unit )

Ok that's interesting, the first parameter has a default so it's likely optional, and the last parameter looks like a lambda. Maybe that has something to do with it? (BTW Kotlin Unit == C++ void) So I go to the Kotlin docs and, lo and behold, there it is: trailing lambdas.

According to Kotlin convention, if the last parameter of a function is a function, then a lambda expression passed as the corresponding argument can be placed outside the parentheses [...] If the lambda is the only argument in that call, the parentheses can be omitted entirely

That means that setContent is taking a lambda. That lambda also calls a function that takes a lambda, and so on. The entire UI hierarchy is just a bunch of lambdas that something (?) eventually executes... And, despite looking like a type due to PascalCase, they are actually functions (which normally use camelCase). How this all works is a question for another day.

One more mystery:

Scaffold(modifier = Modifier.fillMaxSize()) { innerPadding -> // ... }

In C++, lambda parameters go before the curly branches. In Kotlin, lambda parameters go inside the curly brace. Here, the Scaffold() function wants a lambda that takes padding parameters.

EDIT: To their credit, the Android Developer website mentions this in one of their codelabs. However, getting to the right codelab that contains this information is (IMO) not intuitive.

#kotlin #android #jetpack compose

Merging git repos into a monorepo with git subtree

Sometimes you yearn for the simplicity of a monorepo. Maybe you just want everything in one place, maybe you just want to more easily share code.

Regardless of your motivation, git makes it easy to split and combine repos via git subtree. While it isn't part of git core, most distributions package git with subtree anyway.

After your monorepo is set up, you can start importing repos into it with git subtree add:

git subtree -P <prefix> add <repository> <remote-ref>

-P says which subdirectory in the monorepo the imported repo should live. This allows you to organize the code that you're importing.

is the thing you want to import, e.g. https://github.com/user/repo

is the branch or tag to import. Typically, this is main or master.

Compared to submodules, subtrees exist independently of their remote. If the remote of the subtree is deleted, the subtree in your monorepo is a copy so it persists. However, if the remote of the submodule is deleted then you can no longer clone it.

Compared to good ol' copy-paste, subtrees preserve git history. Although if you don't want the history you can also --squash.

One word of caution: if you want history then you need to include the merge commit. On GitHub, this means you have to Create a merge commit; do not squash it! If you squash the merge during submission in GitHub then you lose the history. Maybe I'll go into details another day...

#git #subtree #monorepo

New C++ Projects: Setting Yourself Up for Success

I started a new project at work and it got me thinking about some easy things you can do that immediately elevate your code quality

Source Control (git)

I'm a git guy but any one will do. You just need some way of saving and tracking code over time.

Build System (CMake + Ninja)

While I'd love to recommend Bazel, it's not very cross-platform. Modern CMake is the closest that it gets. My only gripe is that everything gets added to the ALL target by default. If you build your dependencies from scratch via submodules then the usual CMake commands can build too much. So you either need to train people to build specific targets, or not use submodules, or just deal with bloated build times.

Please read Professional CMake. It's the most comprehensive, up-to-date CMake guide.

Style Guide

Google Style is my go-to since it's comprehensive, tested, and makes sense (for the most part).

Dependency Management (conan)

Building everything from source is a good approach since it provides more control. If you find that it doesn't build fast enough I'd consider setting up conan

Auto-format (clang-format)

Just use Google style. Set up to format on save. This makes everything consistent and prevents arguments about code style.

Unit Tests (gtest)

Please write unit tests. I'm begging you.

You catch bugs.

You write fundamentally different code. What behaviors do you want? Are they intuitive? What dependencies do you have? Should they be injected? This forces you to think about SOLID principles and maintainability from the get-go.

Static Analysis (clang-tidy)

C++ is full of ways to shoot yourself in the foot. clang-tidy provides good guardrails.

Code Coverage

This can help you understand which parts of your code could be better tested. Don't use coverage% as a strict metric otherwise you just waste time writing pointless tests.

Sanitizers

This pairs well with static analysis to give you more guardrails. This only works if you have good tests, so please write tests.

Documentation

I used to be a Doxygen fanatic but I got trained on... just reading the source code. Structure your code in ways that can be read by people, then put the docs there in comments. If the headers are too complicated since you're using weird things like templates then maybe don't use templates?

Include what you use

Link what you use

#c++

Steam Deck OLED 1TB

I bought a Steam Deck OLED so I can play gachas while I'm away on vacation.

Why I went with the Steam Deck and not the competition (Lenovo Legion Go or Asus ROG Ally):

Steam Deck seems to be the most stable and reliable. Apparently there are software and hardware issues with the others.

That does mean sacrificing on-paper tech specs considering the competition offers higher resolutions and display refresh rates. However, I'm not looking for a desktop replacement; this is an airport fidget. The increase resolution and refresh rate would eat into battery life for honestly not a whole lot of gain.

I also sacrifice Windows (yes you can install Windows but it doesn't seem to work great) but I'm comfortable with Linux and Proton seems to work well. I prefer an OS that I can hack in a pinch...

Unorganized thoughts:

It defaults into "Native Big Picture" but can be swapped into Desktop Mode. This works with USB-C hubs! Having a keyboard and mouse really helps getting non-Steam games working.

Desktop Mode is Arch running KDE. Time to play pacman. I'm a GNOME guy but KDE is very polished, powerful, and approachable.

I call it "Native Big Picture" because it offers more than Desktop Mode running Big Picture. Primarily, the Steam Deck overlay only works in "Native Big Picture" but not Desktop Mode running Big Picture. This seems important for optimizing battery life. Also, the input remapping only seems to work in "Native Big Picture".

"Native Big Picture" does run a basic window manager, it just lacks decoration and shows up centered.

I got the 90hz model. I notice it occasionally but 60hz is fine for me. Anything above that is gravy. I worry about the battery life that this eats up...

Notes for getting games:

Want Minecraft? Use PrismLauncher. This makes it slightly easier to add as a non-steam game. I still had to convert their .desktop to .sh so it can be added as a non-Steam game...

Use Epic? Install Heroic from the app store, then add them to Steam as non-steam Games.

Play Genshin? Sounds like HoyoPlay works but I used Heroic. I think that I needed to change the install drive from Z: to C:. Then I added the Unity binary as the non-Steam game. Then, in Steam, I needed to turn on compatibility with Proton. This game does not work in Desktop Mode since input remapping doesn't work in that mode. You'll want to log in with a physical mouse and keyboard.

The above applies for ZZZ as well.

Palia? I installed though Heroic then added PaliaClient.exe as a non-Steam game. It was complaining about the C++ Runtime but Reddit to the rescue. I installed protontricks and used that to get vcrun2022. I found this easiest to log in when not docked, otherwise the game resolution got really messed up and input didn't work well. You need to log in every time so I recommend having a password manager like 1password (which you can install from the app store)

Current pain points and annoyances:

While it basically runs any Windows game, every non-Steam game I've downloaded requires non-zero time investment to add it to Steam. Typically I need to drop into Desktop Mode, install the game, find the binary on disk, launch it once or twice to make sure it works, maybe write a wrapper script, then go into Steam and hope it can be added. Then hope Proton works. I've had trouble importing .desktop files (which is annoying...) but those are trivial to convert into a .sh

You need to keep the screen on to download games. Burn-in isn't a problem with the OLED under normal use though so this is more a LOL than anything else.

Steam+X is the soft keyboard. This is not easy to discover and necessary for Desktop Mode. It also doesn't work as well as a physical keyboard

#steam deck #proton #linux #video games #steam #valve

MSVC Boolean Branches (Adventures in Reverse Engineering)

There are many ways of writing the same boolean expression, but some are not the same! Particularly, pay attention to explicit comparisons to TRUE! Anyway, i use this graphic when I get confused about how to decompile something

#decompiling #msvc #boolean

x87 FPU and SSE (Adventures in Reverse Engineering)

Game: Diablo Pre-Release Demo (1996) Language: C++ Toolchain: Visual C++ 4.0 (suspected)

Take a look at this:

These are x87 FPU instructions (described in Intel® 64 and IA-32 Architectures Software Developer’s Manual, Chapter 8). This roughly corresponds to:

if (gamma_correction >= min_gamma) gamma_correction -= gamma_delta;

However, if you put any floating point math into Compiler Explorer, you won't get the same output! That's because the default mode now is to use the SSE instructions.

Fortunately, you can disable SSE on MSVC using /arch:IA32.

#floating point #diablo #decompiling #c++

__fastcall (Adventures in Reverse Engineering)

Game: Diablo Pre-Release Demo (1996) Language: C++ Toolchain: Visual C++ 4.0 (suspected)

Calling conventions are something that a programmer rarely thinks about but is very important to code interoperability. It covers things like: do I push all my arguments on the stack? Can I use registers to pass arguments? Who cleans up the stack, you or me? What matters most about calling convention is that both caller and callee agree.

The default calling convention for the Visual C++ 4.0 toolchain seems to be __fastcall. Here's the Microsoft documentation: https://learn.microsoft.com/en-us/cpp/cpp/fastcall?view=msvc-170 In essence: the first two arguments (if they fit in 4 bytes) can use ecx and edx; other arguments go on the stack in reverse order (e.g. the third argument is at the top, followed by the fourth, etc).

But let's look at an example:

#diablo #decompiling #c++

Was this a macro? (Adventures in Reverse Engineering)

Game: Diablo Pre-Release Demo Langauge: C++ Toolchain: Visual C++ 4.0 (suspected)

One of the tricky things about decompiling is that a lot is lost when the source goes through a compiler:

comments are thrown away

human-readable names are discarded

the preprocessor kicks in and copy-pastes code

variables can be optimized away

In this case, let's look at the preprocessor. The preprocessor runs pretty early on and expands macros into new code. This is indistinguishable from the developer copy-pasting a block of code. Compiler Explorer backs us up here if you look at this sample, both functions produce the same assembly despite seemingly being implemented differently.

So when I see things like this:

#diablo #decompiling #macro #c++#preprocessor

How I (want to) make Pre-ablo

Total reverse engineering including decompilation is the holy grail of any mod project where an SDK is not provided.

I'm currently embarking on a journey of totally decompiling the Diablo Pre-Release Demo using Devilution as a reference. After I have the source code, I can make any change I want anywhere I want. I'll be free from the whims of the original DIABLO.EXE that I'm patching. I won't need binary patching, and I won't need DLL hijacking. I won't need to worry about new code throwing off relative offsets, nor will I need to worry about how to jump in and out of patch code; the patch will be seamlessly integrated.

This is a long way off and I make slow progress. I'm currently investigating what my options are for speeding up this process.

#diablo #decompiling

How I (now) make Pre-ablo

Binary patching isn't the only option for modding. There's also DLL Hijacking. If you think about it, I'm acting like a computer virus; any mechanism that a computer virus uses to hijack a process can be used here. DLL Hijacking is just the simplest one in this case. You'll see why in a moment.

The Diablo Pre-Release Demo relies on DPLAY.DLL, DDRAW.DLL and STORM.DLL. The one that stands out here is DPLAY.DLL; isn't the demo single-player? Yes it is, but it includes some snippets of non-functional multiplayer.* So, since the game isn't using DPLAY.DLL, how about I substitute my own!

What DIABLO.EXE needs from DPLAY.DLL is:

DPLAY.DLL exists in the DLL search path (e.g. the same directory as DIABLO.EXE)

DLPAY.DLL exports two functions: DirectPlayCreate() and DirectPlayEnumerate().

I can make my own DLL that looks like DPLAY.DLL except it has my own C++ code, and the game will accept it and run with it. I very naughtily do this in DllMain() (which you're not supposed to do but it works so hey why not) using VirtualProtect() with PAGE_EXECUTE_READWRITE.

The downside is that I still need to, to some degree, jump in and out of the functions that I write. Though with __declspec(naked) I don't need to worry about the compiler generating prolog/epilog that tramples the register contents.

So now I write C++ code that gets compiled into a DLL and self-patches itself into memory when it's loaded. Now i can use more traditional software development workflows and scrap the tedious binary patching method entirely.

(If I need to get multipalyer working in the future, I can move my code over to DDRAW.DLL since I already distribute a custom version of that)

* That the demo has multiplayer is interesting from an archival, historical perspective but I gloss over it here because it's not important to DLL hijacking. I also can't comment on whether or not it actually works...

#dll hijack #directx #diablo

How I (used to) make Pre-ablo

Pre-ablo is my mod of the Diablo Pre-Release Demo that aims to be as faithful to the source while allowing it to be played from start to finish. I started it because the only existing mod of the Pre-Release Demo was Alpha4 and, while I enjoyed it, I wanted something with fewer creative liberties. Alpha4 is a new game entirely, I just wanted to play the Pre-Release Demo as it was without crashing!

However, I'm relatively new to the world of decompiling and x86 assembly. I figured the best place to start was the most obvious and I could bootstrap my way from there.

I started with binary patching. It was painful. The workflow looked like this:

Load DIABLO.EXE into IDA. I have a running IDA file where I annotate functions and variables based on Devilution

Identify the broken code. This often requires understanding the x86 assembly, mentally decompiling it to C++, and saving those annotations into the IDA file. Very rarely do I already know exactly what the code is doing, so understanding the code is a large part of this time.

Identify a fix. The best fix is one that doesn't add net new instructions so I tried to favor those. I'd also take some shortcuts which I regret doing later...

Identify where the fix would go. In step 3 I said that I didn't want to add net new instructions. This is largely thanks to the .EXE format. If I add new bytes to the file then all the offsets are now wrong and the game won't work. In these cases I had to repurpose dead code, and figure out how to jump in and out. Every jump is more work down the line...

Turn the fix into asm. My mind works in C++ so I need to mentally translate that into X86 assembly. This is tricky since I'm largely unfamiliar with x86 assembly (though I get better every day)

Turn the asm into machine code. Oh boy this sucked. This sucked so hard. Machine code is meant to be read by the processor, not humans! It ends up being incredibly terse! In addition, x86 has some peculiarities: it uses a multibyte encoding and has a lot of weird edge cases. At one point I switched to using Ghidra to do this for me (but this created its own headaches since I can't use my IDA annotations in Ghidra...). Also I needed to calculate relative offsets to the current instruction pointer...

Insert the machine code into DIABLO.EXE. I did this with a hex editor. By hand. If I made a mistake I'd have to start over.

I made this reproducible by encoding the binary differences using vcdiff. That way, I could take a fresh DIABLO.EXE and reconstruct Pre-ablo by applying the vcdiff patches in order. It also separated the logical changes into a list of discrete patches; changing one patch (usually) had no impact on the other.

This sucked but it worked. I used this approach until v0.4 when I replaced it with something better...

(The "easy way out" was to pay for IDA Pro. Which is several hundred US $. No thanks.)

#diablo #modding #assembly #decompiling

Trending Blogs

Recently Viewed Blogs

44D AGAIN AGAIN