DaVinci Resolve is the obvious choice for software these days. The learning curve is... steep. To put it mildly. But it's the industry standard for color grading that's trying to steal market share for editing, and it's available for free. The big advantage here is that you can just keep using the same program as your skills grow. There is nothing you'll ever do for a fanvid, including effects, that can't be done in the free version of Resolve. Some people say to start with an easy program, but I find that many of them suck nowadays. YMMV.
Any of the big programs will have a million youtube tutorials on the technical aspects. We no longer need to rely on fandom tutorials for that.
As for aesthetics... That gets tricky since there are a lot of different fannish communities that have diametrically opposed views on what makes a good vid. If you want to go oldschool, there's a vidding discord with a bunch of fans from the LJ era. Some of the extant fancons have vidshows and vidding panels or even run vidding courses. Escapade, DC Slash, etc.
Some parts of vidding aesthetics are just film aesthetics. When I got interested, I wandered over to the local university bookstore and got myself a copy of In the Blink of an Eye, which is just about the only thing film schools ever manage to assign unless you're specializing in editing. I also think The Eye is Quicker and The Visual Story are worth a look. You certainly don't need to know film nerd shit, but those kinds of books can give you some vocabulary should you want it.
The thing is, you already know aesthetics.
Forty years ago, Media Fandom vidders were learning from vids. You see meta from long after this was true still saying that you have to watch good vids to make good vids.
But in fact, any person in the 2020s (and, frankly, long before) has watched a lifetime of short-form video with vid-like aesthetics long before they ever think of doing any kind of fannish remix video.
Trailers, commercials, music videos, short films, plenty of youtube content: it's all relevant to one type of vidding or another.
If you want to do very oldschool Media Fandom vidding that is highly narrative but doesn't use any show audio, your best bet is to look at the visual grammar of film. The character with the closeup is the important one, the person looking near the camera's emotions are more accessible than the emotions of the character we see in profile, etc. etc.
I've made a couple of videos about visual grammar aimed at fans.
In general, I would ignore any vidding meta about music choice. Everyone is a coward with bad taste, whether that's an entire community that only edits to Linkin Park or an entire community that only vids to Sarah McLachlan. Down with 4/4! Down with only top 40s pop! Unless you like it, of course.
Just pick something you're willing to listen to eight billion times if you're doing a song without show audio. If you're doing a more trailer-y vid, pick something that works well as a short, punchy backdrop to dialogue. Don't play dialogue and song lyrics at the same time. Again, this is basic shit from every commercial trailer and ad ever.
That linked guide seems fine. It's full of things that will vary by person, of course. "(compared to phone) It’s not portable unless you have a beast of a laptop." is cracking me up. For ages, I did all my editing on a macbook pro, which is indeed a beast.
Don't use After Effects. That guide is right that it's super popular, but unless you are primarily interested in heavy effects, it's a worse choice than a program aimed at editing, IMO. Vegas is insanely popular with vidders who use Windows. I'm not as familiar with it, but it seems to be popular at least in part because a lot of vidders have released presets and effects packs and such that work with it.
You can use any program, really. I know a Linux vidder who used Blender because it was what was available. I once was an idiot and could not figure out how I'd fucked up Premiere right before a big deadline, so I made my vid in Avid. (Don't use Avid. Nobody vids on Avid and for good reason.)
That guide's comments about the Ken Burns effect are... sort of right, but I think they're glossing over the fact that it really depends on the type of footage. If you're vidding a sitcom or most anime, you may well want to add motion. If you're vidding a single film that is shot well, it probably already has a lot of beautiful camera motion, and adding different motion requires a deft hand. Ditto Pixar, which spends huge amounts of time and money on imitating live action camera styles.
I use 4K Video Downloader+ to get footage, music, etc. It works on many sites and is good at keeping up with Youtube's constant attempts to thwart it.
Generally, I think you should just grab some footage and some music and go play around.
Most of the other details will depend on your aesthetic aims, which are going to be a lot more specific than "A vid, any vid".