typicalacademic replied to your post: You know any decent books for modern neural net...
curious what you don’t like about AllenNLP—I’ve found it pretty workable? could totally believe that would change if I were working on slightly different tasks though
Oh, this is a fun question -- at work I’m the only person who has used AllenNLP so I occasionally talk about it but I can’t really have a conversation about it
Cut for highly specific shop talk
Here are some of the things that have frustrated me:
Weird assumption that you’re doing everything on the command line instead of in code. This means that if I want to programmatically control something like training, I either have to circuitously write python that orders commands that start other python sessions, or go into `allennlp.command` and deal with code that seem written “to do the command” and not for my direct use
Relatedly, not set up well for hyperparam tuning, or anything else where you’re trying out different things and doing many training jobs. Training code is coupled to serialization code in a way unlike anything I’ve seen elsewhere: every single training job must have its own directory, and then it thinks you want a snapshot of the model weights at every epoch and a separate copy of the weights from the best-val epoch and a third copy of those weights inside the final model at the end.
I remember the abstraction with “Fields” for model inputs being constantly frustrating. I never understood its motivation (I assume there is one) but it puts this weird and complex barrier between the data you have and the actual input your model receives. Generally I know exactly what it is I want my model to be seeing on an array-by-array level, but if I want my model to see “X” I can’t just say “X,” I have to find the appropriate “Field” and the appropriate Y such that passing Y to the Field returns X. I remember this causing bugs in things that are trivial anywhere else, like keeping track of your label encoding, adding in “side info” aligned with your tokens, etc.
For me the Field stuff was a obnoxious headache in particular with their BERT implementation: it treats BERT’s tokenizer as a second tokenizer on top of “your model’s” tokenizer, with BERT’s confusingly nested inside a so-called “token indexer,” and at the same pipeline step some arrays will be aligned with what BERT actually sees and others with the “””real””” tokens (see here).
Defines its own abstractions even for things that are quite common and not really NLP-related. I never really understood what a Predictor and an Archive were for (I think I half understood what a Model was for), and I haven’t really used PyTorch otherwise so maybe it lacks these very basic features? but it limited my range of options when I needed help because I was always dealing with the special AllenNLP trademarked flavor of a thing
If there’s a common theme, it’s something like “AllenNLP seems like it’s trying to create very powerful abstractions that can concisely express a lot of fancy things that people do in papers, but it’s okay making those abstractions are brittle and opaque to the user as long as they technically work in their reference cases.” It feels like it’s made so you can call `allennlp train` and immediately reproduce some glitzy paper, but not for the person who wrote the paper, who wants to try a lot of crazy ideas really fast and knows just what those ideas are supposed to look like.
typicalacademic replied to your post: Please yell about the existential threat
@oceankin fwiw I think Culture is something like “shared gravity” and unity or at least closeness with other people by conforming to the same gravity. So (spoilers) in the MF ending they join the mainstream Culture, to an extent; in CG they have a counterCulture; in CM they relate to things without really having a Culture at all, but having something alternative (maybe culture with a small c, but—it’s different)
o crap o god i cannot tell you how unprepared i was for someone to reply to my tags on this post
anyway LT’s path and the MF ending are the ones i haven’t played yet so uh i’ll get back to you on that
typicalacademic replied to your post: @typicalacademic replied to your post: ...
huh I guess. 3150 vs today looks v different but maybe that’s a bit more than two years. or maybe it’s more that they’re always wide-eyed now vs occasionally then
doing a more thorough review, i think the last sentence is correct—jeph seemed to experiment with eye styles during this period.
would definitely be interested in doing a reread and categorizing some of this stuff, I find it endlessly fascinating.