Bye Hitch
17 December 2022
In the ordinary moral universe, the good will do the best they can, the worst will do the worst they can, but if you want to make good people do wicked things, you’ll need religion.
I used to believe this to be true, as I did many things that Hitch wrote. The statement is obviously false, as well stubbornly simplistic. I’m not too concerned with the semantics of the quote, or the soundness of its argument, but I do think about it a lot, especially at this moment in my life.
I have no answers. I only have compassion for those who fear the intolerance of society, and loathing for the intolerant.
Just Causation
6 December 2022
Me: Does causation imply correlation?
ChatGPT: Yes, causation implies correlation. If one event causes another, then there must be a relationship or correlation between the two events. Causation is the relationship between two events in which one event causes the other to occur. This means that if event A causes event B, there must be a correlation between the two events, because the presence of A is necessary for B to occur.
Correlation doesn’t imply causation is a dictum that’s been repeated to death in statistics classes everywhere. But does that mean causation necessarily implies correlation? Initially, I too would have had a similar answer to ChatGPT above, but as I’ve discovered, the answer depends on what you mean by ‘correlation’. To be precise, and its always good to be precise with math and statistics, causation does not necessarily imply correlation.
Correlation generally refers to linear correlation, as in Pearson’s correlation, which only measures linear relationships between variables. However, its perfectly possible to have one variable to have only a non-linear dependency on another, but the corresponding linear correlation would be zero. While we do have non-linear correlation measures, mutual information is the measure that we want in our new dictum: Causation implies high mutual information.
I’m comforted that I’m not the only one who made the error of not being clear and precise with my words (and thus my understanding) in statistics. Nobel prize winners do it too.
An aside about ChatGPT
The first time I tried the causation prompt with ChatGPT, it didn’t understand the question, and proceeded to give a rote explanation of why correlation doesn’t imply causation. The fact that these models are easily tipped off by this sort of diversionary tactic (as Venelin explored in the adversarial QA paper) dissuades me the most from the idea that there is latent language understanding and meaning that can arise from the simple task of language modeling. However, when I tried it again before writing this post, ChatGPT got it right! I managed to come up with another one that tricked it along the same lines though:
Me: What came second, the chicken or the egg
ChatGPT: The question of which came first, the chicken or the egg, is a classic conundrum. It is often used as a metaphor for a problem that has no clear solution, or for a situation in which it is difficult to determine the cause and effect. In this case, there is no definitive answer, as the question itself is based on a false premise. Chickens hatch from eggs, so a chicken must have come from an egg. However, the egg that the chicken came from must have been laid by a chicken, so the chicken must have come first. This cycle continues indefinitely, making it impossible to determine which came first.
Build to Learn
29 November 2022
Andrej Karpathy on the Lex Fridman podcast, when asked how beginners should approach learning machine learning:
I am a believer at a high level of the 10,000 hours concept…You literally have to put in 10,000 hours of work…
Now, while Malcolm Gladwell does peddle a lot of crap (see Gelman), the 10,000 hours idea is reasonable. Keep working on something for a few hours everyday for a few years, and you’ll probably be really, really good at it at the end. I think it’s good advice, to which I’d add a small aphorism. The best way to learn a technical concept in software, machine learning and related fields, is to build something with it. I learnt SwiftUI and Combine by building DeTeXt. I learnt Javascript by building websites and web apps. Most of my understanding of transformers and modern deep neural nets comes from trying to use them in my research projects. So yes, spend a lot of time consistently on a topic, and try building something while you’re learning it.
I want to be a good writer, and the only way forward towards that goal is to write more. But what am I building? This blog? This doesn’t feel like a cohesive unit that I’m building, not that it has to be one. Perhaps I’ll keep chipping away at the 10,000 hours by writing regularly here, until I find something I want to build.
Game-y game
9 October 2022
I’ve gotten back into gaming recently, and realized something while I played the brilliant metroidvania game Dandara: Trials of Fear+ – video games fall under a spectrum of game-y-ness just like any other category. This was a concept the great Greg Carlson introduced to me in a Pragmatics class: there are prototypical examples of a category that help us cognitively to classify things in the world. So while penguins and magpies are birds, a magpie is very bird-y bird, while a penguin is less so. Similarly, Dandara is more of a game-y game, than say, Inside or Firewatch as dunkey bemoans in the beginning of this video. This isn’t a novel thought, but it’s another way of thinking past the uninteresting are video games art? question.
DeTeXt, a year later
6 November 2021
A year and 2 months ago, I released DeTeXt – my first iOS app that I had built using SwiftUI, Combine, CoreML and PencilKit. Thinking back, I believe I went from no-code to an app in the App Store in 6 weeks? I can hardly believe it now, but I am proud of building DeTeXt – I’m glad it helped some people find the LaTeX symbol they wanted and saved perhaps a few seconds of their day. Building and releasing DeTeXt emphasized one aspect of myself – I love building tools and working on projects that help other people be successful at their work, even in very small ways.
Initially, I had trained the MobileNetV2 neural network for DeTeXt’s symbol recognition engine on a 4 GPU cluster. Today, I did the same on my M1 Pro Macbook Pro running on battery, thanks to tensorflow-metal and the 16 core GPU. I wish I had documented the training better on the cluster, but I do have the training time for the M1 Pro: 6 minutes per epoch, for training a simple image classification neural net on over 200,000 images. In 1 epoch, the network reached an accuracy of 65%. Six minutes to train a complex neural network with over 7 million parameters, on battery power. Wow.