Sitemap

A list of all the posts and pages found on the site. For you robots out there, there is an XML version available for digesting as well.

Pages

Posts

But What is a Model?

47 minute read

Published:

In the last technical posts, where I was trying to figure out whether my model was ready to be compared or not, I realized that it was never really about comparisons! A model is not intrinsically good or bad; rather, the setup is either a match or not. When I tried to see how cross-validation helps me evaluate my model, the answer was the same: a model is not the only thing to be evaluated; it is the whole setup. So, what is a model exactly? And how does one know when it is a match or not?

Mind and Body

7 minute read

Published:

The inventor, as Tesla points to; the creative, as Becker names; the spiritual, as religions refer to; and the masses, as they are looked down upon.
None is the true human. None is the epitome of the human condition. Each one is the human. Each one is part of what it means to be human.

How Good is My Model? Part 5: When cross-validation went rogue!

32 minute read

Published:

In the last technical post, I talked about how to tell when one is in a state to start comparing models. I found that I needed to satisfy some conditions before concluding that my model is suitable for my data and representation. Now, assuming that I have such a suitable model and I want to compare it to other suitable models—or I found no such model, and I just want to see which of my suboptimal models is the least suboptimal—Is cross-validation the next logical step?

Short answer: Not as we use it today!

Feelings

14 minute read

Published:

Feelings are there to tell us something about a situation, and to guide us toward an “appropriate” action in that situation.
If they ever hint at something other than “appropriate action”, then maybe we are not listening well.

It feels weird to think that science can be unbiased!

3 minute read

Published:

Science is based on people pursuing topics by asking questions.
It becomes so clear to me that the way each one asks a question is deeply rooted in how they feel in life at the moment of asking it.

How Good is My Model? Part 4: To Compare, or Not to Compare

25 minute read

Published:

In this post, I go back to the “How Good is My Model?” lane and continue the journey. However, since the next stop is the “Cross-Validation land,” which in my field is mainly about model comparison, one needs to go through this sanity check before moving to comparison. This check should indicate whether I am ready to start comparing models—or not yet.

Repeat after me

4 minute read

Published:

If I were to raise my kid to make sure there’s one thing they learn to do, it would be this: repeat after anyone who is speaking.
Repeat their words back to them — to make sure you’ve captured a glimpse of their depth before you jump into spelling out your own.

No one knows better!

8 minute read

Published:

I know that this might come as a shock to some of you.
Especially in our societies that are deeply rooted in hierarchy, role models, influencers, and aspirations to the next “great” thing.
But believe me, no one knows better…

Distributions for Machine Learning: The Art of Asking Questions!

28 minute read

Published:

In the last post about distributions, I saw how a distribution is an answer that shows the state of the world for a question. I also ended the post by showing how machine learning (ML) is immersed in distributions. And so, just by logical induction, ML is about asking questions. In this post, I want to discover how the formulation of my questions can dramatically make or break my ML model!

The pains of my existence

11 minute read

Published:

I have come to the realization, very recently, that my life has been centred around pain and suffering. This was my baseline throughout my life, and this is the antagonist of my existence.

But What is a Distribution?

17 minute read

Published:

Throughout the past blogs of the “how good is my model” series, I have been talking solely in distributions. I think of the model performance as a distribution, and I build all my intuition with a distribution in mind. But what do I have in mind exactly when I think in distributions? And why does it matter? And why do I end up talking about beliefs?? And why does my model care about beliefs???

How Good is My Model? Part 3: Truce with Small Datasets.

15 minute read

Published:

In the last two posts, I was trying to answer a question but found that it only became entangled with the limitations of reality. To know how good a model is, I need to land on the true distribution of its performance—to see what is common and what is rare. However, this requires a large number of i.i.d. samples, which unfortunately are not easy to find in our field.

How Good is My Model? Part 2: LLN, CLT, i.i.d., and a Messy World

17 minute read

Published:

In the last post, I discussed how a model’s performance is more faithfully reported as a distribution rather than a single value. Also, how it’s important to report it mindfully to distinguish between what is typical and what is expected, and how standard deviation (σ) can quantify a model’s stability. However, I also discussed that one distribution from a small test set is not guaranteed to give an overview of the model’s true distribution. In this post, I’ll discuss how to move forward from this bottleneck.

How Good is My Model? Part 1: µ ± σ and a Bit More

13 minute read

Published:

In my first blog post, I made two bold statements about transformer models for molecular property prediction in our review1 — namely, that they fall short in terms of both novelty and benchmarking. In this and the following posts, I hope to convince you of these conclusions.

  1. Publicly accessible pre-print version here 

I dislike ChatGPT, but it helped me understand the kid in me better!

7 minute read

Published:

Think of a kid who goes to the park and runs after every rock, every insect, every flower, and even each grain of sand. All this kid needs is someone to walk with them—someone to label the rock, the insect, the flower, and the sand. Then, the kid will build a magical story out of these labels. This kid is my brain. And ChatGPT has been this someone.

The Beginning: Something is Broken!

8 minute read

Published:

So, my PhD topic is to implement transformer models for toxicity prediction and molecular generation. Or in plain english, given a molecule (e.g., a drug), I want to train an “AI” model to predict its possible toxic effect on the body and the environment. Even better, create new formulas for molecules that will be safe biologically and environmentally. However, in this post, I will show you how I had to make a major detour before I even attempt to do so!

portfolio

publications

Paper Title Number 4

Published in GitHub Journal of Bugs, 2024

This paper is about fixing template issue #693.

Recommended citation: Your Name, You. (2024). "Paper Title Number 3." GitHub Journal of Bugs. 1(3).
Download Paper

talks

teaching

Teaching experience 1

Undergraduate course, University 1, Department, 2014

This is a description of a teaching experience. You can use markdown like any other post.

Teaching experience 2

Workshop, University 1, Department, 2015

This is a description of a teaching experience. You can use markdown like any other post.