I do not think one needs to “stay updated” on research anymore…
Published:
It has become an automatic response that every time I am presented with a new article to read, I deeply sigh…
When I started my PhD, my topic was (and still is) employing language models to understand chemical properties from molecules.
I began the PhD by making a literature review. I needed to know where the field stood so that I would know where to head from there.
When I worked on this review, I was not an expert in any of the fields it touched. I wasn’t an expert in machine learning. I wasn’t an expert in chemistry. And I wasn’t an expert in cheminformatics. I was simply a newcomer trying to navigate a large topic.
The only thing I did have experience in was logic!
I know how to think.
I know how, when presented with a topic, to ask questions.
And so, the review1 we published was not really a review of machine learning methods. It was not a review of chemistry. And it was not a review of cheminformatics.
It was a review of logic.
People in computer science proposed a model called transformers. And people in cheminformatics wanted to use it. The model was big and complicated. So I asked: How can it be broken down?
Once it was broken down, the next question became: holy duck, this is too much nuance to keep track of! How did people do it?
That question became the foundation of our review: How did people tackle all the nuances in this model?
The papers we reviewed were filtered only through this guiding question. We wanted to know how each paper navigated the complicated nature of the model.
And unfortunately — as we showed clearly in our review — they didn’t…
Most of the research done before our review was basic stitching of different parts from computer science and cheminformatics to create a kind of “deformed” hybrid.
Through simple use of logic, looking at the state of the field made us aware of how deeply it sat in brown mush.
Now, this is not to say these papers lacked ideas. Each paper had an idea. That idea was probably bright and intelligent.
But the execution was, to put it as kindly as possible… horrible.
The way the research was conducted did not allow me, as a fellow researcher, to make use of it.
As I showed in my first blog, and as we presented in our review, the lack of standards in conducting this research made the insights from these articles intangible. To say that “something is good” and to build it into the next step, one needs a way to define what “good” means.
And apparently, none of the reviewed articles invested time in defining this.
But honestly, it is not really their fault. What they did is the exact same thing that has been published for the past decade in the field of machine learning.
They simply followed the existing pattern.
They collected datasets from some sources. They trained a model. They reported a table of numbers and highlighted whichever number was bigger.
Many of the people developing transformer models for molecular property prediction came from computer science, and they brought their practices with them.
And so, my realization of the root problem in my field made me aware of the larger problem in the entire machine learning domain.
We lost the definition of “what is good.”
And now, research is becoming a pure pursuit of whims.
Anyone can, and — by all means — is free to generate an idea, apply it, and show that it works “well for them.” But because we no longer share a common definition of what “good” is, it does not matter anymore whether an idea is “the real deal” or not…
And consequently, it does not matter whether I am aware of it or not.
You do you. I do me. If we happen to meet — in person or online — we talk about our ideas. If we don’t, then we don’t.
You keep working on yours, and I will keep working on mine.
If I stumble upon your work while searching for something relevant to mine, and I find it easy to read and follow, then I will naturally become updated with it. If I do not stumble upon it, or cannot fit it within my own definition of good, then that is unfortunate. But both of us will move on.
And if at some point we realize we are working on the same idea, let’s toast to it. For we managed to think of the same idea in the midst of all this messiness.
The last part above is accommodating and peaceful, but it took a lot of self-bargaining to get there.
Because the reality was: while I was working on the review, while trying to understand what “good” means, and at the beginning of working on this blog site, the dominant feeling I had was rage.
Rage against the inconsistencies.
Rage against the promise we were taught to believe — that science is objective, logical, and robust — only to discover that it wasn’t.
Rage for all the times I doubted myself before doubting the system, because I didn’t know (and no one told me) that the problem lay in the system, not in me.
Rage for all the sweat and tears I had to shed to find myself and trust her, because the system was too noisy to hear me.
Rage for the safety I lost — the safety of logic and reason that can mend the mind and correct the course — only to realize that science is an institution like any other.
An institution made by humans, governed by humans, influenced by humans, and therefore, it will be forever human.
Just as religion was before it, alchemy before it, and any other human institution before it.
I only managed to reach peace with the current state of science after I made peace with the current state of humanity.
Because the rage was not truly against science per se. It was against something far deeper.
Something ancient.
It was rage against existence itself.
