Posts by Tags

Analytical vs empirical

How Good is My Model? Part 3: Truce with Small Datasets.

15 minute read

Published:

In the last two posts, I was trying to answer a question but found that it only became entangled with the limitations of reality. To know how good a model is, I need to land on the true distribution of its performance—to see what is common and what is rare. However, this requires a large number of i.i.d. samples, which unfortunately are not easy to find in our field.

Bootstrapping

How Good is My Model? Part 3: Truce with Small Datasets.

15 minute read

Published:

In the last two posts, I was trying to answer a question but found that it only became entangled with the limitations of reality. To know how good a model is, I need to land on the true distribution of its performance—to see what is common and what is rare. However, this requires a large number of i.i.d. samples, which unfortunately are not easy to find in our field.

Central Limit Theorem (CLT)

How Good is My Model? Part 2: LLN, CLT, i.i.d., and a Messy World

17 minute read

Published:

In the last post, I discussed how a model’s performance is more faithfully reported as a distribution rather than a single value. Also, how it’s important to report it mindfully to distinguish between what is typical and what is expected, and how standard deviation (σ) can quantify a model’s stability. However, I also discussed that one distribution from a small test set is not guaranteed to give an overview of the model’s true distribution. In this post, I’ll discuss how to move forward from this bottleneck.

Confidence

How Good is My Model? Part 4: To Compare, or Not to Compare

25 minute read

Published:

In this post, I go back to the “How Good is My Model?” lane and continue the journey. However, since the next stop is the “Cross-Validation land,” which in my field is mainly about model comparison, one needs to go through this sanity check before moving to comparison. This check should indicate whether I am ready to start comparing models—or not yet.

Cross-Validation

How Good is My Model? Part 5: When cross-validation went rogue!

32 minute read

Published:

In the last technical post, I talked about how to tell when one is in a state to start comparing models. I found that I needed to satisfy some conditions before concluding that my model is suitable for my data and representation. Now, assuming that I have such a suitable model and I want to compare it to other suitable models—or I found no such model, and I just want to see which of my suboptimal models is the least suboptimal—Is cross-validation the next logical step?

Short answer: Not as we use it today!

Data Limitations

How Good is My Model? Part 5: When cross-validation went rogue!

32 minute read

Published:

In the last technical post, I talked about how to tell when one is in a state to start comparing models. I found that I needed to satisfy some conditions before concluding that my model is suitable for my data and representation. Now, assuming that I have such a suitable model and I want to compare it to other suitable models—or I found no such model, and I just want to see which of my suboptimal models is the least suboptimal—Is cross-validation the next logical step?

Short answer: Not as we use it today!

Diagnostics

How Good is My Model? Part 4: To Compare, or Not to Compare

25 minute read

Published:

In this post, I go back to the “How Good is My Model?” lane and continue the journey. However, since the next stop is the “Cross-Validation land,” which in my field is mainly about model comparison, one needs to go through this sanity check before moving to comparison. This check should indicate whether I am ready to start comparing models—or not yet.

Distributions

How Good is My Model? Part 5: When cross-validation went rogue!

32 minute read

Published:

In the last technical post, I talked about how to tell when one is in a state to start comparing models. I found that I needed to satisfy some conditions before concluding that my model is suitable for my data and representation. Now, assuming that I have such a suitable model and I want to compare it to other suitable models—or I found no such model, and I just want to see which of my suboptimal models is the least suboptimal—Is cross-validation the next logical step?

Short answer: Not as we use it today!

How Good is My Model? Part 4: To Compare, or Not to Compare

25 minute read

Published:

In this post, I go back to the “How Good is My Model?” lane and continue the journey. However, since the next stop is the “Cross-Validation land,” which in my field is mainly about model comparison, one needs to go through this sanity check before moving to comparison. This check should indicate whether I am ready to start comparing models—or not yet.

How Good is My Model? Part 3: Truce with Small Datasets.

15 minute read

Published:

In the last two posts, I was trying to answer a question but found that it only became entangled with the limitations of reality. To know how good a model is, I need to land on the true distribution of its performance—to see what is common and what is rare. However, this requires a large number of i.i.d. samples, which unfortunately are not easy to find in our field.

How Good is My Model? Part 2: LLN, CLT, i.i.d., and a Messy World

17 minute read

Published:

In the last post, I discussed how a model’s performance is more faithfully reported as a distribution rather than a single value. Also, how it’s important to report it mindfully to distinguish between what is typical and what is expected, and how standard deviation (σ) can quantify a model’s stability. However, I also discussed that one distribution from a small test set is not guaranteed to give an overview of the model’s true distribution. In this post, I’ll discuss how to move forward from this bottleneck.

How Good is My Model? Part 1: µ ± σ and a Bit More

13 minute read

Published:

In my first blog post, I made two bold statements about transformer models for molecular property prediction in our review1 — namely, that they fall short in terms of both novelty and benchmarking. In this and the following posts, I hope to convince you of these conclusions.

  1. Publicly accessible pre-print version here 

Evaluation

How Good is My Model? Part 5: When cross-validation went rogue!

32 minute read

Published:

In the last technical post, I talked about how to tell when one is in a state to start comparing models. I found that I needed to satisfy some conditions before concluding that my model is suitable for my data and representation. Now, assuming that I have such a suitable model and I want to compare it to other suitable models—or I found no such model, and I just want to see which of my suboptimal models is the least suboptimal—Is cross-validation the next logical step?

Short answer: Not as we use it today!

How Good is My Model? Part 4: To Compare, or Not to Compare

25 minute read

Published:

In this post, I go back to the “How Good is My Model?” lane and continue the journey. However, since the next stop is the “Cross-Validation land,” which in my field is mainly about model comparison, one needs to go through this sanity check before moving to comparison. This check should indicate whether I am ready to start comparing models—or not yet.

How Good is My Model? Part 3: Truce with Small Datasets.

15 minute read

Published:

In the last two posts, I was trying to answer a question but found that it only became entangled with the limitations of reality. To know how good a model is, I need to land on the true distribution of its performance—to see what is common and what is rare. However, this requires a large number of i.i.d. samples, which unfortunately are not easy to find in our field.

How Good is My Model? Part 2: LLN, CLT, i.i.d., and a Messy World

17 minute read

Published:

In the last post, I discussed how a model’s performance is more faithfully reported as a distribution rather than a single value. Also, how it’s important to report it mindfully to distinguish between what is typical and what is expected, and how standard deviation (σ) can quantify a model’s stability. However, I also discussed that one distribution from a small test set is not guaranteed to give an overview of the model’s true distribution. In this post, I’ll discuss how to move forward from this bottleneck.

How Good is My Model? Part 1: µ ± σ and a Bit More

13 minute read

Published:

In my first blog post, I made two bold statements about transformer models for molecular property prediction in our review1 — namely, that they fall short in terms of both novelty and benchmarking. In this and the following posts, I hope to convince you of these conclusions.

  1. Publicly accessible pre-print version here 

The Beginning: Something is Broken!

8 minute read

Published:

So, my PhD topic is to implement transformer models for toxicity prediction and molecular generation. Or in plain english, given a molecule (e.g., a drug), I want to train an “AI” model to predict its possible toxic effect on the body and the environment. Even better, create new formulas for molecules that will be safe biologically and environmentally. However, in this post, I will show you how I had to make a major detour before I even attempt to do so!

Expected vs Typical

How Good is My Model? Part 1: µ ± σ and a Bit More

13 minute read

Published:

In my first blog post, I made two bold statements about transformer models for molecular property prediction in our review1 — namely, that they fall short in terms of both novelty and benchmarking. In this and the following posts, I hope to convince you of these conclusions.

  1. Publicly accessible pre-print version here 

Homoscedasticity

How Good is My Model? Part 4: To Compare, or Not to Compare

25 minute read

Published:

In this post, I go back to the “How Good is My Model?” lane and continue the journey. However, since the next stop is the “Cross-Validation land,” which in my field is mainly about model comparison, one needs to go through this sanity check before moving to comparison. This check should indicate whether I am ready to start comparing models—or not yet.

Independent and Identically Distributed

How Good is My Model? Part 2: LLN, CLT, i.i.d., and a Messy World

17 minute read

Published:

In the last post, I discussed how a model’s performance is more faithfully reported as a distribution rather than a single value. Also, how it’s important to report it mindfully to distinguish between what is typical and what is expected, and how standard deviation (σ) can quantify a model’s stability. However, I also discussed that one distribution from a small test set is not guaranteed to give an overview of the model’s true distribution. In this post, I’ll discuss how to move forward from this bottleneck.

Language models

The Beginning: Something is Broken!

8 minute read

Published:

So, my PhD topic is to implement transformer models for toxicity prediction and molecular generation. Or in plain english, given a molecule (e.g., a drug), I want to train an “AI” model to predict its possible toxic effect on the body and the environment. Even better, create new formulas for molecules that will be safe biologically and environmentally. However, in this post, I will show you how I had to make a major detour before I even attempt to do so!

Law of Large Number (LLN)

How Good is My Model? Part 2: LLN, CLT, i.i.d., and a Messy World

17 minute read

Published:

In the last post, I discussed how a model’s performance is more faithfully reported as a distribution rather than a single value. Also, how it’s important to report it mindfully to distinguish between what is typical and what is expected, and how standard deviation (σ) can quantify a model’s stability. However, I also discussed that one distribution from a small test set is not guaranteed to give an overview of the model’s true distribution. In this post, I’ll discuss how to move forward from this bottleneck.

Machine learning (ML)

How Good is My Model? Part 5: When cross-validation went rogue!

32 minute read

Published:

In the last technical post, I talked about how to tell when one is in a state to start comparing models. I found that I needed to satisfy some conditions before concluding that my model is suitable for my data and representation. Now, assuming that I have such a suitable model and I want to compare it to other suitable models—or I found no such model, and I just want to see which of my suboptimal models is the least suboptimal—Is cross-validation the next logical step?

Short answer: Not as we use it today!

How Good is My Model? Part 4: To Compare, or Not to Compare

25 minute read

Published:

In this post, I go back to the “How Good is My Model?” lane and continue the journey. However, since the next stop is the “Cross-Validation land,” which in my field is mainly about model comparison, one needs to go through this sanity check before moving to comparison. This check should indicate whether I am ready to start comparing models—or not yet.

How Good is My Model? Part 3: Truce with Small Datasets.

15 minute read

Published:

In the last two posts, I was trying to answer a question but found that it only became entangled with the limitations of reality. To know how good a model is, I need to land on the true distribution of its performance—to see what is common and what is rare. However, this requires a large number of i.i.d. samples, which unfortunately are not easy to find in our field.

How Good is My Model? Part 2: LLN, CLT, i.i.d., and a Messy World

17 minute read

Published:

In the last post, I discussed how a model’s performance is more faithfully reported as a distribution rather than a single value. Also, how it’s important to report it mindfully to distinguish between what is typical and what is expected, and how standard deviation (σ) can quantify a model’s stability. However, I also discussed that one distribution from a small test set is not guaranteed to give an overview of the model’s true distribution. In this post, I’ll discuss how to move forward from this bottleneck.

How Good is My Model? Part 1: µ ± σ and a Bit More

13 minute read

Published:

In my first blog post, I made two bold statements about transformer models for molecular property prediction in our review1 — namely, that they fall short in terms of both novelty and benchmarking. In this and the following posts, I hope to convince you of these conclusions.

  1. Publicly accessible pre-print version here 

The Beginning: Something is Broken!

8 minute read

Published:

So, my PhD topic is to implement transformer models for toxicity prediction and molecular generation. Or in plain english, given a molecule (e.g., a drug), I want to train an “AI” model to predict its possible toxic effect on the body and the environment. Even better, create new formulas for molecules that will be safe biologically and environmentally. However, in this post, I will show you how I had to make a major detour before I even attempt to do so!

Model Limitations

How Good is My Model? Part 5: When cross-validation went rogue!

32 minute read

Published:

In the last technical post, I talked about how to tell when one is in a state to start comparing models. I found that I needed to satisfy some conditions before concluding that my model is suitable for my data and representation. Now, assuming that I have such a suitable model and I want to compare it to other suitable models—or I found no such model, and I just want to see which of my suboptimal models is the least suboptimal—Is cross-validation the next logical step?

Short answer: Not as we use it today!

Molecular Property Prediction (MPP)

The Beginning: Something is Broken!

8 minute read

Published:

So, my PhD topic is to implement transformer models for toxicity prediction and molecular generation. Or in plain english, given a molecule (e.g., a drug), I want to train an “AI” model to predict its possible toxic effect on the body and the environment. Even better, create new formulas for molecules that will be safe biologically and environmentally. However, in this post, I will show you how I had to make a major detour before I even attempt to do so!

Reporting

How Good is My Model? Part 5: When cross-validation went rogue!

32 minute read

Published:

In the last technical post, I talked about how to tell when one is in a state to start comparing models. I found that I needed to satisfy some conditions before concluding that my model is suitable for my data and representation. Now, assuming that I have such a suitable model and I want to compare it to other suitable models—or I found no such model, and I just want to see which of my suboptimal models is the least suboptimal—Is cross-validation the next logical step?

Short answer: Not as we use it today!

How Good is My Model? Part 4: To Compare, or Not to Compare

25 minute read

Published:

In this post, I go back to the “How Good is My Model?” lane and continue the journey. However, since the next stop is the “Cross-Validation land,” which in my field is mainly about model comparison, one needs to go through this sanity check before moving to comparison. This check should indicate whether I am ready to start comparing models—or not yet.

How Good is My Model? Part 3: Truce with Small Datasets.

15 minute read

Published:

In the last two posts, I was trying to answer a question but found that it only became entangled with the limitations of reality. To know how good a model is, I need to land on the true distribution of its performance—to see what is common and what is rare. However, this requires a large number of i.i.d. samples, which unfortunately are not easy to find in our field.

How Good is My Model? Part 2: LLN, CLT, i.i.d., and a Messy World

17 minute read

Published:

In the last post, I discussed how a model’s performance is more faithfully reported as a distribution rather than a single value. Also, how it’s important to report it mindfully to distinguish between what is typical and what is expected, and how standard deviation (σ) can quantify a model’s stability. However, I also discussed that one distribution from a small test set is not guaranteed to give an overview of the model’s true distribution. In this post, I’ll discuss how to move forward from this bottleneck.

How Good is My Model? Part 1: µ ± σ and a Bit More

13 minute read

Published:

In my first blog post, I made two bold statements about transformer models for molecular property prediction in our review1 — namely, that they fall short in terms of both novelty and benchmarking. In this and the following posts, I hope to convince you of these conclusions.

  1. Publicly accessible pre-print version here 

Sample quality

How Good is My Model? Part 2: LLN, CLT, i.i.d., and a Messy World

17 minute read

Published:

In the last post, I discussed how a model’s performance is more faithfully reported as a distribution rather than a single value. Also, how it’s important to report it mindfully to distinguish between what is typical and what is expected, and how standard deviation (σ) can quantify a model’s stability. However, I also discussed that one distribution from a small test set is not guaranteed to give an overview of the model’s true distribution. In this post, I’ll discuss how to move forward from this bottleneck.

Sample size

How Good is My Model? Part 2: LLN, CLT, i.i.d., and a Messy World

17 minute read

Published:

In the last post, I discussed how a model’s performance is more faithfully reported as a distribution rather than a single value. Also, how it’s important to report it mindfully to distinguish between what is typical and what is expected, and how standard deviation (σ) can quantify a model’s stability. However, I also discussed that one distribution from a small test set is not guaranteed to give an overview of the model’s true distribution. In this post, I’ll discuss how to move forward from this bottleneck.

How Good is My Model? Part 1: µ ± σ and a Bit More

13 minute read

Published:

In my first blog post, I made two bold statements about transformer models for molecular property prediction in our review1 — namely, that they fall short in terms of both novelty and benchmarking. In this and the following posts, I hope to convince you of these conclusions.

  1. Publicly accessible pre-print version here 

Small Datasets

How Good is My Model? Part 3: Truce with Small Datasets.

15 minute read

Published:

In the last two posts, I was trying to answer a question but found that it only became entangled with the limitations of reality. To know how good a model is, I need to land on the true distribution of its performance—to see what is common and what is rare. However, this requires a large number of i.i.d. samples, which unfortunately are not easy to find in our field.

Standards

How Good is My Model? Part 5: When cross-validation went rogue!

32 minute read

Published:

In the last technical post, I talked about how to tell when one is in a state to start comparing models. I found that I needed to satisfy some conditions before concluding that my model is suitable for my data and representation. Now, assuming that I have such a suitable model and I want to compare it to other suitable models—or I found no such model, and I just want to see which of my suboptimal models is the least suboptimal—Is cross-validation the next logical step?

Short answer: Not as we use it today!

How Good is My Model? Part 4: To Compare, or Not to Compare

25 minute read

Published:

In this post, I go back to the “How Good is My Model?” lane and continue the journey. However, since the next stop is the “Cross-Validation land,” which in my field is mainly about model comparison, one needs to go through this sanity check before moving to comparison. This check should indicate whether I am ready to start comparing models—or not yet.

How Good is My Model? Part 3: Truce with Small Datasets.

15 minute read

Published:

In the last two posts, I was trying to answer a question but found that it only became entangled with the limitations of reality. To know how good a model is, I need to land on the true distribution of its performance—to see what is common and what is rare. However, this requires a large number of i.i.d. samples, which unfortunately are not easy to find in our field.

How Good is My Model? Part 2: LLN, CLT, i.i.d., and a Messy World

17 minute read

Published:

In the last post, I discussed how a model’s performance is more faithfully reported as a distribution rather than a single value. Also, how it’s important to report it mindfully to distinguish between what is typical and what is expected, and how standard deviation (σ) can quantify a model’s stability. However, I also discussed that one distribution from a small test set is not guaranteed to give an overview of the model’s true distribution. In this post, I’ll discuss how to move forward from this bottleneck.

How Good is My Model? Part 1: µ ± σ and a Bit More

13 minute read

Published:

In my first blog post, I made two bold statements about transformer models for molecular property prediction in our review1 — namely, that they fall short in terms of both novelty and benchmarking. In this and the following posts, I hope to convince you of these conclusions.

  1. Publicly accessible pre-print version here 

The Beginning: Something is Broken!

8 minute read

Published:

So, my PhD topic is to implement transformer models for toxicity prediction and molecular generation. Or in plain english, given a molecule (e.g., a drug), I want to train an “AI” model to predict its possible toxic effect on the body and the environment. Even better, create new formulas for molecules that will be safe biologically and environmentally. However, in this post, I will show you how I had to make a major detour before I even attempt to do so!

Statistics

How Good is My Model? Part 5: When cross-validation went rogue!

32 minute read

Published:

In the last technical post, I talked about how to tell when one is in a state to start comparing models. I found that I needed to satisfy some conditions before concluding that my model is suitable for my data and representation. Now, assuming that I have such a suitable model and I want to compare it to other suitable models—or I found no such model, and I just want to see which of my suboptimal models is the least suboptimal—Is cross-validation the next logical step?

Short answer: Not as we use it today!

How Good is My Model? Part 4: To Compare, or Not to Compare

25 minute read

Published:

In this post, I go back to the “How Good is My Model?” lane and continue the journey. However, since the next stop is the “Cross-Validation land,” which in my field is mainly about model comparison, one needs to go through this sanity check before moving to comparison. This check should indicate whether I am ready to start comparing models—or not yet.

How Good is My Model? Part 3: Truce with Small Datasets.

15 minute read

Published:

In the last two posts, I was trying to answer a question but found that it only became entangled with the limitations of reality. To know how good a model is, I need to land on the true distribution of its performance—to see what is common and what is rare. However, this requires a large number of i.i.d. samples, which unfortunately are not easy to find in our field.

How Good is My Model? Part 2: LLN, CLT, i.i.d., and a Messy World

17 minute read

Published:

In the last post, I discussed how a model’s performance is more faithfully reported as a distribution rather than a single value. Also, how it’s important to report it mindfully to distinguish between what is typical and what is expected, and how standard deviation (σ) can quantify a model’s stability. However, I also discussed that one distribution from a small test set is not guaranteed to give an overview of the model’s true distribution. In this post, I’ll discuss how to move forward from this bottleneck.

How Good is My Model? Part 1: µ ± σ and a Bit More

13 minute read

Published:

In my first blog post, I made two bold statements about transformer models for molecular property prediction in our review1 — namely, that they fall short in terms of both novelty and benchmarking. In this and the following posts, I hope to convince you of these conclusions.

  1. Publicly accessible pre-print version here 

Transformers

The Beginning: Something is Broken!

8 minute read

Published:

So, my PhD topic is to implement transformer models for toxicity prediction and molecular generation. Or in plain english, given a molecule (e.g., a drug), I want to train an “AI” model to predict its possible toxic effect on the body and the environment. Even better, create new formulas for molecules that will be safe biologically and environmentally. However, in this post, I will show you how I had to make a major detour before I even attempt to do so!

Trustworthy ML

How Good is My Model? Part 5: When cross-validation went rogue!

32 minute read

Published:

In the last technical post, I talked about how to tell when one is in a state to start comparing models. I found that I needed to satisfy some conditions before concluding that my model is suitable for my data and representation. Now, assuming that I have such a suitable model and I want to compare it to other suitable models—or I found no such model, and I just want to see which of my suboptimal models is the least suboptimal—Is cross-validation the next logical step?

Short answer: Not as we use it today!

How Good is My Model? Part 4: To Compare, or Not to Compare

25 minute read

Published:

In this post, I go back to the “How Good is My Model?” lane and continue the journey. However, since the next stop is the “Cross-Validation land,” which in my field is mainly about model comparison, one needs to go through this sanity check before moving to comparison. This check should indicate whether I am ready to start comparing models—or not yet.

How Good is My Model? Part 3: Truce with Small Datasets.

15 minute read

Published:

In the last two posts, I was trying to answer a question but found that it only became entangled with the limitations of reality. To know how good a model is, I need to land on the true distribution of its performance—to see what is common and what is rare. However, this requires a large number of i.i.d. samples, which unfortunately are not easy to find in our field.

How Good is My Model? Part 2: LLN, CLT, i.i.d., and a Messy World

17 minute read

Published:

In the last post, I discussed how a model’s performance is more faithfully reported as a distribution rather than a single value. Also, how it’s important to report it mindfully to distinguish between what is typical and what is expected, and how standard deviation (σ) can quantify a model’s stability. However, I also discussed that one distribution from a small test set is not guaranteed to give an overview of the model’s true distribution. In this post, I’ll discuss how to move forward from this bottleneck.

How Good is My Model? Part 1: µ ± σ and a Bit More

13 minute read

Published:

In my first blog post, I made two bold statements about transformer models for molecular property prediction in our review1 — namely, that they fall short in terms of both novelty and benchmarking. In this and the following posts, I hope to convince you of these conclusions.

  1. Publicly accessible pre-print version here 

Uncertainty

How Good is My Model? Part 3: Truce with Small Datasets.

15 minute read

Published:

In the last two posts, I was trying to answer a question but found that it only became entangled with the limitations of reality. To know how good a model is, I need to land on the true distribution of its performance—to see what is common and what is rare. However, this requires a large number of i.i.d. samples, which unfortunately are not easy to find in our field.