Chai-1: Decoding the molecular interactions of life

269 pointsposted 5 days ago
by glowingvoices

62 Comments

xianshou

5 days ago

In light of last week's fiasco with Reflection (https://venturebeat.com/ai/new-open-source-ai-leader-reflect...), I hope the community has a newfound enthusiasm for independent testing!

This is extremely exciting news if true, so I'm eager to have it either confirmed or questioned. The one thing I hope we won't be doing is accepting SOTA evals from open-sourced models at face value.

deisteve

5 days ago

I don't know how people like Matt Schumer can attempt what looks like fraud and deception being chalked off as a giant oopsies (which isn't really convincing) and not face any consequences.

For rest of us, this is a privilege that we don't have. We can't deceive, defraud our investors because it has real consequences....but not for people like Matt Schumer, why is that?

Loughla

4 days ago

Having mountains of money, in the US, is equated to being smart and better than. This means that failures, unless they purposefully exploit other better thans, are always forgiveable. Even when they're mildly intentional.

mupuff1234

4 days ago

Pretty sure it's not a US only thing.

mecsred

4 days ago

Just imagine the legal system as a money duel. If you have little money you can be crushed at no cost. Trying to fight someone with big money, even if you're likely to win, will take a lot of time and money. Unless the fraud was black and white or you're in for the long haul it's easier just to lick the wounds.

parentheses

4 days ago

Does that logic apply to the State - usu plaintiff?

Doesn't seem so since they have seemingly endless capital but have limits in what they can bring to bear. You tell me...

nayroclade

4 days ago

"The state" is not a monolith. Anti-fraud enforcement is handled by agencies with limited budgets and resources. Often they are deliberately underfunded and understaffed precisely so they cannot cause too much damage and embarrassment by going after really big targets.

wslh

5 days ago

Theranos everywhere? Except you can’t afford to mess up when it comes to health.

f6v

4 days ago

Oh, pharma messes up all the time.

But it’s an interesting question. You can’t be too risk-averse because there’re thousands of patients dying horrible deaths every single day. There’s simply a need for bold approaches in many areas of medicine.

wslh

4 days ago

I'd like to add a perspective that might not resonate with everyone, based on the famous quote: "Any sufficiently advanced technology is indistinguishable from magic." I sometimes adapt this to say: "Any sufficiently advanced technology is indistinguishable from a scam."

mmmore

5 days ago

Does the use of "foundation" and "multi-modal" for describing this model mean anything, or are those just used as buzzwords? Funnily enough, the only place those terms appear in the paper is in the abstract.

Also the paper says they basically copied the methods used for AlphaFold, but then included the ability to input language embeddings, and input some other side constraints that I don't have the biology knowledge to understand. They don't show any data that indicate how much these changes improve performance. They show a very modest improvement over AF3 (small enough that I would think it could be achieve through randomness/small variations in the training parameters). So I don't think this is very revolutionary, but I suppose it replicates AF3.

dekhn

5 days ago

If by "multi-modal", you mean "it takes several different datatypes as input or output", then yes, it's multi-modal. See Figure 1 in the Tech Report.

alexk101

5 days ago

Foundational maybe isn't the best label for this kind of model. My understanding of foundational models is that they are made to be a baseline which can be further fine tuned for specific downstream tasks. This seems more like an already fine tuned model, but I haven't looked carefully enough at the methodology to say.

lainga

5 days ago

Would you then call it a buzzword, or is there some gentler excluded-middle interpretation of that word's application to the project?

IanCal

4 days ago

I don't think it's a particular buzzword here. They claim it's useful across a range of tasks, and that's the key part imo.

Now, "predictions for parts of drug discovery" isn't the widest range, so perhaps you need to consider "foundation" as somewhat context dependent, but I don't think it's a wild claim. Neither "foundation" nor "fine tuned" are really better than each other, but those are probably the two ends of a spectrum here.

My get-out clause here is that someone with a better understanding of the field may say these are actually extremely narrowly trained things, and the tests are equivalent to multiple different coding problem challenges rather than programming/translation/poetry/etc.

brookst

5 days ago

It’s about like referring to a famous person’s red carpet attire as “off the shelf [designer name]”. It downplays the effort that went into it more than anything.

ashvardanian

4 days ago

There is a pretty noticeable improvement for antibody-antigen interactions - looks like double-digit percents. Check out figure 4 here: https://chaiassets.com/chai-1/paper/technical_report_v1.pdf

mmmore

4 days ago

Figure 4 is comparing the model with itself, unless I'm misunderstanding it. The takeaway seems to be the model performs better if you give it extra "constraints", i.e. extra info already known about the protein.

The table with a comparison to alpha fold gives a less than one percentage point improvement.

bbstats

4 days ago

the error bars are like 5-10x the size of that 'defeat'

mandoline

4 days ago

This is an exciting result – but knowledge of protein structure is usually not a limiting factor in drug discovery: https://www.chemistryworld.com/opinion/why-alphafold-wont-re...

Would be interesting to try to estimate the impact of results like these across the drug development pipeline.

E.g. N% improvement on our most predictive benchmarks X, Y, Z could impact clinical success by M% +- E% (where E would likely be quite large).

LarsDu88

4 days ago

I was just working on a small protein diffusion model and felt bad when I started copying and pasting quaternion functions from pytorch3d to avoid dependency hell.

Lo and behold I see Chai did the same shit in their repo. Lol

zan2434

5 days ago

This is both awesome and feels very dangerous to release publicly, no? Can’t this be used to discover novel bioweapons as easily as it can be used to discover new medicines?

Genuinely curious, would love to learn if that isn’t true / or is generally just not that big of a deal compared to other risks.

matrix2003

5 days ago

We already have some pretty horrific and documented/accessible bioweapons.

This gets into the philosophy of restricting access to knowledge. The conclusion I keep arriving at is that we’re lucky that there don’t appear to be many Timothy McVeighs walking around. I don’t think there is a practical defense from people like that.

cowsandmilk

5 days ago

I think you overestimate the difficulty of discovering bioweapons. There is a reason toxicology is the dead end for tons of drug molecules. It is very easy already to design molecules that will kill someone.

emporas

4 days ago

Even the word bioweapon is not accurate to describe a deadly (or harmful) biological agent. A weapon usually means that there is a source of deadly force, and a target. The source doesn't want to be hit by the same weapon it uses to hit others.

This is vastly difficult to achieve using biology. Any organism on the planet has it's own agency, and it will hit anything to reproduce and eat. In addition this is not limited to toxicology and releasing toxins, because the agent can just eat tissue.

For example phosphorus has been used in chemical warfare, but even that cannot be described 100% as a weapon. The phosphorus gas can hit people who released it the same as everyone else, it just depends on the wind.

Right now, on everyone palms, there are thousands of organisms which create electricity, eat wood and kill animals. Given that the palms are washed, that number is reduced to some thousand different species. If the palms are not washed the last 24 hours, that number shoots up to hundred thousand different species, even millions.

I do not see any difficulty for someone to enhance a harmful agent and make it deadly, using just regular computation and not even A.I.. However the person who facilitated this, will be a target too.

zan2434

4 days ago

This actually makes a lot of sense! Sounds like finding dangerous chemicals is easy and is not the actual limitation at all.

whymauri

4 days ago

As someone who worked in molecular ADMET, this x1000.

taspeotis

5 days ago

This is as unethical as that time JVC released VHS which allowed people to record videos but also pirate content!!1

mmmore

5 days ago

You'd have to work at the RIAA to think that piracy and bioweapons are comparable.

I don't know how much releasing this model is a delta on safety, but we certainly need to do a better job of vetting who can order viruses; my understanding is there's very little restrictions right now. This will become more important as models get more capable.

zan2434

5 days ago

Clear snark aside, content piracy has pretty bounded risks so isn’t a reasonable comparison

dekhn

5 days ago

Nobody has really been able to make a convincing argument whether these sorts of tools haven't lead to large-scale terrorism through bioweapons because the underlying problem is hard (for a sufficiently motivated adversary), or that terrorists don't have the resources/knowledges/skill, and as far as we can tell, the sufficiently motivated adversaries who have tried either failed, succeeded secretly, or were convinced to walk back from the brink due to the potential consequences.

In short there are other ways to negatively affect large numbers of people that are easier, and presumably those avenues are being explored first. But we don't know what we don't know.

peterldowns

5 days ago

If you're implying that the answer is "yes this is too dangerous", could you possibly give a few examples of technological developments that aren't "very dangerous to release publicly" by the same standard?

For instance, would any of the following technologies be acceptably "safe"?

- physical locks (makes it possible to keep work secret or inaccessible to the government)

- solar power (power is suddenly much cheaper, means bad guys can do more with less money)

- general workload computers (run arbitrary code, including bad things)

- printing press (ideology spreads much more quickly, erodes elite hold over culture)

- bosch-haber process (necessary for creating ammunition necessary to fight the world wars)

mmmore

5 days ago

You left out the most relevant comparison:

- nuclear fission, which provides an abundant source of environmentally friendly energy, but allows people to make bombs capable of wiping out whole cities at once (and potentially causing nuclear winter)

But even in that case, I believe that it's a good thing that we have access to nuclear power, and I certainly want us to use more nuclear power. At the same time, I'm very glad that a bomb is hard enough to make that ISIS couldn't do it, let alone any number of lone wolf terrorists. So I think I would apply the same logic to biotechnology; speeding up medical progress seems extremely valuable and I'm excited about how AF and other AI systems can help with this, but we should mitigate the ability for bad actors to use the same tools for evil.

An aspect that's unique about biotechnology that's different in comparison to the examples you gave is that most of those technologies help good and bad people approximately equally, and since there's many more reasonable than crazy people they're not super dangerous.

There's a concern that technologies that make bioengineering easier could make it easier to produce and proliferated novel pathogens, much more so than they make it easier to prevent pandemics; in other words, it favors "offense" more than "defense". The only one example you listed that has a similar dynamic in my mind is the bosch-haber process, but that has large positive downstream effects separate from its use for ammunition. Again, this is not to say we should stop medical progress, but that we should act to mitigate the dangers, and keep this concept in mind as the technology progresses.

That said, I'm not certain how much the current tools are dangerous in this way. My understanding is that there is lower hanging fruit in mitigating these issues right now; for example, better controls at labs studying viruses, and better vetting of people who order pathogens online.

dosinga

4 days ago

The printing press indeed led to religious wars in Europe. The Ottomans banned it and avoided that fate. And the progress associated with it.

echelon

5 days ago

The science to restrict is molecular biology (bacteria) or virology, not applied mathematics (AI). These folks can already do some wild things with the materials they have on hand and don't need fancy AI to help them.

Structure prediction is just one small slice of all of the things you'd need to do. Choosing a vector, culturing it, splicing it into an appropriate location for regulation, making sure it's compatible with the environment, making sure your payload is conserved, study the mechanism of infection and make sure all of the steps are unimpeded, make sure it works with all of the host and vector kinetics, study the pathology, study the epidemiology. And that's just for starters.

This would require a team and an enormous amount of resources. People motivated enough to do this can already do it and don't need the AI piece.

f6v

4 days ago

There’s still a long way from in-silico prediction to wet-lab validation. You need a full-blown molecular biology lab to test any of these.

Then again, you can just release existing dangerous pathogens. Like, poison a water with something deadly. So you don’t need a new one if you’re a terrorist.

crackalamoo

5 days ago

Not a solution, but maybe if a bad actor tried to create a bioweapon, a trusted organization could use this technology as an antidote. Unfortunately this still leaves the possibility of some kind of insidious, undetectable bioweapon.

m00x

5 days ago

No, it's a very small piece for what you'd need to make bioweapons.

d_silin

5 days ago

...as difficult as discovering new medicines, you mean?

Chemistry and molecular biology are fiendishly complicated fields, far more complex and less predictable than what general (and most of the non-biochem STEM majors) imagine them to be.

How do I know? I thought of one brilliant startup idea that would solve so many of the world's problems if only we used computers to simulate biological systems.

Result: https://xkcd.com/1831/

Reference materials:

https://www.amazon.ca/Molecular-Biology-Cell-Loose-Version/d...

I strongly recommend to treat it as introductory-level text on the same level as "K&R - C Programming Language". Yes, all 1464 pages of it.

https://www.amazon.ca/Fundamentals-Systems-Biology-Synthetic...

On the same level as above text, but with more math.

https://www.amazon.com/Introduction-Computational-Chemistry-...

That or any other book on computational chemistry will give you an understanding why it is difficult to design anything of value in biological systems. ML can only help so much.

Also check out this page for entire field scope:

https://en.wikipedia.org/wiki/Omics

dekhn

5 days ago

MBoC is more like Knuth's textbooks. It's a towering monument to the achievements of humanity over the past 150 years (molecular biology proper is less than 100 years old). As well as being highly accessible (readable).

It's done in an interesting style, with lots of direct references to current literature. I was surprised to see a recent edition on IA: https://archive.org/details/alberts-molecular-biology-of-the...

glowingvoices

4 days ago

Thank you for the textbooks! I've started studying Molecular Biology of the Cell to prepare for undergrad, but this is the first time I've heard about the others.

Are there any other books you would recommend?

d_silin

4 days ago

Search for "computational biology" on Amazon, but I'd say go first to online courses if you have time and commitment, like:

https://www.coursera.org/specializations/bioinformatics

https://www.coursera.org/specializations/systems-biology

Also, checkout out

https://www.coursera.org/courses?query=computational%20biolo...

Then you will have a better understanding of the subject area and the literature to search.

glowingvoices

4 days ago

I'm still in high school, so I don't think I'll have time to fit the courses into my schedule. I'll definitely look for the books though! Thanks.

IncreasePosts

5 days ago

The saving grace of civilization is that, for the most part, terrorists are dumb.

mmmore

5 days ago

Unfortunately this is not always true. For example, one of the architects of the Tokyo subway sarin attacks[1], Masami Tsuchiya[2], had a masters in physical and organic chemistry.

[1] https://en.wikipedia.org/wiki/Tokyo_subway_sarin_attack

[2] https://en.wikipedia.org/wiki/Masami_Tsuchiya_(terrorist)

IncreasePosts

4 days ago

Yes, a lot of terrorists have engineering degrees also.

But they're also dumb, which is why they think terrorizing random people will positively I prove the world in some direction they care about.

I won't go into details, but I think if I had 19 dudes with a death wish in America, and a few million dollars, I could do something far worse than 9/11.

sudosysgen

4 days ago

The goal of an attack like 9/11 isn't really to kill the maximum number of civilians in order to terrorize random people.

The attack had a significant degree of symbolism. The intended audience was twofold: the Western public and leadership, with a durable message that they weren't untouchable (hence the attacks on the Pentagon and attempt on the Capitol), hence targeting large landmarks; the combination of civilian and military targets was to signify that they held the two to he equivalent. Plans were actually presented to attack other targets that would lead to more casualties, notably a nuclear power plant.

The other goal was to incite a religious conflict from the Muslim world against the US, and therefore probably from the US against as many Muslim countries as possible.

So the primary goal really wasn't to kill as many random people as possible (though of course that was a consideration), it was actually to target the tallest buildings possible as well as the most important government institutions.

Unfortunately, it really did move the world in the direction they wanted. Despite being extremely evil, they actually were remarkably successful at causing the social and geopolitical changes they wanted given the resources they had, and that caused yet more damage we shouldn't ignore. It also bears remembering (especially today) that terrorists often and unfortunately aren't as dumb as we think, and we underestimate them and simplify their motives to our peril.

pfisherman

5 days ago

There is a big gap between a master’s and a PhD, and then another between a PhD and a seasoned pro. To do something like a bioweapon, you would need a reasonably sized team of pros w/ a lot of capital intensive infrastructure. It would be virtually impossible to do in secret.

throwup238

5 days ago

How hard would it be for a biohacker to use these models to develop novel proteins? Let's say I wanted to take GFP and create another color fluorescent or something.

glowingvoices

5 days ago

I don't think it'd be too difficult. Train a PLM to generate proteins, validate with AF3, and send them off to a lab. You might want to read the ESM-3 paper if you're interested in stuff like this (not affiliated in any way).

pama

5 days ago

The title in HN is inaccurate. Having a 1% higher score on one metric is not beating a previously published model. This is a replicate, which is fine enough.

dang

4 days ago

Ah yes - thanks! We've changed it to the article title now.

Submitters: "Please submit the original source. If a post reports on something found on another site, submit the latter." - https://news.ycombinator.com/newsguidelines.html

(Submitted title was "Chai-1 Defeats AlphaFold 3")

dgfitz

5 days ago

Is there some sort of betting line I can make money off with all this? “-150 a new model isn’t released in the next month claiming it is currently the best at something” would let me retire years early.

If there is another line that said “+500 thus model will be forgotten and useless in 6 months” could take my retirement from years to months.

anitil

5 days ago

I believe Manifold does this sort of thing, though I've never used it myself.

tfehring

5 days ago

Manifold [0] has markets on this sort of thing, but it primarily uses fake money. (They're working on a real-money "sweepstakes" thing, which I'm not super familiar with.) If you're outside the US and looking for a real-money market, Polymarket [1] is probably your best bet. In the US, real-money prediction market contracts are regulated by the CFTC in the US, so availability of contracts is pretty limited; Kalshi [2] would be the most likely option, but I doubt they have anything on this topic.

[0] https://manifold.markets

[1] https://polymarket.com

[2] https://kalshi.com/

pants2

4 days ago

Your best bet in the US is to use Polymarket with a VPN

thefourthchime

5 days ago

-180 it’s a wrapper around alphafold with some pre prompt.

talldayo

4 days ago

> “-150 a new model isn’t released in the next month claiming it is currently the best at something” would let me retire years early.

An optimist and their seed funding are easily parted.

trott

4 days ago

I'm the author of AutoDock Vina (the most cited docking program, and the "runner-up" in the AlphaFold 3 paper)

Docking software is used to scan millions and billions of drug-like molecules looking for new potential binders. So it needs to be able to generalize, rather than just memorize.

But the evaluation approach used here and in the original paper (1) does not test how well the software will perform on novel molecules, because the test set is related to the training set.

If you understand the basics of ML and physics, you may be interested in my detailed critique here: https://olegtrott.substack.com/p/are-alphafolds-new-results-...

I'm glad that Chai-1 has been released though, as this will probably help people evaluate the method better.

(1) It looks like they are a bit different, as this paper allows 40% sequence identity. It's still high. I believe that sequences with 40% identity tend to have the same shapes, especially in the binding site, where it matters.

uptownfunk

4 days ago

Thanks for your work and also for your comments of AF3 and Chai-1. It sounds like you are implying there are potentially gross and subtle types of data set leakages taking place between the train and test which are resulting in what seem to be inflated performance metrics? These are pretty serious issues if so. Also I would agree with previous authors that marginal Improvement over sota is proof more that they have recreated something than really made significant new progress. But this has been an issue with LLMs for sometime now. But it sounds like they have some bright engineers from good brand name companies who are coming together with some VC backing of the team to try and do something in this space. I do appreciate that the weights are open. I would like to learn more about their future direction and their training methods

marviel

5 days ago

> We are releasing Chai-1 via a web interface for free, including for commercial applications such as drug discovery. We are also releasing the code for Chai-1 for non-commercial use as a software library. We believe that when we build in partnership with the research and industrial communities, the entire ecosystem benefits.