Share:

Artificial intelligence meets biology: how OpenProtein.AI is putting protein design tools in the hands of scientists worldwide

Artificial intelligence has already proven it can speed up drug development and deepen our understanding of diseases. But there is a problem that few people talk about openly: the most powerful models that exist today are, in practice, out of reach for most scientists.

Not because of a lack of interest, but because the majority of biologists simply do not have training in machine learning. And that creates a curious paradox: the technology advances at breakneck speed, but the people who could benefit from it the most are stuck at the door, unable to get in.

This is exactly the gap between protein design and advanced computing that OpenProtein.AI set out to close. The company was founded by Tristan Bepler, who completed his PhD at MIT in 2020, and Tim Lu, a former MIT associate professor who completed his doctorate in 2007. Together, they built a platform that works as a bridge between two worlds that rarely communicate well: the technical universe of machine learning and the day-to-day reality of biology labs.

And what makes this story even more interesting is that it did not come from a big corporation trying to monetize a trend. It came from a simple yet powerful realization: the problem was not just technical, it was about access.

What is OpenProtein.AI and why it matters right now

OpenProtein.AI is a no-code computational biology platform that puts artificial intelligence tools directly into the hands of researchers who work with proteins. It does not require them to master programming languages or neural network architectures to get started. Instead of forcing the scientist to learn machine learning from scratch, the platform offers an intuitive web interface where users can upload data, train models, adjust parameters, and apply predictive models based on their own experimental data from their labs. For those who prefer to integrate via code, the platform also provides APIs. This completely changes the dynamic of who gets to use these technologies and for what purposes.

The timing of this initiative could not be more relevant. We are living in a moment when protein design has moved beyond being an academic curiosity and has become one of the hottest frontiers in applied science. Proteins are the molecular machines that carry out virtually everything inside cells, and being able to engineer them with precision opens enormous doors: from developing new biopharmaceuticals to creating more efficient industrial enzymes and even solutions for environmental challenges. The catch is that this process has always been slow, expensive, and heavily dependent on trial and error. Artificial intelligence has the potential to drastically cut that cycle, but only if it is accessible to the people who actually understand the biology behind the problem.

As Bepler himself put it, this is an exciting moment because these models not only make protein engineering more efficient, shortening development cycles for therapeutics and industrial applications, but also expand the ability to design new proteins with specific characteristics. The company’s broader vision goes further: they are creating a language to describe biological systems and are already thinking about applying these approaches to modalities that go beyond proteins.

That is why the OpenProtein.AI proposition resonates so strongly within the scientific community. It does not try to replace the scientist or oversimplify the science. It offers an environment where the researcher’s biological knowledge combines with the computational power of modern predictive models, generating something that neither side could achieve on its own. The company also provides free access to the platform for academic scientists, reinforcing that the commitment to accessibility is real and not just rhetoric.

The academic journey that led to the creation of the platform

The story of OpenProtein.AI begins in the hallways of MIT. Bepler arrived at the university in 2014 as part of the doctoral program in Computational and Systems Biology, studying under the guidance of Bonnie Berger, Simons Professor of Applied Mathematics at MIT. It was during this experience that he realized how much we still do not understand about the molecules that form the fundamental building blocks of biology.

Receive the best innovation content in your email.

All the news, tips, trends, and resources you're looking for, delivered to your inbox.

By subscribing to the newsletter, you agree to receive communications from Método Viral. We are committed to always protecting and respecting your privacy.

According to Bepler, at the time we had not yet characterized biomolecules and proteins well enough to create good predictive models about the behavior of, for example, a complete genomic circuit or a protein interaction network. That realization led him to investigate proteins at a much more detailed level.

He began exploring ways to predict the amino acid chains that make up proteins by analyzing evolutionary data. This happened before Google released AlphaFold, the powerful protein structure prediction model. That work resulted in one of the first generative artificial intelligence models for understanding and protein design, what the team calls a protein language model.

What particularly excited Bepler was the classic protein framework and the relationships between sequence, structure, and function. These connections are still not well understood, and he wanted to figure out how to use foundation models to skip the structure component and go directly from sequence to function. That seemingly simple question carried enormous ambition and ended up shaping the entire trajectory of the company.

After completing his PhD in 2020, Bepler joined Tim Lu’s lab in the MIT Department of Biological Engineering as a postdoctoral researcher. Lu recalls that it was the era when the idea of integrating AI with biology was starting to gain traction. Bepler helped them build better computational models for biologics design, and together they noticed a real disconnect: the most advanced tools existed, but the biologists who would love to use them did not know how to code. OpenProtein.AI was born directly from the idea of broadening access to those tools.

How predictive models are transforming protein design

For decades, understanding how a protein folds in three-dimensional space was considered one of the greatest challenges in biology. The problem was so complex that it earned its own name — the protein folding problem — and it resisted decades of attempts using traditional methods. When AlphaFold delivered extraordinary results in predicting protein structures, the scientific world was stunned by what artificial intelligence was capable of. But predicting structures is only one part of the challenge. The next step, which is designing entirely new proteins with specific functions, requires an even more sophisticated level of predictive models, and this is exactly where platforms like OpenProtein.AI come in at full force.

The predictive models used in protein design work, in simplified terms, by learning patterns from enormous databases of known protein sequences. From those patterns, they can suggest which combinations of amino acids are most likely to produce a protein with a given characteristic, whether that is thermal stability, the ability to bind to a specific target, or resistance to certain chemical environments.

What OpenProtein.AI does differently is allow researchers to feed these models with their own experimental data, making the predictions far more relevant to the specific context of each study. Instead of relying solely on generic public data, the scientist can build a model tailored to the reality of their lab.

PoET: the protein language model that changes the game

One of the standout features of the platform is PoET, which stands for Protein Evolutionary Transformer. This is the core protein language model at OpenProtein.AI, and it was trained on groups of proteins to generate sets of related proteins. Bepler and his collaborators demonstrated that PoET can generalize across evolutionary constraints in proteins and incorporate new protein sequence information without requiring full retraining. This means other researchers can add their own experimental data to improve the model, making it increasingly accurate for their specific use cases.

In practice, researchers can use their data to train models and optimize protein sequences, then use other tools on the platform to analyze those proteins. It is possible to generate entire libraries of protein sequences in silico — meaning on the computer — and then run those sequences through predictive models for validation and structural predictions. All of this works as a no-code front end, although APIs are available for those who prefer to access the platform programmatically.

The models help researchers design proteins more quickly and then decide which ones are promising enough for additional lab testing. It is also possible to input proteins of interest and let the models generate new proteins with similar properties. This fundamentally transforms the experimentation cycle, reducing the number of blind attempts and directing resources toward candidates with the highest chance of success.

In 2024, OpenProtein.AI released a new version of the model, PoET-2, which outperforms much larger models while using only a fraction of the computational resources and experimental data. This kind of efficiency is particularly important because not every lab or company has access to supercomputers or unlimited budgets for cloud processing.

Accessible computational biology: the real impact on laboratories

When people talk about democratizing computational biology, it is easy for the message to sound like empty marketing. But when you look at what actually happens in research labs around the world, the situation is very concrete and quite frustrating. Many research groups have rich data, relevant scientific questions, and solid experimental capabilities, but they simply have no way to take advantage of the latest advances in artificial intelligence because there is no machine learning infrastructure available to them.

Hiring a data science specialist with expertise in computational biology is expensive and difficult, and the open tools available require a level of technical expertise that most biologists do not have — and should not need to have just to use a research support tool.

OpenProtein.AI tackles this problem head-on by offering an experience designed around the actual workflow of people who work with proteins. As Bepler explains, the team worked hard to make the platform an open toolbox with specific workflows but without being locked to a particular protein function or protein class. One of the major advantages of these models is that they are very good at understanding proteins broadly, learning about the entire space of possible proteins.

The practical impact of this shows up as shorter discovery cycles and more informed experimental decisions. Instead of synthesizing and testing dozens of protein variants in the dark, a lab using predictive models can prioritize candidates with the highest probability of success, saving time, reagents, and money. In a world where research funding is always scarce, this efficiency is not just convenient — it can be the deciding factor between a project moving forward or being shelved.

Partnerships with the pharmaceutical industry are already underway

The relevance of OpenProtein.AI is not limited to talk. Pharmaceutical giant Boehringer Ingelheim began using the platform in early 2025, and the companies recently announced an expanded collaboration. The partnership provides for the OpenProtein.AI platform and models to be integrated directly into Boehringer Ingelheim’s protein engineering work for treating diseases such as cancer and autoimmune or inflammatory conditions.

This type of partnership signals something important: the pharmaceutical industry, traditionally conservative when it comes to adopting new technologies, is recognizing that artificial intelligence tools for protein design are no longer experimental. They are an essential part of the pipeline for developing new therapies. And when a company the size of Boehringer Ingelheim bets in this direction, it is a strong indicator that the market as a whole is moving.

Tools we use daily

For smaller biotech companies and academic labs, the existence of a platform like OpenProtein.AI levels the playing field in a way that would have been unthinkable just a few years ago. A research group at a university in Brazil, for example, can access the same foundation models that a European pharmaceutical multinational is using. That is the concrete promise of technological democratization, and it is already happening.

What lies ahead at the intersection of AI and protein science

The field that brings together artificial intelligence and protein design is still in its early chapters, and the feeling among those who follow it closely is that the pace of evolution will continue to surprise. OpenProtein.AI itself is already looking toward the next frontier. Bepler wants to tackle the question of how to describe proteins more completely: what is the meaningful, domain-specific language for the protein constraints used during generation? How do you incorporate more evolutionary constraints? How do you describe an enzymatic reaction that a protein performs in a way that a model can generate sequences to carry out that reaction?

Tim Lu, who currently serves in an advisory role at the company, points to an area that particularly excites him: moving beyond simple protein binding events and using these models to predict and design dynamic characteristics, where a protein needs to engage two, three, or four biological mechanisms at the same time, or change its function after binding to a target. This kind of functional complexity is the next big challenge, and solving it could pave the way for truly programmable therapies.

Another point worth paying attention to is the trend of integrating different types of biological data. Amino acid sequences, three-dimensional structures, gene expression data, functional assay results — all of this is gradually being incorporated into artificial intelligence approaches that can reason in a multimodal way, combining different sources of information to generate more robust predictions. OpenProtein.AI is already operating in this direction, and the expectation is that platforms like this will evolve to incorporate even more layers of biological context over time.

The importance of open access for scientific progress

Lu makes an observation that deserves to be highlighted: as the work becomes more complex, with approaches incorporating things like protein logic and dynamic therapies, the existing experimental tools become limiting. He emphasizes that it is truly important to create open ecosystems around AI and biology. There is a real risk that artificial intelligence resources become so concentrated that the average researcher cannot use them. Open access is critical for the scientific field to keep moving forward.

That warning is not trivial. As artificial intelligence models become more expensive to train and maintain, the natural tendency of the market is to concentrate those capabilities in a few hands. If that happens at the intersection of AI and biology, the result would be a scenario where only large corporations could use the most advanced tools, leaving academic research and smaller companies behind. The decision by OpenProtein.AI to offer free access for academia is a direct counterpoint to that trend, and it signals a philosophy that science advances better when tools are shared.

At its core, what this story represents is something bigger than a single platform or a specific technology. It points to a future where the distance between having a scientific hypothesis and being able to test it computationally is much smaller than it is today. Where a researcher anywhere in the world, with relevant data and good questions, can use cutting-edge artificial intelligence without needing an army of engineers by their side. Accessible computational biology is not just a nice idea. It is, increasingly, a real necessity for science to advance at the speed that the challenges of our time demand. 🧬🤖

Picture of Rafael

Rafael

Operations

I transform internal processes into delivery machines — ensuring that every Viral Method client receives premium service and real results.

Fill out the form and our team will contact you within 24 hours.

Related publications

Amazon's stock could rise following OpenAI partnership.

Amazon and OpenAI partnership could boost AI revenue and stock value, says Citi; strategic impact on AWS and infrastructure race.

Moratorium on AI Data Centers: Energy in Debate

Sanders and AOC propose moratorium on AI datacenter construction in the US to assess environmental and energy impacts.

Blockchain and AI Agents Are Changing Crypto Payments

AI agents power crypto payments with blockchain, stablecoins and x402, enabling autonomous transactions, micropayments and machine-to-machine economy

Receba o melhor conteúdo de inovação em seu e-mail

Todas as notícias, dicas, tendências e recursos que você procura entregues na sua caixa de entrada.

Ao assinar a newsletter, você concorda em receber comunicações da Método Viral. A gente se compromete a sempre proteger e respeitar sua privacidade.

Rafael

Online

Atendimento

Website Pricing Calculator

Find out how much the ideal website for your business costs

Website Pages

How many pages do you need?

Drag to select from 1 to 20 pages

In just 2 minutes, automatically find out how much a custom website for your business costs

More than 0+ companies have already calculated their quote

Fale com um consultor

Preencha o formulário e nossa equipe entrará em contato.