Mishcon de Reya page structure
Site header
Main menu
Main content section
Abstract image

Profluent's OpenCRISPR-1: open sequence licensing in the life sciences industry

Posted on 8 May 2024

On 22 April 2024, Profluent Bio Inc., an AI-first protein design company based in Berkeley, California, published a blog post entitled "Editing the Human Genome with AI", announcing the debut of its OpenCRISPR™ initiative, which will make publicly available gene editing tools comprising novel CRISPR-Cas9-like proteins not found in nature. The blog post was accompanied by a press release and a New York Times article.

CRISPR-Cas9 is a technology used in the life sciences industry to precisely edit DNA sequences in order to alter the sequence of amino acids produced from the transcription and translation of the DNA. The technology was first identified as part of the natural defences of bacteria: DNA sequences (known as CRISPR sequences), derived from previous viral infections, become incorporated within the bacterial genome. A corresponding enzyme, Cas9, uses the CRISPR sequences as a guide to identify, and ultimately cleave, DNA corresponding to new infections by the earlier virus. CRISPR-Cas9 has been adapted for use in the life sciences industry as a convenient means of identifying, cleaving, splicing, and otherwise editing DNA, with broad R&D and therapeutic applications.

CRISPR-Cas9 technologies are the subject of multiple international patent families and, therefore, lawful use by commercial entities generally requires costly licence payments. Of particular interest to the life sciences industry, therefore, is the fact that OpenCRISPR™ is going to be licensed free of charge on an "open source" basis.

Background

Profluent (which has attracted investments totalling USD35,000,000) is a spin-out of a Salesforce-funded project to develop ProGen, the protein equivalent of a large language model ("LLM"), which treats amino acid sequences as a language. ProGen was trained to predict the next amino acid, given a previous sequence, using a large protein sequence database. Researchers found that ProGen was capable of generating functional artificial protein sequences across multiple protein families (including enzymes, antibodies, gene editors, and peptides) "akin to generating grammatically and semantically correct natural language sentences" of the kind output by LLMs.

The OpenCRISPR™ initiative relates to novel gene editors similar to the established CRISPR-Cas9 system. Profluent has announced that it is open sourcing the novel gene editors, but not the deep learning technology that generated them.

Opencrispr licensing

Profluent has stated that it will be making its OpenCRISPR™ gene editor protein sequences available, free of charge.

Provided that none of these Profluent sequences (or their methods of use) fall within the scope of any CRISPR-Cas9 patents licensed-out by the established commercial licensors of CRISPR-Cas9 technologies (The Broad Institute, which licenses CRISPR patents on behalf of MIT, Harvard, and several other institutions, and ESR Genomics, which licenses CRISPR patents on behalf of Prof. Emmanuelle Charpentier, the University of California, and the University of Vienna), it is likely that a commercial ecosystem will arise whereby suppliers will make and sell CRISPR-Cas9 reagents made to Profluent's open source recipes, with pricing pitched at a substantial discount to the corresponding products made under licence from the established licensors, because none of the companies in the supply chain will be paying licence fees.

In fact, if, as Profluent claims, their AI-designed gene editor proteins are superior to existing CRISPR-Cas9 technologies, there may be a rapid migration to Profluent's technology due to an irresistible combination of efficacy and price advantages.

Rather than adopting one of the many established open source or Creative Commons licences, however, Profluent is sharing, with prospective licensees, a "term sheet" setting out the key terms of a standard licence. Profluent has indicated that it will enter into formal licences with prospective licensees based on the term sheet, potentially introducing some inertia into the adoption of OpenCRISPR.

The key difference between the non-exclusive licence described in Profluent's heads of terms and that described in the established open source and Creative Commons licence terms is the incorporation of a condition expressly prohibiting the use of Profluent's OpenCRISPR™ technologies for any "Restricted Purpose", which is defined to include germline editing, the production of non-reproducing plants, and "any other purpose generally considered in the scientific or medical community to be unethical", a somewhat subjective test.

If a restriction of this kind were incorporated in a licence for open source software, the conditionality would mean that the licence would fall outside the multi-part definition promulgated by the Open Source Initiative ("OSI"), the main arbiter of open source software licences, specifically Articles 5 ("No Discrimination Against Persons or Groups") and 6 ("No Discrimination Against Fields of Endeavor").

Developments in open source licensing 

In recent years, the widespread adoption of open source software components by commercial businesses has led to an evolution in the approach to licensing taken by the proprietors of open source versions of essential components such as a database management software. In particular, vendors have begun to offer open source software under a dual licensing model, with different licence terms applying to different classes of licensee, for example not-for-profit licensees are treated differently from for-profit licensees.

One approach to dual licensing is to supplement, or "patch", an established open source licence to cater for the two classes of licensee. This may be achieved by stating that additional clauses have been appended to an established open source licence (e.g. the "Commons Clause") or by editing the provisions of an established open source licence to introduce new provisions (e.g. the Server Side Public Licenses published by the vendors of MongoDB and Graylog, which are, respectively, edited versions of the GNU Affero General Public License v3 and the GNU General Public License v3). More recently, vendors have elected to use customised versions of new modular licence templates (e.g. the Business Source Licence promulgated by MariaDB plc). These licences fall outside the scope of the OSI definition and the relevant software is often referred to as "source available" rather than "open source".

Profluent may be the first company in the life sciences industry to seek to apply the open source philosophy to materials used in gene editing. However, Profluent is not the first company to consider the challenge of open sourcing biological systems and reagents. Notable examples include BioBricks, and the BiOS (Biological Open Source) initiative of the Australian non-profit organisation, Cambia.

The BioBricks Foundation (established in 2006) provides resources enabling public access to biomolecules free of charge on a permissive / sharealike basis. The BioBricks Foundation has produced two template documents to facilitate the free exchange of biological materials:

  • the Open Material Transfer Agreement, which envisages the exchange of biological materials (DNA, RNA, proteins, cell lines, etc) without restrictions on use, subject to a comprehensive disclaimer of the providing party's liability – similar to a permissive open source software licence; and
  • the BioBrick Public Agreement, which was developed for sharing BioBricks parts (modular genetically encoded functional units) but which can also be used to make free the sharing of any genetically encoded function.

The BiOS initiative promotes a legally enforceable framework to enable the sharing of patented and non-patented technology among a community whose members sign-up to the BiOS "concordance": a set of responsible sharing principles, including an agreement not to assert IP rights against each other.

The BiOS organisation is the proprietor of patents covering certain gene transfer technologies which it licenses on terms requiring licensees to make improvements available to BiOS and other licensees, a concept referred to as "patentleft", a nod to the principle of "copyleft" promulgated by the Free Software Foundation and embodied in its family of GNU General Public Licenses.

As with BioBricks. the BiOS initiative has produced licence, MTA, and non-assertion template agreements for specific Cambia-owned patented and unpatented technologies. These agreements may be adapted to cover other technologies.

Open sequence licensing

It appears that there is an unmet need for an ultra-low-friction procedure for incorporating open source markers in technologies used in the life sciences industry. In the open source software industry, a publisher can simply select one or more licences from the extensive repertoire of established licences, apply it to their software, and start competing. Although there are some parallels between software source code and DNA/protein sequences, the analogy soon becomes strained.

Open source licensing in the software field is principally concerned with the copyrights that subsist in original code. Patent rights are referred to in some of the more sophisticated open source software licences, but generally in the context of requiring users to forgo the right to enforce their patents as a condition of benefiting from the open source licence. By contrast, novel biological sequences are typically protected by patents, and copyrights in sequences, if they subsist at all (the general consensus is that they do not) can be safely ignored. It is not surprising, therefore, that the principal subject matter of the licence described in Profluent's term sheet is a patent application (US Provisional 63/627457) and all patents claiming priority from it.

Perhaps the time has come for stakeholders to develop an "open sequence" definition, and a series of "open sequence" and/or "sequence-available" licences governing the use of novel DNA and/or amino acid sequences by academic and commercial players in the life sciences industry. Such a definition, and the description of the permitted uses in such licences, could be drafted in language appropriate to biotechnology, rather than the software-specific language common to the established open source licences, and could properly reflect the applicable regulatory environment, including with respect to ethical use.

How can we help you?
Help

How can we help you?

Subscribe: I'd like to keep in touch

If your enquiry is urgent please call +44 20 3321 7000

Crisis Hotline

I'm a client

I'm looking for advice

Something else