You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We believe superintelligence could arrive within the next 10 years. These AI systems would have vast capabilities—they could be hugely beneficial, but also potentially pose large risks.
Today, we align AI systems to ensure they are safe using reinforcement learning from human feedback (RLHF). However, aligning future superhuman AI systems will pose fundamentally new and qualitatively different technical challenges.
Superhuman AI systems will be capable of complex and creative behaviors that humans cannot fully understand. For example, if a superhuman model generates a million lines of extremely complicated code, humans will not be able to reliably evaluate whether the code is safe or dangerous to execute. Existing alignment techniques like RLHF that rely on human supervision may no longer be sufficient. This leads to the fundamental challenge: how can humans steer and trust AI systems much smarter than them?
This is one of the most important unsolved technical problems in the world. But we think it is solvable with a concerted effort. There are many promising approaches and exciting directions, with lots of low-hanging fruit. We think there is an enormous opportunity for the ML research community and individual researchers to make major progress on this problem today.
As part of our Superalignment project, we want to rally the best researchers and engineers in the world to meet this challenge—and we’re especially excited to bring new people into the field.
Superalignment Fast Grants
We’re launching $10M in grants to support technical research towards the alignment and safety of superhuman AI systems, including weak-to-strong generalization, interpretability, scalable oversight, and more.
In partnership with Eric Schmidt, we are launching a $10M grants program to support technical research towards ensuring superhuman AI systems are aligned and safe:
We are offering $100K–$2M (term of one to two years) for grants for academic labs, nonprofits, and individual researchers.
No prior experience working on alignment is required; we are actively looking to support researchers who are excited to work on alignment for the first time.
We will need new breakthroughs to steer and control AI systems much smarter than us. This is one of the most important unsolved technical problems of our time. But we think it is a solvable machine learning problem. New researchers can make enormous contributions!
If you want to apply for a larger grant, feel free to do so; however, note that we don’t expect to be able to support larger grants.
Our application process is simple, and we’ll get back to you within four weeks of applications closing.
You can learn more about this opportunity by visiting the funder's website.
Eligibility:
We expect to be able to make grants to individuals and institutions in most other countries as well, barring legal restrictions.
We expect grants to be primarily used for funding compute costs, salaries, students, and human data; please specify the intended use of the funds in the budget section of the application form.
Preferences:
With these grants, we are particularly interested in funding the following research directions:
Weak-to-strong generalization:
Humans will be weak supervisors relative to superhuman models.
Can we understand and control how strong models generalize from weak supervision?
Interpretability:
How can we understand model internals?
And can we use this to e.g. build an AI lie detector?
Scalable oversight:
How can we use AI systems to assist humans in evaluating the outputs of other AI systems on complex tasks?
Many other research directions, including but not limited to: honesty, chain-of-thought faithfulness, adversarial robustness, evals and testbeds, and more.
Overview:
Background
We believe superintelligence could arrive within the next 10 years. These AI systems would have vast capabilities—they could be hugely beneficial, but also potentially pose large risks.
Today, we align AI systems to ensure they are safe using reinforcement learning from human feedback (RLHF). However, aligning future superhuman AI systems will pose fundamentally new and qualitatively different technical challenges.
Superhuman AI systems will be capable of complex and creative behaviors that humans cannot fully understand. For example, if a superhuman model generates a million lines of extremely complicated code, humans will not be able to reliably evaluate whether the code is safe or dangerous to execute. Existing alignment techniques like RLHF that rely on human supervision may no longer be sufficient. This leads to the fundamental challenge: how can humans steer and trust AI systems much smarter than them?
This is one of the most important unsolved technical problems in the world. But we think it is solvable with a concerted effort. There are many promising approaches and exciting directions, with lots of low-hanging fruit. We think there is an enormous opportunity for the ML research community and individual researchers to make major progress on this problem today.
As part of our Superalignment project, we want to rally the best researchers and engineers in the world to meet this challenge—and we’re especially excited to bring new people into the field.
Superalignment Fast Grants
We’re launching $10M in grants to support technical research towards the alignment and safety of superhuman AI systems, including weak-to-strong generalization, interpretability, scalable oversight, and more.
In partnership with Eric Schmidt, we are launching a $10M grants program to support technical research towards ensuring superhuman AI systems are aligned and safe:
We are offering $100K–$2M (term of one to two years) for grants for academic labs, nonprofits, and individual researchers.
No prior experience working on alignment is required; we are actively looking to support researchers who are excited to work on alignment for the first time.
We will need new breakthroughs to steer and control AI systems much smarter than us. This is one of the most important unsolved technical problems of our time. But we think it is a solvable machine learning problem. New researchers can make enormous contributions!
If you want to apply for a larger grant, feel free to do so; however, note that we don’t expect to be able to support larger grants.
Our application process is simple, and we’ll get back to you within four weeks of applications closing.
You can learn more about this opportunity by visiting the funder's website.
Eligibility:
We expect to be able to make grants to individuals and institutions in most other countries as well, barring legal restrictions.
We expect grants to be primarily used for funding compute costs, salaries, students, and human data; please specify the intended use of the funds in the budget section of the application form.
Preferences:
With these grants, we are particularly interested in funding the following research directions:
Weak-to-strong generalization:
Humans will be weak supervisors relative to superhuman models.
Can we understand and control how strong models generalize from weak supervision?
Interpretability:
How can we understand model internals?
And can we use this to e.g. build an AI lie detector?
Scalable oversight:
How can we use AI systems to assist humans in evaluating the outputs of other AI systems on complex tasks?
Many other research directions, including but not limited to: honesty, chain-of-thought faithfulness, adversarial robustness, evals and testbeds, and more.
https://openai.com/index/superalignment-fast-grants/
The text was updated successfully, but these errors were encountered: