Quick List of High-Impact AI Security Projects
Written by Lukas Berglund and Caleb Parikh.
About this doc
Securing powerful AI systems against theft or exfiltration could significantly decrease the risk of AI-enabled catastrophe. However, compared to other interventions, like technical AI safety research, AI security has received little attention. To inspire more work in this area, this document outlines several promising AI security projects that people can contribute to.
Our focus is on securing AI systems from sophisticated threat actors. We do not address other areas at the intersection of AI and cybersecurity, such as AI cyber evaluations and improving AI robustness.
We are especially interested in:
Securing AI IP (i.e. model weights and algorithmic insights) from state-level actors.
Preventing loss of control to misaligned and powerful AI systems (AI control).
This list reflects projects the authors find personally compelling rather than a comprehensive evaluation of the highest-impact initiatives in the field. It is not exhaustive – if a project is absent from this list, that does not imply it is unimportant by our lights.
If you are interested in working on any of these projects, please fill in this short form. We would love to connect you with potential collaborators, funders, and advisors.
Strategy/Policy
Report on Securing AI Algorithmic Insights
Scope: Well-defined
Motivation
AI labs generate and possess numerous algorithmic insights that can significantly reduce resource requirements for training highly capable AI systems.
These insights are poorly protected and vulnerable to theft by foreign actors.
It would be useful to explain why algorithmic secrets are strategically important and provide recommendations for securing algorithmic secrets.
These recommendations could draw from existing industry standards for safeguarding proprietary/classified technology.
The Meselson Center at RAND (led by Sella Nevo) is currently working on this report and is looking for experts to contribute to it or lead it.
Aims
Develop guidelines for securing algorithmic secrets by building off of existing guidelines and standards and interviewing key stakeholders and experts.
The report should be similar in structure and scope to the Securing AI Model Weights report.
Useful Skills and Experience
We are interested in people with cybersecurity experience who can lead or contribute to the project. We are particularly interested in people with a background in insider threats, corporate espionage, and espionage more generally.
Report on Promising Interventions to Secure AI Model Weights
Scope: Moderately-defined
Motivation
We think that RAND’s securing AI model weights report did an excellent job of figuring out relevant threat models and pointing at security measures that might be required as we approach transformative AI systems.
That said, it’s not clear what kinds of products, organizations, and infrastructure need to be developed in advance, what work needs to be done inside AI labs and compute providers etc.
E.g. red-teaming tools, security standards, hardware security modules, hardware for upload limits, on-chip mechanisms for training and inference compute.
Taking the securing model weights report and turning it into a set of recommendations for high-impact work could add a much-needed guiding light to the AI security field.
Aims
Distill concrete projects from the securing AI model weights report focusing on projects that are better suited to people outside of AI labs.
Disseminate the report to people looking to people looking for projects to work on, grantmakers, and other relevant parties.
Required Skill and Experience
I expect this project to mostly involve distilling relevant parts of RAND’s securing AI model weights report, talking to experts to get a better understanding of what measures require novel work, and figuring out which measures will likely be done by existing organisations.
Generalist researchers ideally with some cybersecurity experience (or a willingness to spin up on relevant parts of cyber quickly) seem well suited to this project.
AI Lab Cybersecurity Standards
Scope: Moderately-defined
Motivation
Future laws might require frontier AI labs to adhere to cybersecurity standards to help protect model weights and algorithmic secrets.
There are some existing industry standards, but they are often too vague or missing key measures for state proof security.
Aims
The Meselson Center at RAND is currently drafting standards and is looking for experts to contribute to, or potentially lead, that line of work.
The goal is to create actionable standards that could be used as part of a legal framework for securing model weights.
Useful Skills and Experience
People with cybersecurity expertise, particularly related to cybersecurity standards.
Workshops for Key Stakeholders on Extreme Security Preparedness
Scope: Moderately-defined
Motivation
AI capabilities are rapidly evolving, creating new security challenges for AI labs.
Our impression is that there is little existing planning from AI labs on improving security as we approach transformative AI and face state-sponsored attempts to steal AI IP.
Convening technical staff from AI labs, security researchers, and relevant members of government could aid AI labs in preparing for automated AI R&D and the development of adequate security standards.
Aims
Accelerate the development of concrete security standards for SL5 (Security Level 5) and automated AI R&D scenarios.
Create a forum for technical experts to provide input on extreme security measures and get buy-in from relevant stakeholders including senior security staff at AI labs.
Useful Skills and Experience
The main priority is finding an individual (or organization) to organize the workshop series, which would involve:
Coordinating between technical experts and CISOs from multiple organizations.
Ensuring productive discussions while navigating potential antitrust concerns.
Synthesizing technical input into actionable security standards and plans.
The ideal organizer would have experience in:
Project management on projects with high-profile stakeholders.
Facilitating high-level discussions between technical and non-technical stakeholders.
Understanding of AI development processes and associated security concerns.
We have assembled a partial steering group with AI security experts who can contribute to the content of workshops. If you are interested in supporting this project as an individual, management would be available for this position.
Policy Levers for Incentivizing Advanced Model-Weight Security
Scope: Broadly-defined
Motivation
AI companies have natural incentives to protect their systems, but may not fully account for broader societal risks or invest sufficiently in extreme security measures.
Defending against well-resourced attacks (e.g., from state actors) requires considerable effort and resources, which can conflict with other company objectives like rapid R&D progress.
Policy interventions could help align company incentives with broader societal interests in robustly secure AI systems.
Multiple potential policy levers exist, but their effectiveness and feasibility for improving AI security are not well understood.
Aims
Research and analyze policy levers for incentivizing advanced AI model security, with a focus on:
Potential impact on security practices.
Feasibility of implementation.
Possible unintended consequences.
Develop specific policy proposals that could be implemented by relevant agencies and legislative bodies, including but not limited to:
Software export controls for advanced AI models and related technologies.
Procurement rules requiring enhanced security measures for government use of advanced AI systems.
Preferential permitting laws for highly secure AI research and development facilities.
Security standards for AI systems above certain capability thresholds.
Useful Skills and Experience
Background in technology policy.
Familiarity with relevant regulatory bodies (e.g., NIST, CISA, FTC in the US context).
Understanding of legislative processes and how to craft implementable policy proposals.
Experience in cybersecurity, particularly in protecting high-value digital assets.
National security background.
Knowledge of AI development processes and associated security concerns.
Technical
Developing Roadmaps for Extreme Security
Scope: Moderately-defined
Motivation
Implementing state-proof security measures could take around 5 years, according to RAND researchers and other experts.
Early design decisions in AI infrastructure need to be compatible with future extreme security requirements and adaptable to changes in AI technology.
Many critical security measures for extreme scenarios (SL5) differ significantly from current practices (SL2/3).
Aims
Develop a comprehensive plan for extreme security measures that can be rapidly implemented when needed.
Identify and prioritize security features that require long lead times for development or implementation.
Create proposals for modifying current AI infrastructure designs to allow for future extreme security measures.
Produce sharable "public goods" in the form of security methods, software, and R&D that can benefit the entire AI industry. Alternatively, the outputs could be shared in a limited capacity with key stakeholders in government and industry.
Useful Skills and Experience
This project is somewhat novel, and we aren’t sure what knowledge and experience the ideal team would have. We’re particularly excited about people who have some of the following (though an individual doesn't need all of the following):
Background in cybersecurity, with a focus on extreme threat models.
Understanding of current AI infrastructure, security practices, and threat models.
Knowledge of hardware security features, particularly those relevant to AI systems.
Familiarity with data center design and construction, especially for high-security applications.
Experience in red teaming and stress-testing security measures.
Strong project management skills.
Background in policy and regulation related to AI and cybersecurity.
The team should include or have access to experts in AI safety, hardware engineering, software security, and physical security.
Developing Forecasts of Lead Times for SL5 Measures
Scope: Moderately-defined
Motivation
Security Level 5 (SL5) measures represent measures designed to protect against sophisticated state-level actors.
Implementing SL5 measures is complex and time-consuming, potentially taking years to fully realize.
AI labs and governments may underestimate the time required to implement robust security measures, leading to a false sense of preparedness.
Accurate and trusted forecasts of lead times for SL5 measures could create a compelling case for immediate action, and help stakeholders prioritise their security efforts. It may also help motivate the case for measures that may have a large serial lead time but aren’t particularly useful right now.
The project could demonstrate the potential risks of delaying security implementations by illustrating the gap between projected AI capability advancements and security measure readiness.
Aims
Create forecasts for the lead times required to research, develop, and implement critical SL5 measures.
Identify potential bottlenecks and critical paths in the implementation of SL5 measures to prioritise areas requiring the most urgent attention.
Provide AI labs, policymakers, and other stakeholders with actionable timelines and priorities for enhancing AI security.
Useful Skills and Experience
Background in cybersecurity, with a focus on advanced threat models and state-level actors.
Familiarity with AI development processes, infrastructure, and current security practices.
Knowledge of forecasting methodologies, particularly for technological development timelines.
Understanding of hardware security features and their development cycles.
Ability to collaborate with and gather input from diverse stakeholders, including AI researchers, security experts, and policymakers.
Securing Data Centers Used for Frontier AI Model Training
Scope: Broadly-defined
Motivation
Frontier AI models may require specialized security measures at the data center level to protect against state-level actors.
Current data center security practices may be insufficient for the unique challenges posed by training and running frontier AI models.
Developing technical recommendations and prototyping novel technologies could significantly enhance the security of data centers hosting these advanced AI systems.
The development and implementation of advanced security measures for data centers likely have long lead times, necessitating immediate action to be prepared for future needs.
Aims
Develop technical recommendations tailored to the specific security needs of data centers housing frontier AI systems.
Recommendations could either be for the construction of new data centers or the retrofitting of existing ones. In the latter case, it may be useful to provide recommendations precisely targeted to these existing data centers.
Design and prototype novel technologies that address unique security challenges associated with frontier AI models in data center environments.
Useful Skills and Experience
Knowledge of data center architecture and security practices.
Understanding of the unique security requirements for frontier AI models.
Experience in cybersecurity, particularly in protecting high-value digital assets.
Computer hardware expertise, particularly in areas relevant to AI systems and data center infrastructure.
Field-Building
Running AI Security Conferences
Scope: Well-defined
One concrete proposal: Running two conferences in the next 12 months with 200-2000 people. Aiming to bring together security professionals, key stakeholders from AI labs, policymakers, and researchers to address critical issues in AI security with a focus on preparing for transformative AI.
Motivation
Relatively few security professionals are focused on securing AI models, and fewer still are focused on preparing to secure AI models from capable attacks by nation-state actors. AI is being seen as an increasingly important area but by default, it seems unlikely that the field will prioritize work that most effectively mitigates catastrophic risks.
AI Security conferences could both accelerate the AI security field, and potentially determine its direction over the next few years.
Aims
Present GCR-relevant content (aim for 75%+ of the program)
Facilitate discussions on critical AI security issues
Build relationships and strengthen networks among attendees
Build alignment on the importance of AI security measures, particularly for extreme security
Increase the potential for future collaborations and events
Resource Requirements
Estimated 2,000 organizer hours for a large conference (e.g. 2000 people)
Breakdown of time:
Management/Strategy/Evaluation: ~500 hours
Content Curation: ~250 hours
Event Production: ~250 hours
Admissions Management: ~200 hours
Communications: ~150 hours
Volunteer Coordination: ~100 hours
Useful Skills and Experience
Ideal candidates would be:
Knowledgeable about AI security - in our experience these events are better when there is significant input from reasonably knowledgeable and experienced people. Some of this can be done by assembling a steering group and then having “executor” type people run the actual events.
Have some experience organizing events, or experience managing projects with multiple moving parts.
Are excited about field building and building something new.
Possible next steps
Talk to caleb@airiskfund.com for more information on past events in this space and their current plans (aisecurity.forum).
Form an organizing committee, apply for funding, secure a venue, and develop a marketing strategy.
Running a Cybersecurity Boot Camp
Scope: Well-defined
Motivation
There are a variety of projects in AI security that would benefit from increased technical capacity.
MLAB
There are very few security professionals who are focused on global catastrophic risks or have thought extensively about securing powerful AI models.
We suspect that after going through an intense boot camp, some people will be able to start on high-leverage AI security projects straight away.
Aims
Upskill 20-40 people quickly in areas of cybersecurity relevant to securing AI IP, selecting strongly for people we expect to do GCR-relevant work soon.
Transition 10+ people to work on pressing AI security projects shortly after the program ends.
Useful Skills and Experience
We have some cybersecurity experts working on developing the curriculum. The current organizing team includes members who previously managed MLAB, a machine learning boot camp which produced several AI safety researchers at top organisations. The main priority is finding someone to organize the boot camp (advertise it, house participants, make sure day-to-day activities run smoothly, etc.).
Experience managing a complex project and commitment to completing necessary tasks, even tedious ones, to ensure the boot camp is successful.
If you are interested in working on any of these projects, please fill in this short form. We would love to connect you with potential collaborators, funders, and advisors.