Weran a hackathon at the end of the Turing Seminar in ENS Paris-Saclay and ENS Ulm, an academic course inspired by the AGISF, with 28 projects submitted by 44 participants between the 11th and 12th November.
We share a selection of projects. See them all here.
I think some of them could even be turned into valuable blog posts, and I’ve learnt a lot by reading everything. Here are a few extracts.
David HEURTEL-DEPEIGES [link]
Basically an adaptation of the famous dictionary learning paper on vision CNN.
“When looking at 100 random features, 46 were found to be interpretable and monosemantic, [...] When doing the same experiment with 100 random neurons, 2 were found to be interpretable and monosemantic”. Very good execution.
Théo SAULUS [link]
Imho, an almost SOTA summary of the current discourse on open sourcing models.
Tu Duyen NGUYEN, Adrien RAMANANA RAHARY [link]
“What is the goal of this document? Our goal is to clarify, and sometimes criticize, the arguments regularly put forward in this, both from the proponents and the opponents of open-sourced AI models. In an effort to better understand both stances, we have classified the most common arguments surrounding this debate in five main topics:
In an effort to highlight the strengths and weaknesses of each stance, we will present and challenge for each family the arguments of both sides, as well as include in each family some arguments which we think are relevant, but have not been mentioned in most discussions on these topics.”
They tried to be exhaustive, but there are still some gaps. The format is nevertheless interesting (even if it could be even more concise).
Gurvan Richardeau, Raphaël Pesah [link]
“The first analysis we’ve done involved examining major themes by employing text data analysis and ML methods (such as tf-idf analysis and topic modeling) within a corpus of 1,644 publications from the Factiva database over the past five years, specifically related to AI safety. The second analysis (Analysis 2) uses another database: Europresse. “
Here are some snippets:
“First let’s have a look at the global distribution of the articles over the years.”
“Here are the results for the sentiment analysis: “
“Around 71% of the articles talking of AGI talk about it as a good thing whereas 23% are talking about it as a thing to worry about.”
“We got 57,000 articles about AGI in 2023 against 18,000 for the ones related to AI safety, so this time it is three times less. [...] There are between two and three times more articles talking about AGI without AI safety than articles about AI safety in 2023.”
My comment: Interesting. There are many more figures in the reports. Maybe those kinds of metrics could be used to measure the impact of public outreach?
Gaspard Berthelier [link]
A good summary of the paper “Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback”.
Thomas Michel, Théo Rudkiewicz [link]
An alternative title could be “Situational Awareness from AI to Zombies, with premium pedagogical memes”:
Vincent Bardusco [link]
A good summary of the following papers:
Antoine Poirier, Théo Communal [link]
A document summarizing the main aspects of DeepMind's plan. Up until now, their agenda has been covered in a series of blog posts and papers, but now it's all summarized in a ten-page blog post.
Gabriel Ben Zenou, Joachim Collin [link]
An alternative title could be “Against Against Almost Every Theory of Impact of Interpretability”.
Some students tried to distill the discussion and to criticize my position, and you can find my criticism of their criticism in the comments of the google doc. Here is my main comment.
Inès Larroche, Bastien Le Chenadec [link]
It’s a very good summary of what is happening in China regarding AI.
Victor Morand [link]
It’s a good summary of LeCun’s idea, and the numerous criticisms of his plan.
Mathis Embit [link]
A summary of the main ways to regulate AI. Comparing different policies provides them with more distinctiveness and depth.