Soran Ghaderi

I received my MSc in Artificial Intelligence at the University of Essex, supervised by Professor Luca Citi. Previously, I obtained my bachelor's degree from the University of Kurdistan. During my postgraduate studies, I was working on reasoning in LLMs for code generation; as a result, I developed a new approach for guiding the generation process. It consisted of a separate deep-think stage with self-reflection and a direct integration of thoughts (hidden states) into the LLM's main generation's hidden states using model surgery! This framework called NIR is applicable to various LLMs without finetuning. As an extension, It has a thought memory manager that controls the memories and thoughts of the LLM (not included in the evaluations) which allows the integration of approaches like Neural Turing Machines or DNC and LLMs. I have also contributed to open-source projects and have collaborated with research teams in the past.

Research Interests

My ultimate objective is to study the cognitive mechanisms underlying intelligence and develop embodied agents able to reason and interact with the real world.

A major challenge in current AI models is their struggle with generalization, consistent logical reasoning, and decision-making, especially in scenarios that require multi-step inference or handling previously unseen situations. Current models (ie. LLMs) can generate highly semantically coherent text, yet, often fail in robust reasoning abilities that are necessary for tasks such as complex problem-solving, planning, or adapting to novel environments.

Researchers have explored many approaches such as Prompt engineering methods (eg. CoT and self-reflection), which can enhance reasoning abilities to an extent as a temporary and short-term solution, however, they are limited and are not architecturally native to the models. Therefore, my research focuses on developing differentiable approaches for implementing iterative reasoning and decision-making in uncertain environments in embodied agents. This involves developing new architectures that can maintain coherent thought processes across multiple inference steps.

Currently, I am exploring diffusion models, score-based and flow-based models, differential geometry (particularly Riemannian geometry), metric learning, energy-based models, joint embedding predictive architectures operating in the latent spaces among others to design generalizable intelligent agents capable of reasoning (as an optimization problem), deliberate planning (system 2-like), and able to handle uncertainty (in the inference-time)

In a wider context, I am fascinated by the idea of developing a network of specialized networks such as memory and rapid learning, goal-driven planning, spatial reasoning, and error detection and conflict monitoring that operate collaboratively to consistently make good decisions within a complex environment. I am interested in transformers and attention mechanisms with a bias toward efficient computation in these settings: generative models, multimodal learning, and self-supervised learning.

Publications

Neural Integration of Iterative Reasoning (NIR) in LLMs for Code Generation

Soran Ghaderi

ABS HTML PDF CODE