A Survey on Language Model based Autonomous Agents
This April 2024 paper "A Survey on Large Language Model based Autonomous Agents" provides a comprehensive review of the developing field of LM-based autonomous agents.
The authors organise their survey based on three key aspects: the construction, application, and evaluation of these agents.
Agent Construction
The construction of LM-based autonomous agents involves two main problems:
Designing the agent architecture to better leverage LMs
Enabling the agent to acquire capabilities for accomplishing specific tasks
Designing the Agent Architecture
For the first problem, the authors propose a unified agent framework that encompasses most previous studies.
This framework consists of four modules:
Profiling Module
Identifies the role of the agent using strategies such as handcrafting, LM-generation, or dataset alignment.
Memory Module
Stores information perceived from the environment and leverages recorded memories to facilitate future actions.
The authors discuss memory structures (unified and hybrid), formats (languages, embeddings, databases, lists), and operations (reading, writing, reflection).
Planning Module
Assists agents in planning future actions.
Planning strategies are categorised based on whether the agent receives feedback (planning without feedback and planning with feedback from environments, humans, or models).
Action Module
Translates the agent's decisions into specific outcomes.
The authors analyse this module from four perspectives:
action goal (task completion, communication, exploration),
action production (memory recollection, plan following)
action space (external tools, internal knowledge)
action impact (changing environments, altering internal states, triggering new actions).
Enabling the agent to acquire capabilities
Capability Acquisition with Fine-tuning
Fine-tuning the agent based on task-dependent datasets constructed from human annotation, LLM generation, or real-world applications.
Capability Acquisition without Fine-tuning
Enhancing agent capabilities through prompt engineering (describing desired capabilities using natural language prompts) or mechanism engineering (developing specialized modules, introducing novel working rules, etc.).
Agent Application
The authors provide a systematic overview of LM-based autonomous agent applications in
social science (psychology, political science, economy, social simulation)
natural science (jurisprudence, social science, research assistant)
documentation and data management
natural science (experiment assistant, natural science education)
engineering (civil, computer science, aerospace, industrial automation, robotics & embodied AI).
Applications
Social Science
The authors discuss several key areas within social science where LM-based autonomous agents can be applied:
Psychology: LM-based agents can be used for conducting simulation experiments and providing mental health support. The authors cite examples of studies where LMs are used to complete psychology experiments and generate results that align with human participants. They also highlight the potential of conversation agents to help users cope with anxieties, social isolation, and depression.
Political Science and Economy: LLM-based agents can be used for ideology detection, predicting voting patterns, understanding political speech, and exploring human economic behaviours in simulated scenarios.
Social Simulation: LM-based agents can be employed to simulate social phenomena, such as the propagation of harmful information, in virtual environments. The authors provide examples of studies that simulate online social communities, investigate the impacts of agent behavioural characteristics in social networks, and simulate human daily life.
Jurisprudence: LM-based agents can serve as aids in legal decision-making processes, facilitating more informed judgments. Examples include simulating the decision-making processes of multiple judges and supporting legal search strategies.
Research Assistant: LLM-based agents can be used as versatile assistants in social science research, offering assistance in tasks such as generating article abstracts, extracting keywords, crafting study scripts, and identifying novel research inquiries.
Natural Science
The authors present several representative areas within natural science where LLM-based agents can play important roles:
Documentation and Data Management: LM-based agents can excel in tasks related to collecting, organising, and synthesising literature, thanks to their strong language understanding capabilities and ability to employ tools such as the internet and databases for text processing.
Experiment Assistant: LM-based agents can independently conduct experiments, making them valuable tools for supporting scientists in their research projects. Examples include automating the design, planning, and execution of scientific experiments, and providing recommendations for experimental procedures while emphasizing potential safety risks.
Natural Science Education: LM-based agents can be used to develop agent-based educational tools that facilitate students' learning of experimental design, methodologies, and analysis. Examples include assisting researchers in exploring, discovering, solving, and proving mathematical problems, and automatically solving and explaining university-level mathematical problems.
Engineering
The authors review and summarise the applications of LM-based agents in several major engineering domains:
Civil Engineering: LM-based agents can be used to design and optimise complex structures such as buildings, bridges, dams, and roads. An example is an interactive framework where human architects and agents collaborate to construct structures in a 3D simulation environment.
Computer Science & Software Engineering: LM-based agents offer potential for automating coding, testing, debugging, and documentation generation. Examples include an end-to-end framework where multiple agent roles communicate and collaborate through natural language conversations to complete the software development life cycle, and a self-collaboration framework for code generation using LMs.
Industrial Automation: LM-based agents can be used to achieve intelligent planning and control of production processes. An example is a framework that integrates LMs with digital twin systems to accommodate flexible production needs.
Robotics & Embodied Artificial Intelligence: Recent works have developed more efficient reinforcement learning agents for robotics and embodied artificial intelligence, focusing on enhancing autonomous agents' abilities for planning, reasoning, and collaboration in embodied environments.
Lastly, the authors remark on the risks and challenges associated with using LLM-based agents in these applications, such as the susceptibility of LLMs to illusions and other issues, and the potential for malicious exploitation.
They emphasise the need for users to possess the necessary expertise and knowledge to exercise caution during experimentation and the importance of implementing security measures to ensure responsible and ethical use.
Last updated