AI Agent

自主智能体、AI助手、工具调用与规划推理等 AI Agent 领域前沿动态。

Mastercard’s AI payment demo points to agent-led commerce
Agent

Mastercard’s AI payment demo points to agent-led commerce

A recent demonstration from Mastercard suggests that payment systems may be heading toward a future where software agent...

Deploying agentic finance AI for immediate business ROI
Agent

Deploying agentic finance AI for immediate business ROI

Agentic finance AI improves business efficiency and ROI only when deployed with strict governance and clear return on in...

Nokia and AWS pilot AI automation for real-time 5G network slicing
Agent

Nokia and AWS pilot AI automation for real-time 5G network slicing

Telecom networks may soon begin adjusting themselves in real time, as operators test systems that allow AI agents to man...

Trace raises $3M to solve the AI agent adoption problem in enterprise
Agent

Trace raises $3M to solve the AI agent adoption problem in enterprise

Trace is launching with $3 million in seed funding, including investment from Y Combinator, Zeno Ventures, Transpose Pla...

Agent

诺诚健华BCL2抑制剂联合奥布替尼3期临床完成患者入组

36氪获悉,生物医药高科技公司诺诚健华今天宣布,公司自主研发的新型BCL2抑制剂mesutoclax(ICP-248) 联合BTK抑制剂奥布替尼一线治疗慢性淋巴细胞白血病/小淋巴细胞淋巴瘤(CLL/SLL)的注册性III期临床试验已经完成患...

Agent

美媒:AI巨头将签署自主供电承诺

据美国阿克西奥斯新闻网站25日报道,美国多家技术巨头企业代表拟于下周前往白宫面见总统特朗普,其间将签署书面文件,承诺自行供应或购买人工智能(AI)数据中心所需电力。据报道,已有多家美国技术巨头承诺采取措施,避免消费者因人工智能技术发展而遭遇...

Agent

Anthropic acquires computer-use AI startup Vercept after Meta poached one of its founders

Seattle-based Vercept developed complex agentic tools, including a computer-use agent that could complete tasks inside a...

Agent

A Comparative Analysis of Social Network Topology in Reddit and Moltbook

arXiv:2602.13920v3 Announce Type: replace-cross Abstract: Recent advances in agent-mediated systems have enabled a new p...

Agent

Bypassing AI Control Protocols via Agent-as-a-Proxy Attacks

arXiv:2602.05066v2 Announce Type: replace-cross Abstract: As AI agents automate critical workloads, they remain vulnerab...

Agent

Beyond RAG for Agent Memory: Retrieval by Decoupling and Aggregation

arXiv:2602.02007v2 Announce Type: replace-cross Abstract: Agent memory systems often adopt the standard Retrieval-Augmen...

Agent

RebuttalAgent: Strategic Persuasion in Academic Rebuttal via Theory of Mind

arXiv:2601.15715v3 Announce Type: replace-cross Abstract: Although artificial intelligence (AI) has become deeply integr...

Agent

Stabilizing Off-Policy Training for Long-Horizon LLM Agent via Turn-Level Importance Sampling and Clipping-Triggered Normalization

arXiv:2511.20718v2 Announce Type: replace-cross Abstract: Reinforcement learning (RL) algorithms such as PPO and GRPO ar...

Agent

SPACeR: Self-Play Anchoring with Centralized Reference Models

arXiv:2510.18060v2 Announce Type: replace-cross Abstract: Developing autonomous vehicles (AVs) requires not only safety ...

Agent

FML-bench: Benchmarking Machine Learning Agents for Scientific Research

arXiv:2510.10472v2 Announce Type: replace-cross Abstract: Large language models (LLMs) have sparked growing interest in ...

Agent

ClearFairy: Capturing Creative Workflows through Decision Structuring, In-Situ Questioning, and Rationale Inference

arXiv:2509.14537v2 Announce Type: replace-cross Abstract: Capturing professionals' decision-making in creative workflows...

Agent

Multi-agent deep reinforcement learning with centralized training and decentralized execution for transportation infrastructure management

arXiv:2401.12455v2 Announce Type: replace-cross Abstract: Life-cycle management of large-scale transportation systems re...

Agent

OptiRepair: Closed-Loop Diagnosis and Repair of Supply Chain Optimization Models with LLM Agents

arXiv:2602.19439v2 Announce Type: replace Abstract: Supply chain optimization models frequently become infeasible becaus...

Agent

OR-Agent: Bridging Evolutionary Search and Structured Research for Automated Algorithm Discovery

arXiv:2602.13769v2 Announce Type: replace Abstract: Automating scientific discovery in complex, experiment-driven domain...

Agent

OMNI-LEAK: Orchestrator Multi-Agent Network Induced Data Leakage

arXiv:2602.13477v2 Announce Type: replace Abstract: As Large Language Model (LLM) agents become more capable, their coor...

Agent

Toward Ultra-Long-Horizon Agentic Science: Cognitive Accumulation for Machine Learning Engineering

arXiv:2601.10402v4 Announce Type: replace Abstract: The advancement of artificial intelligence toward agentic science is...

Agent

InsightX Agent: An LMM-based Agentic Framework with Integrated Tools for Reliable X-ray NDT Analysis

arXiv:2507.14899v3 Announce Type: replace Abstract: Non-destructive testing (NDT), particularly X-ray inspection, is vit...

Agent

GUI-Libra: Training Native GUI Agents to Reason and Act with Action-aware Supervision and Partially Verifiable RL

arXiv:2602.22190v1 Announce Type: cross Abstract: Open-source native GUI agents still lag behind closed-source systems o...

Agent

SWE-Prot\'eg\'e: Learning to Selectively Collaborate With an Expert Unlocks Small Language Models as Software Engineering Agents

arXiv:2602.22124v1 Announce Type: cross Abstract: Small language models (SLMs) offer compelling advantages in cost, late...

Agent

Training Generalizable Collaborative Agents via Strategic Risk Aversion

arXiv:2602.21515v1 Announce Type: cross Abstract: Many emerging agentic paradigms require agents to collaborate with one...

Agent

Adversarial Intent is a Latent Variable: Stateful Trust Inference for Securing Multimodal Agentic RAG

arXiv:2602.21447v1 Announce Type: cross Abstract: Current stateless defences for multimodal agentic RAG fail to detect a...

Agent

The Headless Firm: How AI Reshapes Enterprise Boundaries

arXiv:2602.21401v1 Announce Type: cross Abstract: The boundary of the firm is determined by coordination cost. We argue ...

Agent

Black-Box Reliability Certification for AI Agents via Self-Consistency Sampling and Conformal Calibration

arXiv:2602.21368v1 Announce Type: cross Abstract: Given a black-box AI system and a task, at what confidence level can a...

Agent

A General Equilibrium Theory of Orchestrated AI Agent Systems

arXiv:2602.21255v1 Announce Type: cross Abstract: We establish a general equilibrium theory for systems of large languag...

Agent

AgenticTyper: Automated Typing of Legacy Software Projects Using Agentic AI

arXiv:2602.21251v1 Announce Type: cross Abstract: Legacy JavaScript systems lack type safety, making maintenance risky. ...

Agent

Budget-Aware Agentic Routing via Boundary-Guided Training

arXiv:2602.21227v1 Announce Type: cross Abstract: As large language models (LLMs) evolve into autonomous agents that exe...

Agent

Field-Theoretic Memory for AI Agents: Continuous Dynamics for Context Preservation

arXiv:2602.21220v1 Announce Type: cross Abstract: We present a memory system for AI agents that treats stored informatio...

Agent

2-Step Agent: A Framework for the Interaction of a Decision Maker with AI Decision Support

arXiv:2602.21889v1 Announce Type: new Abstract: Across a growing number of fields, human decision making is supported by...

Agent

Power and Limitations of Aggregation in Compound AI Systems

arXiv:2602.21556v1 Announce Type: new Abstract: When designing compound AI systems, a common approach is to query multip...

Agent

ARLArena: A Unified Framework for Stable Agentic Reinforcement Learning

arXiv:2602.21534v1 Announce Type: new Abstract: Agentic reinforcement learning (ARL) has rapidly gained attention as a p...

Agent

Beyond Refusal: Probing the Limits of Agentic Self-Correction for Semantic Sensitive Information

arXiv:2602.21496v1 Announce Type: new Abstract: While defenses for structured PII are mature, Large Language Models (LLM...

Agent

A Hierarchical Multi-Agent System for Autonomous Discovery in Geoscientific Data Archives

arXiv:2602.21351v1 Announce Type: new Abstract: The rapid accumulation of Earth science data has created a significant s...

Agent

Google and Samsung just launched the AI features Apple couldn’t with Siri

Google just announced that Gemini will soon be able to take care of some multistep tasks on your phone, like ordering fo...

Agent

OpenClaw creator’s advice to AI builders is to be more playful and allow yourself time to improve

Peter Steinberger talks about the creation of his viral AI agent OpenClaw and how being more "playful" makes for a bette...

遭谷歌制裁,OpenClaw创始人怒怼:Anthropic会先打电话,你们直接封号
Agent

遭谷歌制裁,OpenClaw创始人怒怼:Anthropic会先打电话,你们直接封号

编辑|泽南、杨文最近频频登上新闻头条的 OpenClaw,终于被「制裁」了一回。本周一,谷歌宣布限制部分开发者使用旗下 vibe Coding 平台 Antigravity,并指控他们「恶意使用」,此举在社交平台上引发了争议。 W...

Agent

The Metaphysics We Train: A Heideggerian Reading of Machine Learning

arXiv:2602.19028v2 Announce Type: replace-cross Abstract: This paper offers a phenomenological reading of contemporary m...

Agent

ST-EVO: Towards Generative Spatio-Temporal Evolution of Multi-Agent Communication Topologies

arXiv:2602.14681v3 Announce Type: replace-cross Abstract: LLM-powered Multi-Agent Systems (MAS) have emerged as an effec...

Agent

AceGRPO: Adaptive Curriculum Enhanced Group Relative Policy Optimization for Autonomous Machine Learning Engineering

arXiv:2602.07906v2 Announce Type: replace-cross Abstract: Autonomous Machine Learning Engineering (MLE) requires agents ...

Agent

Repurposing Synthetic Data for Fine-grained Search Agent Supervision

arXiv:2510.24694v2 Announce Type: replace-cross Abstract: LLM-based search agents are increasingly trained on entity-cen...

Agent

A Survey of Data Agents: Emerging Paradigm or Overstated Hype?

arXiv:2510.23587v2 Announce Type: replace-cross Abstract: The rapid advancement of large language models (LLMs) has spur...

Agent

Breaking Agent Backbones: Evaluating the Security of Backbone LLMs in AI Agents

arXiv:2510.22620v2 Announce Type: replace-cross Abstract: AI agents powered by large language models (LLMs) are being de...

Agent

Towards Scalable Oversight via Partitioned Human Supervision

arXiv:2510.22500v2 Announce Type: replace-cross Abstract: As artificial intelligence (AI) systems approach and surpass e...

Agent

Performance Asymmetry in Model-Based Reinforcement Learning

arXiv:2505.19698v3 Announce Type: replace-cross Abstract: Recently, Model-Based Reinforcement Learning (MBRL) have achie...

Agent

BrowseComp-$V^3$: A Visual, Vertical, and Verifiable Benchmark for Multimodal Browsing Agents

arXiv:2602.12876v2 Announce Type: replace Abstract: Multimodal large language models (MLLMs), equipped with increasingly...

Agent

STAR: Similarity-guided Teacher-Assisted Refinement for Super-Tiny Function Calling Models

arXiv:2602.03022v2 Announce Type: replace Abstract: The proliferation of Large Language Models (LLMs) in function callin...

Agent

NewtonBench: Benchmarking Generalizable Scientific Law Discovery in LLM Agents

arXiv:2510.07172v3 Announce Type: replace Abstract: Large language models are emerging as powerful tools for scientific ...

Agent

A Framework for Studying AI Agent Behavior: Evidence from Consumer Choice Experiments

arXiv:2509.25609v2 Announce Type: replace Abstract: Environments built for people are increasingly operated by a new cla...

Agent

DS-STAR: Data Science Agent for Solving Diverse Tasks across Heterogeneous Formats and Open-Ended Queries

arXiv:2509.21825v4 Announce Type: replace Abstract: While large language models (LLMs) have shown promise in automating ...

Agent

TASER: Table Agents for Schema-guided Extraction and Recommendation

arXiv:2508.13404v4 Announce Type: replace Abstract: Real-world financial filings report critical information about an en...

Agent

A Survey on the Optimization of Large Language Model-based Agents

arXiv:2503.12434v2 Announce Type: replace Abstract: With the rapid development of Large Language Models (LLMs), LLM-base...

Agent

"Are You Sure?": An Empirical Study of Human Perception Vulnerability in LLM-Driven Agentic Systems

arXiv:2602.21127v1 Announce Type: cross Abstract: Large language model (LLM) agents are rapidly becoming trusted copilot...

Agent

Cooperative-Competitive Team Play of Real-World Craft Robots

arXiv:2602.21119v1 Announce Type: cross Abstract: Multi-agent deep Reinforcement Learning (RL) has made significant prog...

Agent

Toward an Agentic Infused Software Ecosystem

arXiv:2602.20979v1 Announce Type: cross Abstract: Fully leveraging the capabilities of AI agents in software development...

Agent

See and Fix the Flaws: Enabling VLMs and Diffusion Models to Comprehend Visual Artifacts via Agentic Data Synthesis

arXiv:2602.20951v1 Announce Type: cross Abstract: Despite recent advances in diffusion models, AI generated images still...

Agent

Some Simple Economics of AGI

arXiv:2602.20946v1 Announce Type: cross Abstract: For millennia, human cognition was the primary engine of progress on E...

Agent

Airavat: An Agentic Framework for Internet Measurement

arXiv:2602.20924v1 Announce Type: cross Abstract: Internet measurement faces twin challenges: complex analyses require e...