LLM aided design

{{Article for deletion/dated|page=LLM aided design|timestamp=20250619085151|year=2025|month=June|day=19|substed=yes|help=off}}

----

LLM-aided design refers to the use of large language models (LLMs) as smart agents throughout the end-to-end process of system design, including conceptualization, prototyping, verification, and optimization. This evolving interdisciplinary model integrates advances in natural language processing (NLP), program synthesis, and automated reasoning to support tasks in domains such as electronic design automation (EDA), software engineering, hardware design, and cyber-physical systems.

Unlike traditional automation tools, LLMs - especially transformer-based architectures like GPT-4, ClaudeAnthropic et al. The Claude 3 Model Family: Opus, Sonnet, Haiku. Anthropic Model Card (PDF), 2024. [https://www-cdn.anthropic.com/de8ba9b01c9ab7cbabf5c33b80b7bbc618857627/Model_Card_Claude_3.pdf Available online], LLaMA, and domain-specialized variants such as [https://ollama.com/library/codellama CodeLlama] - are capable of interpreting, generating, and refining structured and unstructured data including natural language specifications, HDL (Hardware Description Language)/HDL-like code, constraint definitions, tool scripts, and design documentation. LLM-aided design thus represents a shift from tool-assisted engineering to a form of co-design in which machine intelligence participates actively in architectural exploration, logic synthesis, formal verification, and post-silicon validation. It is situated at the intersection of artificial intelligence, computer-aided design (CAD), and systems engineering.

Introduction

Engineering workflows in hardware and software development have traditionally relied on manual translation of high-level design intents into machine-readable specifications. These processes, though robust, are time-consuming and often require significant domain expertise. The introduction of large language models into design workflows aims to streamline this process by enabling natural language interaction, synthesis of domain-specific artifacts, and integration with design toolchains.

In recent years, the field of engineering design has witnessed an exponential conjunction of artificial intelligence (AI) and domain-specific modeling. LLMs - such as GPT-4, Claude, and LLaMA - are capable of understanding and generating code, documents, and designs from natural language descriptions. This capacity opens a new area where human designers can work together with AI systems to ensure design correctness and reduce time-to-market. The aim is to allow designers to express intent in natural language and rely on the model to output Verilog, VHDL, HLS C, or firmware code.

LLM-aided design differs from earlier forms of automated design through its ability to generalize across tasks and contexts. Unlike rule-based or template-driven systems, large language models can encode domain-specific heuristics and adapt to various inputs—including design specifications, codebases, formal properties, and documentation—without requiring extensive retraining. This flexibility supports their use in diverse design settings such as system-on-chip development, embedded systems, robotic control, and cyber-physical system modeling.

A new epistemic layer is added to the engineering process by LLM-aided design, in which models contribute towards design reasoning rather than only carrying out commands. This allows use for flow control automation, formal assertion generation, and template retrieval for HLS code repair. Additionally, it gave rise to domain-adapted LLMs known as circuit foundation models (CFMs), which are capable of reasoning and generating across the whole RTL-to-GDSII pipeline.

Background and Foundations of LLM-Aided Design

The integration of large language models (LLMs) into electronic design automation (EDA) represents a shift in how hardware systems are specified, verified, and developed. While EDA has conventionally been defined by predefined workflows, rule-based synthesis tools, and extensive manual intervention, the growth of LLMs has introduced a new design angle driven by reasoning, abstraction, and human-language interaction. This shift aligns with the broader trajectory of artificial intelligence, where general-purpose models have increasingly been specialized for domain-specific tasks, including those that traditionally needed expert engineers.

=From Transformers to Circuit Reasoning=

The transformer architecture introduced by Vaswani et al. (2017)Vaswani, Ashish; Shazeer, Noam; Parmar, Niki; Uszkoreit, Jakob; Jones, Llion; Gomez, Aidan N.; Kaiser, Lukasz; and Polosukhin, Illia. Attention Is All You Need. *Proceedings of the 31st International Conference on Neural Information Processing Systems* (NIPS'17), 6000–6010. Curran Associates Inc. ISBN 9781510860964. [https://arxiv.org/abs/1706.03762 Available online] serves as the foundation of LLM-aided design. This architecture replaced RNNs and LSTMsSherstinsky, Alex. Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) network. Physica D: Nonlinear Phenomena, vol. 404, 2020, p. 132306. [https://doi.org/10.1016/j.physd.2019.132306 Available online] in natural language processing due to its ability to simulate long-range dependencies with self-attention mechanisms. It serves as the basis for the GPT series, beginning with GPT-2 all the way to GPT-4o and more, with each iteration having significantly better capabilities in zero-shot reasoning, code generation, and language understanding.

By 2020, GPT-3's ability to produce functional code - including basic HTML, Python, and even Verilog-had drawn the interest of the AI community. This inspired hardware design researchers to speculate that LLMs could be used for logic design and verification activities by taking advantage of the structural similarities between programming languages and hardware description languages (HDLs). Early experiments using GPT-3 to write Verilog or assist in debugging demonstrated potential but also had critical limitations like poor syntax, hallucinations, and incompatibility with synthesis tools.

The attempt to address these limitations led to the exploration of a new direction - the creation of domain-specific foundation models tailored to EDA. These models - referred to as circuit foundation models — are trained or fine-tuned on HDL codes, simulation traces, synthesis logs, and constraint files. By 2023, tools like RTLLMLu, Yao; Liu, Shang; Zhang, Qijun; and Xie, Zhiyao. RTLLM: An Open-Source Benchmark for Design RTL Generation with Large Language Model. *Proceedings of the 29th Asia and South Pacific Design Automation Conference* (ASPDAC '24), 722–727. IEEE Press, 2024. [https://doi.org/10.1109/ASP-DAC58780.2024.10473904 Available online] began to deliver results with the vision of LLM-aided design through carefully engineered prompts, feedback loops, and domain-aligned datasets.

Year	Milestone
class="wikitable sortable" \|+ Timeline of LLM Trends in EDA
2017	Transformer introduced by Vaswani et al.
2020	GPT-3 exhibits rudimentary HDL generation capability.
2021	Prompt-based Verilog code generation appears in exploratory tools.
2022	RTLLM pioneer structured, feedback-driven generation pipelines.
2023	Domain-specific finetuning (VeriGenThakur, Shailja; Ahmad, Baleegh; Pearce, Hammond; Tan, Benjamin; Dolan-Gavitt, Brendan; Karri, Ramesh; and Garg, Siddharth. VeriGen: A Large Language Model for Verilog Code Generation. ACM Transactions on Design Automation of Electronic Systems, vol. 29, no. 3, article 46, 2024, pp. 1–31. [https://doi.org/10.1145/3643681 Available online], RTLCoderLiu, Shang; et al. RTLCoder: Fully Open-Source and Efficient LLM-Assisted RTL Code Generation Technique. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 44, no. 4, pp. 1448–1461, April 2025. IEEE. [https://doi.org/10.1109/TCAD.2024.3483089 Available online]); agent frameworks RTLFixerTsai, Yunda; Liu, Mingjie; and Ren, Haoxing. RTLFixer: Automatically Fixing RTL Syntax Errors with Large Language Model. Proceedings of the 61st ACM/IEEE Design Automation Conference (DAC '24), Article 53, 6 pages. Association for Computing Machinery, 2024. [https://doi.org/10.1145/3649329.3657353 Available online], MEICXu, Ke; Sun, Jialin; Hu, Yuchen; Fang, Xinwei; Shan, Weiwei; Wang, Xi; and Jiang, Zhe. MEIC: Re-thinking RTL Debug Automation using LLMs. Proceedings of the 43rd IEEE/ACM International Conference on Computer-Aided Design (ICCAD'25), Article 100, 9 pages. Association for Computing Machinery, 2025. [https://doi.org/10.1145/3676536.3676801 Available online] become practical.
2024	Vision-language fusion (LayoutCopilotLiu, B.; et al. LayoutCopilot: An LLM-Powered Multi-Agent Collaborative Framework for Interactive Analog Layout Design. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2025. IEEE. [https://doi.org/10.1109/TCAD.2025.3529805 Available online]), analog LLMs (LaMAGICChang, Chen-Chia; Shen, Yikang; Fan, Shaoze; Li, Jing; Zhang, Shun; Cao, Ningyuan; Chen, Yiran; and Zhang, Xin. LaMAGIC: Language-Model-Based Topology Generation for Analog Integrated Circuits. Proceedings of the 41st International Conference on Machine Learning (ICML '24), Article 241, 10 pages. JMLR.org, 2024. [https://proceedings.mlr.press/v202/chang24a.html Available online], AnalogCoderLai, Yao; Lee, Sungyoung; Chen, Guojin; Poddar, Souradip; Hu, Mengkang; Pan, David Z.; and Luo, Ping. AnalogCoder: Analog Circuit Design via Training-Free Code Generation. Proceedings of the AAAI Conference on Artificial Intelligence, vol. 39, no. 1, pp. 379–387, 2025. [https://doi.org/10.1609/aaai.v39i1.32016 Available online]) expand to new design domains.
2025	Multi-agent architectures and graph-text fusion (DRC-CoderChang, Chen-Chia; Ho, Chia-Tung; Li, Yaguang; Chen, Yiran; and Ren, Haoxing. DRC-Coder: Automated DRC Checker Code Generation Using LLM Autonomous Agent. In: Proceedings of the 2025 International Symposium on Physical Design (ISPD ’25), ACM, 2025, pp. 143–151. [https://doi.org/10.1145/3698364.3705347 Available online]) reshape design verification.

=Decoder vs. Encoder Models in Co-Design=

1. Decoder-Based Autoregressive Models: Based on architectures like GPT and [https://ollama.com/library/codellama CodeLlama], these models are used for generation tasks. They can translate natural language specifications into HDL, generate testbenches, and repair buggy RTL. Prompt chaining and few-shot learning are a few of many ways to make these models effective in synthesis-aligned code generation.

2. Encoder-Based Graph Reasoning Models: Inspired by models such as BERT and adapted into graph neural networks (e.g., ChipFormerLai, Yao; Liu, Jinxin; Tang, Zhentao; Wang, Bin; Hao, Jianye; and Luo, Ping. ChiPFormer: Transferable Chip Placement via Offline Decision Transformer. *Proceedings of the 40th International Conference on Machine Learning* (ICML '23), Article 757, 19 pages. JMLR.org, 2023. [https://proceedings.mlr.press/v202/lai23a.html Available online]), these models are optimized for inference tasks over structural representations like netlists or IRs. They can estimate timing, identify bottlenecks, and do logic equivalence checks.

The design ecosystem is increasingly adapting hybrid strategies, where decoder models generate artifacts and encoder models verify or optimize them-forming a closed co-design loop. This dual architecture is similar to human design workflows, where generation and validation are heavily co-dependent.

Methodological Landscape of LLM-Aided Design

LLM-aided design covers multiple stages of the hardware-software co-design pipeline, including natural language specification, HDL synthesis, analog circuit design, formal verification, and layout generation. While foundational techniques such as prompting, supervised fine-tuning (SFT), and retrieval-augmented generation (RAG) cover much of the field, their practical application is widespread based on the nature of the task. To provide a comprehensive view, the following summary table classifies typical LLM methodologies by their corresponding EDA task domain for a few recently published domain-specific representative LLMs/Tools:

Representative LLMs/Tools	LLM Methodology Used	Task Domain
class="wikitable" \|+ Methodology by Task Domain in LLM-Aided Design
RTLLM, VeriGen, RTLFixer	Prompt engineering, self-refinement, score-based SFT	Specification to HDL
ChatEDAWu, Haoyuan; He, Zhuolun; Zhang, Xinyun; Yao, Xufeng; Zheng, Su; Zheng, Haisheng; and Yu, Bei. ChatEDA: A Large Language Model Powered Autonomous Agent for EDA. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 43, no. 10, 2024, pp. 3184–3197. [https://doi.org/10.1109/TCAD.2024.3383347 Available online]	Instruction tuning, retrieval-augmented generation	Constraint Generation
AutoSVAOrenes-Vera, Marcelo; Manocha, Aninda; Wentzlaff, David; and Martonosi, Margaret. AutoSVA: Democratizing Formal Verification of RTL Module Interactions. Proceedings of the 58th Annual ACM/IEEE Design Automation Conference (DAC '21), pp. 535–540. IEEE Press, 2022. [https://doi.org/10.1109/DAC18074.2021.9586118 Available online], LLM4DVZhang, Zixi; Chadwick, Greg; McNally, Hugo; Zhao, Yiren; and Mullins, Robert. LLM4DV: Using Large Language Models for Hardware Test Stimuli Generation. Proceedings of the 33rd IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM '25), pp. 1–5, 2025. [https://arxiv.org/abs/2310.04535 Available online]	Coverage-driven generation	Testbench & Assertions
LayoutCopilot, ChatEDA	Vision-Language models, TCL script generation	Floorplan/Layout Synthesis
AnalogCoder, LaMAGIC	Topology suggestion, layout constraints, Bayesian tuning	Analog Circuit Synthesis

= Core Methodologies =

Below are a few core methodologies, with insights from recent tools and frameworks:

== Specification to HDL Translation ==

LLMs can generate synthesizable RTL (Verilog, VHDL) directly from natural language specifications. This process is significantly enhanced using:

Prompt engineering and hierarchical prompting, for structured code generation,
Context window expansion, to provide multi-level module and signal context,
Self-refinement and feedback from compiler logs, allowing the LLM to repair and converge to synthesizable HDL,
Score-based supervised fine-tuning (SFT), as seen in tools like RTLLM, VeriGen, and RTLFixer, to improve alignment with design and functional correctness.

==Testbench and Assertion Generation ==

LLMs synthesize SystemVerilog assertions, property checks, and full test environments using examples and coverage goals. Verification environments, SystemVerilog assertions (SVA), and test stimuli can be automatically synthesized using:

Coverage-driven generation, where LLMs aim to satisfy specific coverage goals and random seed diversity,
Tools such as AutoSVA and LLM4DV have shown higher assertion coverage and better bug exposure than traditional constrained-random verification methods.

== HDL Debugging and Repair ==

Using templates, similarity search, and error log analysis, LLMs can auto-repair syntax and functional bugs. LLMs assist in both syntactic repair (fixing compilation errors) and semantic repair (correcting logical/functional behavior), leveraging:

Template libraries and error log parsing,
Similarity search from past fixes,
Retrieval-Augmented Generation (RAG) pipelines such as RTLFixer and MEIC, which iteratively improve code until it passes lint, synthesis, or formal checks.

== HLS Code Refinement ==

Standard C/C++ is often incompatible with HLS constraints (e.g., recursion, pointers). LLMs identify and rewrite such constructs by:

Detecting and rewriting non-HLS-friendly patterns using prompt-repair pipelines,
Generating test harnesses and compiler hints (e.g., `#pragma HLS unroll`),
Tools like GPT4AIGChipFu, Y.; et al. GPT4AIGChip: Towards Next-Generation AI Accelerator Design Automation via Large Language Models. *Proceedings of the 2023 IEEE/ACM International Conference on Computer Aided Design* (ICCAD '23), San Francisco, CA, USA, pp. 1–9. IEEE, 2023. [https://doi.org/10.1109/ICCAD57390.2023.10323953 Available online] convert ML kernels into synthesizable HLS by combining structural abstraction and loop pattern rewrites.

==Constraint Generation==

Constraint files are essential for synthesis, placement, and timing correctness. LLMs like ChatEDA support this through:

Instruction tuning, enabling fine-grained command generation (e.g., for SDC, XDC formats),
Retrieval-Augmented Generation (RAG), which pulls prior constraints from similar designs or databases to ensure domain-consistent generation,
Generating multi-domain timing, placement, and IO constraints with contextual accuracy.

==Floorplan and Layout Synthesis==

Physical design requires careful placement and routing. LLM-vision hybrid models such as LayoutCopilot and ChatEDA employ:

Vision-language modeling to interpret and manipulate layout imagery (DEF/GDSII),
TCL script generation, customized for tools like [https://www.cadence.com/en_US/home/tools/digital-design-and-signoff/soc-implementation-and-floorplanning/innovus-implementation-system.html Innovus] and [https://www.synopsys.com/implementation-and-signoff/physical-implementation/ic-compiler.html ICC2],
Automatic power grid and macro placement proposals, based on learned design intents.

==Analog Circuit Synthesis==

Analog design poses unique challenges due to its sensitivity and lack of digital abstraction. Tools like AnalogCoder and LaMAGIC use:

Topology suggestion via LLMs, based on specification matching (gain, slew, bandwidth),
Layout constraint prediction, such as symmetry, matching, and parasitic awareness,
Bayesian optimization and tuning, informed by LLM predictions for transistor sizing and performance trade-offs.

These methodologies collectively depict LLMs as design agents capable of integrating with CAD flows, reasoning over heterogeneous inputs (text, code, specs, layout), and adapting to domain-specific constraints. As tools mature, the distinction between synthesis, verification, and optimization continues to blur—paving the way for closed-loop, autonomous hardware design.

Among these, HDL generation has emerged as one of the most deeply investigated tasks in LLM-aided EDA research, serving as a methodological testbed for broader design automation challenges. It captures the full interplay between natural language, symbolic code, feedback refinement, and tool integration. The following case study synthesizes key techniques employed in HDL generation workflows.

=Methodological Classification of HDL Generation: A Case Study=

The following table, constructed using detailed insights from recent papers, including the 2025 survey by Pan et al., highlights the methodologies underlying LLM-aided HDL generation

Project Name	Model Used	Approach Type	Summary
class="wikitable" \|+ HDL Generation Methodologies Using LLMs
RTLLM	GPT-3.5	Prompt Engineering	Multi-step planning-based prompt design with syntax and functional log feedback.
Chip-ChatBlocklove, Jason; Garg, Siddharth; Karri, Ramesh; and Pearce, Hammond. Chip-Chat: Challenges and Opportunities in Conversational Hardware Design. In: Proceedings of the 2023 ACM/IEEE 5th Workshop on Machine Learning for CAD (MLCAD), IEEE, Sept. 2023, pp. 1–6. [https://doi.org/10.1109/MLCAD58807.2023.10299874 Available online]	ChatGPT-4	Conversational Co-design	Full pipeline HDL synthesis guided via interactive dialogue with GPT-4.
VeriGen	CodeGen-16B	Fine-tuning	Trained on textbook + GitHub Verilog, improved synthesis-valid output, syntax robustness.
ChatEDA	LLaMA-20B	QLoRA + Instruction Tuning	Trained on GPT-4-generated EDA instructions; interprets and executes user commands.
RTLCoder	Mistral-7B	Scored SFT	Uses synthesis scores to steer SFT toward functionally valid and resource-efficient HDL
BetterVPei, Zehua; Zhen, Hui-Ling; Yuan, Mingxuan; Huang, Yu; and Yu, Bei. BetterV: Controlled Verilog Generation with Discriminative Guidance. Proceedings of the 41st International Conference on Machine Learning (ICML '24), Article 1628, 9 pages. JMLR.org, 2024. [https://proceedings.mlr.press/v202/pei24a.html Available online]	CodeLlama + TinyLlama	Controlled Gen + SFT	Bayesian discriminator modifies token probability for valid HDL output
RTLFixer	GPT-4	RAG + Agent Framework	Uses ReAct prompting and error categorization DBs for debug-oriented HDL refinement.

These methods highlight key trends and research frontiers:

Prompting + Logs: RTLLM is an example of tools that show that prompting alone, when combined with feedback from toolchains, is sufficient for competitive HDL generation without model retraining.
Fine-tuning on RTL: VeriGen and RTLCoder show that focused fine-tuning, especially with quality metrics (e.g., synthesis logs, functional correctness), significantly improves output robustness.
Controlled Generation: BetterV uses probabilistic controls in token sampling, pushing Verilog generation beyond maximum-likelihood decoding.
Agent Architectures: RTLFixer embodies an emerging paradigm where LLMs serve not just as code generators, but as self-refining agents—reading logs, tracing waveforms, and performing symbolic analysis.

The table also highlights the significance of multi-agent collaboration, retrieval-augmented generation (RAG), and tool-in-the-loop frameworks, which move beyond simple completion tasks into autonomous reasoning and repair. The performance advantages of fine-tuned and multi-modal frameworks over traditional prompting, as shown in benchmarks like VerilogEvalPinckney, Nathaniel; Batten, Christopher; Liu, Mingjie; Ren, Haoxing; and Khailany, Brucek. Revisiting VerilogEval: A Year of Improvements in Large-Language Models for Hardware Code Generation. *ACM Transactions on Design Automation of Electronic Systems* (TODAES), Association for Computing Machinery, February 2025. [https://doi.org/10.1145/3718088 Available online] and PyHDL-EvalBatten, Christopher; Pinckney, Nathaniel; Liu, Mingjie; Ren, Haoxing; and Khailany, Brucek. PyHDL-Eval: An LLM Evaluation Framework for Hardware Design Using Python-Embedded DSLs. In: Proceedings of the 2024 ACM/IEEE International Symposium on Machine Learning for CAD (MLCAD '24), ACM, 2024, article 10, pp. 1–17. [https://doi.org/10.1145/3670474.3685948 Available online], confirm that tightly integrated model-tool co-evolution is needed for true engineering-grade HDL generation.

Datasets and Evaluation Infrastructure

Large language models in EDA are developed, tuned, and evaluated using robust datasets. These datasets come in a range of formats, from performance metrics and natural language requirements to tokenized Verilog corpora and annotated tool logs. They make it possible for supervised fine-tuning, domain adaptation, and benchmarking for synthesis validity and generation quality.

In addition to increasing dataset volume, recent initiatives have improved granularity and diversity. Instruction-tuned datasets like ChatEDA teach LLMs how to interact with toolchains; benchmark sets such as VerilogEval assess model output quality; and design-level corpora like RTLCoder and [https://huggingface.co/datasets/GaTech-EIC/MG-Verilog MG-Verilog] offer structural annotations and synthesis metadata. Human-annotated multilingual Verilog pairs that facilitate abstraction and cross-language translation are provided by the [https://huggingface.co/datasets/GaTech-EIC/MG-Verilog MG-Verilog]. The VeriGen dataset uses textbook-derived Verilog tasks to facilitate fundamental pedagogical finetuning.

Tooling and Infrastructure: Practical Deployments

Several practical tools now demonstrate that LLM-aided design is no longer theoretical:

ChatEDA : Serves as a natural language interface for controlling Vivado, [https://www.intel.com/content/www/us/en/products/details/fpga/development-tools/quartus-prime.html Quartus], or [https://www.cadence.com/en_US/home/tools/digital-design-and-signoff/soc-implementation-and-floorplanning/innovus-implementation-system.html Innovus] workflows. It interprets user intent and translates it into tool-specific commands.
RTLLM-Editor: An IDE that integrates real-time HDL generation, compilation feedback, and syntax repair.
LLM4DV and AutoSVA: Specialized for formal verification, these tools generate SystemVerilog assertions and support coverage-driven testbench synthesis.

These tools reflect an operational maturity and are being integrated into prototyping, verification, closure, and constraint generation workflows.