IJDATICS_V14_No1 | PDF Host

INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTEGRATED CIRCUITS AND SYSTEMS The International Journal of Design, Analysis and Tools for Integrated Circuits and Systems (IJDATICS) was created by a netwo rk of researchers and engineers both from academia and industry. IJDATICS is an international journal intended for professionals and researchers in all fields of desig n, analysis and tools for integrated circuits and systems. The objective of the IJDATICS is to serve a better understanding between the community of researchers and practitioners both from academia and industry. Vijayakumar Nanjappan Jie Zhang University College Cork, Ireland Xi'an Jiaotong - Liverpool University Hui - Huang Hsu Tamkang University, Taiwan Editor - In - Chief Ka Lok Man Xi'an Jiaotong - Liverpool University, China Associate Editor s Danny Hughes Katholieke Universiteit Leuven, Belgium M L Dennis Wong Heriot - Watt University, Scotland Editorial Board Yuxuan Zhao Kamran Siddique Xi'an Jiaotong - Liverpool University, China University of Alaska Anchorage Tomas Krilavičius Young B. Park Vytautas Magnus University, Lithuania Dankook University, Kore a Vladimir Hahanov Salah Merniz Kharkov National University of Radio Electronics, Ukraine Paolo Prinetto Politecnico di Torino, Italy Massimo Poncino Politecnico di Torino, Italy Alberto Macii Politecnico di Torino, Italy Joongho Choi University of Seoul, South Korea Wei Li Fudan University, China Michel Schellekens University College Cork, Ireland Emanuel Popovici University College Cork, Ireland Jong - Kug Seon LS Industrial Systems R&D Center, South Korea Umberto Rossi STMicroelectronics, Italy Franco Fummi University of Verona, Italy Graziano Pravadelli University of Verona, Italy Vladimir PavLov Intl. Software and Productivity Engineering Institute, USA Ajay Patel Intelligent Support Ltd, United Kingdom Thierry Vallee Georgia Southern University, USA Menouer Boubekeur University College Cork, Ireland Monica Donno Minteos, Italy Jun - Dong Cho Sung Kyun Kwan University, South Korea AHM Zahirul Alam International Islamic University Malaysia, Malaysia Gregory Provan University College Cork, Ireland Miroslav N. Velev Aries Design Automation, USA M. Nasir Uddin Lakehead University, Canada Dragan Bosnacki Eindhoven University of Technology, The Netherlands Dave Hickey University College Cork, Ireland Maria OKeeffe University College Cork, Ireland Milan Pastrnak Siemens IT Solutions and Services, Slovakia John Herbert University College Cork, Ireland Zhe - Ming Lu Sun Yat - Sen University, China Jeng - Shyang Pan National Kaohsiung University of Applied Sciences, Taiwan Chin - Chen Chang Feng Chia University, Taiwan Mong - Fong Horng Shu - Te University, Taiwan Liang Chen University of Northern British Columbia, Canada Chee - Peng Lim University of Science Malaysia, Malaysia Ngo Quoc Tao Vietnamese Academy of Science and Technology, Vietnam Mentouri University, Algeria Oscar Valero University of Balearic Islands, Spain Yang Yi Sun Yat - Sen University, China Damien Woods University of Seville, Spain Franck Vedrine CEA LIST, France Bruno Monsuez ENSTA, France Kang Yen Florida International University, USA Takenobu Matsuura Tokai University, Japan R. Timothy Edwards MultiGiG, Inc., USA Olga Tveretina Karlsruhe University, Germany Maria Helena Fino Universidade Nova De Lisboa, Portugal Adrian Patrick ORiordan University College Cork, Ireland Grzegorz Labiak University of Zielona Gora, Poland Jian Chang Texas Instruments Inc, USA Yeh - Ching Chung National Tsing - Hua University, Taiwan Anna Derezinska Warsaw University of Technology, Poland Kyoung - Rok Cho Chungbuk National University, South Korea Yong Zhang Shenzhen University, China R. Liutkevicius Vytautas Magnus University, Lithuania Yuanyuan Zeng University College Cork, Ireland D.P. Vasudevan University College Cork, Ireland Arkadiusz Bukowiec University of Zielona Gora, Poland Maziar Goudarzi University College Cork, Ireland Jin Song Dong National University of Singapore, Singapore Dhamin Al - Khalili Royal Military College of Canada, Canada Zainalabedin Navabi University of Tehran, Iran Lyudmila Zinchenko Bauman Moscow State Technical University, Russia Muhammad Almas Anjum National University of Sciences and Technology, Pakistan Deepak Laxmi Narasimha University of Malaya, Malaysia Danny Hughes Xi'an Jiaotong - Liverpool University, China Jun Wang Fujitsu Laboratories of America, Inc., USA A.P. Sathish Kumar PSG Institute of Advanced Studies, India N. Jaisankar VIT University. India Atif Mansoor National University of Sciences and Technology, Pakistan Steven Hollands Synopsys, Ireland Felipe Klein State University of Campinas, Brazil Enggee Lim Xi'an Jiaotong - Liverpool University, China Kevin Lee Murdoch University, Australia Prabhat Mahanti University of New Brunswick, Saint John, Canada Tammam Tillo Xi'an Jiaotong - Liverpool University, China Yanyan Wu Xi'an Jiaotong - Liverpool University, China Wen Chang Huang Kun Shan University, Taiwan Masahiro Sasaki The University of Tokyo, Japan Vineet Sahula Malaviya National Institute of Technology, India D. Boolchandani Malaviya National Institute of Technology, India Zhao Wang Xi'an Jiaotong - Liverpool University, China Shishir K. Shandilya NRI Institute of Information Science & Technology, India J.P.M. Voeten Eindhoven University of Technology, The Netherlands Wichian Sittiprapaporn Mahasarakham University, Thailand Aseem Gupta Freescale Semiconductor Inc., USA Kevin Marquet Verimag Laboratory, France Matthieu Moy Verimag Laboratory, France Ramy Iskander LIP6 Laboratory, France Suryaprasad Jayadevappa PES School of Engineering, India S. Hariharan B. S. Abdur Rahman University, India Chung - Ho Chen National Cheng - Kung University, Taiwan Kyung Ki Kim Daegu University, South Korea Shiho Kim Chungbuk National University, South Korea Hi Seok Kim Cheongju University, South Korea Siamak Mohammadi University of Tehran, Iran Brian Logan University of Nottingham, UK Ben Kwang - Mong Sim Gwangju Institute of Science & Technology, South Korea Asoke Nath St. Xavier's College, India Tharwon Arunuphaptrairong Chulalongkorn University, Thailand Shin - Ya Takahasi Fukuoka University, Japan Cheng C. Liu University of Wisconsin at Stout, USA Farhan Siddiqui Walden University, Minneapolis, USA Yui Fai Lam Hong Kong University of Science & Technology, Hong Kong Jinfeng Huang Philips & LiteOn Digital Solutions, The Netherlands Assistant Editor - In - Chief Shuaibu Musa Adam Katholieke Universiteit Leuven, Belgium Publisher Cooperation Name : Solari Co., Hong Kong Address : Unit 1 - 5, 20/F, Midas Plaza, 1 Tai Yau Street, San Po Kong, Kowloon, Hong Kong Phone : (852) 3966 - 2536 ISSN: 2071 - 2987 (online version), 2223 - 523X (print version) INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTEGRATED CIRCUITS AND SYSTEMS https://www.cicet.org/ijdatics / i Preface Welcome to the Volum e 14 Number 1 of the International Journal of Design, Analysis and Tools for Integrated Circuits and Systems (IJDATICS). This issue br ings together six diverse studies that collectively illustrate frontiers in AI, Internet of Things (IoT), Integrated Circuits and Systems and Computer Engineering Technology There are three key themes evident in these paper s: • Intelligent Agents and Workflow Automation : Two papers explor e how AI systems can be architected to execute complex tasks with minimal human intervention • Feature Enhancement and Signal Parsing : Two papers address the pervasive challenge of extracting accurate insights from data characterized by h igh noise, sparsity, or modal heterogeneity • Algorithmic Applications and Human - AI I nteraction : Two papers focus on the practical deployment of intelligent systems in specific domains, addressing both the optimization of computational strategies and the evaluation of AI efficacy in collaborative environments. We would also like to thank the IJDATICS editorial team, which is led by: Editor - I n - Chief Ka Lok Man Xi’an Jiaotong Liverpool University, China Guest Editors Jie Zhang Xi’an Jiaotong Liverpool University, China Yuxuan Zhao Xi'an Jiaotong - Liverpool University, China Assistant Editor - In - Chief Shuaibu Musa Adam Katholieke Universiteit Leuven, Belgium ii Table of Contents Vol. 1 4 , No. 1 , June 20 2 5 Preface ................................................................................................. i Table of Contents ................................................................................... ii 1. Jia - Yang Jianga, Wan - Chi Yangb, Shih - Jung Wua and Chih - Yung Changa , A Self - Learning Multimodal Pet Assistant Based on RAG - Enhanced , Tamkang University , Taiwan 1 2. Ting Yih Chang, Yi - Ti Lin, Chih - Yung Chang , AI Agent - Driven Procurement Automation with n8n Integration , Tamkang University , Taiwan 5 3. Tsang - Yu Lin, Ho Thi Trang, Chung - Chih Lin , Enhancing Segmentation Performance for Cellular and Subcellular Structures in Micrographs: Leveraging ROI, Neighbor Extraction, and Size Constraints , Tamkang University, Taiwan 9 4. Tzu - Chia Huang, Chih - Yung Chang , Robust Feature Extraction and Adaptive Denoising for Underwater Bioacoustic Recognition , Tamkang University , Taiwan 15 5. Xiaomei Fang, Jing Zhu , DeepSeek’s Effectiveness in Providing Feedback on Academic Writing in Higher Education , Xi’an Jiaotong - Liverpoo l University, China 21 6. Jingya Sun, Pu Wang , Long - Term Dynamic Location Recommendation for Large - Scale Movable Facility Allocation , Soochow University , China 2 7 A Self-Learning Multimodal Pet Assistant Based on RAG-Enhanced Response and Expert Verification Jia-Yang Jiang a , Wan-Chi Yang b , Shih-Jung Wu a and Chih-Yung Chang a , Member, IEEE. a The Department of Computer Science and Information Engineering, Tamkang University. b National Taipei University of Nursing and Health Sciences. Email: 812414034@o365.tku.edu.tw, wanchi@ntunhs.edu.tw, wushihjung@mail.tku.edu.tw and cychang@mail.tku.edu.tw. Abstract — This paper introduces a self-learning multimodal pet assistant tailored for real-time group chat environments, where pet owners frequently seek urgent, community-based support. Leveraging a Retrieval-Augmented Generation (RAG) architecture, the system processes inputs across text, image, audio, and video modalities, enabling timely, accurate, and context- aware responses to a wide range of pet-related inquiries. A language model-based routing mechanism assesses the complexity and risk level of each query to determine whether veterinary expert review is needed, thereby enabling a conditional escalation process that balances AI autonomy with professional oversight.To support personalized and coherent multi-turn interactions, the assistant incorporates user-specific memory encoding that tracks individual preferences and historical queries. It also features a community consensus mechanism that captures socially validated information from group interactions, allowing the knowledge base to grow over time through both expert annotation and user agreement signals. This dual-source learning strategy enhances the system’s adaptability and trustworthiness.While the system’s core components have been implemented, empirical evaluation and deployment testing are ongoing. Early validation focuses on the accuracy of multimodal retrieval, the relevance of generated responses, and the effectiveness of the expert-in-the-loop verification workflow. The proposed framework offers a scalable, adaptive solution for supporting both everyday pet care and emergency scenarios in digitally connected communities, contributing to the development of responsible AI applications in animal health and welfare. Index Terms —Pet health, Retrieval-Augmented Generation (RAG), multi-modal dialogue system, personalized chatbot, vector similarity, veterinary collaboration, group knowledge mining I. I NTRODUCTION The growing integration of companion animals into urban households has led to a significant transformation in the landscape of pet ownership. As pets increasingly assume the role of family members, there is a parallel surge in the demand for accessible, timely, and expert-level pet care guidance. In densely populated cities, where veterinary resources may be limited or unevenly distributed, pet owners frequently turn to digital platforms—particularly community chat environments such as LINE groups—for support. Within these informal support networks, urgent inquiries are often raised, ranging from the accidental ingestion of toxic foods and medications to acute behavioral disturbances and post-surgery care concerns. However, the asynchronous and unstructured nature of such group interactions often results in delayed responses or complete lack of expert input, leaving pet owners vulnerable during critical moments. These gaps underscore the need for a real-time, domain-specific support system that combines conversational understanding with actionable, trustworthy advice. To address this issue, this work presents a self-improving pet assistant system built on a Retrieval-Augmented Generation (RAG) architecture [1]. The system is engineered to detect unanswered or inadequately addressed questions within group chat conversations and to proactively deliver contextually appropriate, multimodal responses that align with user intent and urgency. Core architectural innovations include transformer-based temporal encoding to track dialogue evolution, user-specific memory modeling for personalized assistance, and a human-in-the-loop (HITL) escalation mechanism to ensure factual reliability and ethical soundness in complex cases. By integrating expert validation with machine learning-driven inference, the system bridges the gap between community-generated content and professional-grade guidance. Further distinguishing this architecture is its knowledge growth capability, which enables continuous improvement through the dynamic assimilation of user interactions and emergent community consensus. This involves tracking repeated behavioral patterns, refining retrieval accuracy based on feedback loops, and incorporating structured annotations from veterinary experts. The assistant thus becomes more robust and personalized over time, providing increasing value with prolonged use. Recent applications of deep learning and natural language processing in vertical domains have demonstrated the technical feasibility and societal relevance of real-time, specialized AI systems. For example, in legal technology, neural models have been effectively applied to speaker identification and domain- specific terminology extraction for transcription workflows [2]. In the field of education, plagiarism detection systems have adopted coarse-to-fine semantic similarity frameworks to evaluate writing originality at multiple levels of abstraction [3]. Healthcare research has embraced AIoT-powered image recognition to support wound prognosis monitoring in long- term care environments, where resource constraints mirror those of veterinary contexts [4]. In urban planning, skip-gram– based vector space modeling has been leveraged to recommend restaurant locations by analyzing semantic patterns of consumer reviews and spatial behavior [5]. These studies collectively emphasize the growing trend toward AI systems tailored to highly contextualized, domain-specific environments, a paradigm that directly informs the design philosophy and implementation strategy of our proposed pet assistant system. INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTERGRATED CIRCUITS AND SYSTEMS, VOL. 14, NO. 1, JUNE 2025 1 II. R ELATED W ORK Dialogue systems have undergone a rapid and transformative evolution over the past decades, beginning with early rule-based agents such as ELIZA and progressing to today's transformer-based large language models (LLMs), including ChatGPT and Claude. These modern systems exhibit remarkable fluency and contextual awareness. However, despite such advancements, traditional end-to-end generative dialogue models often suffer from limitations in factual grounding, domain-specific reasoning, and long-term context retention. These deficiencies are particularly pronounced in high-stakes applications—such as healthcare, law, and pet care—where misinformation can lead to real-world consequences. To address the need for grounded generation, Retrieval- Augmented Generation (RAG) has emerged as a powerful paradigm that combines neural retrieval with autoregressive generation. RAG enables systems to dynamically access external knowledge sources, thereby enhancing factual consistency and response relevance. This architecture has shown efficacy across a spectrum of knowledge-intensive NLP tasks, including weakly supervised anomaly detection [6], [7], graph-based violence detection and classification [8], and open- domain question answering. By grounding outputs in retrieved evidence, RAG-based systems offer a more verifiable and context-aware response pipeline. Concurrently, recent research has emphasized the need for model interpretability, bias mitigation, and personalization in neural systems. For instance, ensemble-based interpretable architectures [9] have been applied to enhance transparency in decision-making, while post-hoc explanation frameworks reveal latent correlations in neural representations [10], including mitigation strategies for Shapley value-based bias in feature attribution [11]. Such explainability measures are especially vital in domains like pet care, where recommendations involving diet, medication, or behavior must be both reliable and understandable to non-expert users. Moreover, advances in multimodal learning pipelines have facilitated cross-domain and cross-sensory alignment. For example, explainable multimodal models have been used to generate educational content through synchronized visual- verbal reasoning [12], while hybrid 3D reconstruction techniques integrate vision and language for precise cross- modal representation alignment [13]. These developments inform the design of pet dialogue systems that must interpret and generate content across images (e.g., rashes, x-rays), text (symptoms, histories), and conversational cues. In terms of real-world deployment, many systems now incorporate human-in-the-loop (HITL) mechanisms to ensure reliability and correctness. HITL approaches have proven effective in domain-sensitive pipelines, such as BUAS, which uses bottom-up article selection for textual similarity in scholarly content [14]. In retrieval-augmented systems, HITL mechanisms are used to conditionally escalate uncertain or potentially harmful outputs to experts, blending automation with oversight. These methods are particularly pertinent for pet care contexts, where urgent and nuanced cases demand both speed and precision. Our proposed architecture builds on and extends these foundational works by embedding RAG into a pet-oriented dialogue assistant. Key innovations include: semantic retrieval tailored to veterinary and behavioral domains, user memory encoding for personalization over time, conditional expert routing for HITL escalation, and continual knowledge integration through consensus tracking and system retraining. Together, these components create a robust, explainable, and adaptive dialogue system capable of delivering timely and trustworthy guidance in informal, real-world group chat environments. III. R ESEARCH M ETHODS The proposed system architecture adopts a multimodal processing pipeline that integrates visual, auditory, and textual modalities for real-time pet-related query resolution. The architecture supports context-sensitive response generation, dynamic memory adaptation, and expert-informed escalation. A. Multimodal Input Representation We denote the sequence of incoming user messages as a set of modality-specific utterances. Each input 𝑥 ! " from modality m ∈ { 𝑡𝑒𝑥𝑡 , 𝑖𝑚𝑎𝑔𝑒 , 𝑎𝑢𝑑𝑖𝑜 , 𝑣𝑖𝑑𝑒𝑜 } is encoded using a pretrained encoder 𝑓 " ( ∙ ) , producing an embedding 𝑒 ! " = 𝑓 " ( 𝑥 ! " ) projected into a shared semantic space ℝ # , where 𝑑 typically ranges from 256 to 1024 depending on encoder configuration. This common space enables uniform downstream processing. B. Frame-Level Feature Extraction and Temporal Encoding For modalities with sequential nature, such as audio or video, we divide the input into segments across time and extract features {𝑓 $ , 𝑓 % , ... , 𝑓 & } . These are passed into a modality- specific Transformer encoder, which captures cross-frame dependencies and outputs a temporally-aware representation 𝐹 ( " ) ∈ ℝ # Rather than detailing every intermediate transformation, we treat this as a learned function that summarizes time-dependent signals into a fixed-size embedding. C. Modality Fusion and Representation Aggregation Each modality-specific representation ℎ " is weighted by a learnable scalar 𝛼 " , and the final multimodal representation is computed as a weighted sum: ℎ )*+,# = ; 𝛼 " " ∙ ℎ " (1) the weights 𝛼 " are normalized such that ∑ 𝛼 " " = 1 , ensuring a balanced fusion across modalities. D. Retrieval-Augmented Generation and Similarity-Based Inference Given a fused input vector ℎ )*+,# , we search a knowledge base 𝒦 = { ( 𝒌 𝒋 , 𝒕 𝒋 ) }, where 𝒌 𝒋 is the embedding of knowledge entry j, and 𝒕 𝒋 its corresponding text. The similarity between input and knowledge entry is measured using cosine similarity: INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTERGRATED CIRCUITS AND SYSTEMS, VOL. 14, NO. 1, JUNE 2025 2 𝑠𝑖𝑚 ( ℎ )*+,# , 𝒌 𝒋 ) = ℎ )*+,# ∙ 𝒌 𝒋 ∥ ℎ )*+,# ∥ ∙ ∥ 𝒌 𝒋 ∥ (2) top-k similar entries form the context set 𝐶, which is prepended to the query and passed into a pretrained language model 𝐿𝐿𝑀 ( ∙ ) for grounded response generation: 𝑦 = 𝐿𝐿𝑀 ( ℎ )*+,# , 𝐶 ) (3) E. Expert Verification and Conditional Escalation To ensure the safety and accuracy of responses— particularly in scenarios involving medical or high-risk content—the system includes a conditional expert verification mechanism. Upon receiving a user query, the system first encodes the input into a semantic vector ℎ )*+,# . through multimodal fusion. The system then uses a specialized classification prompt to determine whether the query requires professional veterinary review. If the query is classified as non- critical, the system-generated response is returned to the user directly. However, if the system determines that the question falls within the scope of veterinary expertise—such as health symptoms, drug-related issues, or ambiguous visual inputs— the assistant does not immediately return the generated response. Instead, it routes the output, along with the query context, to a licensed veterinarian via a dedicated review interface. In this stage, the veterinarian plays a verifying role. If the AI-generated response is deemed accurate and safe, the expert approves it, and the system returns it to the user. If modifications are necessary, the expert provides a corrected response 𝑦 ∗ , which then replaces the original answer and is delivered to the user. This ensures that the final output in high- risk cases is professionally vetted. Additionally, both the original AI response and the expert-modified version are stored for system learning and audit, enabling the assistant to improve its judgment and reduce unnecessary escalations over time. This process strikes a balance between automated efficiency and expert responsibility, maintaining user trust while ensuring reliability in medical scenarios. F. User Memory Encoding and Personalization For each user u , we maintain a memory bank ℳ * = {(𝑚 / , 𝑥 / )}, where 𝑚 / is the embedding of a past utterance. At response time, the system computes similarity between the current query and historical entries. Attention weights 𝜔 / are computed as: 𝜔 / = exp ( 𝑠𝑖𝑚 ( ℎ )*+,# , 𝑚 / ) / τ ) ∑ ( 𝑠𝑖𝑚 ( ℎ )*+,# , 𝑚 / ) / τ ) 0 (4) here, τ is a temperature parameter that controls the sharpness of the attention distribution; lower values of τ result in more focused attention on the most relevant memory entries. Using these weights, a personalized memory vector 𝑚 * is constructed as the weighted sum over historical embeddings. This memory vector is then used to condition the response generation process, enhancing personalization and contextual continuity in multi- turn conversations. G. Community Knowledge and Consensus Storage To support long-term knowledge accumulation, the system leverages group consensus as a signal for persistent learning. When a generated or user-contributed message 𝑦 receives sufficient acknowledgment—such as reactions or replies—from members of the user group 𝑈, a consensus score is calculated as: 𝜅 ( 𝑦 ) = 1 | 𝑈 | ; 𝕀 ( 𝑦 ) * ∈ 2 (5) here, 𝕀 ( ∙ ) is an indicator function that returns 1 if user 𝑢 has acknowledged message 𝑦, and 0 otherwise. This score reflects the proportion of users who endorsed the message. If the consensus score exceeds a predefined threshold 𝜃 , the embedding of the message 𝑘 3 is stored in the knowledge base 𝒦, marking it as socially validated knowledge. This allows the system to incorporate community-approved insights into future response generation, enabling the assistant to learn not only from expert input but also from recurring agreement patterns among users. IV. F UTURE W ORK At the current stage, the system framework and architectural components have been comprehensively designed and partially implemented. However, empirical evaluation, scalability testing, and user-centered optimization remain in progress. Future work focuses on validating the system across multiple dimensions—technical performance, user experience, and real-world applicability—under both controlled and semi- naturalistic conditions. The multimodal input modules, encompassing image and video understanding, are presently undergoing rigorous validation using benchmark datasets such as the Oxford-IIIT Pet Dataset, with additional consideration of domain-specific extensions involving real-world pet imagery. A key area of investigation involves assessing the quality and robustness of visual embeddings under varying encoder architectures (e.g., ViT, ConvNeXt, CLIP) to determine their efficacy in supporting semantic grounding and context-aware retrieval. In particular, we are analyzing how visual cues—such as lesion patterns, posture, or facial expressions—can enhance or complement textual symptom descriptions, thereby improving overall response fidelity in the dialogue system. In parallel, we are establishing a controlled experimental environment to systematically evaluate the system’s end-to-end performance across multiple metrics, including response quality, retrieval precision, generation latency, and dialogue coherence in pet-related group chat scenarios. This testbed simulates common user interactions such as emergency queries, routine care advice, and behavioral counseling, allowing us to assess how well the retrieval-augmented generation (RAG) pipeline adapts to varying information needs and temporal contexts. Additionally, we are closely monitoring the behavior of the user memory mechanism, which is designed to track user- specific preferences and interaction history across multi-turn exchanges. Performance here will be measured in terms of personalization accuracy and state consistency over time. The human-in-the-loop (HITL) verification process involving licensed veterinarians is also being piloted. This component is critical for evaluating the balance between INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTERGRATED CIRCUITS AND SYSTEMS, VOL. 14, NO. 1, JUNE 2025 3 automated inference and expert oversight, particularly in high- risk or ambiguous situations. We aim to quantify the escalation rate, expert response latency, and corrective intervention outcomes to refine the system’s conditional routing logic. This includes examining thresholds for uncertainty detection and criteria for knowledge conflicts. Moreover, real user feedback is being actively collected through structured interviews, usage logs, and satisfaction surveys within small-scale deployment groups. Insights gained from this feedback loop will inform future fine-tuning strategies, including prompt engineering, reward-based reinforcement learning, and retrieval index updates. We are also exploring adaptive tuning based on community consensus signals, such as agreement ratios in group discussions and correction patterns from HITL annotations. In the long term, the outcomes of these experiments will serve as the empirical foundation for a follow-up deployment phase, targeting community-based animal welfare networks, such as online rescue groups, veterinary teleconsultation hubs, and urban pet-owner alliances. Subsequent studies will document the results of scalability trials, generalization across pet species and breeds, and cross-linguistic adaptability, with the ultimate aim of validating the system’s readiness for large- scale, real-world deployment as a trusted digital companion for pet care and welfare. V. C ONCLUSION This study presents a unified and intelligent pet assistant system specifically designed for group-based conversational platforms where timely and trustworthy information is critical. By integrating key components—multimodal input processing, transformer-based temporal encoding, retrieval-augmented generation (RAG), and personalized user memory modeling— the system addresses the core challenge of delivering accurate, context-aware, and personalized responses to diverse pet- related inquiries in real-time social environments. These technical innovations are tightly coupled with a commitment to user safety, particularly in scenarios involving potential medical emergencies, toxic ingestion, or behavioral crises. To this end, an expert-in-the-loop (HITL) mechanism allows for conditional routing to licensed veterinarians, ensuring that low- confidence or high-risk responses are subjected to professional validation. Beyond immediate question answering, the system is designed for continuous improvement and long-term adaptability. It incorporates adaptive learning mechanisms based on community consensus signals, such as repeated confirmations, corrections, and behavioral patterns within chat environments. This allows the assistant to evolve alongside its user base, refining its retrieval index, personalization strategies, and confidence estimation algorithms through real-world interaction data. The resulting feedback loop positions the system as a living knowledge agent—one that not only responds, but learns, adapts, and aligns with evolving community standards and best practices in pet care. Although the architectural foundation has been carefully laid and several components have been partially implemented, the system is still undergoing comprehensive empirical evaluation. Current and upcoming phases of development include benchmarking multimodal retrieval accuracy, testing the temporal dynamics of transformer-based conversation tracking, and refining multi-turn user interaction models for continuity and relevance. In parallel, we are piloting and iterating on the HITL workflow, assessing the efficiency, scalability, and professional burden of expert escalation under real-world constraints. Ultimately, this system aims to provide a scalable, reliable, and socially responsible solution for supporting urban pet owners, veterinary professionals, and animal welfare communities. By combining advanced NLP, multimodal processing, and expert input, it enhances digital pet care while promoting a more connected and AI-assisted society. This work lays the foundation for future deployments across varied linguistic and cultural contexts, advancing accessible and responsible companion animal care. R EFERENCES [1] P. Lewis, B. Oguz, R. Rinott, S. Riedel, and V. Stoyanov, “Retrieval- augmented generation for knowledge-intensive NLP tasks,” in Adv. Neural Inf. Process. Syst. (NeurIPS), vol. 33, pp. 9459–9474, 2020. [2] S. Jhang, S.-J. Jhang, C.-Y. Chang, S.-J. Wu, and D. S. Roy, “Courtroom transcription: A deep learning approach to legal terminology and speaker identification,” in Proc. Int. Conf. Consumer Electron.–Taiwan (ICCE-Taiwan), Taoyuan, Taiwan, 2024. [3] C.-Y. Chang, S.-J. Jhang, S.-J. Wu, and D. S. Roy, “JCF: Joint coarse- and fine-grained similarity comparison for plagiarism detection based on NLP,” J. Supercomput., vol. 80, no. 1, pp. 363–394, 2024. [4] C.-L. Chen, S.-C. Chiang, L.-P. Hung, and S.-J. Jhang, “Applying AIoT image recognition for prognosis of wound healing in long-term care residential facility,” Wireless Netw., vol. 30, no. 7, pp. 6523–6536, 2024. [5] C.-Y. Chang, S.-J. Jhang, Y.-T. Yang, H.-C. Chang, and Y.-J. Chang, “Utilizing Skip-Gram for restaurant vector creation and its application in the selection of ideal restaurant locations,” in Proc. Int. Conf. Smart Grid and Internet of Things, Cham, Switzerland: Springer Nature, 2023. [6] Wen-Dong Jiang, Chih-Yung Chang, Ssu-Chi Kuai and Diptendu Sinha Roy, “From Explicit Rules to Implicit Reasoning in Weakly Supervised Video Anomaly Detection”, arXiv preprint, arXiv:2410.21991, 2024. [7] Wen-Dong Jiang, Chih-Yung Chang, Hsiang-Chuan Chang and Ji-Yuan Chen, “Injecting Explainability and Lightweight Design into Weakly Supervised Video Anomaly Detection Systems”, arXiv preprint, arXiv:2412.20201, 2024. [8] Wen-Dong Jiang, Chih-Yung Chang and Diptendu Sinha Roy, “Detection, Retrieval, and Explanation Unified: A Violence Detection System Based on Knowledge Graphs and GAT”, arXiv preprint, arXiv:2501.06224, 2025. [9] Yue-Shi Lee, Show-Jane Yen, Wendong Jiang, Jiyuan Chen and Chih- Yung Chang, “Illuminating the black box: An interpretable machine learning based on ensemble trees”, Expert Systems with Applications, Volume 272, Pp126720, Feb 2025. [10] Wen-Dong Jiang, Chih-Yung Chang, Show-Jane Yen and Diptendu Sinha Roy,“Explaining the Unexplained: Revealing Hidden Correlations for Better Interpretability” , arXiv preprint, arXiv:2412.01365, 2024. [11] Wen-Dong Jiang, Chih-Yung Chang, Show-Jane Yen, Shih-Jung Wu and Diptendu Sinha Roy,“RealExp: Decoupling correlation bias in Shapley values for faithful model interpretations”, Information Processing & Management, Volume 62, Issue 4, , Pp.104153, July 2025. [12] Huang, T. C., Chang, C. Y., Tsai, H. I., and Tao, H. S, “A Multimodal Learning Approach for Translating Live Lectures into MOOCs Materials”, In proceeding of the 2024 International Conference on Consumer Electronics-Taiwan (ICCE-Taiwan), Pp. 687-688, 2024. [13] T. -H. Tsai and J. -Y. Jiang, "3D Hand Mesh Reconstruction from Monocular Image by using Intermediate Representations," In proceeding of the 2023 International Conference on Consumer Electronics-Taiwan (ICCE-Taiwan), Pp. 565-566, 2023. [14] S. Jhang, S.-J. Jhang, and others, “BUAS: Joint bottom-up article selection for quick article similarity identification based on NLP,” Int. J. Design, Anal. Tools Integr. Circuits Syst., vol. 11, no. 2, 2022. INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTERGRATED CIRCUITS AND SYSTEMS, VOL. 14, NO. 1, JUNE 2025 4 Abstract — This study proposes a high - level automated procurement agent model by integrating AI Agent technology with the low - code automation platform n8n, aiming to establish an enterprise - grade procurement workflow deployable within a LINE Bot interface. By incorpo rating multimodal input comprehension, semantic extraction, modular command execution, and reactive workflow generation, the system can autonomously handle procurement requests, data validation, inventory lookup, supplier price comparison, managerial appro val, and order notification, minimizing human intervention. Furthermore, a formalized modeling and reasoning architecture is introduced, which leverages mathematical constructs such as state - action representations, cost functions, and policy optimization. This enables the agent to reason over time - dependent procur ement states, make cost - effective decisions, and adapt dynamically to varying inputs across modalities. Index Terms — AI Agent, Procurement Automation, Low - code Workflow (n8n), Multimodal Semantic Parsing, LINE Bot Integration. I. B ACKGROUND AND M OTIVATION Recent developments in artificial intelligence and automation technologies have prompted enterprises to rethink their internal operations and workflows, especially in repetitive yet mission - critical domains such as procurement. Traditionally, procurement w orkflows rely on manual verification, email - based approvals, and fragmented systems that make real - time coordination and responsiveness challenging. In small to medium - sized enterprises, the lack of IT infrastructure further exacerbates these inefficiencie s. The growing accessibility of large language models (LLMs), low - code orchestration platforms like n8n, and communication bots such as LINE Bot offers a unique opportunity to transform procurement from a manual, document - centric process into a responsive, in telligent, and automated pipeline. By leveraging AI Agents equipped with semantic reasoning, multimodal understanding, and process - aware decision - making, businesses can reduce human error, improve speed, and optimize cost across the procurement lifecycle. This paper aims to formalize and implement a workflow that captures this transformation, highlighting how AI Agents and low - code automation can jointly enable practical and scalable solutions for modern enterprise procurement. II. INTRODUCTION In recent years, rapid advancements in artificial intelligence — particularly in language models and agent - based systems — have opened new opportunities for enterprise process automation. Natural language processing (NLP) has evolved from rule - based systems in to deep learning - driven semantic models, with large language models (LLMs) such as ChatGPT demonstrating powerful contextual understanding and task execution capabilities. These models have given rise to autonomous AI Agents equipped with memory, task plan ning, and environmental interaction abilities, transforming them into central elements of automation pipelines. AI Agents are no longer limited to textual interaction but are increasingly capable of integrating speech and image modalities, while interfacing with heterogeneous systems like enterprise databases, email servers, and ERP modules. In domains such as procu rement, these agents can handle end - to - end processes: from multimodal input (e.g., voice or images), information extraction via OCR or speech - to - text, semantic parsing through LLMs, to price comparison, approval routing, and notification dispatch — all compl eted through automated workflows. Simultaneously, low - code automation platforms like n8n and Zapier have emerged as crucial enablers for enterprises, especially SMEs without dedicated engineering teams. These tools offer modular and visual process design interfaces that make AI Agent orche stration highly flexible and scalable. n8n in particular provides robust node - based orchestration with support for GPT APIs, Whisper, Google Vision, MySQL, SMTP, and more, making it an ideal platform for implementing this research prototype. In summary, this study builds a comprehensive AI procurement agent workflow system, integrating ChatGPT - based semantic modeling with multimodal inputs such as OCR and voice recognition, connected through API modules and rendered into a Directed Acyclic Gra ph (DAG) structure. Formal modeling and decision logic are employed to validate its system feasibility within the LINE Bot ecosystem. III. R ELATED W ORK Several recent studies have explored intelligent automation for procurement and enterprise workflows. Gupta et al. [1] proposed a hybrid ML - based framework for decision support in