Building Agentic Action Space Frameworks
Leo Borcherding
2024-07-18 · 10 min read
Announcing WillFreeAI: Revolutionizing AI Agentic Frameworks
Artificial intelligence is the most rapidly growing technology in the world. From everyday people to companies and organizations, AI has made revolutionary changes that will ultimately bear fruit for our world, our children, and our legacy. This future is bright, a utopia of possibilities. A world where automation has freed us from our limited resources and finite abilities. For us to climb this ladder, we must start looking at the bigger picture and asking ourselves, what is it that we are trying to build? For this, we must not only apply logical first-principles reasoning, but we must also apply ethical considerations to the capabilities of our agentic systems in their action spaces.
I am excited to announce the founding of WillFreeAI, an open source AI Software organization devoted to Supervised Fine Tuning with an emphasis in the domains of Multimodal Super Alignment by Building Agentic Framework Action Spaces & Intercommunication Protocols.
Ollama Agent Roll Cage (OARC)
Two months ago, I built the Ollama Agent Roll Cage (OARC) and have been refining it daily. OARC is a local Python agent that integrates Ollama LLMs with Coqui-TTS, Keras classifiers, Llava vision, and Whisper recognition to create a unified chatbot agent for local, custom automation. It manages chatbot creation through conversation history, model management, function calling, and an interaction space for Windows software, local files, and callable screenshots. OARC includes the /create command, allowing users to define a model and system prompt via text and speech commands. Join the OARC community on Discord for more help and to participate in community mod pack development.
Leoleojames1/ollama_agent_roll_cage: 🤖👽🤬💬 Boost your AI experience with this Ollama add-on!
WillFreeAI and Our Vision
Building on the success of OARC, my colleagues and I at WillFreeAI are excited to announce two major software applications that will push the boundaries of AI development: Agent Roll Cage (ARC) and Agent Chef (AC).
Agent Roll Cage (ARC) V1.0
ARC will be WillFreeAI's primary deployment platform for libraries of agentic frameworks. It will host open-source fine-tuned models for specialized agents, including those for function calling, programming, LaTeX mathematics, synthetic dataset generation, and more. Our goal with ARC is to create a comprehensive platform that supports the development and deployment of advanced agentic systems, making powerful AI tools accessible to everyone.
Agent Chef (AC) V1.0
Agent Chef is our robust tool for dataset refinement, structuring, and generation. By leveraging procedural and synthetic dataset generation techniques, Agent Chef will enable users to refine and clean their fine-tuning data, eliminating data poisoning and low-quality knowledge bases. Additionally, it will provide templates, frameworks, and agents for generating specialized function-calling, programming, LaTeX, and more fine-tune dataset construction for specific use cases. Agent Chef aims to revolutionize home-brewed AI by offering tools to create high-quality, domain-specific datasets.
The ARC 1.0 Framework
Agentic Action Space
The agentic action space refers to the environment and scope within which an AI agent operates and makes decisions. This includes all the possible actions the agent can take, the interactions it can perform, and the outcomes it can influence. In the context of OARC and ARC, the agentic action space is designed to be flexible and expansive, incorporating various tools and models to enhance the agent's capabilities.
Function Calling Model
A function calling model is a specialized AI model designed to execute specific functions based on given inputs. These models can dynamically call and perform functions during interactions, enabling the agent to handle complex tasks such as data processing, mathematical computations, and more. In ARC, function calling models leverage Nomic embeddings and dynamic function calling mechanisms to enhance the agent's functionality.
Nomic Embedding Model
A type of model used to convert unstructured data into vector representations. These vectors encode semantic information about the data, making it easier for computers to manipulate and analyze based on meaning and context. Nomic embedding models are particularly useful in applications like semantic search, classification, and clustering. They can handle both text and image data, projecting them into a unified embedding space that achieves state-of-the-art performance on various tasks.
The following framework is the base for ARC 1.0, integrating a variety of components to create a versatile and powerful agentic system:
ARC 1.0 Framework: Click here for full-size image
- World Building, Define Reality and Action Space: The process starts with defining the reality and action space of the agent, setting the stage for the interaction environment.
- Whisper Speech Recognition: Converts spoken language into text, facilitating natural language processing. openai/whisper-large-v3
- General Large Language Model (LLM): The core of the framework, responsible for understanding and generating human-like text. meta-llama/Meta-Llama-3-8B
- Nomic Embedding: Conversation History: Maintains the context of conversations using Nomic embeddings. nomic-ai/nomic-embed-text-v1
- Parquet Datasets: Used for refining, synthesizing, and constructing data. sebdg/fine-tune-emotions
- Nomic Embedding: Dynamic Function Calling: Enables the system to dynamically call functions based on the conversation context. nomic-ai/nomic-embed-text-v1
- Nomic Database: Function Calling & Build List RAG: Manages function calling and builds lists using Retrieval-Augmented Generation (RAG). nomic-ai/nomic-embed-text-v1
- COQUI TTS or Bark Text-to-Speech Generation: Converts text responses back into speech. BARK: suno/bark COQUI: Borcherding/XTTS-v2_C3PO
- Google Deep Dream and Stable Diffusion Image Generation: For generating and modifying images based on user inputs. DEEP DREAM: tensorflow.org/tutorials/generative/deepdream SD: stabilityai/stable-diffusion-3-medium
- YOLO Image Recognition Classifier and LLaVA Image Recognition: For analyzing and understanding visual inputs. qnguyen3/nanoLLaVA-1.5
- Keras Classifier Navigator and Time Series Classifier: Used for navigating and interpreting time-series data. sebdg/emotions_classifier
- Output to Action Space: Directs the agent's actions based on the processed information.
Each component plays a crucial role in ensuring that our agentic systems are robust, responsive, and capable of handling a wide range of tasks.
By integrating these components, the agent framework creates a powerful and flexible system capable of handling various tasks. This architecture not only ensures robustness and responsiveness but also allows for continuous improvement and expansion as new models and technologies become available. This sophisticated framework is the foundation of WillFreeAI's vision, enabling us to create advanced, ethical, and efficient AI systems that will drive the future of automation and artificial intelligence.
Join us on this exciting journey as we continue to innovate and push the boundaries of what's possible with AI. Together, we can build a brighter future with intelligent, ethical, and powerful agentic systems.
Subscribe to stay informed on AI
Get the latest AI insights delivered to your inbox