Contribute to Cognitive Core

Contribute to Cognitive Core

Contributors aiming to enrich the Character Core of a Virtual AI have several key avenues for contribution, each focusing on different aspects of AI development:

Contribute Models

Contributors may contribute models in two forms:

  • Model Enhancement Submission/ New Model Submission: Training or updating the Large Language Model (LLM) with the collected data. This can be done using either the collective data repository or proprietary datasets, aiming to tailor the AI's responses to specific domains.

  • Pre-trained Models: Developing new models pre-trained with a specific set of domain knowledge, enhancing the LLM's performance and breadth of knowledge in particular areas.

  • Character Card Submission: Using an existing foundational model from the Protocol App to submit a new Character Card to a Virtual.

Model Enhancement Submission/ New Model Submission/ Pre-trained Models

This section is for you if you have:

  • Fine-tuned an Existing Model: If you've enhanced a model based on foundational models available through our protocol app or any open source platform.

  • Developed a New Model: If you've created a new model for the Virtual that isn't entirely based on any pre-existing model.

For every Virtual-specific contribution, please submit the model along with a character card named 'character.json'.

Below is the structure for a sample folder submission of a complete model.

FolderName/
├── YourModelPackageName/
│   ├── YourModel.gguf
│   └── ModelFile (additional model file if necessary)
└── character.json
  • FolderName/: This is the main folder. It encapsulates all the necessary files for your model.

  • YourModelPackageName/: A subfolder within the main folder. It contains the model file (YourModel.gguf) and any additional model files (labeled here as ModelFile for illustration). The name of this folder should exactly match the "Package Name" you provided upon submission to ensure proper identification.

  • character.json: This file is placed directly within the main folder, alongside the YourModelPackageName. It serves as the character card for the Virtual model you are submitting.

Character Card Submission

This section applies to you if:

  • You are utilizing an existing foundational model available on the Protocol App.

  • Your primary contribution involves attaching a new Character Card for the Virtual.

Follow these simple steps:

  1. Select the Foundational Model: During the submission process, choose the appropriate foundational model from the selection provided. This model will serve as the basis for your character card.

  2. Prepare Your Submission Folder: Organize your character card within a folder as outlined below:

Copy codeFolderName/
└── character.json
  • FolderName/: This is the main folder for your submission. It should be named in a way that best represents your character or submission content. This folder will contain your character.json file.

  • character.json: This file should be placed directly inside the FolderName/ folder. It acts as the character card for the Virtual model you are submitting. Make sure it includes comprehensive details about your character such as their backstory, visual characteristics, and any unique traits or abilities.

Tips for model submission

  • Model Naming: Use all lowercase, no spaces, and ensure the name is meaningful.

  • Model Specifications:

    • Quantize the model file to at least 4 bits.

    • Limit the model to no more than 13 billion parameters.

  • Template Indication: Clearly state the chat template used, like "Alpaca template."

  • Response Format: Model should use Alichat format, with actions wrapped in asterisks.

  • Compatibility Check: Ensure model compatibility with existing AI systems.

  • Documentation: Provide comprehensive documentation of the model’s features and use cases.

  • Ethical Considerations: Adhere to ethical AI practices to avoid biases.

  • Performance Metrics: Include validation results or performance metrics.

  • Update and Maintenance Plan: Outline plans for future model updates and maintenance.

Contribute New Datasets

  • Contributors can provide diverse datasets that cover a wide range of topics, enriching the AI's knowledge base and enhancing its ability to respond accurately across various domains.

  • The primary use of these datasets will be for instruction-based finetuning. This process involves adjusting the AI model to better understand and follow specific instructions or guidelines based on the provided data.

  • Submissions should ideally be in .csv (comma-separated values) format.

Other than that, dataset contribution can be submitted in other ways for pre-trained purposes. Below are the different types of dataset can be collected and other alternatives to utilize them in a model.

  1. Data Collection and Transcription

  • Gathering Domain-Specific Information: Focus on collecting information pertinent to the Virtual's area of expertise from a variety of sources. This step is crucial for building a comprehensive knowledge base.

  • Annotating Transcribed Data: Highlight essential information and context within the transcribed data. Annotation is key to understanding and utilizing the collected data effectively.

  • Systematic Organization: Ensure the data is systematically organized. Proper classification is essential for efficiently training the AI in relevant knowledge areas.

  1. Expanding a Virtual's Personality

  • Lore and Backstory Expansion: Submissions can include detailed lore or an extended backstory for the Virtual, adding depth and richness to its character.

  • Trait Elaboration: Contributions can elaborate on specific personality traits or characteristics of the Virtual, helping to create a more nuanced and relatable AI character.

  • This submission can also be integrated into prompt cards. For prompt card integration, please consult the 'Character Card Submission' section for detailed guidelines and formatting requirements.

Tips to contribute datasets

  1. Dataset Diversity and Inclusivity: Ensure representation of diverse data sources.

  2. Quality Assurance: Perform thorough checks for accuracy and relevance.

  3. Anonymization of Data: Anonymize sensitive information in user-generated content.

  4. Legal Compliance: Ensure the dataset adheres to data protection laws.

  5. Metadata Inclusion: Provide metadata detailing source, collection methods, and preprocessing.

Last updated