DEEPNIGHT has developed ai1, a 600 billion+ parameter model that stands as the second-largest model in the world after GPT-4. The ai1 model is designed to perform as well as GPT-4, with a context-window of 8k tokens. It was trained on a diverse corpus of texts, including RefinedWeb, GitHub open-source code, and Common Crawl, and further fine-tuned for logical understanding, reasoning, and function calling capabilities.
One of the key features of ai1 is its chaining methodology which enables it to generate instruction-based prompts internally, thereby reducing the need for extensive prompt engineering that is common with other models like ChatGPT, GPT-4, and Llama. The model is adept at automation tasks, understanding human emotions, roleplays, and coding. Additionally, it possesses global memory units for storing data outside the immediate context, which can be leveraged for function schemas among other things.
However, there is no detailed roadmap for ai1's future goals, as the developers have expressed concerns about open-source research being used for profit by other companies. Access to ai1 will not be available for some time, as the team continues to evaluate and improve the model.
No comments:
Post a Comment