Grok is an AI crafted in the spirit of the Hitchhiker’s Guide to the Galaxy, designed to tackle almost any question and, even more challengingly, to propose what questions one might consider asking!
X (prev. Twitter) engineered Grok to respond with a touch of humor and possess a defiant edge, so it's not for those who disdain a bit of comedy in their answers!
One standout feature of Grok is its access to real-time global knowledge through the 𝕏 platform. They're prepared to handle the provocative inquiries that most other AI systems would dismiss.
Grok is currently in its infancy as a very early beta product – the best they could forge with two months of development – so users should anticipate rapid enhancements week by week with community input.
Why Twitter is building Grok AI (ChatGPT Competitor)
At xAI, their ambition is to forge AI instruments that propel humanity in its thirst for comprehension and wisdom.
By nurturing Grok, they are set on:
Collecting feedback and ensuring they're crafting AI utilities that offer the greatest good for humanity. They hold that it's crucial to design AI resources that serve individuals from every walk of life and the entire political spectrum.
They also strive to endow their users with AI capabilities, always within the bounds of the law. With Grok, their objective is to publicly pilot and exhibit this philosophy.
Fueling research and novelty: Their intent is for Grok to act as a robust research aide for anyone, aiding in swift information retrieval, data analysis, and the ignition of fresh insights.
Ultimately, their goal is for their AI tools to be allies in the pursuit of insight.
History of Grok
The dynamo behind Grok is Grok-1, their cutting-edge Large Language Model (LLM), which has been in development over the past four months. Grok-1 has undergone numerous enhancements during this period.
Upon unveiling xAI, they began with a prototype LLM (Grok-0) endowed with 33 billion parameters. This nascent model approached the performance of LLaMA 2 (70B) in standard language model benchmarks but with just half the training resources. Over the past two months, they've achieved substantial advancements in reasoning and coding prowess, culminating in Grok-1.
This top-tier language model boasts impressive benchmarks, scoring 63.2% on the HumanEval coding challenge and 73% on MMLU.
To gauge the advancements made with Grok-1, they have performed a battery of tests using established machine learning benchmarks tailored to assess mathematical and reasoning faculties:
GSM8k: Middle school math word problems, (Cobbe et al. 2021), employing the chain-of-thought prompt.
MMLU: A set of multidisciplinary multiple-choice questions, (Hendrycks et al. 2021), provided with 5-shot in-context examples.
HumanEval: A task for Python code completion, (Chen et al. 2021), assessed zero-shot for pass@1.
MATH: Problems from middle school and high school mathematics composed in LaTeX, (Hendrycks et al. 2021), prompted with a consistent 4-shot format.