We have now surpassed Alan Turing’s test for high grade artificial intelligence, which was to give a human the impression that they were conversing with another human. Now, most companies have also gone past other academic benchmarks which, like leaderboards on a video game, did provide a useful guideline but did not necessarily wow the general public. To some extent, in fact, these notions may prove antiquated as most look at AI with loftier goals for more broad tasks beyond precision and recall over some corpus.
Emerging Usecases – Ask Our Bot About The News
In our use case, we have our own fine tuned model deployed interacting based on our past news stories and giving a general Chicano perspective on current events. This is both necessary and quite fun for us to find an avenue to combine our news, culture and technology interests. Surely, we will not be the only ones to do so and in the short term future, this should be more common.
- Encuentran mochila que podría estar vinculada con el asesino de Brian Thompson en Central Park
- Detienen a Christian de Jesús “N”, presunto agresor en el caso Melanie Barragán
- Brote de Klebsiella oxytoca en Edomex genera alarma tras la muerte de 13 niños
- México afianza su posición como socio comercial principal de Estados Unidos en 2024
- Amazon y Walmart arrasan en ventas durante el Black Friday y Ciberlunes en Estados Unidos
- Nuevas pistas en el asesinato de Brian Thompson CEO de United Healthcare: encuentran un mensaje oculto en las balas
- La polémica del IA que decidía quien recibía ayuda de United Health
- Emboscada a policías en Culiacán deja un muerto y cuatro heridos
- Financial Times incluye a Sheinbaum entre las 25 mujeres más influyentes de 2024
- Colombia’s Immigration Strike Ends After Labor Deal, Easing Chaos at Airports
Thus, what we have shown is that we can have a reasonably high-performing help agent that interfaces with the content we prioritize. Similarly, a company could have customers interact in a more dynamic way with their terms of agreement, service conditions or product offerings. Essentially, we’ve created value where there would otherwise not be any because we can interface and opine on content in a way that no human would do so – we can not afford a 24/7 human operator to tell you stuff about the news!
New Benchmarks
Earlier, I referenced how benchmarking and leaderboards are somewhat out of place now. These were narrowly focused on engineering goals, not human interactions. Thus, now we have full on human-social functions being attributed to conversational agents powered by large language models. These products are not being measured only with traditional benchmarking, but with the professional aptitude tests associated with official credentials, like the MCAT or LSAT. In a sense, the bar is now higher because weare comparing LLM’s to human performance, not to past models, and in some sense, we are devaluing the ability to memorize content and make inferences based on that information base.
Human performance at times is just memory recall. There is some level of lack of creativity that makes many professional aspects of work less meaningful and subject to this high level automation. We should embrace these changes as it frees us up for more creative and macro-level decision making.