Can AI Chats Replace Humans?

They are necessary to relieve the work that people do. Mostly, this is the routine part of the work. In this case, the person performs the function of a chatbot operator, although having a good management program may not even require constant interaction with a person. People move from workers to the category of observers. Bots are also used in the entertainment industry. For example, you can communicate in AI chat, play a roleplaying game, and really enjoy communicating with the neural network.

Artificial Intelligence Personality

Language patterns can also take on personality archetypes. In a study led by Hang Jiang, a computer scientist at the Massachusetts Institute of Technology (MIT), researchers forced GPT-3.5 to assume hundreds of personalities by asking it to behave with different combinations of personality traits — such as introvert, antagonistic, conscientious, neurotic and closed-off.

For each character, the model took a standardized personality test and wrote an 800-word childhood narrative, which was then analyzed for psycholinguistic features associated with personality traits. The models obediently demonstrated the personality traits assigned to them both in the test and in the stories. Such models allow researchers to test how well people with different personality traits perform at different responsibilities.

Market researchers are already finding value in these models. In one recent study, Israel and his colleagues found that GPT-3.5 exhibits realistic consumer behavior. When asked whether he would buy a laptop at different prices, he was less price sensitive when told his income was $120,000 versus $50,000. He would prefer the brand of toothpaste he had previously purchased and would pay less for yogurt if he already had plenty of it at home. He also said he would pay real premiums for certain product features, such as fluoride toothpaste or aluminum-free deodorant.

The model did not always give the same answers but, on the contrary, offered a range of answers about their preferences and willingness to pay. Israelskaya and her colleagues aggregated the model’s many responses, creating a virtual survey of buyers of these marker products in a fraction of the time and cost it would take in the real world. The data for training language models is aimed at Western, affluent people, so a consumer survey may be equally skewed. However, an Israeli might imagine AI impersonating different consumers or increasing the scope of the study to create a more representative analysis of a product’s appeal or potential.

One market research company already uses language models in its work. The startup Synthetic Users has created a service using OpenAI models in which clients — including Google, IBM, and Apple — can describe the type of person they want to interview and ask them questions about their needs, wants, and feelings about a particular product, for example, a new website or wearable device. The company’s system generates synthetic interviews that co-founder Kwame Ferreira says are “infinitely richer” and more useful than the “bland” feedback companies get when interviewing real people.

Chatbots

Chatbots can also be used to study more complex interactions between people. Last year, Stanford University and Google researchers developed “social simulacra” to study user behavior on platforms such as Facebook and Reddit. The researchers populated the platform, called SimReddit, with the equivalent of 1,000 different users, repeatedly prompting GPT-3 to enter the user’s identity, community topic, community rules, and previous forum posts. People found it difficult to distinguish emerging discussions from real ones, and platform developers found this tool useful for creating rules and moderation methods.

This year, researchers created a more immersive simulation populated by so-called “generative agents.” Characters were endowed with the ability to remember experiences, reflect on them, and generate and implement plans. Organized behavior emerged: The researchers gave one agent the idea of ​​throwing a Valentine’s Day party, and within two days, all the agents in the city organized it in a coordinated manner. Joon Sung Park, a Stanford computer science graduate student who led both projects, said the virtual world could be used to study the impact of economic policies over time before imposing them on real people.

How to Understand that a Chatbot is Quality

The choice of metrics for assessing the quality of a chatbot depends on the formulation of the problem. If we talk about chatbots, automatic criteria and those requiring manual work are used here. For example, to estimate the quality of AI roleplay, we need users’ assessment.

For many years, scientists have been trying to develop an automatic metric that would show the level of adequacy or humaneness of communication: the text a chatbot produces is evaluated. The most popular metric is called “perplexity,” which can be considered an intermediate estimate. Perplexing reflects the bot’s inability to understand a human’s request. The lower this indicator, the better the chatbot communicates. Engineers monitor the perplexity indicator throughout the training, and after training several different neural networks, they select the best one according to this indicator. The next step is a live test using metrics from the second category, which requires people to evaluate.

The first version of the test: people are given to talk to two different versions of a chatbot, and they decide which version is better. This estimate is relative and suitable for pairwise comparisons.

The second option uses absolute evaluation criteria: a person corresponds with a bot and evaluates two indicators—sensibleness (reasonableness) and specificity. People check to see how much the bot’s answers make sense in context, whether the bot said nonsense out of place, and how original and interesting these answers are. The average of the two indicators is the final absolute score by which chatbots can be compared. The rates of AI companion are extremely high. The bot proved itself to be a great roleplay partner.

The evaluation criteria for task-oriented bots will be different. If the task of a chatbot is to maintain a conversation, then in the case of a task-oriented bot, it is to solve a specific user problem.

Conclusion

Can AI replace humans? Most likely, no. However, it can take on some functions and make our lives easier. Don’t be afraid of bots. Using the example of lifelike roleplay with an AI bot, we see how useful new technologies can be.