


We choose seven as a good balance between having long enough context to train a conversational model and fitting models within memory constraints (longer contexts take more memory). We extract each conversation training example, with seven turns of context, as one path through a tree thread. Through tuning the hyper-parameters, we discovered that a more powerful decoder was the key to higher conversational quality.Įxample of Meena encoding a 7-turn conversation context and generating a response, “The Next Generation”.Ĭonversations used for training are organized as tree threads, where each reply in the thread is viewed as one conversation turn. The decoder then uses that information to formulate an actual response. The encoder is responsible for processing the conversation context to help Meena understand what has already been said in the conversation. Meena has a single Evolved Transformer encoder block and 13 Evolved Transformer decoder blocks, as illustrated below. At its heart lies the Evolved Transformer seq2seq architecture, a Transformer architecture discovered by evolutionary neural architecture search to improve perplexity. The training objective is to minimize perplexity, the uncertainty of predicting the next token (in this case, the next word in a conversation). Meena is an end-to-end, neural conversational model that learns to respond sensibly to a given conversational context. Remarkably, we demonstrate that perplexity, an automatic metric that is readily available to any neural conversational models, highly correlates with SSA.Ī chat between Meena ( left) and a person ( right). Such improvements are reflected through a new human evaluation metric that we propose for open-domain chatbots, called Sensibleness and Specificity Average (SSA), which captures basic, but important attributes for human conversation.

We show that Meena can conduct conversations that are more sensible and specific than existing state-of-the-art chatbots. In “ Towards a Human-like Open-Domain Chatbot”, we present Meena, a 2.6 billion parameter end-to-end trained neural conversational model. Current chatbots do this much more often than people because it covers many possible user inputs. For example, “I don’t know,” is a sensible response to any question, but it’s not specific. Moreover, chatbots often give responses that are not specific to the current context. They sometimes say things that are inconsistent with what has been said so far, or lack common sense and basic knowledge about the world.

However, current open-domain chatbots have a critical flaw - they often don’t make sense.
#TRY MEENA CHATBOT MOVIE#
Besides being a fascinating research problem, such a conversational agent could lead to many interesting applications, such as further humanizing computer interactions, improving foreign language practice, and making relatable interactive movie and videogame characters. To better handle a wide variety of conversational topics, open-domain dialog research explores a complementary approach attempting to develop a chatbot that is not specialized but can still chat about virtually anything a user wants. Modern conversational agents (chatbots) tend to be highly specialized - they perform well as long as users don’t stray too far from their expected usage. Posted by Daniel Adiwardana, Senior Research Engineer, and Thang Luong, Senior Research Scientist, Google Research, Brain Team
