D

Research Engineer - Multimodal Companion Agent

DeepMind
On-site
Tokyo, Japan

At Google DeepMind, we value diversity of experience, knowledge, backgrounds and perspectives and harness these qualities to create extraordinary impact. We are committed to equal employment opportunities regardless of sex, race, religion or belief, ethnic or national origin, disability, age, citizenship, marital, domestic or civil partnership status, sexual orientation, gender identity, pregnancy, or related condition (including breastfeeding) or any other basis as protected by applicable law. If you have a disability or additional need that requires accommodation, please do not hesitate to let us know.

Snapshot

We are seeking a highly motivated and innovative Research Engineer to join our team in Tokyo, focused on building the state-of-the-art in multimodal companion agents.  You will work with researchers and software engineers developing a cutting-edge companion agent.  This will involve utilizing the latest advancements in large language models (LLMs), particularly in the multimodal domain (vision, audio, text).  The focus will be on developing more capable, robust, factual, actionable, co-presenting, and helpful companion agents, with the potential to impact millions of users.  In this role, you will have the opportunity to apply your expertise such as LLM post-training and evaluation to create AI agents that can understand and interact with the world in unprecedented ways.  This role offers a unique opportunity to collaborate with a world-class, cross-functional team at Google DeepMind, work on challenging problems, and develop innovative solutions in a dynamic and collaborative environment.  If you are passionate about shaping the future of human-computer interaction through AI and are eager to make a significant impact in the rapidly evolving landscape of agentic technologies, we encourage you to apply.

About us

Artificial Intelligence could be one of humanity’s most useful inventions. At Google DeepMind, we’re a team of scientists, engineers, machine learning experts and more, working together to advance the state of the art in artificial intelligence. We use our technologies for widespread public benefit and scientific discovery, and collaborate with others on critical challenges, ensuring safety and ethics are the highest priority.

The role

As a Research Engineer at Google DeepMind, you will contribute to the development of the Gemini-powered multimodal companion, pushing the boundaries of AI to create an immersive and engaging user experience across various domains, including education, health, gaming, and other exciting areas.  You will be working on cutting-edge research that directly contributes to the development of impactful products, pushing the boundaries of AI to create companion agents that can truly understand and respond to human needs.

Key responsibilities

  • Implementation & Optimization: Translate research concepts into practical implementations by developing and optimizing multimodal AI models, and building and maintaining robust data pipelines for training and evaluation.
  • Experimentation & Evaluation: Design, implement, and run experiments to evaluate the performance and robustness of multimodal companion AI agents, using metrics and techniques like prompt engineering and few-shot learning.
  • Contextual Interaction: Implement algorithms to enable the agent to analyze user interactions via vision and audio, providing contextually relevant assistance in voice.
  • Collaboration & Knowledge Sharing: Work closely with research scientists and engineers, contributing to team discussions, sharing knowledge, and actively participating in code reviews to foster a collaborative environment.
  • Innovation & Product Impact: Proactively identify and address technical challenges, stay updated on the latest AI advancements, and focus on developing solutions that can be effectively integrated into Google products and services, contributing to product impact.

About you

You are a passionate and talented researcher engineer with a strong foundation and a proven ability to conduct impactful research and engineering in AI. You embrace change and thrive under ambiguity. You have a collaborative mindset and are excited to work as part of a team to tackle ambitious research and engineering challenges. You are passionate about seeing research translated into real-world products that improve the lives of users and are eager to work in an environment where research has a direct path to product impact.  You are eager to see your research and engineering contribute to real-world applications and are driven by a desire to create positive change through AI.

  • Bachelors/Masters/Ph.D. in Computer Science, Artificial Intelligence, or a related field.
  • A minimum of 5 years of relevant professional experience.
  • Experience with core software engineering and applied implementations of AI 
  • Innovate and assess new LLM advances and techniques for pilot projects, quickly demonstrating viability and potential impact.  
  • Solid understanding of deep learning, natural language processing, speech processing, and/or computer vision.
  • Experience with relevant ML frameworks such as JAX, TensorFlow, or PyTorch.
  • Strong programming skills in Python.
  • Excellent communication and collaboration skills.

In addition, the following would be an advantage: 

  • Experience with multimodal learning, LLMs, and/or companion AI agents.
  • Strong publication record in top-tier AI conferences or journals.
  • Strong track record in competitions in machine learning, data science, or AI in games.
  • Experience with pretraining, post-training techniques, prompt engineering, few-shot learning, and evaluations.
  • Familiarity with large-scale model training and deployment.
  • Experience working in a collaborative, cross-functional team environment, particularly across different time zones.
  • Experience with C++ is a plus.