AI Research

2023

August 6-10

Synthesizing visual content that meets users' needs often requires flexible and precise controllability of the pose, shape, expression, and layout of the generated objects.
generative adversarial networks (GANs) via manually annotated training data or a prior 3D model

April 11

Amazon Alexa AI recently created a new simulation platform specifically for embodied AI research, the field specialized in the development of autonomous robots.
Our primary objective was to develop an interactive Embodied AI framework to catalyze the creation of next-generation embodied AI agents
Several embodied-AI simulation platforms have been proposed in recent years (e.g., AI2Thor, Habitat, iGibson). These platforms support simulated scenes, where embodied agents can navigate and interact with objects, yet most of them are not designed for humans to interact with agents due to the lack of user-centricity
Alexa Arena, offers a framework with user-centric capabilities, such as smooth visuals during robot navigation, continuous background animations and sounds, viewpoints in rooms to simplify room-2-room navigation, and visual hints embedded in the scene that aid human-users to generate suitable instructions for task-completion.
In a game-like setting, users can interact with virtual robots through natural-language dialogue, providing invaluable feedback and helping the robots learn and complete their tasks.

April 10

allowing computers to detect and comprehend the elements of a previously unseen image and isolate them for user interaction
Meta's Segment Anything Model (SAM) hunts for related pixels in an image and identifies the common components that make up all the pieces of the picture
SAM can be activated by user clicks or text prompts. Meta researchers envision SAM's further utilization in the AR/VR realm. When users focus on an object, it can be delineated, defined and "lifted" into a 3D image
A free working model is available online. Users can select from an image gallery or upload their own photos. They can then tap anywhere on the screen or draw a rectangle around an item of interest and watch SAM define, for instance, the outline of a nose, face or entire body.

關於...