Tools

News

Notícias

Classificados

Cursos

Broker

IPv4:
IPv6:
UpOrDown
Ping
MTR
MTU Detect
Portscan
DNS
HTTP/SSL
My IP
IP Calc & Sum

Tencent Voyager Turns Photos Into Explorable 3D Worlds

Image © Arstechnica
Tencent releases HunyuanWorld-Voyager, a groundbreaking open-weights model that generates steerable, depth-enabled 3D-like video from a single image, though it comes with notable caveats.

Tencent released HunyuanWorld-Voyager, an open-weights AI model that can generate 3D-consistent video sequences from a single image, letting users steer a camera through virtual scenes. The tool outputs both RGB video and depth data to enable direct 3D reconstruction without traditional modeling steps.

Each generated clip comprises 49 frames—roughly two seconds of video—yet Tencent notes these clips can be chained to produce sequences lasting several minutes, offering the sensation of moving through a real 3D space while the frames remain 2D. Depth maps accompany the video, and the combination can be used to form 3D point clouds for reconstruction.

The system works from a single input image and a user-defined camera trajectory. Users can direct forward, backward, left, right, or turning motions, and the model blends the image and depth data with a memory-efficient “world cache” to deliver video that responds to the chosen camera path.

Voyager’s training regime involves more than 100,000 video clips, including Unreal Engine content, effectively teaching the model to imitate camera motion through 3D environments. While impressive in maintaining spatial coherence, the output remains video with depth rather than true 3D geometry, reflecting the limitations of current transformer-based generative models.

Launchers note practical caveats: running Voyager demands substantial GPU power (Tencent cites 60 GB memory for 540p, with 80 GB recommended for better results). The weights are published on Hugging Face, but licensing restricts use in the EU, the UK, and South Korea, and commercial deployments require separate Tencent licensing. The approach is positioned as a tool for 3D reconstruction and video synthesis rather than an immediate replacement for video games or full 3D engines.

 

Arstechnica

Notícias relacionadas

APIs Sob Ataque: Proteção da Confiança Digital
Serpro desenvolve IA nacional para frear LLMs estrangeiros
TIP Brasil e Unifique firmam parceria 5G regional
Anatel mapeará condições de Internet no ensino superior
Anatel pode executar garantias para migrar Oi
Desoneração de M2M/IoT não resolve tudo

O ISP.Tools sobrevive graças aos anúncios.

Considere desativar seu bloqueador de anúncios.
Prometemos não ser intrusivos.

Consentimento de cookies

Usamos cookies para melhorar sua experiência em nosso site.

Ao usar nosso site, você concorda com os cookies. Saiba mais sobre o site