We introduce Shape Tokens, a 3D representation that is continuous, compact, and easy to integrate into machine learning models. Shape Tokens serve as conditioning vectors, representing shape information within a 3D flow-matching model. This flow-matching model is trained to approximate probability density functions corresponding to delta functions concentrated on the surfaces of 3D shapes. By incorporating Shape Tokens into various machine learning models, we can generate new shapes, convert images to 3D, align 3D shapes with text and images, and render shapes directly at variable, user-specified resolutions. Additionally, Shape Tokens enable a systematic analysis of geometric properties, including normals, density, and deformation fields. Across tasks and experiments, the use of Shape Tokens demonstrates strong performance compared to existing baselines.
Figure 1: Our Shape Tokens representation can be readily used as input / output to machine learning models in various applications, including single-image-to-3D (left), neural rendering of normal maps (top right) and 3D-CLIP alignment (bottom right). The resulting models achieve strong performance compared to baselines for individual tasks.
Video 1: The video shows our single image to 3D point cloud results. The images are unseen objects in Objaverse test set. Each video first shows the input image, then the generated point cloud. [Credits]
Video 2: From the same unseen input image, we generate multiple point clouds independently. [Credits]
Figure 2: Overview of our architecture. (Left) We model a 3D shape as a probability density function that is concentrated on the surface, forming a delta function in 3D. (Right) Our tokenizer uses cross attention to aggregate information about the point cloud sampled on the shape into ST. The velocity estimator only use cross attention and MLP to maintain independence between points.
Figure 3: Reconstruction, densification, and normal estimation of unseen point clouds in GSO dataset. For each row, we are given a point cloud containing 16,384 points (xyz only), we compute ST and i.i.d. sample the resulted p(x|s) for 262,144 points. Different columns render the input and the sampled point clouds from different view points. Indicated by the label in the parenthesis, we color the input points according to their xyz coordinates and the sampled points according to their initial noise’s uvw coordinates and their estimated normal (last two columns). Note that we do not provide normal as input to the shape tokenizer.
Video 4: We compute Shape Tokens on input point clouds (16,384 points) from unseen Google Scanned Objects. Then we sample 16x more points (262,144 points). The video shows the uvw-to-xyz trajectory of the flow-matching sampling process, i.e., the ODE trajectory. We color the points with their initial position in the noise space (uvw). [Credits]
Figure 4: The ODE integration trajectory defines a mapping from xyz (data) to uvw (noise).
Video 3: The video shows recent single-image-to-3D methods on Google Scanned Objects, which are unseen to all methods.From left to right:
Input image Spatter-image (CVPR 2024): trained on Objaverse Point-e (2022): trained on several million proprietary 3D meshes. Make-a-shape (ICML 2024): trained on 18 datasets, including Objaverse Ours: trained on Objaverse Note that this video does not intend to compare individual methods --- these models differ in their training data (e.g., Point-e was trained on proprietary 3D meshes) and mechanisms (e.g., Splatter-image is not a generative model, our method assumes a known camera model). We provide the results for the viewer's reference. [Credits]
Video 5: The page shows the neural rendering results on unseen point clouds. From Shape Tokens, we use a neural network to estimate independently each ray's intersection point and its surface normal.From left to right:
Ground-truth surface normal Pointersect (CVPR 2023) Ours [Credits]
Mesh/Image Credits: Google Scanned Objects, fedomo.ru, Jacob.Elhatmi, WrenArt, undeadfae, Monicag97, STK_produktion, Andi R, xabi, th_jabba, johnnokomis, LasquetiSpice, AdiXXioN, taplinhvip111, Stolmark, Koppany.IDK, vetorprotensao, JacksonSanders, remdwaas, GRAPHTEC AMERICA, iiircha, despinozavi, AstrumProjects, asleshka, ulmsklv, S.Duce, idcim, Darkkostas25, CREATRBOI, steam2020, fedomo.ru, AnirudhRao, 3DFoxHound, pattarrian, katienixdesigns, icepacha, A109082012, RyanCrosby, Armen Gevorgyan, EnjoyLife_Tlt, Fong Chen, WHA Arquitectos, andreagonzalez28, YouSaveTime, Cutestormy, amy3d, daand, EfrenR, Poppy, MARTINICE GROUP, julianChee, Whatsername, Stuart, danielleclark, redkaratz, LuDiChRiS, mbilalsiddique1, Frybrix, defnotdan, invisiprim, Brent Loncher, MrMaxICT, Stevie_66, Jesse Van Norman, WuhuAirline, anyaachan, Lustron.ru, КУКАЛЕВ, Maxmalow, Karolina K Bieńkowska, Steel Frame Solutions Limited, James Robson, tepapalearninglab, exhibitbook, Christopher Cox, apoiocad, Padraig Daly, CurveCreativeStudio, DennisGray, grantbowlds, YouniqueĪdeaStudio, nobodyroo, dinomaster, pattarrian, rodrigo.ferrada, tamaliteitor123, George B, Csaba Baity (tsabszy), tim.a.schmitz, romane_bouverot, RPG_Engineer, rilisjr, DJMaesen, agglover, Adrian Carter, mohamedsuspeito, Kevin Bond, faizn0rdin, SpaceCowBoy, Giravolt, NukedGames, bhrf, mscla1r3, ScannerDev, Vikrama Raghuraman, NoobiePie, prostair.pl, Rzyas, Phil Gosch, gFiamma, pahlevidaffa, Onironauta digital, pixelsquare, SketchingSushi, Mateus Schwaab, archmixes, jacob_kenndey, lidija.simo, Jessica Peterson, Ltcolscotty, 3Dystopia, Vincent Laberge, frdifrn, Frédérick Pagé, camlaneve, Matt, IronEqual, Tursito, Davidk, Mrs.X.A.YarnArt, prostair.pl, ChrisLee, guseu, Guilhermino, dieterreinert, Mattyew, natalimedeiros, leopro, Trappemakeren, beehn, alisachen69, Chrifuf, cncbrasil, zuzana vajdova, nguyenlouis32, DarksProducer, globalshizaku, louayleo, semmert179, naruemol.pholnuangma, Eric Haines, 3DHA, Nick_Sherman, chaosexcell, ssarinareza, aveli.ladva, Tomas Rubianes, RainerWahnsinn, Lucas Jaenisch, cs_adam, trinityscsp, a109082026, JasseeNFT, Cowdi, Kisielev Mikhail, kay Quobad, secretariatep, me16019, scailman, Stichting Consortium Beroepsonderwijs, fedomo.ru, PatelDev, bipolarbear, Emm (Scenario), De Oliveira M., Наруто, Keita-sama, RodierGabrielle, mizuhi, shughes, Gregory Khodyrev, millerj449, Marko31, David_Holiday, edouard.angebault, fedomo.ru, Artem Shamsuarov, Alan Grice Staircase Co Ltd, THESTIG03, vamsikrishna.v, Dundee Howff Conservation Group, sinhoroto, jia100, 10668285, Born_Canadian, jashma82, aki.karppinen, DarkAaron999, Luckster, julius.j.bib, trolosqlfod, RBG_illustrations, fedomo.ru, MOHMAX1, jamesdeantv1, moxmoin, Adrian21, andrea bocchini geometra, Re3xyyz, Binkley-Spacetrucker, FeralMan, unownlord, pigfinite, duperonvincent, ayekerik, 140813, antonio.a.longoria, cyber0063, Mateja Veljkovic, Vonka Stairs Ltd, breezeca, kishi, 97jana, Sogomonyan_Vaagn, peachybunny, gb.prof.69, milen.margaryan2003, nguyenhuydang, andysmiles4games, Aorie, jonamanz9673, mommy long legs, buckygaming2019, gwen.domingo, PointXX, Lukas Guhse, arakiminoru, Tatiana Sumarokova, potaato, Lustron.ru, jhseok8927, Xillute | Dev, re1monsen, c4n, Ceat, joseph.terronez, matousekfoto, Max Wittig, rltw, lsbergin, KIΣITO, Aiden Huxley, 3Dystopia, MartyUkovGBS, Jamie Rose, Mihail.Burduja, ashpatz845, Schack-Trapper, brian.h.moyer, Excel Stairs Ltd, Behets, Noemi.Mancilla.Serrano, madison319478, Drake, xeratdragons, timpugh44, GSMRF, Lauren Hasegawa, Ca7chi, dewathoem, schaffsp, newfields-3dprinting, Dikart, MariaMam, Micayla Spiros, silvinomc00, Neut2000, Orie J. Braun, hafsa.ishtiaq97, Robwaah007, shakiller, newfields-3dprinting, -Slash-, Saumleid, DreamSail Games - Graham, Jingbari, sualogo3d, maypassamon, Uğur Yakışık, Caitlin, LynSalvador, lanvalond, TheDesigner, e90r96, guilherme.vinicius, Lustron.ru, ZOMBIEFOLIFE, TroyMay21, Qubx.3D