Tag: 3d

360 Reconstruction From a Single Image Using Space Carved Outpainting (19 Sep 2023)

This is my reading note for 360 Reconstruction From a Single Image Using Space Carved Outpainting. This paper proposes a method of 3D reconstruction from a single image. To the it represents the 3D object by NERF and iteratively update the NERF by rendering new view using Dream booth.
OmnimatteRF Robust Omnimatte with 3D Background Modeling (17 Sep 2023)

This is my reading note on OmnimatteRF: Robust Omnimatte with 3D Background Modeling. The paper proposes a method for video matting. It models the background as a 3D nerf and each foreground object as 2D image
Towards Practical Capture of High-Fidelity Relightable Avatars (15 Sep 2023)

This is my reading note for Towards Practical Capture of High-Fidelity Relightable Avatars. This paper proposes a method to relight mixture volume representation for the face. The major contribution is to explicitly to enforce linearity of light to the network.
Dynamic Mesh-Aware Radiance Fields (08 Sep 2023)

This is my reading note on Dynamic Mesh-Aware Radiance Fields. This paper proposes a method of rendering NERF with mesh simultaneously. To do that, it modifies the ray trace. To handle occlusion and shadow, SDF is used to represent the surface of NERF and light source is estimated from NERF.
Neuralangelo High-Fidelity Neural Surface Reconstruction (03 Sep 2023)

This is my reading note on Neuralangelo: High-Fidelity Neural Surface Reconstruction. This paper proposes a method to reconstruct 3D surface at very high details. The proposed method is based on two improvements: 1) use numerical gradient instead of analytical one to remove non locality 2) use multi resolution instant NGP improve details from coarse to fine.
DreamFusion Text-to-3D using 2D Diffusion (01 Sep 2023)

This is my reading note on DreamFusion: Text-to-3D using 2D Diffusion. This paper proposes a method (score distillation sampling or SDS) to distill a pre-trained text to image diffusion model to a 3D model. The 3D model, which is based on NERF, is trained per text prompt.
Efficient Geometry-aware 3D Generative Adversarial Networks (27 Aug 2023)

This is my reading note on Efficient Geometry-aware 3D Generative Adversarial Networks. EG3D proposes a 20 to 3D generate method base style gan and triplane based nerf. The high level idea is to use style gan to generate triplane, which is then rendered into images. The rendered image is the discriminated to the input images at two resolutions. The camera pose is also required to generate the triplane.
ProlificDreamer High-Fidelity and Diverse Text-to-3D Generation with Variational Score Distillation (20 Aug 2023)

This is my reading note on ProlificDreamer High-Fidelity and Diverse Text-to-3D Generation with Variational Score Distillation. This method proposes variational score sampling to replace score distillation sampling to improve the details of text to image or text to 3D models. Project page: https://ml.cs.tsinghua.edu.cn/prolificdreamer/
NeuralField-LDM Scene Generation with Hierarchical Latent Diffusion Models (17 Aug 2023)

This is my reading note on NeuralField-LDM Scene Generation with Hierarchical Latent Diffusion Models. It trains auto-encoder to project RGB images of scene with camera pose into the latent space (voxel-nerf). It uses three levels of latent to represent the scene and then uses hierarchical latent diffusion model to represent it.
FineRecon Depth-aware Feed-forward Network for Detailed 3D Reconstruction (14 Aug 2023)

This is my reading note for FineRecon: Depth-aware Feed-forward Network for Detailed 3D Reconstruction. It proposes a high detail surface reconstruction algorithm based voxel volume and multi-view geometry. Two major novelties: improve reconstruction accuracy using a novel MVS depth-guidance strategy and enable the reconstruction of sub-voxel detail with a novel TSDF prediction architecture that can be queriedat any 3D point, using point back-projected fine-grained image features.
Efficient Geometry-aware 3D Generative Adversarial Networks (25 Jul 2023)

This is my reading note for Efficient Geometry-aware 3D Generative Adversarial Networks. The paper proposes a 2Dto 3D generate method base style GAN and triplane based NERF. The high level idea is to use style GAN to generate triplane, which is then rendered into images. The rendered image is the discriminated to the input images at two resolutions. The camera pose is also required to generate the triplane.
Rotation in 3D (27 Sep 2022)

This is my note on rotation in 3D space. There are many different ways of representating the rotation in 3D space, e.g., 3x3 rotation matrix, Euler angle (pitch, yaw and roll), Rodrigues axis-angle representation and quanterion. The relationship and conversion between those representation will be described as below. You could also use scipy.spatial.transform.Rotation to convert between methods.
GANFIT Generative Adversarial Network Fitting for High Fidelity 3D Face Reconstruction (01 May 2021)

This is my reading note for GANFIT: Generative Adversarial Network Fitting for High Fidelity 3D Face Reconstruction. The code is available in GANFit. GANFit reconstructs high quality texture and geometry from a single image with precise identity recovery. To do this, it utilizes GANs to train a very powerful generator of facial texture in UV space. Then, it revisits the original 3D Morphable Models (3DMMs) fitting approaches making use of non-linear optimization to find the optimal latent parameters that best reconstruct the test image but under a new perspective. It optimizes the parameters with the supervision of pretrained deep identity features through our end-to-end differentiable framework.
Avatarme Realistically renderable 3d facial reconstruction in-The-wild (28 Apr 2021)

This is reading note for Avatarme: Realistically renderable 3d facial reconstruction ‘in-The-wild’, which was published in CVPR 2020 and code is available in github. Avatarme aims to reconstruct photorealistic 3D faces from a single “in-the-wild” image with an increasing level of detail. It could generate 4K by 6K-resolution 3D faces from a single low-resolution image that, for the first time, bridges the uncanny valley.
RingNet-Learning to Regress 3D Face Shape and Expression from an Image without 3D Supervision (23 Apr 2021)

This is my reading note for Learning to Regress 3D Face Shape and Expression from an Image without 3D Supervision (code). The paper is also called RingNet and was published in CVPR 2019. The paper solves the problems of 3D face reconstruction from a single 2D image and the training requires no 3D ground truth. To this end, RingNet leverages multiple images of a person and automatically detected 2D face features. It uses a novel loss that encourages the face shape to be similar when the identity is the same and different for different people. This is based on observation that an individual’s face shape is constant across images, regardless of expres- sion, pose, lighting, etc.
Learning an Animatable Detailed 3D Face Model from In-The-Wild Images (18 Apr 2021)

This is my reading note for paper Learning an Animatable Detailed 3D Face Model from In-The-Wild Images and code is available in DECA. The paper proposed method for producing animatable detailed 3D face model from uncontrolled image datasets. For uncontrolled, it means no control light settings and camera view angles are required; however it does require the dataset contains multiple images for each subjets.
My Paper Reading List for 3D Face Reconstructions (20 Mar 2021)

Here is my paper reading lsit for 3D face reconstructions based on Papers with Code. 3D face reconstruction is the task of reconstructing a face from an image into a 3D form (or mesh). Most of the papers on the list are between 2017~2020.
Visual Localization via Deep Learning (13 Apr 2019)

Visual localization aims to estimate the localization, which is usually the the coordinate (orientation and localization) in the world coordindately, given one or multiple images.