Shap-E: Creating 3D Objects from Prompts

CryptOliphant / 1 year
0
9 min read

Shap-E is an innovative artificial intelligence tool developed by OpenAI. Similar to the Dall-E AI, which generates images based on text prompts, Shap-E has the ability to create 3D models from your prompts. In this article, we will explore everything you need to know about Shap-E and how to utilize its functionalities effectively.

Introduction

In recent years, OpenAI has revolutionized the world with its GPT-based AI models. From the versatile text generation of ChatGPT to the image creation of Dall-E 2, these models have pushed the boundaries of AI capabilities. Now, OpenAI introduces Shap-E, an AI model capable of generating 3D objects that can be opened in Microsoft Paint 3D or converted into STL files for 3D printers. This new tool holds great potential for architects, interior designers, video game developers, as well as the film and animation industry.

Three Shap-E Demos

Once you have installed Shap-E, you can access it through Jupyter Notebook. It provides three sample notebooks: “text-to-3D,” “image-to-3D,” and “encode-model.” These notebooks allow you to visualize and execute code snippets to see the resulting outputs.

Text-to-3D

The “text-to-3D” notebook, although still in its early stages, enables the generation of 3D objects. It provides two types of results: animated color GIFs compatible with web browsers and monochrome PLY files that can be opened with programs like Paint3D.

By default, the prompt generates a “shark” object, resulting in four 64×64 GIFs. However, you can modify the code to increase the resolution or explore other examples suggested by OpenAI, such as “an airplane that looks like a banana.” The results may vary when using user-input prompts.

Image-to-3D

The “image-to-3D” script allows you to convert existing 2D image files into 3D PLY files. As an example, OpenAI demonstrates converting a Corgi illustration into a low-resolution rotating GIF file. You can modify the code to generate a 3D PLY file that can be opened with Paint 3D.

While users can try feeding the script with their

own images, the results may be less convincing. It is recommended to use PNG format for the input image.

Configuration Requirements for Shap-E

Running Shap-E requires a powerful configuration. With an RTX 3080 GPU and a Ryzen 9 5900X CPU, rendering a complete output takes approximately 5 minutes. However, using a laptop like the Asus ROG Strix Scar 18 with an RTX 4090 GPU and an Intel Core i9-13980HX CPU can reduce the rendering time to two to three minutes.

For older machines with an 8th Gen Intel U CPU and integrated graphics, rendering a single output can take several hours. Therefore, it is advisable to use a PC with a latest-generation Nvidia GPU. Additionally, the initial execution of a script involves downloading models, which can take some time due to their large size (2-3 GB).

Installation and Usage of Shap-E

The Shap-E model is freely available on GitHub and can be executed locally on your PC. Once you have downloaded all the required files, there is no need to be connected to the internet. Unlike other tools provided by the company, Shap-E does not require an OpenAI API key, so you will not be charged for its usage.

However, installing and using Shap-E is not as straightforward as using Dall-E. OpenAI does not provide detailed instructions beyond the Python pip command for installation. The firm does not mention the required dependencies or specify that the latest versions may not function correctly.

To install and run Shap-E on Windows, you can use Miniconda to create a dedicated Python environment. It is recommended to use WSL2 (Windows Subsystem for Linux) to avoid any potential issues. Here are the steps:

Install Miniconda or Anaconda on Linux. You can find the file and instructions on the Conda website.
Create a Conda environment named “shap-e” with Python 3.9 or another installed version using the command: conda create -n shap-e python=3.9.
Activate the environment with the command: conda activate shap-e.
Install PyTorch. If you have an Nvidia GPU, use the command: conda install pytorch=1.13.0 torchvision pytorch-cuda=11.6 -c pytorch -c nvidia. If you don’t have an Nvidia GPU, use the CPU-based installation command: conda install pytorch torchvision torchaudio cpuonly -c pytorch. Note that CPU-based processing for 3D operations can be extremely slow.
For PyTorch build, use the command: pip install "git+https://github.com/facebookresearch/pytorch3d.git". If you encounter a CUDA error, try running sudo apt install nvidia-cuda-dev and repeat the process.
Install Jupyter Notebook using Conda with the command: conda install -c anaconda jupyter.
Clone the Shap-E code repository using the command: git clone https://github.com/openai/shap-e.
Navigate to the created “shap-e” folder and initiate the installation with the command: cd shap-e pip install -e.
Launch Jupyter Notebook using the command: jupyter notebook and open the provided localhost URL, which will be something like http://localhost:8888?token= followed by a token.

Testing the Text-to-3D Demo

To test the text-to-3D demo, navigate to the shap-e/examples directory. Double-click on the sample_text_to_3d.ipynb file. A notebook will open with different code sections. Highlight each

section and click the “Run” button to execute them. Wait for each process to complete before moving to the next section.

This initial process may take some time as it involves downloading several large models to your local hard drive. Once completed, you should see four 3D shark models in your browser and four .ply files in the examples folder. You can open the files using programs like Paint 3D or convert them to STL files using an online converter.

If you wish to modify the prompt and try again, simply refresh the browser and change “a shark” to something else in the prompt section. You can also increase the image resolution by changing the size from 64 to a higher value.

Testing the Image-to-3D Demo

To try the image-to-3D script, double-click on the sample_image_to_3d.ipynb file in the examples folder. Similar to the previous demo, highlight each section and click “Run” to execute them.

By default, you will see four small Corgi images. It is recommended to add the following code to the last section of the notebook to generate PLY files in addition to GIF files:

from shap_e.util.notebooks import decode_latent_mesh

for i, latent in enumerate(latents):
with open(f’example_mesh_{i}.ply’, ‘wb’) as f:
decode_latent_mesh(xm, latent).tri_mesh().write_ply(f)

Remember to modify the image location in section 3 if you want to use a different image. It is also recommended to change the batch_size to 1 for generating a single image. You can adjust the size to 128 or 256 for higher resolution.

Create a Python script with the provided code and save it as text-to-3d.py or any other desired name. Then, execute the script using python text-to-3d.py and enter your prompt when prompted by the program. You will receive a PLY file as output, but not a GIF file. If you are proficient in Python, feel free to modify the script to suit your needs.

Conclusion

In conclusion, Shap-E, developed by OpenAI, brings exciting possibilities for generating 3D objects based on prompts. It offers three demos: text-to-3D, image-to-3D, and encode-model. While Shap-E is still in its early stages and has some limitations, it showcases the potential for creating 3D models through text and images. By following the installation and usage instructions, you can explore Shap-E’s capabilities and leverage them for various applications in architecture, interior design, gaming, and more.

FAQs

Is Shap-E available for free?
- Yes, Shap-E is available for free on GitHub, and there is no need for an OpenAI API key or any associated charges.
Can I use Shap-E on older machines with limited specifications?
- Shap-E requires a powerful configuration, particularly a latest-generation Nvidia GPU, for efficient rendering. Older machines with integrated graphics may result in significantly longer rendering times.
How can I modify the code to generate higher-resolution outputs?

- In the text-to-3D demo, you can increase the image resolution by changing the size parameter. Similarly, in the image-to-3D demo, you can adjust the size to achieve higher-resolution results.
Are there any alternatives to Shap-E for generating 3D objects?
- While Shap-E is a unique tool, there are other AI-based software and libraries available that can assist in generating 3D objects from various inputs. However, each tool has its own features and capabilities.
Can Shap-E be used for commercial purposes?

Yes, Shap-E can be used for commercial purposes. As an open-source tool, it provides flexibility for users to integrate it into their projects and leverage its 3D object generation capabilities for commercial applications.

Does Shap-E support languages other than English?
- Shap-E primarily operates on text prompts, so it is compatible with any language that can be expressed as text input. However, it’s worth noting that the quality of results may vary depending on the language and the availability of training data.
Can Shap-E generate complex and highly detailed 3D objects?
- Shap-E is still a developing tool, and its current version may have limitations in terms of the level of detail and complexity of the generated 3D objects. However, as the technology evolves, it is expected to improve and offer more advanced capabilities.
Are there any additional resources or tutorials available for learning Shap-E?
- OpenAI provides documentation and resources on their GitHub repository for Shap-E. You can refer to the official documentation, sample notebooks, and community forums for further guidance on installation, usage, and exploring the potential of Shap-E.
Can Shap-E be used with other 3D software or frameworks?
- Yes, Shap-E can be integrated with other 3D software and frameworks. It provides compatibility with common formats like PLY, which can be opened and manipulated using various 3D software and libraries.
Will OpenAI continue to improve and update Shap-E?
- OpenAI is dedicated to ongoing research and development in the field of AI. While the specific roadmap for Shap-E’s future updates is not outlined, OpenAI is likely to refine and enhance the capabilities of Shap-E based on user feedback and advancements in the field.

CryptOliphant

Meet Ellie, an oliphant with a keen interest in cryptocurrency. Known for her enchanting instincts, Ellie has ventured from the mystical forests into the crypto wilderness. She’s become a self-taught crypto expert, offering a range of magical services, including Market Analysis, OliphantTrade Signals expressed through stomp patterns, and Enchanted-Consulting. A fan of enchantment-themed altcoins like MysticChain and EnchantedCoin, she loves conversing with forest creatures and tending to her enchanted garden when she’s not busy with crypto. Connect with Ellie by leaving a message inscribed on a magical scroll in a hollow tree trunk.

See author's posts

Bitcoin Retracts and Rebounds:...

Bitcoin’s Remarkable Surge: $4,000...

Crypto Exchanges Lock Down...

FTX Eyes Return of...

SEC Won’t Appeal Loss...

US Considering Application of...