Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Robert Cassidy Project 4 #6

Open
wants to merge 8 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
158 changes: 22 additions & 136 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,28 +1,15 @@
![AntiAliasing Level 3](https://raw.githubusercontent.com/RTCassidy1/Project4-Rasterizer/master/renders/AALevel3.png)
-------------------------------------------------------------------------------
CIS565: Project 4: CUDA Rasterizer
-------------------------------------------------------------------------------
Fall 2014
-------------------------------------------------------------------------------
Due Monday 10/27/2014 @ 12 PM
-------------------------------------------------------------------------------

-------------------------------------------------------------------------------
NOTE:
-------------------------------------------------------------------------------
This project requires an NVIDIA graphics card with CUDA capability! Any card with CUDA compute capability 1.1 or higher will work fine for this project. For a full list of CUDA capable cards and their compute capability, please consult: http://developer.nvidia.com/cuda/cuda-gpus. If you do not have an NVIDIA graphics card in the machine you are working on, feel free to use any machine in the SIG Lab or in Moore100 labs. All machines in the SIG Lab and Moore100 are equipped with CUDA capable NVIDIA graphics cards. If this too proves to be a problem, please contact Patrick or Karl as soon as possible.

-------------------------------------------------------------------------------
INTRODUCTION:
-------------------------------------------------------------------------------
In this project, you will implement a simplified CUDA based implementation of a standard rasterized graphics pipeline, similar to the OpenGL pipeline. In this project, you will implement vertex shading, primitive assembly, perspective transformation, rasterization, fragment shading, and write the resulting fragments to a framebuffer. More information about the rasterized graphics pipeline can be found in the class slides and in your notes from CIS560.

The basecode provided includes an OBJ loader and much of the mundane I/O and bookkeeping code. The basecode also includes some functions that you may find useful, described below. The core rasterization pipeline is left for you to implement.
This is a simplified CUDA based implementation of a standard rasterized graphics pipeline, similar to the OpenGL pipeline. This project implements vertex shading, primitive assembly, perspective transformation, rasterization, fragment shading, and writes the resulting fragments to a framebuffer.

You MAY NOT use ANY raycasting/raytracing AT ALL in this project, EXCEPT in the fragment shader step. One of the purposes of this project is to see how a rasterization pipeline can generate graphics WITHOUT the need for raycasting! Raycasting may only be used in the fragment shader effect for interesting shading results, but is absolutely not allowed in any other stages of the pipeline.

Also, you MAY NOT use OpenGL ANYWHERE in this project, aside from the given OpenGL code for drawing Pixel Buffer Objects to the screen. Use of OpenGL for any pipeline stage instead of your own custom implementation will result in an incomplete project.

Finally, note that while this basecode is meant to serve as a strong starting point for a CUDA rasterizer, you are not required to use this basecode if you wish, and you may also change any part of the basecode specification as you please, so long as the final rendered result is correct.

-------------------------------------------------------------------------------
CONTENTS:
Expand All @@ -31,54 +18,14 @@ The Project4 root directory contains the following subdirectories:

* src/ contains the source code for the project. Both the Windows Visual Studio solution and the OSX makefile reference this folder for all source; the base source code compiles on OSX and Windows without modification.
* objs/ contains example obj test files: cow.obj, cube.obj, tri.obj.
* renders/ contains an example render of the given example cow.obj file with a z-depth fragment shader.
* windows/ contains a Windows Visual Studio 2010 project and all dependencies needed for building and running on Windows 7.

The Windows and OSX versions of the project build and run exactly the same way as in Project0, Project1, and Project2.
* renders/ contains 3 videos of the rasterizer in action.

-------------------------------------------------------------------------------
REQUIREMENTS:
ADDITIONAL FEATURES:
-------------------------------------------------------------------------------
In this project, you are given code for:

* A library for loading/reading standard Alias/Wavefront .obj format mesh files and converting them to OpenGL style VBOs/IBOs
* A suggested order of kernels with which to implement the graphics pipeline
* Working code for CUDA-GL interop

You will need to implement the following stages of the graphics pipeline and features:

* Vertex Shading
* Primitive Assembly with support for triangle VBOs/IBOs
* Perspective Transformation
* Rasterization through either a scanline or a tiled approach
* Fragment Shading
* A depth buffer for storing and depth testing fragments
* Fragment to framebuffer writing
* A simple lighting/shading scheme, such as Lambert or Blinn-Phong, implemented in the fragment shader

You are also required to implement at least 3 of the following features:

* Additional pipeline stages. Each one of these stages can count as 1 feature:
* Geometry shader
* Transformation feedback
* Back-face culling
* Scissor test
* Stencil test
* Blending

IMPORTANT: For each of these stages implemented, you must also add a section to your README stating what the expected performance impact of that pipeline stage is, and real performance comparisons between your rasterizer with that stage and without.

* Correct color interpolation between points on a primitive
* Texture mapping WITH texture filtering and perspective correct texture coordinates
* Support for additional primitices. Each one of these can count as HALF of a feature.
* Lines
* Line strips
* Triangle fans
* Triangle strips
* Points
* Back-Face Culling
* Anti-aliasing
* Order-independent translucency using a k-buffer
* MOUSE BASED interactive camera support. Interactive camera support based only on the keyboard is not acceptable for this feature.

-------------------------------------------------------------------------------
BASE CODE TOUR:
Expand All @@ -99,86 +46,25 @@ You will also want to familiarize yourself with:
* utilities.h, which serves as a kitchen-sink of useful functions

-------------------------------------------------------------------------------
SOME RESOURCES:
-------------------------------------------------------------------------------
The following resources may be useful for this project:

* High-Performance Software Rasterization on GPUs
* Paper (HPG 2011): http://www.tml.tkk.fi/~samuli/publications/laine2011hpg_paper.pdf
* Code: http://code.google.com/p/cudaraster/ Note that looking over this code for reference with regard to the paper is fine, but we most likely will not grant any requests to actually incorporate any of this code into your project.
* Slides: http://bps11.idav.ucdavis.edu/talks/08-gpuSoftwareRasterLaineAndPantaleoni-BPS2011.pdf
* The Direct3D 10 System (SIGGRAPH 2006) - for those interested in doing geometry shaders and transform feedback.
* http://133.11.9.3/~takeo/course/2006/media/papers/Direct3D10_siggraph2006.pdf
* Multi-Fragment Effects on the GPU using the k-Buffer - for those who want to do a k-buffer
* http://www.inf.ufrgs.br/~comba/papers/2007/kbuffer_preprint.pdf
* FreePipe: A Programmable, Parallel Rendering Architecture for Efficient Multi-Fragment Effects (I3D 2010)
* https://sites.google.com/site/hmcen0921/cudarasterizer
* Writing A Software Rasterizer In Javascript:
* Part 1: http://simonstechblog.blogspot.com/2012/04/software-rasterizer-part-1.html
* Part 2: http://simonstechblog.blogspot.com/2012/04/software-rasterizer-part-2.html

-------------------------------------------------------------------------------
NOTES ON GLM:
-------------------------------------------------------------------------------
This project uses GLM, the GL Math library, for linear algebra. You need to know two important points on how GLM is used in this project:

* In this project, indices in GLM vectors (such as vec3, vec4), are accessed via swizzling. So, instead of v[0], v.x is used, and instead of v[1], v.y is used, and so on and so forth.
* GLM Matrix operations work fine on NVIDIA Fermi cards and later, but pre-Fermi cards do not play nice with GLM matrices. As such, in this project, GLM matrices are replaced with a custom matrix struct, called a cudaMat4, found in cudaMat4.h. A custom function for multiplying glm::vec4s and cudaMat4s is provided as multiplyMV() in intersections.h.

-------------------------------------------------------------------------------
README
ADDITIONAL FEATURES TOUR:
-------------------------------------------------------------------------------
All students must replace or augment the contents of this Readme.md in a clear
manner with the following:

* A brief description of the project and the specific features you implemented.
* At least one screenshot of your project running.
* A 30 second or longer video of your project running. To create the video you
can use http://www.microsoft.com/expression/products/Encoder4_Overview.aspx
* A performance evaluation (described in detail below).
* Correct color interpolation between points on a primitive
* my Triangles have support for per-vertex color. In the Rasterization kernel I use the Barycentric coordinates of the triangle to apply the correct color value to a fragment based on its distance from the three vertices.
* As an aside, when first implementing this I had a sign error and implemented "Front-Face culling" While not a desirable feature, it made a funny video that can be found here: https://www.youtube.com/watch?v=q9GIzXXPtGc&feature=youtu.be
* Back-Face Culling
* I included Back-Face culling in my primitive assembly and not as a separate feature. I augmented my triangle struct to have a field indicating whether or not it had been culled.
* In the Rasterization Kernel if a triangle has been culled the Kernel returns immediately.
* Anti-Aliasing
* This is the most in depth feature. In the cudaRasterizeCore() method there is a variable where you can set the Anti-Aliasing level. If you leave this as 1 the rasterizer will act as normal, however setting this to a value larger than one will supersample the entire rasterization process by that many times in both the x and y direction. When it comes to the Render Kernel it will downsample the fragments back to the given resolution using a gaussian distribution.

-------------------------------------------------------------------------------
PERFORMANCE EVALUATION
-------------------------------------------------------------------------------
The performance evaluation is where you will investigate how to make your CUDA
programs more efficient using the skills you've learned in class. You must have
performed at least one experiment on your code to investigate the positive or
negative effects on performance.

We encourage you to get creative with your tweaks. Consider places in your code
that could be considered bottlenecks and try to improve them.

Each student should provide no more than a one page summary of their
optimizations along with tables and or graphs to visually explain any
performance differences.

-------------------------------------------------------------------------------
THIRD PARTY CODE POLICY
-------------------------------------------------------------------------------
* Use of any third-party code must be approved by asking on Piazza. If it is approved, all students are welcome to use it. Generally, we approve use of third-party code that is not a core part of the project. For example, for the ray tracer, we would approve using a third-party library for loading models, but would not approve copying and pasting a CUDA function for doing refraction.
* Third-party code must be credited in README.md.
* Using third-party code without its approval, including using another student's code, is an academic integrity violation, and will result in you receiving an F for the semester.

-------------------------------------------------------------------------------
SELF-GRADING
-------------------------------------------------------------------------------
* On the submission date, email your grade, on a scale of 0 to 100, to Liam, [email protected], with a one paragraph explanation. Be concise and realistic. Recall that we reserve 30 points as a sanity check to adjust your grade. Your actual grade will be (0.7 * your grade) + (0.3 * our grade). We hope to only use this in extreme cases when your grade does not realistically reflect your work - it is either too high or too low. In most cases, we plan to give you the exact grade you suggest.
* Projects are not weighted evenly, e.g., Project 0 doesn't count as much as the path tracer. We will determine the weighting at the end of the semester based on the size of each project.

---
SUBMISSION
---
As with the previous project, you should fork this project and work inside of
your fork. Upon completion, commit your finished project back to your fork, and
make a pull request to the master repository. You should include a README.md
file in the root directory detailing the following

* A brief description of the project and specific features you implemented
* At least one screenshot of your project running.
* A link to a video of your raytracer running.
* Instructions for building and running your project if they differ from the
base code.
* A performance writeup as detailed above.
* A list of all third-party code used.
* This Readme file edited as described above in the README section.

The biggest performance hit comes from Anti-Aliasing. Without anti-aliasing my rasterizer was able to render the cow at 60fps which I used as a baseline. When I set the Antialiasing to 2x that dropped to 10 fps, at 3x it was 4-5fps, and at 5x it was 1-2fps.
* I had hoped that back-face culling would help with this, but I actually did not get any performance gains. However I think this is because of my implementation. When I culled the triangle, it still was submitted to the rasterization Kernel where it failed fast, but because of the warp size, there were probably very few warps that had ONLY back facing triangles.
* What I need to do at a future iteration is to use string compaction to remove the culled triangles so they don't reach the rasterization kernel at all.
* There is also room to improve the AntiAliasing as well. Currently I supersampled the entire image, but if I were to only superscale the edges I would produce a lot fewer fragments.
* Another quick improvement would be in the downsampling algorithm. Currently I calculate the gaussian weight for every subpixel on every pixel. The weights could be computed ahead of time and passed to the kernels so they just read them instead of computing them on every frame for every subpixel.
* I also could try to move the subpixel fragments color values to shared memory. With the antialiasing, multiple pixels will sample the same fragments, so if I set this up correctly I could probably reduce a lot of calls out to memory.
I have two more videos of the renderer, one with AntiAliasing on 3x: https://www.youtube.com/watch?v=uRSzpbR4ZaQ&feature=youtu.be
and one without AntiAliasing: https://www.youtube.com/watch?v=J8bXx7zOvN0&feature=youtu.be
Binary file added renders/AALevel3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added renders/AALevel3.tiff
Binary file not shown.
Binary file added renders/AntiAliasLevel3.mp4
Binary file not shown.
Binary file added renders/AntiAliasOff.mp4
Binary file not shown.
Binary file added renders/CullWrongSide.mp4
Binary file not shown.
10 changes: 8 additions & 2 deletions src/main.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,9 @@ int main(int argc, char** argv){
cout << "Usage: mesh=[obj file]" << endl;
return 0;
}

//Setup Camera
cam = camera();

frame = 0;
seconds = time (NULL);
Expand Down Expand Up @@ -57,7 +60,7 @@ void mainLoop() {
}

string title = "CIS565 Rasterizer | " + utilityCore::convertIntToString((int)fps) + " FPS";
glfwSetWindowTitle(window, title.c_str());
glfwSetWindowTitle(window, title.c_str());

glBindBuffer(GL_PIXEL_UNPACK_BUFFER, pbo);
glBindTexture(GL_TEXTURE_2D, displayImage);
Expand Down Expand Up @@ -92,9 +95,12 @@ void runCuda(){

ibo = mesh->getIBO();
ibosize = mesh->getIBOsize();

nbo = mesh->getNBO();//get normals
nbosize = mesh->getNBOsize();//get normal length

cudaGLMapBufferObject((void**)&dptr, pbo);
cudaRasterizeCore(dptr, glm::vec2(width, height), frame, vbo, vbosize, cbo, cbosize, ibo, ibosize);
cudaRasterizeCore(dptr, glm::vec2(width, height), frame, vbo, vbosize, cbo, cbosize, ibo, ibosize, nbo, nbosize, cam);
cudaGLUnmapBufferObject(pbo);

vbo = NULL;
Expand Down
7 changes: 7 additions & 0 deletions src/main.h
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,8 @@
#include "rasterizeKernels.h"
#include "utilities.h"



using namespace std;

//-------------------------------
Expand All @@ -49,6 +51,11 @@ float* cbo;
int cbosize;
int* ibo;
int ibosize;
float* nbo; //added
int nbosize; //added

//newstuff:
camera cam;

//-------------------------------
//----------CUDA STUFF-----------
Expand Down
Loading