Direct X, XNA, etc: 2007

Friday, December 28, 2007

Frame of Reference and Animation Thoughts

Recently I have been studying the ideas of Frame Hierarchies used in graphics programming. Specifically, with DirectX and the X file graphics format. Listed are some of my findings:

Frame of References

Frame of References are all around us. A typical example is someone giving directions to someone else. These are typically given "within a frame of reference". Meaning, if directions are given to the location (B), they are given assuming you are starting at location (A).

In the context of DirectX graphics, frames (which contain mesh(s)) are usually more complex than just a single frame / mesh. Hence, we require a way to manipulate the "series" of frames (and corresponding meshes) in a way that will allow each frame to be manipulated, but still allow the frame objects to maintain some sort of relationship. If we use the concept of a hierarchy and a frame of reference we can accomplish this.

Basically, we establish a root of the hierarchy and then siblings/children under this. Each sibling will share the same parent world matrix, and each child will have a parent (which it will get its world matrix from).

The order of operation on the world matrix at the child level determines how the object is manipulated/changed. If we apply, for instance, a rotation to the child model space matrix and then concatenate this matrix with the child's parent matrix, we can archive frame independent transformation/rotation/scaling while maintaining the overall frame hierarchal relationship.

Animation

How does this frame hierarchy business help us when we begin to use animation? It does so by defining animation interfaces, that are called by D3DX when Animation Sets that are active (allocated to tracks in the animation mixer) are found that have frame name references that are found in our Frame Hierarchy.

The D3DX API is constructed in such a way as to allow the X file definition contain animation sets, which contain animation definitions. These are in turn linked to frames in the hierarchy. Animation is a much more complex subject, but to simplify it, D3DX will crawl our hierarchy as time advances looking for frames that contain animation sets/animations. It will then do the calculations to determine what the animation should do based on passed time values and animation definition.

One key to remember is that since we are using a hierarchy we don't necessarily have to reprocess the entire tree if only a few children are affected by the change. We just process that part of the tree.

Tuesday, November 20, 2007

XNA Game Studio 2.0 BETA released!

The second release of XNA Game Studio has been released today. I am in currently in process of installation of this. I will be posting my experiences with this. Link to download.

Wednesday, November 14, 2007

Better Zuner than later :)

I have recently had the opportunity to experience the newest version of firmware and software available for the Zune. I am the proud owner of a Halo 3: 30GB Zune. As I have been using what is now termed the Zune 30 I have had experience with the device for about 6 months.

Device update

New firmware is included with the release of the new Zune devices, so the users of the 30GB model are not left out. As they are very similar in specs to the new Zune 80, sans the extra disk space and slightly smaller screen (.2" bigger) and a slight decrease in weight on the new devices. The new devices still use 801.11b/g wireless and this is one of the biggest updates, the advent of wireless music syncing. The menu system that is introduced with the new firmware is very similar to the prior version but is faster and transitions are more "polished". One of the biggest improvements I have welcomed is the ability to stop a video in progress, go listen to something else, and when you return to the video, your bookmark is automatically there. This was one of my biggest pet peeves of the previous version.

Marketplace update

As most who are interested in the Zune know, the software that accompanies the Zune's has been totally rewritten. Much more of a web 2.0 feel to the new version. The myriad of settings are increased to help personalization of your device. Podcast subscriptions are now supported. In similar fashion of the device, the software feels much faster (asynchronous tricks ;))

Online components

Some new additions to the system are the online components. There is now a personalized portal that functions as your zune home online. Zune cards have been introduced. This is very similar to gamercards for the XBox 360. The difference is obviously that they relate to how your using your device. The site shows users with "most plays" among other things. Users are able to "share" their zune cards with friends. Mine is here. This is driving home the social aspect that was originally sought after with this device. I personally love it! Join this with the fact that if you are a XBox 360 user, your friends list is available here for you (as well as view of your friends zunes stats, if they choose to share the info).

Conclusion

The update has been very easy and welcome by me and many others. I will support and add my part to keep the "social" growing!

Sunday, November 11, 2007

Posting from Windows Live Writer

This is a post generated by using a new tool recently released by Microsoft (Windows Live Writer). I have never used a offline blog editing software, so I am looking forward to this. Might help me stay up on my blog posts.

Tuesday, August 28, 2007

I'm back....packing reduced instruction optimization

Been a while since I posted, but I am knee deep in assembly at the moment, reviewing for game programming class. Binary divison was a recent topic for reducing instructions by using bit shifts. The interesting part is how to find the remainder on number other than factor of 2. NOTE: This works with divisors that are a factor of 2 (typical for mov(b/w/l) instructions).

These operations are used with strings alot.

Lets say we have a string with a length of 10 and we want to use movl (4 byte stride) to read the value in.

10/4 = 1010 >> 2 = 10 = 2

Now we know that 10/4 is 2.5, so to calculate the remainder simply do this.

4 - 1 = 11 10 & 11 = 10 = 2 2 of 4 bytes is .5

Bit shifting is much more efficient in terms of instructions for the cpu, which equates to better performance. This is useful when trying to optimize C++ game code to get acceptable performance.

Sunday, July 15, 2007

Bungie Day!

Bungie celebrated Bungie Day last week (07/07/2007 - 777), weird but it may well be the day when most get married (http://www.time.com/time/business/article/0,8599,1630320,00.html). They released a new theme and gamerpics on that day, and that day only. Listed below are the blades.

Monday, July 9, 2007

Modifing the size of your integer value

When moving a integer, from say a smaller size field (word) to say a doubleword, you should not just movw to the new register.

movw %ax, %bx

This should not be done because you can not be certain that the upper part of the EBX register is zeroed out ahead of time. To do this you should first zero out the EBX (destination) and then move your intended value there.

movl $0, %ebx
movw %ax, %bx

Intel provides another instruction that can do this with one instruction, movzx. It takes a source (a register or memory location) and converts to a larger size (register only) destination.

movl $300, %ecx
movzx %cl, %ebx

Just thought this was interesting.

Saturday, July 7, 2007

Under the C++ covers....literally

In my continuing studies of C++ I have been studing assembly language. Not from a standpoint that I will be authoring large programs written totally in assembly, but when you are down to your last options and trying to squeeze performance from a piece of code, taking a look at the assembly generated by the HLL (high level language) you can find things otherwise unseen.

Studies on Assembly: (get the cpu vendor id info)

#cpuid - program to get the processor vendor id
.section .data
output:
.ascii "Processor Vendor ID is 'xxxxxxxxxxxx'\n"
.section .text
.globl _start
_start:
movl $0, %eax
cpuid
movl $output, %edi
movl %ebx, 28(%edi)
movl %edx, 32(%edi)
movl %ecx, 36(%edi)
movl $4, %eax
movl $1, %ebx
movl $output, %ecx
movl $42, %edx
int $0x80
movl $1, %eax
movl $0, %ebx
int $0x80

This is a pretty simple assembly program. In the .data section there is a output label (think variable) defined. The .ascii is the "type" of data. In this case it means store ASCII string. The x's are placeholders for the real value coming later. The space for this data is reserved at compile time. The next section, .text, is where the instructions are stored for the program. The first instructions are to load the register EAX with a value of zero (literal). The next instruction is cpuid, which instructs the processor to get the id we are after. The zero value in EAX defines the CPUID output option. After the CPUID instruction is run, we must collect the result which will be in 3 output registers. The first instruction here (movl $output, $edi) creates a pointer from the output label to a register (EDI). Next we create pointers from the other 3 registers to the appropriate section of the EDI register. Now that all the results have been coorelated we can output the response. This program is using a Linux system call (int $0x80) to access the console from the kernel. This is a software interrupt (with the value 0x80). The EAX register will hold the specific instruction that will be executed when we make this interrupt. The EBX will hold the exit code that is given when the program exits. The ECX will hold the actual output (movl $output, %ecx). And the EDX register will hold the length of the string. Thats pretty much it. I will be posting more as my study of C++ and assembly continues.

Wednesday, July 4, 2007

Torque X for XNA creators (FREE!)

Check out one of the coolest benefits of becoming a XNA Creator Club member (http://creators.xna.com/subscribers/torquex.aspx). This is makes the $100 entry to creators club easier to swallow for some. Content is being added to the site daily (new tutorials and info). Hats off to the XNA Dev Team at Microsoft!

Friday, June 29, 2007

Size matters?

Screenshot posted from inside Bungie's realm. Apparantly this is the backend storage that will support the video/screenshot capabilities of the upcoming Halo 3. You can store as many as you like, on your local storage, but there limits, yet to be announced for public viewable ones. That's a lot of bytes though!

Thursday, June 21, 2007

Compressing Textures.....the DirectX way

Compression of textures is something that DirectX can take care of for us, but it makes sense to understand how it does this.

Consider the following gradient:

If you were required to store this in a texture, you would end up having to store hunderds of RGB values to represent it If you instead stored say 2 colors (the ends) you perform linear interpolation to get the remaining colors and you have are only storing to values. A crude calculation to do just that is here:

Color_Gradient = (Color2 - Color1) / (GradientWidth - 1);
for(i=0; i <> {
SetPixelColor(i, 0, Color1 + (Color_Gradient * i));

}

This is roughly how DirectX can compress textures resulting in massive savings. Of course on a big texture you will probably have more than 2 colors and a simple bar object, so DirectX will partition the texture into a 4 x 4 matrix and based on the Texture Compression state chosen, will use either 1 or 2 of these grids (one for RGB and one for ALPHA) or just one and two RGB color values. The specifics of how the interpolation is performed again is dictated by the state of the Texture Compression chosen. More to come on this..

Sunday, June 17, 2007

Binary Serialization in .NET

Persisting objects to other medium for future use is one advantage of the serialization mechanisms built into the .NET Framework. Listed below is are 2 code sections showing serialization and deserialization to disk.

SERIALIZE:

FileStream fStream = new FileStream(ConfigurationManager.AppSettings[1].ToString(), FileMode.Open);
BinaryFormatter bin = new BinaryFormatter();
htObject = bin.Deserialize(fStream) as Hashtable;

DESERIALIZE:

Stream binstream = File.OpenWrite(ConfigurationManager.AppSettings[1].ToString());
BinaryFormatter binFmtr = new BinaryFormatter();
binFmtr.Serialize(binstream, m_hashTable);
binstream.Close();

This can be useful in XNA programming. More C++ for DirectX9 posts will be coming next week.

GetHashCode() in .NET

So in .NET, there is a virtual function defined for all object types (everything) named GetHashCode(). This function can be used to generated a hash value for an object, but the main caveat is this hash will not be guaranteed unique. The implementation can be used for very small amounts of data. The hash calculation returns a 32-bit integer. This means that at most there 4,294,967,295 different integer values possible.
To help the situation, an overridden GetHashCode() can be used. When first confronted with this dilemma, I was going to just XOR the values I was using to generated the hash.

public override int GetHashCode()
{
return value1.GetHashCode() ^
value2.GetHashCode() ^
value3.GetHashCode();
}

Turns out this is not the best way to do this. If you look deeper at the case where value1 and value2 are switched, you will see the calculation would come out the same (a hence 2 different sets of values would result in the same hash). This is really caused by the fact that addition and multiplication operations are commutative. You should remove this to remove this failure in the algorithm you are constructing.

public override int GetHashCode()
{
int result = 21;
result = result*47 + value1.GetHashCode();
result = result*47 + value2.GetHashCode();
result = result*47 + value3.GetHashCode();

return result;
}

This will work!

Friday, June 15, 2007

Halo 2 Vista.....we hardly knew ya!

So I have finished Normal level campaign of Halo 2 Vista. This game is definately worth the buy, even for gamers who have played H2 on XBox/360. The graphics are that good! I will continue the fight for Heroic and Legendary acheivements. Btw, I will be getting back to more DirectX centric blogging now. ;)

Wednesday, June 6, 2007

Pac Man World Championships - NYC

Got back a bit ago from the Pac Man World Championships. Had a great time hanging out with the Microsoft crew and the fellow gamers. We went to the Marriott in Times Square after the gaming for a community get together. Interesting pics can be found on http://picasaweb.google.com/caleteeter/PacManWorldChampionships , these were courtesy of John Porcaro from the Gamerscore Blog.

Wednesday, May 23, 2007

Halo 2 for Vista Delayed!!!

Halo 2 for Vista supporters will need to wait just a bit longer to "continue" the fight. Apparently, there were some problems with the DVD discs (physical problems). The new release date is May 31, 2007.

UPDATE: There were rumors circulating the internet that H2 Vista would be delayed till July 4th, but these are bogus per Microsoft spokesperson. May 31 is the date.

UPDATE: I swung by the local gamestop last night and H2 Vista is on scheduled for June 1st.

Lambert's Law

More on my continuing study of lighting effects on objects in three dimensional space. Lambert's Law states that intensity of incoming light (a ray of light) is directly proportional to the orientation of the light to the point on a surface. Basically this means that we can scale reflected light (an hence color perceived by the user) by the cosine of the angle between the incoming light ray and the direction the point is facing (the vertex normal). A dot product can be calculated between the 2 vectors (light vector and surface normal) to calculate the cosine of the angle between the two. We can then scale the light intensity by this value. Of course our material of our surface (reflective properties) will determine what the final result will be.

Sunday, May 20, 2007

Halo 3 Beta has arrived!!

Hal0 3 Beta arrived, slightly delayed, but here none the less. Anything of this size is bound to have problems with production deployment, so its understood Bungie! I have been playing since launch, May 16th and this is the best Halo yet in my opinion. Of course we only get a small taste of multiplayer (no campaign) but it looks really impressive.

My stats on H3 Beta:
http://www.bungie.net/Account/Profile.aspx?player=Windozer

Sunday, April 1, 2007

Chatting about Game Cinematics with THQ's Coray Seifert

I had the pleasure of attending a talk hosted by Game Institute with speaker Coray Seirfert from Kaos Studios, a THQ development house. The talk was about trailers and cinematics. It was basically a review of some top trailers and cinematics (H3, GOW, etc). Attendees were asked to view the content and comment to the group on their feel on items like the type of content, issues with the trailers, and discussing likes/dislikes and reasons. my thoughts below

H3 Intro Trailer - great trailer, showing scale of game, sounds builds (kinda slow but worth the payoff), in game technical innovations shown (real-time reflections on Master Chiefs visor and sky)

Gears of War In-Game Cinema - great in game cinema, actual cut from the game, "cnn-style" camera view shown, again technical innovations showcased well, audio a bit weak at parts (drowned out)

Gears of War TV Trailer - personally thought this was the least exciting, slow music ("Tears for Fears"Mad World), uses in game engine (real-time rendered), technically very good

Halo Wars Trailer - outsourced to production house so no in game engine used, great for users of Halo to see the spartans and elites in a more "real-life" scenario, audio weak at end

Overall good to hear Coray's comments as well as he was able to point out things like how to keep the viewer's interest, how to transfer from in-game cinematics to real game play, and the use of audio to pull off the theme and excitement.

Sunday, March 25, 2007

Color macro in DirectX

Quick post but I ran into a macro that is defined in d3d9types.h that accepts 4 float values and returns and a single DWORD. This came up when I was working with the lighting pipeline in DirectX 9. Specifically, there is a fundamental difference in the way vertex diffuse/specular colors are stored (DWORDs) and the way DirectX materials are stored (D3DCOLORVALUEs aka. 4 floats).

definition:

#define D3DCOLOR_COLORVALUE(r, g, b, a)
D3DCOLOR_RGBA((DWORD)((r)*255.f), (DWORD)((g)*255.f), (DWORD)((b)*255.f), (DWORD)((a)*255.f))

Friday, March 23, 2007

DirectX Fixed Function Lighting - Part 1

So I have been studying the lighting module included with DirectX (fixed-function pipeline) and I created this as a summary of these effects.

There are three main types of illumination models.

Emissive - The object "emits" its own light (ie. neon sign)
Ambient - A an object is lit by an outside source, based on the material properties of the object, it will reflect this light, which will subsequently be reflected by other objects, until eventually all objects are lit.
Direct Lighting - This is further broken into two subcategories (positional and directional). Positional is where the light is coming from one defined point in space. Directional is where the light comes from a source that is infinately far away (ie. sun).

We can set emissive at the object level, on frame advance, and we can also set the ambient level at the device level. So we are really focusing on direct lighting. With direct lighting two other illumination types are introduced, diffuse light and specular light.

Diffuse Light - the full intensity of the light, scaled by the cosine of the angle between the normals of the light source and the object face. Lambert's Law dictates this.
Specular Light - used to highlight an object to give the illusion of shiny and smooth surface.

This leads to the basic lighting equation used by DirectX.

I = A + D + S + E

A = Ambient Light
D = Diffuse Light
S = Specular Light
E = Emissive Light

Friday, March 16, 2007

Great article on importance of not building garbage!

This shows the importance of being careful to not build up garbage that the GC has to collect, especially on XNA as the GC is not as efficient.

GDC 2007 Slide Online!

The following are the slides from GDC. I have been unable to find any audio or notes to accompany these, but I have touched based with Dave Weller via the xna forums and he did not have these available.

UPDATE: The slide audio is availabe here for a fee!

Tuesday, March 6, 2007

XNA Creators Club Site Live!!

Its finally here. We all got the opportunity to join the XNA creators club a while back, but now the creators club web site has been released (actually was released Monday, first day of GDC). Apparantly, the msdn forums for XNA and DirectX will be moving over here. The new forums are great, much more rich experience. Kudos to Microsoft and the XNA Team!!!

UPDATE from GDC! From theZman

Monday, March 5, 2007

STL Vectors Unplugged - Part 2

Another update to the disection of the stl vector continues!! This time we are looking closer at how much overhead is added for this "auto-grow" array. There is no free lunch and this proves this point. Link to vector size discussion.

Sunday, March 4, 2007

STL Vectors Unplugged - Part 1

I came upon this article on STL vectors and how the instantiation of new ones (of different types) can effect the output binary size. Check it out! Interesting reading.

Friday, March 2, 2007

Performance Optimization Tips for XNA

A big issue that is getting some press has resulting from new products like XNA Game Studio and the questions are arising about how to make things like the garbage collector deterministic in managed code. If code in a critical game loop is stalled, for garbage collection, the results would seem to be very bad (stutter, dropped frames, oh my!). Luckily, the CF team that worked on the XNA Framework has taken this into account (they got our backs sometimes ;)). Listed below are 2 articles detailing some of these optimizations. Good Stuff!!!!

Managed Code Performance Part 1

Managed Code Performance Part 2

Tuesday, February 20, 2007

Determining intersection of ray and triangle

One of the best ray-triangle intersection routines, written by Tomas Akenine Moller.

/* Ray-Triangle Intersection Test Routines */
/* Different optimizations of my and Ben Trumbore's */
/* code from journals of graphics tools (JGT) */
/* http://www.acm.org/jgt/ */
/* by Tomas Moller, May 2000 */
#include
#define EPSILON 0.000001
#define CROSS(dest,v1,v2) \
dest[0]=v1[1]*v2[2]-v1[2]*v2[1]; \
dest[1]=v1[2]*v2[0]-v1[0]*v2[2]; \
dest[2]=v1[0]*v2[1]-v1[1]*v2[0];
#define DOT(v1,v2) (v1[0]*v2[0]+v1[1]*v2[1]+v1[2]*v2[2])
#define SUB(dest,v1,v2) \
dest[0]=v1[0]-v2[0]; \
dest[1]=v1[1]-v2[1]; \
dest[2]=v1[2]-v2[2];
/* the original jgt code */
int intersect_triangle(double orig[3], double dir[3],
double vert0[3], double vert1[3], double vert2[3],
double *t, double *u, double *v)
{
double edge1[3], edge2[3], tvec[3], pvec[3], qvec[3];
double det,inv_det;
/* find vectors for two edges sharing vert0 */
SUB(edge1, vert1, vert0);
SUB(edge2, vert2, vert0);
/* begin calculating determinant - also used to calculate U parameter */
CROSS(pvec, dir, edge2);
/* if determinant is near zero, ray lies in plane of triangle */
det = DOT(edge1, pvec);
if (det > -EPSILON && det < EPSILON)
return 0;
inv_det = 1.0 / det;
/* calculate distance from vert0 to ray origin */
SUB(tvec, orig, vert0);
/* calculate U parameter and test bounds */
*u = DOT(tvec, pvec) * inv_det;
if (*u <> 1.0)
return 0;
/* prepare to test V parameter */
CROSS(qvec, tvec, edge1);
/* calculate V parameter and test bounds */
*v = DOT(dir, qvec) * inv_det;
if (*v <> 1.0)
return 0;
/* calculate t, ray intersects triangle */
*t = DOT(edge2, qvec) * inv_det;
return 1;
}
/* code rewritten to do tests on the sign of the determinant */
/* the division is at the end in the code */
int intersect_triangle1(double orig[3], double dir[3],
double vert0[3], double vert1[3], double vert2[3],
double *t, double *u, double *v)
{
double edge1[3], edge2[3], tvec[3], pvec[3], qvec[3];
double det,inv_det;
/* find vectors for two edges sharing vert0 */
SUB(edge1, vert1, vert0);
SUB(edge2, vert2, vert0);
/* begin calculating determinant - also used to calculate U parameter */
CROSS(pvec, dir, edge2);
/* if determinant is near zero, ray lies in plane of triangle */
det = DOT(edge1, pvec);
if (det > EPSILON)
{
/* calculate distance from vert0 to ray origin */
SUB(tvec, orig, vert0);

/* calculate U parameter and test bounds */
*u = DOT(tvec, pvec);
if (*u <> det)
return 0;

/* prepare to test V parameter */
CROSS(qvec, tvec, edge1);

/* calculate V parameter and test bounds */
*v = DOT(dir, qvec);
if (*v <> det)
return 0;

}
else if(det < -EPSILON)
{
/* calculate distance from vert0 to ray origin */
SUB(tvec, orig, vert0);

/* calculate U parameter and test bounds */
*u = DOT(tvec, pvec);
/* printf("*u=%f\n",(float)*u); */
/* printf("det=%f\n",det); */
if (*u > 0.0 *u < det)
return 0;

/* prepare to test V parameter */
CROSS(qvec, tvec, edge1);

/* calculate V parameter and test bounds */
*v = DOT(dir, qvec) ;
if (*v > 0.0 *u + *v < det)
return 0;
}
else return 0; /* ray is parallell to the plane of the triangle */
inv_det = 1.0 / det;
/* calculate t, ray intersects triangle */
*t = DOT(edge2, qvec) * inv_det;
(*u) *= inv_det;
(*v) *= inv_det;
return 1;
}
/* code rewritten to do tests on the sign of the determinant */
/* the division is before the test of the sign of the det */
int intersect_triangle2(double orig[3], double dir[3],
double vert0[3], double vert1[3], double vert2[3],
double *t, double *u, double *v)
{
double edge1[3], edge2[3], tvec[3], pvec[3], qvec[3];
double det,inv_det;
/* find vectors for two edges sharing vert0 */
SUB(edge1, vert1, vert0);
SUB(edge2, vert2, vert0);
/* begin calculating determinant - also used to calculate U parameter */
CROSS(pvec, dir, edge2);
/* if determinant is near zero, ray lies in plane of triangle */
det = DOT(edge1, pvec);
/* calculate distance from vert0 to ray origin */
SUB(tvec, orig, vert0);
inv_det = 1.0 / det;

if (det > EPSILON)
{
/* calculate U parameter and test bounds */
*u = DOT(tvec, pvec);
if (*u <> det)
return 0;

/* prepare to test V parameter */
CROSS(qvec, tvec, edge1);

/* calculate V parameter and test bounds */
*v = DOT(dir, qvec);
if (*v <> det)
return 0;

}
else if(det < -EPSILON)
{
/* calculate U parameter and test bounds */
*u = DOT(tvec, pvec);
if (*u > 0.0 *u < det)
return 0;

/* prepare to test V parameter */
CROSS(qvec, tvec, edge1);

/* calculate V parameter and test bounds */
*v = DOT(dir, qvec) ;
if (*v > 0.0 *u + *v < det)
return 0;
}
else return 0; /* ray is parallell to the plane of the triangle */
/* calculate t, ray intersects triangle */
*t = DOT(edge2, qvec) * inv_det;
(*u) *= inv_det;
(*v) *= inv_det;
return 1;
}
/* code rewritten to do tests on the sign of the determinant */
/* the division is before the test of the sign of the det */
/* and one CROSS has been moved out from the if-else if-else */
int intersect_triangle3(double orig[3], double dir[3],
double vert0[3], double vert1[3], double vert2[3],
double *t, double *u, double *v)
{
double edge1[3], edge2[3], tvec[3], pvec[3], qvec[3];
double det,inv_det;
/* find vectors for two edges sharing vert0 */
SUB(edge1, vert1, vert0);
SUB(edge2, vert2, vert0);
/* begin calculating determinant - also used to calculate U parameter */
CROSS(pvec, dir, edge2);
/* if determinant is near zero, ray lies in plane of triangle */
det = DOT(edge1, pvec);
/* calculate distance from vert0 to ray origin */
SUB(tvec, orig, vert0);
inv_det = 1.0 / det;

CROSS(qvec, tvec, edge1);

if (det > EPSILON)
{
*u = DOT(tvec, pvec);
if (*u <> det)
return 0;

/* calculate V parameter and test bounds */
*v = DOT(dir, qvec);
if (*v <> det)
return 0;

}
else if(det < -EPSILON)
{
/* calculate U parameter and test bounds */
*u = DOT(tvec, pvec);
if (*u > 0.0 *u < det)
return 0;

/* calculate V parameter and test bounds */
*v = DOT(dir, qvec) ;
if (*v > 0.0 *u + *v < det)
return 0;
}
else return 0; /* ray is parallell to the plane of the triangle */
*t = DOT(edge2, qvec) * inv_det;
(*u) *= inv_det;
(*v) *= inv_det;
return 1;
}