Catflier: Intersections Part1

So last month been working on latest commercial project (nothing involving any extreme creative skills), so back into research mode.

I got plenty of new ideas for rendering, and quite some parts of my engine are undergoing some reasonable cleanup (mostly new binding model to ease dx12 transition later on).

There's different areas in my tool that I'm keen on improving, many new parts will be for other blog posts, but one has lately drawn my attention and I really wanted to get this one sorted.

As many of you know (or don't), I've been working on many interactive installations around, from small to (very) large.

One common requirement for those are some form of Hit Detection, you have some input device (Kinect, Camera, Mouse, Touch, Leap....), and you need to know if you hit some object in your scene in order to have those elements to react.

After many years in the industry, I've been developing a lot of routines in that aspect, so I thought it would be nice to have all of that as a decent library (to just pick when needed).

After a bit of conversation with my top coder Eric, we wanted to do a bit of feature list, what do we expect of an intersection engine, then the following came up:

We have various scenarios, some routines are better fit to some use cases, so we don't want a "one mode to rule them all". For example, if our objects are near spherical, we don't want to ray cast mesh triangles, ray cast on bounding sphere is appropriate (and of course much faster).
We want our routines sandboxed, so 4v/flaretic subpatch, it should be one node with inputs/outputs, cooking done properly inside and optimized. That saves us load time, reduce compilation times for shaders (or allow precompiled), and easier to control workflow (if our routine is not needed it costs 0).
We want our library minimal, so actually hit routines should not even create data themselves, they are a better fit as pure behaviours (It also helps to have those routines working in different environments).
We don't want to be gpu only, if a case fits better as CPU, then we should use CPU (if preparing buffers costs more time than performing the test directly, then let's just do it directly in cpu).

Next we wanted to decide which type of outputs we needed, this came out:

bool/int flag, which indicates if object is hit or not
filtered version for hit objects
filtered version for non hit objects

Then here are the most important hit detection features we require (they cover a large part of our use cases in general)

Mouse/Pointer(s) to 2d shape (in most cases we want rectangle, circle).
Pointer(s) to 3d object (with selectable precision, either raycast bounding volume eg sphere/box, or go at triangle level).
Area(s) to shapes (rectangle selection)
Arbitrary texture to shape (most common scenario for this is infrared camera, or kinect body index texture). In that case we also want the ability to differenciate between user id as well as object id.
In any 3d scenario, we also eventually want either closest object or all objects that get from the test.
We also have 3 general cases in 3d : Intersect (ray), Containment (is our object inside another test primitive), Proximity (is our object "close enough" to some place).

So once those requirements are set, to perform hit detection we generally have the 2 main following scenarios, you use analytical function or you use a map.

So let's show some examples, if that first post about it, I'll only speak about analytical functions.

In this case, we generally follow the usual pattern, convert our input into the desired structure (point to ray for example), and every mode follow the current pseudo code (c# version)

Code Snippet

bool[] hitResults = new bool[objectCount];
List<int> hitObjectList = new List<int>();
 
for (int i = 0; i < objectCount; i++)
{
    var obj = testObjects[i];
 
    bool hr = Performtest(userObject, obj);
    hitResults[i] = hr;
 
    if (hr)
    {
        hitObjectList.Add(i);
    }
}

Pretty much all test modes will follow this pattern, only difference after is the test function.

Obviously when we start to reach a certain number of elements, this can become slow. And many times, our objects might be on our GPU, so we are not gonna load them back into CPU.

Translating this into hlsl is extremely straightforward, here is some pesudo code for it.

Code Snippet

bool PerformTest(SomeUserObject userInput, SomeStruct object)
{
    return //Perform your intersection/containment/proximity routine here
}
 
StructuredBuffer<SomeStruct> ObjectBuffers  : register(t0);
 
RWStructuredBuffer<uint> RWObjectHitResultBuffer : register(u0);
 
AppendStructuredBuffer<uint> AppendObjectHitBuffer : register(u1);
RWStructuredBuffer<uint> RWObjectHitBuffer : register(u1); //In this case, UAV should have a counter flag
 
cbuffer cbUserInput : register(b0)
{
    SomeStruct userInput;
};
 
cbuffer cbObjectData : register(b1)
{
    uint objectCount;
};
 
[numthreads(128, 1, 1)]
void CS(uint3 i : SV_DispatchThreadID)
{
    if (i.x >= objectCount)
        return;
 
    uint oid = i.x;
 
    SomeUserObject object = ObjectBuffers[oid];
 
    bool hitResult = PerformTest(userInput, object);
    RWObjectHitResultBuffer[oid] = hitResult;
    if (hitResult)
    {
        //If we use append buffer
        AppendObjectHitBuffer.Append(oid);
 
        //If we use counter buffer
        uint idx = RWObjectHitBuffer.IncrementCounter();
        RWObjectHitBuffer[idx] = oid;
    }
}

As you can see there's no huge difference into that.

It's pretty straightforward to perform ray to sphere/triangle/box as a starter.

Rectangle selection is also extremely simple:

Construct a 2d transformation for the screen area to check
Multiply inverse by camera projection
Build a frustrum from this
Perform a object/frustrum test insqtead of ray test.

here is a small range test example

Simple no? ;)

Now I can foresee 2 important question that our acute reader is probably already thinking of:

How do we get closest object?
What if we perfom several user inputs?

Of course, there are solutions for that.

Closest object.

First we will consider that our test function is also capable of returning distance.

So we modify our code by:

Code Snippet

struct HitResult
{
    uint objectID;
    float distanceToObject;
};
 
bool PerformTest(SomeUserObject userInput, SomeStruct object, out float distanceToObject)
{
    return //Perform your intersection/containment/proximity routine here
}
 
StructuredBuffer<SomeStruct> ObjectBuffers  : register(t0);
 
RWStructuredBuffer<uint> RWObjectHitResultBuffer : register(u0);
 
AppendStructuredBuffer<HitResult> AppendObjectHitBuffer : register(u1);
 
cbuffer cbUserInput : register(b0)
{
    SomeStruct userInput;
};
 
cbuffer cbObjectData : register(b1)
{
    uint objectCount;
};
 
[numthreads(128, 1, 1)]
void CS(uint3 i : SV_DispatchThreadID)
{
    if (i.x >= objectCount)
        return;
 
    uint oid = i.x;
 
    SomeUserObject object = ObjectBuffers[oid];
 
    float d;
    bool hitResult = PerformTest(userInput, object,  d);
    RWObjectHitResultBuffer[oid] = hitResult;
    if (hitResult)
    {
        HitResult hr;
        hr.objectID = oid;
        hr.distanceToObject = d;
        //If we use append buffer
        AppendObjectHitBuffer.Append(hr);
    }
}

Now our buffer also contains our distance to object, the only leftover is to grab the closest element.

We have 2 ways to work that out:

Use Compute shader (Use InterlockedMin to filter closest element, since distance is generally positive there's no float to uint tricks to apply), then perform another pass to check if element distance is equal to minimum.
Use Pipeline ; DepthBuffer is pretty good to keep closest element, so we might as well let him do it for us ;)

Using pipeline is extremely easy as well, process is as follow:

Create a 1x1 render target (uint), Associated with a 1x1 depth buffer

Prepare an indirect draw buffer (from the UAV counter), and draw as point list, write to pixel 0 in vertex, and pass distance so it's written to depth buffer, since code speaks more, here it is:

Code Snippet

struct HitResult
{
    uint objectID;
    float distanceToObject;
};
 
StructuredBuffer<HitResult> ObjectHitBuffer : register(u0);
 
cbuffer cbObjectData : register(b1)
{
    float invFarPlane;
};
 
void VS(uint iv: SV_vertexID, out float4 p : SV_Position,
    out float objDist : OBJECTDISTANCE,
    out uint objID : OBJECTID)
{    
    p = float4(0, 0, 0, 1); //We render to a 1x1 texture, position is always 0
    HitResult hr = ObjectHitBuffer[iv];
    
    objID = hr.objectID;
    //Make sure we go in 0-1 range
    objDist = hr.distanceToObject * invFarPlane; 
}
 
void PS(float4 p : SV_Position, float objDist : OBJECTDISTANCE, uint objID : OBJECTID,
    out uint closestObjID : SV_Target0, out float d : SV_Depth)
{
    //Just push object id
    closestObjID = objID;
    d = objDist; //Depth will preserve closest distance
}

Now our pixel contains our closest object (clear to 0xFFFFFFFF so this value will mean "no hit")

To finish for this first part, let's now add the fact that we have multiple "user Inputs".

We want to know the closest object per user.

This is not much more complicated (but of course will cost a test for each user/object).

Code Snippet

struct HitResult
{
    uint objectID;
    float distanceToObject;
};
 
bool PerformTest(UserInput userInput, SomeStruct object, out float distanceToObject)
{
    return //Perform your intersection/containment/proximity routine here
}
 
StructuredBuffer<SomeStruct> ObjectBuffers  : register(t0);
StructuredBuffer<UserInput> UserInputBuffer : register(t1);
 
RWStructuredBuffer<uint> RWObjectHitResultBuffer : register(u0);
 
RWStructuredBuffer<HitResult> RWObjectHitBuffer : register(u1); //Counter flag
RWStructuredBuffer<uint> RWObjectHitUserIDBuffer : register(u2);
 
cbuffer cbObjectData : register(b0)
{
    uint objectCount;
    uint userCount;
};
 
[numthreads(128, 1, 1)]
void CS(uint3 tid : SV_DispatchThreadID)
{
    if (tid.x >= objectCount)
        return;
 
    uint oid = tid.x;
 
    SomeUserObject object = ObjectBuffers[oid];
    uint hitCount = 0;
    for (uint i = 0; i < userCount; i++)
    {
        float d;
        bool hitResult = PerformTest(userInput, object, d);
 
        if (hitResult)
        {
            hitCount++;
            HitResult hr;
            hr.objectID = oid;
            hr.distanceToObject = d;
 
            uint idx = RWObjectHitBuffer.IncrementCounter();
            RWObjectHitBuffer[idx] = hr;
            RWObjectHitUserIDBuffer[idx] = i;
        }
    }
    RWObjectHitResultBuffer[oid] = hitCount;
}

Now we have a buffer with every hit from every user (here is a small example screenshot):

So instead of using a 1x1 texture, we use a Nx1 texture (where N is user Input count).

Process to get closest element is (almost) the same as per the single input.

Only difference, in Vertex Shader, route the objectID/Distance to the relevant user pixel, and you're set!

That's it for first part, next round, I'll explain how the "map technique works", stay tuned.

Catflier

Sunday, 25 October 2015

Intersections Part1

No comments:

Post a Comment