Catflier: July 2014

One obvious very useful feature for most graphics and audio toolset is some form of timeline.

In 4v you have the native one, which with a little effort could become ok (add undo, move keyframes with keyboard, export/import, and better color editor), but in it's current state is at best irritating. It's not as bad as the hlsl code completion in 4v but that's another story ;)

Then you have more standalone ones, vezer looks pretty cool but mac only, so not for me.

You also have duration, which is likely the best but has 2 major flaws, missing scroll and groups. Also opengl rendering would not fit as easily to integrate in my tool.

Then you have this brand new Posh svg, new 4v timeliner, presented as usual as the next big thing which will make your grandmother dance rock and roll again, and so and so.

So first pre alpha release tests were let's call it.... disappointing. Only one track type, no interpolation, and an amount of ui bugs that it could just eat a piece of wood like piranhas would eat my bum.
Next first "public" release is well... appalling, I personally hardly see the point of releasing something in that state, except to show off your incompetence at building user interface.

So first thing you do when you build a timeliner, you prepare a decent track setup, there is NO point showing 5 key frames and one track. So I start to prepare 8 tracks, 40 key frames each (which is really a small setup), and then everything goes laggy, I get 40 % cpu usage (where the hell does that go), selecting and moving several frames just blow up everything, you can't even recover and have to restart the software, usability at it's best.

So well I start to report those issues, and happily suggest that some DirectX11/OpenGL rendering for that type of things would likely fit and scale much nicer, but only reply I get is "yes I know it's slow but I don't care Dx is not as interesting as svg". So I try to explain that well if your ui doesn't scale, you might look for something else, but I get the usual "I don't give a fuck type of attitude which pisses me off so much". We suggested that actually some of the changes could easily make it through the old one (some undo + key) , I get the same I don't give a fuck.

So well I don't really understand the point (and as a user slightly pissed off that you have to pay 500 euros a license to get that type of answer). And I feel really horrified that this took several month to produce, I'm getting really worried about user interface in the next vvvv50, since in current version since there's no will to improve the old version, then it gets decided to produce a brand new ui framework which is as shit, so I got more or less no faith that we will ever have a decently smooth user interface in previous or next gen vvvv.

But let's stop whining and go to the more fun part ;)

So I'm still without a timeliner, since now I got 2 10% baked unusable pieces of junk, and some others which are nice but I can't integrate them.

So since I also mentioned dx11 would be a good candidate, I decided to spend an afternoon doing track renderers. Also that would really prove a point that modern rendering wins against this browser jazz. I also decided to only focus on the rendering for now, since well it's not too hard to hittest and drag a point after all ;)

Main focus is also to of course have some fast rendering, I want a smooth user interface ;)

I just push my tracks into buffers and then render them. I decided to start with color, which is really simple.

Code Snippet

struct ColorKeyFrame
{
    float4 color;
    float time;
    int trackid;
};
 
struct KeyFrameLink
{
    int left;
    int right;
};
 
StructuredBuffer<ColorKeyFrame> ColorKeyFrameBuffer;
StructuredBuffer<KeyFrameLink> ColorLinkBuffer;
StructuredBuffer<float2> TrackOffsetBuffer; //x = top, y = height

Render a bulk of instanced quads, grab color on the left keyframe, color on the right keyframe, position with a map function.

Code Snippet

psInput VS(vsInput input)
{
    psInput output;
 
    KeyFrameLink cl = ColorLinkBuffer[input.ii];    
    ColorKeyFrame left = ColorKeyFrameBuffer[cl.left];
    ColorKeyFrame right = ColorKeyFrameBuffer[cl.right];
    int tid = input.ii;
    tid = tid / cpPerTrack;
    
    float2 pos = input.uv;
    float2 offset = TrackOffsetBuffer[tid];
    
    pos.x = map(pos.x,left.time,right.time);
    pos.x *= 2.0f;
    pos.x -= 1.0f;
    
    pos.y += offset.x;
    pos.y *= offset.y;
        
    output.pos = mul(float4(pos,0,1),tRange);
    output.colstart = left.color;
    output.colend = right.color;
    output.uv = input.uv.x;
    
    return output;
}

Send to the (hardcore) pixel shader:

Code Snippet

float4 PS(psInput input) : SV_Target
  {
    return lerp(input.colstart,input.colend, input.uv);
}

That was so hard ;)

Now let's go for value, positioning keyframe point, well one small Segment instance, just so simple that there's nothing to say about it, now let's build the connections, and let's add the fact that I want more curves (aka tweens):

Code Snippet

struct ValueKeyFrame
{
    float value;
    float time;
    int trackid;
    int interpolation;
};
 
struct KeyFrameLink
{
    int left;
    int right;
};

Of course usual buffers are there too, we send a PatchListWithOneControlPoint batch for each connection (we don't need 2, that's beautiful).

Code Snippet

struct linkData
{
    int left : CPOINT0;
    int right : CPOINT1;
};

Now we only need to pass trough keyframe IDs till we reach the domain shader:

Code Snippet

[domain("isoline")]
psInput DS(hsConstantOutput input, OutputPatch<linkData, 1> op, float2 uv : SV_DomainLocation)
{
    psInput output;
    float t = uv.x;
    
    ValueKeyFrame kl = ValueKeyFrameBuffer[op[0].left];
    ValueKeyFrame kr = ValueKeyFrameBuffer[op[0].right];
    
    float2 start = ComputePosition(op[0].left);
    float2 end = ComputePosition(op[0].right);
 
    float x = lerp(start.x,end.x,t);
    float y = lerp(start.y,end.y,lerpFunc(t,kl.interpolation));
    
    float2 pos = float2(x,y);
    output.pos = mul(float4(pos,0.0f, 1.0f), tRange);
 
    return output;
}

lerpFunc selects interpolation function, which are all the basic tween modes.

Now well, bang track is simple instanced quad or line, nothing to speak about.

Then let's add wave track, I thought it would be a bit complicated, but that was so easy it's even embarrassing. So first get your favorite audio API (I used Bass). Load a music file, read all the samples as float and feed to a big fat structured buffer (float or float2). I chose float since at least i push any type of multi channel later.

Render a quad, then here is the insane pixel shader:

Code Snippet

float4 PS(psInput input) : SV_Target
  {
    uint cnt, stride;
    WaveDataBuffer.GetDimensions(cnt,stride);    
    float off = cnt;
    float xpos = input.uv.x * off;
    
    float sampleleft = WaveDataBuffer[xpos*2];
    float sampleright = WaveDataBuffer[xpos*2+1];
    float y = input.uv.y;    
    y *= 2.0f;
    y -= 1.0f;
    
    return (abs(sampleleft) > abs(y)) + 0.1f;
}

That was hardcore, I could do a bit of oversampling for sure, but it already looks pretty ok.

So here we go, then you add a small ruler.

And now pushing some 4000 keyframes:

Unzoomed view:

And let's push the ui a little bit, since I said I wanted it to scale, 14K keyframes:

3 hours well spent, I now got a pretty decent scalable timeline renderer, which is a pretty huge step forward (80 fps rendering every frame, no caching is rather good so far).

To conclude, hlsl > all :)

I know some of my friends use rather regularly the very nice OpenCV contribution from Elliot Woods.

Most times people use the Camera/Projector calibration tool. This works pretty well (could do with some ui improvemements), but most times I wanted to look at it I always end up with the same problem, you need to download the whole Image pack.

This is a great pack of course, but in that scenario, downloading a 500 megs bulk of dlls (which can also depend of version) is let's call "not ideal". Ok in our times with super fast internet you would think it's ok, but well, my hard drive doesn't like it (so I don't either). All that to call a single opencv function!

So just wrote a small P/Invoke dll (and use static library linking instead of dynamic), one 20 lines of dynamic plugin to call the function and here we go, 1 megabyte dll which doesn't need any other external. I like minimalism ;)

I remember I wanted to do this for a while, and considering the amount of (no) time it took I fell very embarrassed )

Now after this there's a (few) things I wanted to add/change in that tool.

First the point selection is nice in some cases, but not so nice in others. Basically current technique renders object space coordinates in a texture, then you just sample that texture.

If you have a model crammed with small polygons it's fine enough, but what I would simply like to do in general is just get closest vertex from a triangle raycast. Since I don't want to blow up 100k rays in cpu, and 3d model is already in GPU (as obviously we want to render it), let's do a little bit of compute shader ;)

So first let's load the model into a big fat buffer (to avoid subsets annoyance, simple prefix sum on indices), then we have the following data structures:

Code Snippet

StructuredBuffer<float3> PositionBuffer : POSITIONBUFFER;
StructuredBuffer<uint3> IndexBuffer : INDEXBUFFER;
 
AppendStructuredBuffer<float3> AppendVertexHitBuffer : APPENDVERTEXHITBUFFER;
 
float3 raypos : RAYPOSITION;
float3 raydir : RAYDIRECTION;
 
int FaceCount : FACECOUNT;
 
float eps : EPSILON = 0.000001f;

Pretty simple, mouse position is converted back to ray, and we use append buffer to get potential candidates (since we might hit several triangles).

Now here is our ray shader (I omit the ray formula, which is the same as in http://www.geometrictools.com/ )

Code Snippet

[numthreads(64,1,1)]
void CS_RayTriangle(uint3 dtid : SV_DispatchThreadID)
{
    if (dtid.x >= FaceCount) { return; }
    
    uint3 face = IndexBuffer[dtid.x];
    
    float3 p1 = PositionBuffer[face.x];
    float3 p2 = PositionBuffer[face.y];
    float3 p3 = PositionBuffer[face.z];
    
    float3 diff = raypos - p1;
    float3 e1 = p2 - p1;
    float3 e2 = p3 - p1;
    float3 n = normalize(cross(e1,e2));
    
    float DdN = dot(raydir,n);
    float fsign;
    
    bool hit = true;
 
    //Do you rayhit
    
    if (hit)
    {
        AppendVertexHitBuffer.Append(p1);
        AppendVertexHitBuffer.Append(p2);
        AppendVertexHitBuffer.Append(p3);
    }
}

Now when we hit a triangle, we append 3 vertices as "candidates", we now need to find the one closest to us. We could readback filtered data (using CopyResourceRegion), and finish computation on CPU, but that's not fun, so let's continue ;)

First to think option is to sort the data, but that's expensive, we only want the closest element.

So first let's process all elements and write the closest distance into a single buffer:

Code Snippet

[numthreads(64,1,1)]
void CS_MinDistance(uint3 dtid : SV_DispatchThreadID)
{
    if (dtid.x >= VertexHitCountBuffer.Load(0)) { return; }
    
    float3 p = VertexHitBuffer[dtid.x];
    
    float d = distance(raypos,p);
    uint dummy;    
    InterlockedMin(RWMinDistanceBuffer[0],asuint(d),dummy);
}

Now we need to filter closest element:

Code Snippet

[numthreads(64,1,1)]
void CS_StoreIndex(uint3 dtid : SV_DispatchThreadID)
{
    if (dtid.x >= VertexHitCountBuffer.Load(0)) { return; }
    
    float3 p = VertexHitBuffer[dtid.x];
    float d = distance(raypos,p);
    uint ud = asuint(d);
    
    uint mind = MinDistanceBuffer[0];    
    InterlockedCompareStore(RWMinElementBuffer[0], mind,ud);
}

Please note that we don't store position directly since interlocked operations are only allowed on int/uint type. Also we don't handle case if we have more than one candidate, this is easy to replace Store by append (but anyway at some point we need to decide which point we select).

And just get position:

Code Snippet

[numthreads(1,1,1)]
void CS_ExtractPosition(uint3 dtid : SV_DispatchThreadID)
{
    uint idx = RWMinElementBuffer[0];
    RWPositionBuffer[0] = VertexHitBuffer[idx];
}

Copy those 12 bytes back in your CPU and you have your closest vertex.

One part of the morning well spent )

Now one feature which is always useful for a good editor (since at the end we edit points), if of course some form of undo/redo.

You have three main ways to implement undo:

For each action, use one function to update your model and one function to revert it. THis can be really cumbersome and error prone.
Serialize the state, and on undo create a new (or part modified) state from serialized data
Use immutable state

I have much growing interest into using more immutable in general, this is safer and i like the concept around it (ok it doesn't map well everywhere and can consume memory), but in that case (something like 10 points and couple projectors data), this sounds like a good use case.

So here is a calibration point:

Code Snippet

public class CalibrationPoint
{
    private readonly Vector2 screenPosition;
    private readonly Vector3 objectPosition;
 
    public CalibrationPoint(Vector2 screenPosition, Vector3 objectPosition)
    {
        this.screenPosition = screenPosition;
        this.objectPosition = objectPosition;
    }
 
    public Vector2 ScreenPosition
    {
        get { return this.screenPosition; }
    }
 
    public Vector3 ObjectPosition
    {
        get { return this.objectPosition; }
    }
}

You can see that once we create our point, we can't change properties anymore.

No to update properties, instead of setting data directly, we return a new point. There's a little way to help memory, if property is the same we return the same instance:

Code Snippet

public CalibrationPoint SetScreenPosition(Vector2 screenPosition)
{
    return this.screenPosition == screenPosition ? this : new CalibrationPoint(screenPosition, this.objectPosition);
}
 
public CalibrationPoint SetObjectPosition(Vector3 objectPosition)
{
    return this.objectPosition == objectPosition ? this : new CalibrationPoint(this.screenPosition, objectPosition);
}
 
public CalibrationPoint Set(Vector2 screenPosition, Vector3 objectPosition)
{
    return this.objectPosition == objectPosition &&
        this.screenPosition == screenPosition ? this : new CalibrationPoint(screenPosition, objectPosition);
}

Now do the same for Projector and calibration data:

Code Snippet

public class Projector
{
    private readonly string name;
    private readonly IEnumerable<CalibrationPoint> points;
 
    public Projector(string name, IEnumerable<CalibrationPoint> points)
    {
        if (name == null)
        {
            throw new ArgumentNullException("name");
        }
        if (points == null)
        {
            throw new ArgumentNullException("points");
        }
        this.name = name;
        this.points = points;
    }
 
    public string Name
    {
        get { return this.name; }
    }
 
    public IEnumerable<CalibrationPoint> Points
    {
        get { return this.points; }
    }
}

Some of the functions to modify (create new) state:

Code Snippet

public Projector AddPoint(Vector2 screenPosition, Vector3 objectPosition)
{
    var point = new CalibrationPoint(screenPosition, objectPosition);
 
    return new Projector(this.name, this.points.Concat(new CalibrationPoint[] { point }));
}
 
public Projector RemovePoint(CalibrationPoint point)
{
    return new Projector(this.name, this.points.Where(p => p != point));
}

Calibration class:

Code Snippet

public class Calibration
{
    private readonly IEnumerable<Projector> projectors;
    private readonly CalibrationSettings settings;
 
    public Calibration(CalibrationSettings settings, IEnumerable<Projector> projectors)
    {
        if (settings == null)
        {
            throw new ArgumentNullException("settings");
        }
        if (projectors == null)
        {
            throw new ArgumentNullException("projectors");
        }
        this.settings = settings;
        this.projectors = projectors;
    }
}

And to update projector data:

Code Snippet

public Calibration UpdateProjector(Projector oldProjector, Projector newProjector)
{
    if (oldProjector == newProjector)
    {
        return this;
    }
    else
    {
        var projs = this.projectors.ToList();
        int idx = projs.IndexOf(oldProjector);
 
        if (idx >= 0)
        {
            projs[idx] = newProjector;
            return new Calibration(this.settings, projs);
        }
        else
        {
            throw new ArgumentException("oldProjector", "This projector is not part of this calibration data");
        }
    }
}

I could do argument check first of course, and you can use some "Builder classes" to maintain those updates, but you get the point.

Now someone would say, this is a lot of work for simple classes....

But now once you are done with this (not so bad) boilerplate, here is out undo stack:

Code Snippet

public class CalibrationUndoStack
{
    private readonly Stack<Calibration> undoStack;
 
    public CalibrationUndoStack(Calibration initial)
    {
        this.undoStack = new Stack<Calibration>();
        this.undoStack.Push(initial);
    }
 
    public void Apply(Func<Calibration, Calibration> commandFunc)
    {
        var newState = commandFunc(this.Current);
        if (newState != this.Current)
        {
            this.undoStack.Push(newState);
        }
    }
 
    public Calibration Current
    {
        get { return this.undoStack.Peek(); }
    }
 
    public Calibration Undo()
    {
        return this.CanUndo ? this.undoStack.Pop() : this.undoStack.Peek();
    }
 
    public bool CanUndo
    {
        get { return this.undoStack.Count > 1; }
    }
}

As you can see, since we always return a new state, we pass a lambda to the stack, and if object has been modified (eg: function returns a new state), then we push our new state. That's how easy that is.

To implement update commands becomes as trivial as :

Code Snippet

public static Calibration AddProjector(Calibration c, string name)
{
    return c.AddProjector(name);
}
 
public static Calibration RenameProjector(Calibration state, Projector projector, string newname)
{
    var p = projector.SetName(newname);
    return state.UpdateProjector(projector, p);
}
 
public static Calibration AddPoint(Calibration state, Projector projector, Vector2 screen, Vector3 obj)
{
    var p = projector.AddPoint(screen, obj);
    return state.UpdateProjector(projector, p);
}
 
public static Calibration SetScreenPoint(Calibration state, Projector projector, CalibrationPoint point, Vector2 screen)
{
    var newPoint = point.SetScreenPosition(screen);
    var newProjector = projector.UpdatePoint(point, newPoint);
    return state.UpdateProjector(projector, newProjector);
 
}

And as you notice, this looks pretty verbose, here is how to do the same in f# (I love type inference, amongst may other things)

Code Snippet

module CalibrationCommandsFS =
 
    let addprojector (c:Calibration,n) = c.AddProjector(n);
    
    let renameprojector(c:Calibration,p, n) = c.UpdateProjector(p,p.SetName(n))
 
    let addpoint(c:Calibration,p,s,o) = c.UpdateProjector(p,p.AddPoint(s,o))
 
    let setscreenpoint(c:Calibration,proj,pt,s) = c.UpdateProjector(proj,proj.UpdatePoint(pt,pt.SetScreenPosition(s)))

And to operate on calibration:

Code Snippet

let x = new CalibrationUndoStack()
 
let s = new CalibrationSettings(Matrix.Identity)
let empty  = []
let c = new Calibration(s, [])
 
x.Apply(fun c -> addprojector(c,"hello"))

That's it for now, but likely more f# soon ;)

Catflier

Thursday 31 July 2014

Timeline

Wednesday 16 July 2014

OpenCV, Compute and immutability