by baggers
Well it's been a good week.
The first thing I get working was a particle system. This was based on the techniques shown in here and from one of ferris' old demos.
The base technique is very simple:
You then run 1 shader that:
And another shader that draws 1048576 quads using the positions from the current gbuffer
Finally you swap the source & destination gbuffer.
The result is 1 million particles running very smoothly, I'm not sure of the final fps as my machine was capped at 60fps. You can see the result in the picture attached, yay!
This was cool and will serve as a great base for a particle system that actually looks pretty :) however there was one fly in the ointment, my code the create the initial textures and streams was disgustingly slow.
Revisiting the code behind my abstraction over c-arrays made it very clear just how much I have learned since I started this project. I was throwing away performance and memory all over the shop!
Ok so a little background, cepl (my lisp abstraction over gl) needs to send lots of data to the gpu. As opengl is a C library we talk to it through cffi
(the common foreign function interface), this lets us so all the usual things an ffi does, allocate 'C memory' call C functions etc.
In cepl I have a c-array
type which holds a pointer, the dimensions of the array and the datatype of the elements. I then make functions that feel very lispy to interact with this c-array
, and use cffi
to do the work behind the scenes. Common lisp is a dynamic language, but it is one that compiles to machine code and has a bunch of ways to specify types and meta-data the compiler can use to optimize your code, for certain classes of problems it can outperform C (but at that point you are HEAVILY annotating your code)
One big place I was throwing away performance was how I was converting lisp data to C data. The conversion functions have an argument for specifying the C type of the data, I was providing this, but at runtime, so the library (and compiler) had not chance to hardcode the call to the correct conversion function. This meant HUGE numbers of type lookups. This was fix by generating functions with the types hardcoded and storing these new lookup functions with the c-array
.
Next I was allocating massive amounts of memory because I had written the code without thinking about performance. This is a fine thing to do if it is then easy to refactor into performant code. This was turning out to be a little ugly though, the number of nested loops was increasing. To combat this I decided to make functions to map
across the c-array
s so now I have:
map-c
takes a c-array
and a function, it calls the function on every element in the c-array and returns a new c-array
map-c-into
takes a source c-array
, a destination c-array
, and a function. It calls the function on every element in the c-array and stores the result in the destination c-array
.across
takes a c-array and function. It then calls the function passing the c array and the indices of the element it is currently visiting. You could then destructively modify the c-array
if you want toacross-ptr
takes a c-array and function. It then calls the function passing the the pointer and the indices of the element it is currently visiting.With these it is super easy to make functions that modify the C memory without converting between lisp & c data more than necessary.
With these done I got the time to generate and populate the textures and buffers from around 20 seconds to about 0.8 seconds. That was a damn good start :D
However I'm now in the mood for optimizing and I'm noticing I'm still allocating more memory than I could be. After a lot of digging it turns out that this may be related to the dispatch of the generic-functions (kind like methods) used in converting types in cffi
. Luckily they have a way of telling the compiler how to optimize this and so now I have the C struct -> lisp data
conversion to be much more memory efficient (and plenty faster too). However the feature to optimize the lisp data -> C struct
is not currently implemented.
So now my next task is to add this feature to that library and see if they accept it into the project. It would be awesome if they do as cffi
has made my entire cepl
project possible, and to get my code into there would feel great.
Right, that's all for now, time to go cook food.
Ciao!
Last Edited on Sun Mar 13 2016 14:17:58 GMT-0400 (EDT)
on Mon Mar 14 2016 12:12:12 GMT-0400 (EDT)
This sounds pretty awesome! And kudos on that major performance improvement through all that you've learned 👍 It's always so satisfying to look back at an old project when you weren't as familiar with the language and you can see how much you've improved.
Anyways, I'd love to see a vidya (or gif) of this in action! Nicely done!
on Mon Mar 14 2016 12:17:17 GMT-0400 (EDT)
I had a friend come in the room while I was on skype with Jake, and I was talking about blender and my friend asked Jake what he was doing. And more or less Jake replied with something like "Yeah, I'm making something like blender" and that just blew my friends mind lol. That's how I feel when I'm working with particles and your making a particle engine :P
on Mon Mar 14 2016 15:32:11 GMT-0400 (EDT)
SO FAST SO MANY SO COOL
on Mon Mar 14 2016 22:39:08 GMT-0400 (EDT)