Benchmarking NME with BunnyMark

A few months ago, a friend recommended that I try BunnyMark, a cute rendering benchmark. I have meant to port it to Haxe so I could use it to benchmark NME, so today I decided to give it a whirl.

The first two tests were performed on my laptop, using Flash Player and a compiled Windows application, using NME. The first test uses separate bitmap objects for each of the bunnies, so rendering performance relies more heavily on the display list.

Test 1 – Display List

10,000 bitmaps — Flash — 37 FPS
10,000 bitmaps — CPP — 29 FPS
The second test uses blitting to draw all of the bunnies on the same bitmap. Flash supports “copyPixels”, but NME also supports “drawTiles”, which performs much faster.

Test 2 – Blitting

10,000 objects — Flash — 59 FPS
10,000 objects — CPP — 60+ FPS

30,000 objects — Flash — 21 FPS
30,000 objects — CPP — 44 FPS
Blitting was the fastest rendering method in this benchmark. As fast as Flash performed, NME performed over two times faster.

Test 3 – Mobile Performance

The real question, in my mind, was how things would perform on a mobile device. In the original BunnyMark post I linked above, it said that Philippe tested BunnyMark using AIR on his Android phone, and was able to display 600 bunnies at 30 FPS. I wondered how many bunnies I would be able to display, using NME?

The Android phone I have for testing is an LG Optimus S. It has a 600 MHz processor, so it certainly is not the fastest Android phone on the market. I was able to render nearly 4,000 bunnies at 30 FPS. If Philippe and I were running the benchmark on the same phone, that would be a performance increase of seven times.

  • Achmad Aulia Noorhakim

    I don’t think you can run AIR app on a 600MHz android phone (mine is Galaxy Mini). So I think it should be way more than just seven times. Also It’s interesting in PC the performance of CPP is *not* trashing Flash’s performance. What did NME use for CPP in PC ? SDL ? or OpenGL (for display list) ?

  • http://www.joshuagranick.com Joshua Granick

    It looks like Windows is probably using OpenGL.

    One reason for Flash performing faster with the display list may be that NME is creating GL surfaces for each object, rather than drawing with software. If you are drawing with software, like Flash, then many objects in your display list is probably not as much of a concern.

    On the other hand, I know that when I have worked with the source code for NME, I have been focused primarily on mobile features and performance. Since mobile uses OpenGL ES instead of OpenGL, they probably have different renderers, which means that the OpenGL ES renderer in NME may be optimized more than the desktop renderer, which generally appears to be running on par or faster than Flash (although it may not be “insane”, 2x the speed of Flash on the blitting test is pretty good!).

    I would love to see more performance on the desktop, but I know that if I have time, I’ll be spending it improving mobile somehow, for now. Maybe that will change design a game for the desktop and really want to push the envelope… :)

  • http://www.joshuagranick.com Joshua Granick

    On the other hand, maybe it is just this benchmark!

    This was just posted on Twitter:

    massive amount of particles for #haXe #NME: http://pastebin.com/MQJXyEWZ1mio particles @40fps and 3mio @20fpshere (AS3: 300K)http://pic.twitter.com/XXktEku

    If I am reading this right, NME is running 3 million particles at the same speed as AS3 handles 300,000? That’s a ten-fold, “Flash trashing” performance boost if I’m understanding that right.

  • Guest

    Is this blog running in concrete5 with the wordpress blog application?

  • http://www.joshuagranick.com Joshua Granick

    Yes, it sure is!

  • Dinko Pavicic

    Not 100% sure but I think that drawTiles in NME doesn’t use blitting but polygon batching in OGL and all bunnies are rendered in one draw call. That in theory could be the reason why test 1 is poor on cpp target since it doesn’t use batching and every sprite is one draw call which is probably too much for gpu.

  • adam

    Hi, would it be possible to get a copy of your source code. I am trying out Haxe and set up a simular benchmark but could not even get 1000 bunnies running at 30 fps. Obviously I am doing someting wrong and it would be great to see how you achieve this.

    Thanks,

  • http://www.joshuagranick.com Joshua Granick

    I don’t have the source code with me on this machine, but there are a few different ways to render. Using drawTriangles or drawTiles may be faster for a demo like this than copyPixels, though usually you are not trying to draw the same object on the same screen thousands of times :)

  • Don

    I don’t have the fps counter on screen, but I was about to get 5000 bunnies running smoothly  on an ipod Touch 4th gen using drawTiles.  About 10x faster than copyPixels.  But this is all ballpark.

  • http://www.joshuagranick.com Joshua Granick

    Wow, that’s great! That’s a huge boost over copyPixels