Processing 2.0 is out! Processing 2.0 is in!   16 comments

Yesterday was a very important day for the Processing project, as a new stable version, “the 2.0”, has been released. This release is the result of the hard work of a small team of volunteers over the course of the past two years, plus the fundamental support and contributions from the entire Processing community. For me, this release is particularly significant since it includes a major rewrite of the OpenGL and video libraries, which represents my main contribution to the project since I become involved in it almost 5 years ago. After a long period of development, it is very satisfactory to reach a point where the code is good enough to abandon the nebulous territory the of alphas and betas. Of course, a stable release like this is also a compromise between imagination and time. Despite of the standing issues that result from that compromise, Processing 2.0 retains all the functionality that turned it into a widely used tool in computational arts, as well as adding new features and improvements that extends its capabilities and also serve as the starting point for future developments. In what follows, I’d like to describe in more detail some of the technical challenges we faced while working on the new OpenGL library, and the solutions attempted in order to deal with those challenges.

Processing 1.x already included an OpenGL-based renderer which took advantage of hardware acceleration in order to speed-up 3D graphics. However, this renderer had to be rewritten entirely from scratch due to the changes in the OpenGL API from 2.0 to 3.0 on the desktop, and the introduction of OpenGL ES (GLES) 2.0 on mobile platforms. In fact, there are two new OpenGL-based renderers in Processing 2.0: P3D and P2D, with the later further optimized for two-dimensional rendering.

Retained mode rendering

As mentioned in earlier posts, the old and easy-to-learn immediate mode in OpenGL:

glBegin(GL_TRIANGLES);
  glVertex3f( 0.0f, 1.0f, 0.0f);
  glVertex3f(-1.0f,-1.0f, 0.0f);
  glVertex3f( 1.0f,-1.0f, 0.0f);
  glEnd();

is all but gone, replaced by the use of buffer objects throughout the entire OpenGL API. For example, the immediate-mode code for rendering a triangle shown above would need to be replaced by something along the lines of (in C):

const float vertices[] = {
   0.0f, 1.0f, 0.0f, 1.0f,
  -1.0f,-1.0f, 0.0f, 1.0f,
  1.0f,-1.0f, 0.0f, 1.0f,
};

glGenBuffers(1, &buffer);

glBindBuffer(GL_ARRAY_BUFFER, buffer);
glBufferData(GL_ARRAY_BUFFER, sizeof(vertices), vertices, GL_STATIC_DRAW);
glBindBuffer(GL_ARRAY_BUFFER, 0);

glBindBuffer(GL_ARRAY_BUFFER, buffer);
glEnableVertexAttribArray(0);
glVertexAttribPointer(0, 4, GL_FLOAT, GL_FALSE, 0, 0);

glDrawArrays(GL_TRIANGLES, 0, 3);

The use of vertex buffer objects, although requiring more complex code, also results in faster rendering because the geometry can be organized in batches that are sent to the GPU with fewer calls. In addition, geometry that remains unchanged can be stored in a “static” buffer that is uploaded to the GPU memory only once, which offers significant performance improvements.

What is the impact that these changes in the OpenGL API have in Processing? First of all, Processing’s drawing API is basically immediate-mode, with functions like beginShape and endShape mapping very closely to the gl-equivalents:

beginShape(TRIANGLES);
vertex( 0, 1, 0);
vertex(-1,-1, 0);
vertex( 1,-1, 0);
endShape();

Therefore, the buffer-based operation in OpenGL 3.0+ and GLES 2.0+ posed two related challenges: the first being how to reimplement the OpenGL renderer so that all the immediate mode API in Processing continues to work in the same way as it did in 1.x, while offering reasonable performance, and the second: how to extend the drawing API in order to incorporate retained mode rendering (i.e.: enabling static geometry to be stored in an object) in a way that is consistent with the existing API.

The solution of the first challenge is purely technical and invisible to the Processing user. It involved a lot of code behind the scenes that tries to batch the geometry from the immediate-mode calls (rect, ellipse, etc) as efficiently as possible. Implementing this code properly is tricky because gl-state changes like disabling/enabling lighting or texturing breaks the batching, and also requires to keep track of the buffers not becoming too large (which is particularly important on mobile).

In fact, continous changes to the gl-state can have a substantial (negative) effect on performance and make the new OpenGL renderer slower than in 1.x, even for relatively simple code. For example, rendering a large number of interleaved textured/non-textured rectangles, like in:

for (int = 1; i < 1000; i++) {
  image(img, 0, 0, 50, 50);
  rect(0, 0, 50, 50);
}

would perform slower than grouping the textured and non-textured rectangles together:

for (int = 1; i < 1000; i++) {
  image(img, 0, 0, 50, 50);
}

for (int = 1; i < 1000; i++) {
  rect(0, 0, 50, 50);
}

although the visual output will be the same in both cases. Here, the automatic batching that the renderer in 2.0 does internally in order to optimize performance is helpless. Even though this seems to be a rather contrived example, it is important to keep this limitation of the new renderer in mind, specially when working with more complex sketches where certain rendering paths might indadvertedly lead to this kind of scenarios.

The second challenge, of integrating retained mode rendering into Processing’s API, was addressed with the introduction (or rather, the re-introduction, since the class already existed in 1.x) of the PShape class. In a nutshell, the PShape class now contains methods that reproduce the immediate-mode functions, which allow to “draw and store” the geometry inside a PShape object:

PShape sh;
float angle;

void setup() {
  size(400, 400, P3D);
  sh = createShape();
  sh.beginShape();
  sh.fill(180);
  sh.vertex(-200, -200);
  sh.vertex(200, -200);
  sh.vertex(200, 200);
  sh.vertex(-200, 200);
  sh.endShape();
}

void draw() {
  background(0);
  translate(200, 200);
  rotateY(radians(angle));
  shape(sh);
  angle += 1;
}

The vertex data is transferred to the GPU only once, which makes rendering much faster, specially for large meshes. For a more detailed description of the PShape functionality, check the excellent tutorial written by Daniel Shiffman.

It should also be noted that a PShape object can be modified after creation, although doing so will cancel out the performance gains (since the vertex buffer will be refreshed every time changes are made). All the drawing methods that can be used between PShape.beginShape() and PShape.endShape() to create the shape, have a couple of associated getters and setters that can be used outside the beginShape/endShape block in order to query the current values and set new ones. A simple example of this functionality is the following:

PShape sh;

void setup() {
  size(400, 400, P3D);
  sh = createShape();
  sh.beginShape();
  sh.fill(180);
  sh.vertex(0, 0);
  sh.vertex(400, 0);
  sh.vertex(400, 400);
  sh.vertex(0, 400);
  sh.endShape();
}

void draw() {
  background(0);
  sh.setVertex(0, mouseX, mouseY);
  shape(sh);
}

GLSL Shaders

The other major change in GL 3.0+ and GLES 2.0+ was the removal of the so-called fixed-function pipeline. With this change, things like the default lighting and matrix calculations are gone, and GLSL shaders are now required to perform any kind of rendering operation, even if it is as simple as drawing a two-dimensional, solid-color quad without any perspective or geometric transformations applied to it.

On the one hand, this removes the artificial limitations of the fixed-function pipeline imposed on modern GPUs – which provide quite advanced programming capabilities – and allows for an amazing degree of flexibility to create real-time graphics with the GPU. On the other hand, writing shaders only for doing quick sketching using default camera and lighting settings can become a hassle, and even a significant barrier for programmers who are not familiar with a low-level language such as GLSL. So the challenge here was to combine these two conflicting aspects of shader programming so that Processing users can still get gl-accelerated drawing without having to learn GLSL, and at the same time giving advanced users the option to integrate their shaders into Processing sketches. The new shader API included in Processing 2.0 is described at length in this tutorial, so I won’t go into details here.

One issue that is worth noting is that giving access to a low-level shading language from a relatively high-level drawing API is not entirely straightforward because they operate at different levels: Processing’s API generates geometry with a very specific structure, while GLSL doesn’t impose many constraints on the input geometry, other than the data types. This mistmatch was handled by imposing a number of naming conventions that need to be followed in the shader code in order to be recognized by Processing. In addition to that, the shader must be belong to one of six available types (point, line, color, texture, light, texlight), so Processing knows what type of geometry can be rendered by the shader. The type is specified by a #define embedded either in the fragment or vertex shader, so it doesn’t affect the validity of the GLSL code. All of this considerations are covered in the tutorial mentioned above.

The PGL interface

With evolution that OpenGL underwent over the last decade, the API has by now several different variants: the deprecated pre-3.0 (fixed-function) version on the desktop, OpenGL 3.0+ and 4.0+ (programmable-pipeline only) also on the desktop, OpenGL ES 1.x (fixed-function) and OpenGL ES 2.x+ (programmable) on mobile.

These different OpenGL API “versions” can sometimes be available on the same system, as OpenGL ES 2.0 is effectively a subset of OpenGL 3.0. JOGL – the OpenGL-Java bindings that Processing uses to talk with the native gl drivers – implements a system of profiles that allows to access a specific subset of GL functions. The OpenGL renderer in Processing provides access to all the JOGL profiles through a “glue” object called pgl:

void draw() {
  // get the PGL object and prepare the renderer for low-level gl calls
  PGL pgl = beginPGL();
  GL gl = pgl.gl; // get the common GL profile from pgl
  GL2 gl2 = gl.getGL2(); // get the GL2 profile from JOGL
  …
  gl2.glEnable(GL2.GL_BLEND); // use your GL code
  …
  endPGL();
}

Alternatively, the PGL object also implements all the GL functions that are part of the OpenGL ES 2.0 specification,
plus a few additional ones that are not part of the GLES 2.0 spec but are necessary on the desktop for multisampled rendering (blitFramebuffer, renderbufferStorageMultisample), color buffer IO (readBuffer, drawBuffer), and buffer mapping (mapBuffer, mapBufferRange, unmapBuffer). Offering this interface in PGL itself is motivated by the decision of using GLES 2.0 as the baseline GL API that ensures compatibility across desktop, mobile and also web with WebGL.

void setup() {
  size(400, 400, P3D);
}

void draw() {
  // Setting the background color with OpenGL:
  PGL pgl = beginPGL();
  pgl.clearColor(1, 0, 0, 1);
  pgl.clear(PGL.COLOR_BUFFER_BIT);
  endPGL();
}

One advantage of using the gl functions from PGL instead of the JOGL profiles is that the former ones are “automatically aware” of the changes done on the drawing surface by the Processing code, whereas using the lower-level JOGL calls would require additional work to properly interact with the Processing calls. For example, the following sketch draws some shapes with Processing, and then uses PGL to read the color buffer at the mouse location. Since the color buffer might be multisampled (with Processing internally handling the buffer swapping and blitting), the readPixels() call does some additional setup behind the scenes to read the pixel color from the right buffer:

import java.nio.ByteBuffer;

void setup() {
  size(400, 400, P3D);
}

void draw() {
  noStroke();
  fill(255, 0, 0);
  triangle(0, 0, width, 0, 0, height);
  fill(0, 0, 255);
  triangle(width, 0, width, height, 0, height);

  if (mousePressed) {
    PGL pgl = beginPGL();
    ByteBuffer buffer = ByteBuffer.allocateDirect(1 * 1 * Integer.SIZE / 8);

    pgl.readPixels(mouseX, height – mouseY, 1, 1, PGL.RGBA, PGL.UNSIGNED_BYTE, buffer);

    // get the first three bytes
    int r = buffer.get() & 0xFF;
    int g = buffer.get() & 0xFF;
    int b = buffer.get() & 0xFF;
    println(mouseX + ” ” + mouseY + ” = ” + r + ” ” + g + ” ” + b);
    buffer.clear();
    endPGL();
  }
}

For another example showing the interaction between Processing calls and the GL calls in PGL take a look at the Demos|Graphics|LowLevelGL, included in the Processing download.

Posted June 4, 2013 by ac in Programming

Tagged with , , , ,

16 responses to “Processing 2.0 is out! Processing 2.0 is in!

Subscribe to comments with RSS.

  1. Wow, very nice work. Thank you so much for your contributions!

  2. You’ve made a great step forward for Processing. The rewriting of the PShape is fantastic, simple and clear. The way you used OpenGL is also very good. Congratulations and thanks for your hard work!

  3. I was wondering. In Processing 1.5.1 I made extensive use of your GLGraphics in combination with GLVideo library to draw images on textures. This speeded things up for me a lot. However is this still necessary in Processing 2 (I know GLGraphics is not available for Processing 2), since GLVideo is implemented as well. Is the P2D mode really faster now and comparable with using GLGraphics for 2D images in Processing 1.5.1?

    I made an application that captures the webcam and I slice that up in multiple images from 1 spot (kind of a kaleidoscope effect).

    Thank your for your major work for Processing 2!

    • Hello kasper, Processing 2.0 applies the optimizations from GLGraphics/GSVideo automatically (i.e.: video frames copied directly into the gl textures)

  4. Pingback: Code & form » Processing 2.0 released | Marius Watz

  5. batching the rendering data is such a great idea! and code isn’t complicated at all, I’ve just given it a try and it works much more faster! thanks you so much for this;)

  6. I am an enthusiastic and self-taught student. everything I learn is browsing the web. thank you very much for sharing!
    when will get a syphon for processing version 2?
    greetings!

    damian nahmiyas
    • Hello, that depends on the time I will be able to put into Processing & Syphon in the coming weeks. In principle, a working version of the Syphon library for Processing 2.0 requires putting back into the OpenGL renderer some functions that provide low-level access to the color buffers (https://github.com/processing/processing/issues/1860), this will probably happen in a subsequent point release (2.x).

      • Andres, thanks for your hard work here. So I understand correctly… If we use Syphon with 2.0b9 we’re primarily missing the ability to use vertex buffer object mode?

        I use syphon to do transparency running Processing sketches in Resolume. Can you suggest an alternate route with the release of 2.0 final?

      • Hello Bob, 2.0b9 has all the vertex buffer optimizations. The main difference between 2.0b9 and 2.0 final, at least regarding the opengl renderer, is the bugfixes.

  7. Great work! I’m really enjoying the speed of the PShape and integration of GLGraphics features into the new processing 2.0. One question is about the GLTexture class. How come updateTexture() is so much faster than loadPixels()
    ? Is there a way to replicate that speed without GLGraphics? Right now if I cast a PGraphics3D object as a PImage I can access the Pixel array and it is close… Thanks!

    • Hi, PGraphicsOpenGL.loadPixels() might be slower than the original updateTexture() method because it does a few additional tasks: 1) in the case the surface is antialiased (smooth with 2X or higher), it will first blit the contents of the multisampled color buffer into a regular texture before copying it into the pixels array, 2) if no get/set pixels operations have been performed before the loadPixels() call, then the geometry will be flushed to the GPU, in order to ensure that the contents of the color buffer are up to date. But I’m not sure if these are the reasons for the slowness, if you have simple code that precisely quantifies the problem, please upload it somewhere and I will take a look at it. Thanks!

  8. Seems in the new version gl.getGL2() no longer exists. When I try to put this into Eclipse, the GL doesn’t show me the calls that should be part of GLBase. So not sure if pgl.gl is returning the correct thing or what. Weird.

    PGL pgl = beginPGL();
    GL gl = pgl.gl; // get the common GL profile from pgl
    GL2 gl2 = gl.getGL2(); // get the GL2 profile from JOGL

    • Sorry, I missed your last comment. There were some recent changes in the PGL interface. In 2.0.3, you could do:

      PGL pgl = beginPGL();
      GL2 gl2 = pgl.gl.getGL2();
      endPGL();

      But in 2.1, in order to access JOGL-specific fields, such as gl, you need to cast pgl as a PJOGL object:

      PGL pgl = beginPGL();
      GL2 gl2 = ((PJOGL)pgl).gl.getGL2();
      endPGL();

      The reason is that now PGL is a base abstract class that is used to subclass specific implementations (JOGL, GLES, LWJGL).

      gl is in fact a static member of PJOGL, so you could just do:

      PGL pgl = beginPGL();
      GL2 gl2 = PJOGL.gl.getGL2();
      endPGL();

  9. Hi

    I’m trying to figure out the limitations of OpenGL and GLSL in Processing. In particular, is it possible to write to a frame buffer with a custom shader or does it currently require a subclass of PGraphicsOpenGL as per your advice in this thread:

    https://forum.processing.org/one/topic/2-0xx-how-to-set-multiple-render-targets-for-a-fragment-shader.html

    I know there are some effects that require writing to the frame buffer ‘off screen’ so I’m trying to find out how difficult this is currently..

    Thanks

    Paul.

    • Hi Paul, sorry didn’t see your comment until now (I’m no longer updating his blog). Yes, it is possible to write to a frame buffer with a custom shader. All you need to do is to create a PGraphics object, and draw to it using the regular Processing API.

Leave a comment