This is the sixth featured recipe from the IPython Cookbook, the definitive guide to high-performance scientific computing and data science in Python.

Most existing plotting or visualization libraries in Python can display small or medium datasets (containing no more than a few tens of thousands of points). In the Big Data era, it is sometimes necessary to display larger datasets.

Vispy is a young 2D/3D high-performance visualization library that can display very large datasets. Vispy leverages the computational power of modern Graphics Processing Units (GPUs) through the OpenGL library.

Vispy offers a Pythonic object-oriented interface to OpenGL, useful to those who know OpenGL or who are willing to learn it. Higher-level graphical interfaces are also being developed at the time of this writing, and experimental versions are already available. These interfaces do not require any knowledge of OpenGL.

In this recipe, we give a brief introduction to the fundamental concepts of OpenGL. There are two situations where you would need to know these concepts:

  • If you want to use Vispy today, before the availability of stable high-level plotting interfaces.
  • If you want to create custom, sophisticated, high-performance visualizations that are not yet implemented in Vispy.

Here, we display a digital signal using Vispy's object-oriented interface to OpenGL.

Getting ready

Vispy depends on NumPy. A backend library is necessary (PyQt4/PySide, wxPython, glfw, or other).

This recipe has been tested with the development version of Vispy. You should clone the GitHub repository and install Vispy with python install.

The API used in this recipe might change in future versions.

How to do it...

  1. Let's import NumPy, (to display a canvas) and vispy.gloo (object-oriented interface to OpenGL).
import numpy as np
from vispy import app
from vispy import gloo
  1. In order to display a window, we need to create a Canvas.
c = app.Canvas(keys='interactive')
  1. When using vispy.gloo, we need to write shaders. These programs, written in a C-like language called GLSL, run on the GPU and give us full flexibility for our visualizations. Here, we create a trivial vertex shader that directly displays 2D data points (stored in the a_position variable) in the canvas. The function main() executes once per data point (also called vertex). The variable a_position contains the (x, y) coordinates of the current vertex. All this function does is to pass these coordinates to the next stage of processing in the rendering pipeline. We give more details in the How it works section below.
vertex = """
attribute vec2 a_position;
void main (void)
    gl_Position = vec4(a_position, 0.0, 1.0);
  1. The other shader we need to create is the fragment shader. It lets us control the pixels' color. Here, we display all data points in black. This function runs once per generated pixel.
fragment = """
void main()
    gl_FragColor = vec4(0.0, 0.0, 0.0, 1.0);
  1. Next, we create an OpenGL Program. This object contains the shaders and allows us to link the shader variables to Python/NumPy data.
program = gloo.Program(vertex, fragment)
  1. We link the variable a_position to a (1000, 2) NumPy array containing the coordinates of 1000 data points. In the default coordinate system, the coordinates of the four canvas corners are (+/-1, +/-1). Here, we generate a random time-dependent signal in \([-1,1]\).
program['a_position'] = np.c_[
        np.linspace(-1.0, +1.0, 1000),
        np.random.uniform(-0.5, +0.5, 1000)].astype(np.float32)
  1. We create a callback function called when the window is being resized. Updating the OpenGL viewport lets us ensure that Vispy uses the entire canvas.
def on_resize(event):
    gloo.set_viewport(0, 0, *event.size)
  1. We create a callback function called when the canvas needs to be refreshed. This on_draw function renders the entire scene. First, we clear the window in white (it is necessary to do that at every frame). Then, we draw a succession of line segments using our OpenGL program. The vertices used for this visual are those returned by the vertex shader.
def on_draw(event):
  1. Finally, we show the canvas and we run the application.;

The following figure shows a screenshot:

Basic visualization example with Vispy

How it works...

OpenGL is an open standard for hardware-accelerated interactive visualization. It is widely used in video games, industry software (Computer-Aided Design, or CAD, virtual reality) and scientific applications (medical imaging, computer graphics, and so on).

OpenGL is a mature technology created in the early 1990s. In the early 2000s, OpenGL 2.0 brought a major new feature: the possibility to customize fundamental steps of the rendering pipeline. This pipeline defines the way data is processed on the GPU for real-time rendering. Many OpenGL courses and tutorials cover the old, fixed pipeline. Vispy supports exclusively the modern, programmable pipeline.

Here, we introduce the fundamental concepts of the programmable pipeline used in this recipe. OpenGL is considerably more complex than what we will cover here. However, Vispy provides a vastly simplified API for the most common features of OpenGL.

Vispy is based on OpenGL ES 2.0, a flavor of OpenGL that is supported on desktop computers, mobile devices, and modern Web browsers (through WebGL). Modern graphics cards may support additional features. Those features will be available in future versions of Vispy.

There are four major elements in the rendering pipeline of a given OpenGL program:

  1. Data buffers store numerical data on the GPU. The main types of buffers are vertex buffers, index buffers and textures.
  2. Variables are available in the shaders. There are four major types of variables: attributes, uniforms, varyings and texture samplers.
  3. Shaders are GPU programs written in a C-like language called OpenGL Shading Language (GLSL). The two main types of shaders are vertex shaders and fragment shaders.
  4. The primitive type defines the way data points are rendered. The main types are points, lines and triangles.

Here is how the rendering pipeline works:

  1. Data is sent on the GPU and stored in buffers.
  2. The vertex shader processes the data in parallel and generates a number of 4D points in a normalized coordinate system (+/-1, +/-1). The fourth dimension is a homogeneous coordinate (generally 1).
  3. Graphics primitives are generated from the data points returned by the vertex shader (primitive assembly and rasterization).
  4. The fragment shader processes all primitive pixels in parallel and returns each pixel's color as RGBA components.

In this recipe's example, there is only one GPU variable: the attribute a_position. An attribute is a variable that takes one value per data point. Uniforms are global variables (shared by all data points), whereas varyings are used to pass values from the vertex shader to the fragment shader (with automatic linear interpolation for a pixel between 2 or 3 vertices).

In vispy.gloo, a Program is created with the vertex and fragment shaders. Then, the variables declared in the shaders can be set with the syntax program['varname'] = value. When varname is an attribute variable, the value can just be a NumPy 2D array. In this array, every line contains the components of every data point.

Similarly, we could declare and set uniforms and textures in our program.

Finally, program.draw() renders the data using the specified primitive type. Here, the line_strip primitive type tells the GPU to run through all vertices (as returned by the vertex buffer) and to draw a line segment from one point to the next. If there are n points, there will be n-1 line segments.

Other primitive types include points and triangles, with several ways of generating lines or triangles from a list of vertices.

In addition, an index buffer may be provided. An index buffer contains indices pointing to the vertex buffers. Using an index buffer would allow us to reuse any vertex multiple times during the primitive assembly stage. For example, when rendering a cube with a triangles primitive type (one triangle is generated for every triplet of points), we could use a vertex buffer with 8 data points and an index buffer with 36 indices (3 points per triangle, 2 triangles per face, 6 faces).

There's more...

The example shown here is extremely simple. The approach provided by OpenGL and Vispy is nevertheless particularly powerful. It gives us full control on the rendering pipeline, and it allows us to leverage the computational power of the GPU in a nearly optimal way.

High performance is achieved by minimizing the number of data transfers to the GPU. When displaying static data (for example, a scatter plot), it is possible to send the data to the GPU at initialization time only. Yet, rendering dynamic data is reasonably fast; the order of magnitude of data transfers is roughy 1 GB/s.

Besides, it is critical to use as few OpenGL draw calls as possible. Every draw incurs a significant overhead. High performance is achieved by rendering all similar primitive types at once (batch rendering). GPUs are particularly efficient with batch rendering, even when the properties of the points are different (for example, points with various sizes and colors).

Finally, geometric or pixel transformations can be executed on the GPU with very high performance using the shaders. The massively parallel architecture of GPUs, consisting of hundreds or thousands of computing units, is fully leveraged when transformations are implemented in the shaders.

General-purpose computations can be done in the shaders in a context of visualization. There is one major drawback compared to proper GPGPU frameworks like CUDA or OpenCL, though. In the vertex shader, a given thread has access to one data point only. Similarly, in the fragment shader, a thread has only access to one pixel. There are ways to mitigate this issue, but they lead to a drop of performance.

However, it is possible to interoperate OpenGL with CUDA/OpenCL. Buffers can be shared between OpenGL and the GPGPU framework. Complex CUDA/OpenCL computations can be implemented on vertex buffers or textures in real-time, leading to highly efficient rendering of numerical simulations.

Vispy for scientific visualization

As we have seen in this recipe, Vispy requires the user to know OpenGL and GLSL. However, higher-level graphical interfaces are currently being developed. Those interfaces will bring to scientists the power of GPUs for high-performance interactive visualization.

Visuals will provide reusable, reactive graphical components like shapes, polygons, 3D meshes, surface plots, network graphs, and others. These visuals will be fully customizable and may be used without knowledge of OpenGL. A shader composition system will allow advanced users to reuse snippets of GLSL functionality in a modular way.

Visuals will be organized within a scene graph implementing GPU-based transformations.

Scientific plotting interfaces will be implemented. Vispy could also serve as a high-performance backend for existing plotting libraries such as matplotlib.

Vispy will also support full integration in the IPython notebook using WebGL.

Eventually, Vispy could implement many kinds of scientific visualizations:

  • Scatter plots can be rendered efficiently with point sprites, using one vertex per data point. Panning and zooming can be implemented in the vertex shader, enabling fast interactive visualization of millions of points.
  • Digital signals, static or dynamic (real-time) can be displayed with polylines. High-quality rendering of curves can be achieved using an OpenGL implementation of the Anti-grain Geometry (agg) library.
  • Network graphs can be displayed by combining points and line segments.
  • 3D meshes can be displayed with triangles and index buffers. Geometric transformations and realistic lighting can be implemented in the vertex and fragment shader.
  • Real-time streams of images can be displayed efficiently with textures.
  • Axes, grids, ticks, text, and labels can be rendered efficiently in the fragment shader.

Many examples can be found in Vispy's gallery.


Here are a few references:

You'll find the rest of the chapter in the full version of the IPython Cookbook, by Cyrille Rossant, Packt Publishing, 2014.