OpenVX - Hardware acceleration API for Computer Vision applications and libraries

Computer vision has become an essential component of many modern applications including gesture tracking, smart video surveillance, automatic driver assistance, biometrics, computational photography, augmented reality, visual inspection, robotics and more. The Khronos vision working group has been formed to drive industry consensus to create a cross-platform API standard to enable hardware vendors to implement and optimize accelerated computer vision algorithms. The Khronos vision API can accelerate high-level libraries, such as the popular OpenCV open source vision library, or be used by applications directly. A strong focus of the working group will be on providing computer vision on mobile and embedded systems and enabling acceleration on a wide variety of computing architectures including CPUs, GPUs and DSPs. The vision API will also explore interoperability with existing Khronos standards for camera control, video processing, compute acceleration and graphics rendering.

Computer Vision Working Group Proposal

Industry Requirement

Computer vision has become an essential component of many modern applications including gesture tracking, smart video surveillance, automatic driver assistance, biometrics, computational photography, augmented reality, visual inspection, robotics, and many more. Nearly all modern consumer compute devices, from smartphones to desktop computers, can be capable computer vision systems. Any computer vision system contains an image sensor (mostly working in visible or infrared light spectrum) or a 3D sensor (a time-of-flight camera or a structured light device such as Kinect), but the competitive advantage of almost any computer vision technology lies both in the software algorithms that process data from sensors to solve a particular problem and in efficient hardware that runs the algorithms.

Many applications require computer vision algorithms to work in real-time, processing data obtained from sensors on the fly. Consequently, many hardware vendors have developed accelerated computer vision libraries for their products: CoreImage by Apple, IPP by Intel, NPP by NVIDIA, IMGLIB and VLIB by TI, the recently announced FastCV by Qualcomm and several others. As each of these companies develops its own API, the market fragments for developers, creating a need for an open standard that will align the efforts of hardware vendors and simplify the development of efficient cross-platform computer vision applications.

The most widely used cross-platform computer vision library is Open Source Computer Vision Library (OpenCV). It is used by many academic groups as well as commercial companies as a framework for developing computer vision algorithms. OpenCV was initially created by Intel to act as an open source library with a hardware acceleration layer for Intel architectures. Currently the core development of OpenCV is undertaken by Itseez and is sponsored by Willow Garage, NVIDIA, and Google. The current version of OpenCV provides implementations on a variety of desktop CPUs and GPUs, but crossplatform acceleration is not easily possible. Implementations for mobile architectures are in development, but participation by the hardware vendor community is urgently needed to create a mobile-oriented, cross-platform, computer vision acceleration API.

Proposed Solution

Vision API - Proposed Solution

The proposal is to create an open, royalty-free acceleration API standard for computer vision. The standard will specify a Hardware Abstraction Layer (CV HAL) to standardize data formats and hardware vendors to implement, optimize and compete with implementations of computer vision algorithms, while providing portable, crossplatform access to efficiently accelerated computer vision functionality for use by high-level libraries and directly by applications. Future versions of the OpenCV open source library could choose to use the CV HAL to portably access hardware acceleration. The working group should evaluate whether or not to implement an example implementation of the CV and whether such an implementation should be open sourced.

Methodology

We propose that the CV HAL standard be created using the standard Khronos members only collaborative processes based on an open call for contributions and a consensusbased refinement towards fulfilling an agreed Statement of Work (SOW). The working group would work to create a royalty-free standard and avoid IP issues as per any Khronos working group.

The Board would initially establish a temporary working group with a Board designated chair. The temporary working group would work to generate consensus on an agreed SOW. If the generated SOW is then approved by the Board, a full-time working group would be established which would elect a chair and specification editor from its membership.

The group shall collaborate with the corresponding Khronos working groups as needed to study and work towards leveraging and interoperability with other Khronos APIs if shown to be beneficial, including:

  • OpenMAX for camera control and streaming functionality;
  • OpenCL as one possible implementation API;
  • interoperability with OpenGL and OpenCL as peer APIs;
  • using CV Hal to create StreamInput filter nodes.

The group will welcome contributions and API proposals from all working group members. The lower-level part of the current OpenCV 2.3 API is available as one contribution and Qualcomm have indicated that FastCV may be made available as a contribution also.

CV HAL is a temporary working name. The working group should name the API once its scope and purpose have been agreed.

Design Philosophy

The Computer Vision working group may consider using and refining the following design goals and principles for the CV HAL standard:

  • provide interoperable acceleration of computer vision algorithms across platforms;
  • keep the scope for the 1.0 version tightly focused on areas of computer vision agreed to be appropriate and widely beneficial;
  • ensure efficient implementation is possible on a wide variety of computing architectures including CPUs, GPUs and DSPs;
  • keep a primary focus on enabling mobile and embedded devices;
  • efficiently support a possible wide range of high-level computer vision libraries, not solely the OpenCV higher-level library;
  • be usable directly by applications as well as higher level libraries;
  • the design of a clean, forward-looking API takes priority over reuse of existing libraries, however existing APIs and libraries should be leveraged where possible to promote rapid adoption and usage.

Deliverables

The Computer Vision working group will produce the following deliverables:

  • A specification for the CV HAL API;
  • Manual pages for the API;
  • An Adoption process that defines the criteria for implementation compliance;
  • A conformance test suite that tests the operation of an CV HAL implementation;
  • A logo and certification mark usage guidelines.

Industry Support

It is expected that hardware vendors adopt the standardized HAL for deliver efficient computer vision acceleration, and computer vision companies to use CV HAL for creating high-level algorithms, libraries and applications. When this proposal was presented in the September 2011 Khronos TAP meeting several member companies expressed support including: ARM, Freescale, Intel, NVIDIA,

Qualcomm, STMicroelectronics and TI.

We would expect strong interest among companies active in computer vision applications to join Khronos to help define this specification including augmented reality vendors, automotive companies, robotics companies and computer vision research institutes.

Khronos Infrastructure

This is expected to be a typically-organized, medium-sized working group:

  • working group size between 10-20;
  • conformance tests will need to be Khronos funded – though existing OpenCV open source implementations may possibly be leveraged to expedite their implementation. It would be expected to be perfectly feasible to set Adopters fees to more than cover costs of test generation as most silicon vendors would want to implement this API.

Key exhibitions to promote this standard include ISMAR Augmented Reality conference.

Milestones

It is suggested that the Computer Vision refine the following milestone plan as discussions develop:

  1. December 2nd 2011 – Board meeting vote to establish the temporary working group;
  2. January 2012 F2F – first working group F2F Meeting;
  3. April 2012 F2F – agreed SOW sent to Board for approval of full-time working group;
  4. Q4 2012 – draft spec in review.