The Capybaras of Ipaussu [ Abstract Outtake ] /// JP2 (JPEG 2000) Databending, Photo from 2019
seen from Latvia

seen from Türkiye

seen from United Kingdom

seen from United States
seen from Sweden

seen from Romania

seen from Malaysia
seen from United States
seen from Albania

seen from Czechia
seen from Malaysia
seen from China

seen from Malaysia

seen from United Kingdom
seen from Germany
seen from Germany

seen from Sweden

seen from United Kingdom
seen from United States

seen from France
The Capybaras of Ipaussu [ Abstract Outtake ] /// JP2 (JPEG 2000) Databending, Photo from 2019
23___05 (Creation of Identity: The Face of a New Generation) { jpeg2000 databending glitch, 2018Â
GPU HDR Processing for SONY Pregius Image Sensors
Author: Fyodor Serzhenko
The fourth generation of SONY Pregious image sensors (IMX530, IMX531, IMX532, IMX532, IMX535, IMX536, IMX537, IMX487) is capable of working in HDR mode. That mode is called "Dual ADC" (Dual Gain) which means that two raw frames are originated from the same 12-bit raw image which is digitized via two ADCs with different analog gains. If the ratio of these gains is around 24 dB, one can get one 16-bit raw image from two 12-bit raw frames with different gains. This is actually the main idea of HDR for these image sensors - how to get extended dynamic range up to 16 bits from two 12-bit raw frames with the same exposure and with different analog gains. That method guarantees that both frames have been exposured at the same time and they are not spatially shifted.
That Dual ADC feature was originally introduced at the third generation on SONY Pregius image sensors, but HDR processing had to be implemented outside the image sensor. The latest version of that HDR feature is done inside the image sensor which makes it more convenient to work with. Dual ADC mode with on-sensor combination (combined mode) is applicable for high speed sensors only.
fastcompression.comÂ
Fastvideo blogÂ
HDR for SONY Pregius IMX532 image sensor on GPU
GPU HDR Processing for SONY Pregius Image Sensors
Author: Fyodor Serzhenko
The fourth generation of SONY Pregious image sensors (IMX530, IMX531, IMX532, IMX532, IMX535, IMX536, IMX537, IMX487) is capable of working in HDR mode. That mode is called "Dual ADC" (Dual Gain) which means that two raw frames are originated from the same 12-bit raw image which is digitized via two ADCs with different analog gains. If the ratio of these gains is around 24 dB, one can get one 16-bit raw image from two 12-bit raw frames with different gains. This is actually the main idea of HDR for these image sensors - how to get extended dynamic range up to 16 bits from two 12-bit raw frames with the same exposure and with different analog gains. That method guarantees that both frames have been exposured at the same time and they are not spatially shifted.
That Dual ADC feature was originally introduced at the third generation on SONY Pregius image sensors, but HDR processing had to be implemented outside the image sensor. The latest version of that HDR feature is done inside the image sensor which makes it more convenient to work with. Dual ADC mode with on-sensor combination (combined mode) is applicable for high speed sensors only.
In the Dual ADC mode we need to specify some parameters for the image sensor. There are two ways of getting the extended dynamic range from SONY Pregius image sensors:
In the combined mode the image sensor can output one 12-bit raw frame with applied merge feature (when we combine two 12-bit frames with Low gain and High gain) and simple tone mapping (when we apply PWL curve to 16-bit merged data). That approach allows us to have minimum camera bandwidth because in that case the image size is minimal - this is just a 12-bit raw frame.
In the non-combined mode the image sensor outputs two 12-bit raw images which could be processed later outside the image sensor. This is the worst case for the camera bandwidth, but it could be promising for high quality merge and sofisticated tone mapping.
Apart from that, there are two other options:
We can process just Low gain or High gain image, but it's quite evident, that dynamic range in that case will be not better than in the Dual ADC mode.
It's also possible to apply our own HDR algorithm to the results of the combined mode as an attempt to improve image quality and dynamic range.
Dual Gain mode parameteres for image merge
Threshold - this is an intensity level where we should start utilizing Low gain data instead of High gain
Low gain (AD1) and High gain (AD2) - these are values for analog gain (0 dB, 6 dB, 12 dB, 18 dB, 24 dB)
Dual Gain mode parameteres for HDR
Two pairs of knee points for PWL curve (gradation compression from 16-bit range to 12-bit). They actually come from Low gain and High gain values, and from parameters of gradation compression.
Below is the picture with detailed info concerning PWL curve which is applied after image merge, and it's done inside the image sensor. We can see how gradation compression is implemented at the image sensor.
This is an example of real parameters for Dual ADC mode for SONY IMX532 image sensor
Dual ADC Gain Ratio: 12 dB
Dual ADC Threshold: 40%
Compression Region Selector 1:
Compression Region Start: 6.25%
Compression Region Gain: -12 dB
Compression Region Selector 2:
Compression Region Start: 25%
Compression Region Gain: -18 dB
For further testing we will capture frames from IMX532 image sensor at XIMEA camera MC161CG-SY-UB-HDR with exactly the same parameters of Dual ADC mode.
If we compare images with gain ratio 16 (High gain is 16 times greater than Low gain) and exposure ratio 1/16 (long exposure for Low gain and short exposure for High gain), then we clearly see that images are alike, but High gain image has the following two problems: it has more noise and more hot pixels due to strong analog signal amplification. These issues should be taken into account.
Apart from the standard Dual ADC combined mode, there is a quite popular approach which could bring good results with minimum efforts: we can use just Low gain image and apply custom tone mapping instead of PWL curve. In that case dynamic range is less, but that image could have less noise in comparison with images from the combined mode.
Why do we need to apply our own HDR image processing?
It makes sense if on-sensor HDR processing in Dual ADC mode could be improved. That could be the way of getting better image quality due to implementation of more sofisticated algorithms for image merge and tone mapping. GPU-based processing is usually very fast, so we could still be able to process image series with HDR support in realtime, which is a must for camera applications.
HDR image processing pipeline on NVIDIA GPU
We've implemented image processing pipeline on NVIDIA GPU for Dual ADC frames from SONY Pregius image sensors. Actually we've extended our standard pipeline to work with such HDR images. We can process on NVIDA GPU any frames from SONY image sensors in the HDR mode: one 12-bit HDR raw image (combined mode) or two 12-bit raw frames (non-combined mode). Our result could be better not only due to our merge and tone mapping procedures, but also due to high quality debayering which also influences on the quality of processed images. Why we use GPU? This is the key to get much higher performance and image quality which can't be achieved on the CPU.
Low gain image processing
As we've already mentioned, this is the simplest method which is widely accepted and it's actually the same as a switched-off Dual ADC mode. Low gain 12-bit raw image has less dynamic range, but it also has less noise, so we can apply either 1D LUT or more complicated tone mapping algorithm to that 12-bit raw image to get better results in comparison with combined 12-bit HDR image which we can get directly from SONY image sensor. This is a brief info about the pipeline:
Acquisition of 12-bit raw image from a camera with SONY image sensor
BPC (bad pixel correction)
Demosaicing with MG algorithm (23×23)
Color correction
Curves and Levels
Local tone mapping
Gamma
Optional JPEG or J2K encoding
Monitor output, streaming or storage
Fig.1. Low gain image processing for IMX532
Image processing at the Combined mode
Though we can get ready 12-bit raw HDR image from SONY image sensor at Dual ADC mode, there is still a way to improve the image quality. We can apply our own tone mapping to make it better. That's what we've done and the results are consistently better. This is a brief info about the pipeline:
Acquisition of 12-bit raw HDR image from a camera with SONY image sensor
Preprocessing
BPC (bad pixel correction)
Demosaicing with MG algorithm (23×23)
Color space conversion
Global tone mapping
Local tone mapping
Optional JPEG or J2K encoding
Monitor output, streaming or storage
Fig.2. SONY Dual ADC combined mode image processing for IMX532 with a custom tone mapping
Low gain + High gain (non-combined) image processing
To get both raw frames from SONY image sensor, we need to send them to a PC via camera interface. It could cause a problem for interface bandwidth and for some cameras it could be a must to decrease frame rate to cope with camera bandwidth limitations. If we use PCIe, Coax or 10/25/50-GigE cameras, then it could be possible to send both raw images at realtime without frame drops.
As soon as we get two raw frames (Low gain and High gain) for processing, we need to start from preprocessing, then to merge them into one 16-bit linear image and to apply tone mapping algorithm. Usually good tone mapping algorithms are more complicated than just a PWL curve, so we can get better results, though it definitely takes much more time. To solve that issue in a fast way, high performance GPU-based image processing could be the best approach. That's exactly what we've done and we can get better image quality and higher dynamic range in comparison with combined HDR image from SONY and with processed Low gain image as well.
HDR workflow for Dual ADC non-combined image processing on GPU
Acquisition of two raw images in non-combined Dual ADC mode
Preprocessing of two images
BPC (bad pixel correction) for both images
RAW Histogram and MinMax for each frame
Merge for Low gain and High gain raw images
Demosaicing with MG algorithm (23×23)
Color space conversion
Global tone mapping
Local tone mapping
Optional JPEG or J2K encoding
Monitor output, streaming or storage
In that workflow the most important modules are merge, global/local tone mapping and demosaicing. We've implemented that image processing pipeline with Fastvideo SDK which is running very fast on NVIDIA GPU.
Fig.3. SONY Dual ADC non-combined (two-image) processing for IMX532
Resume for Dual ADC mode on GPU
Better image quality
Sofisticated merge for Low gain and High gain images
Global and local tone mapping
High quality demosaicing
Better dynamic range
Less artifacts for brightness and color
Less noise
High performance processing
We believe that the best results for image quality could be achived in the following modes:
Simultaneous processing of two 12-bit raw images in the non-combined mode.
Processing of one 12-bit raw frame in the combined mode with a custom tone mapping algorithm.
If we are working in the non-combined mode, then we can get good image quality, but camera bandwith limitation and processing time could be a problem. If we are working with the results of the combined mode, image quality is comparable, the processing pipeline is less complicated (the performance is better), and we need less bandwidth, so it could be recommended for most use cases. With a proper GPU, image processing could be done in realtime at the max fps.
The above frames were captured from SONY IMX532 image sensor at Dual ADC mode. The same approach is applicable to all high speed SONY Pregius image sensors of the 4th generation which are capable of working at Dual ADC combined mode as well.
Processing benchmarks on Jetson AGX Xavier and GeForce RTX 2080TI in the combined mode
We've done time measurements for kernel times to evaluate the performance of the solution in the combined mode. This is the way to get high dynamic range and very good image quality, so the knowledge about performance could be valuable. Below we publish timings for several image processing modules because full pipeline could be different in general case.
Table 1. GPU kernel time in ms for IMX532 raw frame processing in the combined mode (5328×3040, bayer, 12-bit)
This is just the part of the full image processing pipeline and this is to show a level of how fast it could be on the GPU.
References
Fastvideo SDK for Image & Video Processing on GPU
RAW to RGB conversion on GPU
XIMEA high speed color industrial camera with Sony IMX532 image sensor
Original article see at:Â https://fastcompression.com/blog/gpu-hdr-processing-sony-pregius-image-sensors.htm
Part 2: JPEG2000 solutions in science and healthcare. JP2 format limitations
Author: Fyodor Serzhenko
In the first part of the article, JPEG 2000 in science, healthcare, digital cinema and broadcasting, we discussed the key technologies of JPEG2000 and focused on its application in digital cinema.
In this second part, we will continue examining the functions of JPEG2000, as well as review its main drawback and talk about the other application areas where the format turned out to be in high demand. At the end we will present a solution which simplifies and makes the process of working with the format much more convenient.
1. JPEG2000 in science and medicine
Window mode support is one of the handy features that makes JPEG2000 attractive. Scientists often have to work with files of enormous resolution, the width and height of which can exceed 40,000 pixels, but only a small part of which is of interest. Standard JPEG would have to decode the entire image to work with it, while JPEG2000 allows you to decode only a selected area.
JP2 is also used for space photography. Those wonderful pictures of Mars taken, for example, with a HiRISE camera, are available in JP2 format. Still, the data link from space to Earth is subject to interference, so errors may occur during the transfer or even entire data packets may be lost. However, when the special mode is enabled, it is somewhat error-resilient, which can be helpful when communication or storage devices are unreliable. This mode allows you to detect errors that occur when data is lost during transmission. It is important to note that the image is divided into small blocks (for example, 32x32 or 64x64 pixels), where, after preliminary transformations, each bit plane is encoded separately. Thus, a lost bit most likely spoils only some of the less significant bit planes, and this usually has little effect on overall quality. By the way, in JPEG, the loss of a bit can lead to significant distortions of a big part of or even the entire image.
Regarding the operation of the special mode with the integrity check in the JPEG2000 format file, additional information is added to the compressed file to check the correctness of the data. Without this information, we often can’t determine during decoding whether there’s an error or not, and we continue the process as if nothing had happened. As a result, it’s still possible that even one erroneous bit will spoil quite a large part of the image. If this mode is enabled, however, then we detect any error when it appears and can limit its effect on other parts of the image.
The JPEG2000 format also plays important role in healthcare. In this application area, it is extremely important to maintain a sufficient bit depth of the source data to make it possible to fix all the subtleties of each area of the body under examination. JPEG2000 is used in CTs, X-rays, MRIs, etc.
Also, in accordance with FDA (Food and Drug Administration) requirements, images acquired by means of medical imaging must be stored in the original format (without loss). The JPEG2000 format is an ideal solution in this case.
Another interesting feature of JPEG2000 is the compression of three-dimensional data arrays. This can be highly relevant both in science and in medicine (for example, three-dimensional tomography results). The 10th part of the JPEG2000 standard is devoted to the compression of such data: JP3D (volumetric imaging).
2. JP2 format limitations
Unfortunately, JP2 (JPEG2000) isn’t so simple — in fact, it’s not supported by most web browsers (with the exception of Safari). The format is computationally complex, and existing open source codecs have been too slow for active use over the years. Even now, when the speed of processors is increasing with each new generation, and codecs are being optimized and accelerated, their capabilities still leave something to be desired. To illustrate the importance of codec speed, let's return to the topic of digital cinema for a moment: specifically, to the creation of DCPs (Digital Cinema Packages), the same set of files that we enjoy in cinemas. Again, JPEG2000 is the standard for digital cinema and, accordingly, is required to create a DCP package. Unfortunately, its computational complexity makes this task quite resource-intensive and time-consuming. Moreover, existing open source codecs don't allow decoding movies at the required rate of 25, 30 or 60 fps for 12-bit data at resolutions already in 2K or 4K.
3. How to speed up processing with the JP2 format
JPEG2000 provides modes for operating at a higher speed, but this is achieved at the expense of a slight reduction in quality or compression ratio. However, even the slightest reduction in image quality can be unacceptable for some application areas.
To speed up the process with JPEG2000, we at Fastvideo have developed our own implementation of the JPEG 2000 codec. Our solution is based on NVIDIA CUDA technology, thanks to which it’s now possible to make a parallel implementation of the coder and decoder using all CPU and GPU cores.
As a consequence, the Fastvideo solution performs much better in comparison to the competition and provides fundamentally new capabilities for users. We believe that our solution will encourage more people to use JP2 format, as well as significantly speed up JP2 processing for people who already use it. Our goal is to make high-quality images much more accessible for specialists in application areas where the original image quality is required by default (e.g., science and healthcare).
Other info from Fastvideo concerning JPEG 2000 solutions
JPEG2000 codec on GPU
JPEG2000 vs JPEG vs PNG: What's the Difference?
J2K encoding benchmarks
J2K decoding benchmarks
Fast FFmpeg J2K decoder on NVIDIA GPU
MXF Player
Original article see at: https://www.fastcompression.com/blog/jpeg2000-applications-part2.htm Subscribe to our mail list: https://mailchi.mp/fb5491a63dff/fastcompression
JPEG2000 in science, healthcare, digital cinema and broadcasting
Author: Fyodor Serzhenko
This article is devoted to the JPEG2000 algorithm and will be presented in two parts. In the first part, we will discuss the key technologies of the algorithm and explain why it has become so popular in digital cinema and broadcasting. In the second part, we will talk about other application areas and important features of JPEG2000. We will also discuss its main drawback and present a solution that can significantly improve the usability of JPEG2000.
Part 1: JPEG2000 in digital cinema and broadcasting. Features of JP2 format
The cinema captured the hearts and minds of people all over the world from the very beginning. Comedy movies by Charlie Chaplin and horror films by Alfred Hitchcock left no one indifferent. It took just a little bit more than a century for the industry to evolve from black-and-white silent cinema to IMAX movies, the quality of which leaves a deep impression the moment a spectator watches one for the first time.
Okay, but are you aware of what makes IMAX movies so captivating? And why does it differ so much in video quality from what we used to watch on standard TV channels? The answer is the compression algorithm and image format used.
JP2 is the file format for images compressed with the JPEG2000 algorithm
1. JPEG2000 in digital cinema
The JP2 format (among others) has been actively used in digital cinema for a long time. It was developed in 2000 and selected as a digital cinema standard by the Digital Cinema Initiatives (DCI) group, which includes Disney, Fox, Paramount, MGM, Sony Pictures Entertainment, Universal, and Warner Bros. Studios, in 2004. The same year, some amendments relating to digital cinema were added to the first part of the JPEG2000 standard.
Good compression for digital cinema was simply necessary. An hour-and-a-half movie in 2K or 4K resolution with 12-bit color channels and 24 fps, compressed using JPEG2000 at a standard bitrate of 250 Mbit/s takes up to 160 Gigabytes.
The JPEG2000 compression algorithm, thanks to which we can enjoy vivid images in IMAX, is based on two key technologies — a discrete wavelet transform (DWT) and embedded block coding with optimal truncation (EBCOT), each of which has its own role:
DWT creates a multi-scale image representation to select the spatial and frequency components of the image. It makes it possible, for example, to watch a 4K movie in 2K resolution.
EBCOT arranges the data about the pixels of each coded block by importance, providing a smooth degradation of the picture quality as the compression ratio increases.
2. Format features: 12-bit and lossless compression option
In this section, we will discuss the JPEG2000 format itself, its features and applications. So how come images in JP2 format are so fascinating? The answer is simple: the color depth. One of the most important advantages of the format is working with high-bit data. In other words, the JP2 format is designed to describe one pixel of an image using more bits than a monitor that is not designed for professional color work, and thereby store more information about color. If you compare a standard JPEG image (8 bits per channel) with images in the IMAX format (12 bits per channel), you’ll see that an 8-bit image simply cannot convey such a range of color and brightness as a 12-bit image. As a consequence, IMAX image quality differs fundamentally.
Another important advantage of the JPEG2000 algorithm is the relationship between the compression ratio and the image quality (measured by any metric). The image file size and the transmission speed depend on the compression ratio. What’s more, the quality of the restored image depends on the compression as well. It’s quite clear that the presence of artifacts does not delight anyone.
Thanks to the use of wavelets (DWT), images in JP2 don’t acquire such conspicuous artifacts at high compression ratios as in its predecessor JPEG — when compressing an image with JPEG, the boundaries of 8x8-pixel squares become visible. It’s impossible to completely avoid artifacts, but visually they’re much less noticeable. As a result, JPEG2000 allows you to compress images more, and lose much less quality than JPEG allows with the same compression ratios. You can find a more detailed comparison of JPEG2000 with JPEG in one of our articles.
It’s worth noting that JPEG2000 was developed to provide both lossy and mathematically lossless compression in a single compression architecture. Depending on its contents, an image can be compressed up to 2.5 times without any quality loss, while its data footprint is decreased to 60%. However, there are always exceptions: some images can’t be reduced in size using lossless compression or compression ratio would be close to 1, but it’s quite achievable for the majority of them. Anyway, such compression capabilities are in great demand wherever it’s necessary to store a large amount of data in a compressed form for a long time (e.g., documentation, images, and video), while maintaining the possibility of lossless recovery. For example, it can be quite useful in libraries, museums, etc.
Lossless compression is of great use in the following situations:
when advanced image analysis or multi-stage processing is performed or is supposed to be performed, and each stage can introduce an additional quality loss.
when minor details captured at the camera's sensitivity limit can be of great significance.
For example, early detection of diseases, research of nano-objects and processes at the sensitivity limit of a microscope, study of extremely distant space objects.
3. JPEG2000 in broadcasting
One more use case of the JPEG2000 format which is worth mentioning is sports broadcasting, such as football and basketball tournaments. During broadcasting, the still-uncompressed video is transmitted from the camera to an add-on device, which compresses the images using JPEG2000. Subsequently, they are transmitted in JP2 format to the server where the re-encoding is performed to create a video suitable for an audience. In this case, both fast image transmission and quality preservation are essential. JPEG2000 uses EBCOT coding, which makes it possible to select the order of alternation of resolutions, quality layers, color components and positions within compressed bytestream. Thanks to EBCOT coding, JPEG2000 supports dynamic quality distribution. In other words, it allows you to automatically adjust the amount of transmitted data depending on the bandwidth of the channel. Thus, images of the highest possible quality for a given IP channel are quickly transmitted to the servers.
To be continued…
Other info from Fastvideo concerning JPEG2000
JPEG2000 codec on GPU
JPEG2000 vs JPEG vs PNG: What's the Difference?
J2K encoding benchmarks
J2K decoding benchmarks
Fast FFmpeg J2K decoder on NVIDIA GPU
MXF Player
Remote color grading
Original article see at: https://www.fastcompression.com/blog/jpeg2000-applications-part1.htm Subscribe to our mail list: https://mailchi.mp/fb5491a63dff/fastcompression
J2K codec performance on Jetson TX2
NVIDIA Jetson TX2 hardware is very promising for imaging and other embedded applications. That high-performance and low-power hardware is utilized in autonomous solutions, especially the industrial version Jetson TX2i. Since J2K compression is a common task for UAV (Unmanned Aerial Vehicle) applications, here we evaluate such a solution and its limitations.
Detailed info concerning our testing approach for JPEG2000 encoding and decoding on desktop/server NVIDIA GPUs you can find at the corresponding links. Here we follow exactly the same procedure, but it's applied to the Jetson hardware.
J2K encoding/decoding parameters
File format – JP2
Lossy JPEG2000 compression with CDF 9/7 wavelet
Lossless JPEG2000 compression with CDF 5/3 wavelet
Compression ratio (for lossy algorithm) ~ 12.0:1 which corresponds to visually lossless encoding
Subsampling mode – 4:4:4
Number of DWT resolutions – 7
Codeblock size – 32×32
MCT – on
PCRD – off
Tiling – off
Window – off
Quality layers – one
Progression order – LRCP (L = layer, R = resolution, C = component, P = position)
Modes of operation – single or multithreaded batch
2K test image (24-bit) – 2k_wild.ppm
4K test image (24-bit) – 4k_wild.ppm
It's obvious that in many cases compression ratio for visually lossless encoding could be much higher for JPEG2000 algorithm. So we would suggest testing different parameters to achieve the best compression ratio with an acceptable image quality. Decreasing the quality coefficient one can get not only better compression, but also higher framerate both for encoding and decoding. Our benchmarks show the performance results for the above images and parameters. It's not the maximum performance, which could be better in many other cases.
Hardware and software
NVIDIA Jetson TX2
CUDA Toolkit 10.2
JPEG2000 codec benchmarks on NVIDIA Jetson TX2
Jetson TX2 has 4-core ARM Cortex-A57 @ 2 GHz and 2-core Denver2 @ 2 GHz. These two types of cores have different performance, which should be taken into account. Since Tier-2 stage of JPEG2000 algorithm is implemented on CPU, the performance of both CPU and GPU cores determine the framerate. From that point of view, multithreading can be useful (we use up to 12 threads), but in the single mode we could get different performance depending on the CPU core used. So in the single mode we need to set affinity mask to ensure utilizing the fastest CPU core.
In the tests discussed we've restricted memory usage to 2 GB. This was done under an assumption that Jetson TX2 can have only 4 GB memory, so this is important limitation for the whole image processing solution.
Here we haven't considered the task of J2K transcoding to H.264 on Jetson. That task requires additional tests, though from our previous experience with desktop/server GPUs, performance of the transcoding should not differ significantly, because Jetson has hardware support of H.264 encoding (separate from GPU), which is accessible via V4L2 interface and can be used simultaneously with JPEG2000 decoder.
By request we could offer Fastvideo SDK for Jetson for evaluation - please fill the form below and send it to us.
Other info from Fastvideo concerning JPEG2000 and Jetson
JPEG2000 codec on GPU
JPEG2000 vs JPEG vs PNG: What's the Difference?
J2K encoding benchmarks
J2K decoding benchmarks
Fast FFmpeg J2K decoder on NVIDIA GPU
MXF Player
Jetson Benchmark Comparison: Nano vs TX2 vs Xavier
Jetson image processing for camera applications
Original article see at: https://www.fastcompression.com/blog/j2k-codec-on-jetson-tx2.htm
Subscribe to our mail list: https://mailchi.mp/fb5491a63dff/fastcompression
Low-latency software for remote collaborative post production
Fastvideo company is a team of professionals in GPU image processing, realtime camera applications, digital cinema, high performance imaging solutions. Fastvideo has been helping production companies for quite a long time and recently we've implemented low-latency software to offer collaborative post production.
Today, with restrictions on in-person collaboration, delays in shipping and limitations on travel, single point of ingest and delivery for an entire production becomes vitally important. The main goal is to offer all services both on-premises and remotely. We believe that in the near future we will see virtual and distributed post production finishing.
When you are shooting a movie at tight schedule and you need to accelerate your post production workflow, then remote collaborative approach is a right solution. You don't need to have all professionals on-site, via remote approach you can collaborate at realtime wherever your teammates are located. Industry trend to remote production solutions is clear and it happens not just due to the coronavirus. The idea to accelerate post via remote operation is viable and companies strive to remove various limitations of conventional workflow - now the professionals could choose a place and a time to work remotely on post production.
Nowadays, there are quite a lot of software solutions to offer reliable remote access via local networks or via public internet. Still, most of them were built without an idea about professional usage in tasks like colour grading, VFX, compositing and much more. In post production we need to utilize professional hardware which could visualize 10-bit or 12-bit footages. Skype, ZOOM and many other video conference solutions are not capable of doing that, so we've implemented the software to solve that matter.
Business goals to achieve at remote collaborative post production
You will share content in realtime for collaborative workflows in post production
Lossless or visually lossless encoding guarantees high image quality and exact colour reproduction
Reduced travel and rent costs for the team due to remote colour grading and reviewing
Remote work will allow to choose the best professionals for the production
Your team will work on multiple projects (time saving and multi-tasking)
Goals from technical viewpoint
Low latency software
Fast and reliable data transmission over internal or public network
Fast acquisition and processing of SD/HD-SDI and 3G-SDI streams (unpacking, packing, transforms)
Realtime J2K encoding and decoding (lossy or lossless)
High image quality
Precise colour reproduction
Maximum bit depth (10-bit or 12-bit per channel)
Task to be solved
Post industry needs low-latency, high quality video encode/decode solution for remote work according to the following pipeline:
Capture baseband video streams via HD-SDI or 3G-SDI frame grabber (Blackmagic DeckLink 8K Pro, AJA Kona 4 or Kona 5)
Live encoding with J2K codec that supports 10-bit YUV 4:2:2 and 10/12-bit 4:4:4 RGB
Send the encoded material via TCP/UDP packets to a receiver/decoder - point-to-point transmission over ethernet or public internet
Decode from stream at source colorspace/bit-depth/resolution/subsampling - Rec.709/Rec.2020, 10-bit 4:2:2 YUV or 10/12-bit 4:4:4 RGB
Send stream to baseband video playout device (Blackmagic/AJA frame grabber) to display 10-bit YUV 4:2:2 or 10/12-bit 4:4:4 RGB material on external display
Latency requirements: sub 300 ms
Basic hardware layout: Video Source (Baseband Video) -> Capture device (DeckLink) -> SDI unpacking on GPU -> J2K Encoder on GPU -> Facility Firewall (IPsec VPN) -> Public Internet -> Remote Firewall (IPsec VPN) -> J2K Decoder on GPU -> SDI packing on GPU -> Output device (DeckLink) -> Video Display (Baseband Video)
Hardware/software/parameters
HD-SDI or 3G-SDI frame grabbers: Blackmagic DeckLink 8K Pro, AJA Kona 4, AJA Kona 5
NVIDIA GPU: GeForce RTX 2070, Quadro RTX 4000 or better
OS: Windows-10 or Linux Ubuntu/CentOS
Frame Size: 1920×1080 (DCI 2K)
Frame Rates: 23.976, 24, 25, 29.97, 30 fps
Bit-depth: 8/10/12 (encode - ingest), 8/10/12 (decode - display)
Pixel formats: RGB or RGBA, v210, R12L
Frame compression: lossy or lossless
Colour Spaces for 8/10-bit YUV or 8/10/12-bit RGB: Rec.709, DCI-P3, P3-D65, Rec.2020 (optional)
Audio: 2-channel PCM or more
How to encode/decode J2K images fast?
CPU-based J2K codecs are quite slow. For example, if we consider FFmpeg-based software solutions, they are working with J2K codec from libavcodec (mj2k) or with OpenJPEG, which are far from being fast. Just test that software to check the latency and the performance. It's not surprizing, as soon as J2K algorithm has very high computational complexity. If we implement multiple threads/processes on CPU, the performance of J2K solution from libavcodec is still unsuffcient. This is the problem even for 8-bit frames with 2K resolution, though for 4K images (12-bit, 60 fps) the performance is much worse.
The reason why FFmpeg and other software are not fast at that task is obvious - they are working on CPU and they are not optimized to be high performance software. Here you can see benchmarks comparison for J2K encoding and decoding for OpenJPEG, Jasper, Kakadu, J2K-Codec, CUJ2K, Fastvideo codecs to check the performance for images with 2K and 4K resolutions (J2K lossy/lossless algorithms).
Maximum performance for J2K encoding and decoding at streaming applications could be achieved at multithreaded batch mode. This is a must to ensure massive parallel processing according to JPEG2000 algorithm. If we do batch processing, it means that we need to collect several images, which is not good for latency. If we implement batch with multithreading, it improves the performance, but the latency gets worse. This is actually a trade-off between performance and latency for the task of J2K encoding and decoding. For example, at remote color grading application we need minimum latency, so we need to process each J2K frame separately, without batch and without multithreading. Though in most cases it's better to choose acceptable latency and get the best performance with batch and multithreading.
Other info from Fastvideo about J2K and digital cinema applications
JPEG2000 codec on GPU
Fast FFmpeg J2K decoder on NVIDIA GPU
MXF Player
Fast CinemaDNG Processor software on GPU
BRAW Player and Converter for Windows and Linux
Original article see at: https://www.fastcompression.com/blog/remote-post-production-software.htm Subscribe to our mail list: https://mailchi.mp/fb5491a63dff/fastcompression
JPEG2000
Subtext: "I was actually a little relieved when I learned that JPEG2000 was used in the DCI digital cinema standard. I was feeling so bad for it!"