**** BEGIN LOGGING AT Sat Nov 11 03:00:02 2017 Nov 11 11:09:11 slapin: the most wasteful part I found was the texture colorspace converter, that was really awful Nov 11 11:09:29 slapin: the rest was kinda OK, the blob is rendering into fbdev and using PAN ioctl to implement faux-double-buffering Nov 11 11:09:43 slapin: which is crap, but better than the rest of the shyte blobs do Nov 11 21:14:02 well, it looks like it copies large buffers a lot, which I tried to solve via dmabufs, but it seems that it requires arm engineer to halp me implement that feature... Nov 11 21:14:28 especially bad thing happens when v4l is concerned... Nov 11 21:15:49 I don't want to use hardware overlay as it is so stupidly implemented - it might work for android, but not for normal windowing system or inside browser window... Nov 11 21:16:04 you can't draw above video :( Nov 11 21:16:57 I want normal 2D display + video and this can only be done with proper mali driver, not half-assed one... Nov 11 21:20:35 so you either have shitty performance and normal display (like 99% CPU utilisation) or you get good (25% CPU utl) performance but can't use subtitles and can't hav windows on top of video Nov 11 21:22:27 it is not possible to offscreen render the video in case of overlay use... :( Nov 11 21:32:43 slapin: I do exactly this for a customer -- live video playback on texture Nov 11 21:32:51 slapin: the video is coming from v4l2 device Nov 11 21:34:05 slapin: do the following to import DMABUF from v4l2 device into the GPU as a texture Nov 11 21:34:08 const EGLint attr[] = { Nov 11 21:34:10 EGL_WIDTH, ctx->v4lfmt.fmt.pix.width, Nov 11 21:34:13 EGL_HEIGHT, ctx->v4lfmt.fmt.pix.height, Nov 11 21:34:15 EGL_LINUX_DRM_FOURCC_EXT, DRM_FORMAT_BGRA8888, Nov 11 21:34:18 EGL_DMA_BUF_PLANE0_FD_EXT, fd, Nov 11 21:34:20 EGL_DMA_BUF_PLANE0_OFFSET_EXT, 0, Nov 11 21:34:23 EGL_DMA_BUF_PLANE0_PITCH_EXT, ctx->v4lfmt.fmt.pix.width * 4, Nov 11 21:34:25 EGL_NONE Nov 11 21:34:28 }; Nov 11 21:34:30 EGLImage img; Nov 11 21:34:33 img = ctx->eglCreateImageKHR(ctx->display, EGL_NO_CONTEXT, Nov 11 21:34:35 EGL_LINUX_DMA_BUF_EXT, NULL, attr); Nov 11 21:34:38 assert(img); Nov 11 21:34:40 glActiveTexture(GL_TEXTURE0); Nov 11 21:34:43 glBindTexture(GL_TEXTURE_EXTERNAL_OES, ctx->tex); Nov 11 21:34:45 glTexParameteri(GL_TEXTURE_EXTERNAL_OES, GL_TEXTURE_MIN_FILTER, GL_LINEAR); Nov 11 21:34:48 glTexParameteri(GL_TEXTURE_EXTERNAL_OES, GL_TEXTURE_MAG_FILTER, GL_LINEAR); Nov 11 21:34:51 glTexParameteri(GL_TEXTURE_EXTERNAL_OES, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE); Nov 11 21:34:54 glTexParameteri(GL_TEXTURE_EXTERNAL_OES, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE); Nov 11 21:34:57 ctx->glEGLImageTargetTexture2DOES(GL_TEXTURE_EXTERNAL_OES, img); Nov 11 21:35:00 ctx->eglDestroyImageKHR(ctx->display, img); Nov 11 21:35:02 The fragment shader should have roughly this in it Nov 11 21:35:05 "uniform samplerExternalOES camTex; \n" Nov 11 21:35:07 "void main() \n" Nov 11 21:35:10 "{ \n" Nov 11 21:35:12 " gl_FragColor = texture2D(camTex, vTexCoord);" Nov 11 21:35:15 "} \n"; Nov 11 21:37:05 ah, this must be at the beginning of the fragment shader program too Nov 11 21:37:05 "#extension GL_OES_EGL_image_external : enable\n" Nov 11 21:37:46 ctx->eglCreateImageKHR is obtained via eglGetProcAddress(eglCreateImageKHR) in the egl init Nov 11 21:38:16 ctx->camtexture = glGetUniformLocation(ctx->program, "camTex"); Nov 11 21:38:34 slapin: that should get you started with the DMABUF importing Nov 11 21:38:51 slapin: if you use glTexture2D(), then yes, it will do the texture colorspace conversion and it'll be a disaster Nov 11 21:39:07 slapin: btw the DMABUF importing might only work with newer blobs (r7p0) Nov 11 21:39:28 thanks, I still need to learn EGL to do such tricks, but looks promising, just need to hack that into library Nov 11 21:39:41 can I run newer blobs with older chips? Nov 11 21:39:58 afaik you need blob for your CPU for some reason Nov 11 21:40:13 the rockchip blobs didn't work on zynqmp last time I tried Nov 11 21:40:54 slapin: I used the above to stream video from four cameras at the same time, process it in shader program and composite on the display Nov 11 21:41:10 slapin: it worked at some 80 FPS (rendered), good enough Nov 11 21:41:34 slapin: oh btw, regarding the fd , you get that by doing ... Nov 11 21:41:52 memset(&rqbufs, 0, sizeof(rqbufs)); Nov 11 21:41:52 rqbufs.count = CONFIG_V4L2_BUF_COUNT; Nov 11 21:41:52 rqbufs.type = V4L2_BUF_TYPE_VIDEO_CAPTURE; Nov 11 21:41:52 rqbufs.memory = V4L2_MEMORY_MMAP; Nov 11 21:41:52 ret = ioctl(v4lfd, VIDIOC_REQBUFS, &rqbufs); Nov 11 21:42:02 buf.index = i; Nov 11 21:42:02 buf.type = V4L2_BUF_TYPE_VIDEO_CAPTURE; Nov 11 21:42:02 buf.memory = V4L2_MEMORY_MMAP; Nov 11 21:42:03 ret = ioctl(v4lfd, VIDIOC_QUERYBUF, &buf); Nov 11 21:42:14 memset(&expbuf, 0, sizeof(expbuf)); Nov 11 21:42:14 expbuf.index = i; Nov 11 21:42:14 expbuf.type = V4L2_MEMORY_MMAP; Nov 11 21:42:15 ret = ioctl(v4lfd, VIDIOC_EXPBUF, &expbuf); Nov 11 21:42:31 ^ this is important, since it gives you the MMAPed DMABUF FD and exports it (EXPBUF) Nov 11 21:42:43 ret = ioctl(v4lfd, VIDIOC_QBUF, &buf); Nov 11 21:43:00 then you poll on the v4l2 device FD , when you get POLLIN , you DQBUF , update the texture and QBUF again Nov 11 22:44:35 thanks a lot, now I have hope to make non-overlay video work Nov 11 22:53:17 slapin: and you can do 2D compositing with the GPU too Nov 11 23:13:16 well, 2D composition with dedicated hardware is much better... Nov 11 23:35:08 Overlays? Depends ... Nov 12 01:58:04 marex-cloud: depends on what? if you use hardware compositor (most socs have such things these days) it is way better that using GPU stuff Nov 12 01:59:06 the hardware + software composition is about 3xbetter than GPU one on m2 **** ENDING LOGGING AT Sun Nov 12 03:00:02 2017