Partial video editing without transcoding












0















is it possible to do partial edits to each frame of video (e.g. adding a logo) without transcoding? And save video with the same format?



I mean, pixel positions of each frame are calculable; so is it possible to modify file bytes in such a way that they result in the desired change on a frame (e.g. adding a logo)?



I know in formats like XviD this is not possible (easily at least) but it is completely impossible or is possible for some kind of formats (e.g. MPG or raw AVI)?










share|improve this question





























    0















    is it possible to do partial edits to each frame of video (e.g. adding a logo) without transcoding? And save video with the same format?



    I mean, pixel positions of each frame are calculable; so is it possible to modify file bytes in such a way that they result in the desired change on a frame (e.g. adding a logo)?



    I know in formats like XviD this is not possible (easily at least) but it is completely impossible or is possible for some kind of formats (e.g. MPG or raw AVI)?










    share|improve this question



























      0












      0








      0








      is it possible to do partial edits to each frame of video (e.g. adding a logo) without transcoding? And save video with the same format?



      I mean, pixel positions of each frame are calculable; so is it possible to modify file bytes in such a way that they result in the desired change on a frame (e.g. adding a logo)?



      I know in formats like XviD this is not possible (easily at least) but it is completely impossible or is possible for some kind of formats (e.g. MPG or raw AVI)?










      share|improve this question
















      is it possible to do partial edits to each frame of video (e.g. adding a logo) without transcoding? And save video with the same format?



      I mean, pixel positions of each frame are calculable; so is it possible to modify file bytes in such a way that they result in the desired change on a frame (e.g. adding a logo)?



      I know in formats like XviD this is not possible (easily at least) but it is completely impossible or is possible for some kind of formats (e.g. MPG or raw AVI)?







      video video-editing video-codecs






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited May 1 '13 at 12:26









      slhck

      160k47444466




      160k47444466










      asked Apr 29 '13 at 22:27









      RYNRYN

      3101514




      3101514






















          3 Answers
          3






          active

          oldest

          votes


















          5














          This is only (practically) possible for lossless video encoding.



          Lossless to lossless



          It's very simple to do what you want for losslessly encoded video codecs. Many of those store the video data pixel-by-pixel, mostly in the YUV colorspace. Also, every frame stands on its own.



          It would be pretty easy to edit such a video because you know where every frame is, and where every pixel position is in the byte stream. You can simply overwrite parts of the byte stream without consequences to the remaining stream. Even if the lossless video was arithmetically compressed, you could uncompress it first, edit, and then compress and save again—no problem with that.



          So, for example, if you have a raw YUV video in an AVI container, you could edit it on a frame-by-frame basis and save it to YUV in AVI again.



          Keeping the lossy encoding



          But if you want to do lossy video encoding—or keep the original lossy encoding—after the editing step, this is impossible. There are two problems with that: the encoding process itself, and the fact that often, frames depend on each other.



          Lossy compression tries to remove redundancy as much as possible, by removing details the human eye doesn't see. This is done in several steps, but the most important one involves transforming the pixel domain into the frequency domain, often with variants of the Discrete Cosine Transform.



          What this step does is that it takes a block of, say, 8×8 pixels and transforms it into a block of frequency coefficients. From this block, certain coefficients are dropped, which reduces the amount of information (and thus compresses the size), but also throws away visual information, which reduces the quality of the video. Which coefficients are dropped depends on the quality setting of the encoder. The video is then not stored as pixels, but as frequency coefficients.



          When you want to edit a lossy video, you first have to reconstruct the frequency coefficients into a pixel-by-pixel representation again (like mentioned above). At this point you could edit the video, and insert a logo, but once you'd want to store it again, you'd have to perform the transformation step again—and throw away information. This is, in essence, the main cause for generation loss.



          Another problem is the fact that for most lossy video, some frames depend on the information contained in other frames. More specifically, B- and P-frames only contain offsets of earlier (or later) B/P or I-frames. If you changed the contents of an I-frame, all the other dependent frames would change when they are decoded, which is typically not what you want. Peter Cordes' answer below hightlights this point. And indeed he's right that in principle, you could edit a lossy I-frame only video in place, but it would be practically very hard to accomplish.



          So, very simply speaking, unless you can store a video losslessly, you can't edit a video in-place without sacrificing quality.






          share|improve this answer


























          • Actually, you don't technically need your source to be a lossless representation of anything. Editting without doing any lossy compression to the output of your edit will result in a file MUCH bigger than the original, but will have exactly the same quality as the original.

            – Peter Cordes
            Jan 17 '15 at 16:57











          • Absolutely. I tried to clarify that part, but your answer is good. I guess I was more thinking about what's realistic in the sense of… is there a tool that does this job. And I have yet to see a nice implementation of something that modifies bitstreams to change some macroblock frequencies in such a way that the inverse transformation gives you what you originally wanted (as specified in the pixel domain).

            – slhck
            Jan 17 '15 at 17:37











          • Yeah, I agree most people wouldn't want to inflate their videos by a factor of 10 to 100. Since if there are already artifacts from something, it's easier to accept more.

            – Peter Cordes
            Jan 17 '15 at 21:59











          • re: bitstream hacking: There are still-image jpeg tools like jpegclub.org/jpegtran (packaged for ages in most Linux distros as part of libjpeg-progs / libjpeg-turbo-progs) that do lossless transformation like crop, rotate, or even rescale (DCT is cool). Replacing some blocks with other blocks should be easy, too, but IDK if there's existing code for it.

            – Peter Cordes
            Jan 17 '15 at 22:04











          • actually, looks like there is existing code: I see jpegclub.org/jpegjoin/jpegjoin.txt. "The main purpose of Jpegjoin is to compose (join) multiple images into a single image, similar to the arrangement of row and column cells in a table. Also provided is a feature to drop logo and legend images on the bottom of the cells."

            – Peter Cordes
            Jan 17 '15 at 22:06





















          2














          slhck is correct if we're talking about complex video codecs with motion vectors (edit: actually, any kind of intra or inter prediction). You could modify a h.264 stream by re-encoding only the macroblocks in the area you want to replace with your logo. Logo in I frames, skip blocks in P and B frames. But any time the camera panned, there would be nearby blocks that get their picture data by copying from the part of the picture that's now logoed. So the edges of your logo would get copied around, corrupting other parts of the picture until the next I frame.



          So he's right, but it's because of references, not lossy-ness. (edit: His edits clear up most of the confusion.)



          For simpler codecs where every block is independent (not even any intra prediction), e.g. mjpeg, you could replace the coded blocks only in the area you wanted to logo, without decoding / re-encoding the rest of the picture. I think there's support for doing this in a single jpeg image file, but IDK if anything supports it for mjpeg, without breaking out the frames to separate jpeg files.



          http://jpegclub.org/jpegjoin/jpegjoin.txt has info on how to use jpegtran to replace some blocks of jpeg files, without decoding/encoding the rest of the image.



          Here's an example of mucking around with a video stream without decoding/re-encoding. Hiding data in a h.264 stream, in the low coefficients of iPCM macroblocks. (literal, not DCTed, coding.) You have to decode / re-encode the CABAC entropy-coding layer. (It's the final zip-style compression for the bitstream.)



          If the source is in a lossless format, you're not incurring a huge filesize cost by decompressing, editting, and recompressing the whole thing, just CPU. Many lossless formats essentially zip each whole frame separately, rather than operating on blocks, so you'd have to do a full decompress-edit-recompress. Lossless h.264 has all the inter and intra prediction of normal lossy h.264, so even though it does have macroblocks, a full decode-edit-encode is needed there, too.



          None of this changes anything for the actual practical purpose of logoing a video, UNLESS your lossy source was mjpeg. I'm just writing this for those curious about why.






          share|improve this answer

































            1














            I would say it should be possible.



            During the US changeover to OTA HD (ATSC), almost all the networks provided video from their home studios via satellite, using a very high bit rate (much higher than the OTA bit rate) video stream. They then provided each local television station with a decoder and lower bitrate encoder, so they could completely decode the high-quality stream, add their station logo, picture inserts, etc., and then encode to the broadcast standard bitrate for transmission.



            However, the Fox network opted to encode the stream in their home studios, presumably using a higher quality (and expense) encoder, but to the broadcast bit rate. Then, they supplied a "splicer" to each local affiliate which could partially decode the stream, insert logos, crawls, etc., and then re-encode just that portion before broadcast.



            I do realize that the MPEG-2 stream used in OTA broadcast differs from h.264/MPEG-4, but from what I know of the technical details of the formats, this is most likely still feasible with h.264/MPEG-4. (In fact, MPEG-4 actually adds features that support alpha compositing, although I suspect few decoders may actually handle this feature.)



            That said, finding such a piece of software has eluded me. I have streams to which I'd like to add a graphics bar (mostly a sports score), without re-encoding the already mediocre quality from a cheap camera. But I haven't found one yet.



            Edit: I did find this paper on "Replacing picture regions in H.264/AVC bitstream by utilizing independent slices," that appears to provide the technical details to achieve this. But I'm still looking for some proof-of-concept software.






            share|improve this answer

























              Your Answer








              StackExchange.ready(function() {
              var channelOptions = {
              tags: "".split(" "),
              id: "3"
              };
              initTagRenderer("".split(" "), "".split(" "), channelOptions);

              StackExchange.using("externalEditor", function() {
              // Have to fire editor after snippets, if snippets enabled
              if (StackExchange.settings.snippets.snippetsEnabled) {
              StackExchange.using("snippets", function() {
              createEditor();
              });
              }
              else {
              createEditor();
              }
              });

              function createEditor() {
              StackExchange.prepareEditor({
              heartbeatType: 'answer',
              autoActivateHeartbeat: false,
              convertImagesToLinks: true,
              noModals: true,
              showLowRepImageUploadWarning: true,
              reputationToPostImages: 10,
              bindNavPrevention: true,
              postfix: "",
              imageUploader: {
              brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
              contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
              allowUrls: true
              },
              onDemand: true,
              discardSelector: ".discard-answer"
              ,immediatelyShowMarkdownHelp:true
              });


              }
              });














              draft saved

              draft discarded


















              StackExchange.ready(
              function () {
              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsuperuser.com%2fquestions%2f589390%2fpartial-video-editing-without-transcoding%23new-answer', 'question_page');
              }
              );

              Post as a guest















              Required, but never shown

























              3 Answers
              3






              active

              oldest

              votes








              3 Answers
              3






              active

              oldest

              votes









              active

              oldest

              votes






              active

              oldest

              votes









              5














              This is only (practically) possible for lossless video encoding.



              Lossless to lossless



              It's very simple to do what you want for losslessly encoded video codecs. Many of those store the video data pixel-by-pixel, mostly in the YUV colorspace. Also, every frame stands on its own.



              It would be pretty easy to edit such a video because you know where every frame is, and where every pixel position is in the byte stream. You can simply overwrite parts of the byte stream without consequences to the remaining stream. Even if the lossless video was arithmetically compressed, you could uncompress it first, edit, and then compress and save again—no problem with that.



              So, for example, if you have a raw YUV video in an AVI container, you could edit it on a frame-by-frame basis and save it to YUV in AVI again.



              Keeping the lossy encoding



              But if you want to do lossy video encoding—or keep the original lossy encoding—after the editing step, this is impossible. There are two problems with that: the encoding process itself, and the fact that often, frames depend on each other.



              Lossy compression tries to remove redundancy as much as possible, by removing details the human eye doesn't see. This is done in several steps, but the most important one involves transforming the pixel domain into the frequency domain, often with variants of the Discrete Cosine Transform.



              What this step does is that it takes a block of, say, 8×8 pixels and transforms it into a block of frequency coefficients. From this block, certain coefficients are dropped, which reduces the amount of information (and thus compresses the size), but also throws away visual information, which reduces the quality of the video. Which coefficients are dropped depends on the quality setting of the encoder. The video is then not stored as pixels, but as frequency coefficients.



              When you want to edit a lossy video, you first have to reconstruct the frequency coefficients into a pixel-by-pixel representation again (like mentioned above). At this point you could edit the video, and insert a logo, but once you'd want to store it again, you'd have to perform the transformation step again—and throw away information. This is, in essence, the main cause for generation loss.



              Another problem is the fact that for most lossy video, some frames depend on the information contained in other frames. More specifically, B- and P-frames only contain offsets of earlier (or later) B/P or I-frames. If you changed the contents of an I-frame, all the other dependent frames would change when they are decoded, which is typically not what you want. Peter Cordes' answer below hightlights this point. And indeed he's right that in principle, you could edit a lossy I-frame only video in place, but it would be practically very hard to accomplish.



              So, very simply speaking, unless you can store a video losslessly, you can't edit a video in-place without sacrificing quality.






              share|improve this answer


























              • Actually, you don't technically need your source to be a lossless representation of anything. Editting without doing any lossy compression to the output of your edit will result in a file MUCH bigger than the original, but will have exactly the same quality as the original.

                – Peter Cordes
                Jan 17 '15 at 16:57











              • Absolutely. I tried to clarify that part, but your answer is good. I guess I was more thinking about what's realistic in the sense of… is there a tool that does this job. And I have yet to see a nice implementation of something that modifies bitstreams to change some macroblock frequencies in such a way that the inverse transformation gives you what you originally wanted (as specified in the pixel domain).

                – slhck
                Jan 17 '15 at 17:37











              • Yeah, I agree most people wouldn't want to inflate their videos by a factor of 10 to 100. Since if there are already artifacts from something, it's easier to accept more.

                – Peter Cordes
                Jan 17 '15 at 21:59











              • re: bitstream hacking: There are still-image jpeg tools like jpegclub.org/jpegtran (packaged for ages in most Linux distros as part of libjpeg-progs / libjpeg-turbo-progs) that do lossless transformation like crop, rotate, or even rescale (DCT is cool). Replacing some blocks with other blocks should be easy, too, but IDK if there's existing code for it.

                – Peter Cordes
                Jan 17 '15 at 22:04











              • actually, looks like there is existing code: I see jpegclub.org/jpegjoin/jpegjoin.txt. "The main purpose of Jpegjoin is to compose (join) multiple images into a single image, similar to the arrangement of row and column cells in a table. Also provided is a feature to drop logo and legend images on the bottom of the cells."

                – Peter Cordes
                Jan 17 '15 at 22:06


















              5














              This is only (practically) possible for lossless video encoding.



              Lossless to lossless



              It's very simple to do what you want for losslessly encoded video codecs. Many of those store the video data pixel-by-pixel, mostly in the YUV colorspace. Also, every frame stands on its own.



              It would be pretty easy to edit such a video because you know where every frame is, and where every pixel position is in the byte stream. You can simply overwrite parts of the byte stream without consequences to the remaining stream. Even if the lossless video was arithmetically compressed, you could uncompress it first, edit, and then compress and save again—no problem with that.



              So, for example, if you have a raw YUV video in an AVI container, you could edit it on a frame-by-frame basis and save it to YUV in AVI again.



              Keeping the lossy encoding



              But if you want to do lossy video encoding—or keep the original lossy encoding—after the editing step, this is impossible. There are two problems with that: the encoding process itself, and the fact that often, frames depend on each other.



              Lossy compression tries to remove redundancy as much as possible, by removing details the human eye doesn't see. This is done in several steps, but the most important one involves transforming the pixel domain into the frequency domain, often with variants of the Discrete Cosine Transform.



              What this step does is that it takes a block of, say, 8×8 pixels and transforms it into a block of frequency coefficients. From this block, certain coefficients are dropped, which reduces the amount of information (and thus compresses the size), but also throws away visual information, which reduces the quality of the video. Which coefficients are dropped depends on the quality setting of the encoder. The video is then not stored as pixels, but as frequency coefficients.



              When you want to edit a lossy video, you first have to reconstruct the frequency coefficients into a pixel-by-pixel representation again (like mentioned above). At this point you could edit the video, and insert a logo, but once you'd want to store it again, you'd have to perform the transformation step again—and throw away information. This is, in essence, the main cause for generation loss.



              Another problem is the fact that for most lossy video, some frames depend on the information contained in other frames. More specifically, B- and P-frames only contain offsets of earlier (or later) B/P or I-frames. If you changed the contents of an I-frame, all the other dependent frames would change when they are decoded, which is typically not what you want. Peter Cordes' answer below hightlights this point. And indeed he's right that in principle, you could edit a lossy I-frame only video in place, but it would be practically very hard to accomplish.



              So, very simply speaking, unless you can store a video losslessly, you can't edit a video in-place without sacrificing quality.






              share|improve this answer


























              • Actually, you don't technically need your source to be a lossless representation of anything. Editting without doing any lossy compression to the output of your edit will result in a file MUCH bigger than the original, but will have exactly the same quality as the original.

                – Peter Cordes
                Jan 17 '15 at 16:57











              • Absolutely. I tried to clarify that part, but your answer is good. I guess I was more thinking about what's realistic in the sense of… is there a tool that does this job. And I have yet to see a nice implementation of something that modifies bitstreams to change some macroblock frequencies in such a way that the inverse transformation gives you what you originally wanted (as specified in the pixel domain).

                – slhck
                Jan 17 '15 at 17:37











              • Yeah, I agree most people wouldn't want to inflate their videos by a factor of 10 to 100. Since if there are already artifacts from something, it's easier to accept more.

                – Peter Cordes
                Jan 17 '15 at 21:59











              • re: bitstream hacking: There are still-image jpeg tools like jpegclub.org/jpegtran (packaged for ages in most Linux distros as part of libjpeg-progs / libjpeg-turbo-progs) that do lossless transformation like crop, rotate, or even rescale (DCT is cool). Replacing some blocks with other blocks should be easy, too, but IDK if there's existing code for it.

                – Peter Cordes
                Jan 17 '15 at 22:04











              • actually, looks like there is existing code: I see jpegclub.org/jpegjoin/jpegjoin.txt. "The main purpose of Jpegjoin is to compose (join) multiple images into a single image, similar to the arrangement of row and column cells in a table. Also provided is a feature to drop logo and legend images on the bottom of the cells."

                – Peter Cordes
                Jan 17 '15 at 22:06
















              5












              5








              5







              This is only (practically) possible for lossless video encoding.



              Lossless to lossless



              It's very simple to do what you want for losslessly encoded video codecs. Many of those store the video data pixel-by-pixel, mostly in the YUV colorspace. Also, every frame stands on its own.



              It would be pretty easy to edit such a video because you know where every frame is, and where every pixel position is in the byte stream. You can simply overwrite parts of the byte stream without consequences to the remaining stream. Even if the lossless video was arithmetically compressed, you could uncompress it first, edit, and then compress and save again—no problem with that.



              So, for example, if you have a raw YUV video in an AVI container, you could edit it on a frame-by-frame basis and save it to YUV in AVI again.



              Keeping the lossy encoding



              But if you want to do lossy video encoding—or keep the original lossy encoding—after the editing step, this is impossible. There are two problems with that: the encoding process itself, and the fact that often, frames depend on each other.



              Lossy compression tries to remove redundancy as much as possible, by removing details the human eye doesn't see. This is done in several steps, but the most important one involves transforming the pixel domain into the frequency domain, often with variants of the Discrete Cosine Transform.



              What this step does is that it takes a block of, say, 8×8 pixels and transforms it into a block of frequency coefficients. From this block, certain coefficients are dropped, which reduces the amount of information (and thus compresses the size), but also throws away visual information, which reduces the quality of the video. Which coefficients are dropped depends on the quality setting of the encoder. The video is then not stored as pixels, but as frequency coefficients.



              When you want to edit a lossy video, you first have to reconstruct the frequency coefficients into a pixel-by-pixel representation again (like mentioned above). At this point you could edit the video, and insert a logo, but once you'd want to store it again, you'd have to perform the transformation step again—and throw away information. This is, in essence, the main cause for generation loss.



              Another problem is the fact that for most lossy video, some frames depend on the information contained in other frames. More specifically, B- and P-frames only contain offsets of earlier (or later) B/P or I-frames. If you changed the contents of an I-frame, all the other dependent frames would change when they are decoded, which is typically not what you want. Peter Cordes' answer below hightlights this point. And indeed he's right that in principle, you could edit a lossy I-frame only video in place, but it would be practically very hard to accomplish.



              So, very simply speaking, unless you can store a video losslessly, you can't edit a video in-place without sacrificing quality.






              share|improve this answer















              This is only (practically) possible for lossless video encoding.



              Lossless to lossless



              It's very simple to do what you want for losslessly encoded video codecs. Many of those store the video data pixel-by-pixel, mostly in the YUV colorspace. Also, every frame stands on its own.



              It would be pretty easy to edit such a video because you know where every frame is, and where every pixel position is in the byte stream. You can simply overwrite parts of the byte stream without consequences to the remaining stream. Even if the lossless video was arithmetically compressed, you could uncompress it first, edit, and then compress and save again—no problem with that.



              So, for example, if you have a raw YUV video in an AVI container, you could edit it on a frame-by-frame basis and save it to YUV in AVI again.



              Keeping the lossy encoding



              But if you want to do lossy video encoding—or keep the original lossy encoding—after the editing step, this is impossible. There are two problems with that: the encoding process itself, and the fact that often, frames depend on each other.



              Lossy compression tries to remove redundancy as much as possible, by removing details the human eye doesn't see. This is done in several steps, but the most important one involves transforming the pixel domain into the frequency domain, often with variants of the Discrete Cosine Transform.



              What this step does is that it takes a block of, say, 8×8 pixels and transforms it into a block of frequency coefficients. From this block, certain coefficients are dropped, which reduces the amount of information (and thus compresses the size), but also throws away visual information, which reduces the quality of the video. Which coefficients are dropped depends on the quality setting of the encoder. The video is then not stored as pixels, but as frequency coefficients.



              When you want to edit a lossy video, you first have to reconstruct the frequency coefficients into a pixel-by-pixel representation again (like mentioned above). At this point you could edit the video, and insert a logo, but once you'd want to store it again, you'd have to perform the transformation step again—and throw away information. This is, in essence, the main cause for generation loss.



              Another problem is the fact that for most lossy video, some frames depend on the information contained in other frames. More specifically, B- and P-frames only contain offsets of earlier (or later) B/P or I-frames. If you changed the contents of an I-frame, all the other dependent frames would change when they are decoded, which is typically not what you want. Peter Cordes' answer below hightlights this point. And indeed he's right that in principle, you could edit a lossy I-frame only video in place, but it would be practically very hard to accomplish.



              So, very simply speaking, unless you can store a video losslessly, you can't edit a video in-place without sacrificing quality.







              share|improve this answer














              share|improve this answer



              share|improve this answer








              edited Jan 17 '15 at 17:35

























              answered May 1 '13 at 12:25









              slhckslhck

              160k47444466




              160k47444466













              • Actually, you don't technically need your source to be a lossless representation of anything. Editting without doing any lossy compression to the output of your edit will result in a file MUCH bigger than the original, but will have exactly the same quality as the original.

                – Peter Cordes
                Jan 17 '15 at 16:57











              • Absolutely. I tried to clarify that part, but your answer is good. I guess I was more thinking about what's realistic in the sense of… is there a tool that does this job. And I have yet to see a nice implementation of something that modifies bitstreams to change some macroblock frequencies in such a way that the inverse transformation gives you what you originally wanted (as specified in the pixel domain).

                – slhck
                Jan 17 '15 at 17:37











              • Yeah, I agree most people wouldn't want to inflate their videos by a factor of 10 to 100. Since if there are already artifacts from something, it's easier to accept more.

                – Peter Cordes
                Jan 17 '15 at 21:59











              • re: bitstream hacking: There are still-image jpeg tools like jpegclub.org/jpegtran (packaged for ages in most Linux distros as part of libjpeg-progs / libjpeg-turbo-progs) that do lossless transformation like crop, rotate, or even rescale (DCT is cool). Replacing some blocks with other blocks should be easy, too, but IDK if there's existing code for it.

                – Peter Cordes
                Jan 17 '15 at 22:04











              • actually, looks like there is existing code: I see jpegclub.org/jpegjoin/jpegjoin.txt. "The main purpose of Jpegjoin is to compose (join) multiple images into a single image, similar to the arrangement of row and column cells in a table. Also provided is a feature to drop logo and legend images on the bottom of the cells."

                – Peter Cordes
                Jan 17 '15 at 22:06





















              • Actually, you don't technically need your source to be a lossless representation of anything. Editting without doing any lossy compression to the output of your edit will result in a file MUCH bigger than the original, but will have exactly the same quality as the original.

                – Peter Cordes
                Jan 17 '15 at 16:57











              • Absolutely. I tried to clarify that part, but your answer is good. I guess I was more thinking about what's realistic in the sense of… is there a tool that does this job. And I have yet to see a nice implementation of something that modifies bitstreams to change some macroblock frequencies in such a way that the inverse transformation gives you what you originally wanted (as specified in the pixel domain).

                – slhck
                Jan 17 '15 at 17:37











              • Yeah, I agree most people wouldn't want to inflate their videos by a factor of 10 to 100. Since if there are already artifacts from something, it's easier to accept more.

                – Peter Cordes
                Jan 17 '15 at 21:59











              • re: bitstream hacking: There are still-image jpeg tools like jpegclub.org/jpegtran (packaged for ages in most Linux distros as part of libjpeg-progs / libjpeg-turbo-progs) that do lossless transformation like crop, rotate, or even rescale (DCT is cool). Replacing some blocks with other blocks should be easy, too, but IDK if there's existing code for it.

                – Peter Cordes
                Jan 17 '15 at 22:04











              • actually, looks like there is existing code: I see jpegclub.org/jpegjoin/jpegjoin.txt. "The main purpose of Jpegjoin is to compose (join) multiple images into a single image, similar to the arrangement of row and column cells in a table. Also provided is a feature to drop logo and legend images on the bottom of the cells."

                – Peter Cordes
                Jan 17 '15 at 22:06



















              Actually, you don't technically need your source to be a lossless representation of anything. Editting without doing any lossy compression to the output of your edit will result in a file MUCH bigger than the original, but will have exactly the same quality as the original.

              – Peter Cordes
              Jan 17 '15 at 16:57





              Actually, you don't technically need your source to be a lossless representation of anything. Editting without doing any lossy compression to the output of your edit will result in a file MUCH bigger than the original, but will have exactly the same quality as the original.

              – Peter Cordes
              Jan 17 '15 at 16:57













              Absolutely. I tried to clarify that part, but your answer is good. I guess I was more thinking about what's realistic in the sense of… is there a tool that does this job. And I have yet to see a nice implementation of something that modifies bitstreams to change some macroblock frequencies in such a way that the inverse transformation gives you what you originally wanted (as specified in the pixel domain).

              – slhck
              Jan 17 '15 at 17:37





              Absolutely. I tried to clarify that part, but your answer is good. I guess I was more thinking about what's realistic in the sense of… is there a tool that does this job. And I have yet to see a nice implementation of something that modifies bitstreams to change some macroblock frequencies in such a way that the inverse transformation gives you what you originally wanted (as specified in the pixel domain).

              – slhck
              Jan 17 '15 at 17:37













              Yeah, I agree most people wouldn't want to inflate their videos by a factor of 10 to 100. Since if there are already artifacts from something, it's easier to accept more.

              – Peter Cordes
              Jan 17 '15 at 21:59





              Yeah, I agree most people wouldn't want to inflate their videos by a factor of 10 to 100. Since if there are already artifacts from something, it's easier to accept more.

              – Peter Cordes
              Jan 17 '15 at 21:59













              re: bitstream hacking: There are still-image jpeg tools like jpegclub.org/jpegtran (packaged for ages in most Linux distros as part of libjpeg-progs / libjpeg-turbo-progs) that do lossless transformation like crop, rotate, or even rescale (DCT is cool). Replacing some blocks with other blocks should be easy, too, but IDK if there's existing code for it.

              – Peter Cordes
              Jan 17 '15 at 22:04





              re: bitstream hacking: There are still-image jpeg tools like jpegclub.org/jpegtran (packaged for ages in most Linux distros as part of libjpeg-progs / libjpeg-turbo-progs) that do lossless transformation like crop, rotate, or even rescale (DCT is cool). Replacing some blocks with other blocks should be easy, too, but IDK if there's existing code for it.

              – Peter Cordes
              Jan 17 '15 at 22:04













              actually, looks like there is existing code: I see jpegclub.org/jpegjoin/jpegjoin.txt. "The main purpose of Jpegjoin is to compose (join) multiple images into a single image, similar to the arrangement of row and column cells in a table. Also provided is a feature to drop logo and legend images on the bottom of the cells."

              – Peter Cordes
              Jan 17 '15 at 22:06







              actually, looks like there is existing code: I see jpegclub.org/jpegjoin/jpegjoin.txt. "The main purpose of Jpegjoin is to compose (join) multiple images into a single image, similar to the arrangement of row and column cells in a table. Also provided is a feature to drop logo and legend images on the bottom of the cells."

              – Peter Cordes
              Jan 17 '15 at 22:06















              2














              slhck is correct if we're talking about complex video codecs with motion vectors (edit: actually, any kind of intra or inter prediction). You could modify a h.264 stream by re-encoding only the macroblocks in the area you want to replace with your logo. Logo in I frames, skip blocks in P and B frames. But any time the camera panned, there would be nearby blocks that get their picture data by copying from the part of the picture that's now logoed. So the edges of your logo would get copied around, corrupting other parts of the picture until the next I frame.



              So he's right, but it's because of references, not lossy-ness. (edit: His edits clear up most of the confusion.)



              For simpler codecs where every block is independent (not even any intra prediction), e.g. mjpeg, you could replace the coded blocks only in the area you wanted to logo, without decoding / re-encoding the rest of the picture. I think there's support for doing this in a single jpeg image file, but IDK if anything supports it for mjpeg, without breaking out the frames to separate jpeg files.



              http://jpegclub.org/jpegjoin/jpegjoin.txt has info on how to use jpegtran to replace some blocks of jpeg files, without decoding/encoding the rest of the image.



              Here's an example of mucking around with a video stream without decoding/re-encoding. Hiding data in a h.264 stream, in the low coefficients of iPCM macroblocks. (literal, not DCTed, coding.) You have to decode / re-encode the CABAC entropy-coding layer. (It's the final zip-style compression for the bitstream.)



              If the source is in a lossless format, you're not incurring a huge filesize cost by decompressing, editting, and recompressing the whole thing, just CPU. Many lossless formats essentially zip each whole frame separately, rather than operating on blocks, so you'd have to do a full decompress-edit-recompress. Lossless h.264 has all the inter and intra prediction of normal lossy h.264, so even though it does have macroblocks, a full decode-edit-encode is needed there, too.



              None of this changes anything for the actual practical purpose of logoing a video, UNLESS your lossy source was mjpeg. I'm just writing this for those curious about why.






              share|improve this answer






























                2














                slhck is correct if we're talking about complex video codecs with motion vectors (edit: actually, any kind of intra or inter prediction). You could modify a h.264 stream by re-encoding only the macroblocks in the area you want to replace with your logo. Logo in I frames, skip blocks in P and B frames. But any time the camera panned, there would be nearby blocks that get their picture data by copying from the part of the picture that's now logoed. So the edges of your logo would get copied around, corrupting other parts of the picture until the next I frame.



                So he's right, but it's because of references, not lossy-ness. (edit: His edits clear up most of the confusion.)



                For simpler codecs where every block is independent (not even any intra prediction), e.g. mjpeg, you could replace the coded blocks only in the area you wanted to logo, without decoding / re-encoding the rest of the picture. I think there's support for doing this in a single jpeg image file, but IDK if anything supports it for mjpeg, without breaking out the frames to separate jpeg files.



                http://jpegclub.org/jpegjoin/jpegjoin.txt has info on how to use jpegtran to replace some blocks of jpeg files, without decoding/encoding the rest of the image.



                Here's an example of mucking around with a video stream without decoding/re-encoding. Hiding data in a h.264 stream, in the low coefficients of iPCM macroblocks. (literal, not DCTed, coding.) You have to decode / re-encode the CABAC entropy-coding layer. (It's the final zip-style compression for the bitstream.)



                If the source is in a lossless format, you're not incurring a huge filesize cost by decompressing, editting, and recompressing the whole thing, just CPU. Many lossless formats essentially zip each whole frame separately, rather than operating on blocks, so you'd have to do a full decompress-edit-recompress. Lossless h.264 has all the inter and intra prediction of normal lossy h.264, so even though it does have macroblocks, a full decode-edit-encode is needed there, too.



                None of this changes anything for the actual practical purpose of logoing a video, UNLESS your lossy source was mjpeg. I'm just writing this for those curious about why.






                share|improve this answer




























                  2












                  2








                  2







                  slhck is correct if we're talking about complex video codecs with motion vectors (edit: actually, any kind of intra or inter prediction). You could modify a h.264 stream by re-encoding only the macroblocks in the area you want to replace with your logo. Logo in I frames, skip blocks in P and B frames. But any time the camera panned, there would be nearby blocks that get their picture data by copying from the part of the picture that's now logoed. So the edges of your logo would get copied around, corrupting other parts of the picture until the next I frame.



                  So he's right, but it's because of references, not lossy-ness. (edit: His edits clear up most of the confusion.)



                  For simpler codecs where every block is independent (not even any intra prediction), e.g. mjpeg, you could replace the coded blocks only in the area you wanted to logo, without decoding / re-encoding the rest of the picture. I think there's support for doing this in a single jpeg image file, but IDK if anything supports it for mjpeg, without breaking out the frames to separate jpeg files.



                  http://jpegclub.org/jpegjoin/jpegjoin.txt has info on how to use jpegtran to replace some blocks of jpeg files, without decoding/encoding the rest of the image.



                  Here's an example of mucking around with a video stream without decoding/re-encoding. Hiding data in a h.264 stream, in the low coefficients of iPCM macroblocks. (literal, not DCTed, coding.) You have to decode / re-encode the CABAC entropy-coding layer. (It's the final zip-style compression for the bitstream.)



                  If the source is in a lossless format, you're not incurring a huge filesize cost by decompressing, editting, and recompressing the whole thing, just CPU. Many lossless formats essentially zip each whole frame separately, rather than operating on blocks, so you'd have to do a full decompress-edit-recompress. Lossless h.264 has all the inter and intra prediction of normal lossy h.264, so even though it does have macroblocks, a full decode-edit-encode is needed there, too.



                  None of this changes anything for the actual practical purpose of logoing a video, UNLESS your lossy source was mjpeg. I'm just writing this for those curious about why.






                  share|improve this answer















                  slhck is correct if we're talking about complex video codecs with motion vectors (edit: actually, any kind of intra or inter prediction). You could modify a h.264 stream by re-encoding only the macroblocks in the area you want to replace with your logo. Logo in I frames, skip blocks in P and B frames. But any time the camera panned, there would be nearby blocks that get their picture data by copying from the part of the picture that's now logoed. So the edges of your logo would get copied around, corrupting other parts of the picture until the next I frame.



                  So he's right, but it's because of references, not lossy-ness. (edit: His edits clear up most of the confusion.)



                  For simpler codecs where every block is independent (not even any intra prediction), e.g. mjpeg, you could replace the coded blocks only in the area you wanted to logo, without decoding / re-encoding the rest of the picture. I think there's support for doing this in a single jpeg image file, but IDK if anything supports it for mjpeg, without breaking out the frames to separate jpeg files.



                  http://jpegclub.org/jpegjoin/jpegjoin.txt has info on how to use jpegtran to replace some blocks of jpeg files, without decoding/encoding the rest of the image.



                  Here's an example of mucking around with a video stream without decoding/re-encoding. Hiding data in a h.264 stream, in the low coefficients of iPCM macroblocks. (literal, not DCTed, coding.) You have to decode / re-encode the CABAC entropy-coding layer. (It's the final zip-style compression for the bitstream.)



                  If the source is in a lossless format, you're not incurring a huge filesize cost by decompressing, editting, and recompressing the whole thing, just CPU. Many lossless formats essentially zip each whole frame separately, rather than operating on blocks, so you'd have to do a full decompress-edit-recompress. Lossless h.264 has all the inter and intra prediction of normal lossy h.264, so even though it does have macroblocks, a full decode-edit-encode is needed there, too.



                  None of this changes anything for the actual practical purpose of logoing a video, UNLESS your lossy source was mjpeg. I'm just writing this for those curious about why.







                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited Jan 17 '15 at 22:35

























                  answered Jan 15 '15 at 5:34









                  Peter CordesPeter Cordes

                  2,3281621




                  2,3281621























                      1














                      I would say it should be possible.



                      During the US changeover to OTA HD (ATSC), almost all the networks provided video from their home studios via satellite, using a very high bit rate (much higher than the OTA bit rate) video stream. They then provided each local television station with a decoder and lower bitrate encoder, so they could completely decode the high-quality stream, add their station logo, picture inserts, etc., and then encode to the broadcast standard bitrate for transmission.



                      However, the Fox network opted to encode the stream in their home studios, presumably using a higher quality (and expense) encoder, but to the broadcast bit rate. Then, they supplied a "splicer" to each local affiliate which could partially decode the stream, insert logos, crawls, etc., and then re-encode just that portion before broadcast.



                      I do realize that the MPEG-2 stream used in OTA broadcast differs from h.264/MPEG-4, but from what I know of the technical details of the formats, this is most likely still feasible with h.264/MPEG-4. (In fact, MPEG-4 actually adds features that support alpha compositing, although I suspect few decoders may actually handle this feature.)



                      That said, finding such a piece of software has eluded me. I have streams to which I'd like to add a graphics bar (mostly a sports score), without re-encoding the already mediocre quality from a cheap camera. But I haven't found one yet.



                      Edit: I did find this paper on "Replacing picture regions in H.264/AVC bitstream by utilizing independent slices," that appears to provide the technical details to achieve this. But I'm still looking for some proof-of-concept software.






                      share|improve this answer






























                        1














                        I would say it should be possible.



                        During the US changeover to OTA HD (ATSC), almost all the networks provided video from their home studios via satellite, using a very high bit rate (much higher than the OTA bit rate) video stream. They then provided each local television station with a decoder and lower bitrate encoder, so they could completely decode the high-quality stream, add their station logo, picture inserts, etc., and then encode to the broadcast standard bitrate for transmission.



                        However, the Fox network opted to encode the stream in their home studios, presumably using a higher quality (and expense) encoder, but to the broadcast bit rate. Then, they supplied a "splicer" to each local affiliate which could partially decode the stream, insert logos, crawls, etc., and then re-encode just that portion before broadcast.



                        I do realize that the MPEG-2 stream used in OTA broadcast differs from h.264/MPEG-4, but from what I know of the technical details of the formats, this is most likely still feasible with h.264/MPEG-4. (In fact, MPEG-4 actually adds features that support alpha compositing, although I suspect few decoders may actually handle this feature.)



                        That said, finding such a piece of software has eluded me. I have streams to which I'd like to add a graphics bar (mostly a sports score), without re-encoding the already mediocre quality from a cheap camera. But I haven't found one yet.



                        Edit: I did find this paper on "Replacing picture regions in H.264/AVC bitstream by utilizing independent slices," that appears to provide the technical details to achieve this. But I'm still looking for some proof-of-concept software.






                        share|improve this answer




























                          1












                          1








                          1







                          I would say it should be possible.



                          During the US changeover to OTA HD (ATSC), almost all the networks provided video from their home studios via satellite, using a very high bit rate (much higher than the OTA bit rate) video stream. They then provided each local television station with a decoder and lower bitrate encoder, so they could completely decode the high-quality stream, add their station logo, picture inserts, etc., and then encode to the broadcast standard bitrate for transmission.



                          However, the Fox network opted to encode the stream in their home studios, presumably using a higher quality (and expense) encoder, but to the broadcast bit rate. Then, they supplied a "splicer" to each local affiliate which could partially decode the stream, insert logos, crawls, etc., and then re-encode just that portion before broadcast.



                          I do realize that the MPEG-2 stream used in OTA broadcast differs from h.264/MPEG-4, but from what I know of the technical details of the formats, this is most likely still feasible with h.264/MPEG-4. (In fact, MPEG-4 actually adds features that support alpha compositing, although I suspect few decoders may actually handle this feature.)



                          That said, finding such a piece of software has eluded me. I have streams to which I'd like to add a graphics bar (mostly a sports score), without re-encoding the already mediocre quality from a cheap camera. But I haven't found one yet.



                          Edit: I did find this paper on "Replacing picture regions in H.264/AVC bitstream by utilizing independent slices," that appears to provide the technical details to achieve this. But I'm still looking for some proof-of-concept software.






                          share|improve this answer















                          I would say it should be possible.



                          During the US changeover to OTA HD (ATSC), almost all the networks provided video from their home studios via satellite, using a very high bit rate (much higher than the OTA bit rate) video stream. They then provided each local television station with a decoder and lower bitrate encoder, so they could completely decode the high-quality stream, add their station logo, picture inserts, etc., and then encode to the broadcast standard bitrate for transmission.



                          However, the Fox network opted to encode the stream in their home studios, presumably using a higher quality (and expense) encoder, but to the broadcast bit rate. Then, they supplied a "splicer" to each local affiliate which could partially decode the stream, insert logos, crawls, etc., and then re-encode just that portion before broadcast.



                          I do realize that the MPEG-2 stream used in OTA broadcast differs from h.264/MPEG-4, but from what I know of the technical details of the formats, this is most likely still feasible with h.264/MPEG-4. (In fact, MPEG-4 actually adds features that support alpha compositing, although I suspect few decoders may actually handle this feature.)



                          That said, finding such a piece of software has eluded me. I have streams to which I'd like to add a graphics bar (mostly a sports score), without re-encoding the already mediocre quality from a cheap camera. But I haven't found one yet.



                          Edit: I did find this paper on "Replacing picture regions in H.264/AVC bitstream by utilizing independent slices," that appears to provide the technical details to achieve this. But I'm still looking for some proof-of-concept software.







                          share|improve this answer














                          share|improve this answer



                          share|improve this answer








                          edited Dec 30 '18 at 23:31

























                          answered Dec 30 '18 at 23:20









                          rtilleryrtillery

                          112




                          112






























                              draft saved

                              draft discarded




















































                              Thanks for contributing an answer to Super User!


                              • Please be sure to answer the question. Provide details and share your research!

                              But avoid



                              • Asking for help, clarification, or responding to other answers.

                              • Making statements based on opinion; back them up with references or personal experience.


                              To learn more, see our tips on writing great answers.




                              draft saved


                              draft discarded














                              StackExchange.ready(
                              function () {
                              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsuperuser.com%2fquestions%2f589390%2fpartial-video-editing-without-transcoding%23new-answer', 'question_page');
                              }
                              );

                              Post as a guest















                              Required, but never shown





















































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown

































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown







                              Popular posts from this blog

                              Mouse cursor on multiple screens with different PPI

                              Agildo Ribeiro

                              Sometime when accessing a menu: “Ubuntu 16.04 has experienced an internal error”