Stick Theory: How To Use x264 Like A Pro, Only Pro’s Can’t Legally Use It For Production Work

Just as String Theory attempts to govern the universe, Stick Theory governs how x264 works. Stick Theory is the theory that when choosing x264 settings, even if you have no idea what anything does, as long as your keep throwing shit at a wall, you’ll see what sticks, resulting in quality video.

A trend I notice is that very few people can really use x264 properly. Everyone expects wonderful magical GUI’s and automated shenanigans where you just set a preset and leave it, and your video comes out looking reasonably non-crap. The thing many people don’t know is that CLI is good for you, and not as scary as one might think. Additionally, x264 has presets now, and soon, a –device flag for incredibly simple encodes for PS3 or Windows Mobile or whatever else. This makes basic x264 usage really simple. Just read x264 –help, or if you want more comprehensive info, –longhelp or even –fullhelp.

I lurk in a lot of IRC channels. More and more often now, I am seeing people asking in places like #Archlinux how to encode. The channel is for linux discussion, not encoding. At the same time, the guys in #x264 don’t really like handing out tailored settings for every random lazy guy that drops into the channel. Spending two minutes on a quick google search will give you most info you want to see. http://mewiki.project357.com/wiki/X264_Settings should be one of the first results, and it has more info than pretty much everywhere else on the entire internet besides for possibly the musty halls of Darkhold, or inside Loren Merritt’s head.

I also see a lot of users calling Random Distro X crappy just because it doesn’t provide a crappy tool like Handbrake to transcode their porn with. The thing is, x264 is really easy to use. It’s simple, can be installed or compiled in seconds, and with the preset system you can do pretty decent encodes as long as your source is good. There is no real need for a GUI or for people to think encoding is hard. x264 is practically made for the lazy man now. Stop filling up my 119 IRC channels with useless shit about why you suck at encoding people.

The other problem with x264 users of course is that they often don’t know what different settings do, and mess with them anyway. This is where Stick Theory comes in. While throwing random shit at a wall DOES eventually give you a good result, what is significantly faster and more informed is to learn what common x264 settings do, especially things like the vbv-buffer, crf, aq, psy-rdo, trellis, bframes, b-pyramid, subme, motion estimation, rc-lookahead, and a few others. The above linked wiki article is a very good repo of x264 info mostly written by some very talented encoders (look Dae, I called you talented!)

Knowing what you’re doing will let you write out a full x264 command string in less than a minute to have a very well tuned encode, tailored for your content, resulting in VERY nice results. Of course this isn’t strictly necessary, but knowing it always helps. That does however sound tedious. Luckily, with x264, you can use both the presets AND the main flags, allowing you to choose a preset, autotune it with the –tune flag, and then override as necessary for a very easy yet customised encode string. And all in the space of a minute.

Now, stop being lazy and get off my lawn and out of my IRC channel. x264 does all the work for you, it’s MAGIC I say! (Some might argue that the AI in things like AQ and CRF is sentient, but Dark_Shikari assures me it isn’t, yet)

This Is Not An Elaborately Large Quote I Am Just Writing Some TL;DR About Subtitle Formats To Explain Things As Requested By The Masses

While speaking to Eric over at Siren Visual and my bro Shadow Wolf at Supanova this weekend, the topic of various subtitle formats and how they impact visual typesetting and typography came up. Today, I’m going to be writing about the three main ways you could classify subtitle formats and how they work; namely, text-based subtitles, DVD IDX/SUB format, and the two (yes two!) BD subtitle formats, PGS/SUP and TTXML.

I’m super lazy and there is so much variation in test-based subtitling that I’ll sorta skim over this. Text-based subtitles are formats such as SRT or SSA. They are common in the ripping and fansubbing communities mostly for their ability to be turned on and off at will. Some are more complex than others. ASS (SSA V4+) for example is capable of rendering full text effects as well as vector graphics and with a container like Matroska, it can be packaged with fonts and used for full typography and visual effects as well as the subtitles themselves. SRT on the other hand is a much more basic format, it just stores lines with their times. There are many formats like this and they are used in many places and so I leave further research to the reader as this post is mostly aimed at DVD and Blu-Ray.

DVD uses a format that can be referred to as either IDX, Sub, or VobSub after the horrible renderer for Windows it used to have. DVD SUB uses 4 colours in a raw bitmap storage format. That said, the way your DVD player displays each of those colours is up to the manufacturer as the palette has 16 colours, and there are a further 16 contrast values, also with only 4 that can be used. The four colours are for the background, foreground, outline, and shadow. The background colour is generally an alpha field and so your sub-picture will overlay onto your actual video with transparency, and not covering it. Common ‘fill’ colours are white, yellow, and pale blue, while black is the most common outline. Most players have a transparent shadow, although black is fairly common too.

Blu-Ray on the other hand is a bit better. It uses 24-bit colours in its sub-picture format. This allows for a rather unique ability on a Blu-Ray, if anyone was to take advantage of it (Siren Visual I am looking at you.) Given a BD disk has so much space on it, yet most times that space isn’t utilised even CLOSE to fully, one could take advantage of this 24-bit colour (+alpha) to render full ‘soft’ typesetting onto the video. A studio could open up their compositing application of choice, do their thing, and then output a PNG sequence. Convert that to SUP, mux, and you now have full soft-sub typesetting on a BD release. I have yet to see ANYONE in the industry typeset at all, regardless of method, so this would be a real bonus on release quality.

The second format Blu-Ray has, and it seems a lot of people don’t know about this, is TTXML. TTXML shouldn’t be confused with the MP4 format’s Timed Text, which is usually referred to as TTXT. TTXML is a format mostly defined by Adobe, although barely supported by any software I have seen including Adobe’s own Flash player. It is a text-based format similar to Ogg Kate or SSA, only using XML. It is rather basic and from what I can tell (limited spec) it has no vectoring capability, but I assume SVG incorporation isn’t too difficult. It is capable of the general font selection, bolding, styling like outline and shadow, stretching, and basic text animation effects like karaoke by it’s time function, quite similar to the ASS \t flag if more basic. I have no idea if many hardware Blu-Ray players support this format, but I’m just putting it out there that it exists.

This concludes me writing walls of text about subtitles, it’s 11:48PM and my fingers are freezing. We’ll see if anyone takes interest in the Australian Blu-Ray Industry.

I’m Not Saying Ironman 2 Was Shit But That Framedropping Thing Was Kinda Annoying

I went to see Ironman 2 tonight. I REALLY enjoyed it besides for a few hurfdurf technical moments, and one thing that really irked me concerning framerate nostalgia. As an aviomechatronic engineer, flying robots are kinda my thing, so I had a lot of fun at how crazy a lot of it was. I was a bit wtflol at the whole particle accelerator in a house, and then refracting the particle beam through the prism? What was that about? I like movies being outrageous like that though so it was enjoyable to see. What WASN’T enjoyable however was what appeared to be a bit of framerate history gone wrong.

Movie cameras have been using 24fps since the 20′s (although a 1% slowdown in the digital age because people in the US are dumb) while the non-FILM cameras went with 30fps, broadcast interlaced as 60i if need be. I’m not going to pretend I know what camera the film used for all those scenes of Howard Stark, but I’ll take a guess that it used the cheaper TV type cameras that used 30fps. So we have 30fps footage being played in a 24fps movie, what is commonly called hybrid. The problem with that is that 1 in every 5 frames is gonna be dropped by the 24fps camera used by the studio, making it jerky, which was REALLY annoying.

I did say earlier that it was done wrong, when it really wasn’t, and I meant that in the sense that Stark had money and could afford a movie camera and thus get 24fps and not make my eyes hurt. Hybrid video isn’t really rare, especially in movies that show video in another display, but that doesn’t mean I have to like it.

Other than that though, I found the film really enjoyable, even more so than the first one. You guys should go watch it. Tony reminded me a bit of myself, only further down the track, and with a cooler rig at home. That and he was a good student and I am not a narcissist, hurf. Here’s to the Thor movie being good.

Sly Marketing Is Probably Not The Best Way To Sell A Product But I Guess It Works When Your Audience Is Retarded

A while ago, one of my fellow soldiers in the war against the industry had a chance to talk to Sly, from Madman Entertainment, about their upcoming Bluray releases. After our numerous points about them screwing over the video in DVD releases by standards conversion and a few other things, Sly made several assurances that the BD releases would have the video unchanged from the Japanese masters, including the lack of English typesetting. The company reps have mentioned this to quite a few people as I understand it.

The other day, Madman announced the title Claymore as one of it’s upcoming BD releases. I know a lot of people would think that video being unchanged by a shitty licensing and localising company is a good thing, and it SHOULD be, except that not all Japanese video is perfect either. I would like to see Madman releasing Utawarerumono as a BD title, with it’s terribly delicious 1080i MPEG2. Except I wouldn’t. The Claymore BD in Japan is a known upscale, and a terrible one at that. We’ll see how Madman shoot themselves in the foot this time.

On the side, to clear up some confusion, I do not have anything against Sly. He’s a nice guy and has always been courteous when ever I have spoken to him, although apparently he thinks I’m a dick, which is somewhat understandable. I only have a problem with the company he works for. Just sayin’.

Grain Is Not A Defect: How Eugenics Improve Video Quality

“Way to grainy/noisy for a 1080p… I’ve seen better BRRips @ 2gb…”

As the above quote shows, a lot of people are under the impression that grain is the same as noise, and is a defect. Not just in the scene, which is already known for being incredibly dumb, but also in the AMV and fansub communities. Over the past few months, fansubbers have slowly come to terms with grain (with a few notable exceptions) and at times have gone a little too far such as adding grain for no good reason. Granted adding grain is at times necessary but trolling with it is a bit much. Before the community even thinks about how much grain is necessary though, I think the various video scenes need to get over the fact that it is NOT a bad thing.

This might come as a surprise to some but grain makes up a large amount of picture quality. Most video is encoded with DCT codecs which break things into macroblocks. Without grain or anything else detailed and small, the quantizers the various codecs use will smooth out the blocks and produce solid colour bodies, something x264 overcomes with it’s adaptive quantizer. On content such as animation, large flat colours can be how it’s meant to look, but that is certainly not the case with live footage. Another problem is that where there are large solid blocks, a quantizer might be a bit over active and smooth out a very minor gradient, often seen as grey/white blocks and lines in the sky on a clip. Something like an adaptive quantizer really helps with that, but so does the minor amount of grain often present in sources.

Noise on the other hand is usually the product of poor capturing and is an issue in the source. If a source is noisy, then naturally you denoise it. Grain however can be reduced, added, or left alone, depending on how much there is. One simple way to tell if your clip has noise or grain is by the shape, size, and distribution of it. Grain tends to be uniform and rather fine, except for in flashback scenes in anime where it is significantly larger, and it almost always covers the entire picture. Noise often just impacts a small part of the picture, is usually bigger than grain, and the noise ‘chunks’ are irregular. Paying close attention to grain will show that it appears quite regular. Digital grain is often static as well, so it doesn’t move between frames. It’s easy to spot if you look at a slow pan, the image will pan under the grain. Noise however will move in every frame. Noise also sometimes shows up as a colour aberration.

Dirt is a type of noise, and is quite rare in modern content. It is usually found in analogue content transferred to digital, mostly on older things, and sometimes in video that has already been compressed badly. I have seen it almost nowhere in live content but in anime it’s often present around hair edges. Usually edge cleaning will fix it, and if you have dirt but no other noise there is no reason to denoise your entire clip, just mask it and clean it, or use a dedicated edge cleaner filter.

To get back on track, grain is not bad. Complaining about grain makes you look dumb and blind, and getting rid of it makes content look sterile as hell. Fine grain actually looks good, is rarely even noticed, and makes a picture look significantly more natural. There is absolutely no need to remove or tamper with it. The problem is that people are stupid. I’m not going to actually talk about selective breeding but I think the title gets the point across. There are lots of good uses for grain, such as debanding without resorting to dither, which gets banded at the quantizer anyway, and overall it does look better. Now stop fucking with it.

FLV+VP6 Seemed Like A Good Idea At The Time But Now It’s More Like That Trip To Vegas

I think everyone who actually reads this is aware that FLV is one of the worst containers ever used in the history of pretty much anything. For starters it absolutely abuses timecodes, so when I take a timecode v2 dump and look and see lots of numbers around 30fps, it’s pretty much guaranteed to actually be 30fps.

Another thing FLV likes to do is give itself some random as hell resolutions, such as 583×437, when using H.264 which is generally a YV12 codec, that is, chroma subsampling is done at 4:2:0 which would force the resolution to be divisible by 2 (mod2) although VP6 allows pretty much whatever you want to do.

The video I was asked to deal with today was VP6 at that resolution at 30fps in FLV. The problem I actually had on it was that there was some luma blending. Now I figured a simple MergeChroma(last.Trim(1,0)) (cut the first frame of the chroma so that it’s no longer a frame behind the luma, for those non-avisynth using people) would do the job however it turned out to be more difficult than I thought.

See, Japan has this love of REALLY BAD framerate conversions, and one of the worst framerate conversion methods blends the luma and chroma channels to interpolate motion and whatnot. This was most definitely the case here, and eventually I decided to just freezeframe some really bad bits and leave the rest of it as is, seeing as 30fps is hardly going to make a single frame noticable to the regular human eye.

After messing around a bit more and with the help of Kuukunen, it was found that the blending was in a 4:5 pattern: Definite proof of blended interpolation framerate conversion from 24fps to 30fps. I think this is where I would like to give P.A. Works and whoever else worked on this a big warm FUCK YOU. Naturally I shouldn’t be ripping things but it’s available for free on their site and I happen to be involved in an English translation project for it.

Now, as I’ve been speaking about this on IRC to a few people, some of them commented “Just decimate the 5th frame seeing as that’s the bad one.” The issue with blended interpolation framerate conversions is that they don’t just blend 2 frames to make an extra one, they mess with ALL frames to preserve smoothness of motion, albeit introducing blending as well. That means that there is effectively nothing one can do about it, although Kuukunen suggested the following, which basically takes the 2 worst frames in any set of 5 and blends them to get back down to 24fps, with some funky 3-way blends.

s = last
s0 = s.selectevery(5,0)
s1 = s.selectevery(5,1)
s2 = s.selectevery(5,2)
s3 = s.selectevery(5,3)
s4 = s.selectevery(5,4)
interleave(s0,s1,s2,s3.overlay(s4,opacity=0.5))
assumefps(24000,1001)

I can’t say I liked the result of that, but either way Japan has proven once again that it knows nothing about quality video mastering. The industry strikes again I guess. I could maybe write a letter to the studio informing them of how they’re Doing It WrongTM and they might even send me really low res lossless clips and ask for 1080p upscaled H.264, but I don’t see that happening here.

On the side, I happen to be turning 21 today, and if anyone feels like contributing to something they should message me on Rizon ( ´∀`)

Doing It Wrong: Hardware Support, Null Frames, And Why You’re Overscanning Your Usefulness

Sure is a lot of updating from me recently, if you actually read/appreciate these pages upon pages of tl;dr and/or read them, leave a comment with your thoughts. It’s a bit depressing to see high stats on reading and 30~ comments across the entire blarg.

Everyone knows some guy with a DivX player right? Those guys that pop in CD’s with XviD or DivX DVD rips and whatnot on them and get to watch it on their TV, who haven’t yet figured out streaming matroska or watching MP4 AVC encodes? A lot of anime encoders still like to use XviD for “hardware compatibility” reasons. Some use H.264 but still in the AVI container, like “timecop”, but they are so utterly beyond help I don’t see any point in commenting here. There seem to be an awful lot of issues with these supposed hardware compatible encodes though. I’ll attempt to explain why, and hopefully people will stop using XviD/AVI or at least come to some concessions.

The three main fuckups I see in AVI encodes are overscan, variable framerates, and to a lesser degree, resolution. I’ll start with overscan. Overscan is something that happens on older (and some newer) TV’s, effectively anything that uses a cathode ray tube (ie, not LCD or Plasma, the bulky TV’s) along with flat-panels set to overscan mode for whatever silly reason. An image on a CRT can be broken into three parts: title-safe, action-safe, and overscan. Title-safe is the innermost part of a frame, where everything is certain to remain correct. Action-safe is a slightly larger area where somethings may be cut off but is usually ok, especially on the horizontal as far as subtitles go. Vertically, subtitles should always be in the title-safe region, which most often means a vertical padding margin of 5% of the vertical resolution. For example, for 720p, that would be 0.05 * 720, or 36px. About 5.5% however is the optimum reading zone for most people across almost all reading distances and font sizes. Overscan is the part that is guaranteed to be cut off.

I’m going to go and assume that anyone reading this is somewhat knowledgable about digital subtitling and is familiar with Aegisub. One of the lesser-known features of Aegisub is the overscan mask. It can be enabled under Video -> Show Overscan Mask. A blue mask will appear over your video. The darkest blue part is the overscan mask. The lighter, inner-blue part is the action-safe mask. The clear unmasked part of the picture is the title-safe section. Your subtitles should always vertically be within the title-safe section, and horizontally at least in the action-safe section. While you would imagine it preferable to be all within the title-safe region, on widescreen video the action-safe zone is actually quite wide, and very few CRT’s will actually cut it off. It also makes your subs look less crushed and bulky, and prevents multiple lines when going a tiny bit further can be achieved on one. You must always keep clear of the overscan-mask however.

Daiz wrote a decent page on some popular XviD re-encoders that appear on some torrent sites that supposedly support hardware playback yet don’t. It shows the masks in action and illustrates how each screenshot fails to fit what is needed properly. The supposed point of most of these re-encodes is to provide compatibility for hardware players, but every single one I have seen so far breaks 1 and occasionally 2 or even 3 of the things I mentioned above.

The AVI Container doesn’t support Variable Frame Rate. There is however a nify function called drop frames or null frames, which allows a frame to be dropped and the previous one to show through. This allows a way to ‘fake’ VFR, however it can make the framerate exceedingly high if you have somewhat arbitrary rates, as all different rates must have a common multiplier, most often 120fps. By using the null-frame trick, one can make a faked VFR AVI encode. There is only one problem: Hardware players BREAK HORRIFICALLY on null frames. It is completely unsupported in every player I have ever seen it on. So either you get fucked motion, or you don’t get hardware compatibility. Alternatively you can duplicate frames up to a rate, and make your file roughly 4x bigger, but I can’t say I’ve ever seen someone do this. Usually people use lower-resolution XviD encodes, so being twice the size of an HD matroska encode yet far poorer quality is somewhat silly. This especially goes for people backing up their DVD’s: if it’s VFR, don’t ever use AVI, you will ruin your motion if you intend to watch it on a hardware player.

Another limitation of hardware players is their strange need to be mod16, that is, the horizontal and vertical resolutions must be perfectly divisible by 16. That’s due to the way DCT codecs work and the fact that a macroblock is 16×16 in these codecs. The DivX specification’s max resolution is 720×480 and as such, the most ‘common’ resolution is 704×396 to keep to 16:9. The problem with 704×396 however is that it is not mod16. That would require 704×400, which is a lot more common now, but back when these low resolutions were mostly used it was far from it. It seems it’s gotten more of a following now, probably because the XviD codec is far more efficient at mod16 resolutions.

The final issue I’d like to bring up is the use of AVC in AVI. Using something as nice as the AVC codec in something as hacky and broken as the AVI container, which doesn’t support the codec fully in the first place, is just sheer stupidity. I would like to warn everyone against using Komisar and BugMaster’s shitty x264vfw. No matter what someone says, you do NOT need AVC in AVI and you do NOT need x264vfr. If you think you need these things, you need to educate yourself further.

Theora, H.264, HTML5, and Why Software Patents Are Giving Everybody Fansub Ice Cream

DISCLAIMER: This post is mostly about debunking myths and incosistencies. I’m going to start off by stating that I don’t know all that much about software patents, and that I really don’t care about them or legality. If the GPL Legal Team wants to sent a van for me, go right ahead, I violate your shitty license daily and I’m not going to stop for some freetards and their dignity. Additionally, although I condone and support piracy in some circumstances, this does not include supporting piracy of most commercial applications, especially those of Adobe and Apple. I wholeheartedly support stealing whatever you want from Microsoft.

Recently, there has been a lot of talk of Mozilla and HTML5, specifically over the proposed video tag. Mozilla and Opera are opting for Ogg Theora while Apple and Google are going for AVC (H.264). While I am a long-time Opera user, the change away from Qt has not pleased me, and the 10.5 screenshots are ugly as sin, so it looks like I won’t be concerning myself with that any time soon. I hate GTK so Firefox isn’t really an issue either. That does of course leave me with no browser to use but I’ll figure something out. The important thing though is that Mozilla Firefox is the biggest open-source browser in the world, as far as numbers go, so using Theora as opposed to AVC is a bit of a problem, and that is where the issue that prompted this post came in.

There is what I like to think of as a communal uproar on the internet about this. A schism if you will. Everyone is divided on what Firefox should use, even though the team has said (for the moment) that they’re going with Theora. The one thing that really got to me is people assuming that OGG == Theora. This is not the case. Another problem is that people assume x264 is an open-source implementation of H.264, also not the case. This article really shows that off fairly well. While x264 is an encoder, H.264 is a standard. OGG is a container and Theora is a codec. People messing up their terms has made following articles like that one rather difficult at times.

The main problem as I see it is that people are disillusioned by what Theora is capable of, and exactly how the royalties for H.264 work. Freetards say things like “contribute to Theora to make it better then” but as the codec is patent-free, it is fairly limited in what can be added to it. There is very little more that Theora can do, and nothing it can do to even approach AVC’s quality.

H.264 on the other hand has some difficult to comprehend and confusing patents involved. As long as you don’t intend to sell your stuff publically enough to grab the MPEGLA’s attention, there is really no issue though. Not all patents are even valid outside of the US. A patent in America isn’t really going to mean much to someone in Europe as the European Patent Convention doesn’t allow software patents, and from the pool of developers I know quite a few are european. There are of course countries where AVC is patented in Europe, but I don’t think the MPEGLA isn’t going to track someone down to backwater Norway (or anywhere else) just to yell at them for using unlicensed H.264. If you encode something with AVC and stick it on your website, they’re hardly going to come after you. It certainly does explain a bit about why Madman and other silly companies use VP6 though.

Another thing I see a lot of is people going on about “x264 is ok to use, just not H.264!” As I wrote above, they are not comparable like that at all. Additionally, some people are writing that OGG/Theora have patents too, where they don’t at all. The whole point of Xiph is to have no patents, which is mostly why it’s so shit terrible. There is of course the ffmpeg project which sort of ignores patents altogether, although they are very heavy-handed on GPL violations, especially of their own work, to the point of a wall of shame. It just shows how childish and bigotted the FOSS community can be.

What it comes down to is that H.264 has royalties and patents, Theora does not. Theora can be used in anything but as it’s a piece of shit as far as codecs go, it shouldn’t be used under any circumstances. H.264 is free for non-commercial use, something people seem to either forget or ignore, and the fees for commercial use are usually covered by your software and/or hardware, and not as Ben Shwartz writes, completely separate.

For an Operating System or commercial encoder/decoder to say that distributing your AVC files may require an additional fee is entirely correct, and there is no problem with it. If you are using that software in the first place and distributing commercially, you can afford the fees anyway, they really aren’t that high, and they have thresholds on where they begin and are scaled depending on your needs. That is to day, if you distribute only one copy, or even more than one but still relatively small, say as a video editor for functions, the fees don’t really apply. The standard does have a royalty-free period to encourage adoption, which has also just been extended.

As far as royalties go for a browser to decode things, it would be far simpler for it to rely on the system. On windows that would mean using VFW to select a decoder, and in *nix, my guess is libavcodec. This really kills the argument that free browsers cannot decode AVC without having to pay for it. Nothing is stopping Mozilla from making a paid version of Firefox that links to MainConcept’s decoder but I hardly see the point in that: almost all video online currently is H.264 and the browser handles that perfectly well in the flashplayers, why does it need to suddenly stop handling it so well?

Furthermore, along with codec wars, people are starting to get into container wars. There has always been a cold war as far as containers go (Hi NUT) but the arguments that one cannot stream Matroska for example are just ridiculous. Almost all streaming now is just sequential downloading of the file or beginning from a specific point and then downloading, as opposed to ‘real’ streaming. Calling a container a codec or incapable of streaming just shows off people’s ignorance and stupidity.

Naturally there is some contestation in the business world, something Jason Garret-Glaser (Dark Shikari) covers extremely well over at his blog, mostly about Adobe trying to take over the world and how everything besides Windows is irrelevant. In Ben Schwartz’s article above, Aki Jäntti (Kuukunen) makes some interesting comments about comparisons of AVC and Theora, and how optimisation effects it.

I too was present during the discussion of the –tune option in x264, and the developers really did put a lot of thought into it, especially on the Touhou setting. Something to note is that a game like Touhou is excellent for comparing codecs, its rapid changes in motion and bright simple colour make it rather good for testing at lower bitrates, which is about the only thing Theora is capable of being compared against.

I believe that most people comparing codecs aren’t really good at optimising all the codecs they’re working with, and as such most comparisons are skewed. As far as commercial use goes, x264 is on the brink of nal-hrd being committed, and once VFR works with nal-hrd I would imagine x264 could then be used for mastering standards compliant blurays, and would become a real contender. As far as visual quality is concerned, x264 is by FAR the best encoder in the world at this point in time.

Finally, to link this all back to patents and ice cream, this comic shows what I am talking about rather aptly. People in the FOSS community are so anti-patent and buying ANYTHING that they will use any free alternative, no matter what. They shun people for deciding against free things even if there is no better commercial option purely because they believe too strongly in flawed ideals. This is the main problem in the argument against AVC and for Theora in Firefox. They are too hung-up on using something free, even though it’s terrible, to notice that they CAN use H.264 without any problems. The entire argument is pointless, but it still drags on.

I think this is the longest post I’ve ever written, I completely and utterly blame stupid freetards for this, especially the Mozilla foundation, which honestly needs to die. Firefox is a terrible browser and I’m going to be an opinionated little elitist and not use it and there is nothing you can do about it. Long live x264 and Qt based webbrowsers.

Colour Matrices And Why Typesetters Make Encoders Look Bad

It seems that a lot of people still have issues with when and how colour matrix conversions should be used. After talking to some fellow encoders and a few typesetters I figured it was about time to write something useful here again.

There are two main colour matrices used in pretty much everything: ITU-Rec.709 and ITU-Rec.601. Contrary to popular belief, both use what we call TV levels, that is, on the luma scale, black is 16 and white is 235 as opposed to the PC levels of 0 and 255 respectively. Rec.709 (also called bt709 which I will use from now on) is used on HD material, that is anything with MORE than 576 pixels vertically, while bt601 is used on SD material, or anything up to and including 576 vertical pixels.

The trouble people seem to have is WHERE to use these. You can say that for anything at each of those resolutions, you will use what is appropriate, but what happens when someone takes a screenshot? Video is almost always encoded in the YV12 colourspace, yet screenshots are RGB. A conversion there needs to use the correct matrix, however most mediaplayers will assume bt601. Playback itself they will either read the stream info or assume based on resolution, but it isn’t so for screenshots most of the time.

As another example, what happens if you have something broadcast on an HD stream at bt709 but it’s really upscaled, and you decide to encode it at say 480p, which would you use? Effectively you can use the fancy colormatrix() plugin for avs to convert bt709 to bt601, but why not just flag the stream? It’s my own personal practice to convert SD material because I know people will screenshot it, and if it’s an SD encode, there is no point keeping the other matrix.

Furthermore, what happens when you need to process video? Most programs that use RGB outputs, say for encoding overlay files with an alpha channel, will assume bt601, I know VirtualDub and Adobe After Effects do. AutoDesk Inferno has an option to set it, although I’d assume AE does as well, but then again it isn’t something the industry likes to think about. So lets assume we have a source at bt709, and it’s HD. We then do some processing on it and output a lossless file for someone else to typeset onto, which is then encoded as RGB32, and overlayed with an alpha channel back onto the original clip. The following is a flowchart of sorts in how this normally goes, and results in incorrect colours:

Source (bt709 YV12) -> Processing (bt709 YV12) -> Lossless AVC (bt709 YV12) -> Typeset (bt601 RGBA) -> Overlayed (bt709 YV12)

You could of course encode your lossless as for example, Lagarith, at RGB24, but are you going to use a colourspace conversion in that? Yes. Are you going to do it yourself? No. The encoder will do it and assume bt601. The colours are then off. The correct way to do it would be to set your source going into the LAGS encoder as already RGB, pass that to the typesetter who has correct colours and outputs RGBA for you correctly, and everything is hunky dory. Example avs code that goes right at the end of the script:

ConvertToRGB24(matrix="Rec709")

That’s it. LAGS now has RGB data directly, and there is no problem. Because people get that wrong though, typesetters bork colours slightly (it’s really quite unnoticable to most people, and even then only on certainhues) and then the encoder looks stupid.

I think deciding which matrix to use in final encodes is really up to personal preference, as long as the stream has it correct there is no problem. All that matters is that the same colourspace is used consistently or if changed, any conversions are done correctly. Using the Colormatrix() filter is optional, if you intend to output bt601 then it’s a great way to start off in that colourspace (it goes in the header of your avs file, before anything that changes frames in any way is run) and if not, making sure to flag bt709 in x264 (–colormatrix bt709) should be done instead. At encoding time, flagging for bt601 is also good. I’d say for SD shows doing a bt601 encode is a better idea due to the aforementioned screenshot issue, but for HD stuff it’s ok to use bt601 as well really. Rec.709 is of course preferred for HD content.

I hope this has made it a little clearer about WHERE to put conversions and maybe one or two people actually learn something.

Telecide Will Kill Your Children

It was recently pointed out to me that Certain Content Authoring Communities are still living in the dark ages. I occasionally speak to someone who writes their ‘guides’ but really it’s about time things were updated, especially to be clearer on The PAL Situation. Long ago in the bleak days of everything being shit, Donald Graft wrote his Telecide filter, a filter for performing Inverse Telecine. For those familiar, IVTC requires both field matching and decimation of resulting duplicate frames. Telecide is the field matcher.

While speaking to a friend named Vincent, I discovered that a video he was showing me was a little jerky, so I took a look. It was quickly apparent that every 5th frame was a duplicate. A classic sign of field matching without decimation. So I said “Hey Vincent, what the fuck did you do?” to which he replied “Oh I just followed the guide on the .org, it says just use telecide().” He was quickly informed of a longer filterchain to use which included decimation, however this had me a bit worried.

Tritical’s excellent field matcher and decimation filters (TFM+TDecimate) were written years ago and are effectively the standards inside the non-shit open source video editing crowds. I still use telecide myself for specific reasons at times however for blind IVTC it’s mostly useless. I couldn’t comprehend why people were still using it, and then it hit me. ‘AMV Editors’ are fucking retarded and have no clue as to what they are doing in the slightest. Every last one of them. Learn to RTFM guys, it’s not hard.

tl;dr amvfags suck cocks and can’t encode for shit, everyone go use TFM+TDecimate now k <3

Next Page →