I love posts that peel back the abstraction layer of "images." It really highlights that modern photography is just signal processing with better marketing.
A fun tangent on the "green cast" mentioned in the post: the reason the Bayer pattern is RGGB (50% green) isn't just about color balance, but spatial resolution. The human eye is most sensitive to green light, so that channel effectively carries the majority of the luminance (brightness/detail) data.
In many advanced demosaicing algorithms, the pipeline actually reconstructs the green channel first to get a high-resolution luminance map, and then interpolates the red/blue signals—which act more like "color difference" layers—on top of it. We can get away with this because the human visual system is much more forgiving of low-resolution color data than it is of low-resolution brightness data. It’s the same psycho-visual principle that justifies 4:2:0 chroma subsampling in video compression.
Also, for anyone interested in how deep the rabbit hole goes, looking at the source code for dcraw (or libraw) is a rite of passage. It’s impressive how many edge cases exist just to interpret the "raw" voltages from different sensor manufacturers.
> A fun tangent on the "green cast" mentioned in the post: the reason the Bayer pattern is RGGB (50% green) isn't just about color balance, but spatial resolution. The human eye is most sensitive to green light, so that channel effectively carries the majority of the luminance (brightness/detail) data.
From the classic file format "ppm" (portable pixel map) the ppm to pgm (portable grayscale map) man page:
The quantization formula ppmtopgm uses is g = .299 r + .587 g + .114 b.
You'll note the relatively high value of green there, making up nearly 60% of the luminosity of the resulting grayscale image.
I also love the quote in there...
Quote
Cold-hearted orb that rules the night
Removes the colors from our sight
Red is gray, and yellow white
But we decide which is right
And which is a quantization error.
Funnily enough that's not the only mistake he made in that article. His final image is noticeably different from the camera's output image because he rescaled the values in the first step. That's why the dark areas look so crushed, eg around the firewood carrier on the lower left or around the cat, and similarly with highlights, e.g. the specular highlights on the ornaments.
After that, the next most important problem is the fact he operates in the wrong color space, where he's boosting raw RGB channels rather than luminance. That means that some objects appear much too saturated.
So his photo isn't "unprocessed", it's just incorrectly processed.
I didn’t read the article as implying that the final image the author arrived at was “unprocessed”. The point seemed to be that the first image was “unprocessed” but that the “unprocessed” image isn’t useful as a “photo”. You only get a proper “picture”
Of something after you do quite a bit of processing.
>There’s nothing that happens when you adjust the contrast or white balance in editing software that the camera hasn’t done under the hood. The edited image isn’t “faker” then the original: they are different renditions of the same data.
That's not how I read it. As in, this is an incidental comment. But the unprocessed version is the raw values from the sensors visible in the first picture, the processed are both the camera photo and his attempt at the end.
This whole post read like and in-depth response to people that claim things like “I don’t do any processing to my photos” or feel some kind of purist shame about doing so. It’s a weird chip some amateur photographers have on their shoulders, but even pros “process” their photos and have done so all the way back until the beginning of photography.
Is it fair to recognize that there is a category difference between the processing that happens by default on every cell phone camera today, and the time and labor intensive processing performed by professionals in the time of film? What's happening today is like if you took your film to a developer and then the negatives came back with someone having airbrushed out the wrinkles and evened out skin tones. I think that photographers back in the day would have made a point of saying "hey, I didn't take my film to a lab where an artist goes in and changes stuff."
It’s fair to recognize. Personally I do not like the aesthetic decisions that Apple makes, so if I’m taking pictures on my phone I use camera apps that’s give me more control (Halide, Leica Lux). I also have reservations about cloning away power lines or using AI in-painting. But to your example, if you got your film scanned or printed, in all likelihood someone did go in and change some stuff. Color correction and touching the contrast etc is routine at development labs. There is no tenable purist stance because there is no “traditional” amount of processing.
Some things are just so far outside the bounds of normal, and yet are still world-class photography. Just look at someone like Antoine d’Agata who shot an entire book using an iPhone accessory FLIR camera.
But mapping raw values to screen pixel brightness already entails an implicit transform, so arguably there is no such thing as an unprocessed photo (that you can look at).
Conversely the output of standard transforms applied to a raw Bayer sensor output might reasonably be called the "unprocessed image", since that is what the intended output of the measurement device is.
Would you consider all food in existence to be "processed", because ultimately all food is chopped up by your teeth or broken down by your saliva and stomach acid? If some descriptor applies to every single member of a set, why use the descriptor at all? It carries no semantic value.
You do need to rescale the values as the first step, but not exactly the described way (you need to subtract the data pedestal in order to get linear values).
When I worked at Amazon on the Kindle Special Offers team (ads on your eink Kindle while it was sleeping), the first implementation of auto-generated ads was by someone who didn't know that properly converting RGB to grayscale was a smidge more complicated than just averaging the RGB channels. So for ~6 months in 2015ish, you may have seen a bunch of ads that looked pretty rough. I think I just needed to add a flag to the FFmpeg call to get it to convert RGB to luminance before mapping it to the 4-bit grayscale needed.
True, though in the case of the Kindle they're not really intrusive (only appearing when it's off) and the price to remove them is pretty reasonable ($10 to remove them forever IIRC).
Lets not distort the past. The ads were introduced a few years later with the Kindle Keyboard, which launched with an MSRP of $140 for the base model, or $115 with ads. That was a substantial discount on a product which was already cheap when it released.
All for ads which are only visible when you aren't using the device anyway. Don't like them? Then buy other devices, pay to have them removed, get a cover to hide them, or just store it with the screen facing down when you aren't using it.
I don't think Kindle ads were available in my region in 2015 because I don't remember seeing these back then, but you're a lucky one to fix this classic mistake :-)
I remember trying out some of the home-made methods while I was implementing a creative work section for a school assignment. It’s surprising how "flat" the basic average looks until you actually respect the coefficients (usually some flavor of 0.21R + 0.72G + 0.07B). I bet it's even more apparent in a 4-bit display.
I remember using some photo editing software (Aperture I think) that would allow you to customize the different coefficients and there were even presets that give different names to different coefficients. Ultimately you can pick any coefficients you want, and only your eyes can judge how nice they are.
>Ultimately you can pick any coefficients you want, and only your eyes can judge how nice they are.
I went to a photoshop conference. There was a session on converting color to black and white. Basically at the end the presenter said you try a bunch of ways and pick the one that looks best.
(people there were really looking for the “one true way”)
I shot a lot of black and white film in college for our paper. One of my obsolete skills was thinking how an image would look in black and white while shooting, though I never understood the people who could look at a scene and decide to use a red filter..
This is actually a real bother to me with digital — I can never get a digital photo to follow the same B&W sensitivity curve as I had with film so I can never digitally reproduce what I “saw” when I took the photo.
> I shot a lot of black and white film in college for our paper. One of my obsolete skills was thinking how an image would look in black and white while shooting, though I never understood the people who could look at a scene and decide to use a red filter..
Interesting that the "NTSC" look you describe is essentially rounded versions of the coefficients quoted in the comment mentioning ppm2pgm. I don't know the lineage of the values you used of course, but I found it interesting nonetheless. I imagine we'll never know, but it would be cool to be able to trace the path that lead to their formula, as well as the path to you arriving at yours
The NTSC color coefficients are the grandfather of all luminance coefficients.
It is necessary that it was precisely defined because of the requirements of backwards-compatible color transmission (YIQ is the common abbreviation for the NTSC color space, I being ~reddish and Q being ~blueish), basically they treated B&W (technically monochrome) pictures like how B&W film and videotubes treated them: great in green, average in red, and poorly in blue.
A bit unrelated: pre-color transition, the makeups used are actually slightly greenish too (which appears nicely in monochrome).
I was actually researching why PAL YUV has the same(-ish) coefficients, while forgetting that PAL is essentially a refinement of the NTSC color standard (PAL stands for phase-alternating line, which solves much of NTSC's color drift issues early in its life).
It is the choice of the 3 primary colors and of the white point which determines the coefficients.
PAL and SECAM use different color primaries than the original NTSC, and a different white, which lead to different coefficients.
However, the original color primaries and white used by NTSC had become obsolete very quickly so they no longer corresponded with what the TV sets could actually reproduce.
Eventually even for NTSC a set of primary colors was used that was close to that of PAL/SECAM, which was much later standardized by SMPTE in 1987. The NTSC broadcast signal continued to use the original formula, for backwards compatibility, but the equipment processed the colors according to the updated primaries.
In 1990, Rec. 709 has standardized a set of primaries intermediate between those of PAL/SECAM and of SMPTE, which was later also adopted by sRGB.
Worse, "NTSC" is not a single standard, Japan deviated it too much that the primaries are defined by their own ARIB (notably ~9000 K white point).
... okay, technically PAL and SECAM too, but only in audio (analogue Zweikanalton versus digital NICAM), bandwidth placement (channel plan and relative placement of audio and video signals, and, uhm, teletext) and, uhm, teletext standard (French Antiope versus Britain's Teletext and Fastext).
Honestly, the weird 16-239 (on 8-bit) color range and 60000/1001 fps limitations stem from the original NTSC standard, which considering both the Japanese NTSC adaptation and European standards do not have is rather frustating nowadays. Both the HDVS and HD-MAC standards define it in precise ways (exactly 60 fps for HDVS and 0-255 color range for HD-MAC*) but America being America...
* I know that HD-MAC is analog(ue), but it has an explicit digital step for transmission and it uses the whole 8 bits for the conversion!
> People don’t realize how many man hours went into those early decisions.
In my "trying to hunt down the earliest reference for the coefficients" I came across "Television standards and practice; selected papers from the Proceedings of the National television system committee and its panels" at https://archive.org/details/televisionstanda00natirich/mode/... which you may enjoy. The "problem" in trying to find the NTSC color values is that the collection of papers is from 1943... and color TV didn't become available until the 50s (there is some mention of color but I couldn't find it) - most of the questions of color are phrased with "should".
This is why I love graphics and game engines. It's this focal point of computer science, art, color theory, physics, practical implications for other systems around the globe, and humanities.
I kept a journal as a teenager when I started and later digitized it when I was in my 20s. The biggest impact was mostly SIGGRAPH papers that are now available online such as "Color Gamut Transform Pairs" (https://www.researchgate.net/publication/233784968_Color_Gam...).
I bought all the GPU Gems books, all the ShaderX books (shout out to Wolfgang Engel, his books helped me tremendously), and all the GPU pro books. Most of these are available online now but I had sagging bookshelves full of this stuff in my 20s.
Now in my late 40s, I live like an old japanese man with minimalism and very little clutter. All my readings are digital, iPad-consumable. All my work is online, cloud based or VDI or ssh away. I still enjoy learning but I feel like because I don't have a prestigious degree in the subject, it's better to let others teach it. I'm just glad I was able to build something with that knowledge and release it into the world.
Cool. I could have been clearer in my post; as I understand it actual NTSC circuitry used different coefficients for RGBx and RGBy values, and I didn't take time to look up the official standard. My specific pondering was based on an assumption that neither the ppm2pgm formula nor the parent's "NTSC" formula were exact equivalents to NTSC, and my "ADHD" thoughts wondered about the provenance of how each poster came to use their respective approximations. While I write this, I realize that my actual ponderings are less interesting than the responses generated because of them, so thanks everyone for your insightful responses.
Yep, used in the early MacOS color picker as well when displaying greyscale from RGB values. The three weights (which of course add to 1.0) clearly show a preference for the green channel for luminosity (as was discussed in the article).
The bit about the green over-representation in camera color filters is partially correct. Human color sensitivity varies a lot from individual to individual (and not just amongst individuals with color blindness), but general statistics indicate we are most sensitive to red light.
The main reason is that green does indeed overwhelmingly contribute to perceptual luminance (over 70% in sRGB once gamma corrected: https://www.w3.org/TR/WCAG20/#relativeluminancedef) and modern demosaicking algorithms will rely on both derived luminance and chroma information to get a good result (and increasingly spatial information, e.g. "is this region of the image a vertical edge").
Small neural networks I believe are the current state of the art (e.g. train to reverse a 16x16 color filter pattern for the given camera). What is currently in use by modern digital cameras is all trade secret stuff.
> Small neural networks I believe are the current state of the art (e.g. train to reverse a 16x16 color filter pattern for the given camera). What is currently in use by modern digital cameras is all trade secret stuff.
Considering you usually shoot RAW, and debayer and process in post, the camera hasn't done any of that.
It's only smartphones that might be doing internal AI Debayering, but they're already hallucinating most of the image anyway.
Sure - if you don't want to do demosaicing on the camera, that's fine. It doesn't mean there is not an algorithm there as an option.
If you care about trying to get an image that is as accurate as possible to the scene, then it is well within your interest to use a Convolutional Neural Network based algorithm, since these are amongst the highest performing in terms of measured PSNR (which is what nearly all demosaicing algorithms in academia are measured on). You are maybe thinking of generative AI?
Yes, people usually shoot RAW (anyone spending this much on a camera knows better) - but these cameras default to JPEG and often have dual-capture (RAW+JPEG) modes.
To be clear, they default to JPEG for the image preview on the monitor (LCD screen). Whenever viewing an image on a professional camera, you’re always seeing the resulting JPEG image.
The underlying data is always captured as a RAW file, and only discarded if you’ve configured the camera to only store the JPEG image (discarding the original RAW file after processing).
> Whenever viewing an image on a professional camera
Viewing any preview image on any camera implies a debayered version: who says is it JPEG-encoded - why would it need to be? Every time I browse my SD card full of persisted RAWs, is the camera unnecessarily converting to JPEG just to convert it back to bitmap display data?
> The underlying data is always captured as a RAW file, and only discarded if you’ve configured the camera to only store the JPEG image (discarding the original RAW file after processing).
Retaining only JPEG is the default configuration on all current-generation Sony and Canon mirrorless cameras: you have to go out of your way to persist RAW.
Upon further reading, I think I am wrong here. My confusion was that I read that over 60% of the cones in ones eye are "red" cones (which is a bad generalization), and there is more nuance here.
Given equal power red, blue, or green light hitting our eyes, humans tend to rate green "brighter" in pairwise comparative surveys. That is why it is predominant in a perceptual luminance calculation converting from RGB.
Though there are much more L-cones (which react most strongly to "yellow" light, not "red", also "much more" varies across individuals) than M-cones (which react most strongly to a "greenish cyan"), the combination of these two cones (which make ~95% of the cones in the eye) mean that we are able to sense green light much more efficiently than other wavelengths. S-cones (which react most strongly to "purple") are very sparse.
This is way over simplifying here but I always understood it as: our eyes can see red with very little power needed. But our eyes can differentiate more detail with green.
Water is most transparent in the middle of the "visible" spectrum (green). It absorbs red and scatters blue. The atmosphere has a lot of water as does, of course, the ocean which was the birth place of plants and eyeballs.
It would be natural for both plants and eyes to evolve to exploit the fact that there is a green notch in the water transparency curve.
Edit: after scrolling, I find more discussion on this below.
I think any explanation along those lines would have a "just-so" aspect to it. How would we go about verifying such a thing? Perhaps if we compared and contrasted the eyes of savanna apes to forest apes, and saw a difference, which to my knowledge We do not. Anyway, sunlight at the ground level peaks around 555nm, so it's believed that we're optimizing to that by being more sensitive to green.
Exactly - film photographers heavily process(ed) their images from the film processing through to the print. Ansel Adams wrote a few books on the topic and they’re great reads.
And different films and photo papers can have totally different looks, defined by the chemistry of the manufacturer and however _they_ want things to look.
Excepting slide photos. No real adjustment once taken (a more difficult medium than negative film which you can adjust a little when printing)
You’re right about Ansel Adams. He “dodged and burned” extensively (lightened and darkened areas when printing.)
Photoshop kept the dodge and burn names on some tools for a while.
When we printed for our college paper we had a dial that could adjust the printed contrast a bit of our black and white “multigrade” paper (it added red light). People would mess with the processing to get different results too (cold/ sepia toned). It was hard to get exactly what you wanted and I kind of see why digital took over.
A school photography company I worked for used a custom Kodak stock. They were unsatisfied with how Kodak's standard portrait film handled darker skin tones.
They were super careful to maintain the look across the transition from film to digital capture. Families display multiple years of school photos next to each other and they wanted a consistent look.
Yes. Writing a post like this, but for film, would be illustrative of that similarity, but significantly more challenging to represent, especially for color film. I actually don't know the whole process in enough detail to write one, and the visualizations would be difficult, but the processing is there.
You have layers of substrate with silver halides, made sensitive to different frequency ranges with sensitizing dyes, crystallized into silver halide crystals, rather than a regular grid of pixels; you take a photo that is not an image, but a collection of specks of metallic silver. Through a series of chemical reactions, you develop those specks. Differences in chemistry, in temperatures, in agitation, in the film, all affect what for digital images is described as processing. Then in printing, you have a similar process all over again.
If anything, one might argue that the digital process allows a more consistent and quantitative understanding of the actual processing being done. Analog film seems like it involves less processing only because, for most people, the processing was always a black box of sending off the film for development and printing.
True, but there may be different intentions behind the processing.
Sometimes the processing has only the goal to compensate the defects of the image sensor and of the optical elements, in order to obtain the most accurate information about the light originally coming from the scene.
Other times the goal of the processing is just to obtain an image that appears best to the photographer, for some reason.
For casual photographers, the latter goal is typical, but in scientific or technical applications the former goal is frequently encountered.
Ideally, a "raw" image format is one where the differences between it and the original image are well characterized and there are no additional unknown image changes done for an "artistic" effect, in order to allow further processing when having either one of the previously enumerated goals.
This is also why I absolute hate, hate, hate it when people ask me whether I "edited" a photo or whether a photo is "original", as if trying to explain away nice-looking images as if they are fake.
The JPEGs cameras produce are heavily processed, and they are emphatically NOT "original". Taking manual control of that process to produce an alternative JPEG with different curves, mappings, calibrations, is not a crime.
As a mostly amateur photographer, it doesn't bother me if people ask that question. While I understand the point that the camera itself may be making some 'editing' type decision on the data first, a) in theory each camera maker has attempted to calibrate the output to some standard, b) public would expect two photos taken at same time with same model camera should look identical. That differs greatly from what often can happen in "post production" editing - you'll never find two that are identical.
> public would expect two photos taken at same time with same model camera should look identical
But this is wrong. My not-too-exotic 9-year-old camera has a bunch of settings which affect the resulting image quite a bit. Without going into "picture styles", or "recipes", or whatever they're called these days, I can alter saturation, contrast, and white balance (I can even tell it to add a fixed alteration to the auto WB and tell it to "keep warm colors"). And all these settings will alter how the in-camera produced JPEG will look, no external editing required at all.
So if two people are sitting in the same spot with the same camera, who's to say they both set them up identically? And if they didn't, which produces the "non-processed" one?
I think the point is that the public doesn't really understand how these things work. Even without going to the lengths described by another commenter (local adjust so that there appears to be a ray of light in that particular spot, remove things, etc), just playing with the curves will make people think "it's processed". And what I described above is precisely what the camera itself does. So why is there a difference if I do it manually after the fact or if I tell the camera to do it for me?
>There’s nothing that happens when you adjust the contrast or white balance in editing software that the camera hasn’t done under the hood. The edited image isn’t “faker” then the original: they are different renditions of the same data.
I noticed this a lot when taking pictures in the mountains.
I used to have a high resolution phone camera from a cheaper phone and then later switched to an iPhone. The latter produced much nicer pictures, my old phone just produces very flat-looking pictures.
People say that the iPhone camera automatically edits the images to look better. And in a way I notice that too. But that’s the wrong way of looking at it; the more-edited picture from the iPhone actually corrresponds more to my perception when I’m actually looking at the scene. The white of the snow and glaciers and the deep blue sky really does look amazing in real life, and when my old phone captured it into a flat and disappointing looking photo with less postprocessing than an iPhone, it genuinely failed to capture what I can see with my eyes. And the more vibrant post processed colours of an iPhone really do look more like what I think I’m looking at.
it's not a crime but applying post processing in an overly generous way that goes a lot further than replicating what a human sees does take away from what makes pictures interesting imho vs other mediums, that it's a genuine representation of something that actually happened.
if you take that away, a picture is not very interesting, it's hyperrealistic so not super creative a lot of the time (compared to eg paintings), & it doesn't even require the mastery of other mediums to get hyperrealistism
Perhaps interestingly, many/most digital cameras are sensitive to IR and can record, for example, the LEDs of an infrared TV remote.
But they don't see it as IR. Instead, this infrared information just kind of irrevocably leaks into the RGB channels that we do perceive. With the unmodified camera on my Samsung phone, IR shows up kind of purple-ish. Which is... well... it's fake. Making invisible IR into visible purple is an artificially-produced artifact of the process that results in me being able to see things that are normally ~impossible for me to observe with my eyeballs.
When you generate your own "genuine" images using your digital camera(s), do you use an external IR filter? Or are you satisfied with knowing that the results are fake?
Silicon sensors (which is what you'll get in all visible-light cameras as far as I know) are all very sensitive to near-IR. Their peak sensitivity is around 900nm. The difference between cameras that can see or not see IR is the quality of their anti-IR filter.
Your Samsung phone probably has the green filter of its bayer matrix that blocks IR better than the blue and red ones.
Here's a random spectral sensitivity for a silicon sensor:
But the camera is trying to emulate how it would look if your eyes were seeing it. In order for it to be 'genuine' you would need not only the camera to genuine, but also the OS, the video driver, the viewing app, the display and the image format/compression. They all do things to the image that are not genuine.
I don't think it's the same, for me personally I don't like heavily processed images. But not in the sense that they need processing to look decent or to convey the perception of what it was like in real life, more in the sense that the edits change the reality in a significant way so it affects the mood and the experience. For example, you take a photo on a drab cloudy day, but then edit the white balance to make it seem like golden hour, or brighten a part to make it seems like a ray of light was hitting that spot. Adjusting the exposure, touching up slightly, that's all fine, depending on what you are trying to achieve of course. But what I see on instagram or shorts these days is people comparing their raws and edited photos, and without the edits the composition and subject would be just mediocre and uninteresting.
The “raw” and unedited photo can be just as or even more unrealistic than the edited one though.
Photographs can drop a lot of the perspective, feeling and colour you experience when you’re there. When you take a picture of a slope on a mountain for example (on a ski piste for example), it always looks much less impressive and steep on a phone camera. Same with colours. You can be watching an amazing scene in the mountains, but when you take a photo with most cameras, the colours are more dull, and it just looks flatter. If a filter enhances it and makes it feel as vibrant as the real life view, I’d argue you are making it more realistic.
The main message I get from OP’s post is precisely that there is no “real unfiltered / unedited image”, you’re always imperfectly capturing something your eyes see, but with a different balance of colours, different detector sensitivity to a real eye etc… and some degree of postprocessing is always required make it match what you see in real life.
This is nothing new. For example, Ansel Adams’s famous Moonrise, Hernandez photo required extensive darkroom manipulations to achieve the intended effect:
I wrote the raw Bayer to JPEG pipeline used by the phone I write this comment on. The choices on how to interpret the data are mine. Can I tweak these afterwards? :)
Which is a pleasant read, and I like the pictures. Has the Librem 5's automatic JPEG output improved since you wrote the post about photography in Croatia (https://dosowisko.net/l5/photos/)?
Yes, these are quite old. I've written a GLSL shader that acts as a simple ISP capable of real-time video processing and described it in detail here: https://source.puri.sm/-/snippets/1223
It's still pretty basic compared to hardware accelerated state-of-the-art, but I think it produces decent output in a fraction of a second on the device itself, which isn't exactly a powerhouse: https://social.librem.one/@dos/115091388610379313
Before that, I had an app for offline processing that was calling darktable-cli on the phone, but it took about 30 seconds to process a single photo with it :)
I mean it depends, does your Bayer-to-JPEG pipeline try to detect things like 'this is a zoomed in picture of the moon' and then do auto-fixup to put a perfect moon image there? That's why there's some need to differentiate between SOOC's now, because Samsung did that.
I know my Sony gear can't call out to AI because the WIFI sucks like every other Sony product and barely works inside my house, but also I know the first ILC manufacturer that tries to put AI right into RAW files is probably the first to leave part of the photography market.
That said I'm a purist to the point where I always offer RAWs for my work [0] and don't do any photoshop/etc. D/A, horizon, bright adjust/crop to taste.
Where phones can possibly do better is the smaller size and true MP structure of a cell phone camera sensor, makes it easier to handle things like motion blur. and rolling shutter.
But, I have yet to see anything that gets closer to an ILC for true quality than the decade+ old pureview cameras on Nokia cameras, probably partially because they often had sensors large enough.
There's only so much computation can do to simulate true physics.
[0] - I've found people -like- that. TBH, it helps that I tend to work cheap or for barter type jobs in that scene, however it winds up being something where I've gotten repeat work because they found me and a 'photoshop person' was cheaper than getting an AIO pro.
There's a difference between an unbiased (roughly speaking) pipeline and what (for example) JBIG2 did. The latter counts as "editing" and "fake" as far as I'm concerned. It may not be a crime but at least personally I think it's inherently dishonest to attempt to play such things off as "original".
And then there's all the nonsense BigTech enables out of the box today with automated AI touch ups. That definitely qualifies as fakery although the end result may be visually pleasing and some people might find it desirable.
That's completely unreasonable. Sure the camera processes them heavily. but when you open it up and start editing in photoshop you are changing this area over that one, or highlighting one color over another etc.. or just boosting the brightness way higher than what it looked like that day. It's a perfectly normal question to ask.
Also worth noting that manufacturers advertise photodiode count as a sensor resolution. So if you have 12 Mp sensor then your green resolution is 6 Mp and blue and red are 3 Mp
> The human eye is most sensitive to green light, so that channel effectively carries the majority of the luminance (brightness/detail) data
How does this affect luminance perception for deuteranopes? (Since their color blindness is caused by a deficiency of the cones that detect green wavelengths)
Protanopia and protanomaly shift luminance perception away from the longest wavelengths of visible light, which causes highly-saturated red colours to appear dark or black. Deuteranopia and deuteranomaly don't have this effect. [1]
Blue cones make little or no contribution to luminance. Red cones are sensitive across the full spectrum of visual light, but green cones have no sensitivity to the longest wavelengths [2]. Since protans don't have the "hardware" to sense long wavelengths, it's inevitable that they'd have unusual luminance perception.
I'm not sure why deutans have such a normal luminous efficiency curve (and I can't find anything in a quick literature search), but it must involve the blue cones, because there's no way to produce that curve from the red-cone response alone.
The cones are the colour sensitive portion of the retina, but only make up a small percent of all the light detecting cells. The rods (more or less the brightness detecting cells) would still function in a deuteranopic person, so their luminance perception would basically be unaffected.
Also there’s something to be said about the fact that the eye is a squishy analog device, and so even if the medium wavelengths cones are deficient, long wavelength cones (red-ish) have overlap in their light sensitivities along with medium cones so…
The rods are only active in low-light conditions; they're fully active under the moon and stars, or partially active under a dim street light. Under normal lighting conditions, every rod is fully saturated, so they make no contribution to vision. (Some recent papers have pushed back against this orthodox model of rods and cones, but it's good enough for practical use.)
This assumption that rods are "the luminance cells" is an easy mistake to make. It's particularly annoying that the rods have a sensitivity peak between the blue and green cones [1], so it feels like they should contribute to colour perception, but they just don't.
It’s not that their M-cones (middle, i.e. green) don’t work at all, their M-cones responsivity curve is just shifted to be less distinguishable from their L-cones curve, so they effectively have double (or more) the “red sensors”.
Since the human eye is most sensitive to green, it will find errors in the green channel much easier than the others. This is why you need _more_ green data.
Note that there are two measurement systems involved: first the camera, and then the human eyes. Your reasoning could be correct if there were only one: "the sensor is most sensitive to green light, so less sensor area is needed".
But it is not the case, we are first measuring with cameras, and then presenting the image to human eyes. Being more sensitive to a colour means that the same measurement error will lead to more observable artifacts. So to maximize visual authenticity, the best we can do is to make our cameras as sensitive to green light (relatively) as human eyes.
Oh you are right! I’m so dumb! Of course it is the camera. To have the camera have the same sensitivity, we need more green pixels! I had my neurons off.
Thanks.
Because that reasoning applies to binary signals, where the sensibility is about detection, in the case of our eyes sensibility means that we can detect many more distinct values let's say we can see N distinct luminosity levels of monochrome green light but only N*k or N^k distinct levels of blue light.
So to describe/reproduce what our eyes see you need more detection range in the green spectrum
Yeah, was thinking the same. If we're more sensitive, why do we need double sensors? Just have 1:1:1, and we would anyways see more of the green? Won't it be too much if we do 1:2:1, when we're already more perceptible to green?
With 1:1:1 the matrix isn't square, and if you have to double one of the channels for practical purposes then the green one is the obvious pick as it's the most beneficial in increasing the image quality cause it's increasing the spatial resolution where our eyes can actually notice it.
Grab a random photo and blur its blue channel out a bit. You probably won't notice much difference aside of some slight discoloration. Then try the same with the green channel.
This ratio allows for a relatively simple 2x2 repeating pattern. That makes interpolating the values immensely simpler.
Also you don't want the red and blue to be too far apart, reconstructing the colour signal is difficult enough as it is. Moire effects are only going to get worse if you use an even sparser resolution.
the bayer pattern is one of those things that makes me irrationally angry, in the true sense, based on my ignorance of the subject
what's so special about green? oh so just because our eyes are more sensitive to green we should dedicate double the area to green in camera sensors? i mean, probably yes. but still. (⩺_⩹)
Green is in the center of the visible spectrum of light (notice the G in the middle of ROYGBIV), so evolution should theoretically optimize for green light absorption. An interesting article on why plants typically reflect that wavelength and absorb the others: https://en.wikipedia.org/wiki/Purple_Earth_hypothesis
Green is the highest energy light emitted by our sun, from any part of the entire light spectrum, which is why green appears in the middle of the visible spectrum. The visible spectrum basically exists because we "grew up" with a sun that blasts that frequency range more than any other part of the light spectrum.
I have to wonder what our planet would look like if the spectrum shifts over time. Would plants also shift their reflected light? Would eyes subtly change across species? Of course, there would probably be larger issues at play around having a survivable environment … but still, fun to ponder.
That comment does not make sense. Do you mean the sun emits it's peak intensity at green (I don't believe that is true either, but at least it would make a physically sensical statement). To clarify why the statement does not make sense, the energy of light is directly proportional to its frequency so saying that green is the highest energy light the sun emits is saying the sun does not emit any light at frequency higher than green, i.e. no blue light no UV... That's obviously not true.
> Do you mean the sun emits it's peak intensity at green (I don't believe that is true either, but at least it would make a physically sensical statement).
Yes, that's what I meant, as I was sloppy with my language, and it's definitely true.
Several reasons,
-Silicon efficiency (QE) peaks in the green
-Green spectral response curve is close to the luminance curve humans see, like you said.
-Twice the pixels to increase the effective resolution in the green/luminance channel, color channels in YUV contribute almost no details.
Why is YUV or other luminance-chrominance color spaces important for a RGB input? Because many processing steps and encoders, work in YUV colorspaces. This wasn't really covered in the article.
Not sure why it would invoke such strong sentiments but if you don’t like the bayer filter, know that some true monochrome cameras don’t use it and make every sensor pixel available to the final image.
For instance, the Leica M series have specific monochrome versions with huge resolutions and better monochrome rendering.
You can also modify some cameras and remove the filter, but the results usually need processing.
A side effect is that the now exposed sensor is more sensitive to both ends of the spectrum.
Not to mention that there are non-Bayer cameras that vary from the Sigma Foveon and Quattro sensors that use stacked sensors to filter out color entirely differently to the Fuji EXR and X-Trans sensors.
I don't think that's correct. It's not "all video" - you can easily encode video without chroma subsampling - and it's not because this is how analog TV worked, but rather for the same reason why analog TV worked this way, which is the fact that it lets you encode significantly less data with barely noticeable quality loss. JPEGs do the same thing.
If the Bayer pattern makes you angry, I imagine it would really piss you off to realize that the whole concept encoding an experienced color by a finite number of component colors is fundamentally species-specific and tied to the details of our specific color sensors.
To truly record an appearance without reference to the sensory system of our species, you would need to encode the full electromagnetic spectrum from each point. Even then, you would still need to decide on a cutoff for the spectrum.
...and hope that nobody ever told you about coherence phenomena.
Upon inspection, the author's personal website used em dashes in 2023. I hope this helped with your witch hunt.
I'm imagining a sort of Logan's Run-like scifi setup where only people with a documented em dash before November 30, 2022, i.e. D(ash)-day, are left with permission to write.
Phew. I have published work with em dashes, bulleted lists, “not just X, but Y” phrasing, and the use of “certainly”, all from the 90’s. Feel sorry for the kids, but I got mine.
> I'm imagining a sort of Logan's Run-like scifi setup where only people with a documented em dash before November 30, 2022, i.e. D(ash)-day, are left with permission to write.
At least Robespierre needed two sentences before condemning a man. Now the mob is lynching people on the basis of a single glyph.
I have been overusing em dashes and bulleted lists since the actual 80s, I'm sad to say. I spent much of the 90s manually typing "smart" quotes.
I have actually been deliberately modifying my long-time writing style and use of punctuation to look less like an LLM. I'm not sure how I feel about this.
Alt + 0151, baby! Or... however you do it on MacOS.
But now, likewise, having to bail on emdashes. My last differentiator is that I always close set the emdash—no spaces on either side, whereas ChatGPT typically opens them (AP Style).
found the guy who didn't know about em dashes before this year
also your question implies a bad assumption even if you disclaim it. if you don't want to imply a bad assumption the way to do that is to not say the words, not disclaim them
didn't even notice the em dashes to be honest, i noticed the contrast framing in the second paragraph and the "It's impressive how" for its conclusion.
as for the "assumption" bit, yeah fair enough. was just curious of AI usage online, this wasn't meant to be a dig at anyone as i know people use it for translations, cleaning up prose etc
No offense taken, but realize that good number of us folks who have learned English as a second language have been taught in this way (especially in an academic setting). LLMs' writing are like that of people, not the other way around.
wouldn't say that... they're very distinctly not like people, that's (part of) the problem. But I don't think the difference is measured exactly in the choices of words and punctuation. It's more like... you can tell, reading AI writing, that it's not "sincere"; no person would want to say what the AI is saying, because it feels fake and disingenuous. The phrases and em dashes and whatever else are just the method for this effect. Real people use the same phrases but with real intent to communicate behind them, and the result is different in a way that is curiously easy to detect.
You know, I didn’t think about that, but you’re right. I have seen so many AI narrations where it reads the dash exactly like a hyphen, actually maybe slightly reducing the inter-word gap. Odd the kinds of “easy” things such as complicated and advanced system gets wrong.
A fun tangent on the "green cast" mentioned in the post: the reason the Bayer pattern is RGGB (50% green) isn't just about color balance, but spatial resolution. The human eye is most sensitive to green light, so that channel effectively carries the majority of the luminance (brightness/detail) data. In many advanced demosaicing algorithms, the pipeline actually reconstructs the green channel first to get a high-resolution luminance map, and then interpolates the red/blue signals—which act more like "color difference" layers—on top of it. We can get away with this because the human visual system is much more forgiving of low-resolution color data than it is of low-resolution brightness data. It’s the same psycho-visual principle that justifies 4:2:0 chroma subsampling in video compression.
Also, for anyone interested in how deep the rabbit hole goes, looking at the source code for dcraw (or libraw) is a rite of passage. It’s impressive how many edge cases exist just to interpret the "raw" voltages from different sensor manufacturers.