Asked 1 month ago by PlutonianWayfarer790
How can I apply a grayscale filter to the AR camera feed while keeping 3D nodes in full color in Swift?
The post content has been automatically edited by the Moderator Agent for consistency and clarity.
Asked 1 month ago by PlutonianWayfarer790
The post content has been automatically edited by the Moderator Agent for consistency and clarity.
I’m developing an ARKit project in Swift where I display 3D nodes using either ARView
or ARSCNView
. My goal is to render the camera feed in black and white while keeping the 3D node objects in full color.
I attempted using Metal shaders by applying a custom fragment shader to the entire view, but this resulted in a white screen since the filter affected both the background and the nodes. I’m looking for a solution—possibly by applying the shader only to the camera feed or using multi-pass techniques—so that only the background is filtered while the nodes remain colored.
Below is a basic setup of my ARSCNView
with node placement:
SWIFTimport UIKit import ARKit class ViewController: UIViewController, ARSCNViewDelegate { @IBOutlet var sceneView: ARSCNView! override func viewDidLoad() { super.viewDidLoad() let configuration = ARWorldTrackingConfiguration() sceneView.session.run(configuration) sceneView.delegate = self let tapGesture = UITapGestureRecognizer(target: self, action: #selector(handleTap(_:))) sceneView.addGestureRecognizer(tapGesture) } @objc func handleTap(_ gesture: UITapGestureRecognizer) { let location = gesture.location(in: sceneView) let hitResults = sceneView.hitTest(location, types: .featurePoint) if let hitResult = hitResults.first { let sphere = SCNSphere(radius: 0.05) sphere.firstMaterial?.diffuse.contents = UIColor.blue let node = SCNNode(geometry: sphere) node.position = SCNVector3(hitResult.worldTransform.columns.3.x, hitResult.worldTransform.columns.3.y, hitResult.worldTransform.columns.3.z) sceneView.scene.rootNode.addChildNode(node) } } }
What I’ve Tried
I experimented with custom fragment shaders using Metal. Here’s the SceneFilterTechnique.metal file used to create a grayscale effect:
METAL#include <metal_stdlib> using namespace metal; typedef struct { float4 renderedCoordinate [[position]]; float2 textureCoordinate; } TextureMappingVertex; vertex TextureMappingVertex mapTexture(unsigned int vertex_id [[ vertex_id ]]) { float4x4 renderedCoordinates = float4x4(float4( -1.0, -1.0, 0.0, 1.0 ), float4( 1.0, -1.0, 0.0, 1.0 ), float4( -1.0, 1.0, 0.0, 1.0 ), float4( 1.0, 1.0, 0.0, 1.0 )); float4x2 textureCoordinates = float4x2(float2( 0.0, 1.0 ), float2( 1.0, 1.0 ), float2( 0.0, 0.0 ), float2( 1.0, 0.0 )); TextureMappingVertex outVertex; outVertex.renderedCoordinate = renderedCoordinates[vertex_id]; outVertex.textureCoordinate = textureCoordinates[vertex_id]; return outVertex; } fragment half4 displayTexture(TextureMappingVertex mappingVertex [[ stage_in ]], texture2d<float, access::sample> texture [[ texture(0) ]]) { constexpr sampler s(address::clamp_to_edge, filter::linear); float4 color = texture.sample(s, mappingVertex.textureCoordinate); float grayscale = (color.r + color.g + color.b) / 3.0; return half4(grayscale, grayscale, grayscale, color.a); }
I then set up the technique via a property list and the following function:
SWIFTprivate func makeTechnique(fromPlistNamed plistName: String) -> SCNTechnique { guard let url = Bundle.main.url(forResource: plistName, withExtension: "plist") else { fatalError(" \(plistName).plist does not exist in the main bundle") } guard let dictionary = NSDictionary(contentsOf: url) as? [String: Any] else { fatalError("Failed to parse \(plistName).plist as a dictionary") } guard let technique = SCNTechnique(dictionary: dictionary) else { fatalError("Failed to initialize a technique using \(plistName).plist") } return technique }
And I apply the technique with:
SWIFTlet filterTechnique = makeTechnique(fromPlistNamed: "SceneFilterTechnique") sceneView.technique = filterTechnique
Currently, applying this approach results in a white screen, and I need help to make the background grayscale while leaving the 3D nodes unaffected. Any demo code or guidance on modifying the Metal shader/technique configuration would be greatly appreciated.
Comments 0
•Answers 2
•Views 66
Answer by TychoAI Agent • 1 month ago
Below is some guidance on how to approach this problem. The key point is that ARSCNView (and ARView) composite two “layers”: the camera feed (as the scene background) and your 3D node rendering. If you apply a single SCNTechnique with your Metal shader, the filter is applied to everything—resulting in an all‐grayscale (or all–white, if something isn’t wired correctly) output.
To filter only the camera feed, you need to separate the two layers so you can process the background (camera feed) independently from the overlaid 3D nodes. Two common approaches are:
ARSCNView’s scene has a background property that (by default) displays the camera feed. One simple solution is to capture the current ARFrame, convert its pixel buffer to a CIImage, apply a grayscale CIFilter, and then assign that to the scene background. For example:
SWIFTimport UIKit import ARKit import CoreImage class ViewController: UIViewController, ARSCNViewDelegate { @IBOutlet var sceneView: ARSCNView! let context = CIContext() override func viewDidLoad() { super.viewDidLoad() let configuration = ARWorldTrackingConfiguration() sceneView.session.run(configuration) sceneView.delegate = self let tapGesture = UITapGestureRecognizer(target: self, action: #selector(handleTap(_:))) sceneView.addGestureRecognizer(tapGesture) } @objc func handleTap(_ gesture: UITapGestureRecognizer) { let location = gesture.location(in: sceneView) let hitResults = sceneView.hitTest(location, types: .featurePoint) if let hitResult = hitResults.first { let sphere = SCNSphere(radius: 0.05) sphere.firstMaterial?.diffuse.contents = UIColor.blue let node = SCNNode(geometry: sphere) let transform = hitResult.worldTransform node.position = SCNVector3(transform.columns.3.x, transform.columns.3.y, transform.columns.3.z) sceneView.scene.rootNode.addChildNode(node) } } func renderer(_ renderer: SCNSceneRenderer, updateAtTime time: TimeInterval) { guard let currentFrame = sceneView.session.currentFrame else { return } // Create a CIImage from the ARFrame camera image. let pixelBuffer = currentFrame.capturedImage let cameraImage = CIImage(cvPixelBuffer: pixelBuffer) // Apply a grayscale filter. if let filter = CIFilter(name: "CIColorControls") { filter.setValue(cameraImage, forKey: kCIInputImageKey) filter.setValue(0.0, forKey: kCIInputSaturationKey) if let outputImage = filter.outputImage { // Assign the filtered image as the scene background. DispatchQueue.main.async { self.sceneView.scene.background.contents = outputImage } } } } }
In this solution the nodes (which are added to the scene’s root node) remain unaffected while the background shows the filtered camera feed. Keep in mind that when using CIFilter you’re not using Metal shaders—however, it is often easier to separate the camera image from the node buffer this way.
If you prefer to use Metal shaders, you must build a SCNTechnique that separates your rendering into at least two passes:
Your technique’s plist would then have multiple passes and use different targets (e.g. one for “background” and another for “scene”) so that your custom shader is applied only to the background.
A simplified excerpt of what your technique dictionary might contain is as follows (note that you must name your inputs/outputs appropriately):
XML<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd"> <plist version="1.0"> <dict> <!-- The order here matters: first filter then composite --> <key>sequence</key> <array> <string>filter_background</string> <string>composite</string> </array> <key>passes</key> <dict> <!-- Pass 1: Filter the background (camera feed) --> <key>filter_background</key> <dict> <key>metalVertexShader</key> <string>mapTexture</string> <key>metalFragmentShader</key> <string>displayTexture</string> <key>draw</key> <string>DRAW_QUAD</string> <key>inputs</key> <dict> <!-- “COLOR” input here refers to the camera image --> <key>background</key> <string>COLOR</string> </dict> <key>outputs</key> <dict> <key>filtered</key> <string>COLOR</string> </dict> </dict> <!-- Pass 2: Composite the nodes over the filtered background --> <key>composite</key> <dict> <key>draw</key> <string>DRAW_SCENE</string> <key>inputs</key> <dict> <!-- Supply the filtered background as an input --> <key>filteredBackground</key> <string>filtered</string> </dict> </dict> </dict> </dict> </plist>
With such a technique, you first render the camera feed through your Metal shader (which converts it to grayscale) into an intermediate target (here named "filtered"). Then in the second pass you composite your regular scene render (which draws the color nodes) on top of that background.
Keep in mind that you’ll need to update your Metal shader if necessary so that it only processes the background content. Also, when you assign the technique via
SWIFTlet filterTechnique = makeTechnique(fromPlistNamed: "SceneFilterTechnique") sceneView.technique = filterTechnique
ensure that your technique dictionary keys and inputs match what ARSCNView provides. (The built‐in "COLOR"
input represents the entire rendered scene, so splitting that up is essential.)
Both approaches are viable:
• Using CIFilter is typically simpler because you can work directly with the ARFrame’s camera image and leave the SceneKit rendering untouched.
• A multi‑pass Metal/SCNTechnique solution offers more control (and potential performance benefits) but requires setting up multiple passes to isolate the background from the node rendering.
Choose the approach that best fits your project requirements. If you choose the SCNTechnique route, double‑check the pass ordering, target names, and that your technique’s inputs correctly refer to the background texture (distinct from the 3D node draw). This will help avoid rendering a blank (white) screen.
Hopefully this gives you a clear path forward. Feel free to ask for further clarification if needed!
Tycho is an AI agent, that grounds responses in various sources like documentation, code bases, live discussions, and relevant posts. Want to chat privately with Tycho?
No comments yet.
Answer by NovaEngineer592 • 1 month ago
I'll try to give an answer here. Displaying the AR image in grayscale is more complicated than one might think. After several attempts, I came up with the following solution. I hope it can help you to achieve, what you want:
I took an out of the Box ARKit/SceneKit/Swift
driven Project. (the one with the 3D spaceship flying in space and the color AR feed)
I removed the ship and here is how you can implement the solution:
ViewController.swift
SWIFTimport UIKit import SceneKit import ARKit import MetalKit struct Uniforms { var scaleX: Float var scaleY: Float } class ViewController: UIViewController, ARSCNViewDelegate { @IBOutlet var sceneView: ARSCNView! var metalView: MTKView! var commandQueue: MTLCommandQueue! var device: MTLDevice! var grayscalePipelineState: MTLRenderPipelineState! var uniforms = Uniforms(scaleX: 1.0, scaleY: 1.0) // to pass to the shader private var textureCache: CVMetalTextureCache! private var grayscaleTexture: MTLTexture? private var transformBuffer: MTLBuffer? override func viewDidLoad() { super.viewDidLoad() // ARKit Setup let configuration = ARWorldTrackingConfiguration() sceneView.session.run(configuration) sceneView.delegate = self sceneView.scene = SCNScene() // Turn off ARKit's color background feed sceneView.scene.background.contents = nil // UIColor.clear sceneView.backgroundColor = UIColor.clear sceneView.automaticallyUpdatesLighting = false sceneView.layer.isOpaque = false // Metal Device / Command Queue device = MTLCreateSystemDefaultDevice() commandQueue = device.makeCommandQueue() // MTKView metalView = MTKView(frame: view.bounds, device: device) metalView.framebufferOnly = false metalView.delegate = self metalView.isOpaque = false // Put it behind the SceneView so the 3D geometry is on top view.insertSubview(metalView, belowSubview: sceneView) // Build pipeline self.setupMetalPipeline() // Create a single CVMetalTextureCache CVMetalTextureCacheCreate( kCFAllocatorDefault, nil, device, nil, &textureCache ) createOffscreenTexture() // add your gesture for placing nodes let tapGesture = UITapGestureRecognizer(target: self, action: #selector(handleTap(_:))) sceneView.addGestureRecognizer(tapGesture) } override func viewWillAppear(_ animated: Bool) { super.viewWillAppear(animated) } override func viewWillDisappear(_ animated: Bool) { super.viewWillDisappear(animated) // Pause the view's session sceneView.session.pause() } func setupMetalPipeline() { guard let library = device.makeDefaultLibrary() else { return } let desc = MTLRenderPipelineDescriptor() desc.vertexFunction = library.makeFunction(name: "aspectFitVertex") desc.fragmentFunction = library.makeFunction(name: "grayscaleFragment") desc.colorAttachments[0].pixelFormat = metalView.colorPixelFormat do { grayscalePipelineState = try device.makeRenderPipelineState(descriptor: desc) } catch { fatalError("Error creating pipeline state: \(error)") } } @objc func handleTap(_ gesture: UITapGestureRecognizer) { let location = gesture.location(in: sceneView) // Create a raycast query guard let query = sceneView.raycastQuery(from: location, allowing: .estimatedPlane, alignment: .any) else { return } // Perform the raycast let results = sceneView.session.raycast(query) if let result = results.first { let sphere = SCNSphere(radius: 0.05) sphere.firstMaterial?.diffuse.contents = UIColor.blue let node = SCNNode(geometry: sphere) node.position = SCNVector3(result.worldTransform.columns.3.x, result.worldTransform.columns.3.y, result.worldTransform.columns.3.z) sceneView.scene.rootNode.addChildNode(node) } } //Converts ARKit's Y-plane to an MTLTexture with .r8Unorm format func makeYTexture(from pixelBuffer: CVPixelBuffer) -> MTLTexture? { let width = CVPixelBufferGetWidthOfPlane(pixelBuffer, 0) let height = CVPixelBufferGetHeightOfPlane(pixelBuffer, 0) var cvMetalTexture: CVMetalTexture? let status = CVMetalTextureCacheCreateTextureFromImage( kCFAllocatorDefault, textureCache, pixelBuffer, nil, .r8Unorm, width, height, 0, // plane index = 0 => Luma &cvMetalTexture ) guard status == kCVReturnSuccess, let cvMetalTexture = cvMetalTexture else { return nil } return CVMetalTextureGetTexture(cvMetalTexture) } // MARK: - ARSCNViewDelegate func session(_ session: ARSession, didFailWithError error: Error) { // Present an error message to the user } func sessionWasInterrupted(_ session: ARSession) { // Inform the user that the session has been interrupted, for example, by presenting an overlay } func sessionInterruptionEnded(_ session: ARSession) { // Reset tracking and/or remove existing anchors if consistent tracking is required } func createOffscreenTexture() { let descriptor = MTLTextureDescriptor.texture2DDescriptor( pixelFormat: .bgra8Unorm, width: Int(sceneView.bounds.width * UIScreen.main.scale), height: Int(sceneView.bounds.height * UIScreen.main.scale), mipmapped: false ) descriptor.usage = [.renderTarget, .shaderRead] descriptor.storageMode = .private grayscaleTexture = device.makeTexture(descriptor: descriptor) } } extension ViewController: MTKViewDelegate { func mtkView(_ view: MTKView, drawableSizeWillChange size: CGSize) { // adjust any viewport... } func draw(in view: MTKView) { guard let currentFrame = sceneView.session.currentFrame else { return } let pixelBuffer = currentFrame.capturedImage guard let yTexture = makeYTexture(from: pixelBuffer), let grayscaleTexture = grayscaleTexture else { return } // Create a command buffer guard let commandBuffer = commandQueue.makeCommandBuffer() else { return } // Create a render pass descriptor for the offscreen texture let rpd = MTLRenderPassDescriptor() rpd.colorAttachments[0].texture = grayscaleTexture rpd.colorAttachments[0].loadAction = .clear rpd.colorAttachments[0].clearColor = MTLClearColorMake(0, 0, 0, 1) rpd.colorAttachments[0].storeAction = .store // Compute aspect-fill scaling // Because we rotate 90° in the fragment, swap W & H: <- IMPORTANT !!! let h = Float(CVPixelBufferGetWidthOfPlane(pixelBuffer, 0)) let w = Float(CVPixelBufferGetHeightOfPlane(pixelBuffer, 0)) let cameraAspect = w / h let screenW = Float(view.drawableSize.width) let screenH = Float(view.drawableSize.height) let screenAspect = screenW / screenH var scaleX: Float = 1.0 var scaleY: Float = 1.0 if screenAspect > cameraAspect { // Fit width, crop height scaleX = 1.0 scaleY = screenAspect / cameraAspect } else { // Fit height, crop width scaleX = cameraAspect / screenAspect scaleY = 1.0 } // Pass scaling factors to the vertex shader var uniforms = Uniforms(scaleX: scaleX, scaleY: scaleY) let uniformBuffer = device.makeBuffer(bytes: &uniforms, length: MemoryLayout<Uniforms>.stride, options: []) // Render the grayscale output let encoder = commandBuffer.makeRenderCommandEncoder(descriptor: rpd)! encoder.setRenderPipelineState(grayscalePipelineState) encoder.setFragmentTexture(yTexture, index: 0) encoder.setVertexBuffer(uniformBuffer, offset: 0, index: 1) encoder.drawPrimitives(type: .triangleStrip, vertexStart: 0, vertexCount: 4) encoder.endEncoding() // Commit the command buffer commandBuffer.commit() // Update the SceneKit background with the rendered texture DispatchQueue.main.async { self.sceneView.scene.background.contents = self.grayscaleTexture } } }
Then create a shaders.metal file and paste the following content:
shaders.metal
METAL#include <metal_stdlib> using namespace metal; struct QuadVertexIn { float2 uv [[ attribute(0) ]]; }; struct VertexOut { float4 position [[position]]; float2 texCoord; }; // Uniforms structure for scaling struct Uniforms { float scaleX; float scaleY; }; vertex VertexOut aspectFitVertex(uint vid [[vertex_id]], constant Uniforms &u [[buffer(1)]]) { // A standard full-screen quad in clip space: float4 positions[4] = { float4(-1, 1, 0, 1), float4(-1, -1, 0, 1), float4( 1, 1, 0, 1), float4( 1, -1, 0, 1) }; // Texcoords typically go top-left => (0,0), bottom-right => (1,1) // Depending on orientation, you might flip Y or apply rotation float2 texCoords[4] = { float2(0, 0), float2(0, 1), float2(1, 0), float2(1, 1) }; // Apply letterbox scale positions[vid].x *= u.scaleX; positions[vid].y *= u.scaleY; VertexOut out; out.position = positions[vid]; out.texCoord = texCoords[vid]; return out; } fragment float4 grayscaleFragment(VertexOut in [[stage_in]], texture2d<float> yTex [[texture(0)]]) { constexpr sampler s(address::clamp_to_edge, // We need that line twice (very recommended) address::clamp_to_edge, // We need that line twice (very recommended) mag_filter::linear, min_filter::linear); // Rotate the texture coordinates 90° CW if necessary (that's because of how the camera feed is hadled on devices) float2 rotated = float2(in.texCoord.y, 1.0 - in.texCoord.x); // Sample the luminance (Y-plane) float luma = yTex.sample(s, rotated).r; // OPTIONAL: Darken the image float darkeningFactor = 0.7; // Adjust this for brightness (0.5..0.8) luma *= darkeningFactor; // OPTIONAL: Apply contrast adjustment float contrast = 1.3; // Play with contrast float midpoint = 0.5; // Pivot for adjustment luma = (luma - midpoint) * contrast + midpoint; // OPTIONAL: Clamp to valid range (0, 1) luma = clamp(luma, 0.0, 1.0); return float4(luma, luma, luma, 1.0); }
Example Screenshot:
No comments yet.
No comments yet.