Raymarched Volumetric Lighting in Unity URP (Part 1)

There are a lot of resources on raymarched volumetric lighting on the internet, but in this article, I am going to show you step by step everything you need to know about my own implementation and my modifications over the resources. This includes not just the basic concept, but how to make it performant with good visual quality, using downsampling, randomized sampling, bilateral blur, and near-depth upsampling. In a , we will learn how to modify this algorithm so it supports tinting when passing through stained glass.

Shortcomings of this implementation:

  • Only supports a single directional light, no other lights.
  • No local density volumes or density textures

These cons are not inherent to the implementation and can be added at a later stage, I just haven’t needed to do it, but if I do I will write about it :)

If you want to see the full code you can just scroll to the end, it is not the precise code I am running because I have cleaned it up a bit for this tutorial/breakdown so there may be easy to fix compile errors.

This effect is a screen space post-process, which means it takes place after everything else is rendered, and we work with limited information on a single quad that takes up the whole screen. To implement this in URP (I live on the edge and at the time of writing I am using the latest beta Unity 2021.2b) we need to use a:

Scriptable Renderer Feature

The basic concept is that we need to pass a render texture containing everything we have rendered until this point (we are going to do this just before regular post-processing) to our shader and then back to the screen, effectively injecting our code in the middle.

To do this, we need to write a ScriptableRendererFeature:

Most of this code can be reused for all your renderer features. A ScriptableRendererFeature overrides Create() which will allow us to create our feature in the Renderer Settings. Here we set the name of our feature and the settings that will show up in the inspector.

It also overrides AddRenderPasses, which lets us inject our own code at certain points in the pipeline, passing our own ScriptableRenderPass. Keep in mind that the AddRenderPasses (and Setup because of that), Configure, and Execute methods are called every frame, even in the scene view.

In Configure() we get a temporary render texture which will be the target of our shader, we are using an R16 render texture, which only has a red channel, this is a simplification I am making because I only support a single light, so I don’t need its color until the very end, in our compositing pass, and we can work in black and white.

In Execute() we queue our own commands in the CommandBuffer which will run our code.

cmd.Blit(source, target) takes the source render texture, in this case containing the image the camera just rendered, and copies it into the target render texture, if the target render texture has a different resolution, it handles the downsampling for you.

cmd.Blit(source, target, material, passIndex) does the same but it runs the downsampled source render texture through a material of your choosing, using the pass you want. This means that we can write a single shader that can perform raymarching, blur, upsample, and compositing all in a single file but divided into different stages and we select which one we want to perform.

Make sure to ALWAYS run CommandBufferPool.Release(…) or you will have a huge memory leak that will crash not just Unity but your whole computer. I am putting all the code that can potentially fail while we are writing this feature inside a try-catch block. This way, the execution of this method doesn’t stop if it encounters an error, in that case, it will run the code in the catch block and continue properly.

We also included a RenderPassEvent in our settings, make sure to set it to Before Post Processing.

The code we just wrote will only handle the raymarching pass, but we will add more to it, you can scroll down to the end for the complete version. Also, we are passing our screen into the _MainTex of our raymarching pass, which isn’t necessary, but it will when we change this shader to support stained glass.

Raymarching Pass

The concept

The algorithm we are going to use works this way:

  • For every pixel in the screen, we cast a ray in the direction of the camera view, effectively towards higher depth.
  • This ray is divided into a limited number of steps, each one at a different distance from the camera.
  • We ask Unity if that point in space is in light or shadow of the main light, this is possible because we can sample the Shadow Map (we need to use #pragma multi_compile _ _MAIN_LIGHT_SHADOWS_CASCADE to be able to sample the shadow map).
  • IF a step is lit we calculate light scattering and add its contribution
  • We calculate the average light scattering in that pixel by dividing the sum of the steps between the number of steps.

The light scattering we are interested in occurs when light bounces with dust particles and aerosols in the atmosphere, the physics around this phenomenon is already known and there are equations that model to certain degrees various aspects of it. I have chosen to use a simplified Mie scattering because I am not looking for a realistic effect but a pleasant and performant one.

The requisites

Being a screen space effect, we are limited by the amount of information we have, but we have just enough. We have access to the depth information using _CameraDepthTexture, which we must enable in the pipeline settings.

The following include already declares this texture in our code and gives us a couple of functions.

#include "Packages/com.unity.render-pipelines.universal/ShaderLibrary/DeclareDepthTexture.hlsl"

We can reconstruct world position from the depth texture, which you can explore in-depth (pun intended) in this Cyan article (), luckily Unity already provides us with a function to do just that.

ComputeWorldSpacePosition(uv, depth, UNITY_MATRIX_I_VP);

The last external piece of information we need is the main light direction, to which we will need to add a script that will set a global variable we can read from every shader.

using System.Collections;
using System.Collections.Generic;
using UnityEngine;
[ExecuteAlways]
public class MoonDirection : MonoBehaviour
{
void Update()
{
Shader.SetGlobalVector("_SunDirection", transform.forward);
}
}

Note the [ExecuteAlways], which is necessary if you want this effect to update in the scene view.

You will also need the “_SunMoonColor”, which you can set up as a global variable just like “_SunDirection”, using Shader.SetGlobalVector(“_SunMoonColor”, /*your color*/);

With that, we have all we need to start writing our shader, which will be written in code because even if we used Shader Graph or Amplify Shader Editor we would need to write plenty of custom code.

The code

A couple of notes before we start if you have written shader code in BIRP (Built-In Render Pipeline) or followed old tutorials:

  • We are using HLSLPROGRAM instead of CGPROGRAM, you can still use CGPROGRAM in URP, but Unity has moved into HLSL for URP and chances are that you want to include Unity functions in your code. Including both UnityCG and URP libraries will result in redefinition issues. Don’t worry, because there are replacements for every UnityCG function.
  • The included Unity files define ‘real’ as float or half depending on the platform, and that’s what I use.

Our actual code starts at line 104, in our fragment pass. First, we calculate the world space position of the pixel using GetWorldPosition(i.uv), which is defined in line 64. This function samples the depth texture and then uses a Unity function to reconstruct the depth position. If you want to see how it looks you can return its value after calculating it.

real3 worldPos = GetWorldPos(i.uv);
return frac(worldPos);

Colors in shaders go from 0 to 1, any more than that will only contribute to HDR bloom, but won’t actually be visible, if we want to analyze higher values we can use frac, which returns the decimal part of a number.

The fragment pass we’ve written outputs a single float (real), so if you want to properly see what you are doing, you can temporarily change it to a real3, if you still see a single color, use an RGB format in your render texture in the scriptable renderer feature.

From lines 110 to 128 we calculate the ray direction, its length, and the length of each step. The ray starts from the camera and we can use the global variable _WorldSpaceCameraPos to get the position. The end position is the world space position we just calculated. We then clamp the maximum ray length because it affects our precision, it is recommended you don’t reinvent the wheel and use the available functions, so I should probably use the following code instead.

rayLength = min(rayLength,_MaxDistance);
worldPos= startPosition+rayDirection*rayLength;

In line 132 starts the real deal of the shader, the for loop. Loops aren’t great in shaders as they benefit poorly from the parallel power of a GPU, but for a raymarching approach, it is the only way.

In line 134 we sample the light value in that position, which is essentially a mask for light and shadow. If the value is greater than zero we calculate light scattering, sum its value and advance the loop. We then average the result and return its value.

If we return to Unity we will see a monochrome screen with our volumetric light. If you are checking the FPS meter you can see this effect is costing a very sizeable amount of performance and the result isn’t that great, because there is a very noticeable banding issue. If we reduce the number of steps we can clearly see the shadow silhouette. If we use more steps, it will be less noticeable but it will consume much more.

There is clear banding and it is not even a single ray

Banding occurs because we are sampling at discrete points, and for a given portion of the screen, every step of each ray is mostly at asimilar depth, effectively reproducing the shadow map pixel by pixel, if we somehow randomize the sampling points, the pattern will not be recognizable. This technique is inspired by Interleaved Sampling, used in Lords of the Fallen, but with huge simplification and, in my opinion, improvement.

If we randomize the starting point of each ray, instead of just starting in the camera position, we are also randomizing each subsequent step position. To get a random number we can either sample a blue noise texture or use the famous:

frac(sin(dot(p, float2(41, 289)))*45758.5453 );

, which gives us a pseudo-random value from 0 to 1 depending on an input variable p. If we add it to the starting position, randomizing in the direction of the ray, and add a new variable to control its influence, we can get rid of any patterns, albeit very noisy, but that will be solved when we blur everything out.

real rayStartOffset= random01( i.uv)*stepLength *_JitterVolumetric/100;
real3 currentPosition = startPosition + rayStartOffset*rayDirection;
Noisy, but looking good

If you want to easily play around with the values, like the max distance and step amount, you can add new variables to the ScriptableRendererFeature settings.

        public float intensity = 1;
public float scattering = 0;
public float steps = 25;
public float maxDistance=75;
public float jitter = 250;

And send them to our shader just before the blit.

settings.material.SetFloat("_Scattering", settings.scattering);                settings.material.SetFloat("_Steps", settings.steps);                settings.material.SetFloat("_JitterVolumetric", settings.jitter);                settings.material.SetFloat("_MaxDistance", settings.maxDistance);                settings.material.SetFloat("_Intensity", settings.intensity);                

Now you can play with them!

If you ignore the noise, you will notice that we can get away with fewer steps if we increase the jitter but, even then, it is pretty performance heavy. To get a good result, we are using between 25 and 50 steps for every single pixel of the screen. But what if we don’t do that and work on a lower resolution version of the screen instead?

Downsample

Blit can downsample for us, but chances are that you want to downsample to half or quarter resolution depending on your quality settings, and you may also want to experiment on the fly and check how it looks with or without downsampling without recompiling, so we are gonna add a new variable to our settings.

public enum DownSample { off = 1, half = 2, third = 3, quarter = 4 };
public DownSample downsampling;

Notice that I’ve overwritten the values of the enum, otherwise, it would start at zero, and you will see immediately why we don’t want that. In the Configure method of our ScriptableRenderPass, where we get a temporary render texture, we can further customize its properties.

cameraTextureDescriptor.width /= settings.downsampling;
cameraTextureDescriptor.height /= settings.downsampling;

Now we have a dropdown to select downsample.

If at any time from now on you notice that the downscaling works differently in the scene view, with even lower resolution, you will have to use the following code. I don’t know if it is a bug or version dependent, or the expected behavior, but just in case.

if (Camera.current != null){
cameraTextureDescriptor.width = (int)Camera.current.pixelRect.width / divider;
cameraTextureDescriptor.height = (int)Camera.current.pixelRect.height / divider;
}
else{
cameraTextureDescriptor.width /= divider;
cameraTextureDescriptor.height /= divider;
}

In URP, Camera.current only works in the scene view.

Now it works everywhere but, of course, it has less quality. When we eventually composite this render texture with the camera view, it will be obvious the rays were rendered in another resolution. The answer is a bilateral blur (which will get rid of the noise too) and depth-aware upsample.

Bilateral Blur

Blur works by averaging the color value of the surrounding pixels of a given pixel. But here is the problem, shaders are run in parallel, and the information a single pixel has of the rest is very limited. How do we access the surrounding color values of our raymarch? The answer is writing the raymarch pass into a render texture and passing it into another pass, this way we are just sampling a texture and we can offset our UVs.

A bilateral blur works like a regular Gaussian blur but takes depth into account to try not to blur hard edges in the depth texture. It also blurs to a higher degree pixels that are very close and gradually reduces the blur weight of further pixels.

Gaussian Blur
Bilateral Blur, probably not the best implementation

Our blur is gonna be composed of two different passes, in the first one we will blur in the X-axis, write to a texture, and blur in the Y-axis in another pass. This implementation is almost 1:1 to the one.

The comments in the code are self-explanatory. You will notice we are sampling _MainTex, which takes the value of whatever source we blit into this shader, in our case, our volumetric pass. This blur pass only accounts for the X-axis, to blur the Y-axis you can copy and paste this whole pass and change the UV offset from

real2 uv= i.uv+real2(  index*_GaussAmount/1000,0);

to

real2 uv= i.uv+real2( 0, index*_GaussAmount/1000);

Or you can pass the axis through a variable and run the blit using the same Pass but changing the axis. I’ve opted for the former.

In our ScriptableRenderPass our Blits should be looking like this:

cmd.Blit(source, tempTexture.Identifier(), settings.material, 0);
cmd.Blit(tempTexture.Identifier(), tempTexture2.Identifier(), settings.material, 1);
cmd.Blit(tempTexture2.Identifier(), source, settings.material, 2);

As you can see, we change the index in each Blit because we are using a different pass. There is also a new render texture, we need two to keep passing info from one pass to the next. Add a

RenderTargetHandle temptexture2;

to the ScriptableRenderPass variables and the following lines in the Configure method right next to the other render texture.

temptexture2.id = 1;
cmd.GetTemporaryRT(temptexture2.id, cameraTextureDescriptor);
ConfigureTarget(temptexture2.Identifier());

Both render textures use the same description, we only have to make sure to change the id of the second one (zero by default).

We now need to pass our blur settings (samples and intensity) to the shader. We can add the following variables to the settings:

[System.Serializable]        
public class GaussBlur{
public float amount;
public float samples;
}
public GaussBlur gaussBlur = new GaussBlur();

I’ve chosen to create a small class so it can be expanded and minimized in the editor without writing a custom inspector.

settings.material.SetFloat("_GaussSamples", settings.gaussBlur.samples);
settings.material.SetFloat("_GaussAmount", settings.gaussBlur.amount);

And we pass them to our shader in the Execute method. If we return to Unity and play around with our values we will now have something close to this.

Upsampling and Compositing

We just need to composite! Almost! If we use the downsampled render texture as it is, and composite it into our scene, the pixels will be noticeable, even after blurring, because we blurred on a downsample resolution. The trick is to use , which uses depth information to select the closest-in-depth downsampled pixel adjacent to a full resolution one, leaving our hard edges cleaner, where the downsampling is more noticeable.

To use this technique we need both a full resolution depth texture, which we already have as _CameraDepthTexture, and a downsampled one. Because the compositing pass works at full res, we will need to pass the downsampled version ourselves. There are probably multiple, more elegant methods than mine, but I chose to add yet another pass (fourth in my case, compositing is the third, the order doesn’t matter).

The code is nothing special, it literally just returns depth, but because we are going to use a low res target in the blit, the downsampling is automatic.

We don’t need a new render texture for this, as we can reuse one of the previous ones:

//raymarch
cmd.Blit(source, tempTexture.Identifier(), settings.material, 0);
//bilateral blur X
cmd.Blit(tempTexture.Identifier(), tempTexture2.Identifier(), settings.material, 1);
//bilateral blur Y
cmd.Blit(tempTexture2.Identifier(), tempTexture.Identifier(), settings.material, 2);
//downsample depth
cmd.Blit(source, tempTexture2.Identifier(), settings.material, 4);
cmd.SetGlobalTexture("_LowResDepth", tempTexture2.Identifier());
cmd.SetGlobalTexture("_volumetricTexture", tempTexture.Identifier());

Notice that we are writing both the downsampled depth texture and our blurred raymarch to textures because our future compositing blit needs the camera render as its source, so the other inputs need to be passed as separate textures.

Let’s add the final blit even though we still haven’t written the compositing pass.

//upsample and composite
cmd.Blit(source, tempTexture3.Identifier(), settings.material, 3);
cmd.Blit(tempTexture3.Identifier(), source);

Add yet another render texture to our custom pass, adding:

RenderTargetHandle tempTexture3;

This time around the render texture will not be like the others, as it will need all channels and full resolution, leaving our Configure method like this:

public override void Configure(CommandBuffer cmd, RenderTextureDescriptor cameraTextureDescriptor){var original = cameraTextureDescriptor;
int divider = (int)settings.videoSettings.currentSettings.volumetricSettings.downsampling;
if (Camera.current != null){
cameraTextureDescriptor.width = (int)Camera.current.pixelRect.width / divider;
cameraTextureDescriptor.height = (int)Camera.current.pixelRect.height / divider;
original.width = (int)Camera.current.pixelRect.width;
original.height = (int)Camera.current.pixelRect.height;
}
else{
cameraTextureDescriptor.width /= divider;
cameraTextureDescriptor.height /= divider;
}
cameraTextureDescriptor.msaaSamples = 1;
cameraTextureDescriptor.colorFormat = RenderTextureFormat.R16;
tempTexture2.id = 1;
tempTexture3.id = 2;
cmd.GetTemporaryRT(tempTexture.id, cameraTextureDescriptor);
ConfigureTarget(tempTexture.Identifier());
cmd.GetTemporaryRT(tempTexture2.id, cameraTextureDescriptor);
ConfigureTarget(tempTexture2.Identifier());
cmd.GetTemporaryRT(tempTexture3.id, original);
ConfigureTarget(tempTexture3.Identifier());
ConfigureClear(ClearFlag.All, Color.black);
}

Let’s write the compositing pass:

Branching in shaders is not great, especially in this case, where each pixel uses a different branch, but the results speak for themselves.

If you examine hard edges and try regular upsampling the difference is obvious.

We are done!

Final Code

The final ScritableRendererFeature should be looking similar to this:

I have added a switch in the Execute method so I can choose to run specific stages for debugging purposes.

And our final shader should look similar to this.

Project example

You can find a project with slightly updated code .

Next Article

Other Possible Upgrades

  • Epipolar Sampling, for performance
  • Density volumes, for local fog
  • Point lights
  • Spotlights

Conclusion

If you want more shader and game dev content be sure to follow me on Twitter:

You can also check my itch.io page for finished projects.

Resources

This implementation is loosely based on a combination of different techniques and concepts taken from a variety of resources:

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Valerio Marty

Game Design and Development student. Game Designer/Tech Artist. @ValerioMarty on Twitter