Animation Compression

시작하기 전에..

UE 5.5에서 캐릭터 관련 에디터 기능을 플러그인으로 포팅하면서 애니메이션 시스템의 압축기를 이식하기 위해 압축 프로세스를 분석해보려고 한다. 언리얼 내부에서도 ACL 플러그인을 주축으로 데이터 압축이 진행되며 디코딩 과정만 적용한다면 충분히 사용 가능하다고 판단이 된다.

Key Sampling

먼저 키샘플링은 애니메이션을 일정 시간 단위로 샘플링하여 키를 생성하는 단계이다.
샘플레이트를 지정하여 초당 몇 프레임으로 샘플링할지 정할 수 있고 크로스 플랫폼 게임의 경우
Frame Stripping 기능을 활용하면 플랫폼 별로 샘플레이트를 다르게 적용할 수 있게 해준다.

Key Reduction

애니메이션 키 데이터를 압축하기 이전 압축에 필요한 키들을 선별하기 위한 사전 작업이다.

각각의 알고리즘은 코덱으로 제공이 되며 상황에 맞게 최적의 코덱을 선정하여 사용하거나 전체 후보군의 압축을 시도하여 최적의 압축 방식으로 적용하는 방식으로도 사용이 가능하다.

https://dev.epicgames.com/documentation/en-us/unreal-engine/animation-compression-codec-reference-in-unreal-engine

Remove Constant Key

샘플링은 시간단위로 본의 트랜스폼 값을 저장하도록 되어있기 때문에 움직임이 없는 본에 대해서도 계속 같은 값의 키를 생성한다.
이는 불필요한 데이터가 되므로 기본적으로 Key Reduction 단계에서는 변화가 없는 고정 트랙의 키는 제거하도록 한다.

Least Destructive

트랙의 모든 키들을 유지하고 싶을 때 원본 데이터 그대로 사용할 수 있는 옵션을 제공하고 있다.

Remove Every Second Key

2N번째 키를 제거하는 방식으로 키 샘플링 단계의 샘플레이트를 절반으로 낮추는 것과 같은 동작으로 애니메이션 키를 제거한다.
속도는 빠르겠지만 인접한 키의 변화량이 클수록 왜곡되기 쉬우므로 유의해야 한다.

Remove Trivial Key

임의의 값을 임계값으로 지정하여 인접한 두 키의 값이 임계값보다 작을 때 키를 제거하는 방식이다.
임계 값의 크기가 매우 작으면 제거되는 키의 수가 거의 없거나 반대로 임계 값의 크기가 크면 제거되는 키가 너무 많아 오류가 큰 압축으로 진행될 수 있으므로 유의해야 한다.

Remove Linear Key

주변 키에서 변경되지 않은 키를 제거한다.

속성을 설정하여 압축 코덱이 애니메이션 데이터에서 최대 위치, 각도, 크기 등의 오차 범위를 설정하여 선형 키를 제거하는 방식을 제어할 수 있도록 제공한다.

이는 다음의 알고리즘을 통해 진행이 된다.

// UE5.5 AnimCompress_RemoveLinearKeys.cpp

template <typename AdapterType>
void FilterLinearKeysTemplate(
	TArray<typename AdapterType::KeyType>& Keys,
	TArray<float>& Times,
	TArray<FTransform>& BoneAtoms,
	const TArray<float>* ParentTimes,
	const TArray<FTransform>& RawWorldBones,
	const TArray<FTransform>& NewWorldBones,
	const TArray<int32>& TargetBoneIndices,
	int32 NumFrames,
	int32 BoneIndex,
	int32 ParentBoneIndex,
	float ParentScale,
	float MaxDelta,
	float MaxTargetDelta,
	float EffectorDiffSocket,
	const TArray<FBoneData>& BoneData
	)
{
	typedef typename AdapterType::KeyType KeyType;

	const int32 KeyCount = Keys.Num();
	check( Keys.Num() == Times.Num() );
	check( KeyCount >= 1 );
	
	// generate new arrays we will fill with the final keys
	TArray<KeyType> NewKeys;
	TArray<float> NewTimes;
	NewKeys.Reset(KeyCount);
	NewTimes.Reset(KeyCount);

	// Only bother doing anything if we have some keys!
	if(KeyCount > 0)
	{
		int32 LowKey = 0;
		int32 HighKey = KeyCount-1;

		TArray<bool> KnownParentTimes;
		KnownParentTimes.SetNumUninitialized(KeyCount);
		const int32 ParentKeyCount = ParentTimes ? ParentTimes->Num() : 0;
		for (int32 TimeIndex = 0, ParentTimeIndex = 0; TimeIndex < KeyCount; TimeIndex++)
		{
			while ((ParentTimeIndex < ParentKeyCount) && (Times[TimeIndex] > (*ParentTimes)[ParentTimeIndex]))
			{
				ParentTimeIndex++;
			}

			KnownParentTimes[TimeIndex] = (ParentTimeIndex < ParentKeyCount) && (Times[TimeIndex] == (*ParentTimes)[ParentTimeIndex]);
		}

		TArray<FTransform> CachedInvRawBases;
		CachedInvRawBases.SetNumUninitialized(KeyCount);
		for (int32 FrameIndex = 0; FrameIndex < KeyCount; ++FrameIndex)
		{
			const FTransform& RawBase = RawWorldBones[(BoneIndex*NumFrames) + FrameIndex];
			CachedInvRawBases[FrameIndex] = RawBase.Inverse();
		}
		
		// copy the low key (this one is a given)
		NewTimes.Add(Times[0]);
		NewKeys.Add(Keys[0]);

		const FTransform EndEffectorDummyBoneSocket(FQuat::Identity, FVector(END_EFFECTOR_DUMMY_BONE_LENGTH_SOCKET));
		const FTransform EndEffectorDummyBone(FQuat::Identity, FVector(END_EFFECTOR_DUMMY_BONE_LENGTH));

		const float DeltaThreshold = (BoneData[BoneIndex].IsEndEffector() && (BoneData[BoneIndex].bHasSocket || BoneData[BoneIndex].bKeyEndEffector)) ? EffectorDiffSocket : MaxTargetDelta;

		// We will test within a sliding window between LowKey and HighKey.
		// Therefore, we are done when the LowKey exceeds the range
		while (LowKey + 1 < KeyCount)
		{
			int32 GoodHighKey = LowKey + 1;
			int32 BadHighKey = KeyCount;
			
			// bisect until we find the lowest acceptable high key
			while (BadHighKey - GoodHighKey >= 2)
			{
				HighKey = GoodHighKey + (BadHighKey - GoodHighKey) / 2;

				// get the parameters of the window we are testing
				const float LowTime = Times[LowKey];
				const float HighTime = Times[HighKey];
				const KeyType& LowValue = Keys[LowKey];
				const KeyType& HighValue = Keys[HighKey];
				const float Range = HighTime - LowTime;
				const float InvRange = 1.0f/Range;

				// iterate through all interpolated members of the window to
				// compute the error when compared to the original raw values
				float MaxLerpError = 0.0f;
				float MaxTargetError = 0.0f;
				for (int32 TestKey = LowKey+1; TestKey< HighKey; ++TestKey)
				{
					// get the parameters of the member being tested
					float TestTime = Times[TestKey];
					const KeyType& TestValue = Keys[TestKey];

					// compute the proposed, interpolated value for the key
					const float Alpha = (TestTime - LowTime) * InvRange;
					const KeyType LerpValue = AnimationCompressionUtils::Interpolate(LowValue, HighValue, Alpha);

					// compute the error between our interpolated value and the desired value
					float LerpError = CalcDelta(TestValue, LerpValue);

					// if the local-space lerp error is within our tolerances, we will also check the
					// effect this interpolated key will have on our target end effectors
					float TargetError = -1.0f;
					if (LerpError <= MaxDelta)
					{
						// get the raw world transform for this bone (the original world-space position)
						const int32 FrameIndex = TestKey;
						const FTransform& InvRawBase = CachedInvRawBases[FrameIndex];
						
						// generate the proposed local bone atom and transform (local space)
						FTransform ProposedTM = AdapterType::UpdateBoneAtom(BoneAtoms[FrameIndex], LerpValue);

						// convert the proposed local transform to world space using this bone's parent transform
						const FTransform& CurrentParent = ParentBoneIndex != INDEX_NONE ? NewWorldBones[(ParentBoneIndex*NumFrames) + FrameIndex] : FTransform::Identity;
						FTransform ProposedBase = ProposedTM * CurrentParent;
						
						// for each target end effector, compute the error we would introduce with our proposed key
						for (int32 TargetIndex=0; TargetIndex<TargetBoneIndices.Num(); ++TargetIndex)
						{
							// find the offset transform from the raw base to the end effector
							const int32 TargetBoneIndex = TargetBoneIndices[TargetIndex];
							FTransform RawTarget = RawWorldBones[(TargetBoneIndex*NumFrames) + FrameIndex];
							const FTransform RelTM = RawTarget * InvRawBase;

							// forecast where the new end effector would be using our proposed key
							FTransform ProposedTarget = RelTM * ProposedBase;

							// If this is an EndEffector, add a dummy bone to measure the effect of compressing the rotation.
							// Sockets and Key EndEffectors have a longer dummy bone to maintain higher precision.
							if (BoneData[TargetIndex].bHasSocket || BoneData[TargetIndex].bKeyEndEffector)
							{
								ProposedTarget = EndEffectorDummyBoneSocket * ProposedTarget;
								RawTarget = EndEffectorDummyBoneSocket * RawTarget;
							}
							else
							{
								ProposedTarget = EndEffectorDummyBone * ProposedTarget;
								RawTarget = EndEffectorDummyBone * RawTarget;
							}

							// determine the extend of error at the target end effector
							const float ThisError = (ProposedTarget.GetTranslation() - RawTarget.GetTranslation()).Size();
							TargetError = FMath::Max(TargetError, ThisError); 

							// exit early when we encounter a large delta
							const float TargetDeltaThreshold = BoneData[TargetIndex].bHasSocket ? EffectorDiffSocket : DeltaThreshold;
							if( TargetError > TargetDeltaThreshold )
							{ 
								break;
							}
						}
					}

					// If the parent has a key at this time, we'll scale our error values as requested.
					// This increases the odds that we will choose keys on the same frames as our parent bone,
					// making the skeleton more uniform in key distribution.
					if (ParentTimes)
					{
						if (KnownParentTimes[TestKey])
						{
							// our parent has a key at this time, 
							// inflate our perceived error to increase our sensitivity
							// for also retaining a key at this time
							LerpError *= ParentScale;
							TargetError *= ParentScale;
						}
					}
					
					// keep track of the worst errors encountered for both 
					// the local-space 'lerp' error and the end effector drift we will cause
					MaxLerpError = FMath::Max(MaxLerpError, LerpError);
					MaxTargetError = FMath::Max(MaxTargetError, TargetError);

					// exit early if we have failed in this span
					if (MaxLerpError > MaxDelta ||
						MaxTargetError > DeltaThreshold)
					{
						break;
					}
				}

				// determine if the span succeeded. That is, the worst errors found are within tolerances
				if ((MaxLerpError <= MaxDelta) && (MaxTargetError <= DeltaThreshold))
				{
					GoodHighKey = HighKey;
				}
				else
				{
					BadHighKey = HighKey;
				}
			}

			NewTimes.Add(Times[GoodHighKey]);
			NewKeys.Add(Keys[GoodHighKey]);

			LowKey = GoodHighKey;
		}

		// return the new key set to the caller
		Times= NewTimes;
		Keys= NewKeys;
	}
}

Data Compression

언리얼 엔진에서는 여러가지 압축 코덱을 지원하며 단일 압축 코덱을 설정하여 해당 코덱으로 애니메이션 압축이 진행되는 것이 아닌 적용하고자 하는 코덱을 등록해두고 압축을 실행하게되면 등록한 전체 코덱 후보군에 대한 압축을 시도하여 압축률이 높고 손실율이 적은 최적의 압축 포멧을 자동으로 선택할 수 있는 Animation Per Track Compression기능이 있다. 다만 이 방식은 모든 압축을 시도하므로 수행 시간이 매우 오래 걸리는 단점이 있기도 하다.

Bitwise Compression

5개의 형식의 Bitwise Compression을 사용한다.

Rotation 키는 쿼터니언을 사용하여 XYZW 4개의 float 채널을 가지고 있고 Translation과 Scale 키는 XYZ 3개의 float 채널을 가지고 있으므로 각 키에 적합한 압축 포멧을 선택해야 하며 Rotation 키의 경우 모든 압축 형식을 지원하지만 Translation과 Scale 키는 비압축, IntervalFixed32, Float96 세 가지 압축 형식만 지원된다.

https://dev.epicgames.com/documentation/en-us/unreal-engine/animation-compression-codec-reference-in-unreal-engine#bitwisecompressionformatreference

Float96

Translation과 Scale 키에서는 전체 정밀도가 사용되며 쿼터니언인 Rotation 키에서만 W를 제외한 전체 정밀도가 사용된다.
노멀라이즈한 쿼터니언의 XYZ만 인코드하고 디코드시 다음의 식을 통해 W값을 구해 거의 무손실에 가까운 압축 포맷이다.

Fixed48

XYZ 세개 채널의 사이즈를 각각 16비트로 할당하며 쿼터니언의 W값을 제외하여 인코딩을 실행하고 디코딩시 복원한다.
XYZ 값은 -1에서 부터 1사이의 값을 가져야 하기 때문에 쿼터니언 값이 저장될 때 사용된다.

Fixed32

Fixed48과 동작은 같지만 비트 수를 더 적게 사용하였기 때문에 정밀도가 한참 낮아진다. 범위 축소를 위해 각 채널당 11/11/10 비트에 값을 인코딩하며 쿼터니언의 W값을 제외하여 인코딩을 한다.
키간의 변화량이 큰 애니메이션과 같이 정밀도가 크게 중요하지 않았을 때 고려해야할 옵션이다.

IntervalFixed32

-1에서 1사이의 값만 사용할 수 있던 Fixed32와 달리 Range 값이 헤더에 추가되어 1보다 큰 값을 압축하여 저장할 수 있어 Translation 키나 Scale 키에 사용할 수 있고 각 채널당 10/11/11 비트로 인코딩이 이루어지며 쿼터니언의 경우 각 채널당 11/11/10 비트로 인코딩이 이루어지며 Range 값이 1보다 작다면 Fixed32와 같은 비트를 쓰더라도 더 높은 정밀도로 압축할 수 있다.

Float32

Fixed 방식과는 다르게 10/11비트의 부동소수점으로 데이터를 저장한다.

Animation Compression Library

ACL은 Range Reduction, Uniform Segmenting, Constant Tracks, Quantization 등의 기술을 결합하여 데이터의 중복성을 최대한 활용하고, 각 트랙에 최적화된 비트율을 적용하여 압축 효율성을 높인다. 특히, 계층적 데이터 구조와 시각적 메시의 특성을 고려한 정확한 에러 측정 방법을 강조하며, 이를 통해 애니메이션 데이터의 품질을 유지하면서도 압축률을 극대화할 수 있음을 보여준다.

https://github.com/nfrechette/acl

GitHub - nfrechette/acl: Animation Compression Library

Animation Compression Library. Contribute to nfrechette/acl development by creating an account on GitHub.

github.com

https://www.youtube.com/watch?v=85uOa2m_kBc&t=1607s

ACL 주요 기법

Constant Tracks

애니메이션의 트랙 중 고정된 싱글 키로만 이루어진 트랙을 선별하는 과정이다.

Range Reduction

애니메이션 트랙의 실제 범위를 고려하여 효과적인 간격으로 나누는 방법이다. 예를 들어, 엘보의 회전이 이론적으로 360도일 수 있지만 실제로는 120도 이상 회전하지 않는다는 점을 고려하여 회전 범위를 120도로 좁힐 수 있다. 만일 4비트로 본의 회전값을 저장했다면 360도에서 정밀도는 22.5도이지만 120도에서의 정밀도는 7.5도로 3배의 정밀도 상승 효과를 얻을 수 있다.

효과적인 범위 내에서 16개의 간격으로 분할할 경우 정밀도가 7.5도로 증가하며 이러한 과정을 통해 저장해야 할 최소값과 최대값을 저장해야 한다.

Uniform Segmenting

애니메이션 시퀀스를 일정 단위의 세그먼트로 나누는 작업이다.
하나의 애니메이션 시퀀스를 복수의 세그먼트로 쪼개면, 본의 움직임 범위는 그보다 축소될 가능성이 있기 때문에 세그먼트로 쪼개지 않은 상태와 비교할 경우 같은 비트 수로도 더 높은 정밀도로 표현이 가능할 수 있다.

Quantization

이 과정에서 키의 값을 양자화하는데 만일 허용오차보다 정밀도가 낮아지는 경우 사용 비트 수를 늘려 오차 범위내 양자화할 수 있는 최적의 비트 수를 설정하는 작업이다.

ACL 구현 방식

Compact Constant

애니메이션의 트랙 중 Constant Track인 고정된 싱글 키로 이루어진 불필요한 키를 제거한다.

Extract Clip Range

제거되지 않은 애니메이션의 트랙들의 Clip Range를 추출한다.

Clip Range는 Rotation과 Translation, Scale 트랙 각각 추출한다.

Clip Range는 트랙별 최소 최대값으로 범위가 지정된다.

Normalize Clip

Clip Range 값을 가지고 Constant Track을 제외한 모든 키 값을 Clip Range로 정규화한다.
이 과정을 통해 모든 키 값은 0에서 1사이의 값을 가지게 된다.

Split Clip into Segment

클립을 일정한 개수의 샘플키를 가지는 세그먼트로 나누는 작업을 진행한다.

이는 이후에 세그먼트 단위로 추가 압축을 진행하기 위한 사전 작업이다.

Extract Segment Range

세그먼트 단위의 Clip Range를 다시 추출한다.
사전에 Constant Track을 제외한 트랙의 전체 데이터는 트랙의 Clip Range로 정규화되어 세그먼트 Clip Range의 값은 항상 0에서 1사이의 값을 가지게 된다.

Normalize Segment

세그먼트에 있는 키 들을 다시 노멀라이즈 한다. 두 번 노멀라이즈를 하는 이유는 [ 0.0 ~ 1.0 ] 의 Clip Range로 모든 데이터를 적은 비트 수를 사용하여 표현하기에는 정밀도의 한계점이 있어 세그먼트 단위로 테이블화를 시킨건데 이는 Texture Block Compression의 기법과 유사하다. 세그먼트 단위로 최소 최대값이 해당 Clip Range가 되며 세그먼트에 포함되는 데이터들은 세그먼트의 Clip Range 기준으로 Value를 표현하기에 두번의 노멀라이즈가 이루어지는 것이다.

다만 전체 클립이 두 번 범위 축소를 수행하므로 일반적인 경우에 비해 압축 비율을 높일 수 있지만 짧은 클립에서는 범위 정보를 저장하는 비용이 이득을 상쇄할 수 있다.

Quantization

마지막으로 허용오차 범위내로 양자화할 비트 수를 찾는 작업이다.
2비트로 양자화할 시에 허용오차를 벗어나는 경우 비트 수를 늘려 다시 확인한다.
허용오차를 만족하는 최소 비트 수를 찾는 다면 양자화 단계는 끝이나고 여기서 찾은 비트 수를 가지고 트랙의 키 값들을 인코딩하게 된다.

저작자표시 비영리 변경금지

'Graphics' 카테고리의 다른 글

Animation Retargeting (0)	2025.04.16
Texture Block Compression (0)	2025.04.15
[DirectX 11] Dissolve Effect (0)	2022.06.27
[DirectX 11] Color Grading & LookUpTable Texture (0)	2022.06.23
[DirectX 11] HDRI Sky Light & Sky Cube Baking (0)	2022.05.25

시작하기 전에..

Key Sampling

Key Reduction

Remove Constant Key

Least Destructive

Remove Every Second Key

Remove Trivial Key

Remove Linear Key

Data Compression

Bitwise Compression

Float96

Fixed48

Fixed32

IntervalFixed32

Float32

Animation Compression Library

ACL 주요 기법

Constant Tracks

Range Reduction

Uniform Segmenting

Quantization

ACL 구현 방식

Compact Constant

Extract Clip Range

Normalize Clip

Split Clip into Segment

Extract Segment Range

Normalize Segment

Quantization

'Graphics' 카테고리의 다른 글

티스토리툴바